News & Events

1,298 News items and Awards found.



Learn about the MERL Seminar Series.



  •  AWARD    Jonathan Le Roux elevated to IEEE Fellow
    Date: January 1, 2024
    Awarded to: Jonathan Le Roux
    MERL Contact: Jonathan Le Roux
    Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
    Brief
    • MERL Distinguished Scientist and Speech & Audio Senior Team Leader Jonathan Le Roux has been elevated to IEEE Fellow, effective January 2024, "for contributions to multi-source speech and audio processing."

      Mitsubishi Electric celebrated Dr. Le Roux's elevation and that of another researcher from the company, Dr. Shumpei Kameyama, with a worldwide news release on February 15.

      Dr. Jonathan Le Roux has made fundamental contributions to the field of multi-speaker speech processing, especially to the areas of speech separation and multi-speaker end-to-end automatic speech recognition (ASR). His contributions constituted a major advance in realizing a practically usable solution to the cocktail party problem, enabling machines to replicate humans’ ability to concentrate on a specific sound source, such as a certain speaker within a complex acoustic scene—a long-standing challenge in the speech signal processing community. Additionally, he has made key contributions to the measures used for training and evaluating audio source separation methods, developing several new objective functions to improve the training of deep neural networks for speech enhancement, and analyzing the impact of metrics used to evaluate the signal reconstruction quality. Dr. Le Roux’s technical contributions have been crucial in promoting the widespread adoption of multi-speaker separation and end-to-end ASR technologies across various applications, including smart speakers, teleconferencing systems, hearables, and mobile devices.

      IEEE Fellow is the highest grade of membership of the IEEE. It honors members with an outstanding record of technical achievements, contributing importantly to the advancement or application of engineering, science and technology, and bringing significant value to society. Each year, following a rigorous evaluation procedure, the IEEE Fellow Committee recommends a select group of recipients for elevation to IEEE Fellow. Less than 0.1% of voting members are selected annually for this member grade elevation.
  •  
  •  AWARD    Honorable Mention Award at NeurIPS 23 Instruction Workshop
    Date: December 15, 2023
    Awarded to: Lingfeng Sun, Devesh K. Jha, Chiori Hori, Siddharth Jain, Radu Corcodel, Xinghao Zhu, Masayoshi Tomizuka and Diego Romeres
    MERL Contacts: Radu Corcodel; Chiori Hori; Siddarth Jain; Devesh K. Jha; Diego Romeres
    Research Areas: Artificial Intelligence, Machine Learning, Robotics
    Brief
    • MERL Researchers received an "Honorable Mention award" at the Workshop on Instruction Tuning and Instruction Following at the NeurIPS 2023 conference in New Orleans. The workshop was on the topic of instruction tuning and Instruction following for Large Language Models (LLMs). MERL researchers presented their work on interactive planning using LLMs for partially observable robotic tasks during the oral presentation session at the workshop.
  •  
  •  AWARD    MERL team wins the Audio-Visual Speech Enhancement (AVSE) 2023 Challenge
    Date: December 16, 2023
    Awarded to: Zexu Pan, Gordon Wichern, Yoshiki Masuyama, Francois Germain, Sameer Khurana, Chiori Hori, and Jonathan Le Roux
    MERL Contacts: François Germain; Chiori Hori; Sameer Khurana; Jonathan Le Roux; Zexu Pan; Gordon Wichern
    Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
    Brief
    • MERL's Speech & Audio team ranked 1st out of 12 teams in the 2nd COG-MHEAR Audio-Visual Speech Enhancement Challenge (AVSE). The team was led by Zexu Pan, and also included Gordon Wichern, Yoshiki Masuyama, Francois Germain, Sameer Khurana, Chiori Hori, and Jonathan Le Roux.

      The AVSE challenge aims to design better speech enhancement systems by harnessing the visual aspects of speech (such as lip movements and gestures) in a manner similar to the brain’s multi-modal integration strategies. MERL’s system was a scenario-aware audio-visual TF-GridNet, that incorporates the face recording of a target speaker as a conditioning factor and also recognizes whether the predominant interference signal is speech or background noise. In addition to outperforming all competing systems in terms of objective metrics by a wide margin, in a listening test, MERL’s model achieved the best overall word intelligibility score of 84.54%, compared to 57.56% for the baseline and 80.41% for the next best team. The Fisher’s least significant difference (LSD) was 2.14%, indicating that our model offered statistically significant speech intelligibility improvements compared to all other systems.
  •  
  •  NEWS    Karl Berntorp joins the Editorial Board of IEEE Transactions on Control Systems Technology
    Date: December 7, 2023
    MERL Contact: Karl Berntorp
    Research Areas: Control, Dynamical Systems
    Brief
    • Karl Berntorp has joined the Editorial Board of the IEEE Transactions on Control Systems Technology (T-CST) as an Associate Editor. The IEEE T-CST publishes peer-reviewed papers on technological advances in the design, realization, and operation of control systems, and bridges the gap between the theory and practice of control engineering.
  •  
  •  NEWS    Ankush Chakrabarty served as Co-Chair of ACM BALANCES 2023
    Date: November 14, 2023
    Where: Istanbul, Turkey
    MERL Contact: Ankush Chakrabarty
    Research Areas: Control, Data Analytics, Machine Learning, Multi-Physical Modeling, Optimization
    Brief
    • Ankush Chakrabarty, Principal Research Scientist in the Multiphysical Systems team at MERL, served as Co-Chair at the 3rd ACM International Workshop on Big Data and Machine Learning for Smart Buildings and Cities (BALANCES'23). The workshop places spotlights on two different IEA EBC Annexes: the Annex 81 - Data-Driven Smart Buildings and Annex 82 - Energy Flexible Buildings Towards Resilient Low Carbon Energy Systems.
  •  
  •  NEWS    MERL Researchers give a Tutorial Talk on Quantum Machine Learning for Sensing and Communications at IEEE VCC
    Date: November 28, 2023 - November 30, 2023
    Where: Virtual
    MERL Contacts: Toshiaki Koike-Akino; Pu (Perry) Wang
    Research Areas: Artificial Intelligence, Communications, Computational Sensing, Machine Learning, Signal Processing
    Brief
    • On November 28, 2023, MERL researchers Toshiaki Koike-Akino and Pu (Perry) Wang will give a 3-hour tutorial presentation at the first IEEE Virtual Conference on Communications (VCC). The talk, titled "Post-Deep Learning Era: Emerging Quantum Machine Learning for Sensing and Communications," addresses recent trends, challenges, and advances in sensing and communications. P. Wang presents use cases, industry trends, signal processing, and deep learning for Wi-Fi integrated sensing and communications (ISAC), while T. Koike-Akino discusses the future of deep learning, giving a comprehensive overview of artificial intelligence (AI) technologies, natural computing, emerging quantum AI, and their diverse applications. The tutorial is conducted virtually.

      IEEE VCC is a new fully virtual conference launched from the IEEE Communications Society, gathering researchers from academia and industry who are unable to travel but wish to present their recent scientific results and engage in conducive interactive discussions with fellow researchers working in their fields. It is designed to resolve potential hardship such as pandemic restrictions, visa issues, travel problems, or financial difficulties.
  •  
  •  NEWS    Anoop Cherian gives a podcast interview with AI Business
    Date: September 26, 2023
    Where: Virtual
    MERL Contact: Anoop Cherian
    Research Areas: Artificial Intelligence, Computer Vision, Machine Learning
    Brief
    • Anoop Cherian, a Senior Principal Research Scientist in the Computer Vision team at MERL, gave a podcast interview with award-winning journalist, Deborah Yao. Deborah is the editor of AI Business -- a leading content platform for artificial intelligence and its applications in the real world, delivering its readers up-to-the-minute insights into how AI technologies are currently affecting the global economy and society. The podcast was based on the recent research that Anoop and his colleagues did at MERL with his collaborators at MIT; this research attempts to objectively answer the pertinent question: are current deep neural networks smarter than second graders? The podcast discusses shortcomings in the recent artificial general intelligence systems with regard to their capabilities for knowledge abstraction, learning, and generalization, which are brought out by this research.
  •  
  •  NEWS    Invited talk given by Diego Romeres at Bentley University
    Date: November 1, 2023
    MERL Contact: Diego Romeres
    Research Areas: Artificial Intelligence, Machine Learning, Robotics
    Brief
    • Principal Research Scientist and Team Leader Diego Romeres gave an invited talk with title 'Applications of Machine Learning to Robotics' in the Machine Learning graduate course at Bentley University. The presentation focused mainly on Reinforcement Learning research applied to robotics. The audience consisted mostly of Master’s in Business Analytics (MSBA) students and students in the MBA w/ Business Analytics Concentration program.
  •  
  •  NEWS    MERL co-organizes the 2023 Sound Demixing (SDX2023) Challenge and Workshop
    Date: January 23, 2023 - November 4, 2023
    Where: International Symposium of Music Information Retrieval (ISMR)
    MERL Contacts: Jonathan Le Roux; Gordon Wichern
    Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
    Brief
    • MERL Speech & Audio team members Gordon Wichern and Jonathan Le Roux co-organized the 2023 Sound Demixing Challenge along with researchers from Sony, Moises AI, Audioshake, and Meta.

      The SDX2023 Challenge was hosted on the AI Crowd platform and had a prize pool of $42,000 distributed to the winning teams across two tracks: Music Demixing and Cinematic Sound Demixing. A unique aspect of this challenge was the ability to test the audio source separation models developed by challenge participants on non-public songs from Sony Music Entertainment Japan for the music demixing track, and movie soundtracks from Sony Pictures for the cinematic sound demixing track. The challenge ran from January 23rd to May 1st, 2023, and had 884 participants distributed across 68 teams submitting 2828 source separation models. The winners will be announced at the SDX2023 Workshop, which will take place as a satellite event at the International Symposium of Music Information Retrieval (ISMR) in Milan, Italy on November 4, 2023.

      MERL’s contribution to SDX2023 focused mainly on the cinematic demixing track. In addition to sponsoring the prizes awarded to the winning teams for that track, the baseline system and initial training data were MERL’s Cocktail Fork separation model and Divide and Remaster dataset, respectively. MERL researchers also contributed to a Town Hall kicking off the challenge, co-authored a scientific paper describing the challenge outcomes, and co-organized the SDX2023 Workshop.
  •  
  •  NEWS    MERL researchers presenting four papers and organizing the VLAR-SMART101 Workshop at ICCV 2023
    Date: October 2, 2023 - October 6, 2023
    Where: Paris/France
    MERL Contacts: Moitreya Chatterjee; Anoop Cherian; Michael J. Jones; Toshiaki Koike-Akino; Suhas Lohit; Tim K. Marks; Pedro Miraldo; Kuan-Chuan Peng; Ye Wang
    Research Areas: Artificial Intelligence, Computer Vision, Machine Learning
    Brief
    • MERL researchers are presenting 4 papers and organizing the VLAR-SMART-101 workshop at the ICCV 2023 conference, which will be held in Paris, France October 2-6. ICCV is one of the most prestigious and competitive international conferences in computer vision. Details are provided below.

      1. Conference paper: “Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional Image Synthesis,” by Nithin Gopalakrishnan Nair, Anoop Cherian, Suhas Lohit, Ye Wang, Toshiaki Koike-Akino, Vishal Patel, and Tim K. Marks

      Conditional generative models typically demand large annotated training sets to achieve high-quality synthesis. As a result, there has been significant interest in plug-and-play generation, i.e., using a pre-defined model to guide the generative process. In this paper, we introduce Steered Diffusion, a generalized framework for fine-grained photorealistic zero-shot conditional image generation using a diffusion model trained for unconditional generation. The key idea is to steer the image generation of the diffusion model during inference via designing a loss using a pre-trained inverse model that characterizes the conditional task. Our model shows clear qualitative and quantitative improvements over state-of-the-art diffusion-based plug-and-play models, while adding negligible computational cost.

      2. Conference paper: "BANSAC: A dynamic BAyesian Network for adaptive SAmple Consensus," by Valter Piedade and Pedro Miraldo

      We derive a dynamic Bayesian network that updates individual data points' inlier scores while iterating RANSAC. At each iteration, we apply weighted sampling using the updated scores. Our method works with or without prior data point scorings. In addition, we use the updated inlier/outlier scoring for deriving a new stopping criterion for the RANSAC loop. Our method outperforms the baselines in accuracy while needing less computational time.

      3. Conference paper: "Robust Frame-to-Frame Camera Rotation Estimation in Crowded Scenes," by Fabien Delattre, David Dirnfeld, Phat Nguyen, Stephen Scarano, Michael J. Jones, Pedro Miraldo, and Erik Learned-Miller

      We present a novel approach to estimating camera rotation in crowded, real-world scenes captured using a handheld monocular video camera. Our method uses a novel generalization of the Hough transform on SO3 to efficiently find the camera rotation most compatible with the optical flow. Because the setting is not addressed well by other data sets, we provide a new dataset and benchmark, with high-accuracy and rigorously annotated ground truth on 17 video sequences. Our method is more accurate by almost 40 percent than the next best method.

      4. Workshop paper: "Tensor Factorization for Leveraging Cross-Modal Knowledge in Data-Constrained Infrared Object Detection" by Manish Sharma*, Moitreya Chatterjee*, Kuan-Chuan Peng, Suhas Lohit, and Michael Jones

      While state-of-the-art object detection methods for RGB images have reached some level of maturity, the same is not true for Infrared (IR) images. The primary bottleneck towards bridging this gap is the lack of sufficient labeled training data in the IR images. Towards addressing this issue, we present TensorFact, a novel tensor decomposition method which splits the convolution kernels of a CNN into low-rank factor matrices with fewer parameters. This compressed network is first pre-trained on RGB images and then augmented with only a few parameters. This augmented network is then trained on IR images, while freezing the weights trained on RGB. This prevents it from over-fitting, allowing it to generalize better. Experiments show that our method outperforms state-of-the-art.

      5. “Vision-and-Language Algorithmic Reasoning (VLAR) Workshop and SMART-101 Challenge” by Anoop Cherian,  Kuan-Chuan Peng, Suhas Lohit, Tim K. Marks, Ram Ramrakhya, Honglu Zhou, Kevin A. Smith, Joanna Matthiesen, and Joshua B. Tenenbaum

      MERL researchers along with researchers from MIT, GeorgiaTech, Math Kangaroo USA, and Rutgers University are jointly organizing a workshop on vision-and-language algorithmic reasoning at ICCV 2023 and conducting a challenge based on the SMART-101 puzzles described in the paper: Are Deep Neural Networks SMARTer than Second Graders?. A focus of this workshop is to bring together outstanding faculty/researchers working at the intersections of vision, language, and cognition to provide their opinions on the recent breakthroughs in large language models and artificial general intelligence, as well as showcase their cutting edge research that could inspire the audience to search for the missing pieces in our quest towards solving the puzzle of artificial intelligence.

      Workshop link: https://wvlar.github.io/iccv23/
  •  
  •  AWARD    Best paper award at PHMAP 2023
    Date: September 14, 2023
    Awarded to: Dehong Liu, Anantaram Varatharajan, and Abraham Goldsmith
    MERL Contacts: Abraham Goldsmith; Dehong Liu
    Research Areas: Electric Systems, Signal Processing
    Brief
    • MERL researchers Dehong Liu, Anantaram Varatharajan, and Abraham Goldsmith were awarded one of three best paper awards at Asia Pacific Conference of the Prognostics and Health Management Society 2023 (PHMAP23) held in Tokyo from September 11th to 14th, 2023, for their co-authored paper titled 'Extracting Broken-Rotor-Bar Fault Signature of Varying-Speed Induction Motors.'

      PHMAP is a biennial international conference specialized in prognostics and health management. PHMAP23 attracted more than 300 attendees from worldwide and published more than 160 regular papers from academia and industry including aerospace, production, civil engineering, electronics, and so on.
  •  
  •  AWARD    Joint University of Padua-MERL team wins Challenge 'AI Olympics With RealAIGym'
    Date: August 25, 2023
    Awarded to: Alberto Dalla Libera, Niccolo' Turcato, Giulio Giacomuzzo, Ruggero Carli, Diego Romeres
    MERL Contact: Diego Romeres
    Research Areas: Artificial Intelligence, Machine Learning, Robotics
    Brief
    • A joint team consisting of members of University of Padua and MERL ranked 1st in the IJCAI2023 Challenge "Al Olympics With RealAlGym: Is Al Ready for Athletic Intelligence in the Real World?". The team was composed by MERL researcher Diego Romeres and a team from University Padua (UniPD) consisting of Alberto Dalla Libera, Ph.D., Ph.D. Candidates: Niccolò Turcato, Giulio Giacomuzzo and Prof. Ruggero Carli from University of Padua.

      The International Joint Conference on Artificial Intelligence (IJCAI) is a premier gathering for AI researchers and organizes several competitions. This year the competition CC7 "AI Olympics With RealAIGym: Is AI Ready for Athletic Intelligence in the Real World?" consisted of two stages: simulation and real-robot experiments on two under-actuated robotic systems. The two robotics systems were treated as separate tracks and one final winner was selected for each track based on specific performance criteria in the control tasks.

      The UniPD-MERL team competed and won in both tracks. The team's system made strong use of a Model-based Reinforcement Learning algorithm called (MC-PILCO) that we recently published in the journal IEEE Transaction on Robotics.
  •  
  •  AWARD    Best Paper Award at SDEMPED 2023
    Date: August 30, 2023
    Awarded to: Bingnan Wang, Hiroshi Inoue, and Makoto Kanemaru
    MERL Contact: Bingnan Wang
    Research Areas: Applied Physics, Data Analytics, Multi-Physical Modeling
    Brief
    • MERL and Mitsubishi Electric's paper titled “Motor Eccentricity Fault Detection: Physics-Based and Data-Driven Approaches” was awarded one of three best paper awards at the 14th IEEE International Symposium on Diagnostics for Electric Machines, Power Electronics and Drives (SDEMPED 2023). MERL Senior Principal Research Scientist Bingnan Wang presented the paper and received the award at the symposium. Co-authors of the paper include Mitsubishi Electric researchers Hiroshi Inoue and Makoto Kanemaru.

      SDEMPED was established as the only international symposium entirely devoted to the diagnostics of electrical machines, power electronics and drives. It is now a regular biennial event. The 14th version, SDEMPED 2023 was held in Chania, Greece from August 28th to 31st, 2023.
  •  
  •  NEWS    MERL presents 9 papers at 2023 IFAC World Congress
    Date: July 9, 2023 - July 14, 2023
    MERL Contacts: Karl Berntorp; Scott A. Bortoff; Ankush Chakrabarty; Stefano Di Cairano; Christopher R. Laughman; Diego Romeres; Abraham P. Vinod
    Research Areas: Control, Dynamical Systems, Machine Learning, Multi-Physical Modeling, Optimization, Robotics
    Brief
    • MERL researchers presented 9 papers and organized 2 invited/workshop sessions at the 2023 IFAC World Congress held in Yokohama, JP.

      MERL's contributions covered topics including decision-making for autonomous vehicles, statistical and learning-based estimation for GNSS and energy systems, impedance control for delta robots, learning for system identification of rigid body dynamics and time-varying systems, and meta-learning for deep state-space modeling using data from similar systems. The invited session (MERL co-organizer: Ankush Chakrabarty) was on the topic of “Estimation and observer design: theory and applications” and the workshop (MERL co-organizer: Karl Berntorp) was on “Gaussian Process Learning for Systems and Control”.
  •  
  •  NEWS    MERL researchers present 3 papers on Dexterous Manipulation at RSS 23.
    Date: July 11, 2023
    Where: Daegu, Korea
    MERL Contacts: Siddarth Jain; Devesh K. Jha; Arvind Raghunathan
    Research Areas: Artificial Intelligence, Machine Learning, Robotics
    Brief
    • MERL researchers presented 3 papers at the 19th edition of Robotics:Science and Systems Conference in Daegu, Korea. RSS is the flagship conference of the RSS foundation and is run as a single track conference presenting a limited number of high-quality papers. This year the main conference had a total of 112 papers presented. MERL researchers presented 2 papers in the main conference on planning and perception for dexterous manipulation. Another paper was presented in a workshop of learning for dexterous manipulation. More details can be found here https://roboticsconference.org.
  •  
  •  NEWS    Keynote address given by Philip Orlik at 9th annual IEEE Smartcomp conference
    Date: June 26, 2023
    Where: International Conference on Smart Computing (SMARTCOMP), Vanderbilt University, Nashville, Tennessee
    MERL Contact: Philip V. Orlik
    Research Areas: Communications, Dynamical Systems, Machine Learning, Multi-Physical Modeling, Signal Processing
    Brief
    • VP & Research Director, Philip Orlik, gave a keynote titled, "Smart Technologies for Smarter Buildings" at the 9th edition of the IEEE International Conference on Smart Computing (SMARTCOMP) focusing on some of the research challenges and opportunities that arise as we seek to achieve net-zero emissions in Smart building environments.

      SMARTCOMP is the premier conference on smart computing. Smart computing is a multidisciplinary domain based on the synergistic influence of advances in sensor-based technologies, Internet of Things, cyber-physical systems, edge computing, big data analytics, machine learning, cognitive computing, and artificial intelligence.
  •  
  •  AWARD    MERL Intern and Researchers Win ICASSP 2023 Best Student Paper Award
    Date: June 9, 2023
    Awarded to: Darius Petermann, Gordon Wichern, Aswin Subramanian, Jonathan Le Roux
    MERL Contacts: Jonathan Le Roux; Gordon Wichern
    Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
    Brief
    • Former MERL intern Darius Petermann (Ph.D. Candidate at Indiana University) has received a Best Student Paper Award at the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023) for the paper "Hyperbolic Audio Source Separation", co-authored with MERL researchers Gordon Wichern and Jonathan Le Roux, and former MERL researcher Aswin Subramanian. The paper presents work performed during Darius's internship at MERL in the summer 2022. The paper introduces a framework for audio source separation using embeddings on a hyperbolic manifold that compactly represent the hierarchical relationship between sound sources and time-frequency features. Additionally, the code associated with the paper is publicly available at https://github.com/merlresearch/hyper-unmix.

      ICASSP is the flagship conference of the IEEE Signal Processing Society (SPS). ICASSP 2023 was held in the Greek island of Rhodes from June 04 to June 10, 2023, and it was the largest ICASSP in history, with more than 4000 participants, over 6128 submitted papers and 2709 accepted papers. Darius’s paper was first recognized as one of the Top 3% of all papers accepted at the conference, before receiving one of only 5 Best Student Paper Awards during the closing ceremony.
  •  
  •  AWARD    MERL’s Paper on Wi-Fi Sensing Earns Top 3% Paper Recognition at ICASSP 2023, Selected as a Best Student Paper Award Finalist
    Date: June 9, 2023
    Awarded to: Cristian J. Vaca-Rubio, Pu Wang, Toshiaki Koike-Akino, Ye Wang, Petros Boufounos and Petar Popovski
    MERL Contacts: Petros T. Boufounos; Toshiaki Koike-Akino; Pu (Perry) Wang; Ye Wang
    Research Areas: Artificial Intelligence, Communications, Computational Sensing, Dynamical Systems, Machine Learning, Signal Processing
    Brief
    • A MERL Paper on Wi-Fi sensing was recognized as a Top 3% Paper among all 2709 accepted papers at the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023). Co-authored by Cristian Vaca-Rubio and Petar Popovski from Aalborg University, Denmark, and MERL researchers Pu Wang, Toshiaki Koike-Akino, Ye Wang, and Petros Boufounos, the paper "MmWave Wi-Fi Trajectory Estimation with Continous-Time Neural Dynamic Learning" was also a Best Student Paper Award finalist.

      Performed during Cristian’s stay at MERL first as a visiting Marie Skłodowska-Curie Fellow and then as a full-time intern in 2022, this work capitalizes on standards-compliant Wi-Fi signals to perform indoor localization and sensing. The paper uses a neural dynamic learning framework to address technical issues such as low sampling rate and irregular sampling intervals.

      ICASSP, a flagship conference of the IEEE Signal Processing Society (SPS), was hosted on the Greek island of Rhodes from June 04 to June 10, 2023. ICASSP 2023 marked the largest ICASSP in history, boasting over 4000 participants and 6128 submitted papers, out of which 2709 were accepted.
  •  
  •  AWARD    Joint CMU-MERL team wins DCASE2023 Challenge on Automated Audio Captioning
    Date: June 1, 2023
    Awarded to: Shih-Lun Wu, Xuankai Chang, Gordon Wichern, Jee-weon Jung, Francois Germain, Jonathan Le Roux, Shinji Watanabe
    MERL Contacts: François Germain; Jonathan Le Roux; Gordon Wichern
    Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
    Brief
    • A joint team consisting of members of CMU Professor and MERL Alumn Shinji Watanabe's WavLab and members of MERL's Speech & Audio team ranked 1st out of 11 teams in the DCASE2023 Challenge's Task 6A "Automated Audio Captioning". The team was led by student Shih-Lun Wu and also featured Ph.D. candidate Xuankai Chang, Postdoctoral research associate Jee-weon Jung, Prof. Shinji Watanabe, and MERL researchers Gordon Wichern, Francois Germain, and Jonathan Le Roux.

      The IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE Challenge), started in 2013, has been organized yearly since 2016, and gathers challenges on multiple tasks related to the detection, analysis, and generation of sound events. This year, the DCASE2023 Challenge received over 428 submissions from 123 teams across seven tasks.

      The CMU-MERL team competed in the Task 6A track, Automated Audio Captioning, which aims at generating informative descriptions for various sounds from nature and/or human activities. The team's system made strong use of large pretrained models, namely a BEATs transformer as part of the audio encoder stack, an Instructor Transformer encoding ground-truth captions to derive an audio-text contrastive loss on the audio encoder, and ChatGPT to produce caption mix-ups (i.e., grammatical and compact combinations of two captions) which, together with the corresponding audio mixtures, increase not only the amount but also the complexity and diversity of the training data. The team's best submission obtained a SPIDEr-FL score of 0.327 on the hidden test set, largely outperforming the 2nd best team's 0.315.
  •  
  •  NEWS    MERL researchers presenting four papers and co-organizing a workshop at CVPR 2023
    Date: June 18, 2023 - June 22, 2023
    Where: Vancouver/Canada
    MERL Contacts: Anoop Cherian; Michael J. Jones; Suhas Lohit; Kuan-Chuan Peng
    Research Areas: Artificial Intelligence, Computer Vision, Machine Learning
    Brief
    • MERL researchers are presenting 4 papers and co-organizing a workshop at the CVPR 2023 conference, which will be held in Vancouver, Canada June 18-22. CVPR is one of the most prestigious and competitive international conferences in computer vision. Details are provided below.

      1. “Are Deep Neural Networks SMARTer than Second Graders,” by Anoop Cherian, Kuan-Chuan Peng, Suhas Lohit, Kevin Smith, and Joshua B. Tenenbaum

      We present SMART: a Simple Multimodal Algorithmic Reasoning Task and the associated SMART-101 dataset for evaluating the abstraction, deduction, and generalization abilities of neural networks in solving visuo-linguistic puzzles designed for children in the 6-8 age group. Our experiments using SMART-101 reveal that powerful deep models are not better than random accuracy when analyzed for generalization. We also evaluate large language models (including ChatGPT) on a subset of SMART-101 and find that while these models show convincing reasoning abilities, their answers are often incorrect.

      Paper: https://arxiv.org/abs/2212.09993

      2. “EVAL: Explainable Video Anomaly Localization,” by Ashish Singh, Michael J. Jones, and Erik Learned-Miller

      This work presents a method for detecting unusual activities in videos by building a high-level model of activities found in nominal videos of a scene. The high-level features used in the model are human understandable and include attributes such as the object class and the directions and speeds of motion. Such high-level features allow our method to not only detect anomalous activity but also to provide explanations for why it is anomalous.

      Paper: https://arxiv.org/abs/2212.07900

      3. "Aligning Step-by-Step Instructional Diagrams to Video Demonstrations," by Jiahao Zhang, Anoop Cherian, Yanbin Liu, Yizhak Ben-Shabat, Cristian Rodriguez, and Stephen Gould

      The rise of do-it-yourself (DIY) videos on the web has made it possible even for an unskilled person (or a skilled robot) to imitate and follow instructions to complete complex real world tasks. In this paper, we consider the novel problem of aligning instruction steps that are depicted as assembly diagrams (commonly seen in Ikea assembly manuals) with video segments from in-the-wild videos. We present a new dataset: Ikea Assembly in the Wild (IAW) and propose a contrastive learning framework for aligning instruction diagrams with video clips.

      Paper: https://arxiv.org/pdf/2303.13800.pdf

      4. "HaLP: Hallucinating Latent Positives for Skeleton-Based Self-Supervised Learning of Actions," by Anshul Shah, Aniket Roy, Ketul Shah, Shlok Kumar Mishra, David Jacobs, Anoop Cherian, and Rama Chellappa

      In this work, we propose a new contrastive learning approach to train models for skeleton-based action recognition without labels. Our key contribution is a simple module, HaLP: Hallucinating Latent Positives for contrastive learning. HaLP explores the latent space of poses in suitable directions to generate new positives. Our experiments using HaLP demonstrates strong empirical improvements.

      Paper: https://arxiv.org/abs/2304.00387

      The 4th Workshop on Fair, Data-Efficient, and Trusted Computer Vision

      MERL researcher Kuan-Chuan Peng is co-organizing the fourth Workshop on Fair, Data-Efficient, and Trusted Computer Vision (https://fadetrcv.github.io/2023/) in conjunction with CVPR 2023 on June 18, 2023. This workshop provides a focused venue for discussing and disseminating research in the areas of fairness, bias, and trust in computer vision, as well as adjacent domains such as computational social science and public policy.
  •  
  •  NEWS    Mitsubishi Electric Corporation Press Release Announces Worlds First GaN Power Amplifier Capable of Wideband Operation for 4G, 5G and Beyond 5G/6G.
    Date: June 8, 2023
    MERL Contacts: Toshiaki Koike-Akino; Koon Hoo Teo
    Research Areas: Communications, Electronic and Photonic Devices, Machine Learning, Signal Processing
    Brief
    • Mitsubishi Electric Corporation announced today it has developed what is believed to be the world's first gallium nitride (GaN) power amplifier that achieves a frequency range of 3,400MHz using a single power amplifier, which the company has demonstrated can be used for 4G, 5G and Beyond 5G/6G communication systems operating at different frequencies in a single base station. The amplifier is expected to enable the radio unit (transceiver) to be shared with different communication systems and lead to more power-efficient base stations.

      Mitsubishi Electric Researchers, Toshiaki Koike-Akino and Koon Hoo Teo helped developed the technology and device. Technical details will be presented at the IEEE International Microwave Symposium 2023 this month.

      Please see the link below for the full press release from Mitsubishi Electric.
  •  
  •  NEWS    Abraham Vinod gave an invited talk at the University of California Santa Cruz
    Date: June 8, 2023
    Where: Zoom
    MERL Contact: Abraham P. Vinod
    Research Areas: Artificial Intelligence, Control, Dynamical Systems, Optimization, Robotics
    Brief
    • Abraham Vinod gave an invited talk at the Electrical and Computer Engineering Department, the University of California Santa Cruz, titled "Motion Planning under Constraints and Uncertainty using Data and Reachability". His presentation covered recent work on fast and safe motion planners that can allow for coordination among agents, mitigate uncertainty arising from sensing limitations and simplified models, and tolerate the possibility of failures.
  •  
  •  NEWS    Ankush Chakrabarty co-organized three sessions at the ACC2023, and was nominated for Best Energy Systems Paper.
    Date: June 30, 2023 - June 2, 2023
    Where: San Diego, CA
    MERL Contact: Ankush Chakrabarty
    Research Areas: Applied Physics, Artificial Intelligence, Control, Data Analytics, Dynamical Systems, Machine Learning, Multi-Physical Modeling, Optimization, Robotics
    Brief
    • Ankush Chakrabarty (researcher, Multiphysical Systems Team) co-organized and spoke at 3 sessions at the 2023 American Control Conference in San Diego, CA. These include: (1) A tutorial session (w/ Stefano Di Cairano) on "Physics Informed Machine Learning for Modeling and Control": an effort with contributions from multiple academic institutes and US research labs; (2) An invited session on "Energy Efficiency in Smart Buildings and Cities" in which his paper (w/ Chris Laughman) on "Local Search Region Constrained Bayesian Optimization for Performance Optimization of Vapor Compression Systems" was nominated for Best Energy Systems Paper Award; and, (3) A special session on Diversity, Equity, and Inclusion to improve recruitment and retention of underrepresented groups in STEM research.
  •  
  •  AWARD    MERL Researchers Win Best Workshop Poster Award at the 2023 IEEE International Conference on Robotics and Automation (ICRA)
    Date: June 2, 2023
    Awarded to: Yuki Shirai, Devesh Jha, Arvind Raghunathan and Dennis Hong
    MERL Contacts: Devesh K. Jha; Arvind Raghunathan
    Research Areas: Artificial Intelligence, Optimization, Robotics
    Brief
    • MERL's paper titled: "Closed-Loop Tactile Controller for Tool Manipulation" Won the Best Poster Award in the workshop on "Embracing contacts : Making robots physically interact with our world". First author and MERL intern, Yuki Shirai, was presented with the award at a ceremony held at ICRA in London. MERL researchers Devesh Jha, Principal Research Scientist, and Arvind Raghunathan, Senior Principal Research Scientist and Senior Team Leader as well as Prof. Dennis Hong of University of California, Los Angeles are also coauthors.

      The paper presents a technique to manipulate an object using a tool in a closed-loop fashion using vision-based tactile sensors. More information about the workshop and the various speakers can be found here https://sites.google.com/view/icra2023embracingcontacts/home.
  •  
  •  NEWS    Abraham Vinod serves as a panelist at the Student Networking Event at American Control Conference 2023
    Date: June 1, 2023
    Where: San Diego, CA
    MERL Contact: Abraham P. Vinod
    Research Areas: Control, Optimization
    Brief
    • The student networking event provides an opportunity for all interested students attending American Control Conference 2023 to receive career advice from professionals working in industry, academia, and national laboratories during a structured event. The event aims to provide an engaging experience to students that illustrates the benefits of involvement in the control community and encourage their continued participation as the future leaders in the field.
  •