Machine Learning
Data-driven approaches to design intelligent algorithms.
MERL has a long history of research activity in machine learning, including the development of various boosting algorithms and contributing to the theory and practice of highly scalable collaborative filtering. Our recent work has focused on deep learning and reinforcement learning, with application to a wide range of applications including automotive, robotics, factory automation, transportation, as well as building and home systems.
Quick Links
-
Researchers
Toshiaki
Koike-Akino
Ye
Wang
Jonathan
Le Roux
Ankush
Chakrabarty
Anoop
Cherian
Gordon
Wichern
Tim K.
Marks
Philip V.
Orlik
Michael J.
Jones
Stefano
Di Cairano
Daniel N.
Nikovski
Kieran
Parsons
Devesh K.
Jha
Christopher R.
Laughman
Diego
Romeres
Pu
(Perry)
WangKarl
Berntorp
Chiori
Hori
Bingnan
Wang
Yebin
Wang
Suhas
Lohit
Mouhacine
Benosman
Hassan
Mansour
Matthew
Brand
Petros T.
Boufounos
Arvind
Raghunathan
Moitreya
Chatterjee
Abraham P.
Vinod
Jianlin
Guo
Siddarth
Jain
Kuan-Chuan
Peng
Scott A.
Bortoff
Vedang M.
Deshpande
Jing
Liu
Hongtao
Qiao
William S.
Yerazunis
Radu
Corcodel
François
Germain
Chungwei
Lin
Dehong
Liu
Saviz
Mowlavi
Hongbo
Sun
Wataru
Tsujita
Sameer
Khurana
Pedro
Miraldo
James
Queeney
Anthony
Vetro
Jinyun
Zhang
Jose
Amaya
Abraham
Goldsmith
Yanting
Ma
Joshua
Rapp
Avishai
Weiss
Janek
Ebbers
-
Awards
-
AWARD Jonathan Le Roux elevated to IEEE Fellow Date: January 1, 2024
Awarded to: Jonathan Le Roux
MERL Contact: Jonathan Le Roux
Research Areas: Artificial Intelligence, Machine Learning, Speech & AudioBrief- MERL Distinguished Scientist and Speech & Audio Senior Team Leader Jonathan Le Roux has been elevated to IEEE Fellow, effective January 2024, "for contributions to multi-source speech and audio processing."
Mitsubishi Electric celebrated Dr. Le Roux's elevation and that of another researcher from the company, Dr. Shumpei Kameyama, with a worldwide news release on February 15.
Dr. Jonathan Le Roux has made fundamental contributions to the field of multi-speaker speech processing, especially to the areas of speech separation and multi-speaker end-to-end automatic speech recognition (ASR). His contributions constituted a major advance in realizing a practically usable solution to the cocktail party problem, enabling machines to replicate humans’ ability to concentrate on a specific sound source, such as a certain speaker within a complex acoustic scene—a long-standing challenge in the speech signal processing community. Additionally, he has made key contributions to the measures used for training and evaluating audio source separation methods, developing several new objective functions to improve the training of deep neural networks for speech enhancement, and analyzing the impact of metrics used to evaluate the signal reconstruction quality. Dr. Le Roux’s technical contributions have been crucial in promoting the widespread adoption of multi-speaker separation and end-to-end ASR technologies across various applications, including smart speakers, teleconferencing systems, hearables, and mobile devices.
IEEE Fellow is the highest grade of membership of the IEEE. It honors members with an outstanding record of technical achievements, contributing importantly to the advancement or application of engineering, science and technology, and bringing significant value to society. Each year, following a rigorous evaluation procedure, the IEEE Fellow Committee recommends a select group of recipients for elevation to IEEE Fellow. Less than 0.1% of voting members are selected annually for this member grade elevation.
- MERL Distinguished Scientist and Speech & Audio Senior Team Leader Jonathan Le Roux has been elevated to IEEE Fellow, effective January 2024, "for contributions to multi-source speech and audio processing."
-
AWARD Honorable Mention Award at NeurIPS 23 Instruction Workshop Date: December 15, 2023
Awarded to: Lingfeng Sun, Devesh K. Jha, Chiori Hori, Siddharth Jain, Radu Corcodel, Xinghao Zhu, Masayoshi Tomizuka and Diego Romeres
MERL Contacts: Radu Corcodel; Chiori Hori; Siddarth Jain; Devesh K. Jha; Diego Romeres
Research Areas: Artificial Intelligence, Machine Learning, RoboticsBrief- MERL Researchers received an "Honorable Mention award" at the Workshop on Instruction Tuning and Instruction Following at the NeurIPS 2023 conference in New Orleans. The workshop was on the topic of instruction tuning and Instruction following for Large Language Models (LLMs). MERL researchers presented their work on interactive planning using LLMs for partially observable robotic tasks during the oral presentation session at the workshop.
-
AWARD MERL team wins the Audio-Visual Speech Enhancement (AVSE) 2023 Challenge Date: December 16, 2023
Awarded to: Zexu Pan, Gordon Wichern, Yoshiki Masuyama, Francois Germain, Sameer Khurana, Chiori Hori, and Jonathan Le Roux
MERL Contacts: François Germain; Chiori Hori; Sameer Khurana; Jonathan Le Roux; Gordon Wichern
Research Areas: Artificial Intelligence, Machine Learning, Speech & AudioBrief- MERL's Speech & Audio team ranked 1st out of 12 teams in the 2nd COG-MHEAR Audio-Visual Speech Enhancement Challenge (AVSE). The team was led by Zexu Pan, and also included Gordon Wichern, Yoshiki Masuyama, Francois Germain, Sameer Khurana, Chiori Hori, and Jonathan Le Roux.
The AVSE challenge aims to design better speech enhancement systems by harnessing the visual aspects of speech (such as lip movements and gestures) in a manner similar to the brain’s multi-modal integration strategies. MERL’s system was a scenario-aware audio-visual TF-GridNet, that incorporates the face recording of a target speaker as a conditioning factor and also recognizes whether the predominant interference signal is speech or background noise. In addition to outperforming all competing systems in terms of objective metrics by a wide margin, in a listening test, MERL’s model achieved the best overall word intelligibility score of 84.54%, compared to 57.56% for the baseline and 80.41% for the next best team. The Fisher’s least significant difference (LSD) was 2.14%, indicating that our model offered statistically significant speech intelligibility improvements compared to all other systems.
- MERL's Speech & Audio team ranked 1st out of 12 teams in the 2nd COG-MHEAR Audio-Visual Speech Enhancement Challenge (AVSE). The team was led by Zexu Pan, and also included Gordon Wichern, Yoshiki Masuyama, Francois Germain, Sameer Khurana, Chiori Hori, and Jonathan Le Roux.
See All Awards for Machine Learning -
-
News & Events
-
NEWS MERL researchers present 9 papers at ACC 2024 Date: July 10, 2024 - July 12, 2024
Where: Toronto, Canada
MERL Contacts: Karl Berntorp; Ankush Chakrabarty; Vedang M. Deshpande; Stefano Di Cairano; Christopher R. Laughman; Arvind Raghunathan; Abraham P. Vinod; Yebin Wang; Avishai Weiss
Research Areas: Artificial Intelligence, Control, Dynamical Systems, Machine Learning, Multi-Physical Modeling, Optimization, RoboticsBrief- MERL researchers presented 9 papers at the recently concluded American Control Conference (ACC) 2024 in Toronto, Canada. The papers covered a wide range of topics including data-driven spatial monitoring using heterogenous robots, aircraft approach management near airports, computation fluid dynamics-based motion planning for drones facing winds, trajectory planning for coordinated monitoring using a team of drones and a ground carrier vehicle, ensemble Kalman smoothing-based model predictive control for motion planning for autonomous vehicles, system identification for Lithium-ion batteries, physics-constrained deep Kalman filters for vapor compression systems, switched reference governors for constrained systems, and distributed road-map monitoring using onboard sensors.
As a sponsor of the conference, MERL maintained a booth for open discussions with researchers and students, and hosted a special session to discuss highlights of MERL research and work philosophy.
In addition, Abraham Vinod served as a panelist at the Student Networking Event at the conference. The student networking event provides an opportunity for all interested students to network with professionals working in industry, academia, and national laboratories during a structured event, and encourages their continued participation as the future leaders in the field.
- MERL researchers presented 9 papers at the recently concluded American Control Conference (ACC) 2024 in Toronto, Canada. The papers covered a wide range of topics including data-driven spatial monitoring using heterogenous robots, aircraft approach management near airports, computation fluid dynamics-based motion planning for drones facing winds, trajectory planning for coordinated monitoring using a team of drones and a ground carrier vehicle, ensemble Kalman smoothing-based model predictive control for motion planning for autonomous vehicles, system identification for Lithium-ion batteries, physics-constrained deep Kalman filters for vapor compression systems, switched reference governors for constrained systems, and distributed road-map monitoring using onboard sensors.
-
NEWS Jianlin Guo delivered a keynote in IEEE ICC 2024 Workshop Date: June 13, 2024
Where: IEEE International Conference on Communications (ICC)
MERL Contacts: Jianlin Guo; Philip V. Orlik; Kieran Parsons; Pu (Perry) Wang
Research Areas: Communications, Machine Learning, Signal ProcessingBrief- Jianlin Guo delivered a keynote titled "Private IoT Networks" in the IEEE International Conference on Communications (ICC) 2024 Workshop "Industrial Private 5G-and-Beyond Wireless Networks", held in Denver, Colorado from June 9-13. The ICC is one of two IEEE Communications Society’s flagship conferences.
Abstract: With the advent of private 5G-and-Beyond communication technologies, private IoT networks have been emerging. In private IoT networks, network owners have full control on the network resource management. However, to fully realize private IoT networks, the upper layer technologies need to be developed as well. This keynote presents machine learning based anomaly detection in manufacturing systems, innovative multipath TCP technologies over heterogeneous wireless IoT networks, novel channel resource scheduling in private 5G networks and efficient wireless coexistence of the heterogeneous wireless systems.
- Jianlin Guo delivered a keynote titled "Private IoT Networks" in the IEEE International Conference on Communications (ICC) 2024 Workshop "Industrial Private 5G-and-Beyond Wireless Networks", held in Denver, Colorado from June 9-13. The ICC is one of two IEEE Communications Society’s flagship conferences.
See All News & Events for Machine Learning -
-
Research Highlights
-
TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models -
Gear-NeRF: Free-Viewpoint Rendering and Tracking with Motion-Aware Spatio-Temporal Sampling -
Steered Diffusion -
Edge-Assisted Internet of Vehicles for Smart Mobility -
Robust Machine Learning -
mmWave Beam-SNR Fingerprinting (mmBSF) -
Video Anomaly Detection -
Biosignal Processing for Human-Machine Interaction -
MERL Shopping Dataset
-
-
Internships
-
CI2091: Robust AI for Operational Technology Security
MERL is seeking a highly motivated and qualified intern to work on operational technology security. The ideal candidate would have significant research experience in cybersecurity for operational technology, anomaly detection, robust machine learning, and defenses against adversarial examples. A mature understanding of modern machine learning methods, proficiency with Python, and familiarity with deep learning frameworks are expected. Candidates at or beyond the middle of their Ph.D. program are encouraged to apply. The expected duration is 3 months with flexible start dates.
-
OR2196: Visuo-tactile Learning for Dexterous Manipulation
MERL is looking for a highly motivated individual to work on robotic manipulation using visuo-tactile learning. The research will develop robot motor skills for complex, dexterous manipulation using vision and tactile perception. The ideal candidate should have experience in either one or multiple of the following topics: manipulation, tactile sensing, Reinforcement Learning, sim-to-real techniques for manipulation, and grasping. Senior PhD students in robotics and engineering with a focus on contact-rich manipulation are encouraged to apply. Prior experience working with physical robotic systems (and vision and tactile sensors) is required as results need to be implemented on a physical hardware. Good coding skills in Python ML libraries like PyTorch etc. is required. A successful internship will result in submission of results to a peer-reviewed robotics journal in collaboration with MERL researchers. The expected duration of internship is 4-5 months with start date in Aug/Sept 2024. This internship is preferred to be onsite at MERL.
-
CA2132: Optimization Algorithms for Motion Planning and Predictive Control
MERL is looking for a highly motivated and qualified individual to work on tailored computational algorithms for optimization-based motion planning and predictive control applications in autonomous systems (vehicles, mobile robots). The ideal candidate should have experience in either one or multiple of the following topics: convex and non-convex optimization, stochastic predictive control (e.g., scenario trees), interaction-aware motion planning, machine learning, learning-based model predictive control, mathematical programs with complementarity constraints (MPCCs), optimal control, and real-time optimization. PhD students in engineering or mathematics, especially with a focus on research related to any of the above topics are encouraged to apply. Publication of relevant results in conference proceedings or journals is expected. Capability of implementing the designs and algorithms in MATLAB/Python is required; coding parts of the algorithms in C/C++ is a plus. The expected duration of the internship is 3 months, and the start date is flexible.
See All Internships for Machine Learning -
-
Openings
-
EA2051: Research Scientist - Control & Learning
-
OR2137: Research Scientist - Optimization & Intelligent Robotics
See All Openings at MERL -
-
Recent Publications
- "Disentangled Acoustic Fields For Multimodal Physical Scene Understanding", IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 2024.BibTeX TR2024-125 PDF
- @inproceedings{Yin2024sep,
- author = {Yin, Jie and Luo, Andrew and Du, Yilun and Cherian, Anoop and Marks, Tim K. and Le Roux, Jonathan and Gan, Chuang}},
- title = {Disentangled Acoustic Fields For Multimodal Physical Scene Understanding},
- booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
- year = 2024,
- month = sep,
- url = {https://www.merl.com/publications/TR2024-125}
- }
, - "PARIS: Pseudo-AutoRegressIve Siamese Training for Online Speech Separation", Interspeech, September 2024.BibTeX TR2024-124 PDF
- @inproceedings{Pan2024sep,
- author = {Pan, Zexu and Wichern, Gordon and Germain, François G and Saijo, Kohei and Le Roux, Jonathan}},
- title = {PARIS: Pseudo-AutoRegressIve Siamese Training for Online Speech Separation},
- booktitle = {Interspeech},
- year = 2024,
- month = sep,
- url = {https://www.merl.com/publications/TR2024-124}
- }
, - "MMVR: Millimeter-wave Multi-View Radar Dataset and Benchmark for Indoor Perception", European Conference on Computer Vision (ECCV), September 2024.BibTeX TR2024-117 PDF Data
- @inproceedings{Rahman2024sep,
- author = {Rahman, Mahbub and Yataka, Ryoma and Kato, Sorachi and Wang, Pu and Li, Peizhao and Cardace, Adriano and Boufounos, Petros T.}},
- title = {MMVR: Millimeter-wave Multi-View Radar Dataset and Benchmark for Indoor Perception},
- booktitle = {European Conference on Computer Vision (ECCV)},
- year = 2024,
- month = sep,
- url = {https://www.merl.com/publications/TR2024-117}
- }
, - "Supervised Contrastive Learning for Electric Motor Bearing Fault Detection", International Conference on Electrical Machines (ICEM), September 2024.BibTeX TR2024-120 PDF
- @inproceedings{Zhang2024sep,
- author = {Zhang, Hengrui and Wang, Bingnan}},
- title = {Supervised Contrastive Learning for Electric Motor Bearing Fault Detection},
- booktitle = {International Conference on Electrical Machines (ICEM)},
- year = 2024,
- month = sep,
- url = {https://www.merl.com/publications/TR2024-120}
- }
, - "MPC of Uncertain Nonlinear Systems with Meta-Learning for Fast Adaptation of Neural Predictive Models", International Conference on Automation Science and Engineering (CASE), August 2024.BibTeX TR2024-115 PDF
- @inproceedings{Yan2024aug,
- author = {Yan, Jiaqi and Chakrabarty, Ankush and Rupenyan, Alisa and Lygeros, John}},
- title = {MPC of Uncertain Nonlinear Systems with Meta-Learning for Fast Adaptation of Neural Predictive Models},
- booktitle = {International Conference on Automation Science and Engineering (CASE)},
- year = 2024,
- month = aug,
- url = {https://www.merl.com/publications/TR2024-115}
- }
, - "Assessing Building Control Performance Using Physics-Based Simulation Models and Deep Generative Networks", IEEE Conference on Control Technology and Applications (CCTA) 2024, August 2024.BibTeX TR2024-113 PDF
- @inproceedings{Chakrabarty2024aug,
- author = {Chakrabarty, Ankush and Vanfretti, Luigi and Bortoff, Scott A. and Deshpande, Vedang M. and Wang, Ye and Paulson, Joel A. and Zhan, Sicheng and Laughman, Christopher R.}},
- title = {Assessing Building Control Performance Using Physics-Based Simulation Models and Deep Generative Networks},
- booktitle = {IEEE Conference on Control Technology and Applications (CCTA) 2024},
- year = 2024,
- month = aug,
- url = {https://www.merl.com/publications/TR2024-113}
- }
, - "Bayesian Forecasting with Deep Generative Disturbance Models in Stochastic MPC for Building Energy Systems", IEEE Conference on Control Technology and Applications (CCTA), August 2024.BibTeX TR2024-110 PDF
- @inproceedings{Sorouifar2024aug,
- author = {Sorouifar, Farshud and Paulson, Joel A. and Wang, Ye and Quirynen, Rien and Laughman, Christopher R. and Chakrabarty, Ankush}},
- title = {Bayesian Forecasting with Deep Generative Disturbance Models in Stochastic MPC for Building Energy Systems},
- booktitle = {IEEE Conference on Control Technology and Applications (CCTA)},
- year = 2024,
- month = aug,
- url = {https://www.merl.com/publications/TR2024-110}
- }
, - "Safe multi-agent motion planning under uncertainty for drones using filtered reinforcement learning", IEEE Transactions on Robotics, DOI: 10.1109/TRO.2024.3387010, Vol. 40, pp. 2529-2542, July 2024.BibTeX TR2024-048 PDF Video
- @article{Safaoui2024jul,
- author = {Safaoui, Sleiman and Vinod, Abraham P. and Chakrabarty, Ankush and Quirynen, Rien and Yoshikawa, Nobuyuki and Di Cairano, Stefano},
- title = {Safe multi-agent motion planning under uncertainty for drones using filtered reinforcement learning},
- journal = {IEEE Transactions on Robotics},
- year = 2024,
- volume = 40,
- pages = {2529--2542},
- month = jul,
- doi = {10.1109/TRO.2024.3387010},
- url = {https://www.merl.com/publications/TR2024-048}
- }
,
- "Disentangled Acoustic Fields For Multimodal Physical Scene Understanding", IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 2024.
-
Videos
-
Software & Data Downloads
-
DeepBornFNO -
ComplexVAD Dataset -
Millimeter-wave Multi-View Radar Dataset -
Gear Extensions of Neural Radiance Fields -
Long-Tailed Anomaly Detection (LTAD) Dataset -
Target-Speaker SEParation -
Pixel-Grounded Prototypical Part Networks -
Steered Diffusion -
BAyesian Network for adaptive SAmple Consensus -
Simple Multimodal Algorithmic Reasoning Task Dataset -
Partial Group Convolutional Neural Networks -
SOurce-free Cross-modal KnowledgE Transfer -
Audio-Visual-Language Embodied Navigation in 3D Environments -
Nonparametric Score Estimators -
3D MOrphable STyleGAN -
Instance Segmentation GAN -
Audio Visual Scene-Graph Segmentor -
Generalized One-class Discriminative Subspaces -
Hierarchical Musical Instrument Separation -
Generating Visual Dynamics from Sound and Context -
Adversarially-Contrastive Optimal Transport -
Online Feature Extractor Network -
MotionNet -
FoldingNet++ -
Quasi-Newton Trust Region Policy Optimization -
Landmarks’ Location, Uncertainty, and Visibility Likelihood -
Robust Iterative Data Estimation -
Gradient-based Nikaido-Isoda -
Circular Maze Environment -
Discriminative Subspace Pooling -
Kernel Correlation Network -
Fast Resampling on Point Clouds via Graphs -
FoldingNet -
Deep Category-Aware Semantic Edge Detection -
MERL Shopping Dataset
-