Machine Learning
Data-driven approaches to design intelligent algorithms.
MERL has a long history of research activity in machine learning, including the development of various boosting algorithms and contributing to the theory and practice of highly scalable collaborative filtering. Our recent work has focused on deep learning and reinforcement learning, with application to a wide range of applications including automotive, robotics, factory automation, transportation, as well as building and home systems.
Quick Links
-
Researchers
Toshiaki
Koike-Akino
Jonathan
Le Roux
Ye
Wang
Ankush
Chakrabarty
Anoop
Cherian
Gordon
Wichern
Tim K.
Marks
Philip V.
Orlik
Michael J.
Jones
Daniel N.
Nikovski
Stefano
Di Cairano
Devesh K.
Jha
Kieran
Parsons
Christopher R.
Laughman
Diego
Romeres
Pu
(Perry)
WangKarl
Berntorp
Chiori
Hori
Bingnan
Wang
Yebin
Wang
Suhas
Lohit
Mouhacine
Benosman
Hassan
Mansour
Matthew
Brand
Arvind
Raghunathan
Petros T.
Boufounos
Moitreya
Chatterjee
Abraham P.
Vinod
Jianlin
Guo
Siddarth
Jain
Kuan-Chuan
Peng
William S.
Yerazunis
Scott A.
Bortoff
Radu
Corcodel
Chungwei
Lin
Vedang M.
Deshpande
François
Germain
Dehong
Liu
Jing
Liu
Saviz
Mowlavi
Hongtao
Qiao
Hongbo
Sun
Wataru
Tsujita
Sameer
Khurana
Pedro
Miraldo
James
Queeney
Anthony
Vetro
Ryoma
Yataka
Jinyun
Zhang
Jose
Amaya
Abraham
Goldsmith
Yanting
Ma
Joshua
Rapp
Avishai
Weiss
Janek
Ebbers
Ryo
Hase
Shinya
Tsuruta
-
Awards
-
AWARD Jonathan Le Roux elevated to IEEE Fellow Date: January 1, 2024
Awarded to: Jonathan Le Roux
MERL Contact: Jonathan Le Roux
Research Areas: Artificial Intelligence, Machine Learning, Speech & AudioBrief- MERL Distinguished Scientist and Speech & Audio Senior Team Leader Jonathan Le Roux has been elevated to IEEE Fellow, effective January 2024, "for contributions to multi-source speech and audio processing."
Mitsubishi Electric celebrated Dr. Le Roux's elevation and that of another researcher from the company, Dr. Shumpei Kameyama, with a worldwide news release on February 15.
Dr. Jonathan Le Roux has made fundamental contributions to the field of multi-speaker speech processing, especially to the areas of speech separation and multi-speaker end-to-end automatic speech recognition (ASR). His contributions constituted a major advance in realizing a practically usable solution to the cocktail party problem, enabling machines to replicate humans’ ability to concentrate on a specific sound source, such as a certain speaker within a complex acoustic scene—a long-standing challenge in the speech signal processing community. Additionally, he has made key contributions to the measures used for training and evaluating audio source separation methods, developing several new objective functions to improve the training of deep neural networks for speech enhancement, and analyzing the impact of metrics used to evaluate the signal reconstruction quality. Dr. Le Roux’s technical contributions have been crucial in promoting the widespread adoption of multi-speaker separation and end-to-end ASR technologies across various applications, including smart speakers, teleconferencing systems, hearables, and mobile devices.
IEEE Fellow is the highest grade of membership of the IEEE. It honors members with an outstanding record of technical achievements, contributing importantly to the advancement or application of engineering, science and technology, and bringing significant value to society. Each year, following a rigorous evaluation procedure, the IEEE Fellow Committee recommends a select group of recipients for elevation to IEEE Fellow. Less than 0.1% of voting members are selected annually for this member grade elevation.
- MERL Distinguished Scientist and Speech & Audio Senior Team Leader Jonathan Le Roux has been elevated to IEEE Fellow, effective January 2024, "for contributions to multi-source speech and audio processing."
-
AWARD Honorable Mention Award at NeurIPS 23 Instruction Workshop Date: December 15, 2023
Awarded to: Lingfeng Sun, Devesh K. Jha, Chiori Hori, Siddharth Jain, Radu Corcodel, Xinghao Zhu, Masayoshi Tomizuka and Diego Romeres
MERL Contacts: Radu Corcodel; Chiori Hori; Siddarth Jain; Devesh K. Jha; Diego Romeres
Research Areas: Artificial Intelligence, Machine Learning, RoboticsBrief- MERL Researchers received an "Honorable Mention award" at the Workshop on Instruction Tuning and Instruction Following at the NeurIPS 2023 conference in New Orleans. The workshop was on the topic of instruction tuning and Instruction following for Large Language Models (LLMs). MERL researchers presented their work on interactive planning using LLMs for partially observable robotic tasks during the oral presentation session at the workshop.
-
AWARD MERL team wins the Audio-Visual Speech Enhancement (AVSE) 2023 Challenge Date: December 16, 2023
Awarded to: Zexu Pan, Gordon Wichern, Yoshiki Masuyama, Francois Germain, Sameer Khurana, Chiori Hori, and Jonathan Le Roux
MERL Contacts: François Germain; Chiori Hori; Sameer Khurana; Jonathan Le Roux; Gordon Wichern
Research Areas: Artificial Intelligence, Machine Learning, Speech & AudioBrief- MERL's Speech & Audio team ranked 1st out of 12 teams in the 2nd COG-MHEAR Audio-Visual Speech Enhancement Challenge (AVSE). The team was led by Zexu Pan, and also included Gordon Wichern, Yoshiki Masuyama, Francois Germain, Sameer Khurana, Chiori Hori, and Jonathan Le Roux.
The AVSE challenge aims to design better speech enhancement systems by harnessing the visual aspects of speech (such as lip movements and gestures) in a manner similar to the brain’s multi-modal integration strategies. MERL’s system was a scenario-aware audio-visual TF-GridNet, that incorporates the face recording of a target speaker as a conditioning factor and also recognizes whether the predominant interference signal is speech or background noise. In addition to outperforming all competing systems in terms of objective metrics by a wide margin, in a listening test, MERL’s model achieved the best overall word intelligibility score of 84.54%, compared to 57.56% for the baseline and 80.41% for the next best team. The Fisher’s least significant difference (LSD) was 2.14%, indicating that our model offered statistically significant speech intelligibility improvements compared to all other systems.
- MERL's Speech & Audio team ranked 1st out of 12 teams in the 2nd COG-MHEAR Audio-Visual Speech Enhancement Challenge (AVSE). The team was led by Zexu Pan, and also included Gordon Wichern, Yoshiki Masuyama, Francois Germain, Sameer Khurana, Chiori Hori, and Jonathan Le Roux.
See All Awards for Machine Learning -
-
News & Events
-
NEWS Jianlin Guo delivered a keynote in IEEE ICC 2024 Workshop Date: June 13, 2024
Where: IEEE International Conference on Communications (ICC)
MERL Contacts: Jianlin Guo; Philip V. Orlik; Kieran Parsons; Pu (Perry) Wang
Research Areas: Communications, Machine Learning, Signal ProcessingBrief- Jianlin Guo delivered a keynote titled "Private IoT Networks" in the IEEE International Conference on Communications (ICC) 2024 Workshop "Industrial Private 5G-and-Beyond Wireless Networks", held in Denver, Colorado from June 9-13. The ICC is one of two IEEE Communications Society’s flagship conferences.
Abstract: With the advent of private 5G-and-Beyond communication technologies, private IoT networks have been emerging. In private IoT networks, network owners have full control on the network resource management. However, to fully realize private IoT networks, the upper layer technologies need to be developed as well. This keynote presents machine learning based anomaly detection in manufacturing systems, innovative multipath TCP technologies over heterogeneous wireless IoT networks, novel channel resource scheduling in private 5G networks and efficient wireless coexistence of the heterogeneous wireless systems.
- Jianlin Guo delivered a keynote titled "Private IoT Networks" in the IEEE International Conference on Communications (ICC) 2024 Workshop "Industrial Private 5G-and-Beyond Wireless Networks", held in Denver, Colorado from June 9-13. The ICC is one of two IEEE Communications Society’s flagship conferences.
-
NEWS MERL at the International Conference on Robotics and Automation (ICRA) 2024 Date: May 13, 2024 - May 17, 2024
Where: Yokohama, Japan
MERL Contacts: Anoop Cherian; Radu Corcodel; Stefano Di Cairano; Chiori Hori; Siddarth Jain; Devesh K. Jha; Jonathan Le Roux; Diego Romeres; William S. Yerazunis
Research Areas: Artificial Intelligence, Machine Learning, Optimization, Robotics, Speech & AudioBrief- MERL made significant contributions to both the organization and the technical program of the International Conference on Robotics and Automation (ICRA) 2024, which was held in Yokohama, Japan from May 13th to May 17th.
MERL was a Bronze sponsor of the conference, and exhibited a live robotic demonstration, which attracted a large audience. The demonstration showcased an Autonomous Robotic Assembly technology executed on MELCO's Assista robot arm and was the collaborative effort of the Optimization and Robotics Team together with the Advanced Technology department at Mitsubishi Electric.
MERL researchers from the Optimization and Robotics, Speech & Audio, and Control for Autonomy teams also presented 8 papers and 2 invited talks covering topics on robotic assembly, applications of LLMs to robotics, human robot interaction, safe and robust path planning for autonomous drones, transfer learning, perception and tactile sensing.
- MERL made significant contributions to both the organization and the technical program of the International Conference on Robotics and Automation (ICRA) 2024, which was held in Yokohama, Japan from May 13th to May 17th.
See All News & Events for Machine Learning -
-
Research Highlights
-
TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models -
Gear-NeRF: Free-Viewpoint Rendering and Tracking with Motion-Aware Spatio-Temporal Sampling -
Steered Diffusion -
Edge-Assisted Internet of Vehicles for Smart Mobility -
Robust Machine Learning -
mmWave Beam-SNR Fingerprinting (mmBSF) -
Video Anomaly Detection -
Biosignal Processing for Human-Machine Interaction -
MERL Shopping Dataset
-
-
Internships
-
OR2196: Visuo-tactile Learning for Dexterous Manipulation
MERL is looking for a highly motivated individual to work on robotic manipulation using visuo-tactile learning. The research will develop robot motor skills for complex, dexterous manipulation using vision and tactile perception. The ideal candidate should have experience in either one or multiple of the following topics: manipulation, tactile sensing, Reinforcement Learning, sim-to-real techniques for manipulation, and grasping. Senior PhD students in robotics and engineering with a focus on contact-rich manipulation are encouraged to apply. Prior experience working with physical robotic systems (and vision and tactile sensors) is required as results need to be implemented on a physical hardware. Good coding skills in Python ML libraries like PyTorch etc. is required. A successful internship will result in submission of results to a peer-reviewed robotics journal in collaboration with MERL researchers. The expected duration of internship is 4-5 months with start date in Aug/Sept 2024. This internship is preferred to be onsite at MERL.
-
SA2073: Multimodal scene-understanding
We are looking for a graduate student interested in helping advance the field of multimodal scene understanding, with a focus on scene understanding using natural language for robot dialog and/or indoor monitoring using a large language model. The intern will collaborate with MERL researchers to derive and implement new models and optimization methods, conduct experiments, and prepare results for publication. Internships regularly lead to one or more publications in top-tier venues, which can later become part of the intern''s doctoral work. The ideal candidates are senior Ph.D. students with experience in deep learning for audio-visual, signal, and natural language processing. Good programming skills in Python and knowledge of deep learning frameworks such as PyTorch are essential. Multiple positions are available with flexible start date (not just Spring/Summer but throughout 2024) and duration (typically 3-6 months).
-
OR2103: Human Robot Collaboration in Assembly Tasks
MERL is looking for a self-motivated and qualified candidate to work on human-robot-interaction for manipulation and assembly collaborative scenarios. The ideal candidate is a PhD student and should have experience and records in one or multiple of the following areas. 1) Control, estimation and perception for Robotic manipulation 2) Task and Motion Planning 3) Learning from demonstration algorithms applied to robotic manipulation 4) Machine learning techniques for modeling and control as well as regression and classification problems. 5) Experience in working with robotic systems and familiarity with physics engine simulators like Mujoco, Isaac Gym, PyBullet. The successful candidate will be expected to develop, in collaboration with MERL employees, state of the art algorithms to solve complex manipulation tasks that involve human and robot collaborations. Proficiency in Python and ROS are required. The expectation is that the research will lead to one or more scientific publications. The expected duration s 3-4 months, with a flexible starting date.
See All Internships for Machine Learning -
-
Openings
-
OR2137: Research Scientist - Optimization & Intelligent Robotics
-
EA2051: Research Scientist - Electric Systems Automation
See All Openings at MERL -
-
Recent Publications
- "Safe multi-agent motion planning under uncertainty for drones using filtered reinforcement learning", IEEE Transactions on Robotics, DOI: 10.1109/TRO.2024.3387010, Vol. 40, pp. 2529-2542, July 2024.BibTeX TR2024-048 PDF Video
- @article{Safaoui2024jul,
- author = {Safaoui, Sleiman and Vinod, Abraham P. and Chakrabarty, Ankush and Quirynen, Rien and Yoshikawa, Nobuyuki and Di Cairano, Stefano},
- title = {Safe multi-agent motion planning under uncertainty for drones using filtered reinforcement learning},
- journal = {IEEE Transactions on Robotics},
- year = 2024,
- volume = 40,
- pages = {2529--2542},
- month = jul,
- doi = {10.1109/TRO.2024.3387010},
- url = {https://www.merl.com/publications/TR2024-048}
- }
, - "Physics-Informed Road Monitoring and Suspension Control using Crowdsourced Vehicle Data", European Control Conference (ECC), June 2024.BibTeX TR2024-084 PDF
- @inproceedings{Wang2024jun4,
- author = {Wang, Yanbing and Berntorp, Karl and Menner, Marcel}},
- title = {Physics-Informed Road Monitoring and Suspension Control using Crowdsourced Vehicle Data},
- booktitle = {European Control Conference (ECC)},
- year = 2024,
- month = jun,
- url = {https://www.merl.com/publications/TR2024-084}
- }
, - "Data-driven monitoring with mobile sensors and charging stations using multi-arm bandits and coordinated motion planners", American Control Conference (ACC), June 2024.BibTeX TR2024-078 PDF
- @inproceedings{Nayak2024jun,
- author = {Nayak, Siddharth and Greiff, Marcus and Raghunathan, Arvind and Di Cairano, Stefano and Vinod, Abraham P.}},
- title = {Data-driven monitoring with mobile sensors and charging stations using multi-arm bandits and coordinated motion planners},
- booktitle = {American Control Conference (ACC)},
- year = 2024,
- month = jun,
- url = {https://www.merl.com/publications/TR2024-078}
- }
, - "SuperLoRA: Parameter-Efficient Unified Adaptation for Large Vision Models", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2024.BibTeX TR2024-062 PDF
- @inproceedings{Chen2024jun,
- author = {Chen, Xiangyu and Liu, Jing and Wang, Ye and Wang, Pu and Brand, Matthew and Wang, Guanghui and Koike-Akino, Toshiaki}},
- title = {SuperLoRA: Parameter-Efficient Unified Adaptation for Large Vision Models},
- booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
- year = 2024,
- month = jun,
- url = {https://www.merl.com/publications/TR2024-062}
- }
, - "Long-Tailed Anomaly Detection with Learnable Class Names", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2024.BibTeX TR2024-040 PDF Video Presentation
- @inproceedings{Ho2024jun,
- author = {Ho, Chih-Hui and Peng, Kuan-Chuan and Vasconcelos, Nuno},
- title = {Long-Tailed Anomaly Detection with Learnable Class Names},
- booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
- year = 2024,
- month = jun,
- url = {https://www.merl.com/publications/TR2024-040}
- }
, - "TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2024.BibTeX TR2024-059 PDF Video Software Presentation
- @inproceedings{Ni2024jun,
- author = {Ni, Haomiao and Egger, Bernhard and Lohit, Suhas and Cherian, Anoop and Wang, Ye and Koike-Akino, Toshiaki and Huang, Sharon X. and Marks, Tim K.},
- title = {TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models},
- booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
- year = 2024,
- month = jun,
- url = {https://www.merl.com/publications/TR2024-059}
- }
, - "SIRA: Scalable Inter-frame Relation and Association for Radar Perception", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2024.BibTeX TR2024-041 PDF
- @inproceedings{Yataka2024jun,
- author = {Yataka, Ryoma and Wang, Pu and Boufounos, Petros T. and Takahashi, Ryuhei},
- title = {SIRA: Scalable Inter-frame Relation and Association for Radar Perception},
- booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
- year = 2024,
- month = jun,
- url = {https://www.merl.com/publications/TR2024-041}
- }
, - "Adversarial Imitation Learning from Visual Observations using Latent Information", Transactions on Machine Learning Research (TMLR), June 2024.BibTeX TR2024-068 PDF
- @article{Giammarino2024jun,
- author = {Giammarino, Vittorio and Queeney, James and Paschalidis, Ioannis Ch.},
- title = {Adversarial Imitation Learning from Visual Observations using Latent Information},
- journal = {Transactions on Machine Learning Research (TMLR)},
- year = 2024,
- month = jun,
- url = {https://www.merl.com/publications/TR2024-068}
- }
,
- "Safe multi-agent motion planning under uncertainty for drones using filtered reinforcement learning", IEEE Transactions on Robotics, DOI: 10.1109/TRO.2024.3387010, Vol. 40, pp. 2529-2542, July 2024.
-
Videos
-
Software & Data Downloads
-
Long-Tailed Anomaly Detection (LTAD) Dataset -
DeepBornFNO -
ComplexVAD Dataset -
Pixel-Grounded Prototypical Part Networks -
Steered Diffusion -
BAyesian Network for adaptive SAmple Consensus -
Simple Multimodal Algorithmic Reasoning Task Dataset -
SOurce-free Cross-modal KnowledgE Transfer -
Audio-Visual-Language Embodied Navigation in 3D Environments -
Nonparametric Score Estimators -
3D MOrphable STyleGAN -
Instance Segmentation GAN -
Audio Visual Scene-Graph Segmentor -
Generalized One-class Discriminative Subspaces -
Hierarchical Musical Instrument Separation -
Generating Visual Dynamics from Sound and Context -
Adversarially-Contrastive Optimal Transport -
Online Feature Extractor Network -
MotionNet -
FoldingNet++ -
Quasi-Newton Trust Region Policy Optimization -
Landmarks’ Location, Uncertainty, and Visibility Likelihood -
Robust Iterative Data Estimation -
Gradient-based Nikaido-Isoda -
Circular Maze Environment -
Discriminative Subspace Pooling -
Kernel Correlation Network -
Fast Resampling on Point Clouds via Graphs -
FoldingNet -
Deep Category-Aware Semantic Edge Detection -
MERL Shopping Dataset -
Partial Group Convolutional Neural Networks
-