Artificial Intelligence

Making machines smarter for improved safety, efficiency and comfort.

Our AI research encompasses advances in computer vision, speech and audio processing, as well as data analytics. Key research themes include improved perception based on machine learning techniques, learning control policies through model-based reinforcement learning, as well as cognition and reasoning based on learned semantic representations. We apply our work to a broad range of automotive and robotics applications, as well as building and home systems.

Quick Links
Researchers
Awards
- AWARD MERL Wins Awards at NeurIPS LLM Privacy Challenge
  Date: December 15, 2024
  Awarded to: Jing Liu, Ye Wang, Toshiaki Koike-Akino, Tsunato Nakai, Kento Oonishi, Takuya Higashi
  MERL Contacts: Toshiaki Koike-Akino; Jing Liu; Ye Wang
  Research Areas: Artificial Intelligence, Machine Learning, Information Security
  Brief
  - The Mitsubishi Electric Privacy Enhancing Technologies (MEL-PETs) team, consisting of a collaboration of MERL and Mitsubishi Electric researchers, won awards at the NeurIPS 2024 Large Language Model (LLM) Privacy Challenge. In the Blue Team track of the challenge, we won the 3rd Place Award, and in the Red Team track, we won the Special Award for Practical Attack.
- AWARD University of Padua and MERL team wins the AI Olympics with RealAIGym competition at IROS24
  Date: October 17, 2024
  Awarded to: Niccolò Turcato, Alberto Dalla Libera, Giulio Giacomuzzo, Ruggero Carli, Diego Romeres
  MERL Contact: Diego Romeres
  Research Areas: Artificial Intelligence, Dynamical Systems, Machine Learning, Robotics
  Brief
  - The team composed of the control group at the University of Padua and MERL's Optimization and Robotic team ranked 1st out of the 4 finalist teams that arrived to the 2nd AI Olympics with RealAIGym competition at IROS 24, which focused on control of under-actuated robots. The team was composed by Niccolò Turcato, Alberto Dalla Libera, Giulio Giacomuzzo, Ruggero Carli and Diego Romeres. The competition was organized by the German Research Center for Artificial Intelligence (DFKI), Technical University of Darmstadt and Chalmers University of Technology.
    
    The competition and award ceremony was hosted by IEEE International Conference on Intelligent Robots and Systems (IROS) on October 17, 2024 in Abu Dhabi, UAE. Diego Romeres presented the team's method, based on a model-based reinforcement learning algorithm called MC-PILCO.
- AWARD MERL team wins the Listener Acoustic Personalisation (LAP) 2024 Challenge
  Date: August 29, 2024
  Awarded to: Yoshiki Masuyama, Gordon Wichern, Francois G. Germain, Christopher Ick, and Jonathan Le Roux
  MERL Contacts: Jonathan Le Roux; Gordon Wichern; Yoshiki Masuyama
  Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
  Brief
  - MERL's Speech & Audio team ranked 1st out of 7 teams in Task 2 of the 1st SONICOM Listener Acoustic Personalisation (LAP) Challenge, which focused on "Spatial upsampling for obtaining a high-spatial-resolution HRTF from a very low number of directions". The team was led by Yoshiki Masuyama, and also included Gordon Wichern, Francois Germain, MERL intern Christopher Ick, and Jonathan Le Roux.
    
    The LAP Challenge workshop and award ceremony was hosted by the 32nd European Signal Processing Conference (EUSIPCO 24) on August 29, 2024 in Lyon, France. Yoshiki Masuyama presented the team's method, "Retrieval-Augmented Neural Field for HRTF Upsampling and Personalization", and received the award from Prof. Michele Geronazzo (University of Padova, IT, and Imperial College London, UK), Chair of the Challenge's Organizing Committee.
    
    The LAP challenge aims to explore challenges in the field of personalized spatial audio, with the first edition focusing on the spatial upsampling and interpolation of head-related transfer functions (HRTFs). HRTFs with dense spatial grids are required for immersive audio experiences, but their recording is time-consuming. Although HRTF spatial upsampling has recently shown remarkable progress with approaches involving neural fields, HRTF estimation accuracy remains limited when upsampling from only a few measured directions, e.g., 3 or 5 measurements. The MERL team tackled this problem by proposing a retrieval-augmented neural field (RANF). RANF retrieves a subject whose HRTFs are close to those of the target subject at the measured directions from a library of subjects. The HRTF of the retrieved subject at the target direction is fed into the neural field in addition to the desired sound source direction. The team also developed a neural network architecture that can handle an arbitrary number of retrieved subjects, inspired by a multi-channel processing technique called transform-average-concatenate.
See All Awards for Artificial Intelligence
News & Events
- NEWS MERL Papers, Workshops, and Talks at ICCV 2025
  Date: October 19, 2025 - October 23, 2025
  Where: Honolulu, HI, USA
  MERL Contacts: Petros T. Boufounos; Anoop Cherian; Toshiaki Koike-Akino; Hassan Mansour; Tim K. Marks; Pedro Miraldo; Kuan-Chuan Peng; Pu (Perry) Wang
  Research Areas: Artificial Intelligence, Computer Vision, Machine Learning, Signal Processing
  Brief
  - MERL researchers presented 3 conference papers and 3 workshop papers, co-organized 2 workshops, and delivered 2 invited talks at the IEEE International Conference on Computer Vision (ICCV) 2025, which was held in Honolulu, HI, USA from October 19-23, 2025. ICCV is one of the most prestigious and competitive international conferences in the area of computer vision. Details of MERL contributions are provided below:
    
    Main Conference Papers:
    
    1. "SAC-GNC: SAmple Consensus for adaptive Graduated Non-Convexity" by V. Piedade, C. Sidhartha, J. Gaspar, V. M. Govindu, and P. Miraldo. (Highlight Paper)
    Paper: https://www.merl.com/publications/TR2025-146
    
    2. "Toward Long-Tailed Online Anomaly Detection through Class-Agnostic Concepts" by C.-A. Yang, K.-C. Peng, and R. A. Yeh.
    Paper: https://www.merl.com/publications/TR2025-124
    
    3. "Manual-PA: Learning 3D Part Assembly from Instruction Diagrams" by J. Zhang, A. Cherian, C. Rodriguez-Opazo, W. Deng, and S. Gould.
    Paper: https://www.merl.com/publications/TR2025-139
    
    MERL Co-Organized Workshops:
    
    1. "The Workshop on Anomaly Detection with Foundation Models (ADFM)" by K.-C. Peng, Y. Zhao, and A. Aich.
    Workshop link: https://adfmw.github.io/iccv25/
    
    2. "The 8th International Workshop on Computer Vision for Physiological Measurement (CVPM)" by D. McDuff, W. Wang, S. Stuijk, T. Marks, H. Mansour, V. R. Shenoy.
    Workshop link: https://sstuijk.estue.nl/cvpm/cvpm25/
    
    MERL Keynote Talks at Workshops:
    
    1. Tim K. Marks, Keynote Speaker at the Workshop on Computer Vision for Physiological Measurement (CVPM).
    Workshop website: https://vineetrshenoy.github.io/cvpmSeptember2025/
    
    2. Tim K. Marks, Keynote Speaker at the Workshop on Analysis and Modeling of Faces and Gestures (AMFG).
    Workshop website: https://fulab.sites.northeastern.edu/amfg2025/
    
    Workshop Papers:
    
    1. "Joint Training of Image Generator and Detector for Road Defect Detection" by K.-C. Peng.
    paper: https://www.merl.com/publications/TR2025-149
    
    2. "Radar-Conditioned 3D Bounding Box Diffusion for Indoor Human Perception" by R. Yataka, P. Wang, P.T. Boufounos, and R. Takahashi.
    paper: https://www.merl.com/publications/TR2025-154
    
    3. "L-GGSC: Learnable Graph-based Gaussian Splatting Compression" by S. Kato, T. Koike-Akino, and T. Fujihashi.
    paper: https://www.merl.com/publications/TR2025-148
- NEWS MERL Papers, Workshops, and Talks at ICCV 2025
  Date: October 19, 2025 - October 23, 2025
  Where: Honolulu, HI, USA
  MERL Contacts: Petros T. Boufounos; Anoop Cherian; Toshiaki Koike-Akino; Hassan Mansour; Tim K. Marks; Pedro Miraldo; Kuan-Chuan Peng; Pu (Perry) Wang
  Research Areas: Artificial Intelligence, Computer Vision, Machine Learning, Signal Processing
  Brief
  - MERL researchers presented 3 conference papers and 3 workshop papers, co-organized 2 workshops, and delivered 2 invited talks at the IEEE International Conference on Computer Vision (ICCV) 2025, which was held in Honolulu, HI, USA from October 19-23, 2025. ICCV is one of the most prestigious and competitive international conferences in the area of computer vision. Details of MERL contributions are provided below:
    
    Main Conference Papers:
    
    1. "SAC-GNC: SAmple Consensus for adaptive Graduated Non-Convexity" by V. Piedade, C. Sidhartha, J. Gaspar, V. M. Govindu, and P. Miraldo. (Highlight Paper)
    Paper: https://www.merl.com/publications/TR2025-146
    
    2. "Toward Long-Tailed Online Anomaly Detection through Class-Agnostic Concepts" by C.-A. Yang, K.-C. Peng, and R. A. Yeh.
    Paper: https://www.merl.com/publications/TR2025-124
    
    3. "Manual-PA: Learning 3D Part Assembly from Instruction Diagrams" by J. Zhang, A. Cherian, C. Rodriguez-Opazo, W. Deng, and S. Gould.
    Paper: https://www.merl.com/publications/TR2025-139
    
    MERL Co-Organized Workshops:
    
    1. "The Workshop on Anomaly Detection with Foundation Models (ADFM)" by K.-C. Peng, Y. Zhao, and A. Aich.
    Workshop link: https://adfmw.github.io/iccv25/
    
    2. "The 8th International Workshop on Computer Vision for Physiological Measurement (CVPM)" by D. McDuff, W. Wang, S. Stuijk, T. Marks, H. Mansour, V. R. Shenoy.
    Workshop link: https://sstuijk.estue.nl/cvpm/cvpm25/
    
    MERL Keynote Talks at Workshops:
    
    1. Tim K. Marks, Keynote Speaker at the Workshop on Computer Vision for Physiological Measurement (CVPM).
    Workshop website: https://vineetrshenoy.github.io/cvpmSeptember2025/
    
    2. Tim K. Marks, Keynote Speaker at the Workshop on Analysis and Modeling of Faces and Gestures (AMFG).
    Workshop website: https://fulab.sites.northeastern.edu/amfg2025/
    
    Workshop Papers:
    
    1. "Joint Training of Image Generator and Detector for Road Defect Detection" by K.-C. Peng.
    paper: https://www.merl.com/publications/TR2025-149
    
    2. "Radar-Conditioned 3D Bounding Box Diffusion for Indoor Human Perception" by R. Yataka, P. Wang, P.T. Boufounos, and R. Takahashi.
    paper: https://www.merl.com/publications/TR2025-154
    
    3. "L-GGSC: Learnable Graph-based Gaussian Splatting Compression" by S. Kato, T. Koike-Akino, and T. Fujihashi.
    paper: https://www.merl.com/publications/TR2025-148
See All News & Events for Artificial Intelligence
Research Highlights
Internships
See All Internships for Artificial Intelligence
Openings
- CI0177: Postdoctoral Research Fellow - Agentic AI
See All Openings at MERL
Recent Publications
- Wilkinghoff, K., Fujimura, T., Imoto, K., Le Roux, J., Tan, Z.-H., Toda, T., "Handling Domain Shifts for Anomalous Sound Detection: A Review of DCASE-Related Work", Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), October 2025.
  BibTeX TR2025-157 PDF
  - @inproceedings{Wilkinghoff2025oct,
  - author = {Wilkinghoff, Kevin and Fujimura, Takuya and Imoto, Keisuke and {Le Roux}, Jonathan and Tan, Zheng-Hua and Toda, Tomoki},
  - title = {{Handling Domain Shifts for Anomalous Sound Detection: A Review of DCASE-Related Work}},
  - booktitle = {Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE)},
  - year = 2025,
  - month = oct,
  - url = {https://www.merl.com/publications/TR2025-157}
  - }
- Fujihashi, T., Kuwabara, A., Koike-Akino, T., "QKAN-GS: Quantum-Empowered 3D Gaussian Splatting", ACM Multimedia Workshop, October 2025.
  BibTeX TR2025-156 PDF
  - @inproceedings{Fujihashi2025oct,
  - author = {Fujihashi, Takuya and Kuwabara, Akihiro and Koike-Akino, Toshiaki},
  - title = {{QKAN-GS: Quantum-Empowered 3D Gaussian Splatting}},
  - booktitle = {ACM Multimedia Workshop},
  - year = 2025,
  - month = oct,
  - url = {https://www.merl.com/publications/TR2025-156}
  - }
- Peng, K.-C., "Joint Training of Image Generator and Detector for Road Defect Detection", IEEE International Conference on Computer Vision (ICCV) Workshops, October 2025.
  BibTeX TR2025-149 PDF Video Presentation
  - @inproceedings{Peng2025oct,
  - author = {{{Peng, Kuan-Chuan}}},
  - title = {{{Joint Training of Image Generator and Detector for Road Defect Detection}}},
  - booktitle = {IEEE International Conference on Computer Vision (ICCV) Workshops},
  - year = 2025,
  - month = oct,
  - url = {https://www.merl.com/publications/TR2025-149}
  - }
- Yang, C.-A., Peng, K.-C., Yeh, R., "Toward Long-Tailed Online Anomaly Detection through Class-Agnostic Concepts", IEEE International Conference on Computer Vision (ICCV), October 2025.
  BibTeX TR2025-124 PDF Video Data Presentation
  - @inproceedings{Yang2025oct,
  - author = {{{Yang, Chiao-An and Peng, Kuan-Chuan and Yeh, Raymond}}},
  - title = {{{Toward Long-Tailed Online Anomaly Detection through Class-Agnostic Concepts}}},
  - booktitle = {IEEE International Conference on Computer Vision (ICCV)},
  - year = 2025,
  - month = oct,
  - url = {https://www.merl.com/publications/TR2025-124}
  - }
- Shenoy, V., Wu, S., Comas, A., Lohit, S., Mansour, H., Marks, T.K., "Time-Series U-Net with Recurrence for Noise-Robust Imaging Photoplethysmography", IEEE Access, October 2025.
  BibTeX TR2025-145 PDF
  - @article{Shenoy2025oct,
  - author = {Shenoy, Vineet and Wu, Shaoju and Comas, Armand and Lohit, Suhas and Mansour, Hassan and Marks, Tim K.},
  - title = {{Time-Series U-Net with Recurrence for Noise-Robust Imaging Photoplethysmography}},
  - journal = {IEEE Access},
  - year = 2025,
  - month = oct,
  - url = {https://www.merl.com/publications/TR2025-145}
  - }
- Masuyama, Y., Germain, F.G., Wichern, G., Ick, C., Le Roux, J., "Physics-Informed Direction-Aware Neural Acoustic Fields", IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), October 2025.
  BibTeX TR2025-142 PDF
  - @inproceedings{Masuyama2025oct,
  - author = {Masuyama, Yoshiki and Germain, François G and Wichern, Gordon and Ick, Christopher and {Le Roux}, Jonathan},
  - title = {{Physics-Informed Direction-Aware Neural Acoustic Fields}},
  - booktitle = {IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
  - year = 2025,
  - month = oct,
  - url = {https://www.merl.com/publications/TR2025-142}
  - }
- Paissan, F., Wichern, G., Masuyama, Y., Aihara, R., Germain, F.G., Saijo, K., Le Roux, J., "FasTUSS: Faster Task-Aware Unified Source Separation", IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), October 2025.
  BibTeX TR2025-143 PDF
  - @inproceedings{Paissan2025oct,
  - author = {Paissan, Francesco and Wichern, Gordon and Masuyama, Yoshiki and Aihara, Ryo and Germain, François G and Saijo, Kohei and {Le Roux}, Jonathan},
  - title = {{FasTUSS: Faster Task-Aware Unified Source Separation}},
  - booktitle = {IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
  - year = 2025,
  - month = oct,
  - url = {https://www.merl.com/publications/TR2025-143}
  - }
- Hu, Y., Lohit, S., Kamilov, U., Marks, T.K., "Multimodal Diffusion Bridge with Attention-Based SAR Fusion for Satellite Image Cloud Removal", IEEE Transactions on Geoscience and Remote Sensing, DOI: 10.1109/TGRS.2025.3604654, Vol. 63, September 2025.
  BibTeX TR2025-138 PDF
  - @article{Hu2025sep2,
  - author = {Hu, Yuyang and Lohit, Suhas and Kamilov, Ulugbek and Marks, Tim K.},
  - title = {{Multimodal Diffusion Bridge with Attention-Based SAR Fusion for Satellite Image Cloud Removal}},
  - journal = {IEEE Transactions on Geoscience and Remote Sensing},
  - year = 2025,
  - volume = 63,
  - month = sep,
  - doi = {10.1109/TGRS.2025.3604654},
  - issn = {1558-0644},
  - url = {https://www.merl.com/publications/TR2025-138}
  - }
See All Publications for Artificial Intelligence
Videos

[BMVC 2025] Towards Open-Vocabulary Multimodal 3D Object Detection with Attributes

[ICCV Workshop 2025] Joint Training of Image Generator and Detector for Road Defect Detection

[ICCV 2025] Toward Long-Tailed Online Anomaly Detection through Class-Agnostic Concepts

In-Context Iterative Policy Improvement for Dynamic Manipulation

[CVPR 2025] TailedCore: Few-Shot Sampling for Unsupervised Long-Tail Noisy Anomaly Detection

[MERL Seminar Series Spring 2025] Red Teaming AI Agents in-the-wild: Revealing Deployment Vulnerabilities

[MERL Seminar Series Spring 2025] The Emergence of Generalizability and Semantic Low-Dim Subspaces in Diffusion Models

[MERL Seminar Series Spring 2025] Amplifying human performance in combinatorial competitive programming

[WACV 2025] Towards Zero-shot 3D Anomaly Localization

[NeurIPS 2024] MEL-PETs Defense for the NeurIPS 2024 LLM Privacy Challenge Blue Team Track

[NeurIPS 2024] MEL-PETs Joint-Context Attack for the NeurIPS 2024 LLM Privacy Challenge Red Team Track

[MERL Seminar Series Fall 2024] AI-assisted Power Grid Dispatch and Control: Optimization, Safety, and Real-world Demonstrations

[NeurIPS 2024] Evaluating Large Vision-and-Language Models on Children's Mathematical Olympiads

[MERL Seminar Series Fall 2024] Audio for Object and Spatial Awareness

[IROS 2024] Few-shot Transparent Instance Segmentation for Bin Picking

[MERL Seminar Series Fall 2024] Tools from cognitive science to understand the behavior of large language models

[ECCV 2024] PS-NEUS: A Probability-guided Sampler for Neural Implicit Surface Rendering

[ECCV 2024] Equivariant Spatio-Temporal Self-Supervision for LiDAR Object Detection

Gear-NeRF: Free-Viewpoint Rendering and Tracking with Motion-aware Spatio-Temporal Sampling

[MERL Seminar Series Spring 2024] Are Emergent Abilities of Large Language Models a Mirage?

MERL's Quantum AI Technology

[MERL Seminar Series Spring 2024] The Debate Over 'Understanding' in AI's Large Language Models

[MERL Seminar Series Spring 2024] Computational models of human auditory and language processing

[MERL Seminar Series Fall 2023] Multiplicity in Machine Learning

Multi-level Reasoning for Robotic Assembly: From Sequence Inference to Contact Selection

[MERL Seminar Series Fall 2023] Visual Programming - A compositional approach to building General Purpose Vision Systems

[MERL Seminar Series Fall 2023] The Confluence of Vision, Language, and Robotics

Are Deep Neural Networks SMARTer than Second Graders?

[MERL Seminar Series Spring 2023] Fine-grained wildlife sound recognition: Towards the accuracy of a naturalist

[MERL Seminar Series Spring 2023] Pitfalls and Opportunities in Interpretable Machine Learning

Human Perspective Scene Understanding via Multimodal Sensing

[MERL Seminar Series Spring 2022] Self-Supervised Scene Representation Learning

[MERL Seminar Series Spring 2022] Learning Speech Representations with Multimodal Self-Supervision

[MERL Seminar Series 2021] Deep probabilistic regression

[MERL Seminar Series 2021] Learning to See by Moving: Self-supervising 3D scene representations for perception, control, and visual reasoning

[MERL Seminar Series 2021] Look and Listen: From Semantic to Spatial Audio-Visual Perception

Application of Deep Learning for Nanophotonic Device Design (Invited)

Machine Learning Power Amplifier

Scene-Aware Interaction Technology
Software & Data Downloads

Software & Data Downloads

MERL is making Artificial Intelligence software and data available to the research community:

Subject- and Dataset-Aware Neural Field for HRTF Modeling (SuDaField)
Open Vocabulary Attribute Detection Dataset (OVAD)
Task-Aware Unified Source Separation (TUSS)
Local Density-Based Anomaly Score Normalization for Domain Generalization (anomaly-score-normalization)
Long-Tailed Online Anomaly Detection dataset (LTOAD)
Group Representation Networks (G-RepsNets)

See All Downloads