Artificial Intelligence

Making machines smarter for improved safety, efficiency and comfort.

Our AI research encompasses advances in computer vision, speech and audio processing, as well as data analytics. Key research themes include improved perception based on machine learning techniques, learning control policies through model-based reinforcement learning, as well as cognition and reasoning based on learned semantic representations. We apply our work to a broad range of automotive and robotics applications, as well as building and home systems.

  • Researchers

  • Awards

    •  AWARD   MERL Ranked 1st Place in Cross-Subject Transfer Learning Task and 4th Place Overall at the NeurIPS2021 BEETL Competition for EEG Transfer Learning.
      Date: November 11, 2021
      Awarded to: Niklas Smedemark-Margulies, Toshiaki Koike-Akino, Ye Wang, Deniz Erdogmus
      MERL Contacts: Toshiaki Koike-Akino; Ye Wang
      Research Areas: Artificial Intelligence, Signal Processing, Human-Computer Interaction
      Brief
      • The MERL Signal Processing group achieved first place in the cross-subject transfer learning task and fourth place overall in the NeurIPS 2021 BEETL AI Challenge for EEG Transfer Learning. The team included Niklas Smedemark-Margulies (intern from Northeastern University), Toshiaki Koike-Akino, Ye Wang, and Prof. Deniz Erdogmus (Northeastern University). The challenge addresses two types of transfer learning tasks for EEG Biosignals: a homogeneous transfer learning task for cross-subject domain adaptation; and a heterogeneous transfer learning task for cross-data domain adaptation. There were 110+ registered teams in this competition, MERL ranked 1st in the homogeneous transfer learning task, 7th place in the heterogeneous transfer learning task, and 4th place for the combined overall score. For the homogeneous transfer learning task, MERL developed a new pre-shot learning framework based on feature disentanglement techniques for robustness against inter-subject variation to enable calibration-free brain-computer interfaces (BCI). MERL is invited to present our pre-shot learning technique at the NeurIPS 2021 workshop.
    •  
    •  AWARD   Daniel Nikovski receives Outstanding Reviewer Award at NeurIPS'21
      Date: October 18, 2021
      Awarded to: Daniel Nikovski
      MERL Contact: Daniel N. Nikovski
      Research Areas: Artificial Intelligence, Machine Learning
      Brief
      • Daniel Nikovski, Group Manager of MERL's Data Analytics group, has received an Outstanding Reviewer Award from the 2021 conference on Neural Information Processing Systems (NeurIPS'21). NeurIPS is the world's premier conference on neural networks and related technologies.
    •  
    •  AWARD   Best Poster Award and Best Video Award at the International Society for Music Information Retrieval Conference (ISMIR) 2020
      Date: October 15, 2020
      Awarded to: Ethan Manilow, Gordon Wichern, Jonathan Le Roux
      MERL Contacts: Jonathan Le Roux; Gordon Wichern
      Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
      Brief
      • Former MERL intern Ethan Manilow and MERL researchers Gordon Wichern and Jonathan Le Roux won Best Poster Award and Best Video Award at the 2020 International Society for Music Information Retrieval Conference (ISMIR 2020) for the paper "Hierarchical Musical Source Separation". The conference was held October 11-14 in a virtual format. The Best Poster Awards and Best Video Awards were awarded by popular vote among the conference attendees.

        The paper proposes a new method for isolating individual sounds in an audio mixture that accounts for the hierarchical relationship between sound sources. Many sounds we are interested in analyzing are hierarchical in nature, e.g., during a music performance, a hi-hat note is one of many such hi-hat notes, which is one of several parts of a drumkit, itself one of many instruments in a band, which might be playing in a bar with other sounds occurring. Inspired by this, the paper re-frames the audio source separation problem as hierarchical, combining similar sounds together at certain levels while separating them at other levels, and shows on a musical instrument separation task that a hierarchical approach outperforms non-hierarchical models while also requiring less training data. The paper, poster, and video can be seen on the paper page on the ISMIR website.
    •  

    See All Awards for Artificial Intelligence
  • News & Events

    •  EVENT   Prof. Melanie Zeilinger of ETH to give keynote at MERL's Virtual Open House
      Date & Time: Thursday, December 9, 2021; 1:00pm - 5:30pm EST
      Speaker: Prof. Melanie Zeilinger, ETH
      Location: Virtual Event
      Research Areas: Applied Physics, Artificial Intelligence, Communications, Computational Sensing, Computer Vision, Control, Data Analytics, Dynamical Systems, Electric Systems, Electronic and Photonic Devices, Machine Learning, Multi-Physical Modeling, Optimization, Robotics, Signal Processing, Speech & Audio, Digital Video, Human-Computer Interaction, Information Security
      Brief
      • MERL is excited to announce the second keynote speaker for our Virtual Open House 2021:
        Prof. Melanie Zeilinger from ETH .

        Our virtual open house will take place on December 9, 2021, 1:00pm - 5:30pm (EST).

        Join us to learn more about who we are, what we do, and discuss our internship and employment opportunities. Prof. Zeilinger's talk is scheduled for 3:15pm - 3:45pm (EST).

        Registration: https://mailchi.mp/merl/merlvoh2021

        Keynote Title: Control Meets Learning - On Performance, Safety and User Interaction

        Abstract: With increasing sensing and communication capabilities, physical systems today are becoming one of the largest generators of data, making learning a central component of autonomous control systems. While this paradigm shift offers tremendous opportunities to address new levels of system complexity, variability and user interaction, it also raises fundamental questions of learning in a closed-loop dynamical control system. In this talk, I will present some of our recent results showing how even safety-critical systems can leverage the potential of data. I will first briefly present concepts for using learning for automatic controller design and for a new safety framework that can equip any learning-based controller with safety guarantees. The second part will then discuss how expert and user information can be utilized to optimize system performance, where I will particularly highlight an approach developed together with MERL for personalizing the motion planning in autonomous driving to the individual driving style of a passenger.
    •  
    •  EVENT   Prof. Ashok Veeraraghavan of Rice University to give keynote at MERL's Virtual Open House
      Date & Time: Thursday, December 9, 2021; 1:00pm - 5:30pm EST
      Speaker: Prof. Ashok Veeraraghavan, Rice University
      Location: Virtual Event
      Research Areas: Applied Physics, Artificial Intelligence, Communications, Computational Sensing, Computer Vision, Control, Data Analytics, Dynamical Systems, Electric Systems, Electronic and Photonic Devices, Machine Learning, Multi-Physical Modeling, Optimization, Robotics, Signal Processing, Speech & Audio, Digital Video, Human-Computer Interaction, Information Security
      Brief
      • MERL is excited to announce the first keynote speaker for our Virtual Open House 2021:
        Prof. Ashok Veeraraghavan from Rice University.

        Our virtual open house will take place on December 9, 2021, 1:00pm - 5:30pm (EST).

        Join us to learn more about who we are, what we do, and discuss our internship and employment opportunities. Prof. Veeraraghavan's talk is scheduled for 1:15pm - 1:45pm (EST).

        Registration: https://mailchi.mp/merl/merlvoh2021

        Keynote Title: Computational Imaging: Beyond the limits imposed by lenses.

        Abstract: The lens has long been a central element of cameras, since its early use in the mid-nineteenth century by Niepce, Talbot, and Daguerre. The role of the lens, from the Daguerrotype to modern digital cameras, is to refract light to achieve a one-to-one mapping between a point in the scene and a point on the sensor. This effect enables the sensor to compute a particular two-dimensional (2D) integral of the incident 4D light-field. We propose a radical departure from this practice and the many limitations it imposes. In the talk we focus on two inter-related research projects that attempt to go beyond lens-based imaging.

        First, we discuss our lab’s recent efforts to build flat, extremely thin imaging devices by replacing the lens in a conventional camera with an amplitude mask and computational reconstruction algorithms. These lensless cameras, called FlatCams can be less than a millimeter in thickness and enable applications where size, weight, thickness or cost are the driving factors. Second, we discuss high-resolution, long-distance imaging using Fourier Ptychography, where the need for a large aperture aberration corrected lens is replaced by a camera array and associated phase retrieval algorithms resulting again in order of magnitude reductions in size, weight and cost. Finally, I will spend a few minutes discussing how the wholistic computational imaging approach can be used to create ultra-high-resolution wavefront sensors.
    •  

    See All News & Events for Artificial Intelligence
  • Research Highlights

  • Internships

    • CV1568: Uncertainty Estimation in 3D Face Landmark Tracking

      We are seeking a highly motivated intern to conduct original research extending MERL's work on uncertainty estimation in face landmark localization (the LUVLi model) to the domains of 3D faces and video sequences. The successful candidate will collaborate with MERL researchers to design and implement new models, conduct experiments, and prepare results for publication. The candidate should be a PhD student in computer vision and machine learning with a strong publication record. Experience in deep learning-based face landmark estimation, video tracking, and 3D face modeling is preferred. Strong programming skills, experience developing and implementing new models in deep learning platforms such as PyTorch, and broad knowledge of machine learning and deep learning methods are expected.

    • CA1728: Safe data-driven control of dynamical systems under uncertainty

      MERL is looking for a highly motivated individual to work on safe control of data-driven, uncertain, dynamical systems. The research will develop novel optimization and learning-based control algorithms to guarantee safety and performance in various industrial applications, including autonomous driving. The ideal candidate should have experience in either one or multiple of the following topics: optimal control under uncertainty, (robust and stochastic) model predictive control, (convex and non-convex) optimization, and (reinforcement and statistical) learning. Ph.D. students in engineering or mathematics with a focus on control, optimization, and learning are encouraged to apply. A successful internship will result in submission of relevant results to peer-reviewed conference proceedings and journals, and development of well-documented (Python/MATLAB) code for MERL. The expected duration of the internship is 3-6 months, and the start date is Summer 2022. This internship is preferred to be onsite at MERL, but may be done remotely where you live if the COVID pandemic makes it necessary.

    • CV1722: Multimodal Embodied AI

      MERL is looking for a self-motivated intern to work on problems at the intersection of video understanding, audio processing, and language models. The ideal candidate would be a senior PhD student with a strong background in machine learning and computer vision (as demonstrated via top-tier publications). The candidate must have prior experience in developing deep learning methods for audio-visual-language data. Expertise in popular embodied AI environments as well as a strong background in reinforcement learning will be beneficial. The intern is expected to collaborate with researchers in computer vision and speech teams at MERL to develop algorithms and prepare manuscripts for scientific publications. This internship requires work that can only be done at MERL.


    See All Internships for Artificial Intelligence
  • Recent Publications

    •  Wang, Z.-Q., Wichern, G., Le Roux, J., "On The Compensation Between Magnitude and Phase in Speech Separation", IEEE Signal Processing Letters, November 2021.
      BibTeX TR2021-137 PDF
      • @article{Wang2021nov2,
      • author = {Wang, Zhong-Qiu and Wichern, Gordon and Le Roux, Jonathan},
      • title = {On The Compensation Between Magnitude and Phase in Speech Separation},
      • journal = {IEEE Signal Processing Letters},
      • year = 2021,
      • month = nov,
      • url = {https://www.merl.com/publications/TR2021-137}
      • }
    •  Demir, A., Koike-Akino, T., Wang, Y., Erdogmus, D., Haruna, M., "EEG-GNN: Graph Neural Networks for Classification of Electroencephalogram (EEG) Signals", International IEEE EMBS Conference on Neural Engineering, October 2021.
      BibTeX TR2021-136 PDF Video Presentation
      • @inproceedings{Demir2021oct,
      • author = {Demir, Andac and Koike-Akino, Toshiaki and Wang, Ye and Erdogmus, Deniz and Haruna, Masaki},
      • title = {EEG-GNN: Graph Neural Networks for Classification of Electroencephalogram (EEG) Signals},
      • booktitle = {International IEEE EMBS Conference on Neural Engineering},
      • year = 2021,
      • month = oct,
      • url = {https://www.merl.com/publications/TR2021-136}
      • }
    •  Rakin, A.S., Wang, Y., Aeron, S., Koike-Akino, T., Moulin, P., Parsons, K., "Towards Universal Adversarial Examples and Defenses", IEEE Information Theory Workshop, DOI: 10.1109/​ITW48936.2021.9611439, October 2021.
      BibTeX TR2021-125 PDF Video
      • @inproceedings{Rakin2021oct,
      • author = {Rakin, Adnan S and Wang, Ye and Aeron, Shuchin and Koike-Akino, Toshiaki and Moulin, Pierre and Parsons, Kieran},
      • title = {Towards Universal Adversarial Examples and Defenses},
      • booktitle = {IEEE Information Theory Workshop},
      • year = 2021,
      • month = oct,
      • publisher = {IEEE},
      • doi = {10.1109/ITW48936.2021.9611439},
      • isbn = {978-1-6654-0312-2},
      • url = {https://www.merl.com/publications/TR2021-125}
      • }
    •  Wang, Z.-Q., Wichern, G., Le Roux, J., "Convolutive Prediction for Reverberant Speech Separation", IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), October 2021.
      BibTeX TR2021-127 PDF
      • @inproceedings{Wang2021oct4,
      • author = {Wang, Zhong-Qiu and Wichern, Gordon and Le Roux, Jonathan},
      • title = {Convolutive Prediction for Reverberant Speech Separation},
      • booktitle = {IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
      • year = 2021,
      • month = oct,
      • url = {https://www.merl.com/publications/TR2021-127}
      • }
    •  Wichern, G., Chakrabarty, A., Wang, Z.-Q., Le Roux, J., "Anomalous sound detection using attentive neural processes", IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), October 2021.
      BibTeX TR2021-129 PDF
      • @inproceedings{Wichern2021oct,
      • author = {Wichern, Gordon and Chakrabarty, Ankush and Wang, Zhong-Qiu and Le Roux, Jonathan},
      • title = {Anomalous sound detection using attentive neural processes},
      • booktitle = {IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
      • year = 2021,
      • month = oct,
      • url = {https://www.merl.com/publications/TR2021-129}
      • }
    •  Chatterjee, M., Ahuja, N., Cherian, A., "A Hierarchical Variational Neural Uncertainty Model for Stochastic Video Prediction", IEEE International Conference on Computer Vision (ICCV), October 2021.
      BibTeX TR2021-096 PDF
      • @inproceedings{Chatterjee2021oct2,
      • author = {Chatterjee, Moitreya and Ahuja, Narendra and Cherian, Anoop},
      • title = {A Hierarchical Variational Neural Uncertainty Model for Stochastic Video Prediction},
      • booktitle = {IEEE International Conference on Computer Vision (ICCV)},
      • year = 2021,
      • month = oct,
      • url = {https://www.merl.com/publications/TR2021-096}
      • }
    •  Chatterjee, M., Le Roux, J., Ahuja, N., Cherian, A., "Visual Scene Graphs for Audio Source Separation", IEEE International Conference on Computer Vision (ICCV), October 2021.
      BibTeX TR2021-095 PDF
      • @inproceedings{Chatterjee2021oct,
      • author = {Chatterjee, Moitreya and Le Roux, Jonathan and Ahuja, Narendra and Cherian, Anoop},
      • title = {Visual Scene Graphs for Audio Source Separation},
      • booktitle = {IEEE International Conference on Computer Vision (ICCV)},
      • year = 2021,
      • month = oct,
      • url = {https://www.merl.com/publications/TR2021-095}
      • }
    •  Cherian, A., Pais, G., Jain, S., Marks, T.K., Sullivan, A., "InSeGAN: A Generative Approach to Segmenting Identical Instances in Depth Images", IEEE International Conference on Computer Vision (ICCV), October 2021.
      BibTeX TR2021-097 PDF
      • @inproceedings{Cherian2021oct,
      • author = {Cherian, Anoop and Pais, Goncalo and Jain, Siddarth and Marks, Tim K. and Sullivan, Alan},
      • title = {InSeGAN: A Generative Approach to Segmenting Identical Instances in Depth Images},
      • booktitle = {IEEE International Conference on Computer Vision (ICCV)},
      • year = 2021,
      • month = oct,
      • url = {https://www.merl.com/publications/TR2021-097}
      • }
    See All Publications for Artificial Intelligence
  • Videos

  • Software Downloads