News & Events

1,503 News items, Awards, Events and Talks related to MERL and its staff.



Learn about the MERL Seminar Series.



  •  EVENT    MERL Contributes to ICASSP 2024
    Date: Sunday, April 14, 2024 - Friday, April 19, 2024
    Location: Seoul, South Korea
    MERL Contacts: Petros T. Boufounos; François Germain; Chiori Hori; Sameer Khurana; Toshiaki Koike-Akino; Jonathan Le Roux; Hassan Mansour; Zexu Pan; Kieran Parsons; Joshua Rapp; Anthony Vetro; Pu (Perry) Wang; Gordon Wichern; Ryoma Yataka
    Research Areas: Artificial Intelligence, Computational Sensing, Machine Learning, Robotics, Signal Processing, Speech & Audio
    Brief
    • MERL has made numerous contributions to both the organization and technical program of ICASSP 2024, which is being held in Seoul, Korea from April 14-19, 2024.

      Sponsorship and Awards

      MERL is proud to be a Bronze Patron of the conference and will participate in the student job fair on Thursday, April 18. Please join this session to learn more about employment opportunities at MERL, including openings for research scientists, post-docs, and interns.

      MERL is pleased to be the sponsor of two IEEE Awards that will be presented at the conference. We congratulate Prof. Stéphane G. Mallat, the recipient of the 2024 IEEE Fourier Award for Signal Processing, and Prof. Keiichi Tokuda, the recipient of the 2024 IEEE James L. Flanagan Speech and Audio Processing Award.

      Jonathan Le Roux, MERL Speech and Audio Senior Team Leader, will also be recognized during the Awards Ceremony for his recent elevation to IEEE Fellow.

      Technical Program

      MERL will present 13 papers in the main conference on a wide range of topics including automated audio captioning, speech separation, audio generative models, speech and sound synthesis, spatial audio reproduction, multimodal indoor monitoring, radar imaging, depth estimation, physics-informed machine learning, and integrated sensing and communications (ISAC). Three workshop papers have also been accepted for presentation on audio-visual speaker diarization, music source separation, and music generative models.

      Perry Wang is the co-organizer of the Workshop on Signal Processing and Machine Learning Advances in Automotive Radars (SPLAR), held on Sunday, April 14. It features keynote talks from leaders in both academia and industry, peer-reviewed workshop papers, and lightning talks from ICASSP regular tracks on signal processing and machine learning for automotive radar and, more generally, radar perception.

      Gordon Wichern will present an invited keynote talk on analyzing and interpreting audio deep learning models at the Workshop on Explainable Machine Learning for Speech and Audio (XAI-SA), held on Monday, April 15. He will also appear in a panel discussion on interpretable audio AI at the workshop.

      Perry Wang also co-organizes a two-part special session on Next-Generation Wi-Fi Sensing (SS-L9 and SS-L13) which will be held on Thursday afternoon, April 18. The special session includes papers on PHY-layer oriented signal processing and data-driven deep learning advances, and supports upcoming 802.11bf WLAN Sensing Standardization activities.

      Petros Boufounos is participating as a mentor in ICASSP’s Micro-Mentoring Experience Program (MiME).

      About ICASSP

      ICASSP is the flagship conference of the IEEE Signal Processing Society, and the world's largest and most comprehensive technical conference focused on the research advances and latest technological development in signal and information processing. The event attracts more than 3000 participants.
  •  
  •  AWARD    Jonathan Le Roux elevated to IEEE Fellow
    Date: January 1, 2024
    Awarded to: Jonathan Le Roux
    MERL Contact: Jonathan Le Roux
    Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
    Brief
    • MERL Distinguished Scientist and Speech & Audio Senior Team Leader Jonathan Le Roux has been elevated to IEEE Fellow, effective January 2024, "for contributions to multi-source speech and audio processing."

      Mitsubishi Electric celebrated Dr. Le Roux's elevation and that of another researcher from the company, Dr. Shumpei Kameyama, with a worldwide news release on February 15.

      Dr. Jonathan Le Roux has made fundamental contributions to the field of multi-speaker speech processing, especially to the areas of speech separation and multi-speaker end-to-end automatic speech recognition (ASR). His contributions constituted a major advance in realizing a practically usable solution to the cocktail party problem, enabling machines to replicate humans’ ability to concentrate on a specific sound source, such as a certain speaker within a complex acoustic scene—a long-standing challenge in the speech signal processing community. Additionally, he has made key contributions to the measures used for training and evaluating audio source separation methods, developing several new objective functions to improve the training of deep neural networks for speech enhancement, and analyzing the impact of metrics used to evaluate the signal reconstruction quality. Dr. Le Roux’s technical contributions have been crucial in promoting the widespread adoption of multi-speaker separation and end-to-end ASR technologies across various applications, including smart speakers, teleconferencing systems, hearables, and mobile devices.

      IEEE Fellow is the highest grade of membership of the IEEE. It honors members with an outstanding record of technical achievements, contributing importantly to the advancement or application of engineering, science and technology, and bringing significant value to society. Each year, following a rigorous evaluation procedure, the IEEE Fellow Committee recommends a select group of recipients for elevation to IEEE Fellow. Less than 0.1% of voting members are selected annually for this member grade elevation.
  •  
  •  TALK    [MERL Seminar Series 2024] Melanie Mitchell presents talk titled "The Debate Over 'Understanding' in AI's Large Language Models"
    Date & Time: Tuesday, February 13, 2024; 1:00 PM
    Speaker: Melanie Mitchell, Santa Fe Institute
    MERL Host: Suhas Lohit
    Research Areas: Artificial Intelligence, Computer Vision, Machine Learning, Human-Computer Interaction
    Abstract
    • I will survey a current, heated debate in the AI research community on whether large pre-trained language models can be said to "understand" language -- and the physical and social situations language encodes -- in any important sense. I will describe arguments that have been made for and against such understanding, and, more generally, will discuss what methods can be used to fairly evaluate understanding and intelligence in AI systems. I will conclude with key questions for the broader sciences of intelligence that have arisen in light of these discussions.
  •  
  •  TALK    [MERL Seminar Series 2024] Greta Tuckute presents talk titled Computational models of human auditory and language processing
    Date & Time: Wednesday, January 31, 2024; 12:00 PM
    Speaker: Greta Tuckute, MIT
    MERL Host: Sameer Khurana
    Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
    Abstract
    • Advances in machine learning have led to powerful models for audio and language, proficient in tasks like speech recognition and fluent language generation. Beyond their immense utility in engineering applications, these models offer valuable tools for cognitive science and neuroscience. In this talk, I will demonstrate how these artificial neural network models can be used to understand how the human brain processes language. The first part of the talk will cover how audio neural networks serve as computational accounts for brain activity in the auditory cortex. The second part will focus on the use of large language models, such as those in the GPT family, to non-invasively control brain activity in the human language system.
  •  
  •  AWARD    Honorable Mention Award at NeurIPS 23 Instruction Workshop
    Date: December 15, 2023
    Awarded to: Lingfeng Sun, Devesh K. Jha, Chiori Hori, Siddharth Jain, Radu Corcodel, Xinghao Zhu, Masayoshi Tomizuka and Diego Romeres
    MERL Contacts: Radu Corcodel; Chiori Hori; Siddarth Jain; Devesh K. Jha; Diego Romeres
    Research Areas: Artificial Intelligence, Machine Learning, Robotics
    Brief
    • MERL Researchers received an "Honorable Mention award" at the Workshop on Instruction Tuning and Instruction Following at the NeurIPS 2023 conference in New Orleans. The workshop was on the topic of instruction tuning and Instruction following for Large Language Models (LLMs). MERL researchers presented their work on interactive planning using LLMs for partially observable robotic tasks during the oral presentation session at the workshop.
  •  
  •  AWARD    MERL team wins the Audio-Visual Speech Enhancement (AVSE) 2023 Challenge
    Date: December 16, 2023
    Awarded to: Zexu Pan, Gordon Wichern, Yoshiki Masuyama, Francois Germain, Sameer Khurana, Chiori Hori, and Jonathan Le Roux
    MERL Contacts: François Germain; Chiori Hori; Sameer Khurana; Jonathan Le Roux; Zexu Pan; Gordon Wichern
    Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
    Brief
    • MERL's Speech & Audio team ranked 1st out of 12 teams in the 2nd COG-MHEAR Audio-Visual Speech Enhancement Challenge (AVSE). The team was led by Zexu Pan, and also included Gordon Wichern, Yoshiki Masuyama, Francois Germain, Sameer Khurana, Chiori Hori, and Jonathan Le Roux.

      The AVSE challenge aims to design better speech enhancement systems by harnessing the visual aspects of speech (such as lip movements and gestures) in a manner similar to the brain’s multi-modal integration strategies. MERL’s system was a scenario-aware audio-visual TF-GridNet, that incorporates the face recording of a target speaker as a conditioning factor and also recognizes whether the predominant interference signal is speech or background noise. In addition to outperforming all competing systems in terms of objective metrics by a wide margin, in a listening test, MERL’s model achieved the best overall word intelligibility score of 84.54%, compared to 57.56% for the baseline and 80.41% for the next best team. The Fisher’s least significant difference (LSD) was 2.14%, indicating that our model offered statistically significant speech intelligibility improvements compared to all other systems.
  •  
  •  NEWS    Karl Berntorp joins the Editorial Board of IEEE Transactions on Control Systems Technology
    Date: December 7, 2023
    MERL Contact: Karl Berntorp
    Research Areas: Control, Dynamical Systems
    Brief
    • Karl Berntorp has joined the Editorial Board of the IEEE Transactions on Control Systems Technology (T-CST) as an Associate Editor. The IEEE T-CST publishes peer-reviewed papers on technological advances in the design, realization, and operation of control systems, and bridges the gap between the theory and practice of control engineering.
  •  
  •  TALK    [MERL Seminar Series 2023] Dr. Kristina Monakhova presents talk titled Robust and Physics-informed machine learning for low light imaging
    Date & Time: Tuesday, November 28, 2023; 12:00 PM
    Speaker: Kristina Monakhova, MIT and Cornell
    MERL Host: Joshua Rapp
    Research Areas: Computational Sensing, Computer Vision, Machine Learning, Signal Processing
    Abstract
    • Imaging in low light settings is extremely challenging due to low photon counts, both in photography and in microscopy. In photography, imaging under low light, high gain settings often results in highly structured, non-Gaussian sensor noise that’s hard to characterize or denoise. In this talk, we address this by developing a GAN-tuned physics-based noise model to more accurately represent camera noise at the lowest light, and highest gain settings. Using this noise model, we train a video denoiser using synthetic data and demonstrate photorealistic videography at starlight (submillilux levels of illumination) for the first time.

      For multiphoton microscopy, which is a form a scanning microscopy, there’s a trade-off between field of view, phototoxicity, acquisition time, and image quality, often resulting in noisy measurements. While deep learning-based methods have shown compelling denoising performance, can we trust these methods enough for critical scientific and medical applications? In the second part of this talk, I’ll introduce a learned, distribution-free uncertainty quantification technique that can both denoise and predict pixel-wise uncertainty to gauge how much we can trust our denoiser’s performance. Furthermore, we propose to leverage this learned, pixel-wise uncertainty to drive an adaptive acquisition technique that rescans only the most uncertain regions of a sample. With our sample and algorithm-informed adaptive acquisition, we demonstrate a 120X improvement in total scanning time and total light dose for multiphoton microscopy, while successfully recovering fine structures within the sample.
  •  
  •  NEWS    Ankush Chakrabarty served as Co-Chair of ACM BALANCES 2023
    Date: November 14, 2023
    Where: Istanbul, Turkey
    MERL Contact: Ankush Chakrabarty
    Research Areas: Control, Data Analytics, Machine Learning, Multi-Physical Modeling, Optimization
    Brief
    • Ankush Chakrabarty, Principal Research Scientist in the Multiphysical Systems team at MERL, served as Co-Chair at the 3rd ACM International Workshop on Big Data and Machine Learning for Smart Buildings and Cities (BALANCES'23). The workshop places spotlights on two different IEA EBC Annexes: the Annex 81 - Data-Driven Smart Buildings and Annex 82 - Energy Flexible Buildings Towards Resilient Low Carbon Energy Systems.
  •  
  •  NEWS    MERL Researchers give a Tutorial Talk on Quantum Machine Learning for Sensing and Communications at IEEE VCC
    Date: November 28, 2023 - November 30, 2023
    Where: Virtual
    MERL Contacts: Toshiaki Koike-Akino; Pu (Perry) Wang
    Research Areas: Artificial Intelligence, Communications, Computational Sensing, Machine Learning, Signal Processing
    Brief
    • On November 28, 2023, MERL researchers Toshiaki Koike-Akino and Pu (Perry) Wang will give a 3-hour tutorial presentation at the first IEEE Virtual Conference on Communications (VCC). The talk, titled "Post-Deep Learning Era: Emerging Quantum Machine Learning for Sensing and Communications," addresses recent trends, challenges, and advances in sensing and communications. P. Wang presents use cases, industry trends, signal processing, and deep learning for Wi-Fi integrated sensing and communications (ISAC), while T. Koike-Akino discusses the future of deep learning, giving a comprehensive overview of artificial intelligence (AI) technologies, natural computing, emerging quantum AI, and their diverse applications. The tutorial is conducted virtually.

      IEEE VCC is a new fully virtual conference launched from the IEEE Communications Society, gathering researchers from academia and industry who are unable to travel but wish to present their recent scientific results and engage in conducive interactive discussions with fellow researchers working in their fields. It is designed to resolve potential hardship such as pandemic restrictions, visa issues, travel problems, or financial difficulties.
  •  
  •  TALK    [MERL Seminar Series 2023] Gioele Zardini presents talk titled Co-Design of Complex Systems: From Autonomy to Future Mobility
    Date & Time: Tuesday, November 21, 2023; 11:00 AM
    Speaker: Gioele Zardini, ETH Zürich and MIT
    MERL Host: Karl Berntorp
    Research Areas: Control, Dynamical Systems
    Abstract
    • When designing complex systems, we need to consider multiple trade-offs at various abstraction levels and scales, and choices of single components need to be studied jointly. For instance, the design of future mobility solutions (e.g., autonomous vehicles, micromobility) and the design of the mobility systems they enable are closely coupled. Indeed, knowledge about the intended service of novel mobility solutions would impact their design and deployment process, whilst insights about their technological development could significantly affect transportation management policies. Optimally co-designing sociotechnical systems is a complex task for at least two reasons. On one hand, the co-design of interconnected systems (e.g., large networks of cyber-physical systems) involves the simultaneous choice of components arising from heterogeneous natures (e.g., hardware vs. software parts) and fields, while satisfying systemic constraints and accounting for multiple objectives. On the other hand, components are connected via collaborative and conflicting interactions between different stakeholders (e.g., within an intermodal mobility system). In this talk, I will present a framework to co-design complex systems, leveraging a monotone theory of co-design and tools from game theory. The framework will be instantiated in the task of designing future mobility systems, all the way from the policies that a city can design, to the autonomy of vehicles part of an autonomous mobility-on-demand service. Through various case studies, I will show how the proposed approaches allow one to efficiently answer heterogeneous questions, unifying different modeling techniques and promoting interdisciplinarity, modularity, and compositionality. I will then discuss open challenges for compositional systems design optimization, and present my agenda to tackle them.
  •  
  •  NEWS    Anoop Cherian gives a podcast interview with AI Business
    Date: September 26, 2023
    Where: Virtual
    MERL Contact: Anoop Cherian
    Research Areas: Artificial Intelligence, Computer Vision, Machine Learning
    Brief
    • Anoop Cherian, a Senior Principal Research Scientist in the Computer Vision team at MERL, gave a podcast interview with award-winning journalist, Deborah Yao. Deborah is the editor of AI Business -- a leading content platform for artificial intelligence and its applications in the real world, delivering its readers up-to-the-minute insights into how AI technologies are currently affecting the global economy and society. The podcast was based on the recent research that Anoop and his colleagues did at MERL with his collaborators at MIT; this research attempts to objectively answer the pertinent question: are current deep neural networks smarter than second graders? The podcast discusses shortcomings in the recent artificial general intelligence systems with regard to their capabilities for knowledge abstraction, learning, and generalization, which are brought out by this research.
  •  
  •  TALK    [MERL Seminar Series 2023] Prof. Flavio Calmon presents talk titled Multiplicity in Machine Learning
    Date & Time: Tuesday, November 7, 2023; 12:00 PM
    Speaker: Flavio Calmon, Harvard University
    MERL Host: Ye Wang
    Research Areas: Artificial Intelligence, Machine Learning
    Abstract
    • This talk reviews the concept of predictive multiplicity in machine learning. Predictive multiplicity arises when different classifiers achieve similar average performance for a specific learning task yet produce conflicting predictions for individual samples. We discuss a metric called “Rashomon Capacity” for quantifying predictive multiplicity in multi-class classification. We also present recent findings on the multiplicity cost of differentially private training methods and group fairness interventions in machine learning.

      This talk is based on work published at ICML'20, NeurIPS'22, ACM FAccT'23, and NeurIPS'23.
  •  
  •  NEWS    Invited talk given by Diego Romeres at Bentley University
    Date: November 1, 2023
    MERL Contact: Diego Romeres
    Research Areas: Artificial Intelligence, Machine Learning, Robotics
    Brief
    • Principal Research Scientist and Team Leader Diego Romeres gave an invited talk with title 'Applications of Machine Learning to Robotics' in the Machine Learning graduate course at Bentley University. The presentation focused mainly on Reinforcement Learning research applied to robotics. The audience consisted mostly of Master’s in Business Analytics (MSBA) students and students in the MBA w/ Business Analytics Concentration program.
  •  
  •  TALK    [MERL Seminar Series 2023] Dr. Tanmay Gupta presents talk titled Visual Programming - A compositional approach to building General Purpose Vision Systems
    Date & Time: Tuesday, October 31, 2023; 2:00 PM
    Speaker: Tanmay Gupta, Allen Institute for Artificial Intelligence
    MERL Host: Moitreya Chatterjee
    Research Areas: Artificial Intelligence, Computer Vision, Machine Learning
    Abstract
    • Building General Purpose Vision Systems (GPVs) that can perform a huge variety of tasks has been a long-standing goal for the computer vision community. However, end-to-end training of these systems to handle different modalities and tasks has proven to be extremely challenging. In this talk, I will describe a lucrative neuro-symbolic alternative to the common end-to-end learning paradigm called Visual Programming. Visual Programming is a general framework that leverages the code-generation abilities of LLMs, existing neural models, and non-differentiable programs to enable powerful applications. Some of these applications continue to remain elusive for the current generation of end-to-end trained GPVs.
  •  
  •  EVENT    Prof. Yuejie Chi of Carnegie Mellon University to Deliver Keynote at MERL's Virtual Open House
    Date & Time: Wednesday, November 15, 2023; 3:00-3:40pm (EST)
    Location: Virtual Event
    Speaker: Prof. Yuejie Chi, Carnegie Mellon University
    MERL Contact: Bingnan Wang
    Brief
    • MERL is excited to announce the featured keynote speaker for our Virtual Open House 2023: Prof. Yuejie Chi from Carnegie Mellon University.

      Our virtual open house this year will take place on November 15, 2023, 1:00pm - 5:30pm (EST). Prof. Chi’s talk is scheduled for 3:00-3:40pm (EST). For details and agenda of the event, please visit: https://merl.com/events/voh23

      Join us to learn more about who we are, what we do, and discuss our internship, post-doc, and full-time employment opportunities. To register, go to: https://mailchi.mp/merl/voh23


      Title: Sample Complexity of Q-learning: from Single-agent to Federated Learning

      Abstract: Q-learning, which seeks to learn the optimal Q-function of a Markov decision process (MDP) in a model-free fashion, lies at the heart of reinforcement learning practices. However, theoretical understandings on its non-asymptotic sample complexity remain unsatisfactory, despite significant recent efforts. In this talk, we first show a tight sample complexity bound of Q-learning in the single-agent setting, together with a matching lower bound to establish its minimax sub-optimality. We then show how federated versions of Q-learning allow collaborative learning using data collected by multiple agents without central sharing, where an importance averaging scheme is introduced to unveil the blessing of heterogeneity.
  •  
  •  NEWS    MERL co-organizes the 2023 Sound Demixing (SDX2023) Challenge and Workshop
    Date: January 23, 2023 - November 4, 2023
    Where: International Symposium of Music Information Retrieval (ISMR)
    MERL Contacts: Jonathan Le Roux; Gordon Wichern
    Research Areas: Artificial Intelligence, Machine Learning, Speech & Audio
    Brief
    • MERL Speech & Audio team members Gordon Wichern and Jonathan Le Roux co-organized the 2023 Sound Demixing Challenge along with researchers from Sony, Moises AI, Audioshake, and Meta.

      The SDX2023 Challenge was hosted on the AI Crowd platform and had a prize pool of $42,000 distributed to the winning teams across two tracks: Music Demixing and Cinematic Sound Demixing. A unique aspect of this challenge was the ability to test the audio source separation models developed by challenge participants on non-public songs from Sony Music Entertainment Japan for the music demixing track, and movie soundtracks from Sony Pictures for the cinematic sound demixing track. The challenge ran from January 23rd to May 1st, 2023, and had 884 participants distributed across 68 teams submitting 2828 source separation models. The winners will be announced at the SDX2023 Workshop, which will take place as a satellite event at the International Symposium of Music Information Retrieval (ISMR) in Milan, Italy on November 4, 2023.

      MERL’s contribution to SDX2023 focused mainly on the cinematic demixing track. In addition to sponsoring the prizes awarded to the winning teams for that track, the baseline system and initial training data were MERL’s Cocktail Fork separation model and Divide and Remaster dataset, respectively. MERL researchers also contributed to a Town Hall kicking off the challenge, co-authored a scientific paper describing the challenge outcomes, and co-organized the SDX2023 Workshop.
  •  
  •  EVENT    MERL's Virtual Open House 2023
    Date & Time: Wednesday, November 15, 2023; 1:00 - 5:30 EST
    Location: Virtual Event
    MERL Contact: Bingnan Wang
    Brief
    • Join us for MERL's Virtual Open House (VOH) 2023 on November 15th. Live sessions will be held from 1:00-5:30pm EST, including an overview of recent activities by our research groups, a featured guest speaker and live interaction with our research staff through the Gather platform. Registered attendees will be able to browse our virtual booths at their convenience and connect with our research staff to learn about engagement opportunities, including internship/post-doc openings as well as visiting faculty positions.

      For agenda and details of the event: https://www.merl.com/events/voh23

      To register for the VOH, go to:
      https://mailchi.mp/merl/voh23
  •  
  •  TALK    [MERL Seminar Series 2023] Prof. Shaoshuai Mou presents talk titled Inverse Optimal Control for Autonomous Systems
    Date & Time: Tuesday, October 10, 2023; 1:00 PM
    Speaker: Shaoshuai Mou, Purdue University
    MERL Host: Yebin Wang
    Research Areas: Control, Dynamical Systems, Robotics
    Abstract
    • Inverse Optimal Control (IOC) aims to achieve an objective function corresponding to a certain task from an expert robot driven by optimal control, which has become a powerful tool in many applications in robotics. We will present our recent solutions to IOC based on incomplete observations of systems' trajectories, which enables an autonomous system to “sense-and-adapt", i.e., incrementally improving the learning of objective functions as new data arrives. This also leads to a distributed algorithm to solve IOC in multi-agent systems, in which each agent can only access part of the overall trajectory of an optimal control system and cannot solve IOC by itself. This is perhaps the first distributed method to IOC. Applications of IOC into human prediction will also be given.
  •  
  •  NEWS    MERL researchers presenting four papers and organizing the VLAR-SMART101 Workshop at ICCV 2023
    Date: October 2, 2023 - October 6, 2023
    Where: Paris/France
    MERL Contacts: Moitreya Chatterjee; Anoop Cherian; Michael J. Jones; Toshiaki Koike-Akino; Suhas Lohit; Tim K. Marks; Pedro Miraldo; Kuan-Chuan Peng; Ye Wang
    Research Areas: Artificial Intelligence, Computer Vision, Machine Learning
    Brief
    • MERL researchers are presenting 4 papers and organizing the VLAR-SMART-101 workshop at the ICCV 2023 conference, which will be held in Paris, France October 2-6. ICCV is one of the most prestigious and competitive international conferences in computer vision. Details are provided below.

      1. Conference paper: “Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional Image Synthesis,” by Nithin Gopalakrishnan Nair, Anoop Cherian, Suhas Lohit, Ye Wang, Toshiaki Koike-Akino, Vishal Patel, and Tim K. Marks

      Conditional generative models typically demand large annotated training sets to achieve high-quality synthesis. As a result, there has been significant interest in plug-and-play generation, i.e., using a pre-defined model to guide the generative process. In this paper, we introduce Steered Diffusion, a generalized framework for fine-grained photorealistic zero-shot conditional image generation using a diffusion model trained for unconditional generation. The key idea is to steer the image generation of the diffusion model during inference via designing a loss using a pre-trained inverse model that characterizes the conditional task. Our model shows clear qualitative and quantitative improvements over state-of-the-art diffusion-based plug-and-play models, while adding negligible computational cost.

      2. Conference paper: "BANSAC: A dynamic BAyesian Network for adaptive SAmple Consensus," by Valter Piedade and Pedro Miraldo

      We derive a dynamic Bayesian network that updates individual data points' inlier scores while iterating RANSAC. At each iteration, we apply weighted sampling using the updated scores. Our method works with or without prior data point scorings. In addition, we use the updated inlier/outlier scoring for deriving a new stopping criterion for the RANSAC loop. Our method outperforms the baselines in accuracy while needing less computational time.

      3. Conference paper: "Robust Frame-to-Frame Camera Rotation Estimation in Crowded Scenes," by Fabien Delattre, David Dirnfeld, Phat Nguyen, Stephen Scarano, Michael J. Jones, Pedro Miraldo, and Erik Learned-Miller

      We present a novel approach to estimating camera rotation in crowded, real-world scenes captured using a handheld monocular video camera. Our method uses a novel generalization of the Hough transform on SO3 to efficiently find the camera rotation most compatible with the optical flow. Because the setting is not addressed well by other data sets, we provide a new dataset and benchmark, with high-accuracy and rigorously annotated ground truth on 17 video sequences. Our method is more accurate by almost 40 percent than the next best method.

      4. Workshop paper: "Tensor Factorization for Leveraging Cross-Modal Knowledge in Data-Constrained Infrared Object Detection" by Manish Sharma*, Moitreya Chatterjee*, Kuan-Chuan Peng, Suhas Lohit, and Michael Jones

      While state-of-the-art object detection methods for RGB images have reached some level of maturity, the same is not true for Infrared (IR) images. The primary bottleneck towards bridging this gap is the lack of sufficient labeled training data in the IR images. Towards addressing this issue, we present TensorFact, a novel tensor decomposition method which splits the convolution kernels of a CNN into low-rank factor matrices with fewer parameters. This compressed network is first pre-trained on RGB images and then augmented with only a few parameters. This augmented network is then trained on IR images, while freezing the weights trained on RGB. This prevents it from over-fitting, allowing it to generalize better. Experiments show that our method outperforms state-of-the-art.

      5. “Vision-and-Language Algorithmic Reasoning (VLAR) Workshop and SMART-101 Challenge” by Anoop Cherian,  Kuan-Chuan Peng, Suhas Lohit, Tim K. Marks, Ram Ramrakhya, Honglu Zhou, Kevin A. Smith, Joanna Matthiesen, and Joshua B. Tenenbaum

      MERL researchers along with researchers from MIT, GeorgiaTech, Math Kangaroo USA, and Rutgers University are jointly organizing a workshop on vision-and-language algorithmic reasoning at ICCV 2023 and conducting a challenge based on the SMART-101 puzzles described in the paper: Are Deep Neural Networks SMARTer than Second Graders?. A focus of this workshop is to bring together outstanding faculty/researchers working at the intersections of vision, language, and cognition to provide their opinions on the recent breakthroughs in large language models and artificial general intelligence, as well as showcase their cutting edge research that could inspire the audience to search for the missing pieces in our quest towards solving the puzzle of artificial intelligence.

      Workshop link: https://wvlar.github.io/iccv23/
  •  
  •  TALK    [MERL Seminar Series 2023] Prof. Komei Sugiura presents talk titled The Confluence of Vision, Language, and Robotics
    Date & Time: Thursday, September 28, 2023; 12:00 PM
    Speaker: Komei Sugiura, Keio University
    MERL Host: Chiori Hori
    Research Areas: Artificial Intelligence, Machine Learning, Robotics, Speech & Audio
    Abstract
    • Recent advances in multimodal models that fuse vision and language are revolutionizing robotics. In this lecture, I will begin by introducing recent multimodal foundational models and their applications in robotics. The second topic of this talk will address our recent work on multimodal language processing in robotics. The shortage of home care workers has become a pressing societal issue, and the use of domestic service robots (DSRs) to assist individuals with disabilities is seen as a possible solution. I will present our work on DSRs that are capable of open-vocabulary mobile manipulation, referring expression comprehension and segmentation models for everyday objects, and future captioning methods for cooking videos and DSRs.
  •  
  •  TALK    [MERL Seminar Series 2023] Prof. Zac Manchester presents talk titled Composable Optimization for Robotic Simulation, Planning, and Control
    Date & Time: Wednesday, September 27, 2023; 1:00 PM
    Speaker: Zac Manchester, Carnegie Mellon University
    MERL Host: Devesh K. Jha
    Research Areas: Optimization, Robotics
    Abstract
    • Contact interactions are pervasive in key real-world robotic tasks like manipulation and walking. However, the non-smooth dynamics associated with impacts and friction remain challenging to model, and motion planning and control algorithms that can fluently and efficiently reason about contact remain elusive. In this talk, I will share recent work from my research group that takes an “optimization-first” approach to these challenges: collision detection, physics, motion planning, and control are all posed as constrained optimization problems. We then build a set of algorithmic and numerical tools that allow us to flexibly compose these optimization sub-problems to solve complex robotics problems involving discontinuous, unplanned, and uncertain contact mechanics.
  •  
  •  TALK    [MERL Seminar Series 2023] Prof. Faruque Hasan presents talk titled A Process Systems Engineering Perspective on Carbon Capture: Key Challenges and Opportunities
    Date & Time: Tuesday, September 19, 2023; 1:00 PM
    Speaker: Faruque Hasan, Texas A&M University
    MERL Host: Scott A. Bortoff
    Research Areas: Applied Physics, Machine Learning, Multi-Physical Modeling, Optimization
    Abstract
    • Carbon capture, utilization, and storage (CCUS) is a promising pathway to decarbonize fossil-based power and industrial sectors and is a bridging technology for a sustainable transition to a net-zero emission energy future. This talk aims to provide an overview of design and optimization of CCUS systems. I will also attempt to give a brief perspective on emerging interests in process systems engineering research (e.g., systems integration, multiscale modeling, strategic planning, and optimization under uncertainty). The purpose is not to cover all aspects of PSE research for CCUS but rather to foster discussion by presenting some plausible future directions and ideas.
  •  
  •  AWARD    Best paper award at PHMAP 2023
    Date: September 14, 2023
    Awarded to: Dehong Liu, Anantaram Varatharajan, and Abraham Goldsmith
    MERL Contacts: Abraham Goldsmith; Dehong Liu
    Research Areas: Electric Systems, Signal Processing
    Brief
    • MERL researchers Dehong Liu, Anantaram Varatharajan, and Abraham Goldsmith were awarded one of three best paper awards at Asia Pacific Conference of the Prognostics and Health Management Society 2023 (PHMAP23) held in Tokyo from September 11th to 14th, 2023, for their co-authored paper titled 'Extracting Broken-Rotor-Bar Fault Signature of Varying-Speed Induction Motors.'

      PHMAP is a biennial international conference specialized in prognostics and health management. PHMAP23 attracted more than 300 attendees from worldwide and published more than 160 regular papers from academia and industry including aerospace, production, civil engineering, electronics, and so on.
  •  
  •  AWARD    Joint University of Padua-MERL team wins Challenge 'AI Olympics With RealAIGym'
    Date: August 25, 2023
    Awarded to: Alberto Dalla Libera, Niccolo' Turcato, Giulio Giacomuzzo, Ruggero Carli, Diego Romeres
    MERL Contact: Diego Romeres
    Research Areas: Artificial Intelligence, Machine Learning, Robotics
    Brief
    • A joint team consisting of members of University of Padua and MERL ranked 1st in the IJCAI2023 Challenge "Al Olympics With RealAlGym: Is Al Ready for Athletic Intelligence in the Real World?". The team was composed by MERL researcher Diego Romeres and a team from University Padua (UniPD) consisting of Alberto Dalla Libera, Ph.D., Ph.D. Candidates: Niccolò Turcato, Giulio Giacomuzzo and Prof. Ruggero Carli from University of Padua.

      The International Joint Conference on Artificial Intelligence (IJCAI) is a premier gathering for AI researchers and organizes several competitions. This year the competition CC7 "AI Olympics With RealAIGym: Is AI Ready for Athletic Intelligence in the Real World?" consisted of two stages: simulation and real-robot experiments on two under-actuated robotic systems. The two robotics systems were treated as separate tracks and one final winner was selected for each track based on specific performance criteria in the control tasks.

      The UniPD-MERL team competed and won in both tracks. The team's system made strong use of a Model-based Reinforcement Learning algorithm called (MC-PILCO) that we recently published in the journal IEEE Transaction on Robotics.
  •