-
ST0096: Internship - Multimodal Tracking and Imaging
MERL is seeking a motivated intern to assist in developing hardware and algorithms for multimodal imaging applications. The project involves integration of radar, camera, and depth sensors in a variety of sensing scenarios. The ideal candidate should have experience with FMCW radar and/or depth sensing, and be fluent in Python and scripting methods. Familiarity with optical tracking of humans and experience with hardware prototyping is desired. Good knowledge of computational imaging and/or radar imaging methods is a plus.
Required Specific Experience
- Experience with Python and Python Deep Learning Frameworks.
- Experience with FMCW radar and/or Depth Sensors.
- Research Areas: Computer Vision, Machine Learning, Signal Processing, Computational Sensing
- Host: Petros Boufounos
- Apply Now
-
CA0129: Internship - LLM-guided Active SLAM for Mobile Robots
MERL is seeking interns passionate about robotics to contribute to the development of an Active Simultaneous Localization and Mapping (Active SLAM) framework guided by Large Language Models (LLM). The core objective is to achieve autonomous behavior for mobile robots. The methods will be implemented and evaluated in high performance simulators and (time-permitting) in actual robotic platforms, such as legged and wheeled robots. The expectation at the end of the internship is a publication at a top-tier robotic or computer vision conference and/or journal.
The internship has a flexible start date (Spring/Summer 2025), with a duration of 3-6 months depending on agreed scope and intermediate progress.
Required Specific Experience
- Current/Past Enrollment in a PhD Program in Computer Engineering, Computer Science, Electrical Engineering, Mechanical Engineering, or related field
- Experience with employing and fine-tuning LLM and/or Visual Language Models (VLM) for high-level context-aware planning and navigation
- 2+ years experience with 3D computer vision (e.g., point cloud, voxels, camera pose estimation) and mapping, filter-based methods (e.g., EKF), and in at least some of: motion planning algorithms, factor graphs, control, and optimization
- Excellent programming skills in Python and/or C/C++, with prior knowledge in ROS2 and high-fidelity simulators such as Gazebo, Isaac Lab, and/or Mujoco
Additional Desired Experience
- Prior experience with implementation and/or development of SLAM algorithms on robotic hardware, including acquisition, processing, and fusion of multimodal sensor data such as proprioceptive and exteroceptive sensors
- Research Areas: Artificial Intelligence, Computer Vision, Control, Machine Learning, Optimization, Robotics
- Host: Alexander Schperberg
- Apply Now
-
OR0127: Internship - Deep Learning for Robotic Manipulation
MERL is looking for a highly motivated and qualified intern to work on deep learning methods for detection and pose estimation of objects using vision and tactile sensing, in manufacturing and assembly environments. This role involves developing, fine-tuning and deploying models on existing hardware. The method will be applied for robotic manipulation where the knowledge of accurate position and orientation of objects within the scene would allow the robot to interact with the objects. The ideal candidate would be a Ph.D. student familiar with the state-of-the-art methods for pose estimation and tracking of objects. The successful candidate will work closely with MERL researchers to develop and implement novel algorithms, conduct experiments, and publish research findings at a top-tier conference. Start date and expected duration of the internship is flexible. Interested candidates are encouraged to apply with their updated CV and list of relevant publications.
Required Specific Experience
- Prior experience in Computer Vision and Robotic Manipulation.
- Experience with ROS and deep learning frameworks such as PyTorch are essential.
- Strong programming skills in Python.
- Experience with simulation tools, such as PyBullet, Issac Lab, or MuJoCo.
- Research Areas: Computer Vision, Robotics, Artificial Intelligence
- Host: Siddarth Jain
- Apply Now
-
CV0063: Internship - Visual Simultaneous Localization and Mapping
MERL is looking for a self-motivated graduate student to work on Visual Simultaneous Localization and Mapping (V-SLAM). Based on the candidate’s interests, the intern can work on a variety of topics such as (but not limited to): camera pose estimation, feature detection and matching, visual-LiDAR data fusion, pose-graph optimization, loop closure detection, and image-based camera relocalization. The ideal candidate would be a PhD student with a strong background in 3D computer vision and good programming skills in C/C++ and/or Python. The candidate must have published at least one paper in a top-tier computer vision, machine learning, or robotics venue, such as CVPR, ECCV, ICCV, NeurIPS, ICRA, or IROS. The intern will collaborate with MERL researchers to derive and implement new algorithms for V-SLAM, conduct experiments, and report findings. A submission to a top-tier conference is expected. The duration of the internship and start date are flexible.
Required Specific Experience
- Experience with 3D Computer Vision and Simultaneous Localization & Mapping.
- Research Areas: Computer Vision, Robotics, Control
- Host: Pedro Miraldo
- Apply Now
-
CV0060: Internship - Video Anomaly Detection
MERL is looking for a self-motivated intern to work on the problem of video anomaly detection. The intern will help to develop new ideas for improving the state of the art in detecting anomalous activity in videos. The ideal candidate would be a Ph.D. student with a strong background in machine learning and computer vision and some experience with video anomaly detection in particular. Proficiency in Python programming and Pytorch is necessary. The successful candidate is expected to have published at least one paper in a top-tier computer vision or machine learning venue, such as CVPR, ECCV, ICCV, WACV, ICML, ICLR, NeurIPS or AAAI. The intern will collaborate with MERL researchers to develop and test algorithms and prepare manuscripts for scientific publications. The internship is for 3 months and the start date is flexible.
Required Specific Experience
- Graduate student in Ph.D. program
- Experience with PyTorch.
- Prior publication in computer vision or machine learning conference/journal.
- Research Area: Computer Vision
- Host: Mike Jones
- Apply Now
-
CV0101: Internship - Multimodal Algorithmic Reasoning
MERL is looking for a self-motivated intern to research on problems at the intersection of multimodal large language models and neural algorithmic reasoning. An ideal intern would be a Ph.D. student with a strong background in machine learning and computer vision. The candidate must have prior experience with training multimodal LLMs for solving vision-and-language tasks. Experience in participating and winning mathematical Olympiads is desired. Publications in theoretical machine learning venues would be a strong plus. The intern is expected to collaborate with researchers in the computer vision team at MERL to develop algorithms and prepare manuscripts for scientific publications.
Required Specific Experience
- Experience with training large vision-and-language models
- Experience with solving mathematical reasoning problems
- Experience with programming in Python using PyTorch
- Enrolled in a PhD program
- Strong track record of publications in top-tier computer vision and machine learning venues (such as CVPR, NeurIPS, etc.).
- Research Areas: Artificial Intelligence, Computer Vision, Machine Learning
- Host: Anoop Cherian
- Apply Now
-
CV0075: Internship - Multimodal Embodied AI
MERL is looking for a self-motivated intern to work on problems at the intersection of multimodal large language models and embodied AI in dynamic indoor environments. The ideal candidate would be a PhD student with a strong background in machine learning and computer vision, as demonstrated by top-tier publications. The candidate must have prior experience in designing synthetic scenes (e.g., 3D games) using popular graphics software, embodied AI, large language models, reinforcement learning, and the use of simulators such as Habitat/SoundSpaces. Hands on experience in using animated 3D human shape models (e.g., SMPL and variants) is desired. The intern is expected to collaborate with researchers in computer vision at MERL to develop algorithms and prepare manuscripts for scientific publications.
Required Specific Experience
- Experience in designing 3D interactive scenes
- Experience with vision based embodied AI using simulators (implementation on real robotic hardware would be a plus).
- Experience training large language models on multimodal data
- Experience with training reinforcement learning algorithms
- Strong foundations in machine learning and programming
- Strong track record of publications in top-tier computer vision and machine learning venues (such as CVPR, NeurIPS, etc.).
- Research Areas: Artificial Intelligence, Computer Vision, Speech & Audio, Robotics, Machine Learning
- Host: Anoop Cherian
- Apply Now