-
ST0238: Internship - Multi-Modal Sensing and Understanding
The Computational Sensing team at MERL is seeking a highly motivated intern to conduct fundamental research on multi-modal sensing and understanding —algorithms that can understand, explain, and act on multi-sensor data (e.g., RF, infrared, LiDAR, event camera). Ideal candidates will be comfortable bridging state-of-the-art perception (detection/segmentation/tracking) with higher-level semantic understanding and reasoning capabilities. Experience with text, visual, and multimodal reasoning is a plus. The intern will work closely with MERL researchers to develop novel algorithms, design experiments using MERL’s in-house testbeds, and prepare results for patents and publication. The internship is expected to last 3 months, with a flexible start date.
Required Specific Experience
- Expertise in physical sensing across RF (radar, UWB, Wi-Fi), infrared, LiDAR, and event-camera modalities. Experienced with radar systems and concepts including FMCW and MIMO configurations, Doppler signature interpretation, radar point cloud and heatmap representations, and raw ADC waveforms;
- Solid understanding of state-of-the-art transformer-based (e.g., DETR) and diffusion-based (e.g., DiffusionDet) frameworks;
- Demonstrated work in text-, visual-, and multimodal semantic understanding and reasoning.
- Hands-on experience with open large-scale multi-sensor datasets (e.g., nuScenes, Waymo Open Dataset, Argoverse) and open radar datasets (e.g., MMVR, HIBER, RT-Pose, K-Radar).
- Proficiency in Python and deep learning frameworks (PyTorch/JAX), plus experience with GPU cluster job scheduling and scalable data pipelines.
- Proven publication record in top-tier venues such as CVPR, ICCV, ECCV, NeurIPS, ICLR, ICML (or equivalent).
The pay range for this internship position will be 6-8K per month.
- Research Areas: Artificial Intelligence, Computational Sensing, Machine Learning, Signal Processing
- Host: Perry Wang
- Apply Now