TR2025-169

Sim-to-Real Contact-Rich Pivoting via Optimization-Guided RL with Vision and Touch

- Shirai, Y., Ota, K., Jha, D.K., Romeres, D., "Sim-to-Real Contact-Rich Pivoting via Optimization-Guided RL with Vision and Touch", Embodied World Models for Decision Making, NeurIPS Workshop, December 2025.
  BibTeX TR2025-169 PDF Video
  - @inproceedings{Shirai2025dec,
  - author = {Shirai, Yuki and Ota, Kei and Jha, Devesh K. and Romeres, Diego},
  - title = {{Sim-to-Real Contact-Rich Pivoting via Optimization-Guided RL with Vision and Touch}},
  - booktitle = {NeurIPS 2025 Workshop on Embodied World Models for Decision Making},
  - year = 2025,
  - month = dec,
  - url = {https://www.merl.com/publications/TR2025-169}
  - }
MERL Contact:
- Diego
  Romeres
Research Area:

Robotics

Abstract:

Non-prehensile manipulation is challenging due to complex contact interactions between objects, the environment, and robots. Model-based approaches can effi- ciently generate complex trajectories of robots and objects under contact constraints. However, they tend to be sensitive to model inaccuracies and require access to privileged information (e.g., object mass, size, pose), making them less suitable for novel objects. In contrast, learning-based approaches are typically more robust to modeling errors but require large amounts of data. In this paper, we bridge these two approaches to propose a framework for learning closed-loop pivoting manipulation. By leveraging computationally efficient Contact-Implicit Trajec- tory Optimization (CITO), we design demonstration-guided deep Reinforcement Learning (RL), leading to sample-efficient learning. We also present a sim-to-real transfer approach using a privileged training strategy, enabling the robot to perform pivoting manipulation using only proprioception, vision, and force sensing without access to privileged information. Our method is evaluated on several pivoting tasks, demonstrating that it can successfully perform sim-to-real transfer.

Related News & Events

NEWS MERL Researchers at NeurIPS 2025 presented 2 conference papers, 5 workshop papers, and organized a workshop.
Date: December 2, 2025 - December 7, 2025
Where: San Diego
MERL Contacts: Petros T. Boufounos; Anoop Cherian; Radu Corcodel; Stefano Di Cairano; Chiori Hori; Christopher R. Laughman; Suhas Lohit; Pedro Miraldo; Saviz Mowlavi; Kuan-Chuan Peng; Arvind Raghunathan; Diego Romeres; Abraham P. Vinod; Pu (Perry) Wang
Research Areas: Artificial Intelligence, Computational Sensing, Computer Vision, Control, Data Analytics, Dynamical Systems, Machine Learning, Multi-Physical Modeling, Optimization, Robotics, Signal Processing, Speech & Audio
Brief
- MERL researchers presented 2 main-conference papers and 5 workshop papers, as well as organized a workshop, at NeurIPS 2025.
  
  Main Conference Papers:
  
  1) Sorachi Kato, Ryoma Yataka, Pu Wang, Pedro Miraldo, Takuya Fujihashi, and Petros Boufounos, "RAPTR: Radar-based 3D Pose Estimation using Transformer", Code available at: https://github.com/merlresearch/radar-pose-transformer
  
  2) Runyu Zhang, Arvind Raghunathan, Jeff Shamma, and Na Li, "Constrained Optimization From a Control Perspective via Feedback Linearization"
  
  Workshop Papers:
  
  1) Yuyou Zhang, Radu Corcodel, Chiori Hori, Anoop Cherian, and Ding Zhao, "SpinBench: Perspective and Rotation as a Lens on Spatial Reasoning in VLMs", NeuriIPS 2025 Workshop on SPACE in Vision, Language, and Embodied AI (SpaVLE) (Best Paper Runner-up)
  
  2) Xiaoyu Xie, Saviz Mowlavi, and Mouhacine Benosman, "Smooth and Sparse Latent Dynamics in Operator Learning with Jerk Regularization", Workshop on Machine Learning and the Physical Sciences (ML4PS)
  
  3) Spencer Hutchinson, Abraham Vinod, François Germain, Stefano Di Cairano, Christopher Laughman, and Ankush Chakrabarty, "Quantile-SMPC for Grid-Interactive Buildings with Multivariate Temporal Fusion Transformers", Workshop on UrbanAI: Harnessing Artificial Intelligence for Smart Cities (UrbanAI)
  
  4) Yuki Shirai, Kei Ota, Devesh Jha, and Diego Romeres, "Sim-to-Real Contact-Rich Pivoting via Optimization-Guided RL with Vision and Touch", Worskhop on Embodied World Models for Decision Making
  
  5) Mark Van der Merwe and Devesh Jha, "In-Context Policy Iteration for Dynamic Manipulation", Workshop on Embodied World Models for Decision Making
  
  Workshop Organized:
  
  MERL members co-organized the Multimodal Algorithmic Reasoning (MAR) Workshop (https://marworkshop.github.io/neurips25/). Organizers: Anoop Cherian (Mitsubishi Electric Research Laboratories), Kuan-Chuan Peng (Mitsubishi Electric Research Laboratories), Suhas Lohit (Mitsubishi Electric Research Laboratories), Honglu Zhou (Salesforce AI Research), Kevin Smith (Massachusetts Institute of Technology), and Joshua B. Tenenbaum (Massachusetts Institute of Technology).

Related Publication

Shirai, Y., Ota, K., Jha, D.K., Romeres, D., "Learning Non-prehensile Manipulation with Force and Vision Feedback Using Optimization-based Demonstrations", IEEE Robotics and Automation Letters, January 2026.

BibTeX TR2026-011 PDF Video

@article{Shirai2026jan,
author = {Shirai, Yuki and Ota, Kei and Jha, Devesh K. and Romeres, Diego},
title = {{Learning Non-prehensile Manipulation with Force and Vision Feedback Using Optimization-based Demonstrations}},
journal = {IEEE Robotics and Automation Letters},
year = 2026,
month = jan,
url = {https://www.merl.com/publications/TR2026-011}
}

MERL Contact:

DiegoRomeres

Research Area:

Abstract:

Diego
Romeres