TR2025-046

Learning global control of underactuated double pendulum with Model-Based Reinforcement Learning


    •  Turcato, N., Cali, M., Dalla Libera, A., Giacomuzzo, G., Carli, R., Romeres, D., "Learning global control of underactuated double pendulum with Model-Based Reinforcement Learning", IEEE International Conference on Robotics and Automation (ICRA) - 3rd AI Olympics with RealAIGym Competition, April 2025.
      BibTeX TR2025-046 PDF
      • @inproceedings{Turcato2025apr,
      • author = {Turcato, Niccolò and Cali, Marco and Dalla Libera, Alberto and Giacomuzzo, Giulio and Carli, Ruggero and Romeres, Diego},
      • title = {{Learning global control of underactuated double pendulum with Model-Based Reinforcement Learning}},
      • booktitle = {IEEE International Conference on Robotics and Automation (ICRA) - 3rd AI Olympics with RealAIGym Competition},
      • year = 2025,
      • month = apr,
      • url = {https://www.merl.com/publications/TR2025-046}
      • }
  • MERL Contact:
  • Research Area:

    Robotics

Abstract:

This report describes our proposed solution for the third edition of the ”AI Olympics with RealAIGym” com- petition, held at ICRA 2025. We employed Monte-Carlo Proba- bilistic Inference for Learning Control (MC-PILCO), an MBRL algorithm recognized for its exceptional data efficiency across various low-dimensional robotic tasks, including cart-pole, ball & plate, and Furuta pendulum systems. MC-PILCO optimizes a system dynamics model using interaction data, enabling policy refinement through simulation rather than direct system data optimization. This approach has proven highly effective in physical systems, offering greater data efficiency than Model- Free (MF) alternatives. Notably, MC-PILCO has previously won the first two editions of this competition, demonstrating its robustness in both simulated and real-world environments. Besides briefly reviewing the algorithm, we discuss the most critical aspects of the MC-PILCO implementation in the tasks at hand: learning a global policy for the pendubot and acrobot systems.