TR2021-032

Data-Efficient Learning for Complex and Real-Time Physical Problem Solving using Augmented Simulation


    •  Ota, K., Jha, D., Romeres, D., van Baar, J., Smith, K., Semistsu, T., Oiki, T., Sullivan, A., Nikovski, D.N., Tenanbaum, J., "Data-Efficient Learning for Complex and Real-Time Physical Problem Solving using Augmented Simulation", IEEE Robotics and Automation Letters, DOI: 10.1109/​LRA.2021.3068887, Vol. 6, No. 2, March 2021.
      BibTeX TR2021-032 PDF
      • @article{Ota2021mar,
      • author = {Ota, Kei and Jha, Devesh and Romeres, Diego and van Baar, Jeroen and Smith, Kevin and Semistsu, Takayuki and Oiki, Tomoaki and Sullivan, Alan and Nikovski, Daniel N. and Tenanbaum, Joshua},
      • title = {Data-Efficient Learning for Complex and Real-Time Physical Problem Solving using Augmented Simulation},
      • journal = {IEEE Robotics and Automation Letters},
      • year = 2021,
      • volume = 6,
      • number = 2,
      • month = mar,
      • doi = {10.1109/LRA.2021.3068887},
      • url = {https://www.merl.com/publications/TR2021-032}
      • }
  • MERL Contacts:
  • Research Areas:

    Computer Vision, Control, Machine Learning, Robotics

Abstract:

Humans quickly solve tasks in novel systems with complex dynamics, without requiring much interaction. While deep reinforcement learning algorithms have achieved tremendous success in many complex tasks, these algorithms need a large number of samples to learn meaningful policies. In this paper, we present a task for navigating a marble to the center of a circular maze. While this system is very intuitive and easy for humans to solve, it can be very difficult and inefficient for standard reinforcement learning algorithms to learn meaningful policies. We present a model that learns to move a marble in the complex environment within minutes of interacting with the real system. Learning consists of initializing a physics engine with parameters estimated using data from the real system. The error in the physics engine is then corrected using Gaussian process regression, which is used to model the residual between real observations and physics engine simulations. The physics engine augmented with the residual model is then used to control the marble in the maze environment using a model-predictive feedback over a receding horizon. To the best of our knowledge, this is the first time that a hybrid model consisting of a full physics engine along with a statistical function approximator has been used to control a complex physical system in real-time using nonlinear model-predictive control (NMPC).

 

  • Related Publication

  •  Ota, K., Jha, D., Romeres, D., van Baar, J., Smith, K., Semistsu, T., Oiki, T., Sullivan, A., Nikovski, D.N., Tenanbaum, J., "Data-Efficient Learning for Complex and Real-Time Physical Problem Solving using Augmented Simulation", arXiv, December 2020.
    BibTeX arXiv
    • @article{Ota2020dec,
    • author = {Ota, Kei and Jha, Devesh and Romeres, Diego and van Baar, Jeroen and Smith, Kevin and Semistsu, Takayuki and Oiki, Tomoaki and Sullivan, Alan and Nikovski, Daniel N. and Tenanbaum, Joshua},
    • title = {Data-Efficient Learning for Complex and Real-Time Physical Problem Solving using Augmented Simulation},
    • journal = {arXiv},
    • year = 2020,
    • month = dec,
    • url = {https://arxiv.org/abs/2011.07193}
    • }