Learning Robust, Real-world Visuomotor Skills from Generated Data

Speaker

Ge Yang
MIT CSAIL

Host

Phillip Isola
MIT CSAIL
Abstract:
The mainstream approach in robot learning today relies heavily on imitation learning from real-world human demonstrations. These methods are sample efficient in controlled environments and easy to scale to a large number of skills. However, I will present algorithmic arguments to explain why merely scaling up imitation learning is insufficient for advancing robotics. Instead, my talk will focus on developing performant visuomotor policies in simulation and the techniques that make them robust enough to transfer directly to real-world color observations.

I will introduce LucidSim, our recent breakthrough in producing real-world perceptive robot policies from synthetic data. Using only generated images, we successfully trained a robot dog to perform parkour through obstacles at high speed, relying solely on a color camera for visual input. I will discuss how we generate diverse and physically accurate image sequences within simulated environments for learning, and address the system challenges we overcame to scale up. Finally, I will outline our push for versatility and plans to acquire three hundred language-aware visuomotor skills by the end of this year. These are the first steps toward developing fully autonomous, embodied agents that require deeper levels of intelligence.

Bio:
Ge Yang is a postdoctoral researcher working with Phillip Isola at MIT CSAIL. His research focuses on developing the algorithmic and system foundations for computational visuomotor learning, with an emphasis on learning from synthetic data and sim-to-real transfer. Ge's work is dedicated to making robots capable, versatile, and intelligent.

Before transitioning into AI and robotics, Ge earned his Ph.D. in Physics from the University of Chicago and a Bachelor of Science in Mathematics and Physics from Yale University. His experience in physics motivated a multidisciplinary approach to problem-solving in AI. He is a recipient of the NSF Institute of AI and Fundamental Interactions Postdoc Fellowship and the Best Paper Award at the 2024 Conference on Robot Learning (CoRL), selected from 499 submissions.