Recent Progress on Foundation Model Supervision for Robot Learning

Speaker

Jason Ma
University of Pennsylvania

Host

Leslie Kaelbling
Abstract:
Achieving general-purpose robotics requires robots to quickly learn diverse tasks without extensive training data or hand-engineered controllers for each scenario. While recent efforts in crowd-sourcing robot datasets have expanded available training data, these remain orders of magnitude smaller than datasets used in vision or language foundation models. Rather than solely focusing on scaling robot data, my research develops algorithms that train new and leverage existing foundation models from non-robot domains to provide scalable supervision across diverse robot embodiments, tasks, and policy learning approaches -- in short, enabling robot learning from foundation model supervision. This approach enables automated task learning while bypassing labor-intensive controller design and data collection.

In this talk, I will present some recent progress in these directions. First, I will discuss Eurekaverse, a LLM-based environment curriculum generation algorithm that enables acquisition of complex parkour skills in the real world. Second, I will present Generative Value Learning, a new approach for universal value function enabled by long-context VLM in-context learning.

Bio:
Jason Ma is a final-year PhD student at the University of Pennsylvania. His research interests include foundation models for robotics, robot learning, and reinforcement learning. His work has received Best Paper Finalist at ICRA 2024, Top 10 NVIDIA Research Projects of 2023, and covered by popular media such as the Economist, Fox, Yahoo, and TechCrunch. Jason is supported by Apple Scholar in AI/ML PhD Fellowship as well as OpenAI Superalignment Fellowship.