log in  |  register  |  feedback?  |  help  |  web accessibility
Logo
Robot Foundation Models: Training Robots with Internet Scale Data
Jie Tan - Google
JMP 2116 or https://umd.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=66a19ef9-e9cd-434e-b0f3-b06000fc2ff5
Friday, November 10, 2023, 2:00-3:00 pm Calendar
  • You are subscribed to this talk through .
  • You are watching this talk through .
  • You are subscribed to this talk. (unsubscribe, watch)
  • You are watching this talk. (unwatch, subscribe)
  • You are not subscribed to this talk. (watch, subscribe)
Abstract

Foundation Models, after training with Internet-scale data, show excellent understanding of natural language and images, possessing common sense, and performing logical reasoning and predictions. All of these capabilities are essential for developing intelligent and autonomous robots. How can we tap into the power of these Foundation Models in robotics? In this talk, I will cover three recent papers from Google DeepMind about building Robot Foundation Models: SayCan, RT-2, and ROSIE. They not only demonstrate how to ground large language models (LLM), and vision-language models (VLM) based on the robots' physical capability and the real-world environments, but also discuss how to leverage AI generative models for large-scale data augmentation. Pre-trained with Internet-scale text and image data, and fine-tuned with robotic data collected in the real world or imagined through diffusion models, the Robot Foundation Models show unprecedented capabilities for long-horizon task planning, and generalizable low-level skills.

 

Host

Dinesh Manocha

Bio

Jie Tan is a Senior Staff Research Scientist at Google DeepMind. He leads the Robot Mobility and Embodied General Reasoning Teams, whose mission is to build intelligent and autonomous robots that can assist humans for daily tasks in human-centered environments. His research focuses on applying foundation models and deep reinforcement learning methods to robots, with interests spanning locomotion, navigation, manipulation, simulation, and sim-to-real transfer.

This talk is organized by Samuel Malede Zewdu