NVIDIA AI Podcast: Bringing Robots to Life with AI – The Three Computer Revolution
Episode 274 | September 17, 2025
Host: Noah Kravitz
Guest: Yashraj Narang, Senior Research Manager, NVIDIA; Head of Seattle Robotics Lab
Episode Overview
In this episode, Noah Kravitz speaks with Yashraj Narang, head of NVIDIA’s Seattle Robotics Lab, about the evolving landscape of robotics at the intersection of AI, simulation, and cutting-edge hardware. The conversation explores the “Three Computer” concept central to NVIDIA’s robotics strategy, the growing capabilities of both traditional and humanoid robots, the critical role of simulation, and the field’s major limitations and future directions.
Key Discussion Points & Insights
1. Inside NVIDIA’s Seattle Robotics Lab
- Founding & Mission:
- The lab began in 2017 under Dieter Fox after inspiration from NVIDIA CEO Jensen Huang to create a world-class robotics research initiative.
- "Jensen thinks way far out into the future. And at that point, he was getting really excited about robotics. And he said, you know, essentially that we need a research effort in robotics at NVIDIA, and that's really how the lab started." – Yashraj Narang [01:28]
- Academic focus with high conference engagement and research output, plus growing collaboration with NVIDIA’s product and engineering groups to bring research to industry.
- The lab began in 2017 under Dieter Fox after inspiration from NVIDIA CEO Jensen Huang to create a world-class robotics research initiative.
- UW Partnership:
- Longstanding collaboration with the University of Washington through internships and research connections. Yash recently stepped into the lab’s leadership [03:07].
2. What Does It Mean to Bring Robots to Life?
- Defining a Robot:
- “A robot is a synthetic system that can perceive the world, can plan out sequences of actions, can make changes in the world, and it can be programmed… It typically serves some purpose of automation.” – Yashraj Narang [03:39]
- From Automation to Intelligence:
- Traditional robots in factory settings are impressive but lack adaptability and true “aliveness” due to limited real-time intelligence.
- Bringing robots to life involves intelligence capable of robust adaptation, learning from experience, and interacting with changing environments [04:22].
3. The Three Computer Revolution in Robotics
Three Pillars:
-
First Computer: NVIDIA DGX
- Powerful AI supercomputers for training and inferencing large models (e.g., GB200 systems, Grace Blackwell Superchips).
- Used for foundational understanding—taking in sensory data (visual, language) and mapping to actions.
-
Second Computer: Omniverse & Cosmos
- Omniverse: Developer platform for simulation, rendering, and developing robotics applications (used in Isaac Sim, Isaac Lab).
- Enables data generation and testing in realistic virtual environments.
- Cosmos: Set of world models for robotics, including video prediction (Cosmos Predict), scene transformation (Cosmos Transfer), and vision-language reasoning (Cosmos Reason).
- Omniverse and Cosmos “generate data, generate experience, and evaluate robots in simulation... can come before or after the first computer.” – Yashraj Narang [08:21]
- Omniverse: Developer platform for simulation, rendering, and developing robotics applications (used in Isaac Sim, Isaac Lab).
-
Third Computer: Jetson AGX (incl. Thor)
- Onboard compute for robots to run inference in real-time on-device; critical for autonomy in dynamic settings [09:01].
-
Naming Breakdown:
“D is apparently for deep learning and A is apparently for autonomous.” [08:53]
4. Recent Transformations in Robotics
- Key Innovations:
- Proliferation of compute power and simulation tools like Omniverse and Cosmos.
- Sim2Real breakthroughs, e.g., OpenAI Rubik’s Cube (simulated learning transferred to real dexterous manipulation) [11:31].
- Adoption of transformers and generative AI in robotics.
- Integration of large language models (LLMs, VLMs) to guide tasks, construct rewards, and automate 3D asset generation, progressively reducing dependence on manual human input [13:47].
- Quote:
“You can kind of view this transformation... as kind of taking the human or human ingenuity more and more out of the process or at higher and higher levels... so we're able to automate more and more of that.” – Yashraj Narang [14:25]
5. Modes of Robot Learning: Imitation vs. Reinforcement
- Imitation Learning:
- Robots learn by example—mimicking human demonstrations.
- “The purpose of imitation learning is to essentially mimic those demonstrations. The behaviors would ideally look as I have demonstrated it.” [16:35]
- Reinforcement Learning (RL):
- Robots learn through trial and error, discovering the best action sequences via rewards.
- “The key difference here is that I am not providing very much guidance… I'm letting the robot explore, try out many different things, and then come up with its own strategy.” [17:34]
- Comparison:
- Imitation learning = more human-like, efficient; RL = superhuman potential, better for difficult/demonstration-resistant tasks, but often less efficient.
- RL has enabled robots (and AI agents in games) to exceed human capabilities, especially in speed and optimization [19:21].
6. Modular vs. End-to-End Robotics Brains
- Modular Approach:
- Classic “perceive-plan-act” model; allows specialized teams, easier debugging, certification [22:02].
- End-to-End Approach:
- Direct mapping from raw sensory input (e.g., video, force data) to motor commands, with no intermediate steps; leverages more data, requires less human engineering.
- Trend:
- Robotics (like autonomous vehicles) is evolving toward hybrid models combining strengths of both paradigms [24:22, 25:15].
7. Traditional Robots vs. Humanoids
- Definitions:
- Traditional robots: fixed-purpose machines (e.g., factory arms).
- Humanoid robots: designed to mimic human form and capabilities for environments built for people (stairs, doors, tools) [27:41].
- Humanoid Momentum:
- “This perfect storm... advancements in intelligence and advancements in the hardware. So basically the body and the brain and kind of going back and forth…” – Yashraj Narang [29:00]
- World designed for humans means humanoids can be far more versatile in day-to-day environments.
- Quadruped Robots:
- Dog-like robots are called “quadrupeds” [27:41].
- Prediction:
- Future will see traditional and humanoid robots coexisting, each excelling in their respective domains [30:05].
8. Simulation, Synthetic Data, and Sim2Real
- Data Challenge:
- Unlike language/vision, robotics lacks a large, internet-scale corpus of data—hence, heavy reliance on simulation to generate synthetic datasets [30:53].
- Synthetic vs. Real Data:
- Synthetic data scalable via simulation (visual, physics), but real data remains critical for grounding and final validation.
- Sim2Real Gap:
- Simulation differs from reality in perception (visual quality), physics, sensor latency, etc.
- Bridging the gap:
- High-fidelity modeling (labor-intensive).
- Domain randomization: Randomize backgrounds, physics, sensor properties during training to improve robustness [34:04].
- Domain adaptation/invariance: Tailor or strip data to match real-world deployment context [35:34].
9. Reasoning in Robotics: Vision-Language-Action Models
- Reasoning Models:
- Enable step-by-step task decomposition, planning, and execution.
- “Reasoning… often means in simple terms, thinking step by step.” [36:43]
- Vision-Language-Action (VLA) models are being used to, for example, break down and execute complex tasks like setting a table [38:01].
- Enable step-by-step task decomposition, planning, and execution.
10. Current Limitations & Near-Future Outlook
-
Biggest Current Bottlenecks:
- Sim2Real and Real2Sim Gaps:
- Need seamless data/experience transfer between simulation and real-world robots.
- “Wouldn’t it be great if we could just take some images or take some videos of the real world and instantly have a simulation that also has physics properties.” [41:36]
- Data Scarcity:
- No “fleet” of robots for data collection (comparison to cars/dashcams in autonomous driving).
- Data pyramid approach: mix YouTube/video, synthetic simulation data, and limited real-world data for generalization [41:36].
- Sim2Real and Real2Sim Gaps:
-
Laundry Folding Robots?
- Progress is promising using imitation learning and demonstration-rich datasets.
- “We're getting closer and closer, closer than certainly I've ever seen on tasks like laundry folding.” [39:55]
11. Cutting-Edge Research: Neural Robot Dynamics (NERD)
- Teaser for CoRL Conference:
- Neural simulators (NERD) replacing explicit simulation with neural network models, offering:
- Differentiability for optimization,
- Ease of fine-tuning with new data (including continuous adaptation as robots wear/tear),
- Speed improvements leveraging AI-optimized hardware.
- “If you can transform a typical simulator into a neural network, then you can really take advantage of all of these speed benefits that come with the latest compute... make accurate predictions over a long timescale.” [47:21]
- Neural simulators (NERD) replacing explicit simulation with neural network models, offering:
Notable Quotes & Memorable Moments
- On Growing Humanoid Robotics:
- “I think the most common answer... is that the world has been designed for humans.” – Yashraj Narang [29:00]
- On Simulation as the Key to Data:
- “In contrast with a number of other areas like language and vision, robotics is widely acknowledged to have a data problem... that's really why so many people in robotics are very, very interested in simulation.” [30:53]
- On The Blurring Line Between Sim and Real:
- “The boundaries between SIM and Real, I think, will start to be a little bit blurred…” [41:50]
- On the Future:
- “On the brain side of things, there’s also these questions of, you know, modular versus end to end paradigms… I can imagine that robotics … will start to follow a similar trajectory… probably converge upon hybrid architectures until we collect enough data that an end to end model is actually all we need.” – Yashraj Narang [48:33]
Key Timestamps
- 01:28 – Seattle Robotics Lab origins & mission
- 03:39 – What is a robot? What does it mean to bring robots “to life”?
- 05:08 – The Three Computer Revolution explained
- 09:01 – AGX systems and the importance of on-device compute
- 11:31 – Sim2Real paradigm shift (e.g., Rubik’s Cube manipulation)
- 13:47 – Transition to generative AI and automation in robot learning
- 16:35 – Imitation vs. Reinforcement Learning
- 22:02 – Modular and end-to-end brain architectures
- 27:41 – Quadruped vs. humanoid robots
- 30:53 – The critical role of simulation and synthetic data
- 34:04 – Strategies to close the Sim2Real gap
- 36:43 – Reasoning in vision-language-action robotics models
- 39:55 – Laundry folding robots—how close are we?
- 41:36 – Key limitations: Sim2Real, Real2Sim, and the data pyramid
- 44:00 – Neural Robot Dynamics (NERD) research teaser
- 48:33 – Predictions for the next decade of robotics
Resources Mentioned
- NVIDIA AI Podcast
- CoRL Conference (corl.org)
- NVIDIA Seattle Robotics Lab
- Omniverse Platform
- NVIDIA Isaac Sim
Final Thoughts: Predicting the Future
-
Yash concludes that both traditional and humanoid robots will coexist, each excelling in their optimal domains. Modular versus end-to-end strategies will continue to mix in future architectures, with eventual full end-to-end models possible once dataset scale is sufficient. The line between simulation and reality will continue to blur, improving generalizability and pace of robotic intelligence.
-
Closing Quote:
“We’ll be able to capture the complexity of the real world and make predictions in a very fluid way, perhaps using a combination of physics simulators and these world models… like Cosmos.” – Yashraj Narang [50:50]
For more interviews and research, see the official NVIDIA AI Podcast and the CoRL conference page.
