NVIDIA AI Podcast — Episode 278
What Open Source Teaches Us About Making AI Better
Date: October 21, 2025
Host: Noah Kravitz
Guests:
- Brian Catanzaro, VP of Applied Deep Learning Research, NVIDIA
- Jonathan Cohen, VP of Applied Research, NVIDIA
Episode Overview
This episode explores the philosophy and strategy behind NVIDIA's open source AI initiative, "Nemotron." The conversation goes deep into why openness accelerates AI development, how collaboration shapes the future of AI, and the unique technical innovations that make Nemotron stand out. The guests reveal how Nemotron acts as a cornerstone for NVIDIA's end-to-end AI platform, discuss its impact on efficiency, transparency, and customizable enterprise AI workflows, and give listeners a peek at what's next for the Nemotron family.
Key Discussion Points & Insights
1. What is Nemotron? (01:29–03:33)
- Nemotron is NVIDIA's flagship open technology for AI, including:
- Family of large language models (LLMs): text-only and multimodal.
- Released datasets, algorithms, and methodologies.
- Aimed to support the global community in building customizable AI deeply integrated into businesses.
- Model Sizes ("Weight Classes"): Nano (small), Super (medium), Ultra (frontier/large).
Quote:
"Nemotron includes models that we train...data sets that we release, as well as algorithms and methodologies. Our goal...is to support the community in building customizable AI that can be integrated deeply and tightly into the beating heart of every business around the world."
— Brian Catanzaro (01:45)
2. Nemotron's Role in NVIDIA's AI Strategy (03:33–05:01)
- Co-Design Approach: Nemotron allows NVIDIA to optimize across hardware, software stack, networking, and models.
- Full Stack Optimization: Enables efficient, low-latency, high-throughput, energy-efficient AI systems.
- Learning by Doing: Directly building models enables NVIDIA to learn and iterate rapidly.
Quote:
"Our success comes from this full stack co design and optimization."
— Jonathan Cohen (04:40)
3. The Importance of Smart, Refined Data (05:01–07:23)
- Quality of Data Matters: Model convergence time can be improved 4x by refining pre-training datasets.
- Synthetic Data & Data Curation: Data curation is essential for efficient and effective AI; not all web text is useful.
- End-to-End Optimization: Smart datasets are now considered as critical as hardware acceleration.
Quote:
"We've accelerated pre training by a factor of 4x just by having a smarter pre training data set..."
— Brian Catanzaro (05:09)
4. Efficiency in Inference & Reasoning (07:23–08:19)
- Token Efficiency: Smarter models can generate quality answers in fewer tokens, improving inference by up to 5x.
- Accelerated computing is about capabilities, not just speed.
Quote:
"You care if you can generate the same quality answer in 2,000 tokens instead of 10,000 tokens. That's a function 5x speed up."
— Jonathan Cohen (07:35)
5. Openness and Trust — Why Open Source? (08:43–10:56)
- Transparency Builds Trust: Open source enables enterprises to understand, adapt, and trust their AI deployments.
- Industry Differentiation: Open platforms let industries create solutions tailored to their needs, much like the internet did across sectors.
- Respect for Privacy & Sovereignty: Open models allow sensitive or specific data to be safely included, without losing control.
Quote:
"We really think that it's important for AI to be trusted and widely deployed. And in order for that to happen, we think it's important that enterprises have the option to understand the data sets and the technologies behind AI and fine tune them for their own problems..."
— Brian Catanzaro (09:03)
6. Open Source as a Delivery Mechanism (10:56–11:46)
- Disseminating Accelerations: Sharing models, algorithms, and datasets through open source is the only way to broadly deliver improvements.
Quote:
"Open source is a delivery mechanism for the technology that's going into our platform."
— Jonathan Cohen (11:36)
7. Enterprise Customization & Sovereign AI (11:46–14:03)
- Custom Models: Organizations can build, fine-tune, and adapt Nemotron to their needs—ensuring privacy and domain specificity.
- Transparent Recipes: Full reproducibility: exclusions, customizations, and tweaks possible due to openly shared methods and data.
Quote:
"Everything that we did is transparent. And so you can make these modifications yourself."
— Jonathan Cohen (13:18)
8. Transparency: Speed & Trust in Adoption (14:03–17:14)
- Inspectability: Being able to see what's inside fosters organizational trust.
- Integration & Flexibility: Models can run locally or in the cloud—fits security needs.
- Accelerated Progress: More collaboration → faster innovation (examples: OpenAI, Alibaba, Meta/Llama).
Quote:
"If you don't know what's in a technology, it's harder to trust it. And every business has different ways of thinking about the problems they're solving."
— Brian Catanzaro (14:03)
9. Internal Collaboration & Scaling AI Projects (18:35–21:37)
- Huge Team Effort: Models go through pre-training, post-training, alignment; multiple teams collaborate.
- Beyond Conway's Law: Scaling AI projects requires internal openness and breaking classic modular boundaries.
- AI Development Culture: Mature, egoless, and collaborative cultures are now essential.
Quote:
"All of these things have to get combined together...the modularity is less than in software engineering...there's a real struggle in scaling up an effort like this to a very large team."
— Jonathan Cohen (19:44)
10. AI Changed by Collaboration (21:37–23:07)
- From Lone Genius to Teams: The days of single researchers making state-of-the-art advances are over; industrial-scale collaboration is key.
- Openness Drives Breakthroughs: Internal openness at NVIDIA, now extended to the broader ecosystem.
Quote:
"Organizations that can figure out how to collaborate and work together succeed."
— Brian Catanzaro (22:10)
11. Fully Integrated, Disaggregated for Ecosystem (23:07–26:20)
- End-to-End Products: Like NVIDIA hardware, Nemotron is designed as a full system but offered in pieces so users can build their solution.
- Ecosystem Philosophy: Take what you want — software, data, recipes, or finished models — everything is interoperable.
Quote:
"We're open and interoperable...we're not really locking anyone out at all. Right. We're including everybody."
— Jonathan Cohen (24:45)
12. Technical Breakthroughs in Nemotron (26:20–30:47)
- Hybrid State Space Models: Nematron NANOV2 combines transformers with state space models for 6–20x efficiency gains.
- 4-Bit Floating Point Arithmetic: Successful world-class models trained at ultra-low precision (energy and cost savings).
- Hardware-Software Co-Design: NVIDIA’s Blackwell GPU and Transformer Engine innovations support new architectures.
Quote:
"We released a model...a hybrid state space model...on the same hardware compared with other models of the same intelligence, we're about six to twenty times faster."
— Brian Catanzaro (26:50)
Quote:
"It's amazing that four bits is enough...you can have any number you want as long as it's one of these 16 and somehow...it still works."
— Jonathan Cohen (30:36)
13. What’s Next & How to Get Started (31:10–32:49)
- Future Directions:
- Even larger models coming soon.
- Increased multimodal capabilities (audio, speech, etc.).
- Continued advances in reasoning and cross-model integration.
- How to Access Nemotron:
- Models available on Hugging Face and build.nvidia.com.
- Central Nemotron landing page on Nvidia.com (content expanding regularly).
- Open to community contributions and feedback.
Quote:
"We have some of the world's best open weight speech recognition models...that technology hasn't really been incorporated into Nemotron and we're working towards adding audio and these kinds of capabilities."
— Jonathan Cohen (31:15)
Quote:
"You should expect us to train some big models...Incorporate more multimodal technology...bringing all of the best technology across Nvidia and concentrating it in Nemotron."
— Jonathan Cohen (31:12)
Quote:
"The models are available now...on Hugging Face...on build.Nvidia.com...and our Nemotron landing page."
— Brian Catanzaro & Jonathan Cohen (32:28–32:49)
Notable Quotes & Memorable Moments
-
On Open Source's Acceleration Effect
"The more we're collaborating, the faster we move as a whole."
— Noah Kravitz (17:14) -
On Full Stack Ecosystem Philosophy
"We're including everybody...it's why we've been so successful."
— Jonathan Cohen (24:45) -
On the Four-Bit Training Revolution
"With four bits you can represent 16...if you're going to draw a picture using 4 bits numbers, it's actually going to be pretty hard to make it look smooth."
— Brian Catanzaro (29:16) -
Summary Reflections
"Organizations that can figure out how to collaborate and work together succeed. And that's one of the reasons also that we really believe in Nemotron as an open project."
— Brian Catanzaro (22:10)
Important Timestamps
- Introduction to Nemotron and NVIDIA's Open Strategy: 01:29–03:33
- Stack-wide Co-Design and Accelerated Computing: 03:33–05:01
- The Crucial Role of Data Quality: 05:01–07:23
- Efficiency in Inference & Reasoning: 07:23–08:19
- Openness, Collaboration, and Trust: 09:03–11:46
- Enterprise Customization/Sovereign AI: 11:46–14:03
- Moving from Research to Production & Internal Collaboration: 18:35–21:37
- Integration Philosophy & Ecosystem Approach: 23:07–26:20
- State Space Models & Four-Bit Arithmetic: 26:39–30:47
- Future Direction & How to Get Started: 31:10–32:49
Conclusion
This episode delivers an insider's understanding of how open source principles, community-driven research, and end-to-end platform thinking are driving the next generation of AI at NVIDIA. Nemotron stands as a unique, open family of models and methodologies—engineered for flexibility, trust, and cross-industry adoption—and exemplifies a philosophy of openness, not just for technical gain, but as a foundation for trust, speed, and opportunity in the AI age.
Resources
- NVIDIA AI Podcast
- Nemotron models on Hugging Face
- build.Nvidia.com
- Nemotron landing page on Nvidia.com
For anyone seeking to deploy, customize, or just understand cutting-edge AI, Nemotron and this conversation provide a practical, transparent jumping-off point.
