a16z Podcast – How OpenAI Builds for 800 Million Weekly Users: Model Specialization and Fine-Tuning

Date: November 28, 2025
Host: Andreessen Horowitz (a16z), featuring Martin Casado
Guest: Sherman Wu, Head of Engineering for OpenAI's Developer Platform

Overview: The Dawn of Model Specialization at Scale

This episode digs deep into OpenAI’s simultaneous development of horizontal (API/developer platform) and vertical (first-party apps like ChatGPT) products for a user base approaching 800 million weekly. Martin Casado and Sherman Wu discuss the paradigm shift away from the “one model to rule them all” vision, challenges of balancing platform and product ecosystems, the stickiness and differentiation of AI models, the growing world of model fine-tuning, open source strategy, and the changing infrastructure of AI development.
Throughout, the conversation highlights OpenAI’s adaptation in the face of both technical and market realities, including pricing models, enterprise demands, and practical agent-building.

Sherman Wu's Background and OpenAI’s Evolution ([01:51])
OpenAI’s Platform vs. Product (API vs. ChatGPT) ([08:09])
The Platform Paradox: Empowering Competitors ([08:57])
Models as Anti-Disintermediation Technology ([11:22])
Proliferation of Specialized AI Models ([16:10], [17:04])
Fine-Tuning, Reinforcement Learning, & Data Utility ([20:21])
Pricing AI: Usage-Based vs. Outcome-Based ([31:28])
Open Source Models: Cannibalization & Ecosystem Impact ([36:01])
Verticalization and Application-Specific Models ([39:27])
Language vs. Pixel Models (Text vs. Image/Video) ([41:45])
Evolving Agent Design: Determinism vs. AGI ([45:07])
SOPs, Regulated Use Cases, and Constrained Agents ([46:51])
Notable Quotes & Moments

1. Sherman Wu’s Background and OpenAI’s Evolution ([01:51]-[08:08])

Sherman Wu recounts his journey:
- Current: Leads Engineering for OpenAI’s developer platform (API, classified deployments e.g. Los Alamos Labs)
- Past: Six years at Opendoor (asset pricing ML), Quora (newsfeed ML, formative team)
- Education: MIT, CS+Master’s “crammed in”
On his move to OpenAI:
- Attracted by the exceptional team and intrigue from his Quora network.
- “OpenAI kind of kept a quiet profile. I had always kind of kept tabs on them because a bunch of the Quora people I knew ended up there... they were like, yeah, something crazy is happening here.” ([07:39])
Company culture contrasts:
- Opendoor: “business operations and by the book”
- OpenAI: “very different”

2. OpenAI’s Platform vs. Product (API vs. ChatGPT) ([08:09]-[10:39])

Wu describes OpenAI as unusual in combining horizontal (API) and vertical (ChatGPT) strategies early on.
Internal tension is present but “since day one, Sam [Altman] and Greg [Brockman]... have always told us, we want ChatGPT as a first-party app. We also want the API.” ([08:57])
“The mission of OpenAI, which is to create AGI and then distribute the benefits as broadly as possible... You want it in as many surfaces as you want.” ([09:08])

Notable Quote:

“A tenth of the globe uses it every week. Every week.” – Sherman Wu ([09:43])

3. The Platform Paradox: Empowering Competitors ([08:57]-[10:59])

ChatGPT's explosive growth has created “some tension” with API customers building competing products.
Wu: Competitors are less of a concern due to rapid growth. Tension appears more from API customers fearing feature overlap.
Industry-standard scenario: platforms enabling customers who may become direct competitors.

4. Models as Anti-Disintermediation Technology ([11:22]-[15:20])

Traditional platforms risk being “abstracted away.”
Wu and Casado assert that with AI models, abstraction is hard—users want to know which model they’re using.
- “You always know you’re using GPT5.” – Martin Casado ([12:39])
- Model “stickiness” is high; not a fungible software layer.
Emotional/user loyalty:
- “[O]n the product side... with GPT5 launch... so many people liked O3 and 4.0 and all of that.”
- Users notice and care about personality and performance changes ([13:04])

Notable Quote:

“Retention of people building on our API is like surprisingly high, especially when people thought you could just kind of swap things around.” – Sherman Wu ([13:57])

5. Proliferation of Specialized AI Models ([16:10]-[18:32])

The one-model-to-rule-them-all era is over.
Now: “It’s becoming increasingly clear... there will be room for a bunch of specialized models.”
- OpenAI itself now maintains GPT-4.1, 4.0, 5, Codex, and more.
Benefits: More models = more use cases, healthier ecosystem.

Notable Quotes:

“The crazy thing about all this is just how everyone’s thinking has just changed... Even within OpenAI, the thinking was that there would be one model that rules them all... It’s like definitely completely changed since then.” – Sherman Wu ([17:04])

6. Fine-Tuning, Reinforcement Learning, & Data Utility ([20:21]-[23:59])

Fine Tuning API Origin: Massive demand from clients to “customize the models more.”
Companies have “giant treasure troves of data,” need to unlock its value beyond basic RAG (retrieval-augmented generation).
Early offerings struggled (“only useful for instruction following”); recent unlock is “reinforcement fine tuning” (RFT)—allowing “SOTA level on a particular use case.”

Data Sharing Economics:

OpenAI pilots discounted inference or free training for clients willing to share their (fine-tuning) data.
- “If you actually build with the reinforcement fine tuning API, you can actually get discounted inference and potentially free training too. If you’re willing to share the data.” ([23:33])

7. Pricing AI: Usage-Based vs. Outcome-Based ([31:28]-[36:01])

API Pricing: Usage-based (cost-plus) due to high, variable compute costs.
“Usage based pricing... gets closer and closer to your true utility.”
- “Once you get a taste of usage based pricing, you’re never going to go back.” – Ben Cot via Sherman Wu ([33:35])
Outcome-based pricing is under consideration but is hard to measure and tends to correlate with usage.

Notable Quote:

“It actually ends up correlating quite a bit with usage based pricing... Maybe at the end of the day usage based pricing is all you need.” – Sherman Wu ([35:34])

8. Open Source Models: Cannibalization & Ecosystem Impact ([36:01]-[39:27])

OpenAI recently released open-source models (GPT-OSS) after years of internal discussion.
No observed cannibalization; different use cases & customer base. “Inference is super hard.” ([38:37])
Open-sourcing seen as brand-building and ecosystem-growing, not a competitive threat.
Only a few “crown jewel” models matter for revenue/impact; open-sourcing high-end models wouldn’t undercut OpenAI due to infrastructure/inference requirements.

Notable Quotes:

“It was interesting because... OpenAI hadn’t launched anything, it just seemed like it was super anti open source. But... we were just trying to think, how can we sequence it?” — Sherman Wu ([36:41])
“To be clear, like we have not seen cannibalization at all. ... The use cases are very different.” – Sherman Wu ([38:25])

9. Verticalization and Application-Specific Models ([39:27]-[41:45])

Discusses possibility and practice of deeply verticalized product-specific models.
More common in image (diffusion models); harder with large text models due to compute/infrastructure.
“[With image models] You can fine tune a image diffusion model to be extremely good at editing faces... much heavier motion on the text side.” – Sherman Wu ([41:36])

10. Language vs. Pixel Models (Text vs. Image/Video) ([41:45]-[44:51])

OpenAI operates “world simulation” (image/video) and language teams as mostly separate orgs and infrastructures.
“Props to Mark on our research team...” ([42:14])
Wu: Sora app, DALL·E 2, ImageGen, and others available in the API; image/video use cases continue to expand.

11. Evolving Agent Design: Determinism vs. AGI ([45:07]-[48:20])

Agents: Not defined as a separate modality; OpenAI sees agents (like Sora, Codex) as different interfaces to core intelligence.
Agent Builder: OpenAI’s recent release is a “deterministic” (node-based/low-code) approach, catering to real-world needs for standardized, reliable automation—often procedural or SOP-driven work.

Product Philosophy:

Two major work archetypes:
1. Highly creative, undirected (e.g., coding, data analysis)
2. Procedural, SOP-bound (e.g., customer support, regulated tasks)
Wu: Determinism in agents is essential for the latter, and underappreciated in Silicon Valley.

Notable Quotes:

“There’s a huge need on that side to have determinism, of which an agent builder with nodes... ends up being very, very helpful.” – Sherman Wu ([48:20])

12. SOPs, Regulated Use Cases, and Constrained Agents ([48:51]-[51:54])

Discussion of industries (esp. regulated, e.g., finance, healthcare) that require agent determinism:
- Only allow certain responses, pass in logic as code, validate agent output
- Gaming analogy: NPC behavior logic in code to stay within defined actions
Wu: OpenAI’s agent builder targets this use case by explicitly allowing high-structure constraints.

Notable Quote:

“If you do not give it any of this, like, it can just kind of go off and do whatever. And yet there are like regulatory concerns around this and that is the exact use case that I think we’re trying to target with Agent Builder.” – Sherman Wu ([51:54])

13. Notable Quotes & Memorable Moments

“10% of the globe uses [ChatGPT] every week.” – Sherman Wu ([09:43])
“Retention of people building on our API is surprisingly high...” – Sherman Wu ([13:57])
“Even within OpenAI, the thinking was that there would be one model that rules them all... That’s definitely changed.” – Sherman Wu ([17:04])
“Usage based pricing... gets closer and closer to your true utility.” – Sherman Wu ([33:35])
“Inference is super hard.” – Martin Casado ([38:37])
“There’s a huge need on that side to have determinism.” – Sherman Wu ([48:20])
“That is the exact use case... we’re trying to target with Agent Builder.” – Sherman Wu ([51:54])

Key Timestamps for Major Discussion Points

[01:51] Sherman Wu’s background and team culture contrasts
[08:09] OpenAI’s dual (API vs. app) approach; strategy tensions
[11:22] API platform paradox and “anti-abstraction” of models
[13:43] Model loyalty and user experience stickiness
[16:10], [17:04] Proliferation of specialized models and the end of "one model"
[20:21] Sophistication and value in fine-tuning; role of customer data
[23:33] Incentives and economics of sharing fine-tuning data
[31:28] Approaches to pricing, shift to usage-based, outcome-based considerations
[36:01] OpenAI's open source release: strategy, risk, and real impact
[39:27] Verticalization of models and contrast with image AI
[41:45] Text vs. pixel models: infrastructure and operational lessons
[45:07] Philosophy of agent design and OpenAI's Agent Builder
[48:51] Standard operating procedures in enterprise agent use; regulatory/industry needs

Summary Tone

Casado and Wu’s conversation is open, highly technical, and continually reflects a real-world pragmatism. They debate not only strategic decisions but industry-wide evolutions, all the while acknowledging the rapid pace of change—and the unpredictability—of building AI both as infrastructure and end-user product.

a16z Podcast – How OpenAI Builds for 800 Million Weekly Users: Model Specialization and Fine-Tuning

Date: November 28, 2025
Host: Andreessen Horowitz (a16z), featuring Martin Casado
Guest: Sherman Wu, Head of Engineering for OpenAI's Developer Platform

Overview: The Dawn of Model Specialization at Scale

Sherman Wu's Background and OpenAI’s Evolution ([01:51])
OpenAI’s Platform vs. Product (API vs. ChatGPT) ([08:09])
The Platform Paradox: Empowering Competitors ([08:57])
Models as Anti-Disintermediation Technology ([11:22])
Proliferation of Specialized AI Models ([16:10], [17:04])
Fine-Tuning, Reinforcement Learning, & Data Utility ([20:21])
Pricing AI: Usage-Based vs. Outcome-Based ([31:28])
Open Source Models: Cannibalization & Ecosystem Impact ([36:01])
Verticalization and Application-Specific Models ([39:27])
Language vs. Pixel Models (Text vs. Image/Video) ([41:45])
Evolving Agent Design: Determinism vs. AGI ([45:07])
SOPs, Regulated Use Cases, and Constrained Agents ([46:51])
Notable Quotes & Moments

1. Sherman Wu’s Background and OpenAI’s Evolution ([01:51]-[08:08])

Sherman Wu recounts his journey:
- Current: Leads Engineering for OpenAI’s developer platform (API, classified deployments e.g. Los Alamos Labs)
- Past: Six years at Opendoor (asset pricing ML), Quora (newsfeed ML, formative team)
- Education: MIT, CS+Master’s “crammed in”
On his move to OpenAI:
- Attracted by the exceptional team and intrigue from his Quora network.
- “OpenAI kind of kept a quiet profile. I had always kind of kept tabs on them because a bunch of the Quora people I knew ended up there... they were like, yeah, something crazy is happening here.” ([07:39])
Company culture contrasts:
- Opendoor: “business operations and by the book”
- OpenAI: “very different”

2. OpenAI’s Platform vs. Product (API vs. ChatGPT) ([08:09]-[10:39])

Wu describes OpenAI as unusual in combining horizontal (API) and vertical (ChatGPT) strategies early on.
Internal tension is present but “since day one, Sam [Altman] and Greg [Brockman]... have always told us, we want ChatGPT as a first-party app. We also want the API.” ([08:57])
“The mission of OpenAI, which is to create AGI and then distribute the benefits as broadly as possible... You want it in as many surfaces as you want.” ([09:08])

Notable Quote:

“A tenth of the globe uses it every week. Every week.” – Sherman Wu ([09:43])

3. The Platform Paradox: Empowering Competitors ([08:57]-[10:59])

ChatGPT's explosive growth has created “some tension” with API customers building competing products.
Wu: Competitors are less of a concern due to rapid growth. Tension appears more from API customers fearing feature overlap.
Industry-standard scenario: platforms enabling customers who may become direct competitors.

4. Models as Anti-Disintermediation Technology ([11:22]-[15:20])

Traditional platforms risk being “abstracted away.”
Wu and Casado assert that with AI models, abstraction is hard—users want to know which model they’re using.
- “You always know you’re using GPT5.” – Martin Casado ([12:39])
- Model “stickiness” is high; not a fungible software layer.
Emotional/user loyalty:
- “[O]n the product side... with GPT5 launch... so many people liked O3 and 4.0 and all of that.”
- Users notice and care about personality and performance changes ([13:04])

Notable Quote:

“Retention of people building on our API is like surprisingly high, especially when people thought you could just kind of swap things around.” – Sherman Wu ([13:57])

5. Proliferation of Specialized AI Models ([16:10]-[18:32])

The one-model-to-rule-them-all era is over.
Now: “It’s becoming increasingly clear... there will be room for a bunch of specialized models.”
- OpenAI itself now maintains GPT-4.1, 4.0, 5, Codex, and more.
Benefits: More models = more use cases, healthier ecosystem.

Notable Quotes:

“The crazy thing about all this is just how everyone’s thinking has just changed... Even within OpenAI, the thinking was that there would be one model that rules them all... It’s like definitely completely changed since then.” – Sherman Wu ([17:04])

6. Fine-Tuning, Reinforcement Learning, & Data Utility ([20:21]-[23:59])

Fine Tuning API Origin: Massive demand from clients to “customize the models more.”
Companies have “giant treasure troves of data,” need to unlock its value beyond basic RAG (retrieval-augmented generation).
Early offerings struggled (“only useful for instruction following”); recent unlock is “reinforcement fine tuning” (RFT)—allowing “SOTA level on a particular use case.”

Data Sharing Economics:

OpenAI pilots discounted inference or free training for clients willing to share their (fine-tuning) data.
- “If you actually build with the reinforcement fine tuning API, you can actually get discounted inference and potentially free training too. If you’re willing to share the data.” ([23:33])

7. Pricing AI: Usage-Based vs. Outcome-Based ([31:28]-[36:01])

API Pricing: Usage-based (cost-plus) due to high, variable compute costs.
“Usage based pricing... gets closer and closer to your true utility.”
- “Once you get a taste of usage based pricing, you’re never going to go back.” – Ben Cot via Sherman Wu ([33:35])
Outcome-based pricing is under consideration but is hard to measure and tends to correlate with usage.

Notable Quote:

“It actually ends up correlating quite a bit with usage based pricing... Maybe at the end of the day usage based pricing is all you need.” – Sherman Wu ([35:34])

8. Open Source Models: Cannibalization & Ecosystem Impact ([36:01]-[39:27])

OpenAI recently released open-source models (GPT-OSS) after years of internal discussion.
No observed cannibalization; different use cases & customer base. “Inference is super hard.” ([38:37])
Open-sourcing seen as brand-building and ecosystem-growing, not a competitive threat.
Only a few “crown jewel” models matter for revenue/impact; open-sourcing high-end models wouldn’t undercut OpenAI due to infrastructure/inference requirements.

Notable Quotes:

“It was interesting because... OpenAI hadn’t launched anything, it just seemed like it was super anti open source. But... we were just trying to think, how can we sequence it?” — Sherman Wu ([36:41])
“To be clear, like we have not seen cannibalization at all. ... The use cases are very different.” – Sherman Wu ([38:25])

9. Verticalization and Application-Specific Models ([39:27]-[41:45])

Discusses possibility and practice of deeply verticalized product-specific models.
More common in image (diffusion models); harder with large text models due to compute/infrastructure.
“[With image models] You can fine tune a image diffusion model to be extremely good at editing faces... much heavier motion on the text side.” – Sherman Wu ([41:36])

10. Language vs. Pixel Models (Text vs. Image/Video) ([41:45]-[44:51])

OpenAI operates “world simulation” (image/video) and language teams as mostly separate orgs and infrastructures.
“Props to Mark on our research team...” ([42:14])
Wu: Sora app, DALL·E 2, ImageGen, and others available in the API; image/video use cases continue to expand.

11. Evolving Agent Design: Determinism vs. AGI ([45:07]-[48:20])

Agents: Not defined as a separate modality; OpenAI sees agents (like Sora, Codex) as different interfaces to core intelligence.
Agent Builder: OpenAI’s recent release is a “deterministic” (node-based/low-code) approach, catering to real-world needs for standardized, reliable automation—often procedural or SOP-driven work.

Product Philosophy:

Two major work archetypes:
1. Highly creative, undirected (e.g., coding, data analysis)
2. Procedural, SOP-bound (e.g., customer support, regulated tasks)
Wu: Determinism in agents is essential for the latter, and underappreciated in Silicon Valley.

Notable Quotes:

“There’s a huge need on that side to have determinism, of which an agent builder with nodes... ends up being very, very helpful.” – Sherman Wu ([48:20])

12. SOPs, Regulated Use Cases, and Constrained Agents ([48:51]-[51:54])

Discussion of industries (esp. regulated, e.g., finance, healthcare) that require agent determinism:
- Only allow certain responses, pass in logic as code, validate agent output
- Gaming analogy: NPC behavior logic in code to stay within defined actions
Wu: OpenAI’s agent builder targets this use case by explicitly allowing high-structure constraints.

Notable Quote:

“If you do not give it any of this, like, it can just kind of go off and do whatever. And yet there are like regulatory concerns around this and that is the exact use case that I think we’re trying to target with Agent Builder.” – Sherman Wu ([51:54])

13. Notable Quotes & Memorable Moments

“10% of the globe uses [ChatGPT] every week.” – Sherman Wu ([09:43])
“Retention of people building on our API is surprisingly high...” – Sherman Wu ([13:57])
“Even within OpenAI, the thinking was that there would be one model that rules them all... That’s definitely changed.” – Sherman Wu ([17:04])
“Usage based pricing... gets closer and closer to your true utility.” – Sherman Wu ([33:35])
“Inference is super hard.” – Martin Casado ([38:37])
“There’s a huge need on that side to have determinism.” – Sherman Wu ([48:20])
“That is the exact use case... we’re trying to target with Agent Builder.” – Sherman Wu ([51:54])

Key Timestamps for Major Discussion Points

[01:51] Sherman Wu’s background and team culture contrasts
[08:09] OpenAI’s dual (API vs. app) approach; strategy tensions
[11:22] API platform paradox and “anti-abstraction” of models
[13:43] Model loyalty and user experience stickiness
[16:10], [17:04] Proliferation of specialized models and the end of "one model"
[20:21] Sophistication and value in fine-tuning; role of customer data
[23:33] Incentives and economics of sharing fine-tuning data
[31:28] Approaches to pricing, shift to usage-based, outcome-based considerations
[36:01] OpenAI's open source release: strategy, risk, and real impact
[39:27] Verticalization of models and contrast with image AI
[41:45] Text vs. pixel models: infrastructure and operational lessons
[45:07] Philosophy of agent design and OpenAI's Agent Builder
[48:51] Standard operating procedures in enterprise agent use; regulatory/industry needs

How OpenAI Builds for 800 Million Weekly Users: Model Specialization and Fine-Tuning

Summary

a16z Podcast – How OpenAI Builds for 800 Million Weekly Users: Model Specialization and Fine-Tuning

Overview: The Dawn of Model Specialization at Scale

Table of Contents

1. Sherman Wu’s Background and OpenAI’s Evolution ([01:51]-[08:08])

2. OpenAI’s Platform vs. Product (API vs. ChatGPT) ([08:09]-[10:39])

Notable Quote:

3. The Platform Paradox: Empowering Competitors ([08:57]-[10:59])

4. Models as Anti-Disintermediation Technology ([11:22]-[15:20])

Notable Quote:

5. Proliferation of Specialized AI Models ([16:10]-[18:32])

Notable Quotes:

6. Fine-Tuning, Reinforcement Learning, & Data Utility ([20:21]-[23:59])

Data Sharing Economics:

7. Pricing AI: Usage-Based vs. Outcome-Based ([31:28]-[36:01])

Notable Quote:

8. Open Source Models: Cannibalization & Ecosystem Impact ([36:01]-[39:27])

Notable Quotes:

9. Verticalization and Application-Specific Models ([39:27]-[41:45])

10. Language vs. Pixel Models (Text vs. Image/Video) ([41:45]-[44:51])

11. Evolving Agent Design: Determinism vs. AGI ([45:07]-[48:20])

Product Philosophy:

Notable Quotes:

12. SOPs, Regulated Use Cases, and Constrained Agents ([48:51]-[51:54])

Notable Quote:

13. Notable Quotes & Memorable Moments

Key Timestamps for Major Discussion Points

Summary Tone

Summary

a16z Podcast – How OpenAI Builds for 800 Million Weekly Users: Model Specialization and Fine-Tuning

Overview: The Dawn of Model Specialization at Scale

Table of Contents

1. Sherman Wu’s Background and OpenAI’s Evolution ([01:51]-[08:08])

2. OpenAI’s Platform vs. Product (API vs. ChatGPT) ([08:09]-[10:39])

Notable Quote:

3. The Platform Paradox: Empowering Competitors ([08:57]-[10:59])

4. Models as Anti-Disintermediation Technology ([11:22]-[15:20])

Notable Quote:

5. Proliferation of Specialized AI Models ([16:10]-[18:32])

Notable Quotes:

6. Fine-Tuning, Reinforcement Learning, & Data Utility ([20:21]-[23:59])

Data Sharing Economics:

7. Pricing AI: Usage-Based vs. Outcome-Based ([31:28]-[36:01])

Notable Quote:

8. Open Source Models: Cannibalization & Ecosystem Impact ([36:01]-[39:27])

Notable Quotes:

9. Verticalization and Application-Specific Models ([39:27]-[41:45])

10. Language vs. Pixel Models (Text vs. Image/Video) ([41:45]-[44:51])

11. Evolving Agent Design: Determinism vs. AGI ([45:07]-[48:20])

Product Philosophy:

Notable Quotes:

12. SOPs, Regulated Use Cases, and Constrained Agents ([48:51]-[51:54])

Notable Quote:

13. Notable Quotes & Memorable Moments

Key Timestamps for Major Discussion Points

Summary Tone