Podcast Summary: Azeem Azhar's Exponential View

Episode: AI in 2025 – A Global Perspective, with Kai-Fu Lee

Date: January 2, 2025
Host: Azeem Azhar
Guest: Kai-Fu Lee, AI researcher, investor, and CEO of Sinovation Ventures

Episode Overview

This episode features a wide-ranging conversation between host Azeem Azhar and renowned AI researcher and investor Kai-Fu Lee. They discuss how AI—particularly generative AI—is evolving globally, focusing on developments in China and Asia compared to the US and Europe. The conversation covers emerging AI applications, shifting business models, infrastructure and cost reductions, the future of search, multimodal AI, robotics, and key developments expected in 2025.

Key Discussion Points & Insights

1. The Generative AI Wave: US/EU vs China/Asia

US/EU's Viral "ChatGPT Moment": In the US, ChatGPT’s release triggered viral mainstream adoption. In Asia, and China specifically, local alternatives emerged more gradually (01:04).
Chinese Catch-Up: Chinese models are now close to GPT-4 in capability and notably cheaper, spurring broader experimentation (03:05).
Local Enterprise Approach: China’s lack of a mature SaaS ecosystem means enterprises “roll their own,” requiring deep, hands-on AI integration—leading to dirtier, more customized solutions (03:05).

“Chinese companies are going deeper and actually at this formative stage of generative AI technology, one probably has to go deep to extract the most value…”
—Kai-Fu Lee, (03:52)

2. AI-Native and AI-First Applications

AI First vs. AI Assisted: Lee distinguishes between tools where AI supports users (like Copilot) and AI-native tools where the user “prompts and tweaks” but AI does the heavy lifting (05:24).
Example: POP AI—an app designed for AI-led document creation, with human “tweaking” and a feature to "humanize" output.

“The key difference with Microsoft Copilot is AI does the writing, human does the tweaking.”
—Kai-Fu Lee, (06:51)

Humanization Trend: Tools now include features (“Humanize”) to make LLM output less detectable by AI-detection tools and more stylistically human (07:56).

3. Changing Workflows & Search Experiences

How Lee Uses Generative Tools: Feeds multiple documents into AI, interacts for insights, then has AI summarize and “humanize” final results (09:07).
10x Users: The gap between average users and power users is growing in generative AI; more skillful prompting yields outsized productivity (09:43).
Next-Gen Search:
- Beagle: An AI-native search engine that delivers one answer to complex queries, pulling from verified and recent sources via RAG (Retrieval-Augmented Generation) (10:35).
- Shift Toward "Answer Engines": AI search will soon supplant conventional engines—anticipating a move from “search engines” to “answer engines” within a couple of years (13:26).

4. AI Agents, Multimodal Models, and Cost Dynamics in 2025

Agents:
- Lee expects the term “agent” to rise, but sees near-term value in robust, vertical-specific applications.
- Multimodal (text-to-image) is maturing, while text-to-video and agents still face technical and economic barriers (14:04).
Robustness for Broad Application: General-purpose tools must achieve high reliability; vertical apps can thrive with more constraints (15:26).
Voice as Interface:
- Voice-driven agents excite Lee but are limited by legacy interface design (PC/apps built for keyboard/mouse, not voice) (15:38).
- Phones may see more impact from voice-first agents (17:01).

5. The End of Scaling Laws? Compute, Data, and Inference

Peak Data & Diminishing Returns:
- OpenAI’s Ilya Sutskever declared pre-training scaling limits are approaching due to data exhaustion (17:25).
- Lee views scaling laws as reaching diminishing returns at the frontier but sees more room for “catch up” models (18:50).
Growing Importance of Inference:
- Compute demand is shifting from training toward inference—longer, more detailed answers require more inference compute (21:00).
- Example: 01.AI achieved world-class model rankings at low training cost ($3m), and has driven inference prices dramatically down (22:00, 23:07).
- Cost Milestones: Inference per million tokens dropped from $200 to $0.10 in a few years, enabling wider experimentation (23:07-24:21).

“If you were back in the early days of GPT4, it would have been 75 per million tokens. That would be 75 cents per search, which would bankrupt any company, including Google.”
—Kai-Fu Lee, (24:21)

6. Infrastructure, Optimization & the Role of Memory

Speeding Up Inference:
- Innovations include: Mixture of Experts (MoE) for activating only needed parameters, optimization of memory/cache architecture, and algorithmic tricks (25:56-29:16).
UX Implications:
- Users need flexibility—some tasks require speed, others demand accuracy (30:09-31:37).
- Need for interface designers who understand both UX and AI system constraints (32:19).

7. Emerging Business Models & Global AI Ecosystems

China’s Experiential Edge: Companies experiment more due to cheap models and absence of rigid SaaS stacks.
New Models for SaaS: Firms in China and beyond may move to “service as software,” aligning incentives via value-sharing business models (e.g., JV with a patent company for cost savings/productization) (34:12-38:41).
Challenges of the US Market: Despite flexibility in China, American markets offer more early-stage business opportunities thanks to mature infrastructure and larger addressable opportunities.

“It used to be software as a service, now it’s service as software…”
—Azeem Azhar referencing Sequoia paper, (38:41)

8. Reflections on 2024 and Expectations for 2025

Surprises: Tech progress and model accessibility exceeded Lee’s expectations in 2024, though fewer “killer apps” appeared due to earlier high inference costs (39:58).
Multimodal Lag: Despite critical importance for real-world AI and robotics, multimodal remains more at demo stage, but Lee anticipates breakthroughs in 2025 (41:18).
Robotics & Autonomous Vehicles:
- Lee is bullish on autonomous vehicles (AVs) and embodied AI, with caveats around ecosystem maturity, regulatory risk, and expectations vs reality in humanoid robotics (44:29-48:00).

“With humanoid robots…people are going to get disappointed…you set a very high bar, which is why…early efforts to deploy natural language processing for customer service [failed].”
—Kai-Fu Lee, (47:01)

9. What’s Next for 2025

Lee’s Wishlist:
- Real multimodal apps
- Proliferation of AI-native apps replacing traditional ones
- Mature agents (48:31)
Expectations:
- On apps: Most people will be surprised at the “fast and furious” emergence of new apps (49:00).
- On models/tech: Cautious about over-promising (especially from OpenAI); risk of disappointment as breakthrough pace plateaus (49:00-50:09).
- User need: 2025 will be a year of continuous learning, experimenting, and adaptation.

Notable Quotes & Memorable Moments

On Global AI Ecosystems:
“Chinese companies are going deeper…one probably has to go deep to extract the most value…” —Kai-Fu Lee (03:52)
On the Rise of AI-Native Tools:
“The key difference with Microsoft Copilot is AI does the writing, human does the tweaking.” —Kai-Fu Lee (06:51)
On Search’s Next Frontier:
“I would bet…in two or three years we will mostly move out of the current search engine into what I would consider answering.” —Kai-Fu Lee (13:14)
On Cost Plummeting:
“If you were back in the early days of GPT4…that would be 75 cents per search, which would bankrupt any company, including Google.” —Kai-Fu Lee (24:21)
On Humanoid Robots:
“You set a very high bar, which is why…you are unhappy. But now…the question is, you build this human that dances…then it enters your home…then you’re saying, hey, why are you falling down here?…that might create a wave of disappointment.” —Kai-Fu Lee (47:01)
On 2025 App Explosion:
“I think on apps…most people will be surprised with how fast and furious these will come to the upside.” —Kai-Fu Lee (49:00)

Timestamps for Key Segments

[01:04] Generative AI in China & “The ChatGPT Moment”
[03:05] Market penetration, slow start, cheap models in China
[05:24] AI-First applications: Copilot vs. POP AI
[10:35] The next AI-native “answer engine” (Beagle)
[14:04] Turning points: Agents, multimodal models, and costs
[17:25] The end of scaling laws? Debating Ilya Sutskever’s comments
[22:00] Inference cost breakthroughs & compute implications
[30:09] UX tradeoffs: Speed vs. quality in AI responses
[34:12] Cheap models and the Chinese hands-on enterprise approach
[38:41] “Service as software”: New business models
[39:58] Reflections: 2024’s surprises and disappointments
[41:18] Multimodal & robotics—challenges, expectations, and future
[48:31] Kai-Fu Lee’s 2025 wishlist
[49:00] 2025: Upside surprises and the risk of overpromising

Conclusion

This in-depth conversation between Azeem Azhar and Kai-Fu Lee provides a rich, global perspective on the dramatic shifts and new frontiers in AI expected in 2025. The dialogue highlights the rapid democratization and cost reduction in model building, the importance of application innovation, emerging business models, and the coming push for true multimodal and agentic AI systems. While both are optimistic, they stress the importance of flexibility, experimentation, and realistic expectations in navigating the exponential pace of change.

Podcast Summary: Azeem Azhar's Exponential View

Episode: AI in 2025 – A Global Perspective, with Kai-Fu Lee

Date: January 2, 2025
Host: Azeem Azhar
Guest: Kai-Fu Lee, AI researcher, investor, and CEO of Sinovation Ventures

Episode Overview

Key Discussion Points & Insights

1. The Generative AI Wave: US/EU vs China/Asia

US/EU's Viral "ChatGPT Moment": In the US, ChatGPT’s release triggered viral mainstream adoption. In Asia, and China specifically, local alternatives emerged more gradually (01:04).
Chinese Catch-Up: Chinese models are now close to GPT-4 in capability and notably cheaper, spurring broader experimentation (03:05).
Local Enterprise Approach: China’s lack of a mature SaaS ecosystem means enterprises “roll their own,” requiring deep, hands-on AI integration—leading to dirtier, more customized solutions (03:05).

“Chinese companies are going deeper and actually at this formative stage of generative AI technology, one probably has to go deep to extract the most value…”
—Kai-Fu Lee, (03:52)

2. AI-Native and AI-First Applications

AI First vs. AI Assisted: Lee distinguishes between tools where AI supports users (like Copilot) and AI-native tools where the user “prompts and tweaks” but AI does the heavy lifting (05:24).
Example: POP AI—an app designed for AI-led document creation, with human “tweaking” and a feature to "humanize" output.

“The key difference with Microsoft Copilot is AI does the writing, human does the tweaking.”
—Kai-Fu Lee, (06:51)

Humanization Trend: Tools now include features (“Humanize”) to make LLM output less detectable by AI-detection tools and more stylistically human (07:56).

3. Changing Workflows & Search Experiences

How Lee Uses Generative Tools: Feeds multiple documents into AI, interacts for insights, then has AI summarize and “humanize” final results (09:07).
10x Users: The gap between average users and power users is growing in generative AI; more skillful prompting yields outsized productivity (09:43).
Next-Gen Search:
- Beagle: An AI-native search engine that delivers one answer to complex queries, pulling from verified and recent sources via RAG (Retrieval-Augmented Generation) (10:35).
- Shift Toward "Answer Engines": AI search will soon supplant conventional engines—anticipating a move from “search engines” to “answer engines” within a couple of years (13:26).

4. AI Agents, Multimodal Models, and Cost Dynamics in 2025

Agents:
- Lee expects the term “agent” to rise, but sees near-term value in robust, vertical-specific applications.
- Multimodal (text-to-image) is maturing, while text-to-video and agents still face technical and economic barriers (14:04).
Robustness for Broad Application: General-purpose tools must achieve high reliability; vertical apps can thrive with more constraints (15:26).
Voice as Interface:
- Voice-driven agents excite Lee but are limited by legacy interface design (PC/apps built for keyboard/mouse, not voice) (15:38).
- Phones may see more impact from voice-first agents (17:01).

5. The End of Scaling Laws? Compute, Data, and Inference

Peak Data & Diminishing Returns:
- OpenAI’s Ilya Sutskever declared pre-training scaling limits are approaching due to data exhaustion (17:25).
- Lee views scaling laws as reaching diminishing returns at the frontier but sees more room for “catch up” models (18:50).
Growing Importance of Inference:
- Compute demand is shifting from training toward inference—longer, more detailed answers require more inference compute (21:00).
- Example: 01.AI achieved world-class model rankings at low training cost ($3m), and has driven inference prices dramatically down (22:00, 23:07).
- Cost Milestones: Inference per million tokens dropped from $200 to $0.10 in a few years, enabling wider experimentation (23:07-24:21).

“If you were back in the early days of GPT4, it would have been 75 per million tokens. That would be 75 cents per search, which would bankrupt any company, including Google.”
—Kai-Fu Lee, (24:21)

6. Infrastructure, Optimization & the Role of Memory

Speeding Up Inference:
- Innovations include: Mixture of Experts (MoE) for activating only needed parameters, optimization of memory/cache architecture, and algorithmic tricks (25:56-29:16).
UX Implications:
- Users need flexibility—some tasks require speed, others demand accuracy (30:09-31:37).
- Need for interface designers who understand both UX and AI system constraints (32:19).

7. Emerging Business Models & Global AI Ecosystems

China’s Experiential Edge: Companies experiment more due to cheap models and absence of rigid SaaS stacks.
New Models for SaaS: Firms in China and beyond may move to “service as software,” aligning incentives via value-sharing business models (e.g., JV with a patent company for cost savings/productization) (34:12-38:41).
Challenges of the US Market: Despite flexibility in China, American markets offer more early-stage business opportunities thanks to mature infrastructure and larger addressable opportunities.

“It used to be software as a service, now it’s service as software…”
—Azeem Azhar referencing Sequoia paper, (38:41)

8. Reflections on 2024 and Expectations for 2025

Surprises: Tech progress and model accessibility exceeded Lee’s expectations in 2024, though fewer “killer apps” appeared due to earlier high inference costs (39:58).
Multimodal Lag: Despite critical importance for real-world AI and robotics, multimodal remains more at demo stage, but Lee anticipates breakthroughs in 2025 (41:18).
Robotics & Autonomous Vehicles:
- Lee is bullish on autonomous vehicles (AVs) and embodied AI, with caveats around ecosystem maturity, regulatory risk, and expectations vs reality in humanoid robotics (44:29-48:00).

“With humanoid robots…people are going to get disappointed…you set a very high bar, which is why…early efforts to deploy natural language processing for customer service [failed].”
—Kai-Fu Lee, (47:01)

9. What’s Next for 2025

Lee’s Wishlist:
- Real multimodal apps
- Proliferation of AI-native apps replacing traditional ones
- Mature agents (48:31)
Expectations:
- On apps: Most people will be surprised at the “fast and furious” emergence of new apps (49:00).
- On models/tech: Cautious about over-promising (especially from OpenAI); risk of disappointment as breakthrough pace plateaus (49:00-50:09).
- User need: 2025 will be a year of continuous learning, experimenting, and adaptation.

Notable Quotes & Memorable Moments

On Global AI Ecosystems:
“Chinese companies are going deeper…one probably has to go deep to extract the most value…” —Kai-Fu Lee (03:52)
On the Rise of AI-Native Tools:
“The key difference with Microsoft Copilot is AI does the writing, human does the tweaking.” —Kai-Fu Lee (06:51)
On Search’s Next Frontier:
“I would bet…in two or three years we will mostly move out of the current search engine into what I would consider answering.” —Kai-Fu Lee (13:14)
On Cost Plummeting:
“If you were back in the early days of GPT4…that would be 75 cents per search, which would bankrupt any company, including Google.” —Kai-Fu Lee (24:21)
On Humanoid Robots:
“You set a very high bar, which is why…you are unhappy. But now…the question is, you build this human that dances…then it enters your home…then you’re saying, hey, why are you falling down here?…that might create a wave of disappointment.” —Kai-Fu Lee (47:01)
On 2025 App Explosion:
“I think on apps…most people will be surprised with how fast and furious these will come to the upside.” —Kai-Fu Lee (49:00)

Timestamps for Key Segments

[01:04] Generative AI in China & “The ChatGPT Moment”
[03:05] Market penetration, slow start, cheap models in China
[05:24] AI-First applications: Copilot vs. POP AI
[10:35] The next AI-native “answer engine” (Beagle)
[14:04] Turning points: Agents, multimodal models, and costs
[17:25] The end of scaling laws? Debating Ilya Sutskever’s comments
[22:00] Inference cost breakthroughs & compute implications
[30:09] UX tradeoffs: Speed vs. quality in AI responses
[34:12] Cheap models and the Chinese hands-on enterprise approach
[38:41] “Service as software”: New business models
[39:58] Reflections: 2024’s surprises and disappointments
[41:18] Multimodal & robotics—challenges, expectations, and future
[48:31] Kai-Fu Lee’s 2025 wishlist
[49:00] 2025: Upside surprises and the risk of overpromising

wavePod

AI in 2025 – A global perspective, with Kai-Fu Lee

Powered by Wave AI

Summary

Podcast Summary: Azeem Azhar's Exponential View

Episode: AI in 2025 – A Global Perspective, with Kai-Fu Lee

Episode Overview

Key Discussion Points & Insights

1. The Generative AI Wave: US/EU vs China/Asia

2. AI-Native and AI-First Applications

3. Changing Workflows & Search Experiences

4. AI Agents, Multimodal Models, and Cost Dynamics in 2025

5. The End of Scaling Laws? Compute, Data, and Inference

6. Infrastructure, Optimization & the Role of Memory

7. Emerging Business Models & Global AI Ecosystems

8. Reflections on 2024 and Expectations for 2025

9. What’s Next for 2025

Notable Quotes & Memorable Moments

Timestamps for Key Segments

Conclusion

Summary

Podcast Summary: Azeem Azhar's Exponential View

Episode: AI in 2025 – A Global Perspective, with Kai-Fu Lee

Episode Overview

Key Discussion Points & Insights

1. The Generative AI Wave: US/EU vs China/Asia

2. AI-Native and AI-First Applications

3. Changing Workflows & Search Experiences

4. AI Agents, Multimodal Models, and Cost Dynamics in 2025

5. The End of Scaling Laws? Compute, Data, and Inference

6. Infrastructure, Optimization & the Role of Memory

7. Emerging Business Models & Global AI Ecosystems

8. Reflections on 2024 and Expectations for 2025

9. What’s Next for 2025

Notable Quotes & Memorable Moments

Timestamps for Key Segments

Conclusion