Big Technology Podcast Summary
Episode: Google DeepMind CTO: Advancing AI Frontier, New Reasoning Methods, Video Generation’s Potential
Release Date: May 20, 2025
Host: Alex Kantrowitz
Guest: Korai Kavacholu, Chief Technology Officer of Google DeepMind
Introduction
In this insightful episode of the Big Technology Podcast, host Alex Kantrowitz engages in a deep conversation with Korai Kavacholu, the Chief Technology Officer of Google DeepMind. The discussion centers around the latest advancements in artificial intelligence, including the balance between scaling models and innovating new techniques, the pursuit of Artificial General Intelligence (AGI), and groundbreaking developments in video generation technologies.
The Scale vs. Techniques Debate in AI
A primary topic of discussion is whether scaling AI models or developing new techniques plays a more pivotal role in advancing AI capabilities.
Korai Kavacholu emphasizes a balanced approach:
"It's rare that in any research problem you would have a dimension that pretty confident would give you improvements of course, with maybe diminishing returns, but most of the time with research, it's always like that."
(02:07)
He contends that while scaling is important, architectural innovations, algorithms, data quality, and inference-time techniques are equally critical in pushing the boundaries of AI models.
The Quest for AGI: Perspectives and Research Directions
Addressing skepticism around achieving AGI solely through scaling, inspired by remarks from AI luminary Jan LeCun, Korai provides his perspective:
"From my point of view, we are investing in such a broad spectrum of research that I think that is what is necessary."
(07:35)
He underscores the necessity of exploring multiple research avenues and innovations beyond mere scaling to attain AGI, highlighting the ambitious and multifaceted research agenda at DeepMind.
Unveiling DeepThink: Enhancing Reasoning Capabilities
One of the episode's highlights is the introduction of DeepThink, a new mode in DeepMind's Gemini 2.5 Pro model designed to enhance reasoning during inference.
Korai Kavacholu explains:
"DeepThink... is a mode that we are enabling our 2.5 Pro model so that it can spend a lot more time during inference, time to think, to build hypotheses."
(11:02)
DeepThink allows the model to build and reason over multiple parallel hypotheses, thus advancing its reasoning capabilities beyond traditional single-chain-of-thought models.
Velocity of Model Improvement: No Slowdown Seen
When questioned about the pace of AI model advancements, especially in light of perceived plateaus in other AI institutions, Korai confidently states:
"I have no velocity slowdown is what I'm hearing from you."
(16:24)
He highlights the consistent progress of the Gemini model series, noting significant enhancements in multimodality and reasoning, which contribute to robust and steady improvements in AI capabilities.
Quantifying Model Improvements: 10% vs. 50%
The discussion delves into the tangible benefits of incremental improvements in AI models. Korai articulates that the value of enhancements depends on the metrics used:
"If we can improve 10% by its understanding in math, understanding of really highly complex reasoning problems, I think that is a huge improvement."
(16:54)
He emphasizes that even modest improvements can significantly expand a model's applicability and effectiveness across various domains, aligning with user needs and real-world applications.
Advancements in Video Generation: From VO1 to VO3 and Beyond
DeepMind's strides in video generation technology are explored, focusing on the evolution from VO1 to VO3 models. Korai elaborates on the enhancements:
"With VO2, I think for the first time, we could comfortably say that for many, many cases, the model has understood the dynamics of the world."
(21:01)
VO3 introduces synchronized sound generation, creating more immersive and realistic videos. The accompanying product, Flow, allows users to storyboard and extend generated scenes, marking a significant leap in making AI-generated multimedia content more functional and user-friendly.
Open Source vs. Proprietary Models: Coordinating the AI Ecosystem
A pertinent question from an AI researcher addresses the tension between open-source and proprietary AI models. Korai responds by highlighting DeepMind's commitment to both:
"I think it's not an either or. I think there are different kinds of use cases and communities that actually benefit from different kinds of models."
(24:54)
He explains that while DeepMind releases open-weight models akin to OpenAI's offerings, they also develop proprietary models under the Gemini umbrella to ensure responsible and impactful use of advanced AI technologies.
Vibe Coding: Democratizing Application Development
In a brief yet enthusiastic exchange, Korai touches upon vibe coding, a tool that lowers the barrier to application development:
"All of a sudden it enables a lot of people who are not necessarily. Who do not necessarily have that coding background to build applications. It's a whole new world that is opening."
(28:02)
He expresses excitement about how such technologies empower a broader audience to create dynamic and interactive applications, fostering innovation and accessibility in tech development.
Conclusion
The episode concludes with Korai Kavacholu reiterating DeepMind's dedication to advancing AI responsibly and effectively. As the podcast teases an upcoming interview with DeepMind CEO Demis Hassabis, listeners are left with a comprehensive understanding of the current AI landscape, DeepMind's strategic direction, and the exciting future prospects in AI research and application.
Notable Quotes:
-
Korai Kavacholu (02:07): "It's rare that in any research problem you would have a dimension that pretty confident would give you improvements of course, with maybe diminishing returns, but most of the time with research, it's always like that."
-
Korai Kavacholu (07:35): "From my point of view, we are investing in such a broad spectrum of research that I think that is what is necessary."
-
Korai Kavacholu (11:02): "DeepThink... is a mode that we are enabling our 2.5 Pro model so that it can spend a lot more time during inference, time to think, to build hypotheses."
-
Korai Kavacholu (16:54): "If we can improve 10% by its understanding in math, understanding of really highly complex reasoning problems, I think that is a huge improvement."
-
Korai Kavacholu (21:01): "With VO2... the model has understood the dynamics of the world."
-
Korai Kavacholu (24:54): "I think it's not an either or. I think there are different kinds of use cases and communities that actually benefit from different kinds of models."
-
Korai Kavacholu (28:02): "It enables a lot of people who are not necessarily... to build applications. It's a whole new world that is opening."
This comprehensive summary encapsulates the multifaceted discussion between Alex Kantrowitz and Korai Kavacholu, providing listeners and readers alike with valuable insights into the evolving landscape of artificial intelligence and DeepMind's pivotal role in shaping its future.
