
Loading summary
Alex
What's going on in the heart of Google's AI research operation? We'll find out with Google DeepMind's Chief Technology Officer right after this.
Leah Smart
From LinkedIn News, I'm Leah Smart, host of Everyday Better, an award winning podcast dedicated to personal development. Join me every week for captivating stories and research to find more fulfillment in your work and personal life. Listen to Everyday better on the LinkedIn podcast network, Apple Podcasts or wherever you get your podcasts from LinkedIn News. I'm Jessi Hempel, host of the hello Monday Podcast. Start your week with the hello Monday Podcast. We'll navigate career pivots. We'll learn where happiness fits in. Listen to hello Monday with me, Jessi Hempel on the LinkedIn podcast network or wherever you get your podcasts.
Alex
Booking a big technology podcast, A show for cool headed and nuanced conversation of the tech world and beyond. We have a great show for you today, a bonus show just as Google's IO news hits the wire. We have so much to talk about, including what's going on with the company, what it's announced today, but also what is happening in the research effort underlying it all. And we have a great guest for you. Joining us today is Korai Kavacholu. He is the Chief technology officer of DeepMind. We're going to speak with Korai today and then tomorrow you'll hear from DeepMind CEO Demis Hassabis. Korai, great to see you. Welcome to the show.
Korai Kavacholu
Thank you very much folks.
Alex
By the way, if you're watching on video, Korai and I are in two separate conference rooms in Google's. I don't know, it's a pretty cool new building that they have. It's called what gradient wave or something.
Korai Kavacholu
We call it the gradient canopy.
Alex
Gradient canopy. Anyway, we're here and I wanted to ask you a question that we've been asking on the show a lot, which is the scale question. Now Google has a tremendous amount of compute at your disposal and so you basically have the option is it scale that you want to throw at these models or is it new techniques? So let me just ask it to you as plainly as I can. Is scale the star right now or is it a supporting actor in terms of trying to get models to the next step?
Korai Kavacholu
It's a good question. I think also the way you framed it because it is definitely an important factor. The way I'd like to think about this is it's rare that in any research problem you would have a dimension that pretty confident would give you improvements of course, with maybe diminishing returns, but most of the time with research, it's always like that. When we think about our research right now, in the case of generative AI models, scale is definitely one of those, but it's one of those things that are equally important with other things. When we are thinking about our architectures, like the architectural elements, the algorithms that we put in there that make up the model, they are as important as the scale. We of course analyze and understand, as with scale, how do these different architectures, different algorithms, become more and more effective? That's an important part because you know that you are putting more computational capacity and you want to make sure that you research the kinds of architectures and algorithms that pay off the best under that kind of scaling property. Right. But as I said, that's not the only one. Data is really important. I think it is as critical as any other thing. The algorithms, architectures, modules that we put into the system is important. Understanding their properties with data, with more compute, that is as important. And then of course, inference time techniques is as important as well, because now that you have a particular architecture, a particular model, you can multiply its reasoning capabilities by making sure that you can use that model over and over again through different techniques at inference time.
Alex
To me, it's both hopeful and puzzling to hear about all the different techniques to make these models better. And I'll explain that it's hopeful because it seems like we're definitely going to see a lot of improvement from where the models are today. And the models are already pretty good. The thing that's puzzling to me is the idea with scale was there was effectively limitless potential in making these AI models bigger. And you said the words diminishing returns. And we've heard that from you and basically everybody working on this problem. And it's no secret, right, that right now we've been waiting forever for GPT5 meta had some problems with llama. Anthropic has been trying to tell us there's a new Claude Opus model coming out forever. We haven't seen it. So clearly a lot of the research houses, maybe with the exception of Google, are struggling with what you get from when you make the models bigger. I just want to ask you about that. It seems like it's nice that there are all these techniques, but again, thinking about this one technique that was supposed to have limitless potential, is that a disappointment for the generative AI field overall? If that's not going to be the case?
Korai Kavacholu
Yeah, I really don't think about it that way because we have been able to push the capabilities of the models quite effectively. Right. I think in a way the whole scale discussion starts from the scaling laws. Scaling laws explain the performance of the models under both data and compute and number of parameters. Researching all three in combination is the important thing. When I look at the kind of progress that we are getting from that general technology, I think it is still improving. What I think is important is to make sure that there is a broad spectrum of research that is going on across the board. And rather than thinking about scaling only in one dimension, there's actually many different ways to think about it and investing in those. And we can see the returns that I think across the field, really not just here at Google, but across the field, many different models are improving with quite significant steps. Right. So I think as a field the progress has been quite stellar. I think it's very exciting. And in Google we are very excited about the progress that we have been having with Gemini models going from 1.5 to 2 to 2.5. I think we had a very steady progress, very steady improvement in the capabilities of models, both in the spectrum of the capabilities that we have, but, but also at the quality level for each capability as well. So I think what I'm excited about is we are pushing the frontier all the time and we see returns in many research directions and many different dimensions of research directions. And I'm excited that there's actually, I think there's a lot more progress to do and there's a lot more progress that needs to happen for, for reaching AGI as well.
Alex
We had Jan Lecun on the show a couple of weeks ago. You worked in Jan's lab. Jan emphatically stated there is no way the AI industry is going to reach human level intelligence, which is his term for AGI, just by scaling up LLMs. Do you agree?
Korai Kavacholu
Well, I mean, I think that's a hypothesis, right? That might turn out to be true or not. But also I don't think that there is any research lab that is trying to only do scaling up the LLM. I don't know if anyone is actually trying to negate that hypothesis or not. We are not. From my point of view, we are investing in such a broad spectrum of research that I think that is what is necessary. Clearly. I think many of the researchers that I talk to, and me myself, I think that there is a lot more critical elements that needs to be invented. There is critical innovations on our path to AGI that we need to get through that's why we are still looking at this as a very ambitious research problem. I think it is important to keep that critical thinking in mind. With any research problem. You always try to look at multiple different hypotheses, try to look at many different solutions. A research problem this ambitious, probably the most important problem that we are working in our lifetimes. It is the hardest problem. Maybe we are working as a problem, as a research problem in our work. I think having that really ambitious research agenda and portfolio and making investments in many different directions is the important thing. From my point of view, what is important is defining where the goal is. That our goal is AGI. Our goal is not to build AGI in a particular way. What's important is build the AGI in the right way, that is positively impactful, that is building on it, that we can bring a huge amount of benefits to the world. That's why we are trying to research AGI. That's why we are trying to build AGI AGI in itself, itself. Sometimes it might come across as it's a goal in itself. The goal in itself is the fact that if we do that, then we can hugely benefit all of society, all of the world. That's the goal. So with that responsibility, of course you put in not just particular. It's not very important to me if that particular hypothesis is important or not. What is important is we reach that with doing a very ambitious research by pursuing a very ambitious research agenda and building a very strong understanding of the field of intelligence.
Alex
Okay, so let's get to a little bit of that research agenda. One of the announcements that you're making at I O, which is this week, which just when this airs, it will just have been made, is that there's a new product called DeepThink that you're releasing, which is relying on reasoning or as you put it, test time, compute. I think I have that right. In terms of what the product is going to look like, how effective has including reasoning in these models been and advancing them? I mean, would you say, when you think about all the different techniques that you've discussed so far today, scaling included, how. What sort of a magnitude improvement are you seeing by using reasoning? And talk a little bit about deepthink.
Korai Kavacholu
Okay, I mean, first of all, deepthink, it's not necessary. It's not like a separate product. It is a mode that we are enabling our 2.5 Pro model so that it can spend a lot more time during inference, time to think, to build hypotheses. And the important thing is to Build parallel hypotheses rather than a single chain of thought in it, can build parallel ones and then can reason over multiple of those, build a hypothesis, build an understanding over those, and then continue building those parallel chains of thoughts.
Alex
But this one thinks a little bit longer than your traditional reasoning model.
Korai Kavacholu
It will. I mean in the current setup, yes, it takes longer and it takes because like understanding those parvatoes and building those parvatos is it's all a much more longer process. But one thing that we are also positioning it as is right now, it's research. We are sharing some initial research results. We are excited about it, we are excited about the technique that what it enables, what it can actually enable in terms of new capabilities and new, new performance levels. But it's early days and that's why we are only sharing it right now. We are going to start sharing with safety researchers and some trusted testers because we want to also understand the kinds of problems that people want to solve with it and the kinds of new capabilities it brings and how we should train it the way that we want to train. So it is early days on that, but it is what I think is an exciting research direction that we found in the inference time thinking model space.
Alex
Yeah. So can you talk about what precisely it does different than traditional reasoning models.
Korai Kavacholu
The current reasoning thinking models? Most of the time at least I can talk from our research point of view, builds a single chain of thought. Then as you build a single chain of thought and as the model continues to attend to its chain of thought, it builds a better understanding of what response it wants to give you. It can alternate between different hypotheses, refle on what it has done before. Now, of course, if you think about it just also in a visual kind of space, one kind of scalability that you can bring onto the table is can you have multiple parallel chains of thoughts so that you can actually analyze different hypotheses in parallel and then you will have more capacity exploring different kinds of hypotheses. And then you can look at, you can compare those and then you can eliminate the ones or you can continue pursuing and you can sort of expand on particular ones. It's a very intuitive process in a way, but of course it is more involved.
Alex
I just want to cap this segment by asking you in terms of the pace of improvement of models. I'm just going to use the OpenAI schema just to give an example the progress. This is something that everybody who comes on this show says. The progress of going from GPT3 to GPT4 was undeniable. GPT4 to 4.5, less of a leap. So I want to ask you, just in terms of the velocity of improvement, if that's the right way to put it, are we coming back down to earth a little bit right now?
Korai Kavacholu
Again, when I look at our model family going from Gemini 1 to 1.5 to 2 to now to 2.5, I'm very excited about the pace that we have when I look at the capabilities that we keep adding. We have always designed Gemini models to be multimodal. From the beginning. That was our ambition because we want to build AGI. We want to make sure that we have models that can fulfill the capabilities that we expect from a general intelligence. Multimodality was key from the beginning. As the versions have been progressing, we have been adding that natural multimodality more and more and more. And when I look at the pace of improvement in our reasoning capabilities, like lately we have added the thinking capabilities. And I think with 2.5 Pro, we wanted to make a big leap in our reasoning capabilities, our coding capabilities. And I think one of the critical things is we are bringing all this together in one single model family, and that is actually one of the catalyzers of improvement. And improvement at pace as well. It's harder. But we find that creating a single model that can understand the world and then you can ask questions about, oh, can you code me this sort of simulation of a tree growing? And then it can do it, right. That requires understanding of all of the things, not just how to code. Because, again, we are trying to bring these models to be useful, to be usable by a very broad audience. And I think our pace has been really reflective of the research investments that we have been doing across the board.
Alex
So no velocity slowdown is what I'm hearing from you.
Korai Kavacholu
Let me just put it in the way that I'm very excited about everything that we have been doing as Gemini progresses and research is getting more and more exciting. Of course, like, for us folks who are doing research, it is really good.
Alex
Okay, so I want to ask you. You're on the model side. I want to ask you. Basically, sometimes we debate on the show what the value is of improving models. So let me just put a thought experiment to you. What do you think the value of improving these models by 10% would get us?
Korai Kavacholu
The question there is, how do we define 10%? Right? That is where the. That is where the value is defined already. One of the important things about doing research and improving the models is quantifying progress. We use many different ways to quantify progress. And not every one of them is linear. And not every one of them is linear with the same slope. When we say by improving 10%, if we can improve 10% by its understanding in math, understanding of really highly complex reasoning problems, I think that is a huge improvement because then that actually expands the general knowledge. That would indicate that the general knowledge and the capabilities of the models have expanded a lot. You would expect that that would make the model a lot more applicable to a broader range of problems.
Alex
And what about if you improved the model by like 50%? What would that get you? Is your product team like saying there are things that we can build if this model was just like 50% better?
Korai Kavacholu
Again, I think like we work with product teams a lot, right? Like that's actually taking a step back. That's a quite important thing for me. Thinking about AGI as a goal. I think that also goes through working with the product teams. Because it is important that when we are building AGI, it's a research problem. We are doing research. But the most critical thing is we actually understand what kind of problems to solve, what kind of domains to evolve these models from the users so that user feedback and that knowledge from the interaction with the users is, is actually quite critical. So when our products tell us about, okay, here is an area that we want to improve on, then that is actually quite important feedback for us that we can then turn into metrics and pursue those. As you ask, as we increase the capabilities of the model, I think what is important is across a broad range of metrics, which I think we have been seeing in Gemini, as I said, From 1.5 to 2.5, you can see the capability increases across the model. A lot more people can actually use the models in their daily life to help them to either learn something new or to help them solve an issue that they see. But that's the goal. At the end of the day, again, to the reason we built this technology is to build something that is helpful. And the products are a critical aspect of how we measure and how we understand what is helpful and what is not. And as we increase more in that, I think that's our main ambition. That's great.
Alex
Let's take a concrete example that again, the company Google is releasing today, talking about Today, which is VO3, this is your video generation model. I think we've really seen an unbelievable acceleration in terms of what these models can do from the first generation to second generation to the third. And for listeners and viewers What Google is doing now is not only are you able to generate scenes, you're able to generate them with sound. And having watched one of these videos or a couple of them, I can tell you the sound matches. And then there's this other crazy product that Google's putting out. I think it's called flow, where you could just extend the scene that you've generated and storyboard out like your own basically short film. So I'd love to hear your perspective on how this happened. And is this like, you know, I kind of asked you, what do we get at 10%, 50%? But is this kind of that perfect example of the model getting better, producing something that goes from, you know, that's a fun little video to like, oh, I can really use this now.
Korai Kavacholu
Yes. I think the main difference, the main Progress Going from VO2 to VO3, from VO1 to VO2, it was a lot more about understanding the physics and the dynamics of the world. With VO2, I think for the first time, we could comfortably say that for many, many cases, the model has understood the dynamics of the world. Well, that's very important to be able to have a model that can generate scenes and complex scenes where there is dynamic environment happening. Also there's interactions of objects happening. I remember one of the things that was quite viral was cutting the tomato, where it was so precise, the video generated by VO2, that it looks so realistic that a person was slicing tomatoes. And the dynamics there and how both. Not just any single object, like how the hand moves, but also the interaction between different objects, the blade, the tomato, how the slice falls down and everything. It was very precise. So that interactive element was important. Understanding the dynamics is. Is about not just understanding the dynamics of a particular single object, but it's also multiple objects interacting with each other, which is much, much more complex. I think there we had a big jump with VO3. I think we are doing another jump in that aspect. But I see the sound as an orthogonal. A new capability that is coming in. Of course, our real world, we have multiple senses, and vision and sound go hand in hand. They are perfectly correlated. We perceive them all at the same time, and they complement each other. So to be able to have a model that understands that interactivity, that complementarity, and being able to generate scenes and videos that can generate both at the same time, I think that speaks to the new capability level of the model and the quality. I think, like, this is the first step. There are very impressive examples. There are examples that are a little bit more falling short of what you would say, okay, this is really natural, but I think this is an exciting step in terms of expanding that capability. And as you said, I think I'm excited to see how this kind of technology can be useful. Right. Like you just said that. Oh, it is becoming useful. I think that is great to hear. Right, like that, like now this is a technology that can be built and I think flow is an experiment in that direction to give it to users so that for people to experiment and build something with it.
Alex
Yeah, you prompt a scene and then it creates a scene. Then you prompt the next scene and you can continue to have a story flow, which is a good name for it. All right, this next question comes to me from a pretty smart AI researcher. They basically talked about how there's this basic, There's a tension between open source and proprietary. And of course we have companies like Google that's building, obviously attention is all you need. The transformer came from Google. Now Google's building proprietary models. We saw deepseek push the state of the art forward, you could argue. So this person wanted to know, and I think it's a really good question, is there a coordination possible between open source and proprietary? Maybe we see OpenAI doing their new open source model or teasing it, or should each sort of side try to get its own part of the market? What do you think?
Korai Kavacholu
I think I want to say a couple of things. First and foremost, again, take a step back. There's a lot of research that went into building this technology. Of course, in the last two, three years, I think it became so accessible and so general that people are using in their daily lives. But there's a long history of research that built up to this point as a research lab. Google and before of course, there was DeepMind and Google Brain, two separate labs that are working in tandem in different aspects. Many of the technologies that we see today has been built as research prototypes, as research ideas, and have been published in papers. As you said, Transformers, the most critical technology that is underlying things. Then models like AlphaGo, AlphaFold, all of these kinds of things, all these research ideas have been evolving into building the knowledge space that we have right now. All that research, I think publications and open sourcing, all those have been a critical element because we were really in the exploratory space at those times. Nowadays. I think the other thing that we always need to remember is actually we have at Google we have our Gemma models that are the open weights models. Just like Llama open weights models, we have the Gemma open weights models. The Reason to do those for us is also there's a different community of developers and users who want to interact with those models, who actually need that kind of being able to download those weights into their own environment and use that and build with that. I feel like it's not an either or. I think there are different kinds of use cases and communities that actually benefit from different kinds of models. But what is most important is at the end of the day, in the path towards AGI, of course, it's important that we are being conscious about what we enable with the technologies that we develop. When we develop our frontier technologies, we choose to develop them under the Gemini umbrella, which are not open weights models, because we want to also make sure that we can be responsible in the way that they are used as well.
Alex
Right, right.
Korai Kavacholu
But at the end of the day, what really matters is the research that goes into building the technology and doing that research and pushing the frontier of the technology and building it the right way with the positive impact. And I think it can happen both in open weights ecosystem or in the closed system. But I think when I think about all the sort of the umbrella of things that we are trying to do, we are quite ambitious goals. Building AGI and doing it the right way with the positive impact. That's how we develop our Gemini models.
Alex
Okay, I have like 30 seconds left with you. You're Chief Technology Officer. Are you a fan of vibe coding?
Korai Kavacholu
Yes, exactly. I find it really exciting, right? Because what it does is all of a sudden it enables a lot of people who are not necessarily. Who do not necessarily have that coding background to build applications. It's a whole new world that is opening. Right? Like you can actually say, oh, I want an application like this. And then you see it, you can imagine what kinds of things could be possible in the space of learning. Right? You want to learn about something. You can have a textual representation, but you can ask the model to build you an application that explains you certain concepts. And it would do it. This is the beginning. Some things it does well, some things it doesn't well, it doesn't do well. But I find it really exciting. This is the kinds of things that the technology brings. All of a sudden, the whole space of building applications, the whole space of building dynamic interactive applications becomes accessible to a large, broader community and set of people.
Alex
Korai, great to see you. Thank you so much for coming on the show.
Korai Kavacholu
Yeah, thank you very much. Thanks for inviting Alex.
Alex
Definitely. We'll have to do it again in person sometime. All right, everybody, thank you for listening. We'll have Demis Hassabis on the CEO of Google DeepMind tomorrow. And so we invite you to join us. Then we'll see you next time on big Technology Podcast.
Big Technology Podcast Summary
Episode: Google DeepMind CTO: Advancing AI Frontier, New Reasoning Methods, Video Generation’s Potential
Release Date: May 20, 2025
Host: Alex Kantrowitz
Guest: Korai Kavacholu, Chief Technology Officer of Google DeepMind
In this insightful episode of the Big Technology Podcast, host Alex Kantrowitz engages in a deep conversation with Korai Kavacholu, the Chief Technology Officer of Google DeepMind. The discussion centers around the latest advancements in artificial intelligence, including the balance between scaling models and innovating new techniques, the pursuit of Artificial General Intelligence (AGI), and groundbreaking developments in video generation technologies.
A primary topic of discussion is whether scaling AI models or developing new techniques plays a more pivotal role in advancing AI capabilities.
Korai Kavacholu emphasizes a balanced approach:
"It's rare that in any research problem you would have a dimension that pretty confident would give you improvements of course, with maybe diminishing returns, but most of the time with research, it's always like that."
(02:07)
He contends that while scaling is important, architectural innovations, algorithms, data quality, and inference-time techniques are equally critical in pushing the boundaries of AI models.
Addressing skepticism around achieving AGI solely through scaling, inspired by remarks from AI luminary Jan LeCun, Korai provides his perspective:
"From my point of view, we are investing in such a broad spectrum of research that I think that is what is necessary."
(07:35)
He underscores the necessity of exploring multiple research avenues and innovations beyond mere scaling to attain AGI, highlighting the ambitious and multifaceted research agenda at DeepMind.
One of the episode's highlights is the introduction of DeepThink, a new mode in DeepMind's Gemini 2.5 Pro model designed to enhance reasoning during inference.
Korai Kavacholu explains:
"DeepThink... is a mode that we are enabling our 2.5 Pro model so that it can spend a lot more time during inference, time to think, to build hypotheses."
(11:02)
DeepThink allows the model to build and reason over multiple parallel hypotheses, thus advancing its reasoning capabilities beyond traditional single-chain-of-thought models.
When questioned about the pace of AI model advancements, especially in light of perceived plateaus in other AI institutions, Korai confidently states:
"I have no velocity slowdown is what I'm hearing from you."
(16:24)
He highlights the consistent progress of the Gemini model series, noting significant enhancements in multimodality and reasoning, which contribute to robust and steady improvements in AI capabilities.
The discussion delves into the tangible benefits of incremental improvements in AI models. Korai articulates that the value of enhancements depends on the metrics used:
"If we can improve 10% by its understanding in math, understanding of really highly complex reasoning problems, I think that is a huge improvement."
(16:54)
He emphasizes that even modest improvements can significantly expand a model's applicability and effectiveness across various domains, aligning with user needs and real-world applications.
DeepMind's strides in video generation technology are explored, focusing on the evolution from VO1 to VO3 models. Korai elaborates on the enhancements:
"With VO2, I think for the first time, we could comfortably say that for many, many cases, the model has understood the dynamics of the world."
(21:01)
VO3 introduces synchronized sound generation, creating more immersive and realistic videos. The accompanying product, Flow, allows users to storyboard and extend generated scenes, marking a significant leap in making AI-generated multimedia content more functional and user-friendly.
A pertinent question from an AI researcher addresses the tension between open-source and proprietary AI models. Korai responds by highlighting DeepMind's commitment to both:
"I think it's not an either or. I think there are different kinds of use cases and communities that actually benefit from different kinds of models."
(24:54)
He explains that while DeepMind releases open-weight models akin to OpenAI's offerings, they also develop proprietary models under the Gemini umbrella to ensure responsible and impactful use of advanced AI technologies.
In a brief yet enthusiastic exchange, Korai touches upon vibe coding, a tool that lowers the barrier to application development:
"All of a sudden it enables a lot of people who are not necessarily. Who do not necessarily have that coding background to build applications. It's a whole new world that is opening."
(28:02)
He expresses excitement about how such technologies empower a broader audience to create dynamic and interactive applications, fostering innovation and accessibility in tech development.
The episode concludes with Korai Kavacholu reiterating DeepMind's dedication to advancing AI responsibly and effectively. As the podcast teases an upcoming interview with DeepMind CEO Demis Hassabis, listeners are left with a comprehensive understanding of the current AI landscape, DeepMind's strategic direction, and the exciting future prospects in AI research and application.
Notable Quotes:
Korai Kavacholu (02:07): "It's rare that in any research problem you would have a dimension that pretty confident would give you improvements of course, with maybe diminishing returns, but most of the time with research, it's always like that."
Korai Kavacholu (07:35): "From my point of view, we are investing in such a broad spectrum of research that I think that is what is necessary."
Korai Kavacholu (11:02): "DeepThink... is a mode that we are enabling our 2.5 Pro model so that it can spend a lot more time during inference, time to think, to build hypotheses."
Korai Kavacholu (16:54): "If we can improve 10% by its understanding in math, understanding of really highly complex reasoning problems, I think that is a huge improvement."
Korai Kavacholu (21:01): "With VO2... the model has understood the dynamics of the world."
Korai Kavacholu (24:54): "I think it's not an either or. I think there are different kinds of use cases and communities that actually benefit from different kinds of models."
Korai Kavacholu (28:02): "It enables a lot of people who are not necessarily... to build applications. It's a whole new world that is opening."
This comprehensive summary encapsulates the multifaceted discussion between Alex Kantrowitz and Korai Kavacholu, providing listeners and readers alike with valuable insights into the evolving landscape of artificial intelligence and DeepMind's pivotal role in shaping its future.