Fei-Fei Li: World Models and the Multiverse - AI + a16z

Summary7 min read

Podcast Summary: Fei-Fei Li – World Models and the Multiverse

Podcast: AI + a16z
Episode: Fei-Fei Li: World Models and the Multiverse
Date: December 23, 2025
Host: a16z (Eric, main host; joined by Martin Casado, General Partner at a16z)
Featured Guest: Fei-Fei Li, Co-founder & CEO of World Labs

Episode Overview

This episode delves into why spatial intelligence and world modeling are critical next steps for artificial intelligence, moving AI beyond the current dominance of language-based models. Fei-Fei Li, a pioneer in the intersection of data and AI, and Martin Casado, a16z general partner, discuss the limitations of language models, the promise of AI that understands and acts in 3D space, and the implications for robotics, creativity, and the concept of a digital “multiverse.”

Key Discussion Points & Insights

1. The Case for World Models: Beyond Language in AI

AI’s Current Focus is Language: Most current AI innovation centers around Large Language Models (LLMs), but this neglects a more fundamental component of intelligence: spatial understanding.
Fei-Fei Li’s Perspective:
- "That space, the 3D space, the space out there, the space in your mind's eye, the spatial intelligence that enable people to do so many things, that's beyond language is a critical part of intelligence." (00:00, 11:33)
Origins of World Labs:
- The idea came from repeated observations—both by Fei-Fei and Martin Casado—that AI needed a paradigm shift toward “world models” that capture and reason about the 3D physical world. (05:09)

2. Intellectual Partnership & Founding World Labs

Why Martin Casado as First Investor:
- Fei-Fei wanted not just a financial backer but an “intellectual partner.” She sought a computer scientist who understood both technology and market dynamics, someone who could “be on the phone or in person with me every moment of the day as an intellectual partner.” (03:36)
Early Conversations Around World Models:
- They realized that most investors and technologists didn’t fully grasp the concept of a world model, often offering only "polite nods." Martin stood out as someone who truly understood the idea: "The way he defined it about an AI model that truly understand the 3D structure, shape and the compositionality of the world was exactly what I was talking about." (05:39)

3. Why LLMs Are Not Enough

Human Intelligence Is Deeply Spatial:
- “Language is a lossy way to capture the world.” (07:07)
- The physical, perceptual, visual world exists independent of language; evolutionary intelligence is built on spatial and embodied experience.
Limitations of Language Models:
- Language is great for abstract thought but insufficient for encoding spatial or physical reality—crucial for robotics, creativity, and interacting with the environment.
- "If I put you in a room and blindfolded you and I just described the room and then I asked you to do a task, the chances of you being able to do it are very little." (08:59, Martin Casado)

4. Spatial Intelligence: Evolutionary Perspective

Language vs. Spatial Reasoning:
- Martin: "The part of our brain that actually deals with language is actually pretty recent ... but the part of the brain that actually does the navigation, you know, the spatial, has been around ... 500 million years." (10:24–11:07)
- Fei-Fei: "That double helix in 3D space. There's no way you can use language alone to reason that out." (11:33)

5. Applications and the Digital Multiverse

Creativity and Design:
- "Creativity is very visual … From design to movie to architecture to industry design … that alone is a highly visual perceptual spatial area.” (12:58)
Robotics and Embodied AI:
- All robots, humanoid or otherwise, must understand and navigate 3D space, which requires new AI capabilities.
The Multiverse Concept:
- Advanced AI world models will allow us to “create infinite universes”—spaces for robots, creativity, travel, storytelling, and socialization—in both physical and digital worlds. (13:23; repeated from the opening quote)

6. Horizontal Technology: Foundational Impact

Horizontal Scope:
- Martin analogizes world models to LLMs: “The same LLM we use for an emotional conversation, we use it to write code… So with these [world] models, you can take a view of the world … and then you could actually create a 3D full representation…” (14:44)
Generativity:
- World models enable full 3D reconstructions, manipulations, and creations in both the digital and physical realms. They reach into gaming, art, robotics, architecture, and more. (15:30)

7. Why 3D is Fundamental — Not Just 2D

Limitations of 2D:
- "Physics happens in 3D, and interaction happens in 3D. Navigating behind the back of the table needs to happen in 3D, composing the world...needs to happen in 3D. So fundamentally, the problem is a 3D problem." (17:21, Fei-Fei Li)
- Martin: “If that’s 2D [for a robot], and then you ask the robot ... distance or to grab something, that information’s missing ... you need to provide that information ... so that you can actually navigate in 3D space. And so 2D video is great if it’s a human, because we already can turn it into 3D. But ... any computer program ... need[s] to be 3D.” (18:02)
A Personal Lens:
- Fei-Fei Li’s own loss of stereo vision after a cornea injury led to firsthand insight:
  - "I was just driving in my own neighborhood, and I realized I don't have a good distance measure between my car and the parked car on a local small road ... That was exactly why we needed stereo vision." (19:12)

8. Technological State-of-the-Art in World Models

Pioneering Techniques and Team:
- The field builds on innovations like Neural Radiant Fields (NeRF), Gaussian splat representations, GANs, and style transfer.
- “At World Lab we just have the conviction that we're going to be all in on this one singular big North Star problem, concentrating on the world's smartest people … All of them coming to this one team and try to make this work and to productize this.” (20:06)
Multidisciplinary Team Necessary:
- Martin: "You need experts both in AI ... and graphics, which is like, how do you actually represent these things in memory ... It takes a very special team to actually crack this problem, which Fei Fei has managed to put together." (22:03)

Notable Quotes & Memorable Moments

On the need for spatial intelligence in AI:
- Fei-Fei Li, 00:00 & 11:33: “That space, the 3D space, the space out there, the space in your mind's eye, the spatial intelligence that enable people to do so many things, that's beyond language is a critical part of intelligence.”
On the power of world models:
- Fei-Fei Li, 12:58 & 13:23: “We can actually create infinite universes. Some are for robots, some are for creativity, some are for socialization, some are for travel, some are for stories. It suddenly will enable us to live in a multiverse way. The imagination is boundless.”
On the evolutionary basis for spatial intelligence:
- Martin Casado, 10:24–11:07: “The part of our brain that actually deals with language is actually pretty recent ... the part that does navigation ... has been around ... 500 million years.”
On the real-world importance of depth perception:
- Fei-Fei Li, 19:12: "I realized I don't have a good distance measure ... That was exactly why we needed stereo vision."
On the need for a multidisciplinary approach:
- Martin Casado, 22:03: "To solve this problem you need experts both in AI ... and graphics ... It takes a very special team to actually crack this problem, which Fei Fei has managed to put together."

Timestamps for Key Segments

00:00–00:40: Opening theme – spatial intelligence and world models as the next frontier in AI.
02:06–04:06: Fei-Fei’s background and what she needed in an "intellectual partner" to launch World Labs.
05:09–06:10: The challenge of explaining the significance of world models to others; Martin's unique understanding.
07:07–08:59: Limitations of language for encoding and acting upon the physical world.
10:23–11:33: The evolutionary foundation and depth of spatial intelligence.
12:58–14:44: Real-world and horizontal applications for world models; the multiverse vision.
17:21–18:37: Why AI must operate in 3D; limitations of 2D approaches.
19:12–20:01: Personal anecdote on stereo vision and embodied intelligence.
20:06–22:03: State of the art in computer vision and the technical team at World Labs.

Conclusion

Fei-Fei Li and Martin Casado argue that building AI systems capable of perceiving, reasoning, and acting within 3D worlds—"world models"—is essential for general intelligence. This approach enables new horizons for robotics, creativity, and virtual environments, pushing AI beyond the boundaries of language into the boundless possibility of a digital multiverse. Their conversation underscores a pivotal shift in industry focus, powered by deep technical expertise and a vision that spatial intelligence is the core substrate of truly intelligent machines.

Loading summary

Transcript52 lines

[00:01]
Fei-Fei Li
That space, the 3D space, the space out there, the space in your mind's eye, the spatial intelligence that enable people to do so many things, that's beyond language is a critical part of intelligence.
[00:15]
Martin Casado
Vivi leans over to me, she's like, you know what we're missing? And I said, what are we missing? She said, we're missing a world model.
[00:20]
Fei-Fei Li
And I'm like, yes, we can actually create infinite universes. Some are for robots, some are for creativity, some are for socialization, some are for travel, some are for stories. Telling it suddenly will enable us to live in a multiverse way. The imagination is boundless.
[00:40]
Podcast Host
When we talk about AI today, the conversation is dominated by language, LLMs, tokens, prompts. But what if we're missing something more fundamental? Not words, but space, the physical world we move through in shape? My guests today think we are Fei Fei Li, a pioneer in modern AI, helped usher in the deep learning era by putting data at the center of machine learning. Now she's co founder and CEO of World Labs, building world models, AI systems that perceive and act in 3D space. She's joined by a 16Z general partner, Martin Casado, computer scientist, repeat founder and one of the first people feifei called when forming the company today. They explain why spatial intelligence is core to general intelligence and why it's time to go beyond language. Let's get into it. As a reminder, the content here is for informational purposes only, should not be taken as legal, business, tax or investment advice, or be used to evaluate any investment or security, and is not directed at any investors or potential investors in any A16Z fund. Please note that A16Z and its affiliates may also maintain investments in the companies discussed in this podcast. For more details, including a link to our investments, please see a16z.com disclosures.
[01:59]
Eric
Feifei, thank you so much for joining us here today. Martine, why don't you briefly brag on behalf of Fei Fei a little bit? And how would you summarize your contributions to AI for people unfamiliar?
[02:06]
Martin Casado
Yeah, someone that doesn't need a lot of introduction and she's done so many things that I can't fill in. So maybe I'll just do the ones that appropriate to this. Of course, she was on the Twitter board. She was a Google exec, founder and CEO of World Labs. But very, very importantly, like we all know AI and we all talk about kind of neural networks, and there's a number of people that focused on making those effective. But Fei Fei really singularly brought in data to the equation which now we're recognizing is actually probably the bigger problem, the more interesting. So she truly is the godmother of AI, as everybody calls her.
[02:34]
Eric
And Fei. Fei, why did you have to have Martine as the first investor?
[02:37]
Fei-Fei Li
Well, first of all, I knew Martin for more than a decade, long time. You know, I joined Stanford in 2009 as a young assistant professor and Martin was finishing his PhD there. So I always know. And of course, Martin's advisor, Nick McCune, was a good friend, and I always know Martin went on to became a very successful entrepreneur and very successful investor. So we see each other, we talk about things. But as I was formulating the idea of World Labs, I was looking for what I would call my unicorn investor. I don't know if that's a word, but that's how I think about this. Who is not only obviously a very established and successful investor, who can be with entrepreneurs on this journey through the ups and downs, who can be very insightful, who can bring the kind of knowledge, advice resource. But I was also particularly looking for an intellectual partner, because what we are doing at World Labs is very deep tech. We are trying to do something no one else has done. We know with a lot of conviction it will change the world, literally. But I need someone who is a computer scientist, who is a student of AI, understand product, market, go to market, go to market, and just can be on the phone or in person with me every moment of the day as an intellectual partner. And here we are. We talk almost every single day.
[04:07]
Martin Casado
It is true.
[04:07]
Fei-Fei Li
Yeah.
[04:08]
Eric
Amazing.
[04:09]
Martin Casado
Actually, the origin story of us first connecting is actually pretty interesting. So Fevi has clearly been thinking about this idea for a very long time, like, well before starting, so maybe years even. And she has this very deep intuition of what AI needs in order to basically navigate the world. Right? But we were at one of Mark's fancy lunches and there's a bunch of AI people, and everybody was so excited about LLMs, right? And it was talking about language. And I'd come to this independent conclusion just because I've actually done a lot of image investing, that, like, that wasn't the end of the story. And so the end of this table, all these people talking about it. Vivian leans over to me and she's like, you know what we're missing? I said, what are we missing? She said, we're missing a world model. And I'm like, yes. And it fell into place then because I'd been like, thinking about stuff at a high level. But as she does, she just kind of Perfectly articulated this. So she had a year's worth of thinking about this and talked to people, et cetera. And so in some way we kind of in our own crooked past had arrived at a very similar intuition. Hers was like way more filled out, mine was just this kind of fancy thing. But then after that, we actually had a number of conversations where we both agreed that we were aligned on this kind of idea.
[05:09]
Fei-Fei Li
Actually, I don't know if you know this, so of course during that lunch we hit it off on this world model idea. But I was at that point already talking to various people, not just computer scientists, technologists, but also investors, potentially business partners. And to be honest, most people didn't get it. You know, when I say role model, they nod, but I can just tell that was just a polite nod. So I called Martin. I'm like, do you mind coming over to Stanford campus and have coffee with me?
[05:39]
Martin Casado
Cup of cafe?
[05:39]
Fei-Fei Li
Yeah, go to cafe. And then as soon as Martin came and sat down, I said, Martin, can you define your world model to me? I really wanted to hear if Martin actually meant it. And the way he defined it about an AI model that truly understand the 3D structure, shape and the compositionality of the world was exactly what I was talking about. And I was like, wow, he's the only person so far I've talked to who actually meant it. It's not just nodding.
[06:11]
Eric
Wow. Okay, so we're going to get to World Labs and the specifics of this. But first I want to take you back Both to your PhD days, your professor days, and reflect on if you could go back in time and sort of have knowledge of what's happened the preceding 10 years in AI? What do you think would have been the biggest surprises? Or what's the thing that you didn't see coming that would have shocked your younger self? Or did you have a good sense of how this feels play out?
[06:33]
Fei-Fei Li
Yeah. It's ironic to say because as Martin said, I was the person who brought data into the AI world. But I still continue to be so surprised. Not surprised intellectually, but surprised emotionally that the data hungry models, the data driven AI, can come this far and genuinely have incredible emergent behaviors of thinking machine. Right, yeah.
[07:02]
Eric
Let's get into the specifics. Why start another foundation model company? Why aren't LLMs enough?
[07:07]
Fei-Fei Li
My intellectual journey is not about company or papers. It's about finding the North Star problem. So it's not like I woke up and said I have to do a company. I woke up every day, day after day for the past few years thinking that there is so much more than language. The language is an incredibly powerful encoding of thoughts and information, but it's actually not a powerful encoding of what the 3D physical world that all animals and living things live in. And if you look at human intelligence, so much is beyond the realm of language. Language is a lossy way to capture the world. And also one subtlety of language is purely generative. Language doesn't exist in nature. We look around, there's not a syllabus or word, whereas the entire physical, perceptual, visual world is there. And animals entire evolutionary history is built upon so much perceptual and eventually embodied intelligence. Humans, not only we survive, live, work, but we build civilization beyond language upon constructing the world and changing the world. So that's the problem I want to tackle. And in order to tackle that problem, obviously research was important and I spent years doing that as an academic and it's still fun. But I do realize, and especially talking to Martine, that the time has come that concentrated industry grade effort, focus effort in terms of compute data talent is really the answer to bringing this to life. And that's why I wanted to start World Apps.
[09:00]
Martin Casado
Amazing, Eric. You can do a very simple thought experiment that kind of highlights the difference between language and space. So if I put you in a room and I blindfolded you and I just described the room and then I asked you to do a task, the chances of you being able to do it are very little. I'm like, oh, 10 foot in front of you is like a cop. It's this very inaccurate way to convey reality because reality is so complex and it's so exact, right. On the other hand, if I took off the blindfold and you could see the actual space, right? And what your brain is doing is actually reconstructing the 3D, right. Then you can actually go and manipulate things and touch things. Right. And so one way to think about is we do a lot of language processing and we use that to communicate and high level ideas, et cetera. But when it comes to navigating the actual world, we really, really rely on the world itself and our ability to reconstruct that.
[09:49]
Eric
And how and when did you realize that language weren't enough? Because it seems like it's not super widely known. I don't hear about this all the time.
[09:56]
Martin Casado
Well, so if you ask me, like what is this surprising breakthrough? It's that language went first because we've worked so hard on robotics, right. I mean I feel like even look at autonomous vehicles as an industry, we've invested like $100 billion in it. I remember when Sebastian Thrun, like actually won like the DARPA Grand Challenge 2006 and we're like, Hooray. AV is done, right. And then 20 years later, like we're finally there. $100 billion in etc. This is like a 2D problem.
[10:24]
Fei-Fei Li
Ye.
[10:25]
Martin Casado
So that was the path we were going on is do you actually solve world navigation? And it's hard. Then out of Nowhere comes these LLMs and they are unit economic positive. They solve all of these language problems basically immediately. And so it just took me a moment actually. Fei fei said it beautifully early on when we were talking, which is the part of our brain that actually deals with language is actually pretty recent. And so we're actually pretty inefficient at it. Right. And so the fact that a computer does it better is not super surprising, but the part of the brain that actually does the navigation, you know, the spatial, has been around. It's a million brains, maybe the reptilian brain, about 4 million.
[11:01]
Fei-Fei Li
It's even more than that. It's a trilobite break.
[11:04]
Martin Casado
Yeah, yeah, right.
[11:05]
Fei-Fei Li
Trilobite head break.
[11:06]
Martin Casado
Right.
[11:06]
Fei-Fei Li
500 million years.
[11:07]
Martin Casado
Yeah. So it's almost like we're unrolling evolution. Right. So the language part is actually very, very important for like high level concepts and like the laptop class type work, which is what it's impacting right now. But when it comes to space, and this is everything from robotics, so anything where you're trying to construct something physical, you have to solve this problem. And then we know from AV that it's a very tough problem. And then maybe this is what is worth talking about. Like the generative wave gave us some insight in how you might want to do it. So it really felt like that was the time.
[11:33]
Fei-Fei Li
My journey is very different because I've always been vision. Right. So I feel like I didn't need LLM to convince me LWM is important. I do want to say we're not here bashing language. I'm just so excited. In fact, seeing ChatGPT and LLMs and these foundation models having such breakthrough success inspires us to realize the moment is closer for world models. But Martin said it so beautifully. It's that space, the 3D space, the space out there, the space in your mind's eye, the spatial intelligence that enable people to do so many things. That's beyond language is a critical part of intelligence. It goes from ancient animals all the way to humanity's most Innovative findings such as the structure of DNA. Right, that double helix in 3D space. There's no way you can use language alone to reason that out. So that's just one example. Another one of my favorite scientific example is buckyball carbon molecule structure that is so beautifully constructed. That kind of example shows how incredibly profound space and 3D world is.
[12:49]
Eric
Let's paint even more of a picture. When World Labs has achieved its vision or language, World models have achieved their vision. What are some applications or use cases that we can present to the audience to help make it concrete?
[12:58]
Fei-Fei Li
Yeah, there is a lot, right? For example, creativity is very visual. We have creators. From design to movie to architecture to industry design creativity is not just only for entertainment. It could be for productivity, for machinery, for many things. That alone is a highly visual perceptual spatial area or areas of work. Of course, we mentioned robotics. Robotics to me is any embodied machines. It's not just humanoids or cars, there's so much in between. But all of them have to somehow figure out the 3D space it lives in, have to be trained to understand the 3D space, and have to do things, sometimes even collaboratively with humans. And that needs spatial intelligence. And of course, I think one thing that's very exciting for me is that for the entirety of human civilization, we all collectively as people, lived in one 3D world. And that is the physical Earth 3D world. A few of us went to the moon, but, you know, it's a very small number. But that's one world. But that's what makes the digital virtual world incredible. With this technology, which we should talk about is the combination of generation and reconstruction. Suddenly we can actually create infinite universes. Some are for robots, some are for creativity, some are for socialization, some are for travel, some are for storytelling. It suddenly will enable us to live in a multiverse way. The imagination is boundless.
[14:45]
Martin Casado
I think it's very important because these conversations can sound abstract, but they're actually not. But the reason they sound abstract is because it's truly horizontal, just like LLMs are, right? So if you guys say, what are LLMs good at? The same LLM we use for an emotional conversation. We use it to write code, we use to do lists, we use it for self actualization. And so I think we can get actually pretty concrete about what these models do. Right? And so let me just give it a shot. And then Fei. Fei is the expert, of course. So with these models, you can take a view of the world, like a 2D view of the world, and Then you could actually create a 3D full representation, including what you're not seeing, like the back of the table, for example, within the computer. So given just a 2D view, you have the full thing. And then you ask, okay, well, what can you do with that thing, for example? Well, you can manipulate it, you can move it, you can measure it, you can stack it. So anything that you would do, a space you could do, Right. That means you could do architecture, you could do design. But it turns out the ability to fill out the back of the table means that you can fill out stuff that was never there to begin with. Right. So let's say that I just had a 2D picture of this. I could create a 360 of everything. Right. And so now you have fully generative. And so what does that mean? That means that's video games, that's creativity. And so it's a super horizontal piece that takes basically a computer with a single view in the world, or maybe multiple views in the world, and creates a full 3D representation that that computer then can act on. And so you can see that that's a very concrete, pivotal thing from everything from like robotics to video games to art and design.
[16:13]
Eric
Yeah, it seems like we haven't fully been appreciating sort of 3D components until now. Is that fair to say?
[16:19]
Fei-Fei Li
It is fair to say. In fact, I think it took evolution a long time. 3D is not an easy problem, but I always come back to the fact that I had a conversation with my 6 year old years ago about why trees don't have eyes. And the fundamental thing is trees don't move, they don't need eyes. So the fact that the entire basis of animal life is moving and doing things and interacting gives life to perception and spatial intelligence. And in turn, spatial intelligence is going to reinvent horizontally, as Martin said, so many of the way of work and life that humans are doing.
[17:04]
Eric
Yeah, fascinating.
[17:05]
Martin Casado
But it is definitely worth asking the question, why can't you just use 2D video for this? Right. Like, 3D is very, very fundamental to.
[17:11]
Eric
This Vivi you suggested. Let's get deeper into the technology. What can we share more about how it works or what the breakthrough is, or what's worth commenting on the technology, to Martine's point, does it need to be 3D or why can't you just use 2D?
[17:22]
Fei-Fei Li
I think you can do a lot of things using 2D, but the fact is that 2D will get you very far. In fact, today's multimodal LLMs is already Making a big difference in the robotic learning world, helping guiding you to know what's next, the state of the world. But fundamentally, physics happens in 3D, and interaction happens in 3D. Navigating behind the back of the table needs to happen in 3D, composing the world, whether physically, digitally, needs to happen in 3D. So fundamentally, the problem is a 3D problem.
[18:02]
Martin Casado
One way to think about it is if it's a human being looking at, say, a 2D video, the human being can reconstruct the 3D in their head, Right? But let's say I've got a robot that has the output of the model. If that's 2D, and then you ask the robot to do, I don't know, distance or to grab something, that information's missing. You've got the XYZ plane, the Z plane just isn't there at all. Right. And so for many things that are spatial, you need to provide that information to the computer so that you can actually navigate in 3D space. And so 2D video is great if it's a human, because we already can turn it into 3D. But like, for any computer program, it'll need to be 3D.
[18:38]
Fei-Fei Li
Actually, I want to tell you a personal story. About five years ago, ironically, I lost my stereo vision for a few months because I had a cornea injury. And that means I was literally seen with one eye. And like Martine said, my whole life has been trained with stereo vision. So even if I was seen with one eye, I kind of know what the 3D world looked like. But it was a fascinating period as a computer vision scientist, for me to experiment what the world is. And one thing that truly drove home literally, was I was frightened to drive.
[19:12]
Eric
Wow.
[19:13]
Fei-Fei Li
First of all, I couldn't get on highway that speed. I could not, you know, but I was just driving in my own neighborhood, and I realized I don't have a good distance measure between my car and the parked car on a local small road. Even though I have perfect understanding of how big is my car, almost how big is the neighbors, the parked cars, I know the roads for years and years, but just driving there, I had to be so slow, like almost 10 miles an hour, so that I don't scratch the cars. And that was exactly why we needed stereo vision.
[19:48]
Martin Casado
That's actually a great articulation of why 3D is just actually key if you're doing some processing, right?
[19:52]
Fei-Fei Li
Yeah. So I don't recommend it. But if you're stereo, park your car 1 and drive your car 2 with one eye and Feel it. That's your own car.
[20:01]
Eric
On the tech side with LLMs, a lot of the research was done at the big companies. What's the state of the research here?
[20:07]
Fei-Fei Li
This is definitely a newer area of research compared to LLM. It's not totally fair to say new because in computer vision as a field, we have been doing bits and pieces. For example, one important revolution that has happened in 3D computer vision was Neural Radiant Field, or NERF. And that was done by our co founder, Ben Mildenhall and his colleagues at Berkeley. And that was a way to do 3D reconstruction using deep learning that was really taking the world by storm about four years ago. We've also got a co founder, Christoph Lassner, whose pioneering work was part of the reason Gaussian splat representation started to again become really popular as a way to represent volumetric 3D. And of course, Justin Johnson, who was my former student, also co founder of World Labs, were among the first generation of deep learning computer vision students who did so much foundational work in image generation when before Transformer were out, we were using GANS to do image generation and then style transfer, which was really popularized. Some of the components or ingredients of what we're doing here. So things were happening in academia, things were happening in industry. But I agree with what is exciting now is that at World Lab we just have the conviction that we're going to be all in on this one singular big North Star problem, concentrating on the world's smartest people in computer vision, in diffusion models, in computer graphics, in optimization, in AI, in data. All of them coming to this one team and try to make this work and to productize this.
[22:04]
Martin Casado
I will say from an outsider standpoint, and so I'm not an expert in any of these spaces, but it really feels like to solve this problem you need experts both in AI. And that's like the data and the models, like the actual model architecture and graphics, which is like, how do you actually represent these things in memory in a computer and then on the screen? So it takes a very special team to actually crack this problem, which Fei Fei has managed to put together.
[22:29]
Eric
Well, that's an inspiring note to wrap on. Fei Fei, thank you so much for joining us.
[22:31]
Fei-Fei Li
Thank you. Thank you, Eric.
[22:35]
Podcast Host
Thanks for listening to the A16Z podcast. If you enjoyed the episode, let us know by leaving a review@ratethispodcast.com a16z. We've got more great conversations coming your way. See you next time.