Summary6 min read

Podcast Summary: Latent Space — Priscilla Chan and Mark Zuckerberg: Frontier AI + Virtual Biology to Solve All Diseases

Date: November 6, 2025
Guests: Priscilla Chan (CZI), Mark Zuckerberg (CZI & Meta)
Hosts: Alessio (Kernel Labs), Swyx (Latent Space)

Episode Overview

This episode marks the 10-year anniversary of the Chan Zuckerberg Initiative (CZI) and dives deep into the intersection of AI and biology. Hosts Alessio and Swyx explore CZI’s mission to "cure, prevent, or manage all diseases" with Priscilla Chan and Mark Zuckerberg, focusing on foundational and frontier biological research augmented by next-gen AI models and in-house scientific tool development. The conversation traverses the history, impact, technical approaches, challenges, and future vision for virtual biology and disease prevention.

Key Discussion Points & Insights

1. Origins and Philosophy of CZI

CZI vs. Gates Foundation: CZI started with an experimental approach, learning by doing, rather than focusing immediately on translational medicine as the Gates Foundation does ([00:51]).

"We should just kind of dig in and start doing a few different iterations on it and see what we enjoy and where we think we can have an impact..."
— Mark Zuckerberg [00:51]
Unique Role: CZI emphasizes building research tools and fostering fundamental science, rather than conventional grant-giving ([04:26], [10:56]).

"A lot of what we're doing is actually building up these institutes and building labs to do that kind of research ourselves..."
— Mark Zuckerberg [04:26]

2. Mission to Cure, Prevent, or Manage All Diseases

Ambition and Skepticism: Internally, the mission is seen as realistic; AI experts see it as inevitable, while biologists approach it more cautiously ([07:18], [09:00]).

"When you ask AI people, it’s like, that should be really easy. Why are you so unambitious that you’re shooting for just the end of the century?"
— Mark Zuckerberg [07:35]
Ecosystem Approach: CZI’s Biohub promotes collaboration among scientists, AI researchers, engineers, and physicians, breaking traditional silos ([10:56], [14:15]).

"It is... amazing how much progress you can make if you just have people from different disciplines sit together."
— Mark Zuckerberg [10:56]

3. Data Creation, Tooling, and Modeling

Cell Atlas and Data Growth: The Human Cell Atlas project seeded large-scale, collaborative data collection; recent projects collect exponentially more data in less time
([14:15]).

"We now have one of the largest corpus of RNA transcriptomes. 125 million cells cost a lot of money. And the really cool thing... if we could seed the effort and make it easy for people to contribute, it happened."
— Priscilla Chan [14:15]
Custom Instruments & Imaging: Most cutting-edge microscopes are custom built. Imaging speed, dimensionality (spatial, time), and data fusion remain bottlenecks ([15:46], [16:13], [17:45]).

"...with the cryo model it will get fast again and you just have to repeat it."
— Priscilla Chan [14:15]

4. Virtual Biology and Iterative Model Development

The Virtual Cell & Immune System: Stepwise modeling—protein → cell → system—mirrors advances in AI, with current work just at the beginning ([10:56], [45:12]).

"You kind of need systems that understand data at all different levels... and then... you build a richer and richer model of how these cells work."
— Mark Zuckerberg [10:56]
AI Validity Loops: Unlike rapid LLM testing, AI in bio requires wet lab cycles for feedback, which are still orders of magnitude slower ([25:08], [26:06]).

"You have to actually take it to the wet lab, run the experiment, find out if it actually happened as predicted, and feed it back into the model."
— Priscilla Chan [25:08]

5. Biohub Evolution and Announcements

Frontier Lab Model: Analogous to AI "foundation models," the Biohub unifies top talent (notably, the Evolutionary Scale team led by Alex Reeves, joining CZI) for deep AI-biology convergence ([28:42]).

"I think it’s sort of an interesting decision... to have the AI person basically be running the overall program partnering with these leading biologists..."
— Mark Zuckerberg [28:50]
Compute as an Enabler: CZI built large compute clusters specifically for biological research—an atypical move in the science world ([29:48]).

6. A Vision for Precision and Proactive Medicine

Clinical Translation: The real milestone is clinical impact. One major anticipated advance: identifying the effect of genetic variants with precision ([30:36]).

"That is actually the future of medicine, where we think about each one of your biology based on your genetics, your exposure, and how it predisposes you or not to disease."
— Priscilla Chan [30:36]
Doctors’ Evolving Role: As models advance, physicians will focus more on care, interpretability, and guiding patients, rather than routine diagnosis ([36:59], [37:14]).

"Care and compassion and sort of walking patients through understanding, I think understanding why leads to trust in both the science and in the clinical pathway."
— Priscilla Chan [37:14]
Proactive vs. Reactive Healthcare ([38:29]):

"Everyone wants the health system to be more proactive and less reactive.... The goal... is to be much more proactive about this."
— Mark Zuckerberg [38:32]

7. Philosophy of Scale and Timeframes

Is 'Ending Death' the Goal? Focus is on maximizing quality and length of life, not necessarily radical life extension or 'ending death' ([39:53]).

"I'm a pediatrician. I think about babies and like very sad things happen to very small people. And like, I think a lot about that and how do we like maximize life quality..."
— Priscilla Chan [40:23]
Acceleration Depends on AI: Ultimate pace is closely tied to AI progress; biology must keep producing frontier datasets and tools ([48:37], [49:47]).

"If we're predicting... whether it's going to take 10 or 20 or 40 years, that is probably more a function of the pace of AI development than it is a pace of the pure biology side."
— Mark Zuckerberg [48:37]
Collaborative Science Culture: Large-scale, unglamorous data work (e.g., the 120 millionth cell) is essential, necessitating new models of scientific reward and collaboration ([49:47]).

"We need to be continuing to push the research and the methodologies. And I want to say that the cell atlas was not glamorous work...."
— Priscilla Chan [49:47]

Notable Quotes & Memorable Moments

"Believing is the first step." — Priscilla Chan [10:15]
"I think you want the models... to basically build up different levels of abstraction and pattern matching. And that's here too." — Mark Zuckerberg [44:16]
"We need lots of people coming together to do this work." — Priscilla Chan [53:04]
"Check out the models. They're early, but I think it's kind of an interesting sense of where things are going, and we'd love feedback on it..." — Mark Zuckerberg [52:47]

Important Timestamps

00:51 – CZI’s iterative, learn-by-doing philosophy in philanthropy
04:26 – Tool building in science; differences from traditional grant-giving
10:56 – Breaking down academic silos and enabling mixed-discipline teams
14:15 – Success of the cell atlas and community data efforts
25:08 – The necessity and slowness of wet lab validation cycles for AI models in biology
28:50 – Bringing in the Evolutionary Scale team; AI-first leadership in biology
30:36 – The vision of precision medicine: deeply personalized disease modeling
36:59 – The evolving, increasingly patient-centered and empathetic physician
45:12 – Virtual immune system: opportunities, clinical implications, and technology
48:37 – Timeline questions; pace of AI as primary determinant
52:47 – Call to action: try the models; collaborative plea to the global research community

Conclusion & Call to Action

CZI is pushing the boundaries of what’s possible in fundamental bioscience by uniting AI and biology experts, building bespoke tools and datasets, and embracing collaborative, long-term frameworks that break from academic tradition. Clinical impact remains the north star. The guests urge biologists, engineers, and AI practitioners to explore CZI's models, participate in the ecosystem, and help build the tools and data that will underpin the next era of medicine.

"Let's do this together." — Priscilla Chan [53:04]

Explore models and learn more at: https://latent.space

Loading summary

Transcript123 lines

[00:00]
A
Hey everyone. Welcome to the Late in Space podcast. This is Alessio, founder of Kernel Labs, and I'm joined by Swix, editor of Late in Space.
[00:07]
B
Hello. We're so delighted to be in the Imaging Institute of CZI with literally C and Z.
[00:12]
A
Welcome.
[00:12]
C
Mark and Priscilla, thanks for having us. Thanks for getting nerdy.
[00:16]
D
Yeah, we're excited to do this.
[00:18]
B
We so don't often get to see this side of you. And so thank you for taking some time out to talk about this. And it's like sort of the 10 year anniversary kind of of CZI, so I just wanted to introduce people. If people have not been caught. One of the interesting things that we found out just from talking to your teams is there's an interesting difference between how you guys started CZI and the Gates foundation. And I heard that Bill Gates is a mentor of yours. So maybe you could tell that story of deciding to start CZI and deciding to pursue basic science instead of translational work.
[00:52]
D
Well, I mean, I think one of the core things for us with CZI was just getting started. Earlier we got some advice that basically philanthropy and doing science just like any other discipline, requires practice and you're not going to be good at it overnight. So we should just kind of dig in and start doing a few different iterations on it and see what we enjoy and where we think we can have an impact and go from there. So, yeah, I mean, like you mentioned, I mean, this is. We're coming up on in November, the 10 year anniversary of when we started CZI. And there's a lot of work that we're really proud of that we've been a part of, including work in education and supporting communities. But when we reflect on it, we feel like the work that we've done in science really has had the biggest impact and in a lot of ways is accelerating. And especially with all the advances in AI that are coming, I think the ability to have an even bigger impact over the coming decade is. It just seems really clear like this is coming into focus. So, you know, for the next period, we really want to make science the main focus of what we're doing. And specifically the biohub organization that we're really proud of, this model that we've helped pioneer that we can go into detail on is really going to be like the main focus of our philanthropy. And it's just something that we're very excited about.
[02:13]
C
Yeah, when we started 10 years ago, we had this idea like, okay, I bring experience as a physician, Mark's an engineer and he Builds things, and we have an opportunity to give back resources to make an impact on this world. And we sort of just. We tried a bunch of things. And the thing that in running a philanthropy, I'm incredibly envious of people who run companies is that, like, you guys can have a dashboard and there's like, financial results and people tell you if you're on the right track, on the wrong track, and there's clarity. But in philanthropy, there's so much you can do, and it takes a long time for you to get a sense of, like, what has momentum. What are we doing that is actually bringing all of our both skills and resources to maximal impact? So over the past 10 years, I would say we've been getting a sense of what is that thing that really allows us to have the impact and makes the most of what we bring to the table. And it's really been around AI and biology, where we're like, oh my gosh, this is it. And the ecosystem is big. We really think our ability to bring great scientists, great AI researchers together between the wet lab and the compute, the ability to bring physicians and patients into the picture, that's a unique niche for us at the biohub. And we need others to take the work to translation. The Gates foundation has a strong focus on translation and the field, and we have had a number of really awesome collaborations and continue to where we really look at sort of the basic fundamental research. And being able to partner with someone who's thinking about the translation layer is incredible.
[04:02]
A
We kind of see the first decade, and I would love to get your take as a decade of creating data, creating a science ecosystem, and then starting to work on some of the models. And the next decade maybe is more of the applied modeling side. At what point did you decide that just doing the tooling was matter versus you could have cured malaria in Africa too, or some other disease?
[04:27]
D
Yeah, I mean, take a step back. And this is kind of related to your first question too. Like Priscilla was saying, the space is huge. There are lots of other philanthropies, including Gates, who I think they would say that they're primarily focused on public health and sort of administering. Once you know what a cure is, just getting it out to the world is a huge thing too, and someone needs to do that. And that's a lot of work and a lot of resources, and it's good that they're doing that. Basic science is another completely different part of the kind of innovation funnel to enable that. And our view is that the federal government basically dwarfs Everyone else in terms of how much they invest through nih. But there's a certain pattern to how they invest, which is really enabling a lot of individual investigators to do work. And our kind of observation was that if you look at the history of science, a lot of major advances are basically preceded by new tools or new ways of observing things. So the initial telescope allowed a lot of advances in astronomy. The microscope, the invention of that allowed a lot of understanding of biology. And similarly, I think we're at a point in history where a lot of new tools are being built, computational tools, tools to instrument the body in different ways and understand things. And often those tool development just takes a longer term timeframe and sometimes a larger commitment of capital. Including the way to do it isn't necessarily just to make grants to a lot of different people. You need to really operate it yourself. Which I think is one thing that's different about the way that we've operated than others is most times when you think about philanthropy, you think about giving money away in terms of grants. And a lot of what we're doing is actually building up these institutes and building labs to do that kind of research ourselves by bringing in leading scientists and engineers and all that. But that's kind of the strategy. We feel like there's a lot of new tools to develop. There's sort of been a hole in the ecosystem where tool development and kind of the 10 to 15 year Runway that you need to do that and often hundreds of millions of dollars to build things like the microscopes and imaging that you're seeing in this institute here. I think that that's been sort of underfunded. And that's where we think that if we do that kind of work, it can just give all these other scientists way more tools to accelerate the pace of research, hopefully discover cures. And then you have folks who are focused on public health who bring that out to the world and kind of deploy it to everyone.
[06:57]
C
Yeah, I mean, our mission is to cure, prevent all diseases. And that's not going to happen just in our four walls. So the strategy has to be how do we make every single scientist and everyone better and more effective? And you know, the strategy Mark talked about is sort of where we landed on how to actually maximally move the field forward.
[07:19]
B
Yeah, the mission is cure, prevent all diseases. By the way, a lot of people outside of the CZI worlds are still kind of find this concept very alien. But talking to the CZI people, they really truly believe it. And it's impressive how you pick the Right. Mission to motivate everyone to work towards this enormous task.
[07:35]
D
Well, it's kind of a funny thing. We like to talk about the mission as, like, helping scientists do it right. Because we're not actually curing the diseases. We're just trying to build the tools. Tools, Data models. Yeah. Like, basically accelerating scientists work towards that. But, you know, a funny thing about it is we had this initial time frame of by the end of the century. And, you know, when you ask biologists, there's a lot of questions around, okay, that's really ambitious. Are we going to be able to do that? And then when you ask AI people, it's like, that should be really easy. Like, why are you so unambitious that you're shooting for just the end of the century? And I do think that at the pace that AI is improving things, I think it might be possible significantly sooner than that. I mean, I don't think it's necessarily worth putting a number on it or a date, but I think that, to your point about the first decade was sort of about doing work like the cell atlas to be able to help understand basically all of the kind of specifics and data about all the different configurations of every cell in the body. When we did that, we kind of had this vague notion that that would be useful to advanced science. But I think that, like a lot of people in the tech industry, we have even been impressed by how quickly AI has accelerated. But that ended up being a really valuable thing to have done over the last 10 years, especially for where AI is now, and now the models that can get built with that.
[09:01]
C
But the thing that's interesting, don't you agree, is like, okay, so from a. I totally agree that in our intersection of AI and biology, the AI folks are like, yep. The biologists are like, hmm. And I think it's actually that confluence of conversations that lead both the biologists to be like, okay, I'm really uncomfortable about this idea and timeline. But if I'm really pinned down to think about it, what are, like, you really force people to think through? Like, okay, what are actually the barriers? What would you need to do? And you're forcing that conversation from the biologist side and from the AI side, really getting a sense of, okay, what? Like, data is not just data. You guys know this. Like, you need to know sort of how the data was collected and from where. And being able to connect the AI researchers to the folks who are actually gathering the data on a daily basis makes their work better. And so it's that conversation that's happening here that I think makes people outside so excited about this because it's credible. And they sort of have worked, really dug in and thought through how that would work, and they're excited and they believe. And believing is the first step.
[10:16]
B
Believing is the first step. There's a general pattern of software eating the world, and I think AI eating the world is kind of like the next version of this. I was talking with Garrett outside, who says he's a biologist, but I think he's using models like Sam from Meta.
[10:28]
C
You're like, you don't look like only a biologist.
[10:31]
D
What does a biologist look like?
[10:32]
A
I don't know.
[10:33]
C
He's working on models out there. And that's like biologists are working using models, right? Not just in imagination, like just using. In the wet lab.
[10:41]
B
Yeah, totally. Yeah. I think one of those things that referencing the wet lab is, one of the key approaches that you're pursuing is turning things, pursuing a virtual cell, turning things from mostly wet lab into something in Silico. How far along are we?
[10:57]
D
I mean, it's pretty early, right? I mean, I think the first step, which I think is easy to overlook, is basically what Priscilla was talking about of just getting these folks together. It almost. It's worth taking a beat just to talk about this, just because I think most people assume that this is like, obviously you would go do that, but it's somewhat novel in science because of, I think, the way that a lot of funding has been done that is basically you grant individual teams, relatively small grants, and people do a lot of science independently. It is, I think, pretty amazing how much progress you can make if you just have people from different disciplines sit together. I mean, this is like, over my career. I mean, both at Meta and here, it's like you have teams that are not working together for some reason or they disagree on something. It's like, okay, physically just have them next to each other, and it actually is super helpful. So here, what are we doing? It's not just bringing together the biologists and the engineers, which was a core part of the initial biohub model, but it was also unlocking the ability for people to work together across institutions. So the first biohub that we started out here between Stanford, UCSF and Berkeley, allowed a lot more collaboration between scientists and engineers at those universities than was in practice happening before. And it's like, you can look at this and be like, all right, that seems really obvious. But it actually was sort of an interesting and novel experiment and one that I'm really Happy to see others also implementing because I think it's just such a clear win, just the kind of the human side of bringing people together and having them sit together. So anyway, that I would say is kind of step one or step zero and is probably quite overlooked, but is sort of a fundamental part of the model that I guess also goes back to this idea of, like, we're not just kind of like, granting funds to other people. We're building an institution and we're having people sit together. So then you get that, and then you get these people who are like half biologist, half AI engineer, because they kind of have some experience doing it. And, I mean, I don't know. I mean, we can talk through the specific models, and there's a lot of exciting stuff there, but I'd say it's an early glimpse of where this is all going. I think you want to kind of build up these models hierarchically, so you give them a lot of data about specific proteins, and they can model specific proteins in the cells, and then you can model different cell behavior. And then eventually you kind of zoom out and you're modeling a virtual immune system or something like that. And it's sort of hard to simulate the immune system without having a good understanding of how a cell might work. And it's kind of hard to understand or simulate how a cell might work if you don't really understand how the proteins interact. So you kind of need systems that understand data at all different levels of this, and then you kind of pull them together. And then if you look at the different models, there are versions that are kind of focused on, all right, like, which parts of the genome are kind of being expressed in different ways. I mean, the cryo model that I think is very interesting, that's built off of the data here. The only model that I'm aware of that's like a spatial model of basically how these cells work. And you just want to be able to look at stuff from different perspectives and then put them together, and you build a richer and richer model of kind of how these cells work. But we are definitely at the beginning of this journey.
[14:15]
C
But it's like, slow and fast. Slow and fast, right? So when we built the human cell Atlas we started 10 years ago, it was one of our first RFAs. And we actually, the first RFA was to fund the methodologies of how you would get a single cell transcriptome. And it took us about 10 years to get to a place where we now have one of the largest corpus of RNA transcriptomes. 125 million cells cost a lot of money. And the really cool thing we discovered through that process was if we could seed the effort and make it easy for people to contribute, it happened. That's cell by gene. We actually were responsible for maybe 25% of the data and the rest of the ecosystem contributed 75% of that. That's an incredible asset and has been very important in modeling work. Similarly, if you look at alphafold, they, they, they built off publicly available data that was collected for 30 years prior. Right. So that takes a long time. But now we're doing the billion cell project and that is taking months and at a fraction of the price. You know, really slow to fast. But it's a single dimension and cells are so complicated. And here we're looking, like Mark said, at the three dimensional imaging structures. That's in its slow and expensive. But with the cryo model it will get fast again and you just have to repeat it. And so I think we'll get growth spurts but it's all happening just faster and faster.
[15:47]
A
How do you think about the layers? So you have compute and we'll talk about that later. On the data side, you build these amazing microscopes. I learned that they're all built for you by spec. They're not off the shelf things that anybody design partners. How much of a bottleneck is that still? Can we convert the world of atoms into bits now at the right perceptible or do we need more work on the microscopes themselves too?
[16:14]
D
I mean you're never done, right?
[16:15]
C
Yeah, well speed for here speed has been a big question of how just getting the process through. So here we've worked on sort of the speed at which we can look at tomograms and the sort of contrast in resolution. And that's where the laser phase plate comes in. So to be able to make the data better and faster to get the data. But it's a bottleneck in so much as there's only, I don't know the exact number. There are like maybe tens of these microscopes in the world. So that's one bottleneck and I think really is like when I was saying it's slow and then fast. There's so many other dimensions that we don't have yet of. Like the cool thing here is with the transcriptome work we're looking at cellular expression and with the imaging work you're being, you're able to localize it in space. And now you want to connect those two, but that's still like two Dimensions connected. Time is another dimension. We need to get dynamic imaging in place.
[17:12]
B
Oh God, that's.
[17:16]
C
Yeah, right. But like really cool biological innovation. We need innovation in the way we can look at things like stain free, dye free, so we can look at things without sort of human intervention. With time as a dimension is another because like we are not frozen slices. So I think it's just continuously looking at what the next dimension. We want to sort of be able to either understand deeply or connect to our existing corpus of data and knowledge.
[17:45]
D
And obviously the ideal would be you want to increasingly be able to image things inside living cells. Right. So I mean you can kind of, you can simulate it a bit by okay, you can take a cell out or some culture. It's all destructive. Yeah, it's like, okay, it's living for a little bit or something. But I mean you really want to be able to kind of as much as possible actually understand what's going on in living organisms.
[18:08]
B
Can that be done? Is there, what, what are the approaches?
[18:10]
D
Well, the better it gets.
[18:11]
C
Well, there's this cool methodology. So there is a really high intensity X ray methodology you can use. The organ has to be dead. So like you can just shoot X rays, high intensity X rays at like a lung and understand at like a sort of molecular level how the lung is assembled. And then you can correlate that with living imagery. Right. MRIs of the lungs, CTs of the lungs and look at the associations between the living images in real patients with the sample that you put into the high intensity X ray. So that's another example of like correlating data types so that we can get that sort of high level specificity with clinical data that impacts humans.
[18:56]
D
But I mean in some level that's sort of the point about building these AI biological models is you can have a lot of data and you can interpolate on that space and understand that.
[19:07]
C
Yes.
[19:09]
D
So one of the models that again, it's really early work, but the RBIO model, the idea of doing reasoning is that then you don't just get correlation, but you get some understanding of logic over how these things get together too. So yeah, I think it's probably going to be a while and people don't have great hypotheses on how you'd actually do like molecular imaging, like of a cell deep inside a living organism. But the goal is to be able to approximate that as much as possible with like this kind of surround view of different things that you can image.
[19:45]
C
You guys like to see cool stuff. It's not here, but at our San Francisco Skype, we do image see through fish called zebrafish.
[19:53]
B
Zebrafish, yes.
[19:54]
D
That's another, it's another good example. Like another good model. All right. It's like, what's a good way to imagine a living thing? It's like take a sequel, see through.
[20:01]
C
Things and then use a model to say, how does this see through thing actually relate to us? Right. Like, I'm like not that interested in curing disease, cure, prevent, manage all disease for zebrafish. I am very interested for zebrafish.
[20:17]
A
Yeah.
[20:17]
C
Mark, Mark's pro zebrafish. I'm okay on zebrafish. But you, you need to use another application of large language models is looking at how, what is conserved and what is actually relevant and important to the way human biology works in a fish model. And so being able to have that translation be more effective so we don't waste our time on things that won't apply in a model organism is another really interesting way to elevate biology.
[20:44]
A
On the data side, can you just give an overview of how far we are, like, what percentage of all cells that we image and do we have? What's the distribution of them? When you say 150 million to 1 billion cells, is that a lot? Is that 10%?
[21:01]
C
The funny thing is, until recently we didn't know how many cell types.
[21:05]
D
Yeah, I mean, the human wild thing, I mean, this was a big part of the cell atlas project is like there wasn't even. It's kind of like imagine the periodic table in chemistry. But you, you know, it doesn't end well.
[21:15]
C
It's.
[21:15]
A
You don't have the squares.
[21:16]
C
We know it's billions. We know there are billions of cell types in a human and we've only truly looked at a fraction of them. And we looked at it in largely healthy cells. And so like just the number of permutations of like age, well, species, because not all research is in humans. Right. So species ancestries, like what is your sort of genetic background? Age, like babies are different than old people, Gender, all of those things actually are permutations. Environmental exposures, all of those things are permutations on the cell that actually you, you want to be able to understand in healthy and disease states. I feel confident that we are at the beginning of this.
[21:59]
B
I'll ask a little bit of an obvious question in terms of the intersection of AI and bio, which is don't we want precision in biology? Don't we want some grounding in a world model maybe that we don't normally get in a language Model?
[22:15]
D
Yeah, I mean, I think that that's sort of the point of doing all the measurement and being able to have all this real. So you have the, the diffusion model for generating cells that we put out. And it's like one of the recent models. And it's cool because you can basically, you have a model now that you can describe the conditions and it'll basically give you a synthetic cell. But yeah, you want it to be increasingly grounded. And that's a lot of the point of the biology and the engineering that we're doing is to be able to have these different facets of that. So the Imaging Institute is one part that gets you the spatial data that's very helpful. And the work that we're doing in the other biohubs on cellular engineering and instrumenting inflammation and things like that, it's basically, it's scientific work to build new types of tools that allow us to measure new types of things that generate data, that allow us to ground the models in different ways. One framing that we have on this that I think is pretty interesting is that there's this concept of a frontier AI lab that is like, okay, it's building AI models that are sort of at the frontier of what's possible. And I think you can think about biology in that way too. And there's sort of a concept of a frontier biology lab. Like, what is the idea of labs that are kind of at the cutting edge of building the most advanced imaging, like measuring inflammation or doing cellular engineering in the most advanced ways? Whatever the problem space is that you're at. And then I think that there's this interesting problem space of what happens if you're at the intersection of those two areas.
[23:54]
A
Right.
[23:54]
D
So you mentioned the work that DeepMind did on AlphaFold, which is great. That's an example of a Frontier AI lab using a data set that was just generated by other scientists over decades. But I think part of what we're trying to unlock here with BioHubst is the idea of what happens if you do frontier biology and frontier AI in sync together. And you're designing the tools on the frontier biology side in order to specifically collect and be able to learn types of data that you then want to feed into specific types of models that you want to build so that it can understand the cells and the body at different types of resolution. I think you can just kind of, I don't know, it's like a much more integrated approach that allows designing the things that you need that should eventually get towards more grounding and not just allowing folks who are good at AI to do the best they can with whatever biological data happens to be available.
[24:53]
A
What's the hilt climbing in this scenario? So with language models, you have benchmarks, you look at the benchmark, you just make that go better. With these things, you have to bring it back to the real world. So as you build these models, how do you bring the two teams together to give feedback?
[25:08]
C
I think it's very similar to what Mark just said. You want to be able to validate on the accuracy question. We don't expect that these models, they will get increasingly accurate, but you want to be able to have feedback. And it's not as easy as being like, you know, this output doesn't make sense. You have to actually take it to the wet lab, run the experiment, find out if it actually happened as predicted, and feed it back into the model. And that's the virtuous cycle we want to build to help the AI best serve the biologists and the biologists be part of continuously improving the models.
[25:44]
A
From like a numbers perspective in a language model, you can run tens of thousands of tests. There you go. Theory false. Yeah.
[25:51]
D
And we have to build a lot of them out.
[25:53]
A
Yeah, yeah. And then on going to the wet lab, what do you think that's going to be like, the feedback cycle? Like, as you start to have more of these things to be tested in the wet lab, do you feel like that's going to be a bottleneck, that we cannot take that many or.
[26:06]
C
I don't know the answer to that yet. I think the throughput on sort of established metrics in the wet lab is actually getting quite fast. You can run paralyzed a lot of experimentation, but it's not easily at the tens of thousands of verifications. But it will have to. Well, we actually have to see. We'll probably need to be smart about how we do it.
[26:32]
D
But I mean, there's, you know, a lot of people I think, often take these things to the extreme and are like, okay, pretty soon if you have these models, you're just going to be able to run experiments with the models without even having to go to a wet lab. And it's like, no, I mean, I think that's kind of like. I think that that's sort of the biological version of like, eventually AI is going to automate every single thing in society. It's like, look, maybe you get there, right? And I think that there's like some chance over time, but, well, before you do, you're going to be able to have models that can help generate hypotheses and scientists can apply their taste on which ideas or kind of suggestions come from this are worth testing. And then you test them and then you feed it back into the model, which I think is basically the way that every AI model is deployed into, even in coding and other places.
[27:20]
C
Totally. Yeah. Like you right now, because the wet lab is so expensive and relatively slow compared to sort of computational experimentation. Like people are choosing, like, I need something to hit so people are going for hypotheses or ideas that are like, you know, to use a sports analogy, like singles or doubles and. But like they. It's just too risky. They only have so much grant funding and they need something to help move their work along. But like, if we have a model that can help de risk some of the bigger, riskier ideas, that's going to move science faster. And I think makes the science and those ideas both can be sourced with AI as a tool. But really it's really about making the scientist less hesitant to explore big ideas.
[28:09]
B
Yeah, obviously that's a lot of the success of the model czi, which is serving this part of research that is underserved because there was basically no benefactor or no funding mechanism by which to do this. One thing that we're announcing when we release this podcast is this unification of the sort of biohub model. I think it's very analogous to the foundation model and frontier lab approach where you bring together people, different disciplines, you have much longer time horizons than anyone else. Are there any other key elements to the strategy of the biohub that you're taking?
[28:43]
D
Well, I mean, one thing that we haven't talked about is the evolutionary scale team and Alex Reeves and his team.
[28:48]
B
Joining and they're like, let's talk about the announcement. Yeah.
[28:50]
D
Yes. This is probably the most talented team working on AI and biology at the intersection of doing basically good biology background. And also they've just been working on ESM3. Yeah. Some of the top protein models for a long period of time. Yeah. I mean, I think if you want to build an organization that is doing frontier biology and frontier AI, you need to have like world leading AI researchers. And we're doing that by basically combining the team that we have that's already put out all the models that we're talking about today, plus having the evolutionary scale team, which is just like very renowned, join and Alex is basically going to be running the program. So I think it's sort of an interesting decision, I think, to have the AI person basically be running the overall program partnering with these leading biologists, I think gives a sense of how optimistic we are about the AI work being very fundamental to this. But we're very serious about building out like a leading part, a leading lab on the AI side as well. That goes for both the talent and the compute. I think we were probably the first to build out a large scale compute cluster for biological research. I think now there are some others who are doing it too. But we're also building on that and we plan to release frontier models on this.
[30:11]
A
Do you see that as the 10 year output, like in the next 10 years?
[30:15]
C
We look back at that 10 years yesterday. They say it's faster than that, but.
[30:19]
D
AI people are always in a hurry.
[30:22]
A
We have AGI in two years. Would that be a satisfactory result for you guys? You fast forward 10 years, you have the three best models in biology. Or is there a further goal that you want to have as an output of the foundation?
[30:37]
C
I have to bring it back to the patient? I think the AI models I think will be very excited, both if we have great models and scientists are using them. But you really want to make sure that it's accelerating clinical impact. That's the goal. The AI models is a very challenging milestone that we are working very hard on and we will get there. But how do you actually take those models and apply them to actually change the way people live? And there's. So there's two variants that I think about in the application of these models. Why are they important? One is like, each one of our genetics is incredibly diverse and different. First, like, first of all, we are just, all the four of us are unique people, but we also have things like that are sort of known indicators of disease and unknown indicators of disease. And I actually find the variance of unknown significance to be the most interesting and the most frustrating. Say someone that you love, it's sort of a diagnostic mystery. They need to go in and look at the genetics. Most likely they'll come back and be like, there are these three things that are not usual, but we also don't know why. And you're like, okay, like, should I panic? Should I not panic? Like, what do I do now? And what you really want to do, and I think these models will be able to do is look at those variants and actually model out what is the impact in the different cells, how it influences cellular behavior and whether or not that is tied to a pathway to disease or not. Like that's a big deal. And I think we should be doing that. That is actually the future of medicine, where we think about each one of your biology based on your genetics, your exposure, and how it predisposes you or not to disease. That's huge. And we want to be able to see that clinical application, but we can't. It's too expensive, too hard to model each person, impossible to model each person in the lab. But if we can build models around this, it is possible. And then we can start thinking with extreme precision. And I'm not just talking about rare disease. They're common diseases. I'll just say depression right now. It's empirical. Right. We just say, like, you're depressed. Like, here, let's try this antidepressant. And it's like, usually the one that the patient, the doctor's more familiar with, or maybe one that you've heard of, but like, and then you have to try it for months before it's like, did it work? Did it not work?
[33:05]
B
Months, yes. That's the cycle. I don't have familiarity with this. It's horrible.
[33:10]
C
And meanwhile, if it doesn't work, it means the person's suffering. And this applies to, like, almost every disease. Right. There has to be some biological explanation as to why some medications work and don't. So can we actually then look at each patient and say, based on who you are, we think this medication is going to work best for you? That's the future I want to live in, where we can actually understand individuals as individuals and use the biology and science very directly to keep them well.
[33:41]
D
Yeah.
[33:41]
B
So if there's a name for this tool that has the clinical impact that is on the scale of the electron, how do you envision it? I guess I feel like it's almost going to be the CZI app, I guess.
[33:56]
C
Oh, well, it won't be. First of all, that's not what we're building right now. We're building the basics. We're understanding cells and molecules. So I'm painting.
[34:05]
B
Someone else will do it.
[34:05]
C
We're taking. Painting a picture. Like, we need partnerships. This is. You asked about the ecosystem before. Like, there are experts along the way of this pathway. And so we sort of are at the fundamental research side.
[34:16]
B
Yeah.
[34:17]
C
And you need to be able to partner with folks to bring this all the way through impact. But the way I think about it, people call it different things, but essentially you want to get to medicine where we. It's truly precision medicine. It's N of 1. We're understanding you and designing therapeutics for you.
[34:35]
B
Yeah, I like the mission of rare as one. As well, that's a great framing.
[34:39]
A
Do you feel like that's possible, like almost treating the body as like a compiler? It's like because I know exactly what it looks like, I know exactly what's going to happen. Or is the body just like there's too many outside inputs and like over time it kind of deviates from what you have?
[34:54]
D
Well, I think we'll see how far we can get, but I mean I'm pretty optimistic that we'll be able to make a bunch of progress. And yeah, I mean there's like you basically, what format does this take technologically? I would imagine you're taking these different types of virtual cell models and eventually merging them into the equivalent of a biological omni model. Kind of like how on the language model side you had people that did language and then people who did different kinds of media models and perception and all that and then eventually you just merged that and then you aim to get positive transfer by merging it. So that way it's not just combining capabilities, but getting everything else to be stronger. So yeah, I mean technologically I think that's basically what it looks like is over whatever it is, a five or ten year period, we're building up a series of biohub models that increasingly get all these different dimensions of data and capabilities that can be used to help run individual science experiments and potentially eventually help with finding individual therapies for patients. Although we're going to be less on the clinical side, we're going to be more on the kind of scientific tool development side. And the kind of main tool, if you will, is this like these biohub.
[36:11]
C
Virtual cell models, I would say five years ago, without sort of the large language model supporting this, I don't think it would have been possible to really. Because biology is incredibly complex. And what we're essentially trying to do is break it down from a discovery based science where you kind of get lucky, you kind of get clever and you sort of figure out a hack to learn something new, to really making it closer to an engineering problem of like this is how the system works and when this breaks, what happens to the rest of the system? But like you said, there's just, there's far too many dimensions for us to hold in our brains. That's why we're so excited about this intersection at this moment, because it is possible to consider so many more dimensions matching the complexity of biology.
[36:59]
A
What is the role of the doctor in that future? Right. If you can like predict everything out and then if you take personal superintelligence Seriously, do you kind of distribute some of the diagnosis and all of that work, or how do you envision that?
[37:14]
C
I've been thinking about this a lot, and I think one is the model's not going to take you all the way. You're still going to need to really look at individual clinical situations and the doctor is going to be a form of data input into the model. Right. And so the doctor. There's some judgment that comes into place, but there's already a lot of models that make doctors really good at what they do. For instance, looking at your skin, like, AI is really, really good at detecting lesions in your skin that are concerning. It's excellent retinal issues. It is excellent. So the AI modeling and mapping is really, really good. So it's already happening. So I think about what should future doctors be trained to do? And I really think care and compassion and sort of walking patients through understanding, I think understanding why leads to trust in both the science and in the clinical pathway. And really walking alongside patients on that journey is, you know, it was the original calling of physicians to be healers and to be using great tools to heal patients.
[38:29]
A
Wow.
[38:30]
B
So bedside manner, ultimately.
[38:33]
D
I mean, I also think you can zoom out, though, from like, the role of a doctor to. I think everyone wants the health system to be more proactive and less reactive. Right. So today it's like you show up when you're sick and then you have someone treat you or understand what's going on. I think the goal with a lot of these systems is to be much more proactive about this. So when we say that the vision is to try to help scientists cure and prevent all diseases, it doesn't mean that there's going to be no bacteria in the world and no one ever starts to get an infection. It's just that. All right, ideally, you can kind of understand all of that really early. Right. Similarly, if someone gets a mutation, it looks like it might become cancerous, then you can just treat it a lot better if you know that early, rather than showing up to a doctor when it's already metastasized. And you have a bunch of issues on that. So, I don't know. I think that there are going to be a lot of opportunities to fundamentally improve the healthcare system overall. But I agree with everything that you said on this. And I also just think that when we say that we think it's going to be possible to prevent and cure all diseases, it's not literally that no one ever gets the beginning of a sickness. It's just that it kind of can be managed in a way where everything is sort of manageable.
[39:53]
B
I think we discover more diseases the longer we live. Is it possible to not die? Obviously that's a meme that's coming to fruition. If you theoretically cure all diseases, maybe death is a disease.
[40:07]
C
Mark just said we had extreme alignment, which I love. Thank you, honey.
[40:14]
D
This is one that we don't necessarily.
[40:15]
C
This is one that I'm not sure we have extreme alignment on. I in fact, just haven't thought about this one very much because I think there is so much.
[40:23]
B
There's other things to do.
[40:24]
C
There's so much to do in terms of, you know, I'm a pediatrician. I think about babies and like very sad things happen to very small people. And like, I think a lot about that and how do we like maximize life quality and the things that harm small people. I'm biased and I haven't thought as much on the other end of the spectrum, but I don't know, I'm 40, maybe I should, but I feel like I can still focus on the little ones.
[40:51]
D
I think the strategy is the same, right? I mean, it's like we're basically choosing to not focus on any specific disease and like verticalize. Our strategy is one of trying to accelerate scientific progress overall. And I think that there are a lot of people who are going to focus on each of these individual things.
[41:08]
C
So I don't know, but we don't have to because that's not our strategy. Our strategy is to make sure that we have tools that make people do the best science possible out there.
[41:17]
B
I'll put to you that, you know, because of, because aging and environments and mutations are so diverse, you have a high concentration of grouping in the early years and it should have more diversity in terms of the cell types and the problems that you face in the later years. There might be some imbalance in terms of where all these things happen, but I'm not pitching in any particular direction.
[41:43]
D
No, I think it's clearly. If you look at the trend over the last. I don't know what it is, 100 years. I mean there was this flip. If you pay attention to the history of science where it changed to kind of hypothesis driven scientific method of like we're going to run tests and have controlled experiments. And since that happened, the average life expectancy has basically increased by. I think it's about a quarter of a year every year over the last hundred years. Now a lot of that, like Priscilla said, is basically making it so that a lot of people don't die young so far had somewhat less of an impact on extending the maximum human life expectancy. Although the oldest people today, I do think in general are older than the oldest people 20 or 30 or 40 years ago. But there's been a little bit less of an increase there and more just kind of making it so that people don't suffer and die prematurely from things. But I mean, there's other things that you want to focus on here too. It's not just like how long you live. It's like quality of the life while you're. I think it's like you can live a full life and have that be high quality, or you can get sick in different ways that kind of add up over time. And I think there's lots of different ways to improve. There's all these different analogies that you could throw at this, but I think there's just a lot of room to improve here.
[43:10]
B
And then the other element I wanted to come back to on the engineering side, which is when you're presented with a high dimensionality problem, you want to reduce things into little boxes that you can sort of manipulate at a higher abstraction. And that's something I try to do with the folks outside. And we really struggled because over here you're imaging on the atomic level, and then you're also worrying about proteins, and then you're also trying to build a cell model. Is every abstraction leaky? Where's the boxes? I can move around and not worry about it. My physics analogy is because in the regular world you don't have to worry about quantum physics, but here we kind of do.
[43:45]
D
I think you want to build it up a little bit hierarchically. And when you're trying to understand proteins, understanding molecules makes a big difference. But at some level, you can kind of just look at correlations in cells. But if you want to really have the most accurate model and if you want to be able to reason about things, then you probably also want to understand proteins well, and then I think that kind of extends. But yeah, I mean, that's part of the interesting challenge of this, is that it's not just like one resolution that you're looking at it. I think in order to do it.
[44:13]
C
Well.
[44:16]
D
You have some amount of abstraction, but I think you want the models just like language models or I think how our brains work to basically build up different levels of abstraction and pattern matching. And that that's here too. And you basically just need to be kind of like, have some basic excellence and Understanding at each of these different levels.
[44:34]
B
It's weird, the number of levels at which you have to telescope up and down. It's mind boggling. And I think when people say dimensions, they typically mean orthogonal dimensions. But here it's sort of like nested.
[44:46]
D
And just different scales that are oddly different disciplines to understand each specific scale. And it's like in a way that the people who are good at understanding one scale are like they've never spoken.
[45:00]
C
People at the next scale.
[45:01]
D
Yeah.
[45:02]
B
Physics is there, chemistry is here, bio is there. It's nice to hear about it. But when you see it and you meet the people, you're like, oh, this is real. And they are actually working together.
[45:13]
A
And then there's this goal of the virtual immune system that you're working towards. I would love for you to chat about that. And also if that happens, what should other people build? So there's obviously CRISPR and some of that technologies, like the people should maybe ramp throughput for like how do you think about the future?
[45:30]
C
The virtual immune system, I think is obviously, I think of a subset of sort of the generalized model eventually we'll get to. But the virtual immune system is super interesting for a couple of reasons. One, it's individual cells interacting with each other. There's a number of cells that we don't even fully understand what they do. B cells, T cells, NK cells. And so we can use our current technologies to understand these cells at a more granular level. So that's cool from a biology standpoint, but the clinical impact is huge of understanding the immune system because biology turns out has already given us a way to keep the body healthy. And it also sometimes goes awry and causes disease with autoimmune diseases. Right. And so it's a very complex system that has to stay in balance. And if it goes out of balance in either direction, you get sick. It can also go into your body and it's a privileged system that is mobile and can go into places like your brain, your pancreas, your heart to sort of either do maintenance or to collect signal that's it's built in. So if we can understand this system, we can use it to keep people healthy. We already kind of do. So there's car T cells where we reprogram T cells to go in and fight cancer. And our New York biohub, we're doing cellular engineering to say like, hey, can you go in to this person's heart, check if they have plaques that are causing problems, read it into your DNA self lyse and then we can read out the signal of cell free DNA and give us a binary answer, yes or no. Then we can put in other engineered cells and imagine where you go in and you clear out the plaques using engineered immune cells that are your own. That is incredible. That is a tool that is realistic too. I know it sounds sci fi. It is realistic, it is happening. And then on the other end of understanding the balance, so many autoimmune diseases, Ms. lupus, those are the examples of ones we know. I think there are other things that are autoimmune that we don't understand. Like dementia can have. Autoimmunity can play a large role in that. And so if we can understand the fine balance that the system needs to be kept in, then we can actually impact a lot of the ways the human body is maintained. So I think it's both interesting from a biology perspective and feasible to model and probably one of the highest now impact systems if we can learn how to manipulate.
[48:07]
B
Amazing.
[48:08]
D
But it's only one system, right? I mean I think it's like the.
[48:11]
C
So it's a subset.
[48:12]
D
If you're focused on curing and preventing diseases, the immune system is a pretty important one. And I think it's also interesting for all the, and unique in a lot of the ways that you said. But there's like lots of other parts of the body to understand too.
[48:23]
A
And I think we're running out of time. So we have two questions to close one again, 100 years maybe is too long, right? What would it take to do it in 50, in 25 and to make those happen, like what should other people build to support your work?
[48:38]
D
I mean, I think a lot of this is going to end up coming down to how far a lot of these AI methods get. I think that there's like people have, there's just this constant ongoing debate around what are the time frames for getting to very strong AI. And I think if you get that, then I think it's pretty optimistic that with the right investments in frontier biology, you should be able to get these systems that can allow you to have virtual cells that allow you to do the kind of precision treatments and preventative care that can achieve this kind of mission significantly sooner. But at the end of the day, I think a lot of that time frame will probably come down to the AI timeframe. There's obviously a ton of stuff to do in biology, so it's not. I mean, I think that what should other people do? I mean other people doing more frontier biology and helping to collect this type of data and solve these problems is super helpful to that too. It doesn't automatically happen. But I guess if we're predicting whether it's going to take 10 or 20 or 40 years, that is probably more a function of the pace of AI development than it is a pace of the pure biology side.
[49:48]
C
Yeah, I was going to agree with you. I think a lot needs to. I think we're on a path to get a lot of important biological data through advances in laboratory technique, but it's not a given. And there are different groups that are expert at this all across the nation and across the world. And so we need to be continuing to push the research and the methodologies. And I want to say that, like, you know, the cell atlas was not glamorous work. People were not going to get their tenure track paper by sort of analyzing the hundredth and twenty millionth cell. That is just not it. Right. And so rethinking the way that this work gets done in a collaborative, like doing big things together in science, that's what is going to need to happen to sort of get the knowledge we need to build models that give us this type of insight.
[50:43]
D
I guess one thought on the type of biology that I think should get done is there is a certain orientation around choosing problems that will help generate data that can help make the models a lot smarter. I think that you do that when you are very optimistic about the pace of progress and what AI is going to enable. Because the classic reason that scientists generated data sets is so that they could basically look through the data sets to make advances. So it is a little bit of an inversion in the thinking, which is like, I'm now going to do this so I can help train this other thing to be better and create more advances. And I think in a world where you really believe that there's going to be very significant AI progress, I think more frontier biology should be done in that way. But these data sets aren't going to get created by themselves. There's a lot of work that needs to get done and a lot of investment there. And at some level, you could probably have the smartest AI model in the world. But if it doesn't actually have the data to understand this stuff, it's like, okay, you can't just reason from first principles about. About all these things. I mean, a lot of human knowledge comes empirically, not from first principles reasoning. I think that more, this is kind of the whole biohub network idea that we're building. And I've been really happy to See, other folks, especially a lot of people in technology, I think, have this orientation, too. They believe a lot in AI. They believe in the technological progress. They've generated some significant wealth. We're building their companies, and now they're investing in science research. And I think that's great. And I think doing it in this way where you're building up these networks to basically build specific tools that generate data that make the models better, it's one approach. It's not that all science should go in that direction, but it's one of the things that I'm quite optimistic about that I think is going to make a very big difference. Cool.
[52:39]
B
It's probably all the time we have, but I'll just leave it to you guys for any calls to action. Anything that you want biologists or engineers to check out.
[52:47]
D
I mean, check out the models. Check out the models.
[52:50]
B
The tooling.
[52:50]
D
Yeah. I mean, they're early, but I think it's kind of an interesting sense of where things are going, and we'd love feedback on it, and it'll kind of just help this feedback loop of, like, what we should build next.
[53:05]
C
Yeah, I would say let's do this together. We need lots of people coming together to do this work.
[53:11]
B
Well, thank you for organizing it and solving and curing all diseases, trying to.
[53:16]
D
Help others do it.
[53:18]
B
All right, thank you.
[53:19]
C
Thank you.