Robert Endres, "The Unreasonable Likelihood of Being: Origin of Life, Terraforming, and AI" (arXiv, 2025) - New Books Network

Summary8 min read

Podcast Summary: New Books Network Episode: Robert Endres, "The Unreasonable Likelihood of Being: Origin of Life, Terraforming, and AI" (arXiv, 2025) Host: Gregory McNiff
Guest: Prof. Robert Endres (Imperial College London)
Date: February 24, 2026

Overview

In this wide-ranging and thought-provoking episode, host Gregory McNiff interviews Prof. Robert Endres about his recent paper, "The Unreasonable Likelihood of Being: Origin of Life, Terraforming, and AI." The conversation dives into profound questions about how life may have emerged on Earth, the interdisciplinary challenges of studying this problem, the roles of AI and mechanistic models, the intriguing possibility of directed panspermia, and the importance of retaining human understanding as datasets and models grow ever more complex.

Main Discussion Points and Insights

1. Motivation and Target Audience

Why the Paper?
Prof. Endres shares that after years in academia, he sought to explore "the big questions" that originally drew him to science (02:18), noting that the origin of life is one of the most fundamental and fascinating puzzles.
Interdisciplinarity:
The problem naturally intersects physics, chemistry, and biology, demanding a fresh, quantitative, and cross-disciplinary approach.

"You want to answer the big questions and you wonder where we came from and all these six big important questions... I wanted to work on something super interesting." (02:18)

2. Abiogenesis vs. Evolution

The paper focuses on abiogenesis—the pre-Darwinian emergence of living systems—rather than biological evolution (04:14).
Endres distinguishes between the emergence of initial replicators (protocells) and subsequent Darwinian processes.

"As a physicist, you would work before all that happens, before the biology kicks in." (04:14)

3. Our Understanding of Life’s Origin

Earliest Fossils:
Microfossils from around 3.5 to 3.8 billion years ago provide the first hard evidence for life, supplemented by bioinformatics estimating the "last universal common ancestor" (LUCA) at approximately 4.2 billion years ago. (05:45)
Surprising Complexity:
LUCA exhibited significant metabolic sophistication, immune mechanisms (e.g., CRISPR-Cas), and gene shuffling by horizontal transfer—implying life's early emergence was both rapid and complex.
Puzzle:
The complexity and timing are "super puzzling," especially given the short span between Earth's habitability and the appearance of advanced cellular mechanisms. (05:45, 20:19)

"That very early life was already very sophisticated... had some basic immune system and, you know, complicated metabolism." (05:45)

4. Role of AI and Mechanistic Models

Mechanistic Models:
Traditional small-scale models allow for clear predictions and insights; AI offers massive scale, but risks becoming a 'black box.' (09:03)
Kolmogorov Complexity:
By applying measures like Kolmogorov complexity—‘the size of the shortest computer program describing a cell’—Endres quantifies how much information (and thus, how likely) life is to emerge within a given timespan. (09:03)
Current AI Limitations:
While AI can explore vast hypothesis spaces, true conceptual breakthroughs ("abductive leaps") remain uniquely human, possibly due to qualities like passion, embodied purpose, and awareness of finiteness. (12:47, 13:58)

"...although current AI systems excel at synthesizing knowledge ... their creativity remains largely one of rearrangement rather than genuine conceptual invention..." (12:47)

"A computer wouldn't care about the origin of life. So I think the human input is actually very important in that sense that we select questions we care about where we have meaning and what we value." (13:58)

5. Earliest Evidence & The LUCA Puzzle

Western Australia Fossils:
Some of the oldest confirmed microfossils date close to 4 billion years, found in ancient basalts (19:03).
LUCA’s Features:
CRISPR systems, ATP synthesis, horizontal gene transfer, and advanced metabolism—all emerged very early, further deepening the puzzle.

"We're not talking about some simple replicator 4.2 billion years ago. We're talking about a sophisticated microbe, which is puzzling to me and mind blowing." (20:19)

6. Are We Alone? (Drake Equation & Exoplanets)

Statistics of Life in the Universe:
With ~10²⁴ exoplanets potentially in the universe, even a single instance (Earth) makes the emergence of life plausible elsewhere, though this is fraught with "zero times infinity" statistical pitfalls. (23:17)
No Easy Answers:
The discussion highlights the limits of statistical inference when based on a single sample.

"It doesn't really answer for easy questionnaire because there's only one data point." (23:17)

7. Emergence: Simple Rules, Complex Outcomes

Scientific Principle:
Following thinkers like Kauffman and Phil Anderson ("More is different"), Endres gives examples from flocking birds and superconductivity to illustrate how simple, local laws can generate complex, emergent order. (26:12)
Levels of Description:
Not everything must be derived from first principles; emergent laws at different scales suffice.

"At every level we find new laws and we don't have to deal with all the complexity." (26:12)

8. Siloed Disciplines in Origin of Life Research

The fragmentation of research between fields (biologists, physicists, chemists, astrobiologists) has delayed progress, as each discipline gravitates to their own favored explanations—e.g., the RNA World, metabolism-first, compartmentalization. (28:53)

"Every little community has its own solution to the problem because it's such a multifaceted problem." (29:13)

9. Kolmogorov Complexity, Assembly Theory, and the “Assembly Desert”

Kolmogorov Complexity:
"The smallest code to produce a cell" gives a handle on the minimal information needed to 'assemble' life (32:35).
Assembly Theory:
The 'difficulty' of assembling complex structures matters—a 3D cell can’t be put together in just any order (35:29).
Assembly Desert:
There is a "desert" of low-probability intermediate-complexity structures between simple inorganics and biological molecules—a major hurdle for abiogenesis (38:24).

"You have actually a desert of extremely low complexity index ... that's a severe constraint on the emergence of life." (38:30)

10. The Improbability of Random Emergence

For life to emerge via purely random processes, "it would take a hundred trillion universes stacked end to end at the high end, to 10 million times the universe's current age at the low end." (39:25)
Necessity for Directed Processes:
Progress towards complexity needs "immense persistence" or non-random, energetically driven processes (39:56).

"This random process is a very inefficient process because the progress essentially goes as a square root of time. ... you need a lot of persistence, basically." (39:56)

11. Directed Panspermia and the Question of Terraforming

Crick’s Hypothesis:
The idea that life on Earth could have been seeded (terraforming) by advanced civilizations is scientifically possible but remains speculative, and would merely "outsource the problem." (42:48)

"Directed panspermia by, you know, aliens, let's say, is physically possible...But say, of course one should be careful in order to stay up, stay away from, let's say, you know, sci fi, it has to be scientific." (42:48)

12. Caution about Tools and AI: Understandability and Risk

Limits of Understanding:
Endres warns of the risk that, like the parables of Gödel and Turing, we could end up with explanations we cannot grasp—"an ape before a lightning struck fire." (45:13)
AI Black Boxes:
He invokes Douglas Adams to illustrate the perils of inscrutable answers from "black box" models (e.g., the number 42) and stresses the need for scientific explanations that retain human understandability (46:11).

“...if thinking is outsourced for too long because AI tools take over, ... there are some dangers that we really are sort of puzzled by the answers and we don't understand them anymore.” (46:11)

Notable Quotes & Memorable Moments

| Timestamp | Speaker | Quote | |-----------|---------|-------| | 02:18 | Robert Endres | "You want to answer the big questions and you wonder where we came from and all these six big important questions... I wanted to work on something super interesting." | | 05:45 | Robert Endres | “That very early life was already very sophisticated... had some basic immune system and, you know, complicated metabolism.” | | 12:47 | Greg McNiff (reading Endres) | "...although current AI systems excel at synthesizing knowledge ... their creativity remains largely one of rearrangement rather than genuine conceptual invention..." | | 13:58 | Robert Endres | "A computer wouldn't care about the origin of life. So I think the human input is actually very important in that sense that we select questions we care about where we have meaning and what we value." | | 20:19 | Robert Endres | "We're not talking about some simple replicator 4.2 billion years ago. We're talking about a sophisticated microbe, which is puzzling to me and mind blowing." | | 26:12 | Robert Endres | "At every level we find new laws and we don't have to deal with all the complexity." | | 38:30 | Robert Endres | "You have actually a desert of extremely low complexity index ... that's a severe constraint on the emergence of life." | | 39:25 | Greg McNiff | "...it would take a hundred trillion universes stacked end to end...without immense persistence, life's emergence becomes cosmologically implausible..." | | 42:48 | Robert Endres | "Directed panspermia by, you know, aliens, let's say, is physically possible...But say, of course one should be careful in order to stay up, stay away from...sci fi, it has to be scientific." | | 45:13 | Greg McNiff (reading Endres) | "...a living parable of Goodell's incompleteness and Turing's undecidability. Systems entangled in their own logic, unable to fully explain themselves." | | 46:11 | Robert Endres | "...if thinking is outsourced for too long because AI tools take over...there are some dangers that we really are sort of puzzled by the answers and we don't understand them anymore." |

Key Timestamps

02:18 — Motivation for the paper and questions about the origin of life
04:14 — Differentiating abiogenesis from evolution
05:45 — Evidence for LUCA’s age and puzzling complexity
09:03 — AI, mechanistic models, and Kolmogorov complexity
13:58 — Human creativity vs. AI in understanding life’s emergence
19:03 — Fossil record and early microfossils
20:19 — Details on LUCA and its advanced mechanisms
23:17 — Estimating the likelihood that life exists elsewhere
26:12 — Simple rules and the emergence of complexity
29:13 — Effects of disciplinary silos
32:35 — Kolmogorov complexity and informational requirements
38:30 — The "assembly desert" and hurdles in abiogenesis
39:25 — Improbability of life’s random emergence
42:48 — Directed panspermia and scientific caution
45:13, 46:11 — Final warning about black-box models and the importance of human understanding

Tone and Style Notes

Engaged, thoughtful, and interdisciplinary: The tone reflects the immense complexity of the topic but aims to make it approachable through analogies (e.g., chess, bird flocks, puzzles) and careful explanation.
Reflective and occasionally philosophical: Both host and guest discuss not just technical aspects but the limits of knowledge, the role of human curiosity, and the philosophical implications of powerful AI.

In Summary
This episode offers a deep dive into the mysteries of life’s origin, the likelihood (or unlikelihood) of abiogenesis, the roles humans and AIs have in solving such problems, and the need to ensure that, as our explanatory tools become more powerful, we retain the capacity to understand the answers they provide.

Loading summary

Transcript39 lines

[00:01]
A
This episode is brought to you by indeed. Stop waiting around for the perfect candidate. Instead, use Indeed sponsored Jobs to find the right people with the right skills fast. It's a simple way to make sure your listing is the first candidate. C According to Indeed data, Sponsored jobs have four times more applicants than non sponsored jobs. So go build your dream team today with Indeed. Get a $75 sponsored job credit@ Indeed.com podcast. Terms and conditions apply.
[00:29]
B
Welcome to the New Books Network welcome to the New Books Network. I'm your host Gregory McNiff and I'm excited to be joined by Robert Endrez, the author of a recent paper, the Unreasonable Likelihood of Being the Origin of Life, Terraforming and AI. Professor Robert Endres leads the Complex Adaptive and Living Matter Group, which focuses on the quantitative understanding of sensing and signaling, and co directs the Physics of Life Network of Excellence at Imperial College London. He also designed and introduced the Master's program in Systems and Synthetic Biology and is highly involved in training undergraduate students in systems biology. Before joining Imperial College in 2007 as senior lecturer, Robert was a postdoctoral researcher with Professor Ned Wingreen of the Molecular Biology Department at Princeton University. At Princeton, his main research accomplishments included advancing understanding the remarkable signaling properties of bacterial chemotaxis and developing atomistic predictions of protein DNA binding sites. I selected this article, this paper because it tackles one of the biggest questions in science in a clear and quantitative way, namely, what is the origin of life and how did it happen? Professor Andreas brings together physics, chemistry and information theory to sort of present an overall thesis. He also incorporates the use of AI and assembly theory to frame the origin of life problem as something fresh and thought provoking, which the paper definitely is. Robert, thank you for joining me today to discuss your paper.
[02:07]
A
Yes, thank you Greg for the nice introduction and I'm happy to discuss.
[02:10]
B
Perfect. Robert, I ask all my guests, first of all, why did you write this article and who is the target or I should say this academic paper and who is the target audience?
[02:19]
A
Yeah, I mean it's a good question. So I mean sometimes I wondered myself so last summer I had some time and I thought your academic life has become so hectic and stressful. I mean there's so many obligations, you know, funding acquisition, grant writing, teaching preparations. I don't know, there's a lot of admin as well. And I mean a lot of the things we originally thought we are getting into science for play sort of a minor role nowadays. So I know when you get originally when you want to get into science, you want to answer the big questions and you wonder where we came from and all these six big important questions. And then during your career you make a lot of compromises and then you end up working on something which is more or less a bit dictated by the current climate and research and academia. There are certain topics which are very hot and people generally work on, from cancer to antimicrobial resistance and climate research and so on. And then these interesting topics you originally got into science for all of a sudden play a minor role. So I thought, you know, this is. So I thought, you know, last summer when I had some time, I wanted to work on something super interesting. And then I have to say, I work in biological physics, so I'm a physicist by training, working on biological problems. And whenever physicists work at the interface to biology, these kind of questions emerge very generally and, and very often. So we ask a physicist very simple question, generally, you know, what is the principles of living matter compared to non living matter? And once you ask these kind of questions and you go after these minimal answers and principles, you know, since these kind of things emerge automatically, basically. So yeah, it came quite naturally.
[04:01]
B
Yeah, no, it's definitely a big idea paper and I have a few follow ups here. First of all, the paper seems more focused on abiogenesis than the process of evolution. Is that correct?
[04:15]
A
Yeah. So the idea is, you know, when you work on the physical principles of the origin of life, you ask these sort of questions of ab initio emergence of the first protocell. And so the idea with evolution, according to Darwin, that would then happen afterwards. So first you need something which replicates, which mimics what a cell is about before having an actual cell. And then once you have a cell with all the complex machinery with a heritable information in form of DNA, then you can have Darwinian evolution with random mutation and a selection based on phenotypes. How well an organism can survive in its environment, how well it can reproduce, and then the offspring can inherit these features and can involve and adapt to its environment. So as a physicist, you would work before all that happens, before the biology kicks in. And you also have these dogmas in biology saying nothing makes sense in the light of evolution, or cells come from cells, these kind of statements. So that makes it a bit hard then to understand where the origins or how the origins work. But then as a physicist, it comes quite naturally. You ask what is a minimal replicator and where does it come from based on physics and chemistry. So that comes basically before standard biology and evolutionary biology.
[05:36]
B
Okay. And then I Want to get into basically right into the crusp of this is what is our basic understanding today of the origin of life?
[05:45]
A
Yeah, it's a good question. I mean, it's a difficult question in a sense that it touches on a lot of different disciplines from geology and of course evolutionary biology and so on. And of course everything happened more or less 4 billion years ago. But I mean, what we understand is basically there are fossils, there's a fossil record. Of course, these are very subtle fossils from microorganisms, so microfossils. So they are dated to around 3.5 to 3.8 billion years ago. So it's more or less hard evidence. But since they're also bioinformatics studies, basically backtracking how old life is or where it came from in terms of how much different domains of life, like bacteria and achaea and eukaryotes, how they differ. And basically the difference or the mutations indicate an evolutionary distance going into the past. You know, there are mutations, things change, genes which code for proteins, and the more difference you have, the more you go essentially in the past. And then you can look for common denominators and the common denominators which are shared between the different domains of life. They would then indicate something like the last universal common ancestor, or basically our super old ancestor. And that is stated. I mean, it's super interesting stuff. I mean, it involves lots of data analysis and of course, bioinformatics tools. And set is dated back to almost to 4.2 billion years ago, I think. So this is remarkable. And that also, that was one of the sort of smaller reasons why I ultimately wanted to write this, because to me it seemed like super puzzling. Yeah, 4.2 billion years ago is essentially right after the late heavy bombardment by asteroids of the planet. And life couldn't have started essentially earlier. This is more or less the earliest life could have evolved. And the other fascinating thing is, I mean, the main fascinating thing is it's not only early, but this is a rather complicated organism, the last universal common ancestor. So it shares a lot of things with modern microorganisms and those that indicated, you know, that very early life was already very sophisticated and we can talk more about it, but had some basic immune system and, you know, complicated metabolism. And it indicated, you know, there's a rather complicated community of microorganisms back then, where you had shuffling of genes between organisms and by horizontal gene transfer. So there are a lot of advanced innovations which didn't come much later. They were there very early and so that was super puzzling to me and I thought I have to do something about it or learn more about it.
[08:35]
B
Perfect answer. Robert, I want to ask you, you referenced AI and in this paper you talk about the role of. Against this backdrop of conceptual and disciplinary constraints, one thing is certain. With the rise of powerful AI tools and mechanistic models, we now have entirely new ways of exploring biological complexity. Could you talk about maybe how AI and these mechanistic models can help us better understand the origin of life?
[09:04]
A
Yeah, I mean, mechanistic models are essentially what we'd normally aim for when we do theory in biology, or generally theory in physics, when we do mathematical modeling. So mechanistic modeling is very important because it guides expensive and time consuming experiments. It provides insight into something, what's happening in experiments. And so that is normally very simple models which makes prediction and they're tested by experiment and so on. This is a scientific cycle, basically that's a normal scientific process. And then of course, with the onset of AI, maybe sort of 15 years ago, then of course there were sort of, all of a sudden we had very big models different from these small mechanistic models. So the good thing about small mechanistic models is they're small and they don't have many parameters and that makes it quite, they become quite predictive and they make lots of important statements. The AI models are very different, they have millions of parameters. And of course there's a bit of the danger they work very well, but there's a bit of the danger that the understanding in the insight and the interpretation is lost in these big black box models. But you know, of course they are fascinating and they can do fantastic stuff. So how I mentioned it. So of course these mechanistic models we have, and they're being currently first and further developed, but Sandy has these very big models nowadays and they can help us also understand the complexity of a living cell. So for instance, you know, I have this one equation, some fundamental equation in my paper which basically describes sort of the lower bound of how information has to accumulate to make something complex like a cell. It causes an information rate. So that information has to have occurred, accumulated in the time window we observe from fossil records and so on and on, the last universal common ancestor. So we have basically information complexity of a cell describing a minimal cell which you can estimate and then a time window. And this is sort of the minimum of amount of information we have to accumulate to make a cell. And if that. Yeah, so that's an important lower bound. And can we achieve it basically. Or can nature achieve it or not? This is an important question. And so if you want to estimate the complexity of a simple cell, well, it's not so simple actually. Then they can use these big programs which occurred during the last few years. This is sort of virtual cells, meaning it's a very detailed model of cells. And we can estimate the complexity of a cell with something like. It's called Kolmogorov complexity. It's basically estimating how big the program is to describe it. That's quite a simple estimate. And then, you know, these complicated models, they have a purpose for them to estimate it. And AI models are set type of model, so many parameters, they become very close to actual cells in terms of complexity. And that of course allows people also to estimate things like, or to test perturbations or drugs on these cells without having to do actual experiments. So that's sort of a new development over the last 10, 15 years. And that can help us constrain how much information is in a cell, how much is needed to make a cell in a certain time window. And is it feasible? If it's feasible, then we have a good place for abiotic emergence of protocells. And if it's not feasible, then we have to think about other alternative scenarios.
[12:47]
B
Okay, I definitely want to ask you about an alternative scenario because you cite one, including by a former Nobel Prize winner. That's fascinating, but I just want to hit you with a follow up on AI. Here you acknowledge AI may itself play a decisive role. But you then say, although current AI systems excel at synthesizing knowledge and exploring enormous hypotheses spaces, their creativity remains largely one of rearrangement rather than genuine conceptual invention. I am personally skeptical that deep scientific sight can be achieved without the qualities that shape human creativity, purpose, passion, embodiment and awareness of our own finiteness. And you suggest there still may be a role for the lone genius driven by irrational motivations. And then you actually go on to say future AI may either approximate such motivational structures or overcome their absence to overwhelming exploratory capacity. For now, however, the kind of abductive leaps required to explain life's emergence remains uniquely human. Do you expect that to be the case? And what is that uniquely human element that allows us to understand this emergence of life?
[13:58]
A
I mean, yeah, that's. It's not an easy question to answer. So I mean, there's a different ways of thinking about AI essentially. So, you know, say biology is very complex biology. And of course, if you talk about the origin of Life, you would think this is also a very complex process which lead to it, which led to the origin of life. And then one school of thinking would be, because it's so complex, the answer has to be also very complicated. And so of course, if it's a very complicated answer, then maybe it is required that we have tools like AI, which are very good, dealing with lots of different and large data sources, and the human can't process that information. It's just too complex and we need AI for that purpose. This is not really normally how we think of complex problems in science. Normally we think of complex problems in science that we can actually reduce essence to it. So let's say underlying physical, simple principles. And that's how science has worked over the last hundreds of years. And that has been very successful, finding these simple principles. And that has also to do with the idea of emergence, that complex things can emerge even if the underlying principles are very. Or ingredients are very simple. And this basically the whole scientific thing that at every level we find new laws and we don't have to deal with all the complexity. And even if it's very complex, there's principles we can normally understand as a human being. Yeah. And then of course, that makes it, I think the question is quite hard to answer because it touches on a lot of different things. What do we mean by computation and intelligence? And is AI actually doing it sort of also resonating in that question, or is it more like a toolbox? Which is. And the danger here is to understand if it's interpreted with the answer. Basically, if you can understand the answer, then of course there's a question. Do we need some sort of human like insight to answer these very difficult questions or these fundamental questions? So, I mean, that's a fascinating question. I mean, to me it's a bit like the question, you know, people thought some decades ago, you know, once you have a computer which can play chess and beat a human being. So a few decades ago, before Gary Kasparov was Spieton said would be a marvelous development. And now machines are intelligent and human like. But, you know, this is certainly not true. I mean, after Gary Kasparov was beaten, I don't know, it was, I think around 1990, there was a bit of a wave of excitement. But then, so what, you know, it's a machine which can search through different algorithms, have decision trees. The trick is maybe to evaluate positions on the chessboard. But then it's not super surprising that a computer can brute force calculate all kinds of stuff and come up with the best answer, I can come up with a very simple algorithm right now, which is a good chess algorithm which can win games. This is basically just. I'm deviating slightly from the question. But if you have an algorithm which promotes moves which allow you to have many future moves, that's a good thing because then you have lots of possibilities in the future. If you have 0 moves left, you're checkmated and it's over. So you don't even need to have insight into the game of chess to have a good algorithm. So I don't think a computer like building against a human in chess all of a sudden means that this is intelligent. So that has changed, I think, in a way, in terms of. So what I wanted to basically say is a computer can search through all kinds of scenarios, but as a computer can't provide a value for information it's producing, so it can't make decisions which we care about. And I think the human mind is important to understand what is important to us. And of course, questions of our finiteness and motivation through which we go through life, I think quite important to make important decisions. So there are some human aspect, I think, which drives us to understand this. A computer wouldn't care about the origin of life. So I think the human input is actually very important in that sense that we select questions we care about where we have meaning and what we value. And a computer is completely agnostic to it. So I think it's a fundamental human question. To answer the question briefly.
[18:38]
B
No, great answer. And I want to circle back later to this idea of do we understand the tools? Because you end the paper on sort of a warning that, hey, these are powerful tools, but we need to make sure we understand how to use them and interpret them. However, I do want to get into the crux of this paper here. Can you talk about when we think life first appeared on Earth? I believe in Western Australia?
[19:04]
A
Sorry. I mean, that's one of the first fossil records you're talking about, right? I mean, I don't know the exact age of these rocks. I think it's a time window or these fossils. So these are basically very old basalt rocks which are almost as old as a planet. And the fossils in there basically dated almost 4 billion years ago. So it's not like we have now particular. You know, we can't. We can't do sort of forensic science. How we would see it in a. In a sort of crime movie or on TV or anything like that. Everything is basically destroyed. It's more about the Shapes of these fossils which remind people of bacteria, even existing bacteria nowadays, modern bacteria, microbes. So, so these are one of the earliest pieces of evidence pointing to a very early start of the origin of life on Earth.
[19:55]
B
And, and I want to then move on to this idea of the last universal common ancestor. We think that appeared about 4.2 billion years ago and it was still pretty complex. Is that right? I mean, it had an anaerobic system, was metabolically similar to modern prokaryotes and equipped with ATP synthesis. Is that accurate? How do you interpret that?
[20:20]
A
Yeah, I mean that's what I mentioned earlier on in the interview. That was one of the sort of smaller reasons why I ultimately then started working on this or I wanted to learn more about it because that was essentially, I thought, marvelous. Yeah, I mean if life would have Arosen 3.5 billion years ago and Earth is 4.5 billion years old and you know, maybe that 1 billion years enough for life to emerge. But going back and back in time, the more we know now, 4.2 billion years ago, which I said is right after the late heavy bombardment by meteors. That is fascinating. And the organism apparently which lived back then was sophisticated. It had basically the CRISPR CAS system, which is now a gene editing tool, which is a very well known tool now, but said existed already. This is essentially an adaptive immune system of the early microbes. So it basically allows. It also implies that cells were already bombarded by viruses. So what they can do is basically when they are attacked, they can take out bits of DNA and store it in their own DNA in their own genome, which is a kind of a memory. And that basically can then be activated and used later to defend against future viral infections. So that's super sophisticated. Says horizontal change transfer, which is another one, ADP ASIS just means ADP is a energy building block which is used by all cells. It's a small high energy molecule which has to be produced when the cell is feeding. And that is essentially based. All the motor proteins and everything is based on ADP DNA replication. So we are not talking about some simple replicator 4.2 billion years ago. We're talking about a sophisticated microbe, which is puzzling to me and mind blowing.
[22:14]
B
Yeah. And we think Earth is only about 4.5 billion years old. And the fact the LUCA, the last universal common ancestor, may have, excuse me, lived around 4.2 with a protocell that emerged even earlier. Right. I mean, through either RNA or other replicator systems.
[22:34]
A
Yeah. So that's a fascinating thing is, you know, If Luke was so early 4.2 billion years ago and that was sophisticated, that of course means that earlier versions of cells, these protocells, we are talking about these very basic replicators which we as physicists would think of the origin of life is like said, had to be sent before. Yeah, of course, we don't know because it wasn't the usual evolutionary biology which kept a record in the DNA sequence. So, yeah, it has to be even earlier. So that makes the problem even more puzzling. So, yeah, it's a fascinating problem which is puzzling me in many others.
[23:13]
B
No, absolutely. Are we alone in the universe?
[23:18]
A
Yes, that's a good question to have. I mean, there's essentially that famous Drake equation, which tries to estimate the number of civilizations based on how many suns they are with planets and how many planets are in the habitable zone and so on. I mean, there have been more recent estimates based now on exoplanet data. So planets which look like, which are livable zone and have certain features and maybe certain atmospheric features which we can detect. And there's a lot of effort in detecting these maybe say around 10 to the 24 or so exoplanets being estimated in the universe or visible universe. And now we can make basic estimates from that. So there was recently a paper, I think by. Who was his name? Frank Adams, I think, and said estimated some sort of. It said, basically it's relatively likely that there's at least one more civilization out there. And the argument was relatively simple. So we basically know that 10 to the 24 exoplanets, we know one at least Earth, which produces life. So on average, now the average rate of producing Life is basically 10 to the 24, divided by 1 over 24 for the probability. And so there's basically an expected number of planets producing life, which is one. And now we can say there's a constant rate of that, and then we can say constant rate, meaning the underlying distribution of life, of planets with life is some distribution distributed. And someone can estimate what is the probability of no life in the universe. One planet is live Earth, and someone can estimate one minus these probabilities sets the probability to having at least two civilizations. And then these guys found some probability of, I think around 30% asked questions. But it's, you know, the tricky part is you're basically multiplying numbers which are essentially zero, with numbers which are essentially infinite. You know, 10 to the 24 versus 1 over 10, 10 to the 24. So that's of course, a bit ill defined. And you get These interesting answers out. So on one hand it sounds optimistic. With certain probability, yes, there's another civilization. But if you take the extra number, what it means is it's based on one exoplanet having produced life 1 over 10 to the 24. And a bit of playing around with the numbers. So yeah, it doesn't really answer for easy questionnaire because there's only one data point.
[25:59]
B
Can you discuss how simple rules can lead to complex organized systems? And here I'm referencing the work, I think by Kaufman and maybe Sarah Walker and a few others that we talked about earlier.
[26:13]
A
Yeah, I mean, that's essentially how science works. So. So we understand the fundamental rules, how things interact, how atoms interact, how electrons interact in a metal, for instance. And then based on these very simple ingredients, complex things can emerge. For instance, you know, if you talk about modeling in biology, you would say, oh, let's describe a bird flocks as collective behaviors, these beautiful forms of bird flocks. And how, how does it coordinate their movement? And is there a leader bird or not? Apparently not. Yeah, but you need only very few ingredients. You have basically a model of certain sort of sort of agents which are moving, actively moving. And you need to know they're interacting only with their neighbors. They keep certain distances. They don't want to come too close. They don't want to lose them either. They align their movement with their neighbors, just their vicinity. And these simple rules are for enough to describe a bird flock. You don't need a leader bird to lead us this beautiful behavior. But that can be applied to any processes in nature. So things like conductivity in a metal or superconductivity. This is a more interesting example, superconductivity. It was super puzzling. You have a metal which has no resistance anymore. It expels magnetic fields. And this is sort of counterintuitive because we know what goes in. We know electrons, they interact by Columbia interactions and they are scattering and that leads to resistance. But know under certain conditions, only new but sort of very small tweaking, small effects can have big outcomes. And all of a sudden you get new collective behavior in a metal, a bit like the bird flock. So emergent behavior is abundant in nature. It's one why science works also because we don't have to explain everything based on first principles. You know, we can say cell biology, it's enough to know molecular biology, and molecular biology is enough to know chemistry. Or if we want to explain human behavior, we don't have to go back to quantum mechanics. So every level of hierarchy in nature has its new laws which are emergent. Basically most laws are emergent. So that's fascinating. And it allows essentially complex behavior to emerge or based on very simple principles. So there was this famous article by Phil Anderson from 1970s or something, 72 maybe it was called More is different. And he talks about how hierarchical nature is and at each level there are new laws and we don't need to derive everything from first principles. It's actually impossible.
[28:53]
B
You talk about the search for the origins of life being siloed across academic disciplines. What has been the impact on this search for. On this research into how the origins of life began, the fact that the disciplines were sort of siloed and not communicating with each other, fragmentated.
[29:14]
A
So on one hand it's a very positive thing that origin of life touches on different scientific disciplines like chemistry, physics, biology, obviously geology and so on. But on the other hand, of course it also. Traditionally a lot of scientific disciplines haven't interacted well with one another or it takes effort. And so of course then when certain disciplines try to describe something that stays within the disciplines, biologists focus on evolutionary aspects, physicists try to understand very simple physical principles, astrobiologists talk about other things again. So that has been recognized as one of the drawbacks. It's just because the origin of life touches on very different disciplines and merging them is generally difficult. I mean I work in biological physics or merging physics. Symbology is fascinating, but it also has its challenges, you know, because the physicists very differently about sets and biologists. But it makes it also fascinating minimalistic physicists complexity raising biologists and set has worked quite well, I would say. Even so it's challenging. But then going to the life, it's even more. Yeah, there's more, there's chemistry, there's of course a big part life happened and that makes it even more challenging for the communities to discuss things with. And the thing is also, you know, every little community has its own solution to the problem because it's such a multifaceted problem. You know, some other theorists talk about the beginning started with so called RNA world because we need something like DNA with a sequence to store heritable information. RNA is beautiful because it can be used as a template to make a new RNA template to make a new RNA strand with the same sequence because it can grow on the existing one, it can copy it. So that touches on a lot of things. RNA can also have enzymatic reactions, catalyzed reactions, which is in modern ribosomes important. Even so it covers one aspect and it works beautifully on certain questions, but then it doesn't cover other things. It still has a chicken and egg problem. Where does a complicated RNA molecule come from? And then the other disciplines again. Other communities talk more about how important metabolism is, which avoids the whole DNA problem or RNA problem. It talks more about energy transduction and how things in the network catalyze each other and how that becomes self sustained, which is lifelike. Other communities talk more about how important compartments are for making cells a bit mimicking early membranes effectively to avoid dilution. The things from the outside to make it more cell like. So every, every discipline had its own successes, which also hindered the communication amongst them.
[32:16]
B
Okay, I want to move into really the meat here. And I will say there's a fair amount of formulas, but you also walk us through and provide the conceptual background for those formulas. And with that as a pretext, what is Kolmogorov complexity and how does it help us provide an estimate of the likelihood of life emerging on Earth?
[32:35]
A
Yeah, so Kolmogorov complexity, so use this in one of estimates, a complexity of a protocell. So basically saying to make a protocell of a certain complexity, we have to estimate in the observed amount of time, let's say half a billion years, this is my minimal information rate, I need to make a protocell. And the question is, does nature provide this? Then we find this abiotic emergence of protocols or not. And then we have to worry about other mechanisms potentially and how to estimate the complexity of a cell. You know, I mentioned relatively complicated programs nowadays, AI codes and so on, which mimic cellular behavior, either virtual cells, whole cell models, but also these sort of AI models for protein folding, which are very successful. So we can use these programs and the Kolmogorov complexity to estimate the complexity of a cell, at least parts of it, like the structural components, for instance. So basically Kolmogorov complexity says what is the smallest computer code to produce an outcome? And if something is very complex, it can't be compressed easily into something simple, like simple building blocks. Then it has a large Kolmogorov complexity. It's a bit like a random string of numbers. It can't be compressed because it's random. So we have to store the whole string. The program is long, you basically have to code each number. But if it's very reproducible, it can be compressed into repetitive sequences, let's say of numbers. Since the Komograph complexity is simple, it's very low. So essentially once you have computer programs which are quasi realistic, we can use it to estimate parts of it. The structural complexity of cells, the DNA complexity can be estimated just from the sequence. There are of course dynamic aspects as well, like metabolism and so on. So it can be used to estimate some of the complexity. And that was one input in my so called rate distortion theory formula. So I basically said there's a minimum amount of information based on this complexity of a protocol and the amount of time given. And then on the left side it talked about the entropy is the possibilities of the combinatorial possibilities of complex molecules in the environment in their lifetime, basically. And how to assemble a protocol, what's possible based on that. And if that information rate on the left of the equation is larger than the minimum I had on the right side, then we are fine in terms of abiotic emergence of a protocol. And if it's too small, what can be done by nature? Then we have some problem, basically.
[35:23]
B
And I want to get back to that. But quick follow up question. Could you talk about assembly complexity as well?
[35:29]
A
So the komogram of complexity talks about the complexity of something which is in front of you, for instance, like a cell or some structure you build, like a statue for instance, or a mosaic or puzzle. And the assembly theory talks about the difficulty of assembling something complex in particular in 3D. So just describing the complexity by itself is not essentially everything. You know, you have to also be able to assemble it. And of course the thing has to also work while you assemble it. So you can't for instance make a cell, like say build a membrane of a cell. And then later on you realize, oh, I should have put something inside, you know, so you have to make sure the small things assemble first, since the next layer assembles and then the outer layer and so on. So when something is very complicated, say assembled pathway gets more and more sort of funneled or linear in some sense. So you can't sort of in parallel simply assemble something like a two dimensional puzzle where you can start at every corner and assemble the pieces easily in any order you want. No, it's like a 3D puzzle where you have to make sure you put in the stuff in the middle first and then assemble the outer layers. So it has to have the pasture becomes very important for these complicated structures. And this is essentially the assembly theory. So it basically is the assembly index which is part of the theory is sort of a product of the complexity of something of molecules you're trying to assemble times the abundance. So in nature, let's say what is very common is low complexity molecules like CO2, H2, hydrogen, O2, simple molecules, they're highly abundant. So the complexity is very low, the abundance is huge. So the complexity theory would predict a complexity index which is quite high, you know, based on the huge abundance. But then, you know, you have very rarely complicated molecules. And because they're so rare, you don't have a big assembly index. However, once you have life on a planet and you have been in evolution and reproduction, you have all of a sudden lots of complicated molecules. So in this complex can have another maximum describing basically a signature of life on a planet which can be useful so it can distinguish a non living planet or the signatures of chemistry on a nonliving planet from a living planet. So one has a single peak, let's say in this complexity landscape based on low complexity, higher bonded molecules and biosphere planet would have another peak with high complexity and high abundance because of amplification by cells and life on the planet.
[38:25]
B
And Robert, are you referring to the assembly desert there when you refer to these two peaks?
[38:30]
A
Yeah, so that was another term I used. Exactly. So you can have one peak commonly in an abiotic planet, let's say, and then another peak becoming a bimodal distribution on a planet with a biosphere. But in between you have actually a desert of extremely low complexity index. And this is actually a problem because the question is how to cross to the other side. So one could naively expect, if that happens abiotically, the emergence of life, then there should be mechanisms to make molecules of intermediate complexity, at least to a level. So you can somehow get from one end, from one peak, the low complexity peak, to the high complexity peak. But it looks like that there's very little in between, which is a puzzle. Yeah, so that's, that's a severe constraint on, on, on, on, on the emergence of life. On. Yeah.
[39:25]
B
Depending on your assumptions around time, I believe you conclude that for random a diffusive process, it would take a hundred trillion universes stacked end to end at the high end, to 10 million times the universe's current age at the low end, thereby concluding, in other words, without immense persistence, life's emergence becomes cosmologically implausible, potentially pointing to alternative mechanisms. What do you mean by immense persistence?
[39:57]
A
Okay, that's a good question. So I mean, say basically, you know, two ways how you make something complex. So one would be the random exploration. Yeah. So by adding things, removing things, and that can be completely unnatural process things diffusing around, bind temporarily and then something else can bind, complexity could grow, it could Also fall apart again. And this random process is a very inefficient process because the progress essentially goes as a square root of time. It's a characteristic of diffusion. So meaning the progress is very slow. It levels off and it can be random, but they can be persistent. So there can be sort of periods of persistence where you add more and more complexity before it falls apart. So there's some sort of memory in it which makes it more persistent. So the more persistence you have, the better in a way. But then on the other hand, you could also have something directed, you know, that there's a biased random walk, if you like, there's some sort of drift towards higher complexity. But of course these things don't happen naturally. You know, things don't drift naturally to higher complexity. So you would need of course, to be in line with the physical principles like thermodynamics, for instance. So, you know, thermodynamics, second law of thermodynamics would say is that entropy in the universe would have to increase because it's a closed system. And so when you have order creating in one direction, if you have progress in terms of order and complexity, that comes at the expense that somewhere else has to be produced even more entropy. So that's the thermodynamic laws are in line. So you need to couple it to an energetic process somehow which dissipates and creates entropy. And that's of course going after the physical mechanism. So that's of course an unknown chemical mechanism. But if it's just sort of random exploration, which doesn't cost really energy, you know, things come together, fall apart again. And exploring that is very inefficient. And it depends a bit, you know, how long these intermediate end go persistent, but it's still random. So you need a lot of persistence, basically. I don't know, estimated it to thousands of years at a minimum, you know, maybe longer thousands of years persistence, where you make progress and then before it falls apart again, so you have more exploration. So it becomes very unlikely that this is how it works. Yeah, so as I said, you know, to get something which goes towards complexity, you need something which dissipates, which is in line with thermodynamics.
[42:36]
B
If it isn't random. One option you suggest is this idea that Earth was terraformed, or I think it's called directed panspermia as the origin of life. How likely do you think that is?
[42:48]
A
Yeah, I mean, as it was originally introduced by Crick and orge in the 1970s, I think, and they Just basically discussed essentially how unlikely abiotic emergence of a protocell would be or is. And so they thought, you know, said basically directed panspermia by, you know, aliens, let's say, is physically possible, you know, and it can't be ruled out. But say, of course one should be careful in order to stay up, stay away from, let's say, you know, sci fi, it has to be scientific. So the key thing is what they did is to turn it into testable hypothesis or say at least formulated this problem like that. So they said, you know, how do we test this? I mean, it's very hard to test. But you know, one thing, one kind of certain signatures might be indicative of directed panspermia. And that might be things like the universal genetic code. You know, if life emerged naturally, let's say you could imagine say very distinct life forms on this planet. Of course it can be also relatively easily dismissed by one successful life forms and easily taking over the whole planet because it just outcompetes other life sources. So that would be one thing or the fact, I think they mentioned as well in their papers that there are certain trace metals in enzymes, which trace metals, so they are not very highly abundant. But biology heavily relies on these in terms of catalytic abilities. And so these are sort of indicative things, also the very early origin of life. But one has to be careful, of course, in terms of biases because we came very late. So if you are very late and it's very hard to evolve intelligent humans, let's say, then life had to emerge very early on. So there are these scenarios which make it sound like directed panspermia is reasonable, but of course it doesn't solve the problem because it would then outsource a problem. But as I said, these kind of scenarios can also be explained by regular or conventional science. So yeah, it's hard to estimate how likely it is. Yeah, it's more like say last resort. If it would indicate if it would be sufficient, abiotic emergence of life would be impossible or very, very difficult. It's hard to estimate how likely it is.
[45:14]
B
Yeah, I want to end with the last paragraph of your paper, which you say we end with a note of caution. There is a real possibility that seeking to understand life origins, we become a living parable of Goodell's incompleteness and Turing's undecidability. Systems entangled in their own logic, unable to fully explain themselves. And you suggest we could stand dumbstruck like an ape before a lightning struck fire. We must ensure that our tools, however powerful, can still speak in terms we understand. I feel like we're on the cusp of that danger with AI and these large language models that frankly, are evolving in ways we don't understand. Do you think we're past this point? I mean, is there still a chance to manage and control these tools? Because they are very, very powerful, and it seems like they're leading us in directions we don't have a map for.
[46:11]
A
Yeah, so I was just alluding to the fact that, of course, to explain something very complicated, we might need these big models to go through large amounts of data to come up with very complicated answers to these questions. But personally, I don't think that's really how nature works. I alluded to that earlier. And as you say, these complex models which have emerged, these AI models, they are so complicated, they can be computers, black boxes. Most people don't understand anymore how they work. So even now, if it comes with answers, it comes up with answers. And I think also I cite Douglas Adams, you know, say, hitchhiker sky to the galaxy. So it comes up with some answer, you know, like after millions of years, 42, I think in the book, then there's some danger, you know, that one doesn't even understand the answer anymore or maybe even forgot the question. Even so, it's all part of the black box. But, you know. Yeah, I personally don't worry too much about it. I think proper science, how nature works hierarchical, based on principles that can be understood even if you don't exactly now can recapitulate the exact path, the assembly path, to a first protocol. That might be very difficult, maybe impossible, or maybe some AI tool comes up with some answer. But I think we can understand the principles at least, you know, we have a predictive theory. If these conditions are fulfilled, there's a high chance to produce something which is lifelike. I think that one can hope for the principles. So I don't think there's a danger that one personally, that one doesn't understand the answer anymore. But if you outsource thinking, let's say future generations of students, if that happens, if thinking is outsourced for too long because AI tools take over, as in education, and one starts forgetting to answer critical questions, think, then of course, there are some dangers that we really are sort of puzzled by the answers and we don't understand them anymore. But I think that a lot of things have to go wrong to go that path here.
[48:19]
B
Oh, well, that's an optimistic outlook and a great way to end our interview. Robert, thank you so much for taking the time to join me to discuss this fascinating paper. I look forward to the next iteration and hopefully a book at some point. Thank you very much for your time.
[48:36]
A
Thank you very much. It was my pleasure.