Summary7 min read

Machine Learning Street Talk (MLST) — Episode #75

Emergence [Special Edition] with Dr. Daniele Grattarola
Date: April 29, 2022

Episode Overview

In this special edition, the MLST hosts are joined by Dr. Daniele Grattarola to deeply explore the concept of "emergence" as it appears in complex systems, AI, and in particular, through the lens of cellular automata and their generalizations to graphs. The episode weaves together foundational philosophical distinctions (weak vs. strong emergence), insights from leading scientists and philosophers, and cutting-edge research at the intersection of neural networks, biology, and computation. The conversation is rigorous yet exploratory, traversing cellular automata (CA), neural CA, morphogenesis, graph-based systems, and prospects for robust, self-organizing, self-healing AI systems.

Key Discussion Points & Insights

1. Defining Emergence

[00:58–08:00]

Levels of Abstraction:
Host 2 discusses how phenomena at higher abstraction levels (like trust, culture) become hard to quantify, and parallels this to emergent phenomena in machine learning and society.
Specialization vs. Generalization:
Dr. Grattarola notes population-driven algorithms highlight specialization, diverging from ML’s default emphasis on generalization. Populations tend to produce many specialized "exotic" behaviors not seen in generalist models.

"The population implies... I want to see a lot of different things and like hyper specializations to all kinds of exotic things that... the generalist won't do." — Dr. Daniele Grattarola [02:20]
Philosophical Foundations:
Citing Melanie Mitchell, John Locke, and biologists, the hosts explain how simple agents, when aggregated, can show "superorganism" behavior — collective intelligence not reducible to individuals (e.g., army ants).
Reductionism vs. Relationism:
Tim summarizes connections to physics and philosophy, exploring how Western science’s reductionism contrasts with relationism—emphasizing system interactions and context.

"[Reductionism] keeps chopping up things into smaller and smaller pieces... stark contrast with relationism." — Host 1 (Tim) [04:07]

2. Weak vs. Strong Emergence

[08:00–21:00]

Weak Emergence:
- Defined as new properties arising from interactions of simple entities, visible only at scale, and often only discoverable via simulation (computational irreducibility).
- Marc Bedau: Weak emergence = macrostate P with microdynamics D is weakly emergent "if and only if P can be derived from D and S’s external conditions, but only by simulation."
- Strong Emergence: Properties not deducible from lower-level facts, often connected to the "mystery" of consciousness.
Chalmers and Hossenfelder:
Professor David Chalmers and Dr. Sabine Hossenfelder debate the meaning and legitimacy of "strong emergence," especially in the context of consciousness and free will.
Example from Chalmers’s work:
- "He argues that consciousness isn't a logical necessity... he could imagine a universe... with the same physical laws where he would be a philosophical zombie." — Host 1 (Tim) [18:14]

3. Infinity and Computability in the Universe

[30:42–34:00]

Infinity vs. Unboundedness:
Keith and a guest debate whether actual infinity or just unboundedness exists in our universe, referencing Gödel’s results and the limits of language and computation.
Implication for Emergence:
Computationally irreducible systems, especially those involving real infinity, might always escape complete formalization — supporting "strong emergence" in principle.

4. Cellular Automata: Discrete, Continuous, and Universal Computation

[36:00–53:40]

Basics and Historical Context:
- Cellular Automata (CA)—simple rule-based models that, via local computation, can yield enormously complex behavior.
- Conway’s Game of Life as a canonical example.
Morphogenesis and Self-Organization:
- Biological systems specify only simple developmental rules, which self-organize into complex, robust forms (the "genomic bottleneck").

5. Emergence in Neural Cellular Automata & Graphs

[47:01–57:23]

Alexander Mordvintsev’s Work:
Neural CAs can self-assemble (morphogenesis) and be trained to recover desired global patterns (like images) robustness and adaptability to perturbation.

"They've turned a self healing image generation process into an emergent phenomena..." — Host 1 (Tim) [47:01]
Decentralization vs. Centralization:
Spontaneous centralization (leader election) often emerges in decentralized systems (brains, societies), but the process of that emergence gives adaptability, robustness.

6. Dr. Grattarola’s Research: Learning Graph Cellular Automata

[50:20–105:03]

Graph Cellular Automata (GCAs):
Dr. Grattarola generalizes CA by defining the local rule not over a regular grid but over an arbitrary graph, using graph neural networks (GNNs) to learn the rule.
Morphogenesis on Arbitrary Graphs:
- Demonstrated with point clouds (e.g., forming the shape of a bunny) via only local updates governed by a GNN.
- Emergence happens as a distributed, decentralized process on arbitrary geometries.
- Notable quote:
  
  "Does a rule exist that, starting from a random configuration of points, actually morphs these points into this coherent shape?... Yes, this can be expressed as a process that, iteratively and locally, kind of grows the image into what we want." — Dr. Daniele Grattarola [98:11–98:54]

7. Universality, Computation, and Iterative Dynamics

[91:29–96:23]

Neural CAs and Computation:
The neural CAs described (both grid and graph-based) use complicated neural rules (learned, not hand-coded) and often operate via many iterations.
- "Without that iterative capability, without that kind of working space, without that temporal dynamics, you don’t get this kind of behavior." — Keith [95:13]

8. Practical Implications & Future Directions

[109:17–114:38]

Robust, Self-Healing Systems:
Potential engineering applications: systems that can self-heal (drawing from morphogenesis), robustify to changes, or coordinate in complex environments.
Where to Learn More:
Dr. Grattarola recommends resources:
- For cutting-edge CA/neural CA: Twitter and GitHub communities.
- For academic work on biological emergence: Michael Levin’s lab work (e.g. "xenobots").
- For visually engaging explanations: YouTube channel "Emergent Garden".
Theory Frontiers:
Quantifying emergence via entropy, exploring the edge of chaos (Langton), and the notion that simple rules at multiple system levels yield recurring emergence.

Notable Quotes & Memorable Moments

On the Surprising Power of Simplicity:

"There is absolutely no reason why this should work, like at all. There is nothing that we can observe that says that these kinds of rules should exist at all. This model, in principle, it's like it's too simple for it to actually work. But in fact it turns out that these models are... universal models." — Dr. Daniele Grattarola [91:29]
On Weak vs. Strong Emergence:

"Any endorsement of strong emergence is a rejection of physicalism and reductionism, which is to say an appeal to magic and esoterica. Whereas weak emergence can be used to support the physicalist picture of the world..." — Host 1 (Tim) [20:41]
On the Recurring Nature of Discrete and Continuous:

"As you move along scales of emergence or reduction, you keep coming across the need to either view things as a continuum or as a discrete spectrum... this alternating series... never converges." — Guest 2 (Keith) [76:26]
Neural Networks & Iteration:

"A neural network as it's typically conceived... by itself is not Turing complete. What you need is the ability to do this iterative computation... and that's exactly what we have in this work." — Keith [95:13]
On the Practical Side of Science:

"What I came to realize is that at some point it becomes a matter of solving the problem. Solving the problem is more important than the way you solve it, in a sense." — Dr. Grattarola [105:42]

Timestamps for Key Sections

[00:58] – Defining emergence, levels of abstraction, population vs. individual
[03:08–05:00] – Melanie Mitchell on complexity, ant colonies, and reductionism
[08:00–13:00] – Weak/strong emergence, Marc Bedau, David Chalmers, Hossenfelder
[18:14] – Chalmers, consciousness, philosophical zombies
[30:42] – Infinity, computability, fundamental limits
[36:00–43:00] – Cellular automata, morphogenesis, nature’s robust design
[47:01] – Neural CAs, self-organization, real-world examples
[50:20–53:43] – Dr. Grattarola’s background and graph neural network journey
[57:23–63:35] – CA as modeling tools: across scale, continuum vs. discrete
[91:29] – Neural CA universality and surprise
[98:11] – Point cloud "bunny" morphogenesis on a graph
[109:56] – Further resources and recommendations
[114:38] – Edge of chaos, entropy, the future of studying emergence

Additional Resources Mentioned

Books:
"Complexity: A Guided Tour" — Melanie Mitchell
"A New Kind of Science" — Stephen Wolfram
"Godel, Escher, Bach" — Douglas Hofstadter
Papers/Authors:
Sabine Hossenfelder — "The case for strong emergence"
Sebastian Risi — "The Future of Artificial Intelligence is Self-organizing and Self-assembling"
Michael Levin (biological emergence and morphogenesis)
Alexander Mordvintsev — "Growing Neural Cellular Automata"
Langton (Edge of Chaos)
Online:
Twitter & GitHub CA communities
YouTube: Max Robinson’s "Emergent Garden"

Takeaways

Emergence lies at the heart of both the natural world and artificial intelligence.
Whether weak or strong, emergent phenomena highlight how systems’ complexity cannot always be deduced from simple parts—sometimes only accessible via simulation and iteration.
Neural CA and graph CA research point to new ways of building robust, adaptable, self-organizing AI, but also raise profound theoretical and engineering questions.
The field is rapidly evolving, bridging theory (physics, computation, philosophy) with practical applications (robust AI, neural modeling, biological insights).
Staying at the bleeding edge means following both academic work and informal hacker/enthusiast communities online.

Loading summary

Transcript119 lines

[00:00]
Dr. Daniel Grattarola
In this episode of Street Talk Unplugged. And so one thing that I find fascinating is that there is absolutely no reason why this should work, like, at all. There is nothing that we can observe that says that these kinds of rules should exist at all. This model, in principle, it's like it's too simple for it to actually work.
[00:25]
Host 1 (Tim)
Welcome back to Street Talk. This week we are coming live from Lisbon in Portugal.
[00:33]
Host 2
I haven't had access to my studio.
[00:35]
Host 1 (Tim)
Or any of my, you know, normal.
[00:36]
Host 2
Recording equipment, so it's going to be a bit of an interesting one. But I've been working out of coffee shops, building an introduction, doing all my stuff. So, yeah, it's going to be a show about strong and weak emergence, about cellular automata. And we're going to be interviewing Dr. Daniel Grattarola and speaking all about his work on graph cellular automata.
[00:56]
Host 1 (Tim)
So I hope you enjoy it, folks.
[00:58]
Host 2
See you soon. One thing that really fascinates me is a lot of the interesting phenomena happens at a different level, a different rung of the emergence ladder, if that makes sense. And I'm starting to see this everywhere. Like, even at work, I'm building a code review platform and at the low level, the metrics are obvious. I know Ken talks about the tyranny of metrics, by the way, but you know, it's how many code reviews has an engineer done? How many customer engineers do I have?
[01:27]
Host 1 (Tim)
It's easy.
[01:28]
Host 2
And then I start going up the levels of abstraction. I'm talking to the senior leaders and now I'm starting to use much more abstract language like vertical information flows and trust and engineering culture. And all of a sudden it's impossible for me to quantify. And if I do, I'm making it up. And it's the same thing you're talking about these population scale phenomena that happen now. So I've got all of these intelligent agents, they're doing things and, and I can try and. Because now I've got a meta optimization problem, right? So I want to encourage interesting phenomena in the emergent scale. So I might say, well, this type of thing is interesting, I want more of that. But I'm kind of, I'm reaching because I don't know how to describe it.
[02:10]
Dr. Daniel Grattarola
I mean, I think one other characteristic maybe you can point to that kind.
[02:14]
Guest 2 (Keith)
Of separates like population from just individual is sort of specialization versus generalization.
[02:21]
Dr. Daniel Grattarola
Like, I think population driven algorithms sort of implicitly are more about specialization a lot of the time, because like each member of the population, you want them to be doing Some different thing. So they're kind of becoming specialists. But I think there's a huge amount of generalization snobbery kind of within machine learning. Like we're looking for the ultimate generalist all the time. It's like get it to do all the tasks you can possibly do and then throw more in and the data set just gets bigger. And we're all very impressed with that the population implies. I feel like something in spirit different because it's more just like actually I want to see a lot of different things and like hyper specializations to all kinds of exotic things that like probably the generalist won't do because it's basically all it cares about is being general. This comes down to like the focus on like the particular level of abstraction or the level of agency that we have.
[03:08]
Host 1 (Tim)
Professor Melanie Mitchell wrote a beautiful book on complexity about 10 years ago. I hope one day we can get her back on the podcast and discuss it in detail. In the book, she led with a quote from John Locke I call complex, such as are beauty, gratitude, a man, an army, the universe. The animal kingdom has several examples of what I would call externalized or collective intelligence. Melanie quotes biologist Nigel Franks in her book. The solitary army ant is behaviorally one of the least sophisticated animals imaginable. If 100 army ants are placed on a flat surface, they will walk around and around in never decreasing circles until they die of exhaustion. Yet if you put half a million of them together, the group as a whole becomes what some have called a super organism with collective intelligence. The whole is in some sense more than the sum of its parts. Although we need to be quite careful with the language that we use here. An emergent behavior or emergent property can appear when a number of simple entities or agents operate in an environment, forming more complex behaviors as a collective. If emergence happens over disparate sized scales, then the reason is usually a causal relation between different scales. Western science has a strong tendency towards reductionism because it assumes that things have essences. So science keeps chopping up things into smaller and smaller pieces to find their essence. It's an intellectual and philosophical position which interprets a complex system as the sum of its parts. This is in stark contrast with relationism.
[05:02]
Host 2
Which you could say is related to.
[05:04]
Host 1 (Tim)
The philosophical ideas we were discussing with Andrew Lampinen from DeepMind last week. Biologist Peter Corning asserted that this whole discussion rather misses the point. He said that wholes produce unique combined effects. But many of these effects may be co determined by by the context and the interactions between the whole and its environment. Now, weak emergence describes new properties arising in systems as a result of low level interactions. These might be interactions between components of the system or components and their environment. Emerging properties are scale dependent, though, and can only be observed at large enough system scale. One reason emergent behavior is hard to predict is that the number of interactions between a system's components increases exponentially with the number of components, thus allowing for many new and subtle types of behavior to emerge. Emergence is often a product of particular patterns of interaction. Negative feedback introduces constraints that serve to fix structures or behaviors. In contrast, positive feedback promotes change, allowing local variations to grow into global patterns. On the other hand, merely having a large number of interactions is not enough by itself to guarantee emergent behavior. Many of the interactions may be negligible or irrelevant, or may cancel each other out in some cases. A large number of interactions can in fact hinder the emergence of interesting behavior by creating a lot of noise to drown out any emerging signal. The system has to reach a combined threshold of diversity, organization, and connectivity before emergent behavior emergency appears. Marc Bedow said in his 1999 paper titled Weak Emergence, that an innocent form of emergence, what he called weak emergence, is now commonplace in the thriving interdisciplinary nexus of scientific activity, sometimes called the sciences of complexity. Interestingly, which he elected to put in air quotes for some reason, he said that this included connectionist modeling and nonlinear dynamics, which is now commonly known as chaos theory and indeed artificial life. He gave two interesting hallmarks of emergent phenomena in his opinion. One, emergent phenomena are somehow constituted by and generated from an underlying process, and two, emergent phenomena are somehow autonomous from the underlying process. So he said that emergence is a perennial philosophical puzzle, and at best the idea raises the specter of illegitimately getting something from nothing. He said that any defence of emergence should aim to explain, that is to say, explain away the apparent illegitimate metaphysics and indeed demonstrate emergence to be entirely compatible with materialism. He argued that emergence must be more than intellectual masturbation. Putting words in his mouth here. And actually demonstrate tangible value to the empirical sciences and be a constructive player in our understanding of the natural world. He argued that weak emergence meets these goals, but argued that stronger forms of emergence are entirely irrelevant. He said that the failings of strong emergence can be traced back to this idea of strong downward causation, which is this notion that things in the lower resolution emergent domain can cause things in the high resolution domain. Mark said that strong emergence is uncomfortably like magic. How does a supervenient but irreducibly downward causal power arise since by definition it cannot be the result of the high resolution domain. He said this would discomfort reasonable forms of materialism and pay homage to the idea that it's possible to get something from nothing. My Mark concluded by saying that strong emergence is just a mystery which we don't need. It's interesting to note that his definition of weak emergence is as Macrostate P with microdynamic D is weakly emergent if and only if P can be derived From D&S's external conditions, but only by simulation. So interestingly, his definition incorporates the necessity for computational irreducibility, but not the notion of whether it is effectively computable. One of the main hallmarks of weak emergence is the underrivability, except for finite simulation, the exponential divergence of trajectories, or indeed the so called butterfly effect. Describing the sensitivity of a physical simulation on its starting parameters is a well known feature of chaotic systems. But Mark says that weak emergence is present in almost all complex systems, regardless of whether they produce chaotic dynamics, which lead to weak emergence being part of the definition of what it means to be a complex system. The popular physics YouTuber Dr. Sabine Hossenfelder wrote a paper called the Case for Strong Emergence. She felt that weak emergence was too deterministic an affront on free will, if you like. She used to think that we're all made of tiny particles which follow strict laws and human behavior is really just a consequence of these particles laws. Needless to say, she's since changed her mind and she thinks that you should as well. She led by saying reductionism works. Large things are made of smaller things and if you know what the smaller things do, you know what the larger things do. Physicists call this idea reductionism. Now you might not like it, but it works pretty well. Arguably, reductionism allowed us to understand molecular bonds and chemical elements, atomic fission and fusion, the behavior of an atom's constituents and and the constituents of those constituents and whoever knows what the physicist will come up with next, she said. She admits that the best explanation for the world around us right now is almost certainly incomplete. Sabine decided to discuss the concept of emergence in respect to physical theories and how fundamental they are. She said that a physical theory is a set of mathematically consistent axioms combined with an identification of of some of the theory's mathematical structures with observables. If two physical theories give the same predictions for all possible observables, then they are physically equivalent. She displayed a figure depicting a directed graph of physical theories. An edge between two theories meant that One was more fundamental than the other. She said that a physical theory A is more fundamental than B if B can be derived from A, but not the other way around. In this case, the theory B is weakly emergent from A. A physical theory is fundamental if it is to the best current knowledge, not emergent from any other theory. So this is quite interesting. Weakly emergent is the opposite of more fundamental. The idea that the theory at low resolution is always weakly emergent can be derived, at least in principle, from the theory at high resolution. Sabine also discussed the causal exclusion argument, which, roughly speaking, says that if a lower resolution effect can be derived from a theory at high resolution, then the effect cannot have another cause. The causal exclusion argument combined with effective field theory is the main reason why physicists believe that reductionism is correct and in a sense, why strong emergence is not a thing. She also spoke about top down causation, which is this idea that the laws of a system at low resolution can dictate the laws at high resolution. A good example of this is the mental states in our brain causing our bodies to perform physical actions. So it's important not to think of the emergent layers as being independent or assuming that they could or should be modeled in isolation. Interestingly, though, in Sabine's article she denied that top down causation even exists at all. In her conclusion, Sabine did a 180 degrees. And she decided that in fact there are many examples where there isn't a clear effective computational or functional path between physical theories. She gave a hypothetical example of a function which cannot be computed for negative values of X or a Taylor series expansion around zero. And she said that if there are any points where the coupling can't be continued between resolutions, you'll need new initial values which would need to be determined by measurement, and therefore strong emergence is viable. She said it's only fair on philosophers who believe that strong emergence exists that physicists first show the coupling constraints of a quantum field theory can always be continued to low energies for physically realistic systems. So what is emergence? Emergence is just the interpretation of a phenomenon from the perspective of a different scale, at least according to Professor David Chalmers. He wrote a paper called Strong and Weak Emergence where he lamented the abuse of the term strong emergence by complex system scientists and cognitive scientists. Echoing Marc Bedow before him, Chalmers says that it is strong emergence which is most common in the philosophical parlance of emergence and in particular used by the British emergentists of the 1920s. He thought that we could say a high level phenomenon is strongly emergent. With respect to a low level domain, when the high level domain phenomenon arises from the low level domain, but truths concerning that phenomenon are not deducible, even in principle, from truths in the low level domain. Now, I think deducible is a bit of a weasel word, but we'll talk more about that in a minute. He says that weak emergence does not yield the same sort of radical metaphysical expansion in our conception of the world as strong emergence. But it's no less interesting, he says, that you can think of weak emergence in terms of the ease of understanding of one level. In terms of another level. Emergent properties are usually properties which are more easily understood in their own right than in terms of properties at a lower level, indicating that weak emergence appears to be an observer relative property. Now, how interesting is this high level phenomenon to an observer, and how difficult is it to deduce this phenomenon from the lower level, that is emergence? So Chalmers takes emergence in the general sense to mean surprising or interesting and indeed an unexpected phenomena. And he uses the strong versus weak designation to delineate a radical paradigmatic surprise. He says that the emergence of high level patterns in cellular automata, a paradigm of emergence in recent complex systems theory, provides a clear example. If one is given only basic rules governing a cellular automaton, then the formation of complex high level patterns such as gliders, may well be unexpected. Therefore, the patterns are weakly emergent, but the formation of those patterns is straightforwardly deducible from the rules and the initial conditions. He concedes that this might take a fair amount of computation, which he indicates as a reason why the emergent behavior wasn't obvious to start with. And I assume by the word obvious, he's kind of means it as an autonym to unexpected cellular automata are provably computationally irreducible. This means that there are no analytical shortcuts to perform the effective calculation without resorting to running the sequential simulation in its entirety. Since the computational domain is exponentially large in the case of discrete cellular automata, and infinitely large in the case of continuous cellular automata. If you were trying to find the initial conditions and rules for a given behavior, or even if you had to recompute the simulation, we would argue that this constitutes at least a semi strong designation of emergence because of the effective computability. Right? The effective computability must come into it. Professor Chalmers says that strong emergence has much more radical consequences than weak emergence. If there are phenomena that are strongly emergent with respect to the domain of physics, then our conception of the natural world would need to be revolutionized to accommodate them with new fundamental theories. Now, I find this a little bit strange. I mean, given that a Class 4 cellular automata is Turing complete, which is to say that they can represent any computer program, it seems like a contentious point that there's no possible output in a cellular automata, which would be paradigmatically surprising. Maybe I'm wrong. To be clear, Chalmers is a materialist, right? He's not subscribing to any kooky views by saying this. He's a computationalist in the sense that he agrees that if you replicated him atom by atom in the natural world according to our universe, then it would have a consciousness.
[18:15]
Dr. Daniel Grattarola
Right.
[18:15]
Host 1 (Tim)
But he argues that consciousness isn't a logical necessity. He could imagine a universe which has all the same physical laws where he would be a philosophical zombie because it's not logically necessary.
[18:26]
Guest 3 (Keith)
Yeah, so this term emergence is, you know, it's so woolly and ambiguous. I mean, it gets used for so many different things in the sciences. In philosophy, any kind of phenomenon of a complex system that we don't fully understand, we say, oh yeah, well, it's emergent. And then the question, okay, well, great, well, what's the cash value of, of that? And I've found it useful to distinguish, as you were, weak and strong emergence, where weak emergence is kind of a matter of mostly of complexity, where, for example, you've got some simple rules at the bottom level that gives rise to some high level macroscopic phenomenon, which is complex and surprising, hard to predict and derive as a practical matter, but it's really more of a practical limitation. You can still see in principle why those bottom level principles, say laws of physics or rules in a cellular automaton, would in principle give rise to these high level phenomena, derivable in principle, if not in practice, whereas strong emergence would require something that's not even derivable in principle. And I guess I think that most of the things you get in AI or complex systems theory and so on involve weak emergence. Certainly I was very influenced by Dag Hofstadter here, who's, who's, you know, Godel, Lescher, Bach is in some ways all about the powers of weak emergence, how really simple processes at one level could give you complex processes at a higher level and actually get these tangled hierarchies he took, or strange loops. You go up a few levels and then you'd, you'd come down. So I guess I'm probably more sympathetic with Hofstadter's picture of weak emergence than, say, George Ellis's, where causation is always within a level. I think there are very complex relations between the levels, and some of them may be best understood as causal. You could think of it, you know, the butterfly snapping its wings, having some causal relation to some sociological event days later. So you do get these tangled hierarchies, but all that is still weak emergence.
[20:41]
Host 1 (Tim)
So he's agreeing with Mark Badow by saying that any endorsement of strong emergence is a rejection of physicalism and reductionism, which is to say an appeal to magic and esoterica. Whereas weak emergence can be used to support the physicalist picture of the world by showing how all sorts of phenomena which might seem novel and irreducible at first sight, can nevertheless be grounded in underlying simple physical laws. Chalmers thinks that there is exactly one clear example of strong emergence in our universe, which is, guess what, our consciousness. We can say that a system is conscious when there is something it is like to be that system, which is to say it has a phenomenological experience. Chalmers argues that it is a fact of nature that the universe contains conscious systems. We are existence proofs of that. And there's reason to believe that the facts about consciousness are not deducible from any number of physical facts. He makes the argument that there could be a world physically identical to this one, but lacking consciousness entirely, which is very similar to that philosophical zombies argument that I just spoke about, or even containing conscious experiences which are potentially different to our own. Roger Penrose said that the human ability to understand is undecidable and requires consciousness. If this is true, it might be a mathematical proof that consciousness is strongly emergent, exactly as Chalmers claims.
[22:11]
Guest 2 (Keith)
And so the way I, the way I view strong emergence, at least for right now, is that if I have these different formalizations at different levels, and it's just not possible in any practical scheme whatsoever for me to directly go from a lower level to a higher level. Like, for example, I just can't computationally do it. Or there's no mathematics that can ever hope to symbolically, you know, prove that the properties I observe at a higher level derive from a lower level. Maybe I, you know, people say, well, in principle you could, but in reality, you just may never be able to do that. You know, does that qualify as strong emergence? Or is that a bad definition of it? And do you think there is such a thing as strongly emergent behavior? Or can we ultimately just reduce everything down to hypergraph or, or loop quantum gravity or whatever?
[23:06]
Guest 4
Every level, in a sense, is independent. You Cannot expect it to be fully reduced to a lower one or a higher one, or. Each level has its own value. It has its concepts, it has its conclusions, it has problems that suitable to be solved at that level, not higher or lower. To me, that's the first principle. But the second one is you don't want to push it too far. You don't want to say all the layers have nothing to do with each other. So after all, we are talking about the same object. We are talking about the Google map of the same area. Even though you zoom in, zoom out at different level, if it is a 2 map a different area, that's a different story. Okay, so as far as all those series is in a sense about the same object, but at different levels of description, they're correlated. But yeah, kind of like very overall high level, not high level is the wrong way to say. It is confusing the relation. It's kind of like you have an overall large scale correlation, but you don't have one to one mapping among the concepts. That's also my opinion about the relation, for example, between neurons and concepts. Of course they're related, but there is no one to one mapping, or not even many one mapping. It's more like many to many mapping. And also it's a very messy mapping, except if you want to limit your discussion to a very special phenomena at a certain level. For example, we know that some basic concept is in chemistry can be explained very well in physics, right? Because physics talk about the details, the same story. Something in biology can be explained very well with physics and chemistry, which talk about the details. So that's true. But on the other hand, if you say that biology overall can be eventually reduced to chemistry and physics, I say not only that's practically wrong, it's even theoretically wrong. Because when you're saying that you're ignoring the cognitive capability of the researcher and the user of your theory, you cannot really reduce everything to the lower level without greatly increasing the number of concepts and computational costs.
[25:54]
Host 1 (Tim)
Melanie Mitchell pointed out that it's incredibly mysterious curious how the intricate machinery of the immune system fights disease. Or how a group of cells organizes itself to be an eye or a brain. Or how independent members of an economy, each working chiefly for their own gain, produce complex but structured global markets. Or most mysteriously, how the phenomena we call intelligence or consciousness emerge from non intelligent, non conscious material substrates. The cognitive scientist Douglas Hofstadter, in his book Godel's Che Bach made an extended analogy between ant colonies and brains, both being complex systems in which relatively simple components with only limited communication among themselves collectively give rise to complicated and sophisticated system wide global behavior. The ants in our human brain are of course our neurons. They communicate with each other in a similarly simplistic manner. Yet our intelligence and arguably our consciousness emerge from this low level primitive communication. Markets are also complex, emergent and self organizing entities, if you like. Melanie said in her book that they are self organized on the microscopic and the macroscopic level. She said that on the microscopic level, individuals, companies and markets try to increase their profitability by learning about the behavior of other individuals and companies. The microscopic self interest has historically thought to push markets as a whole on the macroscopic level towards a so called Nash equilibrium. Now the process by which markets obtain this equilibrium is called the market efficiency. The 18th century economist Adam Smith called this self organizing behavior of markets the invisible hand. It arises from the myriad microscopic actions of individual buyers and sellers. The individual actions on a trading floor give rise to the hard to predict large scale behavior of financial markets. Now Melanie gives three core properties of complex systems in her book. One complex collective behavior. Large networks of individual components which each one following relatively simple rules with no central control or leader. It's the collective action of vast numbers of components that give rise to the complex, hard to predict and changing patterns of behavior which fascinate us so much. 2. Signaling and information processing. Complex systems use information and signals from both their internal and external environments. And three adaptation. All of these systems adapt, I.e. they change their behavior to improve their chances of survival or success through learning or some evolutionary process. So Melanie then goes on to give her definition of a complex system as a system in which a large network of components with no central control and simple rules of operation give rise to complex collective behavior, sophisticated information processing and adaptation by learning or evolution. Now I spoke with our friend Dr. Duggar on strong emergence and he said that in his opinion it describes behaviors which cannot be analytically derived nor effectively computed from a lower level or higher resolution theory. This would place glider wars in a cellular automaton firmly in the domain of strong emergence. A cellular automaton is computationally irreducible. There's no effective computational path from the lower level rules to the higher level behavior. The only thing you can do is run the simulation again from scratch. He thought that Chalmers and Hossenfelder evade the issue and or beg the question by phrases like deducible in principle or a fair amount of computation or follows from that at least in principle, etc. Etc. So you know what he's saying is that they make claims which we can in principle do something, but they can't actually demonstrate or perform with a reasonable amount of computation. In physical strong emergence, you can't even run the computation. At least in a continuous cellular automaton. You're pretty much in the same boat as the N body problem. Now, a lot of this discussion comes down to whether you believe infinity exists or not. Actual infinity. This is a teaser clip from our conversation with Dr. Yoshua Buck.
[30:43]
Guest 2 (Keith)
Can the universe, can our actual universe that we're in right now be actually infinite in spatial extent?
[30:51]
Guest 5
The problem is that it can have unboundedness in the sense that you have a computation that doesn't stop giving you results, but you cannot take the last result of such computation and go to the next step. You cannot have a computation that relies on knowing the last digit of PI before it goes to the next step. In this sense, you don't have an infinity. But the infinities are about the conclusion of such a function. It means that you actually run this function to the end and then do something with the result. Unboundedness is different in the sense that you will always get the something new that you didn't expect, that they cannot predict. But it's. It's just going on and on without this end. And it. I think it's completely conceivable that our universe is in this class of systems in the sense that it doesn't end, but it doesn't mean that there is anything that gives you the result of an infinite computation. Because if it that was the case, then it could not be expressed in any language. It also means if something cannot be expressed in any language, that you cannot actually properly think about it. Because when you think, you need to think in some kind of language, not in English, but in some kind of language of sort, or in a mathematical language that doesn't have contradictions. And what Godel has shown is that the language that he hoped to reason in about infinities breaks, that it has contradictions in it, that at some point it blows. Blows itself apart. So the languages that we can build are only those in which we have to assume that infinities cannot be built. So infinity in this sense is meaningless because we cannot make it in any kind of language.
[32:25]
Guest 2 (Keith)
So the thing is, though, I'm not limiting what the universe is capable of based on human, you know, mental and linguistic limitations or even mathematical limitations. Like I'm asking you if it's possible for this universe that we're in to ontically be right now actually infinite in spatial extent.
[32:46]
Guest 5
The thing is that you try to make a reference to something that you cannot observe and cannot conceive of, other than making a model in some kind of language. And to have that model make sense, the language needs to work right. Otherwise you are just maybe in some kind of delusional thing.
[33:04]
Host 1 (Tim)
You can't get to infinity from non infinity and you can't get to discrete from analog. So Keith believes that there are actual infinities, in stark contrast to people like Stephen Wolfram. But our brains are computationally bound. We are conducting what is a discrete computation in our mind. But we might have access to oracles, which is to say we're connected to a Turing machine, but we can only sample at a certain rate. Keith believes in infinity, therefore there may be many strongly emergent phenomena because they're not computable. He therefore doesn't think the universe can even run on a computer, or indeed that we exist inside a simulation. The simplest way to prove the constructivist hypothesis that natural systems need to perform computation in order to succeed and adapt in respect of its environment is to create an idealized version of the problem. That is to say, let's simplify it as much as possible while still retaining the features that make the problem interesting. And that is exactly what a cellular automaton does. Cellular automata are a class of computational models that exhibit rich dynamics weakly emerging from the local interactions of cells arranged on a regular lattice, for example a two dimensional grid. Cellular automata were invented by by John von Neumann back in the 1940s. They exhibit extremely complex behavior that's difficult or impossible to predict from the cell update rule. Now Melanie Mitchell commented in her book that this is one of the great ironies of computer science, since cellular automata, often referred to as non von Neumann style architectures, in contrast with the von Neumann style architecture textures that he also invented. Von Neumann was also able to show that his cellular automaton was equivalent to a universal Turing machine and therefore capable of universal computation, which is to say computing anything which a Turing machine can. In 1970, John Conway invented his own cellular automata called the Game of Life. And it had significantly simpler update rules than von Neumann's version. The most simple version is on a 2D grid with discrete binary values where the alive or dead state of every single cell depends on its eight neighboring cells. The rules are as follows. 1. Any live cell with two or three live neighbors survives. 2, Any dead cell with three live neighbors becomes a live cell. 3. All other live cells die in the next generation. Similarly, all of the dead cells stay dead. Even though the game of life doesn't pretend to be the most sophisticated way to understand complex systems, they are a wonderfully simple way to get acquainted in the ideas of complexity science, and in particular, weak emergence. Now, many of the patterns are incredibly lifelike, and that's because These are Class 4 automata. They are Turing complete, which is to say they're capable of representing any computation. Now, being weakly emergent doesn't preclude useful analysis. I mean, for example, it's still possible to model how frequently phenomena like gliders appear in the emergent domain. Given many random initializations, laws governing the weakly emergent states almost certainly exist, but can only be discovered through empirical analysis and observation and simulation. We can identify motifs, systems, behaviors, mechanisms, high level abstractions even in the emergent layer, but nothing from first principles. In what sense do natural systems compute? At a very general level, one might say that computation is what a complex system does with information in order to succeed or adapt in its environment. Morphogenesis means the generation of form. It's colloquially described in a biological process that causes a cell or a tissue or an organism to develop its shape. But in an artificial intelligence context, we can think of it as meaning the blueprint of emergence of any physical form. Professor Sebastian Ricci recently wrote an article called the future of artificial intelligence is self organizing and self assembling. And before you ask, yes, we'll be inviting him to mlst. He spoke of a current movement which combines ideas from deep learning with ideas from self organization and collective systems systems. It's a wonderful treatise for emergentist, open ended and biologically inspired AI enthusiasts. Searching for parameters of self organizing systems which produce particular patterns is a difficult optimization problem. Trying to make self organization programmable is a research field of its own called morphogenetic engineering. He said that the merger of these ideas could ultimately allow our AI systems systems to escape their current limitations, such as being brittle and rigid and not being able to deal with novel situations. However, the combination of these methods also poses new challenges and requires novel ways of training to work as efficiently as possible. Risi said that one of the most fascinating aspects of nature is that groups with millions or even trillions of elements can self assemble into complex forms based on only on local interactions and display what is called a collective type of intelligence. Sebastian gave the example of ants, which can join forces to create bridges and rafts or navigate difficult terrain. Termites can build nests several meters high without an externally imposed plan. And thousands of bees work together as an integrated whole to make accurate decisions on on when to search for food or a new nest. He said that achieving these incredible abilities is a result of following relatively simple behavioral rules through a process of self organization. Khamazin et al. Defined self organization in 2001 as the as a process in which a pattern at the global level of a system emerges solely from the numerous interactions of among lower level components of the system. Moreover, the rules specifying interactions among the system's components are executed using only local information without reference to the global pattern. In short, the pattern is an emergent property of the system rather than being imposed on the system by an external ordering influence. With the emergence of powerful machine learning learning algorithms, Sebastian said that the key question is instead of hand designing the algorithms for self assembly, can we learn these algorithms instead allowing more complex forms to be created? Sebastian said that self organizing systems are made out of many components which are highly interconnected. The absence of any centralized control allows them to quickly adjust to new stimuli and changing environmental conditions. Additionally, because these collective intelligence systems are made of many simpler individuals, they have built in redundancy with a high degree of resilience and robustness. Individuals in this collective system can fail without the overall system breaking down. Sebastian points out that evolution was able to exploit self organizational processes to create artifacts of remarkable complexity. However, human made designs are normally put together piece by piece. This is similar to the idea of whether an AI architecture and knowledge should be human engineered or revolved Blank slate style such as Professor Rich Sutton pointed out in his Bitter Lesson essay, the amount of information it takes to specify the wiring of a sophisticated brain directly is far greater than the information stored in the genome. Instead of storing a specific configuration of synapses, the genome encodes a much smaller number of rules that govern how to wire up a brain through self organizing processes and how synapses should change based on the activation of neurons. This amazing compression has also been called the genomic bottleneck. Now, when humans engineer bridges or teach curricula, there's always a plan, a pedagogy, a curriculum in biological construction. There's no blueprint, well, not one which defines the outcome. Evolution is a kind of meta optimizer. And our DNA is incredibly compressed. It can't possibly describe the complex configuration of our brains, a form of optimization which transgresses rungs of the ladder of emergence. Sebastian says that our genes contain the information to make the structure by controlling a sequence of events during morphogenesis. Our final physical form is merely a kind of sampled Materialization of this lower level process. This is very similar to this concept of inverse diffusion which happens in the OpenAI Dali 2 model by the way. Now, as Sebastian says in his article, deep neural networks are totally human engineered, whether it's the architecture itself or indeed the optimization algorithm which is stochastic gradient descent. Given enough data, deep learning algorithms can learn to decompose any space into a highly sophisticated geometrically tessellated, nested, compressed representation. The problem is that this representation is extremely brittle and breaks with even minor changes in the environment. Deep learning models efficiently compress what they have seen before with laser like effectiveness. But the problem is that many domains are open ended and combinatorially large and are not amenable to memorization in this way. Sebastian argues that using emergence and self organization might help robustify neural networks in a similar way to how biological systems are robust. Although he conceded that self organization is not the only principle that allows biological organisms to to display high level robustness. Anyway, I highly recommend you check out Sebastian's article.
[43:46]
Dr. Daniel Grattarola
It's brilliant to follow up a bit on this idea of centralized versus decentralized. If we look at decentralized systems, not always, but very often, they sort of self organize into a centralized system. For example, the brain has the sort of the prefrontal cortex directing everything. If we look at humans, the first thing humans do is they band together and they elect a leader, right? If we build decentralized computing systems, there's always like one leader. And so how much do you think the emergence of properties such as intelligence or whatnot is a property of really decentralized computing? Or how important is the sort of leader election among decentralized systems and can we do without it?
[44:38]
Guest 6
Oh yeah, I think that's a great question. Again, you always ask the great question. So you're right. Like in many decentralized systems like our brain or in civilization, eventually something like a centralized system is formed and usually maybe via our genotype, our genes. The same centralized system is usually formed across all humans and in even societal structures. Like typically you have a leader or a few leaders and they govern the society in a few types of ways, right? But I think the emergence of that structure is very important compared to designing it top down at the beginning. Let me tell you why, because take the example of say our bodies or the brain, right? There are cases the way the structure is emerged may be the same for most people, but for people with certain disabilities, unfortunate disabilities from birth or accidents, we're able to see brains or structures evolve differently, but they still function as a whole. Certain infants are known to have half their brain not functioning at birth, and it has to be removed from birth. But they still grow into a functional brain that has a different structure than what we traditionally know. For most humans, and even for people with disabilities like blindness or death, their brain structures would change functionalities. A blind person would use their visual cortex to process audio, for instance. The emergence property is very useful for tackling changes in the environments, as I mentioned in the beginning. So ultimately the goal is to have something that will work even when the environment changes. But it'll work maybe optimally when the environment is expected, but it's not going to completely not work when the environment changes.
[47:02]
Host 1 (Tim)
Alexander Mordvintsev, another guy that we definitely need to get on the show, wrote a fascinating article called Growing Neurocellular Automata. I've been looking into this and essentially it's a convolutional or neural network type architecture which produces what appears to be an RGB value for every single pixel, but actually the output is a 16 channel space including a bunch of other information. They've turned a self healing image generation process into an emergent phenomena. And so they're kind of continuously applying this convolutional neural network over the input space, much like you would do with a traditional cellular automata. Except this one, of course, is a continuous cellular automata, which is learned with a neural network. But then it has this incredible thing where if you interactively delete components, components of the image, or perturbed components of the image, it'll dynamically repair itself, which is fascinating. Just imagine some of the applications for this where you could have self healing systems and you could have these agents that learn to heal a system as an emergent behavior. So there's this really interesting intellectual journey here which starts with discrete cellular automata, which are binary and run on a regular lattice, which is to say a two dimensional grid. And then all sorts of of interesting things happen when we increase the resolution, or run on different manifolds, or have continuous values, and then even use something like a learnable neural network for performing the update rules. Dr. Daniel Grattarola is a scientist and postdoctoral researcher at EPFL. He recently published a fascinating paper called Learning Graft Cellular Automata, which was published in Neurips. And in that work he focused on a generalized version of a typical cellular automata called a graph cellular automata, in which the lattice structure is replaced by an arbitrary graph. In particular, they extended the previous work which I just showed you from Alex Mordvinseff you know, which was when they, they learned a 2D convolutional neural network for applying the cellular automaton update rule to now using a graph neural network and learning the update rule on that with message passing. It's absolutely fascinating. So now I give you Danielle Gr. Right. Let me get my notes out.
[49:06]
Host 2
By the way, I, I've just been on a crash course in cellular automata.
[49:09]
Dr. Daniel Grattarola
Oh, nice.
[49:11]
Host 2
It, it is absolutely fascinating. Yeah, I'm completely hooked on it actually.
[49:17]
Dr. Daniel Grattarola
Yeah, it, it is, it is fascinating.
[49:19]
Guest 2 (Keith)
Tim, I'm gonna, I, I have to share with you, I have to share with you my little cellular automa that, that tries to light casting or shadow casting for roguelike. Roguelike games.
[49:32]
Dr. Daniel Grattarola
Oh nice. Yeah, many of those, like tiny games. Like even I think it's called Gnome Fortress, something like that. It was an old school Linux game. It used like cellular automata to generate the terrain and stuff like that. It's super fascinating.
[49:46]
Guest 2 (Keith)
Yeah, I've played around with, with simple, simple cellular automata quite a bit and little hobby. Yeah you know, hobby games or simulations. I mean even the, you know Tim, that, that Galton board simulation in our video, that was a cellular automata.
[50:02]
Host 2
Yeah, I hadn't thought about that. Of course. Yeah, it's fascinating. I mean, and by the way, I mean because you linked Alexander's article, you know, he did the kind of the, the 2D gridded CNN version of morphogenesis and I mean, maybe you should just introduce. I'll tell you what, we're doing this all wrong. Daniel, why don't you introduce.
[50:21]
Dr. Daniel Grattarola
All right. Right. So yeah, I, so my name is Daniele. I am currently, I just graduated actually from Itzia in Lugano. So I'm currently working at EPFL in Lausanne. So I moved to the French speaking part of Switzerland and so right now I'm working in the domain of proteins. And my formal training during my PhD was in graph neural networks. And at some point I reached out to my current, let's say supervisors or PI that were starting out this project on applying graph neural networks to the protein domain. In particular protein design which is like essentially the inverse problem to AlphaFold, if you've heard about AlphaFold recently. So AlphaFold goes from the sequence of amino acids to the folded structure and one still open and very interesting problem is to how to do the opposite. So if I want a particular structure, what is the sequence that would fold into that structure? And you would think that having solved one direction would essentially mean you solved the other, but it's still computationally expensive to go over all possible sequences and try and see if they fall in the correct state. And so there's still this open question of whether the structure of a folded protein somehow informs the sequence and if you can predict one from the other. And so I'm working in that whole domain right now. But as I said, my background is in graph neural Networks. During my PhD, I've worked on a thousand different things related to graph neural networks. And by the way, I started like 2017, I started my PhD, so it was still at the time where graph neural networks were starting to emerge a little bit. So there was like this feeding frenzy of finding applications and trying to see if stuff worked, which was a really exciting time to be in graph neural networks, I should say. And then at some point during my PhD, towards the end, I decided to link back to one of my oldest passions, which was this idea of the cellular automata and trying to see if some of the tools that I have been working on had been working on would actually be useful to say something about that whole world. And it turned out it did. So, yeah.
[52:40]
Guest 2 (Keith)
Yeah.
[52:40]
Host 2
Well, I mean, I'm so inspired by geometric deep learning after I spoke with Michael Bronstein and his friends. But yeah, I mean, so much of the work that we've been brought up on is Euclidean or gridded data. And then when you start to think about some of the applications that you can do with graphs and, you know, curved surfaces and so on, it blows my mind. But before we get there, why don't we go on a kind of intellectual journey here and start talking about cellular automata. Now, anyone who's the misfortune of doing leetcode challenges in the tech industry probably would have had to implement Conway's Game of Life at some point. And usually the way, you know, these challenges are formulated is they are binary, which means the cells are 1 or 0, and it's on a regular lattice, usually a 2D grid, and you have a whole bunch of update rules which are a function of the neighbouring cells. And then you just kind of execute all of these rules and you just get this emergent phenomena happen. When you zoom out. It's fascinating. So can you just tell us a little bit about cellular?
[53:44]
Dr. Daniel Grattarola
Yeah, sure. So basically the short story is what you just said. So you have this essentially is a computer program or a computational model that has a state. And typically the state is what you said. It's just a bunch of cells arranged in these regular structures which can be 1D or 2D or even 3D or whatever. And then every cell has a particular state of its own. And then you have this transition rule, or update functions, however you want to call it that is applied synchronously to every cell and essentially decide what the next state of the cell will be as a function of the cell itself and the neighbors. And really, the cool thing that you find is that even though the complexity at the level of the rule is fairly low, so you have pretty simple rules that you can define. You know, the behavior that emerges can be, like, very lifelike. Like the tiny creatures that you see emerging on these grids, really, you know, they kind of click with our pattern matching system as humans because they look like tiny creatures moving around the grid. They're able to spawn new creatures. They're able to, you know, move coherently, and they add spiriodicity over time. And so it's either living things or engineered things. They have the same kind of regularity that we recognize as interesting. And that's just the basic version. But then with time, people have started to complicate the definition of a cellular automata, right? So, for example, instead of binary, you can ask the question of, okay, what happens if I allow the states to be n possible states over the grid and I can color the states differently? Or I can ask what happens if the state is continuous, right? And as you start doing that, what you see is that the behavior becomes more and more complex in a sense, even though, let's say the Kolmogorov complexity at the level of the rule remains fairly low, as you start introducing just that tiny bit more of complication, you see this insanely complicated patterns that emerge as a result. And so, for example, at some point, people started, let's say, playing with the definition of the neighborhood. So you make it a bit larger, which is equivalent to increasing the resolution. So your grid approximation, which, if you think about it like a grid, is just a discretization of 3D space or 2D space, right? But at the same time, we are like, as humans, we are very far from that kind of level of discretization of space, if it even exists, right? So we're used to thinking about high resolution, in a sense. And so what you observe if you start to increase the resolution of these models, is that their behavior starts to become, you know, eerily like living things. And so you see, like tiny cells forming and moving around, and then they start to organize into membranes and stuff like that. And this all happens by that same convolution like process that's happening on this grid. And that's why I think they're so fascinating. Leading back also to your comment before, like, they have this. They let you observe behaviors that typically you only see in nature, but at the same time, you are aware of this inherent simplicity that the behavior stems from. And so I'm already kind of diverging because this topic automatically makes me go on rants of how these things are super simple and yet super complicated and super fascinating.
[57:24]
Guest 2 (Keith)
Maybe, maybe just playing devil's advocate here, because I do love solar automata, as we were talking about before we started the show, but just to perhaps pull things back to some grounding here, which is that in a way, folks working on cellular automata have converged in some ways to a very old set of numerical techniques called finite difference. You know, modeling, right? So, and the way what engineers do with finite difference is they say, okay, look, I have this set of partial differential equations, right? And as we know, tons of things in the world, physical phenomena can be described by PDEs, partial differential equations. And they say, okay, can't solve these symbolically, so but I can do it numerically. If I have a grid, then I can start to write down how the PDEs, you know, result in changing continuous values based on kind of neighboring grid cells. And they do this exact thing. They create a mesh. They write down what. What transition rules the PDEs would. Would imply for each individual cell. And then you run simulations, and so you wind up with things like, you know, the equivalent of cellular automata with continuous values for the wave equation or for diffusion equations or for all this kind of thing. And so in one sense, it shouldn't surprise us that cellular automata can reproduce all the behaviors that we see in life, because indeed, the majority of that or all of it, you know, stems from partial differential equations at some level. And we know that we can approximate those solutions with cellular automata via the, you know, finite difference route. But at the same time, it definitely feels mysterious. And I think there is something. Something deep there. And this even goes back to, say, Mandelbrot, right, with his discovery of fractals. I mean, what the heck? I just have this simple little quadratic equation and complex space, and if I just iterate that map over and over again, I wind up with this absurdly complex, you know, in a sense, like infinitely complex, you know, boundary and interesting connections. So there's these kind of two opposing viewpoints. I wonder how you. How you reconcile those. Like, on the one hand, it seems very interesting and mysterious and yet on the other hand, it seems like, of course, if we go to a high enough resolution, we can simulate arbitrary differential equations.
[59:50]
Dr. Daniel Grattarola
I think it's like the. I don't know if it's an answer, but one possible explanation of this apparent dichotomy is it's a matter of scale, right? So if you go low enough with the scale, and by low I mean, you know, modeling the low level physics, in a sense, you probably get to the point where the scale is so low that everything makes sense to be modeled as a continuous differential equation, right? And so at some point, you know, cellular automata or a set of PDEs, what's the difference? It's not even that well defined. It's like a sort of spectrum that you can decide where you want to place yourself in. What is interesting though is that I think that this idea of locality and having update rules or transition rules or whatever execution kind of engine you want to have for your inherent program, at that point it becomes, as you go up in the abstraction hierarchy, it just makes sense to consider your elements yourselves as discrete objects. And then this is maybe not true if you're modeling, for example, the flow of some water or whatever. So, for example, I had this professor when I was at the during the Bachelor that he was using cellular automata to model how water moves inside coffee, for example, right. And then you can ask the question of, okay, is it a cellular automata? Should I be using something else that is actually continuous? But what I think is really fascinating about the idea at least of CAs, is that as you go up in the hierarchy, then it makes more and more sense to think about discrete agents interacting according to some rules. And at that point the continuum kind of gets lost anyway, or at least it gets lost in the way you want to model it. Right? So if you're modeling, again, low level physics, it makes sense to be working with a grid. But if you're working at modeling human interaction, for example, which you can probably still model fairly well as a sort of local kind of dynamical system, then does it make sense to model humans as living on a continuous grid? Well, no. You kind of want to model them as individual objects in a sense, right?
[62:10]
Host 1 (Tim)
Yeah.
[62:11]
Guest 2 (Keith)
What's interesting to me is as you're bringing up, so you were going in this direction of if we're at a very low level, it might make sense to think of a continuum, but then at some level of abstraction, things really behave as these discrete, you know, objects. Or say, in the case of Carl Friston, you Know, a Markov boundary, a thing that has this kind of stochastic dynamic, you know, Markov boundary, yet it behaves as a unified, you know, whole. And what's odd is that neither at the highest scale nor at the lower scale does either continuity or discrete nature ever disappear. It just seems to be continuously. I shouldn't have said continuously. It just seems to be forever intertwined. You know, that, okay, if I keep going down in this direction, I get to a continuum, but maybe if I go even deeper, I get back to some discrete, you know, hypergraph. Or if I'm going in the opposite direction, I get up to, say, the discrete level of molecules. But if I keep going, it starts to behave as a continuous fluid. If I go further, it starts to behave as a discrete cell. If I go further than that, it's a, you know, you keep interleaving back and forth between discrete being the correct level of analysis versus continuous being the correct level analysis. That's kind of the mystery is that they're. They don't exist at either extreme. They just are constantly interleaved.
[63:35]
Dr. Daniel Grattarola
That's a real cool thing that. It's like, it appears very arbitrary that we as humans at some point decide, no, this is like, I recognize this as a layer of this abstraction. It's like, you know, I can pretty clearly distinguish between being at the layer of atoms and being at layers of cells. Although if you look close enough, then cells are composed of proteins and proteins are composed of atoms. So you can kind of always go back, but at some point there is also this kind of limit that defines or that decides how far can a layer kind of communicate with the other layers. Right? So we as humans, we have no agency to interact with atoms, although we are made of atoms, but we have kind of essentially zero ability to interact with that layer. And so it's like there is a sort of intrinsic boundary that lets you operate on some levels of these abstractions. And these levels appear to be fairly close to one another. So I might be able to influence the level above me and below me, which might be. I don't know if we want to look at some discretization, maybe I can act on my organs and I can act on society, right? One level down, one level up, but already I cannot act on my proteins, right, or my individual cells. It's going to be much, much harder. And as you go down in the hierarchy, and of course, up in the hierarchy, the contribution of a particular layer becomes less and less relevant, right? So it's like you're definitely able to recognize those boundaries or like, I don't know if you can give a precise definition, but there definitely appears to be some level of discreteness or at least some ranges where it makes sense to talk about a level of abstraction that stands on its own.
[65:20]
Host 2
I would say, yeah, I'm really interested actually that there's a kind of observer relative problem that, you know, depending on the ladder of the.
[65:29]
Host 1 (Tim)
Let'S call it.
[65:29]
Host 2
An emergence ladder, we're on a rung of the emergence ladder and maybe that determines how we can formalize and understand phenomena on different rungs of the emergent ladder. But just to pull the discussion back a tiny bit. So this is all quite new to me and I'm fascinated by it. I've actually just started reading A New Kind of Science by Wolfram inspired by all the links that you sent us. And you know, one of the first things to learn about with cellular automata is the basic discrete one dimensional version. And Wolfram's actually given all of them names, right? Because if you think about it, in the one dimensional discrete cellular automata you have a neighborhood of three. And then if it's binary, you've got two to the power of three, which means you've got eight patterns. And then for every single pattern it could be one or zero. So you've got two to the power of eight, which is 256 things. And, and all, all of those things, you know, so, so for example, there's Rule 110 which he says exhibits Class 4 behavior. Now this is interesting as well because I'm interested in, you know, did he just pluck this out of thin air that you can kind of classify the behavior subjectively of, of phenomena in the emergent domain. And, but anyway, he said Class 4 behavior, which is neither completely, completely repetitive localized structures appear and interact in various complicated looking ways. And, and then there's this guy called Matthew Cook who used to work for Wolfram and, and he said these structures are rich enough to support universality.
[66:51]
Dr. Daniel Grattarola
Right.
[66:51]
Host 2
This result is interesting because Rule 110 apparently is, is an extremely simple one dimensional system and difficult to engineer the form a specific behavior. Anyway, so I'm reading Wolfram's book and I'm interested, like what's he talking about? You know, universal computation. And he, he, he had, he has this wonderful image where, you know, look, here are the occurrences of progressively longer blocks in the pattern generated by Rule 30 starting from a single black cell. And as far as he could tell, all of the possible eventual blocks will appear, potentially letting the pattern serve as a kind of directory of all possible computations. And that was kind of like his argument about the Turing universality. So what's your take on that?
[67:31]
Dr. Daniel Grattarola
Yeah, so they're definitely fascinating objects in that regard. Meaning that. So I think that for Rule 110 in particular, the argument there is that you can build logic gates essential with them because you kind of have this. And I might be messing up with the specific one of the 256 ones, but at some point, you're able to shoot this kind of rays in one direction and another in another direction. And at some point, if they interact, the rays disappear, and if they don't interact, the ray goes on. And so it's like a zero and a one, depending on if the rays survive or dies. And so all the arguments for universality of these models, by the way, are not arguments for their efficiency in doing computation. They're just showing equivalence, is that you can essentially build the basic machinery that you would need to perform computation, which is basic manipulation like the NAND gate and stuff like that. And I think that the formal proof of computation actually goes in a completely different direction with a different computational model. But the general idea is that one, you can simulate a Turing machine just by encoding the various rules in essentially the state. And that's a very fascinating thing that I've been thinking about a lot recently, because what we see and much of the things we can do with cellular automata are not necessarily given by the rule itself. So the rule is always simple. And in fact, it's not necessarily in a new kind of science. But I think in the next work by Wolfram, which is the Wolfram physics project, like, he kind of makes this argument that the rules that govern the universal cellular automata should be fairly simple, right? And so there is this bias towards simple rules, but the complexity that we typically observe comes from the configuration of the states, right? And so all of the computation that you might want to do with a cellular automata comes from configuring the state in the correct way. In fact, if you Google it up, it should be fairly easy to find. At some point, people had this challenge on Stack Overflow of implementing a functional clock in the game of life. And as it turns out, you can't implement the functional clock in the game of life. You just need to configure the states in a particularly smart way. And the states will evolve on their own, and they will create digits and then change the digits according to the time once every second, and so on. And so this is what I find fascinating about the kind of computation that CAs do, right, because you can prove that they are universal because you can do basic operations with them. But then much of the complexity that you can actually observe comes from the actual states that you have to configure. And that of course is by no means trivial. Which is also why people haven't been using CAs to implement computers. You use different architectures and computational models.
[70:27]
Guest 2 (Keith)
Yeah, and, and to that point, and, and by the way, so the Cook proved Rule 110, you know, Turing complete by mapping it to another Turing complete system, like the cyclic tag systems or something. I think it was. And then I think there's only one other rule that might be Turing complete, but it hasn't been proven, you know, one way or the other. Like Rule 30 may or may not be Turing complete. I don't think this is quite fascinating, right, that just this little simple set of rules, and it may be the simplest Turing complete system out there. But as you were just saying, the difficulty lies in going from a simple set of rules and some complex initial conditions to then being able to really predict what the large scale behavior is going to be. And I think Wolfram, you know, points this out himself, which is that, you know, he has this concept of computational irreducibility which, which essentially says, look, predicting what this computation is going to do is just as hard as actually running the computation. So there's no shortcuts in a sense. You can't, you know, and it's related to, you know, Turing completeness, you know, as well, or rather, rather the, you know, the halting problem, which is essentially a similar kind of problem. It says that trying to compute whether or not a Turing machine, an arbitrary Turing machine will halt is just as difficult as running it itself. And therefore you can't, yeah, you can't do it without halting or without failing to halt that sort of thing. So my question to you is, and this is kind of this emergence question is, is this just an impossible barrier? So in other words, is it the case that, you know, something in physical reality runs, runs things, you know, whether you believe it's a computation or not, whatever, there's some substrate that's executing this, you know, this, this automata of the universe, okay, and it displays all this multiscale behavior. And so you get these emergent, you know, things happening. Human beings, planets, you know, superclusters in the galaxy, whatever. Is it even possible in any, in any sense to predict from lower levels, higher level emergent behavior? Or is There just this barrier that, in a sense, this irreducible complexity that we. There's no shortcuts. We can't actually predict a higher layer of emergence or abstraction from a lower layer, one without just running the universe itself.
[72:53]
Dr. Daniel Grattarola
It's a very interesting question, and I don't think I'm really equipped to answer that without making anybody angry, I would say. So there's a lot of speculation that you can do.
[73:04]
Guest 2 (Keith)
Don't worry. I've made people angry by just asking a question.
[73:08]
Dr. Daniel Grattarola
So what I can say is that. So first of all, like we were saying before, there seems to be some degree of communication or ability to predict what the layer above or below you will do, right? So if you are, let's say, configuring a cellular automata, if we accept that that is how the universe runs. So if you're trying to configure or to study a cellular automata at a particular level, you might have some intuition, a priority of what it will do, right? So it is possible to engineer an emergent system if you are, you know, even without, you know, running it necessarily. But you probably can come up with some smart rules that have a desired behavior that you might want to have. Now, it's a much harder question to say, can we build a particular cellular automata that acts like we see in the universe? Because that requires us to answer the question of what are we actually trying to model, and at what scale are we trying to model the physical reality? Are we trying to model a brain or a network of humans? And that, I guess, you can also make that irreducibility argument for intelligence in general. So do I necessarily need to model the brain in order to have intelligence, or can I just approximate intelligent behaviors and maybe just train a neural network to do that? Right. So I think it's a much more profound question than just what is the cellular automata doing and how do I reproduce that in a computer? It's a question of. About the nature of reality itself and whether it's the only possible way to obtain a particular computation is to simulate it starting from the lowest possible level. Now, we have some evidence that this is not necessarily the case. We're having some models of computation that are able to emulate some of the phenomenas we see. And so I guess this is, you know, it gives us some kind of hope that we might be able to design one such system. And people, for example, have been started to using machine learning to try and design cellular automata that do a particular thing, right? So there might be Some degree of approximation that we can achieve without necessarily computing everything starting from the lower level.
[75:28]
Guest 2 (Keith)
And this is, in a sense, of course, Wolfram's physics program, I think he calls it, is all about can we start at the lowest possible. And I mean, like finest scale. You know, I don't even know, 10 to the minus 100 meters or something, you know, substrate in the form of this, this hypergraph. And then, and then with these simple rules, does the universe as we know it, you know, emerge from that? At least that. That seems to be the, you know, the case. And I mean, or rather that's. That's the. That's his program and could be. Could be possible. I think there's so many open questions there and of course, it's a very active area of research and almost a new branch of. Of mathematics too. And I. Or physics at least. And I think Wolfram himself said if it can do. If it can do these predictions, like, for example, if we can derive general relativity or, you know, quantum mechanics, it's. It's at least a century.
[76:26]
Dr. Daniel Grattarola
Yeah.
[76:27]
Guest 2 (Keith)
A century out. So, you know, we have a while to wait. But I'm curious what you think about the possibility of that. So if it. If we have this interleaved. So here's a question. We talked earlier about the interleaved, discrete and continuous. You know, that as you move along kind of scales of emergence or reduction, you keep coming across the need to either view things as a continuum or as a. Is a discrete, you know, spectrum of something. And so it's like this alternating series, you know, in math that we learned. That never converges. Right. It's plus one, minus one. It never converges. Well, if you go to zero. So if you go to that end of infinity, all the way down to the smallest possible scale, do you think we arrive at a discrete system, like a hypergraph, like. Like Wolfram's envisioning, or is zero actually a continuum? I'm curious what you think that answer is and what you think about Wolfram's paradigm of physics.
[77:21]
Dr. Daniel Grattarola
So I think that I was reading about this some time ago and the question of whether at the lowest possible level, you actually can see something like a cell and that kind of discreteness to it. I think it's still an open question in physics. Yeah, exactly. The atom of space. That's probably, from what I understand, it's still an entirely open question in physics. Like they haven't been able to answer that. And the question there would be like, can we actually discretize the notion of space and the notion of time, because that would be answering yes to that question, would be a fairly strong argument for Wolfram's theory of the universe being this huge L system that's constantly rewriting itself. But, of course, it's difficult to say. And at some point, one should even ask the question of whether, for us, it's even important to be going at that particular level, which is what you were asking before. Do we necessarily need to answer the question of whether the universe is continuous to be doing interesting things? And then it becomes a matter of goals. So, personally, I find it fascinating to be talking about all this localized, emerging computation because I see it as a potential path forward towards AGI, for example. And the question there at that point becomes, can we achieve something like that without necessarily needing to go and simulate the whole universe just to get, you know, you simulate this huge environment, just that maybe it will develop intelligence that seems insanely costly to do for achieving something that we can describe fairly well. And so I don't know, really, if at some point it will become obvious or people will be able to answer the question of whether the universe is continuous, which would probably discredit the cellular automata theory. But I think it's interesting that this idea of cellular automata, especially how Wolfram has instantiated it in one particular way, but the general paradigm is much more flexible. In a sense. It's interesting that we can describe things with this model. And whatever the level we decide to start at, we also know that if we set the rules right, then we will be able to observe the same kinds of emergence. And so it becomes kind of arbitrary for us to decide where do we start simulating the universe or the system we are trying to simulate? Right.
[79:55]
Host 2
Yeah. I mean, there are so many things to unpack here. I find this absolutely fascinating. So how does our universe work? And, you know, Wolfram was kind of making the point that underneath all of this richness and complexity that we see in physics, there could just be really simple rules. So is the universe an emergent phenomenon? And it kind of. I mean, it's very subjective, right? It's very vague. It seems like it is, because when you recreate so many of these emergent systems, they produce phenomena that look a lot like the universe.
[80:26]
Dr. Daniel Grattarola
Right.
[80:27]
Host 2
But then there's this notion of, well, could we find the exact simple rules and the exact representation to create something like the universe? There's this notion of irreducibility. And then it becomes very, very vague. But what's fascinating, though, is just by applying the same rules over and over again, you can produce something that looks really, really complicated, and that's just not what our intuition tells us at all.
[80:49]
Dr. Daniel Grattarola
Yeah, that's true. And I think, like, I just had this kind of follow up to the discussion, which is this idea that at some point you're interested in understanding the universe at some particular scale, right? So you might be interested in understanding intelligent systems or just living system if you want to be at one level lower. And so at some point, the questions you need to be asking, it goes back to what we were saying before, right? If you're interested in a particular level, you don't necessarily need to go all the way down and start there. You might just go one level lower and start modeling things there and know that maybe you will introduce some approximation errors, but if you do things well enough, then you will kind of see that emergence and you can build on top of that layer after layer after layer. Right. And I agree that it is fascinating that this kind of emergence appears to be like a sort of constant throughout the different layers. Right. So it doesn't really matter the specific rules that are acting at a particular layer. Right. What matters is that the same kind of computational engine appears to be working at every layer. And so you might have the similar phenomena happening at the physics level or at the cell level or at the society level, and the rules will change. And in fact, Michael actually makes this very interesting argument about different layers being recognizable by the kind of goals they're trying to solve. Right. And so, for example, you might recognize the layer of society because it's trying to solve the goal, I don't know, of surviving as a species, for example. And the rules that you will find at that layer will be, in a sense, emerging to solve that particular goal. Exactly. Like, I don't know, at a very more simple level, atoms will have very simple rules that just try to satisfy some electrical or physical constraints. Right. And so as you move up and down, you will find different rules, but the same computational principle. So we'll have a rule that tries to solve an objective through localized computation. And that is what I really find fascinating about cas and just the general idea of this kind of models, because they appear to be reasonable to explain different layers in this architecture. And so this is really interesting to me.
[83:11]
Host 2
So, I mean, this is getting to something we were just previously talking about with our other guest about formalizing what happens in that emergent space. And actually when we speak with AI alignment people, they bring up Asimov's Laws and you can get into, you know, utilitarianism versus deontological and so on. But you know, the thing is, even because we're going to get on to your work when we're talking about morphogenesis and what's fascinating there is that actually you almost want to evaluate the rules that you're creating based on the emergent phenomena. And then there's this thing about, well, how do I, how do I formalize the nature of that phenomenon? I mean, if I was Steve, Stephen Wolfram, for example, how would I formalize universe like behavior?
[83:53]
Dr. Daniel Grattarola
Right.
[83:53]
Host 2
Oh, this thing's emerging and it looks like the universe. How do I formalize that?
[83:59]
Dr. Daniel Grattarola
So, yeah, I think what you're describing is a property that we might find in different aspects of life, right. Once you have the same rules applied in an interconnected system of sorts. And so there's like this interconnected graph of information flow and how you act on that graph. So as you have that system and it goes on over time, probably at some point you will see some emergent phenomena. And I guess at that point the question is, how do I control that emergence? Like, how can I introduce some sort of metric of my own to try and understand what's going on and how to control it? To which I would answer that it's impossible to tell a priority. Like, it's so task dependent and it's so dependent on what you're doing that if you are trying to optimize for something, then you should go at the level that we were saying before. Like you should go at that level of abstraction and trying to understand what are your goals at that point. Right? So it could be that you're optimizing for the overall success of the company or whatever it is that you're developing code for. Right? And so you might introduce that as a higher level goal and hope that what's happening at the lower level gets optimized to essentially achieve that objective. Right. Which nature has done through selective evolution. But if you're trying to introduce that signal into your own control process, you might need to do some particular actions at the lower level so that maybe you can achieve that higher level objective.
[85:28]
Host 2
I know, but this is the tyranny of objectives though, right? There's the shortcut rule, there's all of these. Because as soon as you formalize something, you block stepping stones and you exclude the actual behavior that might lead you to where you actually want to go. It's fascinating, but I do want to move on a little bit. I want to kind of slowly move the conversation to. Towards your work. But I want to go via this fascinating article that you shared with me about morphogenesis. And actually it was by this guy, I think he's at Google, he's called Alexander Maud vincef Vint Sev. And the article was called Growing Neurocellular Automata. And it was Performing Morphogenesis and morphogen. I mean, that sounds like a ridiculously complicated word, but it's about essentially, if I perturb something or if I have some initial starting state, how could I create something that I want to create?
[86:15]
Dr. Daniel Grattarola
Right.
[86:15]
Host 2
So this was using a cellular automata on IM on the 2D plane. And he was able to design a cellular automata and an update rule which would quickly converge to a desired image. So there is a picture of a lizard.
[86:28]
Dr. Daniel Grattarola
Yeah.
[86:28]
Host 2
And you could, you could damage it and perturb it and then it would, it would just come back to the lizard. And it's just, it's incredible to me, right, because it's like from these local low level rules, you could actually create something that had global coherence. And I've just never seen anything like that before. So tell me a bit about that article.
[86:50]
Dr. Daniel Grattarola
Yeah, that's amazing. So that article in particular was inspired by this idea of the flat worm, which is like a tiny creature that if you take it, you cut it in half, both halves are able to regrow into the whole thing independently. And so the question there was, how can this animal be doing this? Because the decision to grow is a local decision and the decision to stop must be a global decision. So there must be something inside the growing process that's somehow coordinating the global shape. Right. And in fact, this idea of morphogenesis in cellular automata was already present in the literature, of course, and people had been doing it even 10 years prior, trying to generate or to generate flags. So you have three states, cellular automata with different colors, and you try to arrange the cells of a particular color in a particular region of the IM that you can generate, like the flag of Italy or the UK or whatever. Right. So you can kind of see that it grows into that shape. And what they did in that paper was bringing it to the absolute next level. Right. So it was looking at this platform and say, okay, it's able to regrow into the full thing. Okay, can we do the same thing with a neural network? And so can, can we learn to do the same thing? Because this goes back to probably what we were saying before, which is, and designing these kind of rules can be insanely hard. Like you don't know what the rule is that just by letting it evolve locally on input of pixels, it eventually gives you the lizard. Or you know, they have these smiley emojis, they have different kinds of emojis and images. And so how do we design that? Well, the answer in that case was, well, we'll take a convolutional neural network which if you look at it has, has essentially the exact same kind of shape as the cellular automata. So it has like a local 3x3 kernel that updates the state of every pixel synchronously. So it's like there's a lot of overlap there. And they said, okay, let's just train this neural network in a recurrent way. So we're just gonna propagate forward and then use backpropagation through time to adjust the weight so that after some steps of computation, we land in a particular objective state. And so this is what they did. And they were able to make this very, very robust actually. So they were able to have the image grow from a single pixel into the full thing. And then you can start trying to make that robust to perturbation. So what happens if I cut the image in half? I would like to grow it, I would like it to grow back into the full thing. And you can actually can introduce that type of, of input output examples in the training process. And so that's what they did. And then they have this whole analysis of what the cellular automata does or the neural cellular automata does. For example, they let it evolve way past the training horizon that they've trained it for. And so they let it compute over and over and over again. And at some point what happens is that it breaks the stability and it kind of starts producing the same pattern everywhere on the image. So it kind of the single lizard becomes a pattern of textured lizards everywhere. And in fact, they actually have done some really cool follow up work where they actually do the same to actually generate textures. So you can use that instability to generate textures on an image, which is very cool. And nature, like if you look at the images.
[90:27]
Host 2
Well, it is possible to, because I know they did a whole bunch of stuff to robustify it, if you like, but it is possible to make it lose its global coherence. So on the lizard I would press in the middle of the lizard and if I kind of oscillate the mouse cursor a little bit, I could make the lizard grow another pair of legs and feet. But what fascinates me is that this is, I don't think it was synchronous either. I think they, to make it resemble real life a little bit more, they performed the update rules randomly and stochastically. But it's incredible though, isn't it, that just by having essentially what is a filter back. I think it's a bit more complicated than that. They have a notion of if it's growing and dead and alpha. They've got about 16 different values, haven't they? Not just RGB for every single pixel, but basically it's a gridded cnn. And you're just updating this thing and you get an insane amount of global coherence from these bottom up rules. And there are loads of people we're speaking to that think that AI must be top down. You know, it's not possible for it to be bottom up. But this is fascinating, right?
[91:30]
Dr. Daniel Grattarola
It is, it is. And so one thing that I find fascinating is that there is absolutely no reason why this should work, like at all. There is nothing that we can observe that says that these kinds of rules should exist at all. This model, in principle, it's like it's too simple for it to actually work. But in fact it turns out that these models, that these CNNs in particular are. And then in my own work we proved it for generic graphs, but these are universal models. So if there is some kind of computation that can be expressed as a cellular automata and by extension as a gnn, as a cnn, then the CNN can implement that computation. And I think that it is really fascinating that this computation exists. So what this paper answers to me is it's not the question of whether can we do it with a cnn. But the really fascinating thing is that yes, this can be expressed as a process that, you know, iteratively and locally kind of grows the image into what we want. And that's a real fascinating thing like that they were able to actually do this at all. And it is not at all obvious that they could, but yeah, they could do it. Yeah.
[92:42]
Guest 2 (Keith)
I think the important thing for the listeners to know is that yes, it has the small grid which is its input. However, the cellular automata is absolutely not the simple rules that we're used to. Right. Like it's actually a relatively deep neural network behind that. Taking a look at that input, deciding what to do. Is that correct?
[93:05]
Dr. Daniel Grattarola
Yeah, I mean, it depends on what you mean by complexity. I would say that being able to compress the information of the image into a relatively small kernel of a cnn, it's still a fairly simple way of doing It, Right. So maybe it's not as simple as you would have the game of life. So you still have several thousands or probably even millions of parameters in that neural network, but it's still like it's encoding a lot of information. And especially what's fascinating to me is not that it's just like outputting the image one shot, which of course there would be better ways to do it, but the fact that it's doing it iteratively like this is a process that by, you know, by applying the same rule at every pixel and doing it, doing so iteratively, it's able to output the image. And so this is what I think is really fascinating.
[93:55]
Guest 2 (Keith)
Yeah, so I agree with that. I'm just, I'm just trying to set a baseline here. So one thing for the listeners to understand is that there is this quite complicated, you know, neural network that's looking at the small window and then deciding, you know, how to update. So it isn't old school cellular automata that have like update rules that can be written down in three lines of code or something. Okay. And then the other thing that very much interests me about this project in particular is, you know, I get on this soapbox pretty often, okay. That a neural network as it's typically conceived, which is a neural network takes, you know, some, some inputs and it does what is ultimately equivalent to you can always unroll it as a forward pass through a fixed depth, you know, thing, and then you get an output by itself is not Turing complete. That what you need is the ability to do this iterative, you know, computation, if you will, on kind of a working space. And that's exactly what we have, you know, in this work is there's this plane and it, and it's, it's learned, this computation that if it gets iterated over and over again, can do very fascinating things. And so I think it's just important for everyone to understand that without that iterative capability, without that kind of working space, without that temporal dynamics, you know, you don't get this kind of behavior.
[95:14]
Dr. Daniel Grattarola
Exactly, exactly. And that's, that's exactly what happens. Right. And like, for people working in the GNN community, this would be much more trivial to, to see. But like, what's happening is that by recurrently feeding the output back into, as input into the network, what's happening is that every cell essentially is able to see farther and farther away from itself. Right. Because the receptive field in a sense, aggregates information from the neighborhood. But then so is doing every other receptive field of every other cell. So after two iterations, you will have reached like a neighborhood of size, I guess, four by four instead of three by three. And so you go on growing. And this is exactly what we do in graph neural networks. Like every layer lets you go one step, step further. And so what's happening with this iterative computation is that as the iteration progresses, every cell gets access to essentially more and more large view of the world system. And at some point this information kind of bounces around like waves in a pond, if you think about it, and comes back at some point.
[96:23]
Guest 2 (Keith)
But that's the key. But that's the key, right? There is at some point. And the problem is that with irreducible computations, you don't know at what point that is. And so this T, this T parameter is open ended. The only thing you can do is sit there computing, computing, computing. That's why it can never be compressed into any fixed number of layers. It's like you have to have this open ended, you know, T, you have to have an open ended number of layers.
[96:50]
Host 2
Yeah, I wanted to unpack this a little bit as well because there's something magic about this. And it's exactly as Keith said.
[96:57]
Dr. Daniel Grattarola
Right.
[96:57]
Host 2
You know, people even say consciousness itself, what's emergent and magical about it is its reflexive property, the fact that it's constantly going round and round in circles. And when we look at this cellular automata, even this gridded CNN version, it appears very lifelike. And that's why, because we used to think that neural networks were kind of like performing iterative computations. And now, based on our conversation with Randall Balestruria, we know that actually they're just decomposing Euclidean space up into these kind of polytopes. And the amount of computation is finite. So it's a different type of computation. But anyway, I wanted to move the discussion on to your work, Danielle. So we've discussed these gridded cellular automata with CNNs. And then graph neural networks are absolutely fascinating because they extend the notion of a CNN into this world where you can have any structure at all. So you know, the concept of message passing, for example, it extends the cnn, right? So you still have this notion of a neighborhood, but you're not on this gr, our manifold anymore. And you've done exactly the same thing, right? You've done this morphogenesis, but with a point cloud. So can you tell us about that? It's absolutely amazing.
[98:11]
Dr. Daniel Grattarola
Yeah, it was fun because. So that whole paper was about exploring this very idea of can we, like people, have been complicating everything about cas? And at some point the question becomes, can you complicate the underlying geometry? And you get the graph cellular automata. And then we were trying to show that actually graph neural networks are universal engines, perform these kinds of computation on gcas. And so the task at that point became like, okay, can we do morphogenesis on a graph? What does it look like on a graph? Right? And let's say the typical visualizable example on the graph is to take something that somehow represents space, something that we are used to interact with as humans. And so this was the point cloud, so points in space. And we took several of them. We took this bunny like thing, we took a graph that. That more or less represented a writing, so some letters, so something that has like a spatial kind of notion to it. And we were trying to ask the question of, like we were saying before, does a rule exist that, starting from a random configuration of points, actually morphs? And here, like, this shape is actually a real shape, if you think about it, so morphs these points into this coherent shape. And again, the cool thing is that this must happen just through sheer local message passing. So every node at some point will read, the neighbors will read where it is at that point, and it will decide where to go next. And just by this continuous exchange of information, which at some point again bounces around through the graph. And by the way, we have evidence that not many steps are needed for this kind of information to bounce around. What you really need is that you at least need to have as many exchanges as the diameter of the graph, meaning the the most distance that you have between any two pair of nodes. And so if you can do that, what you see is that in fact there exists a rule that takes you from random points to Bonnie, and it gets there fairly stably as well. You can train it to be fairly stable. And so, for example, what we saw is that because if you think about it, there are these two regimes that the neural network must learn. So it must learn to go from random to bunny and then from bunny to bunny. So it has to remain stable once it gets there, right? Which is what you try to do with the lizard. Like, you would like the lizard to remain a lizard even if you perturb it. And the same applies here. You would like the bunny to remain a bunny even if you perturb it. And so once it gets to the bunny, it has to stay to the bunny. And what you see is that in fact the network is able to very quickly put everything where it's supposed to be. Almost immediately, like in two or three steps, the random point cloud becomes essentially a bunny. And then it learns to gradually adjust the remaining points. And it is fascinating that everything is happening as the same rule gets applied everywhere. And so we kind of explored that and it worked fairly well. Although sometimes we had this weird effect that you probably get as you train a recurrent neural network in the dynamical system, where so instead of converging immediately, it starts oscillating around the bunny bunny or around whatever target you have. And it does this weird oscillations where it goes from bunny to random to bunny to random and so on forever. And it's really fun when you animate it because like, it looks like the bunny is stomping on the ground because there is this weird oscillation on the foot. So it was really nice to work in that space.
[101:49]
Host 2
Amazing. I mean, in a way, this is an entirely new. As I know you got the work published in Europe, which I think should give an indication of how impactful it is. But you can think of it as an entirely new model of computation in a way. I mean, you said in the paper that you could apply it to things like swarm optimization and control and modeling epidemiological transmission and even improve our understanding of complex biological systems in the brain.
[102:11]
Guest 4
Right?
[102:12]
Guest 2 (Keith)
Yeah, forget all that. I'm just looking for a first person shooter where I'm playing in a world that's the cellular automata that self repairs anytime somebody does damage to it.
[102:22]
Dr. Daniel Grattarola
Yeah, you can do that type of stuff. Like that's the cool thing about this, is that once you break free from the grid, in a sense, every time you have this kind of local interaction. And the cool thing is that the interaction can signify anything at that point. It doesn't need to be a discrete projection of 3D space anymore, which is what the image is. You project the 3D world into a 2D plane and then you discretize that. So it can be anything. It can be be relations between humans, it can be relations between neurons and whatever is doing this kind of computation through local exchanges in whatever geometry of the cells you want to have. And by the way, one thing it also tested was the setting of the dynamical graph. So a setting in which the graph changes at every iteration, if you want. And so all of that kind of unlocks a lot of possibilities because now what you have is that. That again, if you can specify the correct objectives, which not necessarily you, it's not something you can do always or you don't know always how to do it. But if you can, and if everything turns out to be differentiable, which is another probably big limitation, then you can train this object kind of end to end to give you the desired behavior. Right?
[103:42]
Guest 2 (Keith)
Is that the idea? So with the protein, protein modeling or synthesis, you know, trying to find the DNA sequence that ultimately will give you the desired protein structure. What's the connection there?
[103:56]
Dr. Daniel Grattarola
So the connection is that there is no real connection so far. Like I'm not trying to apply this type of computation onto the protein design problem because it introduces like an extra layer of complexity that I don't think we're quite there yet to be able to use this GCA stuff reliably to solve any kind of problem. And I'm not even entirely sure that any function should be expressed as this sort of recurrent computation on graphs. So what we're doing with the protein design is just trying to solve different kinds of sub problems, which is for example, what function should the protein have and what does that look like in terms of structure? And once I have the structure, what does the sequence look like? So there is no idea of, of recurrent computation in that space. But in a sense it's similar in that you could say you're trying to find, let's say, a description for something or for something that has a higher level behavior. But in that regard the CA stuff kind of lives on its own for now. I'm hoping that we'll eventually apply it to modeling protein dynamics or something like that.
[105:04]
Guest 2 (Keith)
Well, what I'm curious about is suppose because the problem of protein folding itself is, is obviously difficult. But suppose you could design a cellular automata that could take as input a DNA sequence and then it some type of relatively simple, you know, input range on the DNA and then it could run, run, run iterate and give you the folded protein as a result. Then if that cellular automata was actually invertible, you could run the reverse computation to get back to a possible sequence that would have given you that, that is there research into invertible cellular automata of this nature.
[105:42]
Dr. Daniel Grattarola
Okay, so let's unpack that because so there are two things. The first is that AlphaFold actually kind of works like that, meaning that it has this refinement procedure, meaning that it predicts the structure and then it kind of feeds it back and then tries to iterate on the predicted structure. And there is also work on reversible cellular automata, although I have yet to hear about re cellular or so on, because that would imply irreversible neural network and that required the whole thing to be integrated, essentially. Right. But so, yeah, what you say is probably a possible way to do it. What I'm wondering if it's the best way to solve the problem, because at some point, what I came to realize working, so I had the luck to work with many real scientists, in a sense, meaning people that actually work with the brain or biology and people that actually, actually have a deep biological knowledge. And what I came to realize is that at some point it becomes a matter of solving the problem. So it's like solving the problem is more important than the way you solve it, in a sense. Right. And so if you're trying to solve protein design, it feels like an exercise in style to try and do it with a cellular automata, for example, because you don't have any strong evidence that the function that goes from structure to sequence is actually one such recurrent kind of computation that you need to do. And so at this point, what we're trying to do, and this is actually like something that I've tried to force myself to do because it can become difficult at times, is to not try and use the big guns immediately. So you kind of want to step back and go back to basics and, you know, start from a multilayer perception and see what, what it does and see if it can actually work and try to, you know, solve the problem. Actually solve the problem. So you start simple and you, you, you know, stress your neural network until you, until you hit a wa wall. And once you've hit that wall, then you try and make the model more complicated and you try and see if different things could work. And I'm seeing that that's probably a good way to approach this problem. So right now, multilayer perceptions are the way to go.
[108:00]
Guest 2 (Keith)
Well, wait, have you taken it one step further towards simplicity and said, let's start with a linear model? And if that doesn't work, then.
[108:09]
Dr. Daniel Grattarola
Yeah, no, but you see, that's when sometimes you already know that some models are not good. Right? Because the thing that I'm trying to do is highly nonlinear. And so linear model probably wouldn't work. You can try even. I mean, you can try, you should try, probably, but you know, it won't work. What I'm talking about is whether you already need to go and look at that structural source of information. Do you need the graph immediately or can you solve the problem, let's say, from the sequence or from just the coordinates without the graph representation and so that's what we've been doing a lot in this protein design space, which as I said is kind of orthogonal to the work on cellular automata. Because with cellular automata the question is whether this computation exists. If I can formulate the objectives, can I find the solution? With a neural network, it's a different set of questions and they're more like, I don't want to say essential, but they're at a more basic level than just trying to actually solve a problem. It's more asking questions about, about the universe or this particular computational model and see if there are answers. Right.
[109:18]
Host 2
Yeah, this whole thing blows my mind. I've only just discovered this and from an engineering point of view, I'm fascinated by this notion of having systems that can be self healing in some sense or even having multiple agents that are going around my system and kind of repairing things that get broken by people. So I guess, I mean, we'll slowly wrap, but I wanted to, if there are folks that are interested in some of these topics we've spoken about, so complex systems theory, you know, things like graph neural networks and so on, and the work you're doing, where should they look? And also I'm interested just to know personally, what other areas are you interested in?
[109:57]
Dr. Daniel Grattarola
Right. So as far as resources go, you can approach it at different levels. So. So if you just want to learn about cellular automata, there's tons of resources out there. And what I would suggest people do is that they go and look on Twitter, which as weird as it may sound, is where like the actual hacker community is doing this type of insanely complex cellular automata that really have like, once you see them, it's really hard to think about them being cellular automata because the behaviors is just so complex and lifelike. And so sometimes you will find and some of these hackers in this community that showed there's a layer of automata and then side by side comparison with the real world living system. And they're exactly the same, moving in the same way. And it's like, okay, so is the CA predicting what's happening in biology? What does that tell you about the nature of the world? And so, yeah, so I think, for.
[110:57]
Guest 2 (Keith)
The record, I think you may be the first guest that's recommended Twitter as.
[111:01]
Dr. Daniel Grattarola
Yeah, but that's the reason for that.
[111:03]
Guest 3 (Keith)
I love it.
[111:04]
Dr. Daniel Grattarola
That's awesome. That's what it is. As weird, as unscientific as it may sound, that's where interesting things are happening. Like, you know, Twitter's the new archive, huh? Yeah, Twitter is the new archive and GitHub is a new archive. Like you will find, you know, bleeding edge cellular automata on GitHub. You don't necessarily see them published at all. Right now, like, we are starting to see this NCA stuff popping up on, you know, in Europe's icml, iclr, and it's like starting to make a breakthrough, but, you know, it's more.
[111:38]
Guest 2 (Keith)
So there's. Tim, there's hope for us. You and I may yet become researchers.
[111:41]
Dr. Daniel Grattarola
If now we can just do it.
[111:43]
Guest 2 (Keith)
On GitHub and Twitter.
[111:45]
Dr. Daniel Grattarola
Yeah. And if you're. Let's say if you're interested in more academic work, especially on the biology side, there is the work of the entire lab of Michael Levin. And it's like he does a lot of work regarding studying emergence in real biological systems. And so it's like they study the cells of frogs and they take cells out of frogs and see what they do in a different environment. And they were able to create these, like, small biological robots that essentially grow out of skin cells to create what they call these xenobots. And that whole group.
[112:27]
Guest 2 (Keith)
Group.
[112:27]
Dr. Daniel Grattarola
And by the way, he's also part of the. That morphogenesis paper with CNNs. And so like that. That whole group is, is doing excellent work in that regard. I would say they are the pioneers of this entire idea of emergence.
[112:42]
Host 2
I discovered a YouTube channel called Emergent Garden, and it's by a guy called Max Robinson, and he's got some really cool videos. I definitely recommend you guys check that out.
[112:52]
Dr. Daniel Grattarola
Right, right.
[112:53]
Guest 2 (Keith)
Okay. But does he have a Twitter feed? Because if he.
[112:56]
Dr. Daniel Grattarola
That's the real point. And. Yeah, and you know, you can also keep an eye out. As I said, like, papers are starting to pop up in terms of. And this is all in terms of neural cellular automata. If you look at the literature on cellular automata on their own, like, there's a whole bunch of literature that goes back to the 60s. So, like, any kind of variation on the theme has been explored and proposed and even for gro graphs. And so right now we're starting to see this convergence between neural and cellular in a sense. But there's tons of literature in that whole space if you just approach it from a perspective of dynamical systems. And you can see actually really interesting things. So Tim, before was asking me about this idea, clustering the behavior of rules according to 1, 2, 3, 4. And as it turns out, there are pretty clear entropy measures that naturally cluster the behavior of rules according to their abstract behavior. And so it's really fascinating stuff. And you can really see that Class 4 rules actually have their own space in this entropy description. So, yeah, there's a lot of things you can look into. But, yeah, Twitter all the way, man. If you want to see nice images, that's. That's where you go.
[114:20]
Host 1 (Tim)
Fascinating.
[114:21]
Host 2
Yeah. I mean, on the entropy thing, I'm sure Carl Friston and even people like Kenneth Stanley, you know, the people that study artificial life, they think of lifelike. I don't know if agents is the right way to do it, but, you know, information accumulation is something that's super important for the characteristics of intelligence and life.
[114:39]
Dr. Daniel Grattarola
One fascinating concept is the Edge of Chaos and the work by Langton, for example. And that's also something that pops up continuously inside the route, for example, and elsewhere. Oh, yeah, sure, sure, sure, sure.
[114:56]
Host 2
Well, Dr. Daniela Grattarola, it's been an absolute honor. This has been a really fascinating conversation, actually. I think we need to do loads more content on this area. This feels like an area that we've just not really done enough on. So, yeah, this has been amazing. Thank you so much.
[115:11]
Dr. Daniel Grattarola
Thank you very much for having me. It was a real honor. Thank you.
[115:15]
Guest 2 (Keith)
Pleasure.
[115:16]
Host 1 (Tim)
Remember to, like, comment and subscribe. We love reading your comments. I really hope you've enjoyed this episode. If you don't mind, please rate us five stars on the Apple Podcasts app and we'll see you back next week.