Loading summary
Sebastian
We can take an example of how nature evolved intelligence and use evolution instead. When you use a static fixed network that is not changing the weights during its lifetime, if you cut off a leg, it will probably fail because it can't adapt. But these Hebbian networks, they change the weights all the time. It's basically like a continually learning, updating system where you can cut off a leg and oftentimes it will still be able to function, even though it has never seen this kind of variation during training.
Host
Let me jump in with a little explanation before we get started. This is a very technical podcast, but one of the more interesting ones that I've recorded in a while, and I want as many people as possible to benefit from it. I'll begin by explaining in simple terms what gradient descent is, which is used in most neural networks today, as opposed to neuroevolution, which is what this podcast is about. An illustration of gradient descent is standing blindfolded on a mountainside with the goal of finding the lowest point in the landscape. That lowest point is the solution. Your distance from it is the error or loss to get to the solution. You try to reduce that error step by step. So you feel around with your foot to find which direction the ground slopes downward. You then take a step in that direction. You repeat that process until every other direction feels uphill. At that point, you've reached a low point called a minima, though not necessarily the lowest point in the whole mountain range, which would be called the global minima. With neuroevolution, imagine a plane flies over the whole mountain range and drops many people with different search strategies in many different places. One wanders to his left, another to his right. One walks in widening circles, another takes big jumps, another small ones. After a while, you see who ended up at the lowest point. You keep the best strategies, make variations of them, combine some of them, maybe the jumping and the walking in a widening circle. And then you send out a new group of people with those strategies. Over time, your people get better and better at finding the lowest point, even though none of them ever knew which way was downhill. That is the difference. You have a better chance of finding the global minima, the lowest point in the entire mountain range. With gradient descent, you improve by following the slope. With neuroevolution, you improve by variation and selection. You try many candidates, score the results, keep the better ones, and make new variants from them. No one has to know which way is downhill to begin.
Interviewer
You have this new book out, Neuroevolution, so maybe you can Start by explaining what neuroevolution is in AI.
Sebastian
Sure. So neuroevolution is the idea of combining evolutionary algorithms with evolution. So the idea is instead of training networks with gradient descent or reinforcement learning, we can take an example how nature evolved intelligence and use evolution instead. And so you're applying genetic algorithms, evolutionary strategy, many different flavors of evolutionary algorithms to optimize some part of the, of a neural network. And that can be the nice thing is it doesn't only have to be the weights like in standard back propagation, but it can be the architecture, it can be some learning rule, it can be hyperparameters, it can be many different parts, and it doesn't have to be differentiable. So it's quite versatile how you can apply it.
Interviewer
But what do you mean it doesn't have to be differential?
Sebastian
So, so when you typically train a neural network with like supervised learning, you need to be able to differentiate through the networks. When you use an algorithm like back propagation, what basically most machine learning is built on, be able to differentiate through the network to know how much based on the loss, like how well it performs, how much you change each weight in the network. And so the network has certain properties like this differentiability for it to be able to apply this algorithm. And so if you don't have that, then it's a little bit more complicated to apply something like backpropagation to it. Evolution basically still works. So you have to have specific activation functions that are differentiable. The architecture has to be differential. Everything has to be kind of smooth instead of being discrete. And if you want to do something like discrete, discrete actions, things like this, then you have to use some tricks. Evolution doesn't really care if anything about it is differentiable or not.
Host
For gradient descent to work well, the landscape has to be smooth enough that wherever you are, you can feel a clear downhill direction. That is what differentiability means. If the train is too jagged, covered with boulders and hillocks, the tiny patch under your feet does not reliably tell you which way leads downhill overall. In that case, following the slope is much harder. And methods like neuroevolution may work better.
Sebastian
If you have a function, you can take the derivative of that function, and that's basically what you're doing. When you train a neural network, you view the whole neural network as a big function. And if you take the derivative of it, like from math, like high school math, then it tells you the slope, it tells you which direction do I have to push the weights for the arrow to get lower so that's all it does. Like it gives you the slope of the function and it then means that should I take this weight in this direction or the other direction? And if I take it in this direction the arrow increases and in this direction the arrow goes down. So you have basically like let's say you have a three dimensional network with three weights. What you get is depending on how you vary those weights you get an error surface and that tells you. And then if you get the slope it tells you which way you should go down. Right. And if you have a million parameter network, it's a million space and back propagation is very good. If you can do that then it's great to finding that point, the minima, the maxima. But if you can't do it then very difficult to navigate that space and that's what you can use like evolution for. So yeah, basically the differences in, if you use evolution is that you don't need the gradient because yeah, you have a whole, you have a population that is basically distributed on this landscape, right. And you don't need to have this arrow signal. You can just basically you kind of sample like evolution strategy for example. You have like you are somewhere on that, that, that surface and then you, what you do is you sample, you slightly change the weights, right? Create like a hundred different mutations that are like around you and then you go into the direction of oh in this direction. So it's a more you locally sample but you have a population so you're not only sampling here but in, in many places at the same time. And that can give you a direction to go. And then the nice thing is also that if you, that you not only need to do mutations like slightly going in one direction, right. But you can also do big jumps by doing crossover. So like you know, the idea crossover is you take the genes of one parent and the genes of another parent and by combining them maybe you get the best of both worlds because each one has good building blocks. And by combining it together you get something even better. But if you want to do that with evolution you need to take care. You can't just randomly take half of a network and half of another network and assume that it's working well. So that neuro evolution researchers have developing algorithms that allow like a sensible crossover that you, you know, you don't want to have like suddenly like two left hands, you want to have a left and the right hand. And the same applies to kind of neural networks.
Interviewer
And then it can combine them in a way that's still Executable.
Sebastian
Right, exactly. That, that makes sense. Where you don't kind of lose functionality.
Interviewer
So that's neuroevolution, right? As it applies to, to AI. And you've been working on, and this is an area that fascinates me, plasticity of networks growing, self growing networks and to some extent self recurrent recursive improvement. Are you working on that at all?
Sebastian
Yeah, yeah, so. So, yeah, exactly. Like we have. So we're basically trying to see what are potentially building blocks from nature that we don't have in our current system that might hopefully a lot better. And one of those things is this plasticity that you mentioned. So how we learn is through one of the mechanisms that our brains learn is like if two neurons always fire together, then the connection between them gets stronger. So it's a local learning rule instead of this back propagation that is like this outside thing that changes everything about the network. And so what we have been working on is what if we only train those learning rules for each synapse, we train this local learning rule instead of having a global signal and we train it through evolution. But then we can put the. So we did experiments where we trained those Hebbian learning rules and they take into account like how much does the presynaptic neuron fire, like the source neuron, and how much does the postsynaptic neuron fire. And then depending on how much they fire together, we have a learning rule that says, oh, if this fires often or this one or them together, then maybe make it stronger, make it weaker. And for every connection in the network we evolve its own rule. And then we showed that if you do that, starting from when the agent is born, we can start from a completely random network. The only thing it has, the learning rules, but otherwise the weights are completely random. And in a few steps the network can self organize because it's trained by evolution. The learning rules are trained by evolution to self organize into a network that can, for example, control a car driving around or controlling a quadrupedal robot. And the interesting part is that this quadruped, when you use a static fixed network that is not changing the weights during its lifetime, if you cut off a leg, it will probably fail because it can't adapt. But these Hebbian networks, they change the weights all the time. It's basically like a continually learning, updating system where you can cut off a leg and oftentimes it will still be able to function, even though it has never seen this kind of variation during training. And so now we're Trying to, you know, extending those to also more complicated tasks, more, more like continually learning tasks. But the main idea is that the weights never stop changing. Like our, you know, your brain is not frozen at some point, but it keeps, keeps changing. And yeah, so I think this is a very like a promising direction towards like continually learning agents that are based on their own evolved learning rule that could be, for example, optimized to facilitate continual learning through some kind of meta learning.
Interviewer
And the problem of course with continual learning on a fixed network is you overwrite weights and you forget information.
Sebastian
Right.
Interviewer
How do you, how do you deal with that?
Sebastian
Yeah, so that can still happen with those networks. So, so one thing that, that, that people have been experimenting with is so in the brain we have this Hebbian learning, but we also have this thing neuromodulation. So, so neuromodulation is like another type of, you know, system in the brain that tells some parts of the brain when they should learn. Like that's one of the things it does many other things, but one, one functionality is that it tells parts when should they switch learning on and when should they switch it off. So us and others have been experimenting with adding another type of neuron to a neural network that can then tell other parts of the network when should learning be switched on and off. And so that's one way of, towards more continuous learning system that the system itself sense. Okay, I should override maybe this part, this, this weights are fine. The other parts maybe shouldn't be changed. And yeah, the, the. So the other thing we have been working on that as part of this, this EU project grow AI is that we are also trying to learn not just in a, in a fixed network, but also learning actually to grow a network like more taking inspiration from neurogenesis and morphogenesis in nature that we're not given. This brain, like this brain has been, it's been growing. And so that's one thing that we, in machine learning skip. We just. This is the neural network you have, you have. But in nature things are grown from like starting from a single cell. So we're trying to replicate that process. Trying, starting with one neuron and growing and then the hope, the ideas. So we have a system that can do that and currently it works for simple tasks. But ultimately the idea is to also take into account the environment during the growth process, like to take advantage. There's already some information in the environment, so why not take it into account when the network is created and developed? And so that's something we're working on. And also we have been doing a combination of. And we call this a neural developmental program. So it's basically like you're learning another small neural network that is a copy of those runs in every neuron of a normal neural network. And then that small network can then decide when should another node be created or how should the connection between two nodes change based on the activation. So it can learn in principle, any type of learning rule, which just makes it also harder to optimize. But it's basically like a graph neural network type system, but that this dynamic that can change while the agent is born and interacting with its environment.
Interviewer
You're growing parameters, right?
Sebastian
Yeah, in effect, yeah, we're growing parameters, but not trainable parameters. So the trainable parameters is like the DNA, the small program. And then the big network then has also parameters. But those are basically like those it has to learn to change by itself our like DNA and the process. But then your brain then not relying on evolution anymore, but on whatever learning mechanism is running inside of that system.
Interviewer
You have say a three layer network with four nodes in each layer.
Sebastian
Right.
Interviewer
So in the hidden layer, the middle layer, one of the neurons gets enough information or more information that it can handle and then splits off a new neuron.
Sebastian
Yeah, except that it would be up to the. So there are two things. Like one is it could just grow without getting any information at the start. Like, you know, like how before our cells might get sensory information from the outside. You could just have a fixed program where the nodes communicate with each other and they exchange information and, and they figure out, okay, you should grow five times and I grow two times. And so this is like without any outside activation, that's one, that's the first like process that can run. But then it could be that. Yeah, then it could be that they figure out, okay, I have too much. So the system itself learns to do this. So, so we're not telling the system if you have too much information, but each. Because in each node you have another like recurrent network basically running like this genetic program, developmental program. And that could figure out, okay, I've getting all this information, so maybe I should split the cells to that, you know, you have more capacity. But that is something that we don't program in. But that the algorithm would have to figure that out by itself. And then how it would figure it out is that genetic programs that do that would get a higher fitness than genetic programs that don't do that. And so the other ones would select to be selected out and and some that do this a little bit would get initially better fitness, and then they would be selected. And. Yeah, and so that's one thing that is. That is a little bit of challenge, because the space of what you could learn is so large. Right. You could learn any very weird developmental program. And that's. We. So it probably would require to evolve really complicated things, have a good curriculum of tasks. Like, first, you know, you have to do some small. Some small task, and then we make the task more and more complicated, like, kind of. There's also some research that we talk about in our. In the Neural Revolution book, like one system called poet, where you're evolving the environment and the agent together so that both things get. Can, like, can scaffold off of each other. And so something like this is probably required to get that approach to work for really, like, complicated problems.
Interviewer
And just on a very concrete level, you're dealing with computer code. Right. So does the code. Is there a function in the code then that replicates a neuron and then adjusts that new neuron's weights?
Sebastian
Yeah, that's basically like the, the small developmental program has an. Is a no network that has an output that says, like, create another node. And then when it tests when it's there, so that is over some threshold, we just add another node so that the parent node and then there's another output in the network. If you give it as input the. The. The state of two nodes, then it tells you how much you should change the connection between them.
Interviewer
I see. Yeah.
Sebastian
So it's an extra output to the net, to the developmental program network.
Interviewer
Yeah. And. And so far, how many nodes have you grown? I mean.
Sebastian
Right.
Interviewer
The network has grown from what to what?
Sebastian
Yeah, so. So the network down from like, one single node to, like. I think the biggest we had is like, we, we. We tried it on the. On the small. Like, we tried it on some robotic tasks, but I think the biggest one was like a small version of mnist and maybe like a few thousand nodes, something like this. So it's. It's orders of magnitude smaller than, like, current, but.
Interviewer
But it grows by order. And if you can scale this up, I mean, first of all, are you guys confident enough? Are you at the point in this research that it's time to scale it, or are you still working on the base algorithms?
Sebastian
I think it could be almost time. I think there's some challenges with balancing growth and plasticity. What it sometimes likes to do is it grows as much as it almost can and then uses this whole network and just changes the connection between nodes. So there has to be some kind of pressure towards being like maybe sparse or not using too much. Because if you don't give it any pressure like in nature you have like this energy consumption. So you can't just grow and use all that energy. So that's a little bit harder to find the balance. But it's in nor revolution. You often also have multi objective optimization. That means you can say, okay, one thing is important is the fitness. The other thing that's important is for example how, how much nodes there are like some pressure on the, on the, how many, how big the structure is. And that helps. But there's still a few things we need to figure out. Like one thing that's also. It's still a little tricky. You want to learn to grow something and you want to be able to elaborate on that. And nature is very good at it. Like it learned how to make a butterfly and then it learned how to make a butterfly with like eye spots and different eye spots. But in these type of representations they're also called like indirect encodings because you have this indirect way of making a network. One thing that's difficult is that you want it to grow a certain network and then you want that it learns more and adds like another structure to the network. But it should do that without forgetting how to grow the first part. So there is like a kind of continual learning problem in developmental system and we're still trying to figure out how to best the.
Interviewer
If you can get this to work, is there. What are the scaling laws that apply? I mean can you, could you then grow this indefinitely? Or does something start happening beyond a certain point in the size of the network?
Sebastian
I think if we really, if we figured out how to grow it, I think we can, we can really, we should be able to really scale it up. And there's actually some interesting work that was presented by Neurips where they had a network. They showed a reinforcement learning. If you scale up to. Normally you have quite small networks for reinforcement compared to like language models. But they showed if you have a network, if you really scale it up, like having hundreds of layers, you actually get much more interesting dynamics out of the system by itself. If you train this in a supervised way. They did this like in some robotics task and that is just scaling up like a typical feed forward network. So, so imagine there might be really interesting dynamics hidden if you scale it up. But let not the structure be the structure grown and determined by this and where it's not like a Typical feed forward architecture. But I would like to grow something that has more like, I don't know, maybe it discovers how to grow a cortical column and then it should be able to replicate that cortical column many times or some other structure that's important in, you know, in biological systems. And then I think then we could get really interesting dynamics out of the system. But yeah, the difficulty is having a system that can learn to grow important neural motives and able to copy them and also maybe slightly create variations of it because maybe it wants to use this kind of cognitive map for something and then slightly changes to use it for another modality. So but I think if we figure out how to make this really efficient then there's a lot of things, interesting things we can do.
Interviewer
Yeah. And once you, I mean growing is one problem, right?
Sebastian
Yeah.
Interviewer
Training is another and then inference is another. So when you grow this network then you're starting with a trained network. Right. A pre. Trained.
Sebastian
We. So what is trained is the program that grows it and.
Interviewer
Right, but can you then train that network the way you would a GPT model?
Sebastian
So you can also use, you can also. We did some approach. You can also train it in a more supervised way or through reinforcement learning. It's just. It seems to be easier to train it with, with evolution. But the issue is also with this approach it kind of. There's in general machine learning and evolutionary computation there's this kind of issue of like deception that it's easy to get like a decent score, but if you want to get all the way to the goal you might have to first like decrease your go another way that decreases your performance for it to be then become better. Like there's this classical example of a maze and you can get very close to the, the goal of the maze but to get actually to the goal you would have to go all the way around the maze. So getting a decent score is, is okay, but if you want the, the really good score you have to get worse first. And so likely these kind of problems require like approaches that can deal with this kind of deception. And, and, and that's why in also neuro evolution people have been developing these methods of, of more open ended search methods. Like methods that don't just go for like one target but, but methods that are. It's called, it's under this umbrella term like quality diversity. You want to have an approach that explores much more of the space but also takes quality of the solution into account. So for this kind of growing approach to really work really well, we have to combine it with these kind of quality diversity approaches because all of these things kind of work together because otherwise it's really, it's quite difficult to explore the space. And we also, we did some, some work back in the day that just shows the, just the difficulty of learning to learn and plasticity. Like imagine you are like in a. There's typical experiment people use in biology like this tea maze. So you have a maze that looks like a, like a tea and the mice goes to one part of the maze who has to learn to remember like, oh, was there a big reward here or was it here? When they collected, they have you put them back to the start of the maze. And if you train this, we did experiments where we used heavier learning for that. Imagine you learned to always go to the small reward. Like you go to the small reward then, then you get put back and then it's the high reward here. So you learn to go to the small reward. This is like the worst thing you can do. It's worse than going always to one side of the maze because you would at least get 50% right. But in terms of how close is that network to actually learning? It's closer than the network at oi. That is just not reacting and always going to wanna stupidly going to the right side. Right. So if you use a traditional approach, this will be the worst. This will be directly be sorted out. So you need to have different methods to evolving these more like cognitive skills than just saying, you know, this is the fitness. Because otherwise you will get stuck in that like go 50% stupidly to one arm of the middle. So everything has to kind of work together. And that's kind of the challenge in this, kind of, in this way. You, you want to learn to learn, you want to learn to grow, you want to do everything at, at the same time. And that's kind of the challenge and
Interviewer
what I was referring to in, in learning, you know, right now we have large models that have been trained on a tremendous amount of data. And if you do too much training, post training, you end up overriding.
Sebastian
Right.
Interviewer
The idea here is that you could have a network that's trained, but as it operates during inference in the world, it learns new things and rather than overriding, it would grow Right. New nodes and store that learning in those nodes.
Sebastian
Yeah.
Interviewer
Is that right?
Sebastian
Yeah. I think that that will be the ultimate goal to combine these like smaller scale experiments and seeing what we learn there to, to allow ultimately language models to do this kind of continual learning. And there's some work by, by Sakana that goes a little bit in this direction, which is called this evolutionary model merging, where you take, you know, because we have many networks that are already trained, like the thousands of language models that you can download online. And why not take advantage of all these networks that are already there? And so in this model merging approach, you take one network and some layers of that and you take from another network layers, and then you kind of merge them together and you let evolution figure out how to do it. And then you can. So colleagues at Sakana have, have done this, that you can then take a model that is good at Japanese, you take a model that's good at math, and you let evolution figure out how to combine them together, have a model that's good at Japanese and math. And, and so one thing that the next thing could then be, could you have a model that's good at Japanese that you can teach incrementally how to be good at math or something else. But, but I think we're not, we're not there yet. Like, but I think this is how the field is, is moving towards.
Interviewer
Yeah, the model merging is fascinating. And so you can take, if you had the source code.
Sebastian
Yeah.
Interviewer
GPT5 and Grok, wherever they are now for.
Sebastian
Yeah.
Interviewer
And. And combine them. I mean, theoretically.
Sebastian
Theoretically. Yeah. Those would be probably a little big like, or like the, if you had the resources, you could do that. The trick is like, you have to be able to see is it better the combination. You have to be able to evaluate it, but you can evaluate on many of the benchmarks that you have, and then you could merge them together for these really big models. Then there could be some other challenges like the approaches people have done so far, slightly smaller models also, because then the smaller model, we know that model is not good at math and we know this one is not good at Japanese, which is for the really big models, it's a little bit harder to know even what are they not good at. But yeah, the evaluation is a little more tricky there. Yeah, but in principle you could do it.
Interviewer
Yeah, you could distill smaller models from the parents and then merge those. How large are the models that you merge?
Sebastian
They're like a few, I don't know, like maybe like 100 million per minute, actually, I don't remember exactly, but yeah, not compared to the.
Interviewer
And where do you see all of this going? You're also doing some really interesting work in evolving, what is it called but life, right. In virtual life.
Sebastian
Yeah, yeah. Artificial life. Yeah. Yeah, what do you call it? Artificial life?
Interviewer
Artificial life, yeah.
Sebastian
Right.
Interviewer
So is that related to this or is that completely separate?
Sebastian
No, no, that is. That is very related. Like in. So in artificial life, it's like the idea is that life, the instance we know is like one example of life. But artificial life is like life as it could be like. And some. And people simulate things that are lifelike properties. And. And one thing of lifelike property is growth. So, so self organization and growth and self replication is very like essential to life. And those are also things we explore with these growing networks. But we also explore them with what's called this neural cell automata. Like, it's also basically like neural networks, like copies of it. And they imagine just replacing the traditional rules of cellular automata, like of the game of life, which has these fixed rules. Like if you have three neighbors, you create a new cell. If you have four, you, the cell dies and you can replace that with a neural network. So instead you ask each cell says, ask the neural network, what should I do? What state should I be become next? And we changed that to. So we are able to scale that to 3D. Like we have a paper where we growing Minecraft structures with this. And the fun thing is like you can have a. You grow salament on Minecraft, you cut it in half and then it grows to salamanders. And the nice thing is you can train those with supervised learning. So if you have a target, you know, you want to grow a house or like a tree, then you can teach it to grow that. If you don't know, we also use it to train kind of soft robots that have like squishy, squishy robots, where we don't know what is a good morphology for locomotion. But there you can use evolution that you tell it, you know, grow a structure, put it in environment, see how well it works. If it doesn't work, then we throw it out. And through this process we can grow structures that are able to locomote and then we also able to damage those. You can cut off parts of the structure. And if it's trained to recover from it, then it can regrow just only based on the local information. So it doesn't need any other information. It just needs to sense the local part, like a salament that can regrow its tail. And these methods can be used to do that. And there's this community, artificial Life, that also is a few people at Sakana also working on these kind of ideas of artificial life. And yeah, it's an Interesting direction. That's a little bit not the mainstream machine learning, but I think there's a lot of promise, like taking some of these properties from biological system, putting them there. One is being resilient. So biological systems are incredibly resilient and still deep learning. Often you find these weird examples and it completely fails. So I think there's a lot of promise in using these systems that can self organize and based on local communication, they have an inbuilt resilience that I think we could exploit to make these deep learning systems also more robust. And also adaptive.
Interviewer
And adaptive. That's another area for evolutionary systems that use evolutionary strategies to find solutions. Right. Find novel solutions to a problem and
Host
then
Interviewer
systems that, that can combine models, you know, based on the quality of their outputs and improve through generations. It seems like that would be very applicable to scientific research because with gradient descent, what we were talking about earlier, you're, you're going toward the local minima, you're hoping it's the global.
Sebastian
Right.
Interviewer
But what you were talking about being able to cover a much larger landscape. Can you talk about what that the implications are for. Right, for scientific research?
Sebastian
Yeah, that's also something we're exploring at Sakana. It's this kind of idea of like an AI scientist, for example, with AI scientists, but also this thing we call Shinka evolve, which is like kind of alpha evolve. And the idea is that you can, and that's a combination of evolution and large language models. So large language models are good at for example, generating code and generating ideas. But to explore that space, for example, it can be really useful to use evolution. So like in basically you can use language model as a mutation operator. You start with one kind of example is this circle packing in the space and you have like a number of circles and you want to put them in there, like the maximum number of circles you can put into this space. And so what you can do is you can then ask a language model to give you a new solution and multiple solutions. And then you evaluate those solutions based on fitness. Like how, what's the score that it gets packing those circles? And then you, you do this again from the best ones and, or from the number of best individuals. And you ask again the language model give me variations of the solution and then you do it over and over again until you find a good solution that packs the most circles into this, into this space. So, so this is like you use evolution to navigate the space, but you're using the language model to give you as a mutation operator. And you can do this or you know, like scientific ideas for example, you can start with one idea and let the model generate your variations of it. And then the only thing you need is you need to be able to somehow score it based on some fitness function. So which is a little easier if you have the circuit packing. It's a little harder if you have like some, you know, scientific idea which is a little bit more complicated to say if this is a good idea or is it a bad idea. But this is a direction, I mean that a lot of Sakana is pursuing and other companies are pursuing where you have this kind of combination of evolution because it's creative in what it can discover, but you have it a little bit more grounded and because you have a language model that is the mutation operator. And people in evolution have for a long time done things like evolving programs with genetic programming. But those were always like very hand tailored to the kind of problem at hand. But now that you can use a language model, you can let it output code and you can ask it to modify the code and navigate in this space and applying all these lessons that we have learned from Neuro evolution, like more open ended setups, using things like quality diversity to kind of navigate the space and hopefully not getting stuck in too many of these local optima. And I think it will, yeah, it will change how science is made that you have this kind of AI scientist or like co scientists that you can exchange ideas. It's navigating some space, it's giving you some hypothesis to test. This is kind of where the direction is kind of moving towards.
Interviewer
Yeah. And you guys at Sakana, I mean where are you in that research? Have you.
Sebastian
Right.
Interviewer
Are you still in at the architecture level or have you done run experiments to see if it will output some useful.
Sebastian
Yeah, so, so I think it's, it's going this way. So there's this, the AI scientists from Sakana that can. And, and the recent version, the new version was able to actually generate some like a paper that got accepted at a workshop. Yeah. So it kind of shows that there is some, it can generate some generally interesting things. Now's the question, how far can you push this approach to generate something like truly groundbreaking. So I think that's the kind of the holy grail of the field and still like an open question like it can definitely generate some new things but how far from the trainee distribution, how creative can it be and do we need to. Is it just about prompting it and getting those ideas out or do we need to does the need to run its own experiments and we need to fine tune it on those results and then like iteratively making it better. And also this idea of the, the self improvement. So. So that the model itself gets better and better and it gets better at getting better. So, so, so those are like ideas that also exploring.
Interviewer
Yeah. On the paper that was submitted, I think that I flir. I mean it got accepted.
Sebastian
Right, right.
Interviewer
Was as I recall, it was kind of proving a negative. Right. How did that start?
Sebastian
Yeah, yeah, I wasn't part of that paper. But I think the main thing is that that's probably the worst it will ever be. So I think that's kind of the idea that this was with an older model you get this paper. Right. But if you would replicate it now using like Gemini or some, some other model, it will probably push further on this. So, so the, so the better. The nice thing is about the framework that he can use the same kind of framework and he can switch out the language model that you're using. So the better the language models become, the better papers they should also be able to write. And I think that's kind of the main thing. Not necessarily that there was this, what ideas are generated then, but showing that you can kind of automate the whole pipeline and it will get better and better with better models. But also for me, I think the interesting part is how can we use this also as a kind of like a co scientist, like, because at least for some time there will be humans and AIs like working closely together. And I think it's very interesting. How can you make sure that it can take into account both ideas? How can you make sure that AIs and humans talk in the same language? There was an interesting keynote by Melanie Mitchell where she showed that basically like even the models that come up with it looks great on the benchmarks and it looks like the right solution and it gets a good score. But it solved it in a very different way that was not even intended by the humans. Like it exploited some kind of feature about the domain that wasn't even built in. So how do we kind of. We have to find kind of like a language that we talk in the same way. If we want to collaborate, then we have to find a common ground to being able to do that. And I guess there's already some common ground. It's natural language, it's trained on text, you can communicate with it, but you might not be sure about intentions or so I think to collaborate well, we need to do a Lot of work that goes beyond it just being able to write its own papers. But how do we best kind of combine it with what humans are good at and what machines are good at? And I've always been interested is kind of co intelligence or hybrid intelligence. Like, how can we combine the best of both worlds? And before it was a little more easy, like it was a little bit more complex. Clear. What, what are humans and computers good at? Now it becoming less clear. So I think that's something we need to kind of figure out.
Interviewer
Yeah, you said something interesting before. You said its creativity is constrained by the training data. Like, is it really going to come up with ideas that aren't in some way embedded in the training data? Two questions on that one. Maybe that's the case. But the training data can be very rich in scientific ideas that have never been explored. I mean, even evolutionary AI has not gotten the attention until very recently that it probably should because people get attracted by other things. AI is not like that. It'll look at everything. So aren't there insights to be discovered within the training data, within the body of scientific research as it exists?
Sebastian
And I think it's still, I think there's some results that show it can generate something new. It's just the question, how far can it be pushed beyond what was in the training? So I wouldn't say that it cannot produce anything new. I just, for me it's less clear. How far is that outside of what it has seen? And I mean, there's like mathematicians, I think Terence Tao, that used it and it found a proof that humans forgot or didn't know about. And so it can certainly be helpful. It can, I mean, people have been applying it to also optimize robot morphologies. And it came up with new morphologies that didn't exist before. But the question is. Yeah, how far can we push it out to outside that?
Interviewer
And what you're saying is that, you know, human knowledge contains like a finite amount of all possible knowledge.
Sebastian
Right.
Interviewer
And how do you go beyond.
Sebastian
Yeah.
Interviewer
Current human knowledge?
Sebastian
Like let's assume like, I don't know, like 50 years ago, let's say you had a model that was just trained onto this point, would have had then invented the iPhone at some point.
Interviewer
Yeah, I've heard this.
Sebastian
Like, yeah. And, and so there's not, I don't think the current ones would, they probably would invent other things and then you could imagine, okay, they invent something and then they could be trained on, on those inventions and then maybe ultimately they would get there.
Interviewer
But maybe they would come up with something.
Sebastian
Something else.
Interviewer
Yeah, yeah. I mean the iPhone is. Is it. It certainly was revolutionary but it was. There was no groundbreaking tech in the iPhone.
Sebastian
Right.
Interviewer
It was an engineering exercise. We brought together mp3 player and.
Sebastian
And yeah.
Interviewer
So I mean given the right target it might yeah. Evolve an iPhone, but it might evolve something that.
Sebastian
Right.
Interviewer
Is better than an iPhone.
Sebastian
I think the, the thing is also it probably has to be combined with. It has to be able to run like its own experiments and stuff. I think if you just like 50 years ago and then you just only in your head and think about what could you invent? Like I don't know how far people would have gotten but you have to be able to either run an experiment or you have to be able to manufacture and then see how does it do or so So I think and some people are working on combining this now like this like Leela science that they combine these ideas and they have robots and manufacturing things to then I think ultimately try to see what the model makes and then use it to automate kind of science or like what
Interviewer
kind of a factory like because they're are. Yeah. You know I'm thinking of in Silico. The drug discovery company.
Sebastian
Yeah yeah.
Interviewer
That has. Is using AI for you know, compound.
Sebastian
Right.
Interviewer
Yeah discovery. But then they have it attached to an automated wet lab that synthesizes the molecules and in this case what. What kind of a manufacturing.
Sebastian
Yeah actually I don't know like exactly how it looks like but I guess like also trying like I could imagine it's like yeah robots that try to mix things together and, and like seeing what happens. I think they're doing many many different things but I guess there are a few labs like companies that. That moving towards that direction like automating also like material science. And I think that what probably has to happen because if it's just only being able to output language I don't know how much it will be able to do something. I think it has to be able to affect the world or run experiments somehow because that's how. Yeah how we humans at least learn like so. But, but I think that's also moving in this. In that direction.
Interviewer
Yeah. And so we talked about artificial life, we talked about growing or plastic neural networks, about evolutionary strategies. What. What are the other areas that you're focused on?
Sebastian
Yeah a lot about like also open endedness like culture. How can you create a system that can and that's very much tied to the current LLMs. Like how can you create A system that can keep innovating, keep producing interesting new things. And that's why people use more and more language models in depth as well. But there is a lot of techniques in the neuroevolution community that people have already developed that now getting kind of augmented using language models. One is for example, evolving both the environment and the agent at the same time.
Interviewer
Yeah, you mentioned that. What does that mean?
Sebastian
Right. So one of these examples is this algorithm called Poets, where basically the agent might be a bipedal robot and the environment is the terrain in this case. And it's easy to, if you have a flat terrain, it's easy for the robot to walk, but then you can introduce gaps or obstacles. And so the agent has to learn to deal with those. And so in this approach, it's initially you start with a very simple environment, but then over time you make it more and more difficult. And at the end it's solving like crazy environments that it goes down or it has to jump over things. Like it is really impressive what it can do in the end. And the interesting part is if you would have started with the really complicated environment at the end, it wouldn't have been able to solve it. So you need to go through these stages for it to discover these kind of stepping stones in the behavior to be able to do the final thing. And then people have also extended that to which we also talk about in the book. One is this approach called Omni, where you can extend this to things like environment generation in Unity, for example, like where you have not just this two dimensional flat landscape, but now that we can have language models. You can have the language model produce code that creates an environment, and initially it might create an environment that's very simple, and then it creates more and more complex environments. And so that's, I think really interesting and could allow also like neural evolution to scale up to really complicated tasks. And I mean, now you can even imagine combining this with. We now have also neural networks that can simulate whole 3D worlds. Like, and if you combine, I think neurorevolution with a controllable world that you can just prompt, how should this world look like? Should it be a simple world? Should it be very difficult worlds, more predators, more this and that, then I think we could really get like an increase in what the agents we can kind of evolve this way.
Interviewer
And Sukana is building world models, right?
Sebastian
Yeah.
Interviewer
And are, are they explicit? You know, Fei, Fei Li I had on recently. Right. And her marble world model played with that, you know, expresses the output as an explicit representation of.
Sebastian
Right.
Interviewer
A world. I'm, I'm interested in how you, you use the internal representation of a world for the, for the model to learn as it interacts with the external environment. So understands physics, it understands, you know, whatever it learns from the external world.
Sebastian
Right.
Interviewer
So which is Sakana working on one of the.
Sebastian
David Howe, our CEO, he made kind of one of the first papers on world models. Like that was Maybe I think 2017, where he trained a simple word model at the time. And then he could train inside of this dream. You could have the agent get better and better. He trained us on like this kind of 3D doom task. And yeah, I think one of the most interesting things was that you can, even back then the simulation wasn't perfect. You could see that it was like a hallucinated world. But you could already then use the agent to train inside of the world model to then be better in the real world. And so that's something that David has explored in the past. Something we looked a little bit into is when do you need to have a world model? And when can you just use. When can you just ask a language model the answer if there's something about physics, when you need to ask and when you actually need to run a simulation. So those are kind of some things that we have been looking at a little bit. But yeah, like basically what you said, like answering the question, when is a world model useful? Because you might not. I might not need to ask a world model every little step I do. Right. But if I thinking about I'm jumping over the table, maybe I should ask the world model, will I likely get to the other side or not? Yeah. Can you learn to when to run kind of which process.
Interviewer
That's interesting. And so you're not using generated environments like that?
Sebastian
No.
Interviewer
Evolutionary AI has not been front and center. I mean there's, you know, the transformers took over and everything's about optimizing transformer architectures or looking beyond transformer architectures. Is there a reason why evolutionary AI is swinging back into focus? Yeah, maybe it always has been and I'm just not attending the right co.
Sebastian
I think it definitely wasn't in the main focus, but I think now that I think it's just a very good pairing, for example, LLMs with evolutionary algorithms. I think that's a really good match. And that's why I think more people also now interested in combining generative AI with evolution. Because it's just before, basically before an evolution revolution, you always had to think about what does your representation look like. And then you kind of with particular representation you might restrict the type of solutions you're getting. And now that you can use language model, you don't have to worry so much about, you can just have that come up with the representation or be the representation in that instance. It's quite, it's difficult to differentiate or use gradient descent. Like it's so searching that you know you can use the language model to give you ideas or like generate you artifacts and then searching that space. It's, it's beneficial using evolution because it's hard to back propagate using gradient descent to that space. So I think that's one reason why it's becoming also more popular now. Like the approaches like alpha evolve and this model merging, I think fit really well together. And then that's also something that Sakana is very much looking into. Like we want to go beyond the transformer. Like one of our co founders, Lanjosi is one of the, the authors on this transformer paper. And we think this is not where we should kind of stop and go beyond what the transformer can do. And so one of our approaches is this continuous thought machine, which is an approach where the network actually, it's not just like input outputs, like you get something in, you put something out, but the network itself can decide to think about a problem for longer time periods. So where the external input is actually not as important as like the internal thought process itself. And so we made an architecture that allows the model to kind of do that. And again like incorporating some more biologically inspired algorithms, like an activation memory of what comes into each neuron, making each neuron more complex. More complex. So not just the simple neuron models that are most times used, but, but slightly more complex that each neuron is, is its own neural network. And also this idea of the brain, like biological brains often use this kind of synchronization and oscillations to do like processing. And they seem to play a really, I mean, I love neuroscientists, but they seem a really important role in biological brains, like how neurons are oscillate together or synchronize together. And so that's one of the key components in this continuous thought machine that you see how much the neurons synchronize and you kind of use that as
Interviewer
a representation and synchronize. You talking about the, what's the hebian fire together?
Sebastian
Actually in this case it's more like, yeah, basically looking how much they fire together, like how much they Correlate together so very related to the Hebbian one. And in this case. So those systems you can also train with this whole thing you can train with grade descent. So it's not. You should only use evolution. I think it really depends on the type of problem you have. Where is evolution useful? Where is gradient descent useful? There's also some other work we published where we used traditionally neural cell automata where we used gradient descent to optimize them. So, so I think kind of both techniques I think are very complementary and useful for like different aspects. And we also talk about that in the, in the book.
Interviewer
Yeah, the thought machine and the, the data it's taking in is. So we apply continuous stream of data or is it right, so, so the
Sebastian
benefit of it is it can operate on continuous stream of data, but it can also be used for domains that don't inherently have like a sequential dimension. Like one thing we applied it to is image classification. And what happens there is that it, it kind of, it gets the image and it learns to look around on the image and to. Based on what it sees. It slowly builds up confidence like, oh, I'm seeing like an image of a dog. I'm seeing this image. And the other domain, we used it in this reinforcement the task. But also like maze navigation and maze navigation is. And there it's, it's quite a lot better than like a standard like LSTM long Short Term Memory network in kind of. Yeah. Learning to navigate out of a maze. So these are the tasks we applied it to. But in the future we want to also apply to more complicated tasks. But in this one it, it's really interesting. I think one of the very interesting parts for me is like that it looks very eerily human. Where it looks like it looks sometimes at the eyes or it looks at these parts. If you would compare it to I think how a human, if you track the eye tracking of what humans look at, I think it seems pretty similar to that. And then you can kind of track how confident it is that it sees a certain image. And it's interesting too, with some images it's more difficult. Like I don't know, a dryer, it might take more time to look around to be sure. And then some other images it looks very briefly and it knows what it is. So and then you could take that into account to give it, you know, say okay, when you're confident enough then you stop computing and if not then you just, you know, it can go on for a while. And that's one thing that I think it's starting to. I mean, in language models, people added this kind of reasoning, but it's often also, like, kind of reasoning that you elicit through reinforcement learning. And you're kind of reasoning in text. Like, you can see the models like, oh, I think this and that, and then. But this is another type of reasoning that's happening in the. In the neural substrate itself. And so there's some interesting difference. And the question now is how can. Can we take that and apply it to language models and how that would look and.
Interviewer
Yeah. Wow, that's fascinating, because it seems like all of these threads could converge, right?
Sebastian
Yeah, that's the hope.
Interviewer
Yeah. Yeah. Okay, Sebastian, I know you got a plan again, so I'll leave it there.
Eye On A.I. – Episode #330: Sebastian Risi – Why AI Should Be Grown, Not Trained
April 6, 2026
Host: Craig S. Smith
Guest: Sebastian Risi
This episode explores the concept of "growing" artificial intelligence, in contrast to the conventional idea of "training" via gradient descent. Host Craig S. Smith interviews Sebastian Risi, a prominent AI researcher, about the field of neuroevolution—a biologically inspired approach to creating adaptive, robust, and continually learning AI systems. The conversation covers how drawing from nature's methods—evolution, growth, plasticity, and self-organization—can overcome current limitations in AI and potentially lead to more flexible, resilient, and creative artificial agents.
On Resilience:
“Biological systems are incredibly resilient… deep learning, often you find these weird examples and it completely fails. So I think there’s a lot of promise in using these systems that can self-organize… they have inbuilt resilience that I think we could exploit…”
— Sebastian Risi [33:20]
On Model Merging:
“You can take a model that is good at Japanese, you take a model that’s good at math, and you let evolution figure out how to combine them together.”
— Sebastian Risi [28:50]
On Collaborative Intelligence:
“How do we best kind of combine it with what humans are good at and what machines are good at? … How can we combine the best of both worlds?”
— Sebastian Risi [41:40]
Limits of LLMs:
“Its creativity is constrained by the training data… Is it really going to come up with ideas that aren’t in some way embedded in the training data?”
— Interviewer [42:55]
The tone of the episode is intellectually adventurous and technically in-depth but remains accessible and engaging. Both host and guest are enthusiastic about the future of AI that draws more deeply from the lessons of biology—evolution, growth, continual adaptation, resilience, and creativity.
This episode is a comprehensive dive into why tomorrow’s AI may look less like a rigidly trained machine and more like a living, evolving, and endlessly adapting organism.