Transcript
Francois Chollet (0:00)
I think we're probably looking at AGI 20, 30, around the time that we're going to be releasing, like maybe arc6 or arc7. You're not going to stop AI progress. I think it's too late for that. And so the next question is, okay, like, AI progress is here. It's actually going to keep accelerating. How do you make use of it? How do you leverage? How do you ride the wave? That's the question to ask
Interviewer 1 (0:26)
Foreign.
Interviewer 2 (0:31)
To be joined by Francois Chollet, founder of the ARC Prize, a global competition to solve the ARC AGI benchmark. His latest project is ndia, a lab exploring a new paradigm in frontier AI research. Francois is one of the best people in the world to help us understand the current AI moment and where all of this is going to. Francois, thank you so much for joining us today and congrats on the launch of ARC AGI V3.
Francois Chollet (0:58)
Thanks so much for having me. I'm super excited to be here. Super exciting time to talk about AI.
Interviewer 3 (1:02)
So, Francois, tell us a little bit about ndea. So what exactly is it and what are you guys trying to achieve?
Francois Chollet (1:08)
Right, so NDIA is this new AGI research lab and we are trying some very different ideas. And so our goal is basically to build this new branch of machine learning that will be much closer to optimal. Unlike deep learning, all of us right
Interviewer 2 (1:24)
now are sort of taken by what's going on with code. I have sort of this viral moment right now where I got to 40,000 stars this morning on GStack. So it's like, oh, this is an open source project that now is one of the biggest ones. And I have more than 100 PRs from contributors to deal with. I guess you're, you know, one of the best people to talk to about this because you're actually literally coming up with something that is a totally different pathway.
Francois Chollet (1:51)
That's right. That's right. So what we're doing at NDIA is we're doing program synthesis research. And when I talk about program synthesis, often people ask me, oh, so are you doing like cogen? Are you building an alternative to coding agents? And it's actually not at all what we are doing. We are working at a much, much lower level than that. What we're actually doing is that we are trying to build a new branch of machine learning, an alternative to deep learning itself, rather than coding agents. Coding agents are this very, very high level, last layer piece of the stack. And we're actually trying to rebuild the whole stack on top of different foundations. So we're building a new learning substrate that's very different from parametric learning, deep learning. So if you go back to the problem of machine learning, you have some input data, some target data, and you're trying to find a function that will map the inputs to the targets and that will hopefully generalize to new inputs. And if you're doing deep learning, what you're doing is that you have this parametric curve that serves as your function, as your model, and you're trying to fit the parameters of the curve via gradient descent. And this is basically what we are doing, except we are replacing the parametric curve with a symbolic model that is meant to be as small as, as possible. It's like the simplest possible model to explain the data, to model what's going on. And of course, if you're doing that, you cannot apply gradient descent anymore. So we are building something that we call symbolic descent, which is like the symbolic space equivalent of gradient descent. The idea is to build this new machine learning engine that's giving you extremely concise symbolic models of the data you're feeding into it, and then we're going to make it scale. And so everything you're doing with machine learning today with parametric curves, we should be able to do it with symbolic models in the future in a way that will be much, much closer to optimality. Much closer to optimality in the sense that you're going to need much less data to obtain the models. The models are going to run much more efficiently at inference time because they're going to be so small. And because they are so small, they will also generalize much better and compose much better. You know, the minimum description length principle, that the model of the data that is most likely to generalize is the shortest. And I think you cannot find a model like this if you're doing biometric learning. You need to try symbolic learning.
