Y Combinator Startup Podcast

Episode: How François Chollet Is Building A New Path To AGI

Date: March 27, 2026
Guest: François Chollet
Main Theme:
A deep dive into alternative approaches to Artificial General Intelligence (AGI), focusing on François Chollet’s NDIA lab, the ARC Prize and benchmark, and the broader landscape of AI research beyond large language models (LLMs).

1. Overview

In this episode, the Y Combinator team interviews François Chollet, founder of the NDIA AGI research lab and creator of the ARC Prize, which challenges teams globally to solve the ARC AGI benchmark. Chollet discusses why he believes current deep learning approaches are not the ultimate path to AGI, details NDIA's radically different machine learning paradigm, and explains the motivations and evolution behind the ARC AGI benchmarks (now in its third version). The discussion is both philosophical and highly technical, providing perspective on the future of intelligence, the inefficiencies of current methods, and why exploring alternative AI architectures is vital.

2. Key Discussion Points & Insights

AGI Trajectory & Inevitability

AGI by 2030?
Chollet predicts that AGI could arrive around 2030 – early 2030s, paralleling the release of ARC AGI versions 6 or 7.

"I think we're probably looking at AGI 2030, early 2030s most likely. So around the time that we're going to be releasing maybe arc 6 or arc 7." (00:00, 45:50)
The Wave of Progress
It’s too late to "stop" AI—adoption and acceleration are inevitable. The right question is how to ride and benefit from the wave.

"You're not going to stop AI progress. I think it's too late for that. And so the next question is...how do you leverage? How do you ride the wave? That's the question to ask." (00:00, 55:44)

NDIA’s New Paradigm: Symbolic Program Synthesis vs. Deep Learning

NDIA’s Goal
NDIA aims to build a fundamentally new branch of machine learning: optimal symbolic learning, not neural net-based parametric learning.

"We are trying to build this new branch of machine learning that will be much closer to optimal. Unlike deep learning." (01:08)
Program Synthesis Explained
Instead of fitting huge parameterized models via gradient descent, NDIA is building new learning engines that generate concise symbolic models (short programs) to explain data. It eliminates reliance on raw scale and data hungriness.

"We're building a new learning substrate...replacing the parametric curve with a symbolic model that is meant to be as small as possible...you need to try symbolic learning." (01:51)
Symbolic Descent
NDIA’s symbolic descent is a "symbolic space" analog of gradient descent, meant to yield highly efficient, generalizable models.

"We're building something that we call symbolic descent, which is like the symbolic space equivalent of gradient descent." (01:51)
Efficiency and Optimality
Symbolic models require less data and are more interpretable and efficient at inference time—they generalize better due to their simplicity.

"Much closer to optimality in the sense that you're going to need much less data...the models are going to run much more efficiently...they will also generalize much better." (01:51)

Why Not Just Scale Up LLMs?

Dangers of Monoculture in Research
It’s counterproductive for everyone to work on the same LLM stack. Chollet doubts LLMs will be the long-term foundation for AI/AGI.

"I personally don't think that machine learning or AI in 50 years is still going to be built on this stack...I think it's inevitable that the world of AI will trend over time towards optimality." (04:39)
On Defining Intelligence and AGI
Widely used definitions (automating most valuable work) miss the essence of intelligence. True general intelligence is measured by how efficiently a system can learn new tasks like a human.

"AGI is...a system that can approach any new problem...and become competent at it with the same degree of efficiency as a human could." (09:52)
LLMs Lack Sample Efficiency
Current LLMs are not sample-efficient like humans—even if they achieve similar task automation, they do it through massive data and compute.

"LLMs aren’t fundamentally sample-efficient. I do believe, however, [building AGI on LLMs] would be the wrong thing to do because it would be very inefficient." (11:41)

ARC AGI: Benchmarking Intelligence Progress

ARC v1 & v2: Evolution of Challenges

ARC AGI started as a reasoning and program synthesis benchmark highlighting deep learning’s limitations. LLMs performed poorly on v1 and v2 until reasoning/agentic models and verifiable reward harnesses were introduced.

"Base LLMs were scoring extremely low on v1, like sub 10%...scaling up pre training alone was not going to crack the benchmark." (17:55)
Agentic reasoning and fine-tuning using verifiable signals (like in code or math) allow models to perform and automate narrower domains efficiently but don't equate to fluid intelligence

"It's not that the models are smarter, it's that they're suddenly more useful...the models don’t have higher fluid intelligence per se, it’s just that they’re way better trained." (22:36, 23:00)

ARC v3: Testing Agentic Intelligence

ARC v3 moves from static, passive pattern matching to interactive, agent-in-environment tests (think ‘mini games’).

"V3...is completely different. We are trying to measure agentic intelligence. So it's interactive, it's active. Like, the data is not provided to you. You must go get it." (26:08)
To succeed, an AI agent must explore, discover goals, build a model of its environment, plan, and act efficiently—just like humans do when playing a new game with no instructions.

"Your agent is dropped into a new environment, which is kind of like a mini video game...and it must figure out everything on its own via trial and error." (26:08)
ARC v3 focuses on agentic intelligence, exploration efficiency, and true fluid reasoning. It's much harder to "brute force" via massive compute or overfitting, and is scored on action efficiency.

"If you try to brute force mine the space of every possible game state...you would score extremely low. Even if you solve the level...you're scored on efficiency. You must match human level efficiency." (31:20)
Development required building a full video game studio and creating hundreds of unique, knowledge-neutral environments to avoid test leakage.

"We set up an entire video game studio...We hired a team of game developers and we built our own game engine." (29:37, 30:20)

The Future of Benchmarks: ARC 4, 5, ... AGI On the Horizon

ARC AGI will continue to evolve as a "moving target," tracking new frontiers in fluid, continual, and inventive intelligence until no measurable gap remains between human and AI learning efficiency—that will be the true AGI moment.

"The point...is not to say, here's this test. If you pass it, this is AGI. Instead...we're targeting the residual gap of fair capabilities...Eventually there will be no measurable difference between human capabilities and frontier AI...this is the AGI moment." (44:04)

Philosophical and Practical Takeaways

Conciseness is Ultimate Intelligence
The shortest, most elegant models generalize best: science as symbolic compression, not curve fitting.

"Science is fundamentally a symbolic compression process...you're compressing that down to a very simple symbolic rule...we are building science incarnate: science, the scientific method in algorithmic form." (39:40)
Learning from the Past
Chollet argues revisiting and scaling up old ideas (genetic algorithms, 1980s research, etc.) could yield breakthroughs—current research focus is too narrow.

"Earlier in the history of the AI research timeline, people were exploring more things and very different things. I think genetic algorithms are actually a very good example of that." (48:41)
Remove Humans from the Improvement Loop
Successful future systems scale with compute, not human engineering; improvement must be automated and self-compounding wherever possible.

"You want to be in a setup where the system can improve its capabilities with no human in the loop, with no human involved." (50:22)
Advice to New Builders
Deep focus on usability, onboarding, and cultivating the community are key to long-lasting open-source impact. Hire your super-users!

"There was this big focus on making the API simple and intuitive...focus on usability...But you have to put a lot of investment into community building...Hire your power users." (53:12)
Actionable Mindset for the Next Generation
AI is not to be feared but leveraged. Those with expertise—especially in programming—will be able to ride the wave and create new opportunities.

"The more expertise you have...the better you're able to use and leverage these tools for your own benefit...AI progress is actually empowerment." (55:44)

3. Notable Quotes & Memorable Moments

On Foresight and Odd Jobs in Research:

"If you have a big idea and it has a very low chance of success, but if it works, it's going to be big, and no one else is working on it... then you should try." (06:09)
On Recapturing Symbolic Insight:

"Science is not about curve fitting. Science is about finding the equation, finding the most compressive symbolic model of your pile of observation." (39:40)
On What AGI Will Eventually Be:

"When you create AGI retrospectively, it will turn out that it’s a code base that’s less than 10,000 lines of code and that if you had known about it back in the 1980s, you could have done AGI back then..." (36:30)
On the Human Element:

"The fact that you need humans to engineer these harnesses is also a sign that we're short of AGI today. Because if we had AGI, AGI would just make its own harness." (25:07)

4. Key Timestamps for Important Segments

| Timestamp | Segment/Topic | |-----------|--------------| | 00:00 | François Chollet’s AGI timeline and inevitability of progress | | 01:08 | NDIA’s symbolic program synthesis – moving beyond deep learning | | 04:39 | Why not just keep building on the LLM stack? | | 09:52 | Defining AGI: Efficiency in skill acquisition | | 17:55 | How ARC benchmarks revealed LLM and reasoning limitations | | 26:08 | ARC AGI v3: Measuring agentic intelligence with interactive games | | 31:20 | Why brute-forcing ARC v3 won’t work—efficiency as the bar | | 39:40 | Science as symbolic compression; NDIA’s core insight | | 44:04 | The future of ARC AGI: a moving, expanding benchmark | | 48:41 | Value in revisiting overlooked AI techniques from previous eras | | 50:22 | Automating improvement—removing the human bottleneck | | 53:12 | Chollet’s lessons on open-source projects and community | | 55:44 | Advice to the next generation: treating AI as empowerment |

5. Listener Takeaways

AGI is likely arriving within the next decade, but not along the current mainstream LLM trajectory.
Alternative paradigms—especially those focused on symbolic, concise, and agentic intelligence—are vital.
ARC AGI benchmarks are the industry’s best barometer for fundamental advances, not just incremental improvements.
The most important attribute for future AI systems is the ability to learn new tasks efficiently and autonomously, like a human.
For builders: challenge orthodoxy, focus on elegance and reuse, and cultivate a thriving user community.
For all: AI can and should be a tool for individual and collective empowerment, not just a cause for anxiety.

Closing Note:
François Chollet’s approach is a reminder that radical progress in AI might come from off the beaten path—from those willing to reimagine the entire stack and rethink what "intelligence" really means.

Y Combinator Startup Podcast

Episode: How François Chollet Is Building A New Path To AGI

1. Overview

2. Key Discussion Points & Insights

AGI Trajectory & Inevitability

AGI by 2030?
Chollet predicts that AGI could arrive around 2030 – early 2030s, paralleling the release of ARC AGI versions 6 or 7.

"I think we're probably looking at AGI 2030, early 2030s most likely. So around the time that we're going to be releasing maybe arc 6 or arc 7." (00:00, 45:50)
The Wave of Progress
It’s too late to "stop" AI—adoption and acceleration are inevitable. The right question is how to ride and benefit from the wave.

"You're not going to stop AI progress. I think it's too late for that. And so the next question is...how do you leverage? How do you ride the wave? That's the question to ask." (00:00, 55:44)

NDIA’s New Paradigm: Symbolic Program Synthesis vs. Deep Learning

NDIA’s Goal
NDIA aims to build a fundamentally new branch of machine learning: optimal symbolic learning, not neural net-based parametric learning.

"We are trying to build this new branch of machine learning that will be much closer to optimal. Unlike deep learning." (01:08)
Program Synthesis Explained
Instead of fitting huge parameterized models via gradient descent, NDIA is building new learning engines that generate concise symbolic models (short programs) to explain data. It eliminates reliance on raw scale and data hungriness.

"We're building a new learning substrate...replacing the parametric curve with a symbolic model that is meant to be as small as possible...you need to try symbolic learning." (01:51)
Symbolic Descent
NDIA’s symbolic descent is a "symbolic space" analog of gradient descent, meant to yield highly efficient, generalizable models.

"We're building something that we call symbolic descent, which is like the symbolic space equivalent of gradient descent." (01:51)
Efficiency and Optimality
Symbolic models require less data and are more interpretable and efficient at inference time—they generalize better due to their simplicity.

"Much closer to optimality in the sense that you're going to need much less data...the models are going to run much more efficiently...they will also generalize much better." (01:51)

Why Not Just Scale Up LLMs?

Dangers of Monoculture in Research
It’s counterproductive for everyone to work on the same LLM stack. Chollet doubts LLMs will be the long-term foundation for AI/AGI.

"I personally don't think that machine learning or AI in 50 years is still going to be built on this stack...I think it's inevitable that the world of AI will trend over time towards optimality." (04:39)
On Defining Intelligence and AGI
Widely used definitions (automating most valuable work) miss the essence of intelligence. True general intelligence is measured by how efficiently a system can learn new tasks like a human.

"AGI is...a system that can approach any new problem...and become competent at it with the same degree of efficiency as a human could." (09:52)
LLMs Lack Sample Efficiency
Current LLMs are not sample-efficient like humans—even if they achieve similar task automation, they do it through massive data and compute.

"LLMs aren’t fundamentally sample-efficient. I do believe, however, [building AGI on LLMs] would be the wrong thing to do because it would be very inefficient." (11:41)

ARC AGI: Benchmarking Intelligence Progress

ARC v1 & v2: Evolution of Challenges

ARC AGI started as a reasoning and program synthesis benchmark highlighting deep learning’s limitations. LLMs performed poorly on v1 and v2 until reasoning/agentic models and verifiable reward harnesses were introduced.

"Base LLMs were scoring extremely low on v1, like sub 10%...scaling up pre training alone was not going to crack the benchmark." (17:55)
Agentic reasoning and fine-tuning using verifiable signals (like in code or math) allow models to perform and automate narrower domains efficiently but don't equate to fluid intelligence

"It's not that the models are smarter, it's that they're suddenly more useful...the models don’t have higher fluid intelligence per se, it’s just that they’re way better trained." (22:36, 23:00)

ARC v3: Testing Agentic Intelligence

ARC v3 moves from static, passive pattern matching to interactive, agent-in-environment tests (think ‘mini games’).

"V3...is completely different. We are trying to measure agentic intelligence. So it's interactive, it's active. Like, the data is not provided to you. You must go get it." (26:08)
To succeed, an AI agent must explore, discover goals, build a model of its environment, plan, and act efficiently—just like humans do when playing a new game with no instructions.

"Your agent is dropped into a new environment, which is kind of like a mini video game...and it must figure out everything on its own via trial and error." (26:08)
ARC v3 focuses on agentic intelligence, exploration efficiency, and true fluid reasoning. It's much harder to "brute force" via massive compute or overfitting, and is scored on action efficiency.

"If you try to brute force mine the space of every possible game state...you would score extremely low. Even if you solve the level...you're scored on efficiency. You must match human level efficiency." (31:20)
Development required building a full video game studio and creating hundreds of unique, knowledge-neutral environments to avoid test leakage.

"We set up an entire video game studio...We hired a team of game developers and we built our own game engine." (29:37, 30:20)

The Future of Benchmarks: ARC 4, 5, ... AGI On the Horizon

ARC AGI will continue to evolve as a "moving target," tracking new frontiers in fluid, continual, and inventive intelligence until no measurable gap remains between human and AI learning efficiency—that will be the true AGI moment.

"The point...is not to say, here's this test. If you pass it, this is AGI. Instead...we're targeting the residual gap of fair capabilities...Eventually there will be no measurable difference between human capabilities and frontier AI...this is the AGI moment." (44:04)

Philosophical and Practical Takeaways

Conciseness is Ultimate Intelligence
The shortest, most elegant models generalize best: science as symbolic compression, not curve fitting.

"Science is fundamentally a symbolic compression process...you're compressing that down to a very simple symbolic rule...we are building science incarnate: science, the scientific method in algorithmic form." (39:40)
Learning from the Past
Chollet argues revisiting and scaling up old ideas (genetic algorithms, 1980s research, etc.) could yield breakthroughs—current research focus is too narrow.

"Earlier in the history of the AI research timeline, people were exploring more things and very different things. I think genetic algorithms are actually a very good example of that." (48:41)
Remove Humans from the Improvement Loop
Successful future systems scale with compute, not human engineering; improvement must be automated and self-compounding wherever possible.

"You want to be in a setup where the system can improve its capabilities with no human in the loop, with no human involved." (50:22)
Advice to New Builders
Deep focus on usability, onboarding, and cultivating the community are key to long-lasting open-source impact. Hire your super-users!

"There was this big focus on making the API simple and intuitive...focus on usability...But you have to put a lot of investment into community building...Hire your power users." (53:12)
Actionable Mindset for the Next Generation
AI is not to be feared but leveraged. Those with expertise—especially in programming—will be able to ride the wave and create new opportunities.

"The more expertise you have...the better you're able to use and leverage these tools for your own benefit...AI progress is actually empowerment." (55:44)

3. Notable Quotes & Memorable Moments

On Foresight and Odd Jobs in Research:

"If you have a big idea and it has a very low chance of success, but if it works, it's going to be big, and no one else is working on it... then you should try." (06:09)
On Recapturing Symbolic Insight:

"Science is not about curve fitting. Science is about finding the equation, finding the most compressive symbolic model of your pile of observation." (39:40)
On What AGI Will Eventually Be:

"When you create AGI retrospectively, it will turn out that it’s a code base that’s less than 10,000 lines of code and that if you had known about it back in the 1980s, you could have done AGI back then..." (36:30)
On the Human Element:

"The fact that you need humans to engineer these harnesses is also a sign that we're short of AGI today. Because if we had AGI, AGI would just make its own harness." (25:07)

4. Key Timestamps for Important Segments

5. Listener Takeaways

AGI is likely arriving within the next decade, but not along the current mainstream LLM trajectory.
Alternative paradigms—especially those focused on symbolic, concise, and agentic intelligence—are vital.
ARC AGI benchmarks are the industry’s best barometer for fundamental advances, not just incremental improvements.
The most important attribute for future AI systems is the ability to learn new tasks efficiently and autonomously, like a human.
For builders: challenge orthodoxy, focus on elegance and reuse, and cultivate a thriving user community.
For all: AI can and should be a tool for individual and collective empowerment, not just a cause for anxiety.

wavePod

How François Chollet Is Building A New Path To AGI

Summary

Y Combinator Startup Podcast

Episode: How François Chollet Is Building A New Path To AGI

1. Overview

2. Key Discussion Points & Insights

AGI Trajectory & Inevitability

NDIA’s New Paradigm: Symbolic Program Synthesis vs. Deep Learning

Why Not Just Scale Up LLMs?

ARC AGI: Benchmarking Intelligence Progress

ARC v1 & v2: Evolution of Challenges

ARC v3: Testing Agentic Intelligence

The Future of Benchmarks: ARC 4, 5, ... AGI On the Horizon

Philosophical and Practical Takeaways

3. Notable Quotes & Memorable Moments

4. Key Timestamps for Important Segments

5. Listener Takeaways

Transcript

Summary

Y Combinator Startup Podcast

Episode: How François Chollet Is Building A New Path To AGI

1. Overview

2. Key Discussion Points & Insights

AGI Trajectory & Inevitability

NDIA’s New Paradigm: Symbolic Program Synthesis vs. Deep Learning

Why Not Just Scale Up LLMs?

ARC AGI: Benchmarking Intelligence Progress

ARC v1 & v2: Evolution of Challenges

ARC v3: Testing Agentic Intelligence

The Future of Benchmarks: ARC 4, 5, ... AGI On the Horizon

Philosophical and Practical Takeaways

3. Notable Quotes & Memorable Moments

4. Key Timestamps for Important Segments

5. Listener Takeaways