Intelligent Machines (Audio) | TWiT
Episode 844: "Poob Has It For You - Spiky Superintelligence vs. Generality"
Date: November 6, 2025
Host(s): Leo Laporte, Paris Martineau, Jeff Jarvis
Guest: Jeremy Berman, Post-training Researcher at Reflection AI
Episode Overview
This wide-ranging and lively episode brings on Jeremy Berman from Reflection AI, one of the hottest new AI firms, to explore what’s happening at the AI frontier. The main thread is a deep discussion on the distinction between “spiky superintelligence” (elite, domain-specific AI) versus true general intelligence, as Berman and the hosts dig into AI post-training, reinforcement learning, challenges of generality, benchmarks like the ARC AGI, and the philosophies driving the next wave of intelligent machines. The conversation also touches on the open vs. closed model debate, existential risks, and the realities of building and deploying AI in business and society.
Main Themes
- What truly separates domain-specific AI (“spiky superintelligence”) from general intelligence?
- How reinforcement learning is changing the post-training landscape
- The realities of contemporary AI research: benchmarks, lunchroom debates, business needs
- Openness, risk, and the future: from open weight models to existential questions
Key Discussion Points & Insights
1. Pre-training vs. Post-training (04:00–08:50)
-
Jeremy Berman kicks off with a clear distinction:
- Pre-training: “We can stuff the entire Internet into these deep neural networks... So you have a pre-trained model and it's just a document completer.”
- Post-training: Makes models helpful and aligned to human tasks. “ChatGPT is a post-trained model in that it knows what a user is... these are not things that come out of the box.”
-
On the role of post-training:
- “From 2023 to now, post-training is making it useful... we can imbue a personality and this is all by showing it examples of what we want.” (Jeremy Berman, 05:33)
2. The Evolution of Training Objectives (07:13–08:53)
- Blurry lines: Is reinforcement learning part of post-training? Berman says, “At reflection we called it training... maybe you can think about it like, what is the objective?... In pre-training... and some of post-training it’s next token prediction. But then there’s... reinforcement learning or on policy learning.”
- RL as the next-gen step: “...letting the model learn for itself...having the model try out different answers and then taking the best ones and then reinforcing on those and doing that in a loop.”
3. The Significance of ARC AGI Benchmark (10:53–14:45)
- What is ARC AGI?
“ARC AGI is like an IQ test for machines... you have grids, input and output grids, and there’s a common transformation rule... simple children’s puzzles. And the key is, can you, given an input, generate the output grid? Children can do it. The best LLMs in 2024? 4–6%." (Jeremy Berman, 10:53) - The challenge of true generalization:
“Language models at the time were not trained to think for themselves fundamentally.. ARC AGI requires you to be able to think on your feet because you've never seen these challenges before. You can't hack this test.” - On Reinforcement Learning & Advancements:
“I was able to get top score...then two weeks after that, OpenAI's 01 model [with RL] smoked my score. This totally lit a light bulb...this is really the new paradigm.”
4. Reflection AI’s Vision & The Spiky vs. General Debate (14:48–20:05)
- Reflection AI's roots:
“The founders of Reflection were at DeepMind when DeepMind was really pushing forward the reinforcement learning paradigm... So at Reflection, we have this reinforcement learning paradigm which we know produces superhuman intelligence. So why don’t we combine them?” - Spiky superintelligence vs. generality:
“Are we building spiky superintelligences where we know that if you have a distribution of data, deep neural nets can learn that...but if you give them data that it hasn't seen before that's different, it will do less well. This is the spiky paradigm. The other paradigm is we figure out how to build generality, build the skill that builds other skills.” (Jeremy Berman, 18:00) - Berman’s bet:
“I have a theory for why I think that we are going to be able to inject generality into these models, but it's not certain.” (19:08)
5. On AI Generalization, Dead Zones, and the Limits of Data (21:43–23:55)
-
“Dead zones” and the problem of brittle generalization:
“When you ask a language model a task that it hasn’t been trained on, it’s like an adversarial attack...it hits weights that it has memorized in pre-training...And I think this is...why exactly that is happening.” -
The need for models that reason:
“I think the right answer is we need to build the right environments and build the right training paradigm such that the models internalize reasoning for all domains in a general way.” (21:43)- Notable Quote:
“Reasoning with sufficient creativity is the engine of generality.”
— Jeremy Berman (20:05)
- Notable Quote:
6. Lunchroom Debates & Theories at Reflection AI (21:50–23:55)
- On how innovation happens:
“They’ll tell me why I’m wrong, I’ll tell them why I think I’m right...A lot of paper sharing and saying ‘look, I told you this.’...I think fundamentally we need to reinforce its own circuits.” (22:01)
7. Open Weight Models: Business & Philosophy (28:33–29:58)
- Why open models?
“If I’m a government, I want to use my own models on my own hardware...I can’t use frontier language models...Chinese models are currently overwhelmingly at the top for open weight models...there are a lot of enterprises that really want to run their own models.” - Philosophical angle:
“It’s really great to contribute to the community...If we’re able to build a great model, we can give it to researchers.”
8. AGI: Consistent Reasoning Across Domains (29:58–32:43)
- What’s still missing:
“Humans are able to extend reasoning from one discipline to another...AI isn’t good at that. They have dead reasoning zones.” - On knowledge webs and reinforcement learning:
“Pre-training is stored like a knowledge web...But you are missing this causal understanding...I think RL does is it slowly shifts knowledge from this web to more of a cohesive knowledge tree.”
9. Prospects, Optimism, and Existential Risk (33:10–40:43)
-
Are we on track to AGI?
“We have a goal. At least we’re not sure if this is going to work, but we have a pretty good idea of what we want.” -
Risks:
“I think it’s reasonable to assume that [AGI systems] will be around as dangerous as nuclear weapons...But I don’t think it is more than that.” -
Will Reflection AI’s models be publicly available?
“No, they're internal, but...it’s a goal.”- Memorable quote:
“If we are able to do what I’m describing...everyone on the planet has an Einstein in their pocket.” (39:43)
- Memorable quote:
10. Scale vs. Smarter Training: The Next Paradigm (35:54–38:24)
- Jeff Jarvis asks:
“Are you LLM side still or got to do something new? Are you scale side still or something else?” - Berman: Open to all avenues:
“I am more confident than not that language models will get us there...I think generally models that are bigger are better as long as they're trained appropriately...But I don't think that's necessary.” - Skepticism of “LLMs are dead” takes:
“This is the problem I have with Yann LeCun and others...in general with the speculation...We don’t know, right?” (36:35)
Notable Quotes & Moments
On generality vs. spikiness:
“Are we building spiky superintelligences...or do we figure out how to build generality, how to build the skill that builds other skills?... That is what I’m most interested in.”
— Jeremy Berman (18:00)
On reinforcement learning’s importance:
"We have cracked the code for how to teach...the general into distribution. And that is through reinforcement learning."
– Jeremy Berman (14:15)
On real world risk:
“It’s reasonable to assume that they will be around as dangerous as nuclear weapons...But I don’t think it is more than that.”
– Jeremy Berman (39:34)
On the value of open models:
“It’s really great to be able to contribute to the community...If we’re able to build a great model, we can give it to researchers.”
– Jeremy Berman (29:58)
On the nature of deep learning:
“Fundamentally neural networks learn the distribution they are trained on...But I actually don’t think it’s very hard to fit all of reasoning into that body.”
– Jeremy Berman steelmanning the ‘skeptical’ case (37:19)
Timestamps for Key Segments
- Post-training explained: 03:27–08:53
- ARC AGI benchmark & generalization crisis: 10:53–14:45
- Reflection AI’s founding theory, reinforcement learning lineage: 14:48–16:23
- Spiky superintelligence vs. generality and Berman’s theory: 18:00–20:05
- “Dead zones” and limits of current models: 21:43–23:55
- Open weight models—business & research case: 28:33–29:58
- Consistent reasoning across domains (AGI goal): 29:58–32:43
- Risk, optimism, and societal impacts: 33:01–40:43
- Scale vs. new paradigms, LLMs vs. multimodal models: 35:54–38:24
Flow & Tone
Throughout, the discussion is deeply technical but remains engaging and accessible, with all speakers frequently inserting quips, contesting each other's theories, and maintaining the playful, skeptical spirit that defines TWiT podcasts. Berman comes across as open, thoughtful, and candid about the immense uncertainty in AI research—and the personal, philosophical, and societal stakes involved.
Additional Memorable Moments
- Lunchroom at Reflection AI: “Lunches are very fun. We always have lunch together...It's a lot of paper sharing...I would say that’s a window into our lunch conversation.” (22:01)
- Vibe coding vs. programmatic coding:
Leo Laporte: “So you’re vibe coding your solution, in effect.” (27:02) Jeremy Berman: “Yes, that's a good way to think about it.” - On the experience of working on AGI:
“You're in the middle of the most exciting thing to happen, I think, in human life.”
– Leo Laporte (40:41) - Paris on existential threat worries:
“I think there are a million other things to worry about before we [get to AGI]...this technology seems uniquely predisposed to make things worse.” (56:17)
Useful for…
- Anyone seeking a state-of-the-art understanding of how LLMs are trained, fine-tuned, and potentially made more general
- Listeners curious about the business, regulatory, and philosophical stakes in open source AI
- Those looking for an accessible, real-world view of what working at a top AI lab entails—including internal debates
- Anyone interested in whether AI risk fears (“if AGI, we all die!”) are fueled primarily by evidence or ideology
Summary Table
| Section | Timestamp | Speaker(s) | Key Point/Quote | |----------------------------|---------------|------------------------------------------|---------------------------------------------------------------------------------------------| | Pre-training vs. Post | 03:27–08:53 | Jeremy Berman, all | “Pre-training is stuffing the internet... Post-training is making it useful.” | | ARC AGI explained | 10:53–14:45 | Jeremy Berman | “Children can get 85-100%...best LLMs were getting 4–6%. You can’t hack this test.” | | Reinforcement learning | 14:48–16:23 | Jeremy Berman | “We brought RL to language models—let them truly think for themselves.” | | Spiky vs. general debate | 18:00–20:05 | Jeff Jarvis, Jeremy Berman | “Are we building spiky superintelligences...or true generality?” | | Theories & dead zones | 21:43–23:55 | Jeremy Berman, Leo Laporte | “When you ask a language model a task it hasn’t been trained on, it’s like an adversarial attack.” | | Open weight models | 28:33–29:58 | Jeremy Berman | “We’ve found this gap in the market for open weight models. Philosophically, it’s inspiring.”| | Generality & risk | 33:01–40:43 | All | “If we build general intelligence, everyone on the planet has basically an Einstein in their pocket.”|
Conclusion
This episode is a must-listen for those invested in the future of AI—the sharpest lines on what makes current models impressive yet limited, and where the next breakthrough might emerge. Berman provides optimism grounded in hands-on research, while the hosts press on both the technical and ethical stakes of a world soon to be “inhabited” by intelligent machines.