Intelligent Machines 844: “Poob Has It For You”
TWiT.tv • November 6, 2025
Host: Leo Laporte
Co-hosts: Paris Martineau, Jeff Jarvis
Special Guest: Jeremy Berman (Post Training Researcher, Reflection AI)
Overview
This episode centers on the state-of-the-art in artificial intelligence—especially the ongoing quest for artificial general intelligence (AGI)—with expert insight from Jeremy Berman of Reflection AI. The panel explores the anatomy of large language model training, the significance of the ARC AGI benchmark, the evolving landscape of open-weight frontier models, and the philosophical as well as practical implications of "spiky" vs. general AI. With typical TWiT charm and humor, the hosts and guest dig deep into technical and societal questions, highlight the competitive landscape, and reflect on both risks and massive opportunities ahead.
Key Discussion Points & Insights
1. Understanding Pre-training vs. Post-training in AI
- [02:51] Jeremy Berman lays out the differences:
- Pre-training: Consists of training neural networks on vast Internet datasets, teaching them to predict the next word in any sample of data—effectively compressing the “intelligence” of the internet. Pretrained models excel at document completion but aren’t useful as assistants out-of-the-box.
- Post-training: The phase where models are made useful for human tasks, fine-tuned for behavior like helpfulness or coding, and given personalities by training on curated datasets.
- Reinforcement Learning (RL): Considered part of post-training, this involves the model training itself on its own generated outputs, iteratively improving its performance and reasoning abilities.
Quote:
"Pre-training is the process of stuffing basically the Internet into a deep neural network ... But the problem is these models are not useful ... Post-training is the process of making that useful for humans and tasks."
— Jeremy Berman [03:06]
2. Frontiers and Benchmarks: Reflection AI and the ARC AGI Test
- [09:53] Berman explains his recent achievement: achieving top scores in the ARC AGI test, an “IQ test for machines” designed by François Chollet. While children easily ace these pattern recognition puzzles, LLMs historically performed poorly (4–5%).
- Reinforcement learning dramatically changed the landscape, as OpenAI’s O1 model “smoked” Berman's high score using these techniques.
- Reflection AI’s Approach: The company leverages expertise from DeepMind’s reinforcement learning (RL) (e.g., AlphaGo) and applies RL techniques to language models for more human-like, generalizable reasoning.
Quote:
"What changed the game completely is ... let’s teach these models to do it from scratch ... This is the power of letting the models think for themselves."
— Jeremy Berman [14:18]
3. The “Spiky” AI Debate: Specialization vs. Generalization
- [17:30] The panel examines whether the march towards AGI is distracted by the proven excellence of narrow, “spiky” superintelligences (e.g., AlphaGo), or whether the real challenge/goal lies in building systems with generalized, transferable reasoning.
- Berman believes it’s likely we’ll achieve “spiky superintelligence” soon, but is optimistic about integrating true generality—reasoning that transfers across domains, covering "dead reasoning zones."
Quote:
"There’s one world where okay, let’s just build a dataset for literally everything ... but the right answer is we need to build the right environments and the right training paradigm such that the models internalize reasoning for all domains in a general way."
— Jeremy Berman [20:54]
4. Inside Reflection AI: Open-Weight Frontier Models
- [28:03] Reflection AI aims to deliver open-weight models at the frontier, unlike incumbents (OpenAI, Anthropic) whose models are closed-access. This serves enterprise needs (e.g., privacy for governments, regulatory compliance) and promotes research accessibility.
- Chinese open-weight models currently lead the market, but Reflection sees a gap in non-Chinese, cutting-edge, open models.
5. Training Paradigms: Reinforcement, Distillation, and Research Culture
- [21:31] Berman describes lively lunchroom debates at Reflection AI, referencing the recent “on-policy distillation” paper—teacher-student RL techniques making models learn more efficiently.
- Important internal disagreements revolve around whether models must learn from their own traces, or expert demonstrations suffice.
6. Current Limitations & Promising Directions
- [29:28] AGI remains elusive because AIs continue to struggle with consistent reasoning across domains ("dead reasoning zones").
- Current models store knowledge as a web, lacking the "coherent causal relationship" that reinforcement learning can impart.
- Berman speculates that future breakthroughs will require new training paradigms—perhaps with broader, more creative post-training phases.
7. The Scale Debate and AGI Outlook
- [35:21] The panel discusses whether continued scaling and LLM architectures will bring AGI, or whether a paradigm shift is needed (as argued by Yann LeCun and others).
- Berman maintains that scaling, especially focused on post-training and reasoning, is promising. He’s confident we’ll solve spiky superintelligence and cautiously optimistic about general reasoning “within the next 10 years.”
8. Risks, Openness, and Social Implications
- [39:05] AGI could be as dangerous as nuclear weapons if universally accessible. The potential for misuse is real—“everyone on the planet has basically an Einstein in their pocket” ([39:13]).
- Models from Reflection AI are not yet public but the company is committed to openness.
Notable Quotes & Moments
-
On the leap to AGI:
“We will know we've achieved AGI when we can't create tasks that are easy for humans but hard for AI.”
— Leo Laporte [29:28] -
On the “AI bubble” and cultural moments:
“I look down. I hold up a bubble. The bubble's filling up the screen. The words AI Are in it in a bubble. ... Suddenly Freddy Krueger s cams come and pop the bubble.”
— Paris Martineau (reacting to an on-air video gag) [73:50] -
On spiky superintelligence vs. true generality:
“I am very confident ... we will build spiky superintelligence, ... I think it’s more likely than not that in the next 10 years we have new ideas that will lead to true general reasoning.”
— Jeremy Berman [33:29] -
On the “lunchroom culture” at Reflection:
“People at Reflection are very smart ... I’ll toss out a few theories. They’ll tell me why I'm wrong. I'll tell them why they're wrong ... It’s paper-sharing and saying, ‘See, told you this is OK...’”
— Jeremy Berman [21:31] -
On open-weight models for the community:
“It’s really great to be able to contribute to the community ... If we’re able to build a great model, we can give it to researchers ... for scientific discovery.”
— Jeremy Berman [28:03] -
On AGI safety (and jokes about doomsaying):
“You’re not with what’s his [name] ... Yudkowsky says if we get there, we’re all dead?”
— Leo Laporte [38:56]
“It’s reasonable to assume they’ll be about as dangerous as nuclear weapons ... but not more than that.”
— Jeremy Berman [39:11]
Timestamps for Important Segments
- [02:51] — Pre-training vs. Post-training, RL explained
- [09:53] — Berman’s path to AI; ARC AGI test and breakthroughs
- [14:18] — AlphaGo’s reinforcement learning as paradigm inspiration
- [17:30] — Spiky vs. general AI and field-wide debates
- [21:31] — Reflection’s research culture, lunchroom debates, on-policy distillation
- [28:03] — Open-weight models: technical, business, and philosophical rationale
- [29:28] — Consistent reasoning, “dead zones,” challenges ahead
- [33:29] — AGI outlook: spiky superintelligence, forecast for general reasoning
- [39:05] — Societal risks of AGI and comparison to nuclear arms
- [73:50] — Podcast in-joke: “AI bubble” video bit with Paris
- [81:23] — Paris' Halloween as Log Lady, whimsical tangents (charm of TWiT)
- [129:03] — Data centers in space, practical AI applications
- [135:03] — Favorite AI tools (Claude, ChatGPT, Perplexity), research best practices
Conclusion
This episode is a goldmine for anyone following AGI progress, LLM architectures, or the interplay between open research and real-world applications. Jeremy Berman’s grounded, lucid explanations offer a roadmap to where AI is today and where it could go—demystifying "post-training," the importance of RL, and the choppy waters between specialized and general intelligence. The conversation is peppered with insight about the industry, caution about risks, and a genuine sense of curiosity and excitement about the possibilities ahead.
Additional Tidbits
- Paris is heading to the American Museum of Tort Law—one of many quirky cultural asides.
- The open-source, researcher-friendly ethos remains strong at Reflection AI, paralleling broader calls for transparency in AI systems.
- The co-hosts digress into tangents on AI-generated ads, copyright debates, and even vintage pizza—showcasing the podcast’s signature blend of depth, warmth, and humor.
Further Listening/Reading
- Jeremy Berman’s Substack (for technical breakdowns and ARC AGI solutions)
- ARC AGI benchmark (Francois Chollet)
- OpenWeight models and the shifting enterprise AI landscape
Closing
"Thank you, Jeremy, for helping us... You're right in the middle of the most exciting thing to happen, I think, in human life."
— Leo Laporte [40:10]
For a detailed, accessible dive into the most important issues surrounding AGI, this episode of Intelligent Machines delivers expertise, clarity, and plenty of memorable moments.