Lex Fridman Podcast #490 Summary
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI
Guests: Sebastian Raschka & Nathan Lambert
Release Date: February 1, 2026
Episode Overview
In this episode, Lex Fridman is joined by respected AI researchers and educators Sebastian Raschka and Nathan Lambert to explore the "state of the art" in artificial intelligence as of early 2026. The conversation covers rapid advances in large language models (LLMs), the global competition (especially between the US and China), technical innovations (architectures, scaling laws, tool use), open vs. closed models, the future of coding, the economics and culture of AI development, and speculation on AGI timelines and the societal impact of AI.
The discussion keeps a technical edge but aims to be accessible. There are candid takes on hype, practical use, and the human side of working in research labs. They also reflect on the joy and challenge of learning, teaching, and building in the AI field.
Key Discussion Points & Insights
1. The "DeepSeek Moment" & Global AI Competition
[16:55–23:30]
- The "DeepSeek moment" refers to Jan 2025, when Chinese company DeepSeek released R1, an open weight model with near state-of-the-art performance at much lower compute cost, sparking rapid competition and innovation, especially in China.
- Sebastian: No one company will have exclusive technological control: researchers move and ideas spread fast. The real differentiators will be access to compute, budgets, and hardware at scale, not secret algorithms.
"I don't see currently take it all scenario where a winner takes it all." – Sebastian [18:00]
- Nathan: The US has strong labs (OpenAI, Google, Anthropic), but China is catching up and open models from companies like DeepSeek, Z AI, Minimax, and Kimi are gaining global influence.
- Chinese open models often run locally, bypassing Western corporate or state security concerns.
2. Model Trends: Open vs. Closed, Differentiation, and Hype
[23:31–34:44]
- Nathan: Anthropic's Claude Opus 4.5 is hyped within AI circles, especially for code, but the mass market still uses OpenAI (ChatGPT) and Google (Gemini), reflecting brand momentum, not just pure capabilities.
- Customization, memory features, and the emergence of multiple concurrent subscriptions/use-cases are shaping user habits.
- Choice and "muscle memory" matter:
"You use it until it breaks, and then you explore other options." – Sebastian [33:23]
- While enthusiasts chase the latest, most people stick with familiar platforms unless pushed by failure.
3. Coding with LLMs: Human + AI Collaboration
[36:11–39:00]
- Tools like Cursor, Claude Code, Codex, and VS Code plugins have changed programming workflows.
- Lex: Uses Claude Code to “build the skill of programming with English,” emphasizing the shift from manual code editing to guiding higher-level design.
- Sebastian: Prefers tools that do not fully automate programming, wanting to maintain control and see the process.
- AI as a pair programmer is appreciated for reducing loneliness and making debugging/social aspects of programming better.
4. Open Models Landscape
[43:02–46:24]
- China leads in releasing large, open-weight models (DeepSeek, Kimi, Minimaxes, Z AI, Qin, etc.).
- US and Europe see increased efforts (Olmo from AI2, Stanford’s LM360, Hugging Face’s SmallLM, Nvidia’s Nemotron).
- OpenAI’s first open model since GPT-2 (GPT-OSS) is seen as a significant move.
- Open weights are crucial for enterprise use, customization, and research; permissive Chinese licenses are attractive.
5. Technical Innovations in Architectures & Training
[54:41–62:38]
- Most SOTA (state of the art) models are still transformers, with core tweaks:
- Mixture of Experts (MoE) architectures (sparse routing for efficiency and specialization).
- Novel attention mechanisms: multi-head latent attention, group query attention, sliding window attention, etc.
- Focus on making long-context handling efficient (memory, cache).
- Despite hype, most progress is incremental:
“Fundamentally, it is still the same architecture.” – Sebastian [58:42]
6. Scaling Laws: Still Alive, But Changing
[62:38–78:45]
- Nathan: Scaling laws (predictable power-law improvements with more compute/data) still hold, but easy wins are mostly gone, especially for pretraining.
- New progress comes from post-training and inference time scaling: making models smarter by letting them "think longer" per problem.
- Reinforcement Learning with Verifiable Rewards (RLVR) is central—letting models iteratively try and self-evaluate answers, driving big leaps in reasoning/coding/math performance.
- Economic reality: Serving costs now outweigh training costs at scale; decisions (hardware, model size) are influenced by this shift.
- Trade-off: Should money/compute go to bigger models, longer training, more inference, or richer post-training?
7. Data, Quality, and Curation
[79:58–91:19]
- Pretraining now mixes massive, highly filtered data (PDFs, Reddit, books, code, web).
- Quality > Quantity: Labs target carefully curated datasets for specific skills (reasoning, math, code).
- Licensing, legality, and copyright lawsuits (Anthropic’s $1.5Bn ruling on book data) are increasingly shaping dataset construction.
- Worry about LLM-generated data polluting the web/repositories—human curation layer is still valuable/expert summaries outperform raw LLM output.
8. Post-Training: RLHF, RLVR, & Unlocking Reasoning
[111:43–141:40]
- “Post-training” now includes:
- RL with verifiable rewards (RLVR) – models learn by trial/error, iterative feedback, especially for math/code (“aha moments”).
- RLHF (reinforcement learning via human feedback)—still useful for formatting, personality, user preferences.
- Synthetic data, distillation, tool use.
- Scaling in post-training (with RLVR) rivals pretraining in importance—some labs use as much compute for post-training as for initial training.
9. Education, Learning, and Struggle
[132:43–147:24]
- Learning LLMs “from scratch” (as per Sebastian’s books) is powerful for understanding, even if you never use a small custom model in production.
- Struggle is essential—using LLMs to support, not shortcut, learning. Educational AIs should facilitate, not instantly solve, problems.
- Fine-tuning and research in niche areas (e.g., character, style) require less compute and remain accessible for individuals.
10. Career Advice & Lab Culture
[151:36–157:15]
- Trade-offs: Academic research (publish, low resources, high personal credit), Open Labs (open models, mid resources, community), Frontier companies (high pay, closed models, less credit, grueling workload “996” culture: 9am-9pm, six days a week).
- Burnout is real; leapfrogging competition drives a relentless pace.
- Bubbles (e.g., Silicon Valley) can be productive but risk missing global perspective.
11. New Model Architectures & Tool Use
[163:18–173:17]
- Text diffusion models: Potential alternative to transformers for fast, parallel text generation; currently not SOTA, but may power “free” or quick queries.
- Tool use: LLMs integrating search, code, calculators; reduces hallucination, but not a magic solution.
- Open models are catching up with proprietary labs in tool use capabilities.
12. Open Source, Geopolitics, and Policy
[243:57–252:23]
- US lags behind China in open weight models; policy initiatives (e.g., Atom Project, White House AI Action Plan) aim to change this.
- Open models are essential for research, talent development, and educational access.
- Banning open models is seen as infeasible and counter-productive.
- Competition with China drives both open and closed US labs.
13. AGI/ASI Timelines, Generalization, & Limits
[194:04–223:04]
- Definitions of AGI and ASI (artificial superintelligence) are hotly debated; remote worker replacement and fully-automated super-coder are often used yardsticks.
- Most likely future: “Jagged” progress—superhuman on some tasks, weak on others. Full automation of software engineering will come progressively, not all at once.
- It’s likely that the “one-model-for-everything” dream is plateauing:
“That dream is actually kind of dying...” – Nathan [220:00]
- The biggest, quietest impact: LLMs make human knowledge radically more accessible to the world, not just experts or English speakers.
14. Hardware & The Role of Key Figures
[254:43–262:35]
- Nvidia (Jensen Huang) dominates, thanks not just to chips but the CUDA ecosystem; competitors may arise, but momentum is strong.
- Hardware specialization—training and inference chips diverging.
- "Great Man" Theory: Singular leaders (Jensen, Steve Jobs, Sam Altman, Demis Hassabis, etc.) do accelerate progress, but broader dynamics would have led to AI advances eventually—just slower.
15. Looking Forward – Social Impact & Hope
[263:09–278:58]
- In 100 years: Compute and connectivity are likely to be remembered as the inflection points, not the details of LLMs or transformers.
- Future: Specialized robots as well as local community, human connection, and the value of the in-person will persist.
- Flood of “AI slop” will increase the value of physical, human-made, or truly meaningful experiences.
- Open models, multi-agent systems, and social changes are expected; worries about deepfakes, trust, and information verification remain.
- AI tools are “just tools”—humans retain agency and direction. Cautious optimism about steering civilization through the disruptive phases of AI progress.
Notable Quotes & Memorable Moments
-
On global AI competition:
"The real differentiators will be budget and hardware constraints... the ideas will be everywhere."
– Sebastian [17:26] -
On using LLMs until they fail:
"You use it until it breaks, then you try something else... like your text editor or browser."
– Sebastian [33:23] -
On current LLM architectures:
"It's not really fundamentally that different. It's still the same architecture."
– Sebastian [58:42] -
On coding with LLMs:
"It's genuinely more fun to program with an LLM... like having a pair. It's less lonely."
– Lex [108:23] -
On education:
"If you're not struggling as part of this process, you're not fully following the proper process for learning."
– Lex [145:15] -
On lab burnout:
"...being in a culture that is super tight and having this competitive dynamic is going to make you work hard and create things that are better. That comes at the cost of human capital."
– Nathan [157:15] -
On the quiet force of LLMs:
"[LLMs] make all human knowledge accessible to the world...that's how we get to Mars, that's how we build these things. It's this quiet force that permeates everything."
– Lex [223:04] -
On future risks:
"All the capabilities we’ve been talking about can be used to destabilize humans or civilization, even with relatively dumb AI applied at scale."
– Lex [274:56] -
On what gives hope:
"Humans do tend to find a way. That’s what humans are built for, to have community and figure out problems. We have to go through that long period of hard, distraught AI discussions if we want to have the lasting benefits."
– Nathan [275:31] -
Final words:
"It is not that I'm so smart, but I stay with the questions much longer."
– Albert Einstein, quoted by Lex [279:04]
Timestamp Guide to Key Segments
- DeepSeek & Global Competition: 16:55–23:30
- Model Trends & User Habits: 23:31–34:44
- LLMs for Coding: 36:11–39:00
- Open Models in 2026: 43:02–46:24
- Technical Innovations: 54:41–62:38
- Scaling Laws & Economics: 62:38–78:45
- Data Curation & Copyright: 79:58–91:19
- Education, Learning, Struggle: 132:43–147:24
- Lab Culture & '996': 151:36–157:15
- Alternative Architectures/Diffusion Models: 163:18–173:17
- Open Source & Policy: 243:57–252:23
- AGI/ASI Timeline Discussion: 194:04–223:04
- Long-term Social Outlook & Hope: 263:09–278:58
Overall Tone & Takeaways
- Candid, pragmatic, with a balance of technical depth and big-picture reflection.
- Realism about hype cycles, the limits of current technology, and the persistent opportunities in open collaboration.
- Strong commitment to teaching, learning by struggle, and empowering people to find agency amid rapid change.
For Listeners Who Haven't Tuned In
This conversation gives a sweeping overview of the current AI landscape in 2026, demystifying much of the technical progress in LLMs, highlighting the shifting balance of global competition, and showing how both the mundane and profound aspects of the technology are set to reshape how we live, work, and learn. The human stories of learning, collaboration, burnout, and hope for agency thread through the discussion, reminding us—amidst the hype and headlines—that AI's history is both deeply technical and deeply human.
