Podcast Summary: Last Week in AI – Episode #215
Episode Date: July 8, 2025
Hosts: Andrej Karpathy (“A”, co-founder, Gladstone AI) & Jeremy (“B”, generative AI startup, co-founder Gladstone AI)
Theme: “Runway games, Meta Superintelligence, ERNIE 4.5, Adaptive Tree Search”
Description: A wide-ranging, fast-paced weekly roundup of the most notable events, developments, research, and drama in AI—focusing especially on tools, corporate maneuvers, cutting-edge research, safety, regulation, and the growing impact of AI on society.
Overview
This episode dives into a week rich in AI news, with no single dominating story but an abundance of notable advances and controversies. Key themes include:
- The shifting competitive landscape among major tech companies (especially Meta’s “Superintelligence” push and high-profile hires)
- Open-source progress from China’s tech giants
- New research pushing the boundaries of model reasoning, scaling, and agentic capabilities
- AI policy battles in the US and Europe
- Economic, labor market, and cybersecurity impacts of ever-more capable models
Key Stories and Insights
1. Tools, Apps & Business Updates
[00:11–03:54] Cloudflare Moves Against AI Scrapers
- Cloudflare now blocks AI scrapers by default, requiring sites to opt-in for AI bot data collection.
- Significance: As Cloudflare handles ~20% of global web traffic, this is a precedent-setting move in controlling data access for model training.
- Quote [03:54, Jeremy]:
“The ultimate question is… can you meaningfully distinguish between human and bot traffic? Maybe not for long, especially with economic incentives to scrape. But it’s still precedent-setting.”
[05:01–09:32] Runway’s Foray into AI-Generated Gaming
- Runway (known for AI video editors) is introducing tools for AI-generated interactive games, seemingly in the style of enhanced “AI Dungeon.”
- Runway/Meta acquisition talks recently ended; Runway chooses to remain independent for now.
- Quote [06:55, Jeremy]:
“Gaming companies are adopting this stuff much faster than Hollywood—less union baggage, more agility. Indie gamers will pick it up really fast.”
[11:24–15:10] Google Releases Gemini AI Tools for Education
- 30+ new AI-powered features tailored for schools and teachers; includes lesson plan, quiz, and “Gems” (custom AI experts) generation.
- Google also updated device management tools for classroom Chromebooks.
- Quote [13:24, Jeremy]:
“When do we move from ‘AI tools for educators’ to ‘AI tools as educators’? Professors are kind of competing against a suite of products that’s increasingly optimized to do a better job than they can.”
[15:10–20:22] The Rise of AI Notetakers in Meetings
- Growing trend of meetings being attended (and summarized) by more AI notetakers than humans, especially in large companies.
- Discussion touches on meeting culture differences between startups and corporates.
- Quote [17:42, Jeremy]:
“Obviously, meetings are the enemy by default in startups… I wonder what the failure modes are when most everybody in these meetings is just an AI agent.”
[18:05–21:21] Google’s AI Apps: Doppel & Imagen 4
- Doppel: New app lets users try virtual outfits via AI.
- Imagen 4 (and 4 Ultra): New text-to-image capabilities, focus on prompt adherence and spatial detail, but little public excitement.
- Quote [20:22, Jeremy]:
“The incremental advantage of these models over each other feels pretty opaque to me… Surely we must be saturating?”
2. Big Tech Moves & AI Talent Competition
[21:21–27:40] Meta Goes All-In on Superintelligence
- Meta launches a new “Superintelligence Labs” division, led by ex-OpenAI, Scale AI, and GitHub talent, with huge compensation packages (rumored $100–$300M+).
- Sam Altman reportedly likened poaching to a “break-in”; Meta’s stock hits all-time high.
- Tensions between new hires and Yann LeCun’s open-source, skeptical-of-LLMs philosophy.
- Quote [24:09, Andrej]:
“Within OpenAI, Sam Altman sent a memo saying Meta’s been pretty aggressively recruiting senior researchers…it was cast as: ‘someone has broken into our home.’” - Quote [24:59, Jeremy]:
“What a repudiation of Yann LeCun’s philosophy…Zuck says, ‘we’re doing superintelligence, we’re calling it that, and we’re hiring the OpenAI guys.’…Meta had to refound the AI part of the company.”
[27:40–34:57] Anthropic Loses Key Talent to Cursor (nSphere)
- Cursor, a top AI coding tool, poaches two leaders of Anthropic’s Claude Code. Cursor used by top developers, offers flexibility to leverage top LLMs.
- Cursor’s ARR now >$500M, Anthropic hitting $4B revenue but burning several billion a year.
- Quote [36:28, Jeremy]:
“Our expectation is that we’ll never hire another developer with less than 10 years of experience. Again. That’s pretty amazing.”
[35:10–38:04] Anthropic Launches Economic Futures Program
- Anthropic launches research program to study labor market/economic effects of AI—timely, given the possible elimination of 50% of entry-level white-collar jobs in 1–5 years.
- Grants, symposia, and partnerships to focus on empirical evidence and strategy.
3. Hardware & Infrastructure
[38:04–41:09] OpenAI and Chips: No Google TPUs, Own Chip in the Works
- OpenAI declines to use Google’s TPUs; building its own chip with Broadcom, hitting “tape out” milestone this year—a major hardware independence move.
- Quote [39:05, Andrej]:
“OpenAI’s trying to shake itself loose from Microsoft more and more… At the same time, Google is just starting to push out in the direction of third-party partnerships for TPUs.”
[41:09–46:58] Data Centers, Power, and Supply Chain Bottlenecks
- Emerald AI (Nvidia-backed) working to optimize data center power loads—potential to unlock up to 100GW supply.
- TSMC Arizona chips are flown to Taiwan for packaging, illustrating ongoing U.S. dependence on Taiwan for crucial semiconductor steps.
- Quote [45:00, Jeremy]:
“Everybody’s talking about packaging as if it’s solved. But if you look under the hood, there are reasons why it could take longer…You can’t make chips.”
4. Open Source & Research Breakthroughs
[46:58–56:03] China’s Open-Source LLM Surge: Baidu’s ERNIE 4.5 & Tencent’s Hunyun A13B
- Baidu releases ERNIE 4.5: a family of models under Apache 2.0, top model has 424B parameters, besting DeepSeek v3 on many benchmarks.
- Notable for greater open-source detail and tooling.
- Tencent releases Hunyun A13B: MOE model with only 13B active parameters, state-of-the-art on some reasoning and agentic benchmarks.
- Introduces “dual mode chain of thought” for fast vs. slow reasoning.
- Quote [53:25, Jeremy]:
“Yet another model build in the Chinese ecosystem that mirrors the DeepSeek training approach…It’s another impressive player.”
[56:03–61:13] Other Notable Open Source & Research Releases
- DeepSwe from Together AI: RL-trained open code agent, incremental progress with strong software engineering results.
- GLM 4.1 Voltsinking: From Tsinghua & GPU AI, VLM with 9B parameters, advances in multi-modal (image, video, text) reasoning.
- Apple & HKU: Masked Diffusion LLMs for code generation—experiments with diffusion architectures for LLMs, still unusual for text.
[66:20–74:06] Advances in LLM Reasoning & Evaluation
- Adaptive Tree Search for Reasoning (Lightning Deep-Dive): New dynamic approach to breadth/depth tradeoffs in LLM inference, uses Thompson sampling, enables ensemble reasoning with multiple LLMs (“meta-models”).
- NanoGPT Speedrun Benchmark: Evaluates AI agent’s ability to reproduce stepwise scientific optimization (e.g., training time reduction), tests generalizability and automation of research.
- Meta’s Agentic Time Horizon Tracking: Latest results: Claude 4 Opus can now reliably complete 80-min tasks, up from 65 in “Sonnet 4,” evidence of steadily increasing agentic coherence.
[81:15–88:26] Research: LLM Capabilities, Transfer, and Error Analysis
- Encoder-decoder vs. Decoder-Only for system prediction: Encoder-decoder preferable for structured, non-language tasks.
- Math Reasoning Transferability: RL fine-tuning transfers reasoning better than supervised learning, but can also cause negative transfer if not done carefully.
- Error Correlation Among LLMs: Empirical study across 349 LLMs—models’ mistakes are highly correlated; implications for ensembling and risk management.
5. Policy & Safety
[89:04–95:20] Biosecurity Risk Forecasting with LLMs
- New expert forecasting study suggests LLMs materially increase risk of man-made epidemics, but risk can be mitigated with safety measures (though hosts remain skeptical about full mitigation).
- Highest risk estimates came from the most accurate subject-matter experts.
[95:20–101:38] Offensive Cybersecurity: AI’s Task Length Horizons
- Blog post adapts Matter methodology to cyber “capture the flag” and hacking tasks; current LLMs solve 6-min tasks at 50% success rate but improving steadily, with four-to-six-month doubling time on time horizon length.
[103:45–112:49] The US “One Big Beautiful Bill” AI Preemption Attempt
- Major US lobbying push led by Meta, A16Z, and OpenAI tried (unsuccessfully) to block most state-level AI regulation for 10 years in the federal budget bill; provision ultimately removed 99–1 in Senate.
- Quote [103:45, Jeremy]:
“Rather than having states regulate this, we should regulate at the federal level...which sounds great until you realize the federal government has been gridlocked...so by saying, ‘let’s preempt any state regulation for 10 years’—when OpenAI predicts superintelligence could hit in three—it seems insane.” - Highlights importance of preserving optionality and the bipartisan nature of AI regulation debates.
[112:49–113:46] Denmark Will Give People Legal Copyright Over Their Face/Voice (Deepfakes)
- Major law proposed to grant people copyright rights over their own likeness and voice to combat deepfakes—potentially a model for other countries.
Memorable Quotes
- [03:54, Jeremy]: “The end of the CAPTCHa era, in every possible sense of the term.”
- [06:55, Jeremy]: “Gaming companies are moving to adopt this much faster than Hollywood…less union baggage, more agility.”
- [13:24, Jeremy]: “When do we move from ‘AI tools for educators’ to ‘AI tools as educators’?”
- [24:09, B]: “Within OpenAI, Sam Altman sent a memo saying Meta’s been aggressively recruiting…‘someone has broken into our home.’”
- [24:59, A]: “What a repudiation of Yann Lecun’s philosophy…Meta had to refound the AI part of the company.”
- [36:28, A]: “Our expectation is that we will never hire another developer with less than 10 years of experience. Again. That's pretty amazing.”
- [45:00, A]: “If you onshore a lot of fab for the logic dies but can't onshore packaging, you still have to ship chips back to Taiwan…. You can’t make chips.”
- [103:45, A]: “It just seems so... [bold] Let’s enshrine this in law for ten years, as superintelligence may come and go. That’s some balls, dude.”
Important Timestamps Index
- 00:11: Introduction & Episode Overview
- 03:54: Cloudflare’s Anti-AI Scraper Setting
- 05:01: Runway Move into AI-generated Games
- 11:24: Google Gemini AI Tools for Education
- 15:10: AI Notetakers Overtaking Meetings
- 18:05: Google Doppel (AI Outfits), Imagen 4 Model
- 21:21: Meta's Superintelligence Lab—Massive Hiring Wave
- 27:40: Anthropic Talent Loss—Cursor, Claude Code, Coding Tools
- 35:10: Anthropic’s Economic Futures Program
- 38:04: OpenAI Chips, Google TPUs, Hardware Moves
- 41:09: Emerald AI, Data Center Power, TSMC Packaging Bottleneck
- 46:58: Baidu’s ERNIE 4.5 Announcement
- 50:32: Tencent’s Hunyun A13B—MOE Reasoning Model
- 56:03: Together AI’s DeepSwe, Other Open Models
- 66:20: Adaptive Tree Search for Reasoning
- 74:06: NanoGPT Speedrun Evaluation Benchmark
- 78:01: Meta Agentic Time Horizons Update
- 81:15: System Performance Prediction Model (Encoder-decoder)
- 87:27: Error Correlation in LLMs
- 89:04: Biosecurity LLM Risk Forecast
- 95:20: Cybersecurity Task Horizon Benchmarks
- 103:45: US AI Legislation, Federal Preemption Battle
- 112:49: Denmark’s Copyright Over Face/Voice Law
Tone, Style & Closing Remarks
The hosts deliver with their trademark blend of in-depth technical insight, skepticism, dry humor, occasional swearing, and accessible analogies. Listeners new to the field or outside “AI Twitter” will find the episode fast-paced but rich in context and analysis, covering corporate intrigue, research, real-world AI impact, and regulatory chess moves with a critical but often irreverent voice.
[114:16, Jeremy: on Denmark’s deepfake law]
“How much can you modify a face until it's not your face anymore? Where is AI an adornment versus a fundamental change of appearance? AI keeps fuzzing the boundaries around everything…”
Closing Freestyle Theme Break [115:04]:
“Last week in AI, come and take a ride / Hit the lowdown on tech and let it slide / From the labs to the streets, AI’s reaching high / Algorithm shaping up the future—tune in, get the latest with ease.”
For the full stories, tech deep-dives, and further reading, check out links in the episode description.
