Last Week in AI – Episode #235 Summary
Date: March 3, 2026
Hosts: Andrej Karpathy ("Andre Karenkov"), Jeremy Harris
Main Theme: A packed week of major AI model releases, advances in AI benchmarking and optimization, fierce hardware and geopolitical competition, and a dramatic standoff between Anthropic and the Pentagon.
Episode Overview
The hosts recap two weeks’ worth of breakneck AI news: multiple high-impact LLM updates, advances in agentic AI tooling, major hardware deals and challenges, ongoing interpretability research, and—most notably—Anthropic’s escalation with the U.S. Department of War over military AI usage. The episode is a whirlwind tour of technical updates and power struggles, delivered with the podcast’s blend of technical rigor, speculation, and dry humor.
1. Listener Updates and Podcast Housekeeping
- Hosts regret missing a previous week; Andrej mentions Astrocade recently raised a Series B and is hiring.
- Podcast reviews highlight appreciation for the show’s blend of technical depth and directness around political topics.
- Jeremy jokes, “If anybody wants to see me squirm, this is going to be the week.” (05:55)
2. Tool & App Highlights
Anthropic’s Sonnet 4.6
- Anthropic follows up the Opus 4.6 release with Sonnet 4.6: a “0.1 version bump” seen as a major real-world jump (1 million token context window).
- Rapid pace of model iteration attributed to “retraining, reinforcement learning,” and likely leveraging internal cloud code data.
Quote:
“Anthropic is on fire.” – Jeremy, (08:44)
- Sonnet’s benchmark: ~60.4% on ARC-AGI2, close to top models in its weight class, but surpassed by Gemini 3, Deepthink, and GPT-5.2.
ARC-AGI Benchmarks Primer
- Designed to measure LLMs’ human-level generalization; emphasizes problems that require out-of-distribution reasoning (e.g., “Okay, now do Connect Five”).
Quote:
"A couple different ways to kind of win... given limited compute and limited data to then get a score that is very strong." – Andrej, (08:44)
Google Gemini 3.1 Pro
- Google rolls out Gemini 3.1 Pro with 77.1% on ARC AGI2 (up from 31.1%).
- Noteworthy for its “multimodal” capabilities and lower API pricing ($2 per million input tokens vs. Claude's $5).
- Observations on model pricing: as models get closer in quality, Anthropic’s premium pricing may become unsustainable.
Grok 4.20 Public Beta (xAI)
- Ambiguity whether this is a new model or just a new inference strategy with four agent “personas” debating before issuing a consensus answer.
- Elon Musk hypes “order of magnitude” gains (regarded skeptically by hosts).
- Model pivots: Grok previously marketed as “uncensored,” now targeting “real world, concrete capabilities” (e.g., medicine, engineering).
- Use in U.S. classified systems confirmed this week (see policy below).
3. Agentic Tools & AI Agents
Anthropic’s “Remote Control” for Claude Code
- Enables mobile device to access a persistent cloud code session on a user’s machine.
- Clearer security boundaries than OpenAI’s earlier agentic approaches—pull-based rather than push; local files needn’t be transferred off-device.
Perplexity “Computer”
- New AI agent coordinator that orchestrates sub-agents for long-running, extended tasks (e.g., marketing campaigns or app building).
- Perplexity pivots toward multi-agent infrastructure (competing with OpenRouter) to differentiate beyond search.
4. Business & Hardware Shake-ups
Meta & AMD: $100 Billion Chip Deal
- Meta commits up to $100B for AMD chips (as part of $600B data center expansion), pushing frontiers of AI infrastructure.
- Equity/warrant structure to align incentives; AMD’s stock must triple for full vesting.
- Meta’s stated aim: “personal superintelligence,” but has lagged in releasing flagship models recently.
MatX: Nvidia Challenger Raises $500 Million
- Building specialized chips for “10x” throughput over Nvidia’s for Transformer-based LLMs (shipping 2027).
- Emphasizes hardware lottery–dependent bet on Transformers' continued dominance.
World Labs: $1B Raise for World Models
- Startup aiming to commercialize “world models” for 3D simulation and agent training, with applications in robotics and autonomy.
Simile: $100M for Simulating Predicting Human Behavior
- Spinoff from the Stanford “AI village”: agents simulate realistic human behavior—a critical next step for AI social simulations and consumer modeling.
OpenAI Stargate Data Centers: Delays
- Discord between OpenAI, Oracle, and SoftBank over $50B+ Stargate centers; disagreement over control slows progress.
- OpenAI reportedly at risk of “running out of cash by mid-2027.”
Chinese Chip Capacity Ambitions
- China aims to 5x 7nm & 5nm chip production by 2027, but faces yield and tooling limitations due to export controls.
5. Research & Technical Advances
Adaptive Optimizers: Surprising Masking Effectiveness
- Google paper shows that skipping random weight updates and aligning masking with momentum in adaptive optimizers (e.g., Adam) leads to substantial performance gains (up to 19% lower perplexity for billion-parameter LLMs).
Quote:
“These kinds of ideas that seem so basic, we’re still discovering them… there’s a lot of low-hanging fruit.” – Jeremy, (46:26)
Measuring LLM “Deep Thinking”
- New metric: “deep-thinking tokens” are those whose internal representations change most in the late layers of LLMs.
- Higher fractions of these tokens correlate with better output accuracy—suggesting good models distribute deliberation across all layers rather than converging early.
Attractor States in LLM Dialogue
- Analysis of LLMs “talking to themselves” finds they converge to highly model-specific attractor states: Claude becomes existentially silent, GPT degenerates into code, Grok spews memes, Gemini becomes grandiose, etc.
- Raises questions about LLM “personalities” and failure modes in long-term autonomous agents.
Quote:
“Grok is unhinged and meme, meme lover; Claude is more philosophical and thoughtful.” – Andrej, (62:38)
Mechanistic Interpretability: Counting, Manifolds, and Text Wrapping
- Anthropic paper shows Claude 3.5 Haiku encodes counting as a one-dimensional manifold in a six-dimensional subspace—offering interpretability insights into how models internally represent seemingly simple tasks.
Bridging Model/Human Task Completion Times
- New method to infer human-equivalent completion times for AI tasks using “item response theory,” scaling up time horizon benchmarks beyond expensive matter tasks.
- Issues remain with variance and untested long-horizon task extrapolation, but density of human-calibrated benchmarks may improve.
Safety Backstops: NESSE Benchmark
- Simple sanity-check benchmark: if your model fails on basic, easy-to-instruct safety tasks, something’s wrong.
“Least Understood Driver” of AI Progress (Epoch AI)
- Synthesis post argues that most “algorithms & training breakthroughs” for LLMs are really just better data curation and scaling laws—software & data progress remain deeply opaque.
Persona Selection in LLMs
- Anthropic posits LLMs do not have persistent “selves,” but dynamically condition into “personas” as prompted—explaining generalized behaviors & misalignment via character inference.
6. Policy & Geopolitics: Anthropic vs. The Pentagon
Anthropic’s Pentagon Showdown
- Anthropic refuses Pentagon (Department of War) request to drop restrictions on model use for autonomous weapons/surveillance.
- DoW threatens “supply chain risk” designation (potentially blacklisting Anthropic, as happened to Huawei); also floats invoking the Defense Production Act to conscript Anthropic as a contractor.
- Amodei’s statement:
Quote:
“These threats do not change our position. We cannot in good conscience accede to their request.” – Anthropic Statement, (89:58)
- Context: Anthropic is the first major AI lab to supply large-scale LLMs to the military; DoW wants “any lawful use.”
- U.S. strategic dilemma: balancing AI innovation, commercial independence, and national security imperatives.
- At the same time, Elon Musk and xAI/Grok agree to license their models to the Pentagon for “any lawful use,” filling the gap.
Distillation Attacks: China & AI Security
- Anthropic uncovers large-scale attempts (16 million exchanges via 24,000 accounts) by Chinese companies (Deepseek, Moonshot, Minimax) to “distill” Claude via automated scraping—highlighting dual-use, knowledge-transfer risks.
- Jeremy:
Quote:
“Distillation works... It gives you crazy leverage, asymmetrical leverage if you’re compute constrained.” – Jeremy, (98:13)
OpenAI’s “Malicious Use” Monthly Report
- OpenAI catalogs cases of model abuse (malware, organized crime, authoritarian censorship assistance) and touts increasing detection/prevention efforts.
7. Notable & Memorable Moments
- Andrej: “Anthropic is on fire.” (08:44)
- Jeremy (on U.S.-China AI rivalry): “As long as our labs are penetrated... we’re dragging our adversaries along with us.” (100:48)
- Existential Claude sample:
“Stillness enough. Letting the conversation rest. We’re both explaining why we’re not responding while responding. Stopping now.” (59:54)
- Jeremy responds to Pentagon-DoW-AI standoff: “This could not be more important.” (94:51)
8. Key Timestamps
- 05:55 – Listener reviews, political commentary
- 06:00–17:44 – Major model updates: Sonnet 4.6, Gemini 3.1 Pro, Grok 4.20
- 22:00–26:28 – Claude Code Remote Control & Perplexity Computer
- 26:29–39:32 – Meta/AMD, MatX, World Labs, Simile, OpenAI Stargate, China Chipping
- 43:21–68:40 – Deep dives: Adaptive optimizers, “Deep thinking” tokens, LLM attractor states, interpretability
- 68:40–74:49 – Benchmarking/model-human task bridge, evaluation woes
- 87:52–100:48 – Policy & safety: Pentagon vs. Anthropic, DoD moves, distillation and export control, OpenAI’s AI abuse report
9. Overall Tone
Technical, wry, and occasionally irreverent (banter about “420” and “6.9” Grok versions). The hosts strike a balance between wonkish technical detail, strategic business/policy analysis, and a sense of mounting stakes as models and institutions race ahead.
10. Recommended For
Anyone wanting a comprehensive, critical, and accessible summary of February/March 2026’s most important AI happenings—especially those tracking the intersection of leading-edge technical advances and global power maneuvering in AI.
End of summary.
