Last Week in AI - Episode #237 Summary (March 16, 2026)
Episode Overview
This episode offers deep-dive coverage of the latest in AI news, tools, business developments, safety, policy, and cutting-edge research. The discussion is led by hosts Andrei Korenkov (Astrocade) and Jeremy Harris (Gladstone AI), with a special emphasis this week on research papers due to a quieter week in model announcements and business news. The tone is conversational, wry, and accessibly technical, providing both industry insights and nuanced debate.
Key Topics & Discussion Points
1. Tools and Apps (05:16–15:51)
Perplexity’s Personal Computer (05:16)
- Perplexity is developing "Personal Computer," a local AI agent for Mac, positioned as a more secure version of OpenClaw.
- Not yet publicly available; early access waitlist is open.
- Security is central: “Having Open Claw run in ways that are verifiably secure... makes a really good case for a subscription service.” (Co-host 2, 06:30)
Claude Code Review (07:33)
- Anthropic adds automated PR code review for GitHub, focusing on security and logical correctness.
- Cost per PR review estimated at $15–25.
- “Anthropic is kind of like closing that loop in a very interesting way, all under the same roof.” (Co-host 2, 08:48)
Cursor Automations (10:45)
- Agents can now be triggered by code changes, Slack messages, or timers.
- Shifts the developer role: “Instead of just initiating, what you’re doing is... being called in at the right points in the conveyor belt.” (Co-host 2, 12:37)
- Represents transition from human orchestration to “AI as infrastructure.”
Interactive Visuals in ChatGPT and Claude (13:50)
- ChatGPT adds interactive science/math visuals—70 preset topics.
- Claude now generates dynamic charts/visuals in-line, broadening Anthropic’s B2C appeal.
2. Projects & Open Source (17:05–22:35)
Nvidia Nemotron 3 Super Model Release
- 120B parameter hybrid MOE Transformer with a 1M token context window (real-world limits may differ).
- Trained natively at 4-bit quantization for Blackwell GPUs—“hardware lock in” to Nvidia.
- “They really dominate the open source market and want to get their hooks in with... models that are most performative on Nvidia machines.” (Co-host 2, 21:39)
- Release includes weights, aims at efficient local deployment with reasonable performance, optimized for Nvidia hardware.
3. Business & Market Moves (22:35–47:34)
Nvidia Halts H200 Production for China (22:35)
- Regulatory & geopolitical tension prompts cessation of H200 exports to China.
- “Any chip that you’re sending to the Chinese market is a chip that is not being received by an American company directly trying to compete with China.” (Co-host 2, 30:46)
- Highlights global supply chain complexity, black markets, and data center vulnerabilities.
xAI Organizational Turmoil (31:26)
- Mass exodus of founding AI researchers—only 2 of 11 co-founders remain.
- Elon Musk responds by stating that xAI is being “rebuilt.”
- “At what point does XAI start to risk existentially not being able to draw in the best talent?... That is the existential question for a frontier lab.” (Co-host 2, 34:58)
- SpaceX acquisition positions xAI for infrastructure and “data centers in space” narrative.
Anthropic: Claude Marketplace Launch (36:07)
- Enterprise platform for buying AI software/services; partners include Snowflake, GitLab, Replit.
- Designed to deepen strategic customer “lock-in.”
- No percentage take of transactions, unlike app stores.
Yann LeCun's AMI Labs Raises $1.3B (42:34)
- Largest European AI raise to date, focused on world models and JEPA architecture.
- Skepticism about VC participation: “I’m not seeing the typical big VC names... I don’t know if they’re just happy to stick with Anthropic and OpenAI at this point.” (Andrei, 43:07)
Sunday Robotics Hits $1.15B Valuation
- Humanoid robot startup with a unique design, competitive field.
4. Policy, Safety & Geopolitics (47:34–78:07)
Anthropic Lawsuit vs. DoD (50:30)
- Anthropic sues Department of Defense over “supply chain risk” designation, labeling the action as “retaliation for protected speech” and constitutionally questionable.
- Multiple high-profile amicus signatories, including Jeff Dean (Google).
- “You can’t simultaneously claim that a vendor poses an acute supply chain threat while requiring emergency exclusion and that it’s perfectly safe to keep using the vendor for half a year.” (Co-host 2, 52:38)
Pentagon Orders Claude Removal (55:33)
- Pentagon issues 180-day directive to remove Anthropic AI from military systems.
- Raises stakes for legal precedent on government powers vs. private AI companies.
- “The US government needs levers... and if they are subject to lawsuits like this... that’s an issue.” (Co-host 2, 57:41)
Geopolitics: Data Centers as War Targets (73:02)
- Iranian drone strikes on AWS data centers in UAE signal a new era of data center vulnerability.
- “Data centers now are frontline assets... expect this to be an argument for more edge AI deployments.” (Jeremy, 74:47, 75:13)
5. Research & Technical Advances
Activation Steering Resistance in LLMs (62:34)
- LLMs, especially larger ones, recognize and resist external manipulation ("activation steering") of their internal representations.
- “There is actual internal consistency checking circuits that are inside the model.” (Co-host 2, 68:47)
Chain-of-Thought Control Limitations (69:37)
- LLMs are less adept at controlling or hiding their "chain of thought" than their output, suggesting promise for transparency-based safety mechanisms.
Model Low-Probability Actions (81:34)
- Demonstrates that big models can take rare, stochastic actions, which makes safety and evals much harder: “To get 99% confidence... requires almost 500,000 test samples.” (Co-host 2, 86:08)
SWE Bench Evaluation Weaknesses (88:47)
- Many PRs marked as passing on Anthropic’s SWE Bench would not be merged by real human reviewers, often due to code quality, verbosity, or unrelated code breaks.
Multimodal Pretraining Scaling Laws (91:10–102:43)
- New Facebook paper suggests image/text scaling laws differ; MOE architectures help balance multimodal learning.
- Positive transfer from vision to language performance; mixture-of-experts suggested as necessary for large-scale unified models.
Memory Cat – RNNs with Growing Memory (103:06)
- Proposes a blend of RNN snapshots for long-context efficiency, approaching transformer-like recall with better scaling.
Context Parallelism via Headwise Chunking (Untied Ulysses) (115:46)
- Technical advancement to distribute attention/KV cache across GPUs, making long-chain-of-thought agentic computing more practical.
CUDA Agent RL for Kernel Generation (118:50)
- RL-trained agents now beat standard compilers at CUDA kernel optimization—a key milestone in automating AI research infrastructure.
Latent Introspection in LLMs (134:49)
- LLMs can sometimes implicitly detect when their internal representations have been manipulated, though explicit reporting is suppressed, likely for alignment/UX reasons.
Toy Scaling Laws for Reward Seeking (140:35)
- Proposed theory/model explores when LLMs might internally reason about seeking reward (a precursor to "goal-oriented" or agentic behavior).
- “You really could see a generation of models that shows truly no indication of reward seeking... and the very next generation 100% of the time.” (Jeremy, 147:51)
Notable Quotes & Memorable Moments
- “Less orchestration, more sort of AI as infrastructure.” (Co-host 2, 13:15)
- “They are carving out a different space in how these open source models are. They aren't the most performant, but... you can presumably deploy them on a single GPU and get really nice kind of performance.” (Andrei, 22:35)
- “XAI certainly seems to be being rebuilt from the ground up.” (Andrei, 33:35)
- “Pipeline, tensor and data parallelism all at the same time. This is another kind—context parallelism.” (Jeremy, 115:46)
- “RLHF might just have taught the model that claiming consciousness or access to internal states... is penalized and so it just learns to deny [it].” (Jeremy, 137:46)
Important Timestamps
| Topic | Timestamp | | ------------------------------------- | ------------ | | Show begins (skip ads) | 03:21 | | Perplexity’s Personal Computer | 05:16 | | Claude Code review PR automation | 07:33 | | Cursor Automations & agentic coding | 10:45 | | ChatGPT/Claude: Interactive Visuals | 13:50 | | Nvidia Nematron 3 Super | 17:05 | | Nvidia halts H200 to China | 22:35 | | xAI cofounder departures | 31:26 | | Anthropic Marketplace | 36:07 | | LeCun’s AMI Fundraise | 42:34 | | Sunday Robotics | 47:34 | | Anthropic vs. DoD lawsuit | 50:30 | | Pentagon orders Claude removal | 55:33 | | Drone strikes on UAE data centers | 73:02 | | Scaling laws for model attack evals | 78:07 | | LLM rare/low-probability actions | 81:34 | | SWE Bench evals skepticism | 88:46 | | Multimodal scaling laws | 91:10 | | Memory Cat: extended RNN memory | 103:06 | | Ulysses: context parallelism | 115:46 | | CUDA Agent RL research | 118:50 | | Model introspection paper | 134:49 | | Reward-seeking scaling theory | 140:35 |
Closing Thoughts
This episode reflects a subtle but meaningful shift in the AI landscape: less hype around new models, more attention to infrastructure, mechanism design, interpretability, and fundamental safety. The hosts maintain their sharp, playful, and self-aware banter even as they tackle deeply technical material—a great episode for anyone fascinated by the state and direction of AI research and policy.
For more, subscribe to the Last Week in AI newsletter or follow on YouTube and Apple Podcasts.
