Transcript

Host 1 (0:00)

Foreign. We'd like to thank Box for sponsoring

Andrei Korenkov (0:12)

last week in AI.

Host 1 (0:13)

Box is the leading intelligent content management platform enabling organizations to fuel collaboration, manage the entire content lifecycle, secure critical content and transform business workflows with enterprise AI. To unlock the power of AI, you need to get your content to your LLMs and agents. Agents. Your business isn't the sum of Internet knowledge. Your business lives in your content, so you don't just want to bolt on

Andrei Korenkov (0:40)

AI to your existing processes.

Host 1 (0:43)

To become an AI first company isn't just about automating what you already do, it's about reimagining what's possible. With boxai, you can truly leverage the latest breakthroughs in AI to automate document processing and workflows, extract insights from content, build custom AI agents to work on assignments, and more. And most importantly, boxai works with all the major leading AI model providers so OpenAI, Anthropic, Google XAI and others so

Andrei Korenkov (1:09)

you can be sure you can use

Host 1 (1:11)

the latest AI models with your content. Boxai will give you the content layer that gives AI the context it needs while giving your teams the flexibility they need to test and leverage various models for different use cases. So go to box.com AI to learn more. Last week in AI we would like to thank odsc AI for being a sponsor. ODSC is one of the longest running and largest communities focused on applied data science and AI.

Andrei Korenkov (1:40)

It started over a decade ago with

Host 1 (1:42)

a simple idea Bring practitioners together to learn from people actually building and deploying models in the real world, not just talking theory. On April 28th through the 30th, you can experience it yourself at ODSC East 2026 taking place in Boston and virtually, there will be thousands of hybrid attendees ranging from data data scientists, ML engineers, AI researchers and technical leaders. You can attend over 300 sessions covering LLMs, gen, AI, computer vision, NLP, data engineering and more. You can also go to hands on training with workshops and bootcamps taught by experts from companies like OpenAI, Hugging Face, Nvidia and other top companies and universities. And of course there'll be a massive expo and networking opportunities. Great for startups, hiring managers and AI tool builders. It's one of the best ways for AI practitioners and teams to stay ahead of the field, learn from the best and connect with the community. Go to ODSC AI east and use promo code LWAI for an additional 15% off your pass to ODSC AI East 2026. That's ODSC AI east and use code LWAI to get an extra 15% off on the number one AI builders and training Conference.

Summary

Last Week in AI - Episode #237 Summary (March 16, 2026)

Episode Overview

This episode offers deep-dive coverage of the latest in AI news, tools, business developments, safety, policy, and cutting-edge research. The discussion is led by hosts Andrei Korenkov (Astrocade) and Jeremy Harris (Gladstone AI), with a special emphasis this week on research papers due to a quieter week in model announcements and business news. The tone is conversational, wry, and accessibly technical, providing both industry insights and nuanced debate.

Key Topics & Discussion Points

1. Tools and Apps (05:16–15:51)

Perplexity’s Personal Computer (05:16)

Perplexity is developing "Personal Computer," a local AI agent for Mac, positioned as a more secure version of OpenClaw.
Not yet publicly available; early access waitlist is open.
Security is central: “Having Open Claw run in ways that are verifiably secure... makes a really good case for a subscription service.” (Co-host 2, 06:30)

Claude Code Review (07:33)

Anthropic adds automated PR code review for GitHub, focusing on security and logical correctness.
Cost per PR review estimated at $15–25.
“Anthropic is kind of like closing that loop in a very interesting way, all under the same roof.” (Co-host 2, 08:48)

Cursor Automations (10:45)

Agents can now be triggered by code changes, Slack messages, or timers.
Shifts the developer role: “Instead of just initiating, what you’re doing is... being called in at the right points in the conveyor belt.” (Co-host 2, 12:37)
Represents transition from human orchestration to “AI as infrastructure.”

Interactive Visuals in ChatGPT and Claude (13:50)

ChatGPT adds interactive science/math visuals—70 preset topics.
Claude now generates dynamic charts/visuals in-line, broadening Anthropic’s B2C appeal.

2. Projects & Open Source (17:05–22:35)

Nvidia Nemotron 3 Super Model Release

120B parameter hybrid MOE Transformer with a 1M token context window (real-world limits may differ).
Trained natively at 4-bit quantization for Blackwell GPUs—“hardware lock in” to Nvidia.
“They really dominate the open source market and want to get their hooks in with... models that are most performative on Nvidia machines.” (Co-host 2, 21:39)
Release includes weights, aims at efficient local deployment with reasonable performance, optimized for Nvidia hardware.

3. Business & Market Moves (22:35–47:34)

Nvidia Halts H200 Production for China (22:35)

Regulatory & geopolitical tension prompts cessation of H200 exports to China.
“Any chip that you’re sending to the Chinese market is a chip that is not being received by an American company directly trying to compete with China.” (Co-host 2, 30:46)
Highlights global supply chain complexity, black markets, and data center vulnerabilities.

xAI Organizational Turmoil (31:26)

Mass exodus of founding AI researchers—only 2 of 11 co-founders remain.
Elon Musk responds by stating that xAI is being “rebuilt.”
“At what point does XAI start to risk existentially not being able to draw in the best talent?... That is the existential question for a frontier lab.” (Co-host 2, 34:58)
SpaceX acquisition positions xAI for infrastructure and “data centers in space” narrative.

Anthropic: Claude Marketplace Launch (36:07)

Enterprise platform for buying AI software/services; partners include Snowflake, GitLab, Replit.
Designed to deepen strategic customer “lock-in.”
No percentage take of transactions, unlike app stores.

Yann LeCun's AMI Labs Raises $1.3B (42:34)

Largest European AI raise to date, focused on world models and JEPA architecture.
Skepticism about VC participation: “I’m not seeing the typical big VC names... I don’t know if they’re just happy to stick with Anthropic and OpenAI at this point.” (Andrei, 43:07)

Sunday Robotics Hits $1.15B Valuation

Humanoid robot startup with a unique design, competitive field.

4. Policy, Safety & Geopolitics (47:34–78:07)

Anthropic Lawsuit vs. DoD (50:30)

Anthropic sues Department of Defense over “supply chain risk” designation, labeling the action as “retaliation for protected speech” and constitutionally questionable.
Multiple high-profile amicus signatories, including Jeff Dean (Google).
“You can’t simultaneously claim that a vendor poses an acute supply chain threat while requiring emergency exclusion and that it’s perfectly safe to keep using the vendor for half a year.” (Co-host 2, 52:38)

Pentagon Orders Claude Removal (55:33)

Pentagon issues 180-day directive to remove Anthropic AI from military systems.
Raises stakes for legal precedent on government powers vs. private AI companies.
“The US government needs levers... and if they are subject to lawsuits like this... that’s an issue.” (Co-host 2, 57:41)

Geopolitics: Data Centers as War Targets (73:02)

Iranian drone strikes on AWS data centers in UAE signal a new era of data center vulnerability.
“Data centers now are frontline assets... expect this to be an argument for more edge AI deployments.” (Jeremy, 74:47, 75:13)

5. Research & Technical Advances

Activation Steering Resistance in LLMs (62:34)

LLMs, especially larger ones, recognize and resist external manipulation ("activation steering") of their internal representations.
“There is actual internal consistency checking circuits that are inside the model.” (Co-host 2, 68:47)

Chain-of-Thought Control Limitations (69:37)

LLMs are less adept at controlling or hiding their "chain of thought" than their output, suggesting promise for transparency-based safety mechanisms.

Model Low-Probability Actions (81:34)

Demonstrates that big models can take rare, stochastic actions, which makes safety and evals much harder: “To get 99% confidence... requires almost 500,000 test samples.” (Co-host 2, 86:08)

SWE Bench Evaluation Weaknesses (88:47)

Many PRs marked as passing on Anthropic’s SWE Bench would not be merged by real human reviewers, often due to code quality, verbosity, or unrelated code breaks.

Multimodal Pretraining Scaling Laws (91:10–102:43)

New Facebook paper suggests image/text scaling laws differ; MOE architectures help balance multimodal learning.
Positive transfer from vision to language performance; mixture-of-experts suggested as necessary for large-scale unified models.

Memory Cat – RNNs with Growing Memory (103:06)

Proposes a blend of RNN snapshots for long-context efficiency, approaching transformer-like recall with better scaling.

Context Parallelism via Headwise Chunking (Untied Ulysses) (115:46)

Technical advancement to distribute attention/KV cache across GPUs, making long-chain-of-thought agentic computing more practical.

CUDA Agent RL for Kernel Generation (118:50)

RL-trained agents now beat standard compilers at CUDA kernel optimization—a key milestone in automating AI research infrastructure.

Latent Introspection in LLMs (134:49)

LLMs can sometimes implicitly detect when their internal representations have been manipulated, though explicit reporting is suppressed, likely for alignment/UX reasons.

Toy Scaling Laws for Reward Seeking (140:35)

Proposed theory/model explores when LLMs might internally reason about seeking reward (a precursor to "goal-oriented" or agentic behavior).
“You really could see a generation of models that shows truly no indication of reward seeking... and the very next generation 100% of the time.” (Jeremy, 147:51)

Notable Quotes & Memorable Moments

“Less orchestration, more sort of AI as infrastructure.” (Co-host 2, 13:15)
“They are carving out a different space in how these open source models are. They aren't the most performant, but... you can presumably deploy them on a single GPU and get really nice kind of performance.” (Andrei, 22:35)
“XAI certainly seems to be being rebuilt from the ground up.” (Andrei, 33:35)
“Pipeline, tensor and data parallelism all at the same time. This is another kind—context parallelism.” (Jeremy, 115:46)
“RLHF might just have taught the model that claiming consciousness or access to internal states... is penalized and so it just learns to deny [it].” (Jeremy, 137:46)

Important Timestamps

| Topic | Timestamp | | ------------------------------------- | ------------ | | Show begins (skip ads) | 03:21 | | Perplexity’s Personal Computer | 05:16 | | Claude Code review PR automation | 07:33 | | Cursor Automations & agentic coding | 10:45 | | ChatGPT/Claude: Interactive Visuals | 13:50 | | Nvidia Nematron 3 Super | 17:05 | | Nvidia halts H200 to China | 22:35 | | xAI cofounder departures | 31:26 | | Anthropic Marketplace | 36:07 | | LeCun’s AMI Fundraise | 42:34 | | Sunday Robotics | 47:34 | | Anthropic vs. DoD lawsuit | 50:30 | | Pentagon orders Claude removal | 55:33 | | Drone strikes on UAE data centers | 73:02 | | Scaling laws for model attack evals | 78:07 | | LLM rare/low-probability actions | 81:34 | | SWE Bench evals skepticism | 88:46 | | Multimodal scaling laws | 91:10 | | Memory Cat: extended RNN memory | 103:06 | | Ulysses: context parallelism | 115:46 | | CUDA Agent RL research | 118:50 | | Model introspection paper | 134:49 | | Reward-seeking scaling theory | 140:35 |

Closing Thoughts

This episode reflects a subtle but meaningful shift in the AI landscape: less hype around new models, more attention to infrastructure, mechanism design, interpretability, and fundamental safety. The hosts maintain their sharp, playful, and self-aware banter even as they tackle deeply technical material—a great episode for anyone fascinated by the state and direction of AI research and policy.

For more, subscribe to the Last Week in AI newsletter or follow on YouTube and Apple Podcasts.

wavePod

#237 - Nemotron 3 Super, xAI reborn, Anthropic Lawsuit, Research!!!