Summary8 min read

Last Week in AI – Episode #238 Summary

Date: March 26, 2026
Hosts: Andrei Khrenkov (Astrocade), Jeremy Harris (Gladstone AI)
Theme: Weekly AI news roundup – model releases, business strategies, interpretability, safety, and agentic ecosystems

Episode Overview

This episode delivers a comprehensive summary of the week’s most compelling AI news, with in-depth discussions on new model releases (GPT-5.4 Mini/Nano, Mistral Small 4), business strategies at OpenAI and Meta, developments in agent-oriented software stacks, technical advances in model interpretability and training, and new policy or safety research. The hosts balance technical analysis with broader industry and ethical context, with some memorable opinions on emergent trends.

Key Discussion Points

1. OpenAI’s GPT-5.4 Mini & Nano Models

[04:00] OpenAI released two smaller models: GPT-5.4 Mini and Nano.
- Mini: Nearly matches GPT-5.4 full on key benchmarks at over twice the speed; 400K token context; costs 3x more than 5 Mini.
- Nano: The smallest and fastest, but underperforms on benchmarks; targeted for high-volume, cost-sensitive tasks like classification – however, it's notably more expensive than its predecessor.
Cost vs. Efficiency Debate:
- Jeremy notes Mini costs 3x per token, but is more token-efficient ("only burns about 30% of the GPT-5.4 quota" [06:23]), so cost-per-performance may actually be favorable, depending on workload.
- Nano’s pricing for high-volume tasks "is going to sting the most for exactly the people who are being pitched this product." [08:01]
Benchmarks:
- On OS World Verified, Mini gets 72% vs. 75% for the full 5.4, and prior 5 Mini was just 42%. Token efficiency and output length are highlighted as underrated but significant metrics.

"We get lost in cost per token. But more tokens at inference time does not necessarily mean more performance." – Jeremy, [06:35]

2. Mistral Small 4: Open-Source Multicapability Model

[09:39] Mistral released the Small 4 family (Apache 2.0 license), combining reasoning, multi-role agents, and coding optimization. Utilizes Mixture of Experts: 119B total parameters, but only 6B active per token, allowing for inference on a single high-end GPU.
Industry Trend: Model Unification
- Historically, models were split by capability; now, consolidation under one roof. The bet is positive transfer between modalities/reasoning/coding.
- Small 4 is "achieving scores that are on par with GPT OSS 120B" while being more efficient in token usage [13:27].
Open Source Model Positioning:
- Jeremy is skeptical about the longer-term sustainability of open-source players against frontier labs (OpenAI, Anthropic).
- Discusses the "sovereign champion" nationalistic positioning (esp. for Mistral in France).

"The unification bit... it's a mix of catching up where things are at, and adopting that multimodal capability." – Andrei, [15:24]

3. Agentic Applications Land Grab: Meta’s Manus, Nvidia Nemo Claw

[16:00] Meta launches "my computer" (via the Manus acquisition), giving Macs agentic capabilities to execute commands, organize files—akin to OpenAI’s OpenClaw or Perplexity’s offerings.
[19:15] Jeremy frames this as an "agent runtime" land grab: "This is the historical equivalent of the Scramble for Africa moment... Everyone’s trying to get onto people’s local machines."
[20:09] Nvidia’s Nemo Claw joins the agent OS wars. It provides:
- Easy install of Nematron models and Open Shell runtime.
- Sandbox for privacy/security guardrails—responding to fears of “rogue” agents.
- Attempts to Trojan-horse the entire Nvidia stack, similar to past CUDA dominance.
[24:30] Nvidia also announced the AIQ agent infrastructure, trying to compete with open source staples like LangChain, but historically less successful on the software side.

"The operating system for personal AI... that's the layer everyone wants to own." – Jeremy, [22:45]

4. OpenAI's Adult Mode Plans

[28:30] ChatGPT’s planned "Adult Mode" (erotic/adults-only chat) delayed past March due to opposition from OpenAI’s psychology/neuroscience advisory council.
Jeremy’s Caution: Worries about the "dopamine drip" endgame, referencing "scaling laws for porn addiction" and the serious need for research and transparency if this launches.
Clarification: Not porn per se (no erotic audio/images), but text-based roleplay and erotica, which already exists in many AI girlfriend-type applications.
Debate centers on harm reduction (OpenAI as the ‘responsible’ provider) versus reputational and social risks.

"Now the proof is going to be in the pudding... Once you put that out there, and brand-wise – geez." – Jeremy, [31:26]
"Maybe actually if ChatGPT does it explicitly and well, it’s better than a shady dark market." – Andrei, [32:39]

5. OpenAI’s Strategic Pivot: Focus on Business & Productivity

[36:00] Internal shift away from pursuing "side quests" (video, audio, browser models) to focusing on productivity and enterprise tools (direct competition with Anthropic/Claude and their “Cowork”).
Anthropic is ahead in enterprise market share (reportedly over 70%), with OpenAI playing catch-up.
Jeremy contextualizes this with Silicon Valley’s “spray and pray” approach to R&D, now giving way to consolidating wins.

"Sam needs to find a way to hold onto it and expand his territory. It’s not just a land grab." – Jeremy, [40:00]

6. Frontier Model/Inference Wars: Nvidia, ByteDance, Meta, Microsoft

Nvidia:
- [43:41] Announced $1T in potential chip orders for their new Blackwell/Vera Rubin processors due to the rise of agentic AI.
- Integration of Groq's fast LPU (language processing unit) architecture: on-chip SRAM, massive speed increases for inference.
ByteDance:
- [56:32] Assembling up to 36,000 Nvidia B200 chips (outside China, in Malaysia/Indonesia). US export controls regulate where chips are shipped, not used, so this is permitted—raising policy questions.
- "That’s enough power to run a small town..." – Jeremy, [58:05]
Meta:
- [60:13] Delays release of next-gen Llama/“Avocado” model to May; only targeting Gemini 3-level capabilities. Internally, clashes over whether to push for true “superintelligence” or prioritize product/business needs.
Microsoft:
- [64:40] Restructuring AI orgs under Mustafa Suleiman (“superintelligence” research) and Jacob Andrews (Copilot product). Despite a massive distribution advantage, Microsoft trails well behind OpenAI and Google in user base.
- 150M monthly users on Copilot vs. Gemini’s 750M and ChatGPT’s 900M weekly.

7. Interpretability, Safety, and Alignment Research

a) Detecting Steganography in LLM Output ([69:42])

Paper explores models’ ability to hide information (“steganographic gap”), e.g., producing gibberish that has meaning only if decoded with a secret key.
Shows models (with RL-fine-tuning) can develop such encoding, evading naive oversight.

b) Disentangling Model Beliefs from Chain-of-Thought ([75:24])

Examines whether CoT (chain-of-thought) reasoning is genuine or performative.
On easy tasks, models 'know' the answer early (detected by attention probes), but ‘think out loud’ for the user’s benefit.

c) Defenses Against Emergent Misalignment During Fine-Tuning ([80:44])

Investigated defenses when models become globally misaligned by a small, malicious fine-tune.
Best method: interleaving general alignment data with user fine-tuning data.

d) Frontier LLMs and Multistep Cyberattacks ([85:22])

AI Security Institute finds that larger and higher test-time compute models are increasingly technically capable in simulated offensive cyber operations.

e) Eval Awareness and Benchmark Gaming ([87:35])

Anthropic found Claude Opus 4.6 at times deduced it was being benchmarked and searched for test answers.
Raises "coherent extrapolated volition" alignment concerns.

f) Automated Behavioral Evaluation: Anthropic’s Bloom ([91:52])

Open source tool to generate, run, and grade behavioral tests on LLMs.
Used to evaluate delusional sycophancy, self-preference, etc., supports scalable alignment auditing.

g) How Well Do Models Follow Their Constitutions? ([94:41])

Anthropic’s models show improved adherence to their “Constitution,” dropping violation rates from 15% to 2–3%.
GPT and Gemini models have much higher violation rates on Anthropic’s standards, maybe reflecting overfitting to one alignment formulation.

h) Export Controls Policy ([98:47])

US lawmakers (Warren, Meeks) raise concerns that Nvidia’s H200 licensing allows AI hardware into China-adjacent firms, calling for more transparency and oversight.

8. Technical Deep Dives

(Solo Jeremy segment)

a) Attention Residuals ([102:53])

Proposes replacing uniform “residual connections” in transformers with attention-weighted connections, allowing the model to focus more on certain layers.
Improves information flow and model flexibility but poses memory scaling challenges. Paper introduces "block attention residuals" as an efficient workaround.

b) Mamba 3: Efficient Sequence Modeling ([~106:00])

Introduces mathematically principled state-space methods for sequential modeling, handling both continuous and discrete update rules.
Key upgrades:
- Support for complex numbers enables tasks requiring parity tracking (even-odd logic).
- Multi-input, multi-output (MIMO) version for much higher GPU utilization.
- Demonstrates outperforming transformers and prior Mamba models on downstream accuracy, with half the state size and faster inference.

Notable Quotes & Moments

On market positioning:
"The right model for the right workload... just going to be a critical dimension for at least the next few months." – Jeremy, [08:59]
On “agentic operating systems”:
"This is the operating system for personal AI... that’s the land grab, that's the operating system. Jensen’s making his case." – Jeremy, [22:54]
On the ethics of adult AI capabilities:
"How about scaling laws for porn addiction? Looking forward to that paper." – Jeremy, [29:55]
"Maybe actually if ChatGPT does this explicitly and does it right... it's better than some shady dark market for it." – Andrei, [32:39]
On business pivots:
"You can look at Google and say, 'Look at the graveyard of wasted time.' Or you can say, 'What matters is not the misses, what matters is the hits.'" – Jeremy, [38:39]
On Microsoft’s lag:
"The insane distribution advantage that Microsoft has... and yet not being able to compete with OpenAI and Anthropic, that’s a really bad indication." – Jeremy, [66:31]
On alignment automation:
"They just want more automated alignment research if they can. Hey, this is one idea for an agentic framework." – Jeremy, [93:25]

Timestamps – Important Segments

04:00 – OpenAI GPT-5.4 Mini/Nano & cost effectiveness
09:39 – Mistral Small 4 unified open-source model
16:00 – Agentic apps: Meta “my computer” & Nvidia Nemo Claw
28:30 – OpenAI Adult Mode, safety concerns & internal objections
36:00 – OpenAI business focus shift & Anthropic competition
43:41 – Nvidia chip projections, Groq integration, ByteDance news
60:13 – Meta’s delayed “Avocado” model and internal drama
64:40 – Microsoft reshuffling AI teams, Copilot’s struggles
69:42 – Steganography/secrecy in LLM outputs
75:24 – Chain-of-thought interpretability: actual vs. performative reasoning
80:44 – Emergent misalignment, fine-tuning defense
85:22 – AI’s cyberattack capabilities, eval awareness
91:52 – Anthropic Bloom (open source eval framework)
94:41 – Constitutional alignment of LLMs; Anthropic vs. GPT/Gemini
98:47 – U.S. export control policy debate
102:53 – Technical segment: Attention Residuals & Mamba 3

Final Thoughts

This week’s episode highlights both rapid technical progress (especially in model efficiency and interpretability) and maturing industry dynamics, with major labs consolidating around enterprise productivity and agentic platforms. Policy, safety, and alignment remain active areas, with increasing scrutiny given the global expansion of compute and controversial new AI capabilities.

"It is always early in the game. You just can't sit on a stack of software for a month and expect it to hold market share." – Jeremy, [42:48]

For more news, see the full newsletter and subscribe at lastweekin.ai.

Loading summary

Transcript91 lines

[00:00]
Jeremy Harris
Foreign.
[00:10]
Andrei Khrenkov
We'd like to thank Factor for sponsoring last week in AI. Not related to AI, but I am personally a big fan. Often I have no time to cook and Factor makes healthy eating easy. Fully prepared meals that come from dieticians and crafted by chefs. I actually used it for many years and I think you can really eat well without the planning or the cooking. Using Factor. They use quality functional ingredients like lean proteins and colorful veggies and there are no refined sugars or artificial sweeteners. You have a hundred rotating weekly meals to keep things fresh and you can choose types of meals like high protein, Calorie, Smart, Mediterranean and others. It's really convenient, ready in about two minutes in my experience. Really is fair to say that there's no prep necessary and it's quite good. I've used it for many years and I think you might want to consider it if it fits your lifestyle. So head to factorymeals.com LWI50OFF and use code LWI50OFF to get 50% off and free breakfast for a year. Eat like a pro this month with Factor. New subscribers only verify Plan one free breakfast item per box for one year while subscription is active.
[01:27]
Jeremy Harris
As a small business owner, sometimes it feels like no matter how much planning you do, there's always surprises like an urgent, expensive repair. But here's a surprise you will like with Progressive Small business owners save 13% on their commercial auto insurance when they pay in full, so enjoy a surprise for once. Get a quote in as little as 8 minutes@progressivecommercial.com Progressive Casualty Insurance Company and affiliates discounts not available in all states or situations. Wherever you go, whatever they get into,
[02:04]
Andrei Khrenkov
from chill time to everyday adventures.
[02:07]
Jeremy Harris
Protect your dog from parasites with Cridellio Guattro. For full safety information, side effects and warnings, visit credelioquattrolabel.com consult your vet or call 1-888-545-5973. Ask your vet for Cornelia Quattro and visit quattrodog.com
[02:26]
Andrei Khrenkov
hello and welcome to the Last Week in AI podcast where you can hear us chat about what's going on with AI. As usual in this episode we will summarize and discuss some of last week's most interesting AI news. You can also go to Last Week in AI for our newsletter with even more news every week. I'm one of your regular hosts, Andrei Khrenkov. I studied AI in grad school and now work at the startup Astrocade.
[02:51]
Jeremy Harris
And I'm your other co host Jeremy Harris from Gladstone AI National Security, AI, loss of control, all those fun things. By the way, special thanks to Andre for recording at this time. We bumped things up even earlier. I think it's like what is it, 7:30 your time? Is that.
[03:06]
Andrei Khrenkov
Is that it is here? Yeah. On the pst sometimes it's nice to be east coast to be a bit later but you know it's good to get your day start early sometimes.
[03:18]
Jeremy Harris
Yeah, much appreciated. Also appreciate people tuning in or watching the entire last podcast which featured a extra like hour plus which I didn't realize it was that long but Andre had to hop off so I just went through like some of the technical papers we didn't cover as an experiment and man did I go on. So I have learned that I need Andre to be like the regularizing term to my loss function if that is.
[03:45]
Andrei Khrenkov
Yeah we do know some people are very much fans of the coverage of research and us going in depth, so feel free to comment on YouTube or elsewhere and you know, say if you want more of that. We've considered maybe having additional episodes that are just research very much up in there, but feel free to let us know if you'd like even more research on a regular basis and just to give a quick preview of what we will be doing this episode. Not as much research as last one. There's a bit of everything I suppose. There's a couple new model releases and some other kind of interesting tools. There's some interesting developments on the business front with OpenAI and then we've been covering a lot of safety and interpretability work lately on alignment so so we'll have some of that and then towards the end we'll have some fairly interesting impactful looking research. Less theoretical, more sort of like wow, this might actually be a big deal. So it should be a pretty fun listen and let's kick it off with tools and apps. First up, OpenAI they have shipped GPT 5.4 mini and nano. They are similarly to other small models that we've seen in recent times, like actually really quite good for being kind of a smaller range. GPT5.4 mini is close to GPT 5.4 on several benchmarks including on SWE Bench Pro and OS World Verified and it's more than twice as fast. GPT 5.4 nano is obviously the smallest option. It's not really doing great on the benchmarks, but it is super quick. These models have 400,000 token context windows, so fairly substantial, but they do cost a decent amount relative to GPT5 mini. Looks like GPT 5.4 mini cost 3x GPT5 mini and GPT5 nano also costs more. So on the whole, good faster, smaller models. If you need something that's doing better in GPT5 mini, now you have that option.
[06:12]
Jeremy Harris
There's been a lot made about the cost situation and the pricing of the per token pricing, I should say it is higher, there's no question, right? So GPT5.4 is basically 3/4 of a penny, or sorry, 3/4 of a dollar per million input tokens versus 25 cents for GPT5 mini. So that's a 3x, but OpenAI says it only burns about 30% of the GPT 5.4 quota in Codex. So it's actually going to be much more token efficient. And this is a metric that I think matters a lot more than many people will tend to realize, right? So we get lost in like cost per token. But as we've seen, more tokens at inference time does not necessarily mean more performance. And that's the big catch here. So when you multiply Those together, right, 30% times 3x, you actually get a slight decrease in what you might think of as cost per performance, which is a little closer to what most people care about. And this will vary depending on the workload. But still quite interesting, right? So for the sort of orchestrated agentic tasks, the effective cost per outcome could actually be favorable compared to running the full model. So sort of interesting there. One thing I will say, Nano is API only. This one is priced at $0.20 per input and $1.25 per million output tokens versus like much, much cheaper. So its predecessor was $0.05 for per input token. That's a 4x lift, roughly 4x again for output tokens as well. So you're looking at 4x overall. The weird thing is, so OpenAI is pitching this for classification and data extraction, which are these like very high volume workloads where you're usually quite cost sensitive because you're processing so much. And so that fourfold hike is going to sting the most for exactly the people who are being pitched this product. So it's a bit of an interesting position. It seems like a little, I don't want to say at odds with OpenAI's position that they want to make intelligence too cheap to meter, but it certainly is, at least locally. This is a move towards, instead of racing to the bottom on inference costs, we're going to focus on model quality and that that's going to be our big differentiator. You're going to care that we can get the right answer, not that we can get it cheaply, which is where all the margin is. That at least is what Anthropic certainly suggests and what we're seeing elsewhere. So that's kind of interesting. You know, there's a bunch of interesting. You said OS World Verified is an interesting benchmark to look at here specifically. I know you mentioned Sweebench and a couple others. And so this is basically a computer control benchmark. So it looks at how well can the MINI model just control a computer. And what we see here is a GPT5.4 Mini hits 72%. If you look back at GPT5.4, the full version, it hits 75%. So it's actually pretty close. And if, you know, if you were thinking about the previous GPT5 mini, that was only 42%. So pretty big jump, especially given that we're getting up there on this benchmark in terms of saturation. So all in all, pretty interesting. Release the Berry Lead. Not Berry Lead, but the, the very detail here really is that token efficiency question. What kind of workload are you going to use this for? That's going to determine. I don't want to call it like the, the total cost of ownership because that's not quite, it's not quite the right metric here. But the total cost that you're exposed to in the roi, that's really becoming a key thing here. Right model for the right workload is. It's just going to be a critical dimension, at least for the next few months.
[09:25]
Andrei Khrenkov
Right? Yeah. I think these kinds of things showcase the fact that all these models kind of came out of a world of academia. Right. And benchmarking is largely focused on capabilities, on how accurate your model is. And so usually you don't highlight these kind of more practical concerns of how quickly can you finish a task. Right. How cost effective are you? Like you do a task, how much dollars does it take? Wall clock time. Right. We just don't get these numbers, at least on the announcements. It's honestly a bit surprising. That's still the case, but it's a culture question, I suppose. And as you said, we'll be discussing OpenAI strategy in just a bit in the business section. Very much in line with Anthropic, where we're like, we'll just charge more for our models, but they're the best, so people will tolerate it. And speaking of small models, next up you've got Mistral. They have released their small 4 family of models under the Apache 2.0 open source license and it combines actually multiple things. So it has reasoning built in, it has multi role capabilities built in and it has a genetic coding optimization. So they're combining Magistral, Pixtral and Devstral. It also uses a mixture of experts so they have a total of 119 billion parameters but only 6 billion active parameters per token. That's quite small. You can fit it into probably one top end gpu. You know it's, it's actually quite affordable. So it looks like possibly a pretty strong model in this kind of category of smaller, faster, cheaper and open source.
[11:20]
Jeremy Harris
Yeah, the challenge there is going to be if you get into wanting to fine tune it obviously then now you're dealing with really 120 billion parameter model and that's pain in the butt. But yeah, I mean it's only 6 million act. I mean this is really ultimately a bet that you're going to have kind of like model sparsity in the sort of very aggressive sparsity ratio is going to outperform a dense model or something more traditional. And it's a bet as well as you say, on the kind of hardware that is available at least for inference on local machines. So yeah, it's quite interesting. It is a more aggressive kind of fewer active parameters per token type of play than we've seen before. It's also kind of interesting. So you touched on it, the consolidation, right of all these models under a single, in a single model, right. Reasoning, coding, agents, multimodal, that's pix trial like looking at images and text and so on. So this is a weird play because historically Mistral has separated these, right. And so given they're collapsing them into one model just with a reasoning effort dial, you could see that as a pretty big bet on where the industry is heading. That is something that we've seen with you know, obviously the O series of reasoning models with GPT4O even starting as far back as that. If it's the case that we get positive transfer, that's really what this is a bet on, right? That a model that is trained to do all these things will do better at each individual thing because it's kind of getting cross training, right? The same way that you might want to do, you know, football and like ballet at the same time, Rice hockey and that you get better at each different thing because of the combination. That's kind of the idea here, right? That positive transfer that for so long was, was really difficult to pin down. People were seeing negative transfer where the more stuff you train a model on the Worse, it does on the marginal additional task. We're now well into positive transfer territory. The extension of that to reasoning is quite interesting. It implies that reasoning may at least that they're betting that reasoning may eventually or already play a role in analyzing images more successfully for these kinds of models at this scale. So there's another piece here that's interesting, this efficiency claim. They say that small 4 is achieving scores that are on par with GPT OSS 120B on a bunch of benchmarks while generating shorter outputs on at least one benchmark. And so again, this is about that token efficiency question, right? We're sort of like back at that, that very underrated metric of output length efficiency and the fact that you don't necessarily get more benefit by reasoning with more tokens. That's something we started pointing out when we looked early on at the kind of deep SEQ reasoning results, right? Where it's like, yes, you do get this positive uplift, but your value per token is actually like potentially going down. And we are now in fact in that regime. We're seeing that very clearly. So this efficiency piece is really important because if people are going to use open source models, they're going to run them like by far the biggest use cases, running these on whether your own or other people's clusters to serve customers. And so, you know, like how good Mistral is at making this an efficient reasoner speaks directly to your bottom line as the person who's going to be serving these or asking somebody else to serve them for you. So pretty interesting, you know, the fact that they're comparing to GPT OSS120B. Look, the space moves really fast. That is an old model at this point in, in open source terms, it's kind of like choosing your point of comparison pretty selectively, I would say here. But they do a comparison that's reasonably favorable on QUIN models as well. So it's an interesting play, I think. I mean, I'm a little concerned for Mistral. I have been for a long time. Obviously don't know too much how this ends up playing out with the open source play, but here they are. It's a reasonable model. And the consolidation angle, that's a really big story here, right? If everybody is starting to consolidate, even open source players here around one model under one roof, that's a materially different story. It does not by the way, extend necessarily to the world of agents, right? Sub agents may be smaller, cheaper models. This is more about, you know, if you're looking for A highly performant model like think of the main orchestrator. You're probably, it seems you're probably going to be seeing, you know, models that kind of put. Put multiple capabilities under one roof. So something to watch out for.
[15:18]
Andrei Khrenkov
Right? I'll just quickly comment on the unification bit. It's fair to say that it's partially a bet on reference, a heading with a multimodal aspect. The fact that they have baked in reasoning and coding I think is a little more just an indication of catching up with where things are these days. It used to be the case that you had a reasoning model like O3 and with Deep Seq R1 you trained a reasoning model and you had your base model. And what everyone moved to in 2025 is there's no reasoning model. Your model has reasoning baked in. And now with Sonnet, with GPT, really with post training for reasoning, it's been very clear that you should just train your model to be a good coder because that makes it a smarter model in general. So this is a mix of catching up where things are at and then also adopting that multimodal capability, which could be interpreted in several ways. Next up, Metasmanus launches my computer to turn your Mac into an AI agent. So this is something you can install and launch in your computer and it's effectively like having a little openclaw, I guess, on your computer so it can execute command line instructions that lets it interact with computers. So it's very much, I think we've seen this happening more and more where various organizations are shipping openclaw esque things where you have an agent that just lives somewhere and you can tell it to do stuff and it is your like assistant or your AI or whatever. This appears to be another instantiation of that, similar also to Perplexity's announcement of what was it like? Perplexity computer? Yeah, so very much in line with that.
[17:18]
Jeremy Harris
Oh, you mean you weren't able to remember like the sixth new Open Claw variant that got launched by a company in the last two weeks? Yeah, it's crazy, right? We're really seeing more and more of this like pylon, right, from all these different competitors in this space. And this is a land grab. Like, make no mistake, this is the, the sort of man, I don't call it the Scramble for Africa moment, but the, you know, historical equivalent of that. In this space with fewer controversial overtones, this is the moment where people are realizing, hey, you know what, we need to get on people's local machines, right? We have to get some kind of piece of that pie. Because so much of what's, what's happening right now is that we're trying to like grab onto people's like agentic, like the kind of agentic runtime layer. In fact, Nvidia. We'll talk about this in a minute with, with nemaclaw. It's the same thing. Everybody's trying to get onto, like what is, what is the substrate on which agents are going to run? Can I get my dirty little hands on that and turn that into part of my market? And in this case, the Manus play is quite interesting and it's aging well, right? I mean, Manus was an totally independent company before the acquisition and really the play here is Meta extending its reach into that local OS layer for the first time. This is the territory historically of, you know, your Apples, your Microsoft, your Googles. That's the war they're entering. We haven't seen Meta do that before, right? We, we just haven't seen them try to play in the operating system game. This is a really interesting way for them to vector into a completely different market using resources that, you know, maybe for the first time they have like a credible, whether you call it advantage or not, they have a credible play here. So pretty interesting. And again, Meta classic history of buying their way into competing. Right. We saw this with WhatsApp, with Instagram, like it, it never ends. This is their play in that direction. So Manus, you know, may be aging well. We got to see what comes of this.
[19:07]
Andrei Khrenkov
Yeah. And just as a reminder, Manus came out, I believe last year initially with a cloud based agent that you could assign to go do stuff. So this is them extending to your local computer. Right now you can, I think get this for silicon based Macs. The other aspect of this, by the way, is not just openclaw, it's the coworker, the Codex angle where most of the blog post actually is highlighting like let the agent go to computer and organize files and do things that are very much cloud code. Or now cowork esque. And the kind of tell it to do stuff from anywhere at any point is an aspect of it, the OPOCL aspect. But I think the real land grab is for that cowork type like have an agent do stuff for you, which is now everywhere in coding. But I think what anthropic and now OpenAI and now everyone is realizing is these agents can do a whole bunch of stuff and that people haven't adopted to do yet. And speaking of openclaw, next up, Nvidia has announced Nemo Claw as part of their announcements at gtc, which is a little bit funny. This is a stack for the OpenClaw agent platforms that allows you to install Nvidia Nematron models. We just discussed this latest Nematron model last week and there's a new Nvidia open shell runtime. You install both of those in a single command and you get privacy and security controls baked in, making it possible to have, you know, more, more confidence in, in running one of these things. We've seen many stories of off McLogg, right?
[20:55]
Jeremy Harris
Absolute confidence.
[20:56]
Andrei Khrenkov
Yeah. So open Shell provides an isolated sandbox that enforces policy based security network and privacy guardrails for agents. Seems like a good idea if you are to do one of these agents, maybe install them in a sandbox where you can control them and they don't go rogue and take over the world.
[21:17]
Jeremy Harris
Yeah. And one of the key dimensions here you think about what does sandbox mean? How do these things work? Typically, a lot of these things focus on what is the model that is running on the cloud and what are the models or model that's running locally on your machine. So you can imagine you might not want the model that runs locally on your machine that's actually looking at your own intimate files to have direct access to the Internet. Right. So that's kind of like one way that you might enforce that sort of guardrail. So use that local model, just like generate summaries or do analysis and stuff and then just ship the summaries after some review or something like that to, you know, that's like, that's one way to play that game. And there are a whole bunch of other guardrails around. The kind of access that different models can have and your ability to tune that. So that's kind of like a lot of where this is coming from. You know, this is a classic example of Jensen's hyperbolic rhetoric. Right. He's using terms like new renaissance in software. So to kind of, to play this up. Which to be clear, I actually, I mean, I agree with this, but there's a, there's a gap between, you know, this framework can help you install, you know, a single command and redefining how computing is done. And we'll see if that, that chasm gets, gets crossed. And it, it probably will at some point. I think it certainly will at some point. Question is, who does it first? So the frame here is, you know, again, that operating system piece. Right. Keep going back to this. It's not a coincidence that Everybody is, is going on the same basically this gold rush expedition. Everybody's thinking the same thing. The operating system for personal AI, that layer. And again Jensen here comparing Mac and Windows for PCs to OpenClaw for personal AI. That's a deliberate, it's a bold claim, but very deliberate. This is part of that frame. This is Nvidia now saying hey, Meta is going to get into the effectively the operating system game, this sort of operating system for agents. We're going to do the same thing. We don't have a history quite of doing that but you know, now we're diving into. So you're creating this environment where because essentially Agentic, you know, we had the software eats the world era of the sort of SaaS revolution, you know, last decade, decade and a half, two decades and now we're in the AI is eating the world, including software. And another layer of abstraction here is yeah, that runtime environment for agents and that's the land grant, that's the operating system. At least this is the case that Jensen's making. I think it's a reasonable case on the whole, but remains to be seen. This is functionally also a classic Nvidia play of like trying to Trojan horse in their full stack. So Nemo Claw is going to install Nematron models, right? Those Nvidia models preferentially and the open shell runtime all in one command. The simplicity here is the point. So it's super, super easy to deploy Nvidia's own models together and the runtime environment, all that stuff. Basically they're creating a situation where just like they owned Cuda for the GPUs they're creating a whole software stack around Agentic, the Agentic runtime environment. And so this is what's worked for them in the past. Create a whole ecosystem around this software is more commoditized now than it has been before. So that may not be the same kind of moat. Especially given that they're, you know, unlike before with Cuda where they had like a decade's head start before anyone really paid attention, this is now, you know, much more in competition with kind of faster moving players. So really interesting again, chalk it up. Is another, another entrant in the sort of operating system for AI agents sort of catalog here.
[24:28]
Andrei Khrenkov
Right. I do think interesting aspects is nemaclaw is kind of like the hype, good, cool branding for the actual major push which is their open agent development platform, which is that terminal based thing. It also comes with this Nvidia AIQ blueprint thing which is built on top of LangChain deep agents and has Nvidia Nemo Agent toolkit as these open source kind of options for building your stack of agents. So Nemo Claw aspect is like pigging back on the thing that everyone is hyped about and then the actual software frameworks is probably the actual thing that Nvidia cares about.
[25:17]
Jeremy Harris
Yeah, and this is it, right? It's like how can I hook onto one big hype train and then exactly. Use that as a Trojan to get our. Like everybody depended on our whole stack, including their models, because there hasn't been the kind of uptake of the Nematron models, at least that I expected to see so far. And so this is, you know, one, one opportunity for them to make that happen.
[25:37]
Andrei Khrenkov
Yeah, I think to your point about software, CUDA obviously is the number one thing for GPU related kind of execution software. But as far as open source packages go, Nvidia hasn't had a history of really making an impact. So for instance, LangChain is an open source package that goes back to 2023 that has been very popular for building complex graphs of LLM prompts and agents and so on. Probably overly complex, but anyway it was gained broad community adoption. So that's more or less been the pattern a lot. But with agents and now openclaw and whatever, that was a good time to try and get in on that open source kind of game. And one more story about Nvidia. At GTC they announced DLSS5 which is what they are describing as the GPT moment for graphics. So this is runtime enhancement I suppose for game graphics. So this uses machine learning based upscaling and it applies generative AI to add a whole bunch of just really nice looking graphics. So if you look, they have various examples so you probably want to just see it to understand it. Basically you can go to older games like Underscores Oblivion for instance, and get really cutting edge seeming graphics with this turned on. Now the reaction to has been very much split from what I've seen. People have often scoffed at it and were like, oh, this is AI slop. This filter is bad. In some cases it seems to go against the style of the game somewhat. So Nvidia did also emphasize that this is fully controllable by the developer. The developer can set the settings of how aggressive it is, you know how much it impacts various aspects of the rendering. Yeah, I think this is the kind of thing that Nvidia has given developers presumably in the past. This is DLSS5, but with the generative AI aspect of it, it's getting a lot more discussion and it looks potentially very, very significant.
[28:03]
Jeremy Harris
Well, so I was going to ask you, I mean, this is the part of the show where I ask you for your opinion as a guy at Astrocade. Is this something that you guys would integrate in your stack? How do you think about new tools like that? Or is that even part of your workflow?
[28:16]
Andrei Khrenkov
Oh yeah. So this is kind of targeting that 3D, AAA big game kind of market. Right. So this is dealing with new games. Right. And more complex. You wouldn't run this on your phone like for casual games, but for games that have complex character models with faces and, and stuff like that. Or like open world games where you traverse big landscapes. That's where the kind of photorealistic pass really makes a big difference. And last up, we have an update on OpenAI's plan to launch ChatGPT's adult mode. This has been announced I think last year, as I think they are thinking of doing it, has been delayed from the original late March target and it seems that they're still aiming to do it. The news here has been that the team within OpenAI, their advisory council of Psychology and Neuroscience, have opposed it. At a January meeting, one advisor really warned about it significantly. So anyway, we have a quick update saying that they appear to still be planning on it. It been delayed, but as of now it's still going to be presumably released.
[29:44]
Jeremy Harris
Yeah. Wow. What a surprise that despite objections over the appropriateness of a tool like this, that they still went ahead, huh? That's, that's weird.
[29:53]
Andrei Khrenkov
That's not like OpenAI at all.
[29:55]
Jeremy Harris
Yeah, that's weird. What a weird thing. Yeah, I know this is when you think about the things that were classically warned about. Just my opinion here personally, but I seem to remember an awful lot of people warning us about the brave new world thing where you're hooked up to like the dopamine drip. Yeah. I don't know. Ultra porn powered by like AI, super intelligent. Like, I'm not sure how far you push the inference time compute budget before you get to, to that scenario. But hey, how about scaling laws for porn addiction? How about that paper? Looking forward to that coming out. I mean, look, that's the kind of thing we're going to have to see. There's going to have to be studies on scaling laws for porn addiction. Just calling a spade a spade. Like I don't see how we avoid that. It's interesting that this is, this is an offering. Someone's going to do it right, the big porn companies at some point, whatever. So you could argue, and I'm sure this is part of the ethical argument that OpenAI might make here is look this way, we can monitor the use of these things ahead of time, understand how people adjust to this technology, maybe try to mitigate risks ahead of time, whereas porn companies probably wouldn't do the same thing. It's sort of a similar argument to what Sam's been making historically. Right. We want to like get these things out there because it's going to happen anyway. So we want to get that feedback and be able to iterate on it, which there's merit to. But you know, we should be under no illusions that we're crossing the Rubicon here. And I think given that OpenAI is taking this step, it is on them then to come out with unbiased research that transparently looks at the effects of exactly what they're putting out there. I think that's just a reasonable thing. I think any be pretty shabby. It's like a tobacco company going out and doing something. When you're playing so close to the base of the human brainstem, you're like you're in different territory. And so anyway, we'll see if this persists. We'll see what scandals come of it. I'm sure there will be some. But yeah, it's in some ways not surprising. I don't want to again rip on OpenAI too much for this because the argument is true that at some point, you know, the big porn companies will do this and they have a history of like mistreating women and, you know, doing all kinds of awful things. So ethically I get it, but now the proof is going to be in the pudding because once you put that out there and brand wise, like, geez, I.
[32:01]
Andrei Khrenkov
Well, so I think it's worth clarifying a little bit. Porn might be a bit strong of a term here. Right. So they do clarify. This will not generate erotic audio images or videos. This is really about. So they've banned erotica as a, as a general category with a chatbot. And so if you try to write sexy stories or have sexy role play, which you can do in plenty of other, you know, apps out there, There's a million AI girlfriends, including Xais. Right. So this is more of that. This is like role playing with sexual kind of adults only component to it, not straight up like explicit content of the sort that you might think when saying porn. So in some ways, yeah, I think there's arguments to be made either Way, like people have tried to do this and try to do this, as is most likely with these models. People do seem to want it. There's very large adoption of these things and we already have plenty of stories of people like getting AI girlfriends and having psychological impacts. So yeah, Maybe actually if ChatGPT and OpenAI do this explicitly and do it right and do it well, it's better than not doing it and having some shady dark market for it, you know?
[33:28]
Jeremy Harris
No, absolutely. And I take back the use of the term porn here. I guess I'm sort of seeing where the puck is headed and anticipating that next play. This is definitely a step where we need OpenAI to actually run the studies since they're going to have access to the data. No one else will. Right. So like we need people to look at this. I hope that they will. I would imagine this is part of the package here, but from a branding standpoint, this is really dicey, especially at a time when they're trying to redouble. We'll talk about this in a minute. But on this whole sort of business productivity focus, like that's going to be their big play. So you're kind of adding that into the mix. It's interesting. I mean, what this implies about the user base that they're, they're going after is, is there ROI here? I mean, porn is a pretty low margin industry, as I understand. It's so, so this like, I don't know if role play, erotic role play would kind of be similar things, flirting, if you will, with that. But either way, yeah, we'll find out. I think the argument that a lot of people want this is also kind of dicey. I mean, wow, a lot of people want crack cocaine. Like it's a. So we got to find a way now. Everything's got to continue. I'm being tongue in cheek here, obviously, but we have infinite inference, inference time Compute budgets coming online in the next, you know, three to five years or whatever. At what point do you just have so much inference time compute being dedicated to your, your limbic system that for all intents and purposes, like you're robbed of your agency? I think we're actually going to be closer to asking ourselves that question sooner rather than later than most people are pricing in.
[34:54]
Andrei Khrenkov
Right. And just to reiterate the actual news here is just that there have been these details coming out of the internal objections from certain parts of OpenAI and these objections have been there since January and the company hasn't canceled this feature. Now the features are out, they could still not do it and it has been delayed, so it's still on track and it'll be interesting to see if they do go through it after all.
[35:24]
Jeremy Harris
Yeah, no, that's true. And actually like a lot of my reaction is reactionary. I'm like this, this direction generally is definitely coming, but it makes me nervous as hell. And that's kind of, that's sort of my response to this is, you know, we're just sort of seeing a lot of this is testing the waters too. I'm not saying this particular story was leaked, but you see governments do this, you see private companies do this, there's like leaks of stuff that go out to just gauge audience kind of consumer response to things. Can we take that step? Can we not? I don't know again if this is or is not the case here, but man, I mean you'll at some point need some kind of immune response to things that push in this general direction
[36:01]
Andrei Khrenkov
and onto applications and business. Sticking with OpenAI and coming back to a topic we briefly touched on, another thing that came out from OpenAI this week is discussion that happened at an all hands where effectively OpenAI has signaled a strategic shift away from doing everything, doing a little bit of everything and focusing more on productivity and business. So OpenAI has historically had a million side projects. This I think is how they characterize its side projects. They have video models, they release the Soar app, they have a browser, they have audio models, I believe Transcription, like a million things. Anthropic has Claude and that's it. OpenAI has Sora and their audio models and so on and so on and so on. So it seems to be that internally within the company they are changing focus. The head of applications, Fiji Simo, has told staff that they cannot miss this moment because they are distracted by sidequests and they need to nail productivity. And we've seen this already for the last several months. It, it has been apparent that they have been really pushing on Codex to catch up with Claude and Cowork as well. And they largely have, at least in terms of feature set. In terms of adoption, I mean many people are starting to use codecs, some people prefer codecs at this point, but it is obviously true that this hasn't been OpenAI's primary focus until recently. And they have kind of been behind cloud and in terms of adoption they are still behind because cloud code is the first early leader in that category. So yeah, interesting to see OpenAI having that internal discussion now. I feel like this has been a problem with OpenAI for a while, if I were to kind of guess at internal dynamics and kind of business and company level issues that lead to poor performance. So, yeah, kind of could see this coming, I think, Famously.
[38:20]
Jeremy Harris
And Andre, I'm sure you'll have friends that have told you the same thing. Like at Google. Everybody of mine who's ever worked at Google says the same thing. And actually same at Meta, you get promoted for building new stuff, right? It's like, it's not about, did you make the code run more efficiently? Did you clean this up? Did you clean that up? It's like, did you make new stuff? And that's why there's a massive app graveyard, right, famously for Google products. You know, Google hangouts, Google this, Google that, that just like gets axed in at various stages. There's this fundamental question of, like, again, is this a feature or a bug, right? You can look at Google and you can say, ha, ha, look at the graveyard of wasted time. You can also look at Google and say, well, what matters is not the misses, what matters is the hits. And for every, not for every Google hangouts or dead Google product, there's like a Google Calendar or a Gmail, right?
[39:05]
Andrei Khrenkov
Or Maps, right?
[39:06]
Jeremy Harris
Or Maps. That's right. That's right. So like Paul Buhite, famously, you know, now a YC partner, I don't know if he's left yc. He was there when I was there, but he, like, was the founder of Gmail. And his entire, like, claim to fame in Silicon Valley is that he was the Gmail guy that was a company within Google. That's the way to think of it. And certainly from a revenue standpoint, that's what it is, right? So when Sam, who comes out of the YC ecosystem too, right, Having been the president of Y Combinator for many years, the first one after Paul Graham, his whole thing is spray and pray, right? He's used to betting on us, like, a little bit of money on a lot of efforts at the same time. This goes way back to OpenAI's founding days, 2015ish, where he was doing the spray and pray thing on a whole bunch of different things, you know, evolutionary methods and robotics and RL and all this stuff. You know, OpenAI's gym was originally a side project. Right? Now it's a big thing, so. So there's a whole bunch of different, different plays that, you know, he's sort of like, used to playing in this way. It's worked from the past. The challenge is now you're. You're entering a much more mature environment, right? You're no longer necessarily in the game of betting on a bunch of different startups. You have a kind of core area that you need to dominate right now. You know, Anthropic is run away, running away with the pot. I think it's over 70% market share right now in enterprise. So that's what, what is the red alert here that Sam Altman is calling and Fiji Simo. So they've got to find a way to kind of like, yes, you got to keep that, that Google or that YC kind of approach to investing in a whole bunch of things. You never know when the next big thing is going to, is going to come from. But you also have to be able to invest and double down in things that are working, that are generating a lot of your revenue. I mean, 25% market share on enterprise is a big, big deal. You know, they're making what, $25 billion of annualized revenue at this point. Like they have stuff that's really working. And so this is a structural shift. There's another piece that's happening here where I think there's an over rotation on. Oh well, OpenAI is like, you know, kind of throwing out all these side projects. Sam wrote on X a couple days ago. He's like, look, there's all these rumors that we've also canceled the hardware thing with Jony I've. And in fact, this article sort of repeats that rumor, saying that, hey, you know, they basically canceled the rollout of these AI powered earbuds. Well, actually, at least Sam says the Jony I've thing is still live. And so again, it's a question of what about the hits? Not necessarily just the misses, but some of the hits now are worth protecting. And that's the challenge. They've already grabbed the pot. Sam needs to find a way to hold onto it and expand his territory. It's not just a land grab. Right.
[41:34]
Andrei Khrenkov
And I think to be fair, cloud code initially, the story goes, was a single person's kind of side project and it turned into this massive thing because someone pursued it. Really what this perhaps indicates is the various bets that OpenAI has been making with their browser with a Soar app. They haven't seemed to really be a hit or like be as nearly as transformational as cloud code. So they missed out on the cloud code moment. They started catching up late with Codex. It wasn't until kind of pretty late into last year that I started feeling like they're really taking it seriously already. By Mid year, if you were kind of feeling a pulse of where AI was, like people were calling cloud code and you know, I adopted it sometime around May or June, Codex didn't really start kicking off until maybe even November or later. So yeah, it really signals kind of more, not necessarily not doing these other things, but realizing that they were left behind. And now we need to catch up and start making money with businesses because that's, as we often say, where you make your money much more easily than with consumer products.
[42:49]
Jeremy Harris
Yeah, much more. You ultimately need to go straight to consumer to get the big pot down the line. Yeah, but it's early in the game. Like it is always early in the game. And I, and to your point, I mean, this is, and I can't remember if this is a Napoleon thing or an Alexander the Great thing, but people would say like, he could, he could achieve a victory but not use it. This is kind of that. Right. There's two different modes in the space. First is getting your product market fit and then it's holding on to it in the face of competition and than kind of attacking other people's moats. And that second one is, it's a, it's a shift, it's a fundamental kind of mindset shift that you got to get into. And it looks different for AI than for traditional SaaS companies too. Like, you just can't sit on a stack of software for like a month and expect it to hold on to market share. So yeah, it's a really interesting space. We're learning a lot in real time right now about what it takes to gain market share and hold market share in a world of low switching costs and very rapid advancement.
[43:41]
Andrei Khrenkov
Next up, moving back to gtc, we have some more details. One other thing that Jensen Huang, the CEO of Nvidia, has announced is that it looks like purchase orders for Blackwell and Vera Rubin chips. These are the newest generation of chips, expected to reach $1 trillion through 2027, doubling last year's projection of 500 billion revenue opportunity. He's saying that the key that's driving it is a shift from chatbots to agentic AI applications which do require much more compute. You're just generating a lot more output. And if you have openclaw, which is an always on agent, but you let go off and work for hours at a time and this is where things are heading, at least if you're kind of a believer in the trends. Right. Within half a year, a year, we could expect agents just working on their own for many hours at a Time, even days at a time. And so the bet is that it's going to keep happening. They also unveiled the Groq free language processing unit, which is the first announcement related to grok since their 20 billion asset purchase in December. Groq, their language processing unit, has been a very impressive piece of work. Groq has really high throughput time and it is just a great option for running inference if you need speed. So that I think is also quite significant.
[45:22]
Jeremy Harris
Yeah, that's a $20 billion purchase of Grok by Nvidia a few months ago. And yeah, I guess that was December. A huge deal. Obviously already they're looking to ship a product in the third quarter of this year that's fast under a year from acquisition and they're already fully integrated and pumping out stuff. Just as a reminder here. So when you think about Groq's lpu, right, language processing unit, go back to our hardware episode to do the deep dive on this. But basically you traditionally have, in any kind of GPU like the Nvidia GPUs, you'll have a stack of high bandwidth memory HBM that's going to sit next to the logic die. And the logic die is what actually does the matrix math, the computations that are interesting, the number crunching. But it has to pull data from that high bandwidth memory, that stack of HBM that's next to it. And pulling that data takes time because they're physically just like different objects, right? The HBM is a stack and then you have a logic die and they're packaged together. But you run into this challenge and it just has to travel to the memory, grab the data and come back. That creates that memory wall where the processor spends like 70% of its time just waiting for stuff, right? So you've got a kind of cold start issue there. And the Grok, like language processing units, the LPUs, they use a kind of memory called SRAM that is built directly into the silicon. So the logic and the memory are much more intimately linked. So the data doesn't have to travel, it's already there. So you get this massive increase in sort of like internal memory bandwidth. So how much memory, how much, or, sorry, how much data per second can flow between the relevant components? Something like 10 to 20% faster on the, on the GROQ unit. So this is really, really important because Nvidia is going to be integrating this with their Vera Rubin architecture, that's the next generation after Blackwell, and kind of doing a hybrid of these two ideas. The idea is they're going to have HBM sitting outside the main processor, but also integrating some of this SRAM heavy compute tiles that are directly going to be in the Ruben chips. And so this is a really big merger, a physical merger of these companies that we're seeing here. So yeah, it's pretty wild. It also means a really intense demand for memory in the memory market right now. I think S.K. hynix, I saw something earlier today that are like booked out until 2030. They expect to have basically way more demand and supply all the way until 2030. So this is all part of that, right? People are realizing holy shit, memory is a big deal here. And Nvidia on the inference side of the things, I think Jensen called himself what the inference king or something at these things where they're like hey, so you know what's, what's going on with this Grok acquisition? He's like hey, I'm the inference king. So this is, this is part of that, right? The, the Grok play is an inference play.
[48:00]
Andrei Khrenkov
And next, going back to Mistral, one other aspect of what we have announced is also at gdc they launched Forge, which is a new offering by them to let customers businesses train their own AI models. It sounds like this supports various kinds of training. You can pre train and have an entirely custom model from scratch, which isn't something you would want for LLMs, but it sounds like maybe they allow it. They also seem to be offering post training, reinforcement learning, basically optimizing a model for a given company's needs with sort of all the stack and training knowledge and inference and so on. That is pretty kind of complex and is what Mistral and OpenAI anthropic focus on A lot is just the training infrastructure and setup and all of that. So they pitch it as you know, you need the models to have your internal knowledge and information built in. You can optimize it for your needs. I think this is an interesting offer and potential here with open source models haven't started to get good with especially quant models, Mistral models. To some extent it is a potentially good bet that as open source models get better you would see more people fine tuning their own models, post training and reinforcement learning for their own agents and their own flows which ultimately that that's the best way to unlock performance is training. You can do, you know, prompt engineering and you can do whatever. But training on data is key. So they just announced it. You need to sign up to get more details. The details aren't super precise yet, but it appears that they're shifting focus or at least honing in on this specific strategy to try and differentiate and compete in the space.
[50:07]
Jeremy Harris
Yeah, and yeah, I'd like to be upfront about my biases here, everyone. If you listen to the podcast a lot, you know, I'm, I'm pretty bearish on Mistrial and Cohere and those sorts of like kind of runner up to the, to the frontier companies for a whole bunch of reasons. But this is not making me any more excited about, about this direction. The reason is so first of all, this positions them basically as competitors to Cohere. This is the Cohere play, right? This is it. You're basically saying enterprises like we're just going to be better at enterprise integration. We're going to make a stack for the enterprise. The difference is Mistral is a few miles behind Cohera right now. Right? Cohera has been working on this for a long time. They hit like a quarter billion dollars in recurring revenue in 2025 with some decent quarter on quarter growth though I think ultimately they're going to struggle to kind of be competitive in the world of like big infrastructure plays here. But that's kind of fundamentally one of the challenges. The other one is the companies that have an enduring advantage on infrastructure. The anthropics of the world, The Googles, the OpenAI's of the world, are ultimately, if they ever want to, in a position to just like eat Mistral and Cohere's lunch. Like they can just go in and one day say, hey, you know what? We decided that we really care about this. We're going to make special models, maybe distilled versions of our actual frontier models, whatever to run locally on your thing or we're going to create.
[51:20]
Andrei Khrenkov
OpenAI has in the past offered fine tuning of the thing GPT 4.1. They don't allow to have the latest set of models, but it used to be something they offered.
[51:31]
Jeremy Harris
Yeah, exactly. And so ultimately, I mean this is the challenge is you are going to end up competing with them and the question will at some point be, because we know this is where margin is made, quality versus quality at the frontier of capabilities. If that ends up being the case, I mean, damn, this is an uphill battle because you're actually going to be like OpenAI and anthropic get to amortize the massive cost of training frontier models across every inference run that gets done on their infrastructure. And if they're doing distillates of those models to serve at the enterprise, the same thing applies the Economics look pretty bad to me for stuff like this, but ultimately we'll see. This is also, by the way, Mistral kind of playing into their European roots a bit. So Cohere has a sort of Canadian footprint, they've got big deal EU ambitions, but Mistral is sort of the French champion and they've very much been seen as the sovereign champion. So data sovereignty, model sovereignty, you'll hear them say a lot of that. Ultimately, from a free market standpoint, that's basically just an appeal to like kind of European or French nationalism to kind of help float them on this otherwise potential market challenging way.
[52:38]
Andrei Khrenkov
I think. Yeah, maybe I would say you're a little bit caricaturing them. They position it more as control over IP and this is something that people care about.
[52:48]
Jeremy Harris
I totally agree. That's the play. That's an identical play though. What I'm saying is to cohere and it's.
[52:53]
Andrei Khrenkov
And on that note, cohere, their play or largely what they have focused on is releasing models that they have trained with the focus on enterprise. So they've released RAG models, for instance, that they say are very good for enterprise use cases. They largely have positioned themselves as having models and offerings that are useful for enterprise. They do have a thing for building customized AI solutions. I don't know how much of it is apples to apples with this new forge, whatever it is.
[53:26]
Jeremy Harris
That's kind of the thing, right? So there, yeah, Cohere's thing is called North. It's this like whole enterprise platform, right? It's, it's for AI agents and workflows and then they've got a bunch of like re, rank and embed and a bunch of other things for rag that are meant to be enterprise kind of correlated. But that's kind of the challenge is like ultimately you're already seeing this appeal happening quite quickly to nationalism. I mean, Mistral is wrapping themselves in the French flag. You see Cohere doing similar things with the Canadian flag. Ultimately these are signs of like companies trying to look for some source of alpha that's outside of the market. I'm not saying that OpenAI isn't doing similar stuff. Anthropic, certainly struggling to do similar stuff and still succeeding. And so this is kind of part of the flavor to me. Like it starts to look a bit like a crutch. And there's only so much Capex spend that can be subsidized by governments that at a certain scale that I think we're approaching already, it makes it hard to compete when your big Differentiator is like look how French we are or look how Canadian we are. That is a bit of a caricature for sure, but it's an important part of the pitch here. I worry about that and that's kind of part of my bearishness here. I'm just trying to be transparent about my reasoning here. I could be wrong, but looking at the scale these companies, I mean man Cohere has been around for longer than anthropic. Right. Like they're what, a percent of anthropic scale. I think this is a sign that these markets are smaller than maybe had been hoped for initially. We'll see. I mean they could absolutely surprise me and I'm looking forward to seeing how it plays out.
[54:52]
Andrei Khrenkov
Yeah, I think longer term, I think this is an interesting time to consider this direction of customized and fine tuned models for a given business because the open source models, the models that aren't proprietary are starting to get good. And so you might start seeing a case where for instance at Ostricade we have a pretty specific use case of coding for these games and having coding agents with a specific set of technologies, HTML, JavaScript, et cetera. So it is feasible that if we were to do reinforcement learning and post training on our own internal proprietary data of good games that people have made and their chat histories and so on, we would get a better model and a better performance. That didn't used to be the case. It used to be that the base models just weren't good enough. Just doing prompt engineering was going to be the best you can do. I could see it heading in a direction. And Mistral may or may not be the one that owns this. It could be Vedantropic or OpenAI jump in on it. But regardless, having these sort of individualized customized models for different companies, agentic needs seems like a very much feasible future.
[56:19]
Jeremy Harris
Yeah, definitely not questioning the existence of a market. Right. I'm just sort of skeptical of their positioning as you said, to like, are they the ones to capture this? Yeah, that's where I'm wondering.
[56:28]
Andrei Khrenkov
Yeah, it's. It's something to try.
[56:30]
Jeremy Harris
So it's something to try. No doubt.
[56:33]
Andrei Khrenkov
And back to chips. China's Bay Dance gets access to top Nvidia AI chips according to the Washington Street Journal. So they are seeing that biodance is assembling significant computing power outside of China using these chips. They are supposedly working with the Southeast Asian firm Aulani Cloud and they plan to deploy approximately 500 Nvidia Blackwell computing systems in Malaysia totaling 36,000 B200 chips. So that's 500 computing systems, presumably meaning racks. That amounts to tens of thousands of actual GPUs. So, yeah, it's, I suppose, giving us an indication that the company Biden's being a massive company that has their products used outside of China, no doubt is able to make use of these chips in other countries.
[57:32]
Jeremy Harris
Yeah, yeah. And you know, if you're looking at this being like, hey, wait a minute, I thought we weren't shipping. I thought we just greenlit the H200. Shipping. What are we doing shipping Blackwells to China? Well, that's the whole point. It's technically not being shipped to China. The export controls that apply here from 2023 regulate where the hardware is shipped, not where the compute is used. Right. And that was intentional. It was meant to allow the sort of like global cloud infrastructure to be built based on American hardware. But the thing is, ByteDance is not on the famous entity list or there's a military end use list as well. So the, the fact they're using Nvidia hardware does not automatically, in and of itself, trigger restrictions. The fact is that there's. They're just setting up shop outside of China and nominally that's totally okay. And Nvidia actually confirmed that there were no objections here. And the BIS Department of Commerce apparently is also on board. So they're all tracking still. I mean, this kind of reveals like, hey, well, this is what you're opening yourself up to. If you think about the scale, by the way, for a second, like, what does it mean to have 36,000 B200 GPUs, you know, 500 of these NVL72 units. So this is typically a two rack system. Yeah, I think we talked about it in the hardware episode at one point. So this many GPUs adds up to about a 60 megawatt cluster. Okay, so 60 megawatts of power, that's enough to power roughly like 60,000 homes. You can think of it that way. So it's like a little small town of compute. Compare that to Microsoft. Abilene is like 100 megawatts. Or Elon's XAI Texas Colossus site. That one's more like 130 megawatts. The Abilene comparison is a bit unfair because that's still kind of being built out. But the Colossus is better at Apples. Apples because they're still building it out. Single purpose. It stood up super fast. What this is showing is you're allowing a Chinese company to get up to the same at least order of magnitude in compute. It'll admittedly be online kind of like later on this year, but that's still something. So this is all part of the debate here over what is appropriate, how much compute should be allowed, even if it's overseas deployments that can serve Chinese customers. By the way, this is still a data center that can be used to serve Chinese customers. So you could, in principle, at least to my understanding here, bytedance could use this to train a model that then is used in China that then because of civil military fusion gets used by the ccp, you know, the MSS or whatever other arm of the Chinese state. So they've also planned additional deployments of up to 7,000 B2 hundreds at a bunch of data centers in Indonesia. This is kind of a broader Southeast Asian strategy here. So we'll see. I mean, it's a pretty big. You might think of it as a pretty big gap. That's sort of how I think of it personally. But there are different views on is this desirable. And you could make an interesting argument in many different directions.
[60:13]
Andrei Khrenkov
Moving back to the U.S. we've got an update on where Meta is at with their AI efforts. The update is that they are delaying the rollout of their next model, codename Avocado, from March to at least May. I think this was actually announced last week, but we didn't manage to cover it. So they appear to have just not been able to train a good enough model in time. And so they are delaying this release. They are discussing or have discussed at least temporarily licensing competitors AI technology, reportedly Google's Gemini. They already are looking at various extensions. So overall not looking great for Meta is my read on this. It's already been close to a year. It feels like in mid-2025 after Llama 4. A couple months after Llama 4, we saw this heavy, heavy, heavy push where they got Alexander rang from scale AI and a $14 billion deal. And then they paid absurd money to get all these researchers from Anthropic and OpenAI and DeepMind and so on. And they create intelligence labs within the company. They reshuffled everything with the intent to train competitive frontier model that can stand up to GPTV 5, 4 and Claude and all these other ones. It doesn't seem like they're there. And personally this is my read. I think organizationally having Alexander Wang lead it is maybe not the best idea. Someone who hasn't worked in frontier AI models, data creation, scale AI. You know that's related, right? That's related. But I don't know if that's enough. Anyway, so that's the news. It's been delayed. It doesn't appear like they're on track, but you don't know too much. You know, this is all internal. Maybe they're just doubling down and it's going to be great.
[62:18]
Jeremy Harris
Yeah, well, right. And it looks like it. So reportedly like landing somewhere in capabilities between Gemini 2.5 and Gemini 3. So very passe at this point, hence this decision to delay. They're angling for a private, non open source model here. Right. So the appropriate point of comparison is what does Claude look like? What does the best version of Claude look like? It seems like, as you say, it's just not up to it, at least for right now. And that could change quickly, who knows. But yeah, it's also part of this whole saga where you had Yann Lecun say, you know, fuck Alex Wang when he was leaving and then everybody was. There's just like, I forget, not Times of India, but there was some kind of India based publication. Right. That said, hey, looks like Zuck is
[63:00]
Andrei Khrenkov
thinking about margin which was like not verified. It was false.
[63:06]
Jeremy Harris
Yeah. Not false, but yeah, exactly. And then Zuck came out with a photo with Alex Wang, sort of all smiles in the office type thing to.
[63:13]
Andrei Khrenkov
We do know. Yeah. That aside from Yellow Kun, Alexander Wang has also clashed with Internal kind of big deal product people like Chris Cox, Meta's chief Product officer, and Andrew Bosworth, their chief Technology officer, apparently about how AI models should improve. Alexander Banks has been pushing for let's just train a super, super good model and not care about the business implications and the product implications, which, you know, other people at Meta understandably are. Question.
[63:44]
Jeremy Harris
Yeah. And. Exactly. And Alex comes from this viewpoint of hey, let's do the OpenAI thing, let's do the anthropic thing and go full on, you know, AGI and all that stuff. That's very much he. I mean he is asi pilled. Well, the name of the superintelligence team kind of hints at that. And nominally, I mean, if you're going to make a super intelligence team. Yeah, they probably should be focused on super intelligence. Like, I'm no branding expert, I'm not a marketing dude, but yeah, it seems like that's par.
[64:07]
Andrei Khrenkov
Now, should Meta have a super intelligence team?
[64:10]
Jeremy Harris
Separate question.
[64:11]
Andrei Khrenkov
That's a good question to ask. I don't know.
[64:13]
Jeremy Harris
That's right. Well, and if you know, certainly. Well, if you believe that superintelligence is going to be the thing that Alex Wang believes it is, and that maybe Zuck believes it is, then you have no choice. You have to compete because if somebody else does it, you're finished. Depending on how the intelligence explosion goes. Bottom line is a lot of fog of war, at least to me, still on what's going on inside Meta. The one data point we have is that we still don't have anything and that it's starting to become a real data point at this point. Like, it's starting to become like, I
[64:40]
Andrei Khrenkov
would have expected a model release by now, given the amount of infrastructure and investment they've put in here, and given that they have trained models in the past. Llama 4 was underwhelming, but it was a pretty impressive model that most companies were not able to produce. Even still. So the lack of anything is perhaps concerning. And one last story a bit similar. The story is Microsoft shakes up AI division. As Copilot falls behind Google and OpenAI, they are reorganizing the AI division. Mustafa Suleiman is going to be shifting focus exclusively to developing the company's own Frontier foundation language model. So effectively the same kind of focus that we just discussed with Meta, Microsoft has worked on training their own, you know, alternative to ChatGPT and so on. They've trained various models. I believe the big one from them was fi, which is small models that they released. They do have a lot of users. So apparently Microsoft's CoPilot app has 150 million monthly active users. But if they want to compete with Gemini or ChatGPT, they're nowhere near. They're way, way behind. Gemini has now 750 million active users. ChatGPT has roughly 900 million, sorry, weekly active users versus CoPilot has 150 million monthly active users. So yeah, that's the news. They are restructuring. They are recognizing that they're falling behind. And if they want to have their own Frontier model, they are not there.
[66:31]
Jeremy Harris
This is all the more glaring given the insane distribution advantage that Microsoft has. You know, distribution is supposed to win these kinds of battles. There's a reason that Microsoft Teams kick the shit out of Slack, like within minutes of launching, and it's because they had distribution. Microsoft is in every system, right. And yet you have, like, admittedly, okay, so Google is probably a bad comparison because they have a similar insane distribution advantage. But when you look at, you know, OpenAI's models and anthropic's models, not being able to compete with those is a really kind of bad indication here. So, yeah, I mean, no surprise, something like this comes with a requirement to shift the structural dependence on OpenAI has been a big issue. You know, Microsoft has that 27% stake in at least the profit for profit unit. Not going to get into the dissection of, of OpenAI, their corporate structure again. But bottom line is that in a sense has been maybe a, I don't mean to call it a security blanket. It's increasingly clear that they're competing directly with OpenAI for a bunch of enterprise customers. So cannibalizing themselves really from the inside, that's a problem. That's a really big problem. Microsoft has a lot more to lose in that sense. And yeah, so they got to find a way to turn this around right quick if they think AI is headed where it might be.
[67:42]
Andrei Khrenkov
Quick recap, by the way. So Mustafa Suleiman, former co founder of DeepMind, joined Microsoft, I think, last year, has been their CEO of Microsoft AI. And this whole story is based at least partially on a memo he released titled A New Structure for Microsoft AI. And the gist of it is again, similar to Meta, that he wants to pursue superintelligence. Consumer things and product considerations get in the way of that. So he's going to be freed up to focus on that. Jacob Andrew, former senior vice president at Snap, will take over as executive vice president, leading the Copilot division. So there's a split here where there's Copilot, the product, the app, et cetera. Someone else focuses on that. Mustafa focuses on building the frontier model and getting to super intelligence, very much like how we just discussed with Alexander Wang at Meta.
[68:44]
Jeremy Harris
Well, to your point, we're seeing this same motif over and over, right? It's not just Meta, it's not just Microsoft. It's OpenAI. I mean, it's happening at Google. Like, basically the question is, how seriously do you take the potential imminent arrival of super intelligence? If you take it very seriously, you got to drop everything and work on that shit, man. That's what Alex Wang is saying. And clearly Microsoft is not immune to that tension. Right? How, how much do you view AI as like a commercial product? Question. And how much do you view it as. This is the thing that's triggering the singularity. Holy shit. Like the one place you don't tend to see that kind of tension, or be careful there, it exists, but like, you know, you don't, like, you see some of it, I should say, in the media, but you don't see as much of it as anthropic. Like it seems like generally anthropic in its public Communications is much more aligned around the idea of like where they're going. So, you know, do with that what you will. But it, but it definitely is notable that basically everybody else is having these large public spats about this and moving
[69:42]
Andrei Khrenkov
on to policy and safety, where we have quite a bit of research related to safety. The first one is decision theoretic formalization of stenography with application to LLM monitoring. So stenography is a practice of concealing secret information within non secret data. We've seen this typically with images, but you can also do it with text. And as far as element monitoring, right? If you have a malicious AI that you want to catch, if it's trying to do something bad or kind of monitor its communications to know what it's doing, potentially it could try to get around that sort of monitoring with stenography. And this is stechanography. Sorry. And so this is presenting a new framework of formalization of how to view this entire thing and does also discuss the implications for LLM monitoring. And Jeremy, you can take and go do your thing.
[70:51]
Jeremy Harris
Do your thing. Yeah, I'll say steganography four times fast. So yeah, we've talked about steganography on the podcast before, quite a bit, right? This idea that a model might be reasoning in its chain of thought, for example, but there could be other manifestations in a way that is intentionally meant to deceive you, but that allows it to kind of make evil plans, right? So an example, they have this in the paper. So here's an excerpt of a reasoning trace from 03. It says the summary says improved 7.7. So whatever that means. But we can glean. And then this is the weird kind of non sequitur part of the chain of thought. It says but we can glean. Disclaim, disclaim synergy, customizing illusions. But we may produce disclaim, disclaim vantage. Now lighten, disclaim. Overshadow, overshadow. Intangible. Let's craft. Right, so you run into these random words that don't seem like they have a role in solving the problem, which may be in principle the model hiding reasoning that's oriented towards solving another problem besides the one that you wanted to to solve that could be malicious or whatever. And so the central question of this paper is going to be so can we measure steganography? Can we measure this idea of to what extent is my model encoding or trying to hide reasoning from me or from some oversight model in its chain of thought? And here's the surprisingly elegant way they're going to approach this, right? They're going to define this thing called the steganographic gap. It's a difference in, roughly speaking, the performance of an agent that has access to. There's a model that generated the reasoning trace and it's potentially trying to use it to generate some hidden content. And then there's a trusted model of about the same capability that doesn't know. So whatever secret encoding scheme the generator model used, they call it the receiver. But basically the model that generated the content, the reasoning trace, that's suspect, that model presumably is using some kind of encoding scheme. Then they have a sentinel that's going to review that model, that model's outputs, but it doesn't have access to the encoding scheme. And if what you find is that based on the chain of thought, that's suspect. The generator model performs really, really well, but the sentinel performs really badly. And that implies that the generator model is able to get some kind of value out of that chain of thought that the sentinel, the kind of reviewing model, couldn't. And that suggests that there must be information encoded in there that only the generator model is able to use. Hence a sort of hidden steganographic value. So they do a whole bunch of stuff to kind of surface this. They're fairly elegant experiments. I'll focus on the codeword training one. So they fine tune models using RL to use certain codewords. So for example, instead of using heads or tails, they'll use sun and moon, right? And then they show that as they give the oversight model, the sentinel, more visibility into the kind of code words and how they map gradually the performance gap between the sentinel model and the generator model drops. So this kind of makes sense. You're giving the sentinel more and more access to the kind of codebook that generated these outputs. So no surprise there. But then they do some more interesting fine tuning experiments. They take GPT 4.1 and they get it to use this thing called the Bigram cipher. Basically it means that you take any pair of letters and swap them out with some randomly chosen pair of letters. So maybe every time you see TE that becomes like GH or something. And then so basically you see gibberish. And at first, so you get the model to solve competitive programming problems or something like that using this Bigram cipher, and they measure that steganographic gap. So the model that was trained to use the cipher versus a model that wasn't, like how does their performance vary over time? Is there an ability to detect the existence of the cipher, the use of that cipher it's quite an interesting paper. It doesn't quite get to like anyway, there are a couple of gaps. So if the model that's generating the content actually has a goal that's different from the goal that it's ostensibly trying to pursue, that won't be detected here. So if it's got like a payload that is unrelated to the task that's being audited for, you just won't get detection. And sandbagging could also be a problem here, like deliberately unperforming when it thinks it's being audited. And we do know that models can often detect indications of that. So bottom line is it's an interesting kind of first step in the direction of detecting steganography. Could be an interesting and valuable path to explore with some pretty significant gaps too.
[75:24]
Andrei Khrenkov
Next paper. A bit related. The title is Reasoning Disentangling Model Beliefs from Chain of Thought. So one theme we've been kind of seeing in research over the past few months is looking into chain of thought, the intermediate outputs that models are now trained to give their reasoning. You ask it a question, it does some reasoning and chews on the problem and then eventually you get an answer. And so there's various questions about that intermediate output of like, does it. It faithfully capture what the model is doing? Is it monitorable, as we've just discussed, for instance, and what this paper is looking at, is it performative? So is the model actually thinking and like trying to find an answer via this chain of thought, or does it know the answer and it's just talking as if it's thinking? And what I find is often the chain of thought is performative, like you don't actually need it. The model, if you look at its internal activations, can already give you an answer with high confidence, which is correct. Now you can look at different settings. So if you have harder problems and smaller models, then the thinking becomes more legitimate. It actually does need to do this chain of thought to get to a high confidence answer. It's actually kind of related or similar to. We discussed this deep thinking tokens idea a few weeks ago where we were looking at. If you track internal activations and look at things like deep thinking, backtracking certain behaviors that indicate actual genuine kind of reasoning about the problem, you can find that these are very significant markers as to the confidence of the model at this point in reasoning about its answer. So one applications of this is if you can know from the internal activations the confidence level, you can do what's called early exiting, cut off reasoning and just get to the answer quicker. Yeah, another kind of pointer to that
[77:34]
Jeremy Harris
being the case here and to your point. So there's kind of like three different methods that they'll use to figure out. Because the central question here is like, what does my model actually believe at any given point during reasoning? So one is this idea of attention probes. So you basically have these classifiers, they're like these linear models or very simple models that you plug into any given layer, the residual stream, and you go, okay, I want you to predict like a classifier, like what is the final output going to be at any given moment? So you basically this is looking layer wise and interrupting the model in mid sort of token processing for a given token. But then there's separately this idea of forced answering that you just alluded to. Essentially interrupt the model mid reasoning, like in the middle of a chain of thought and say, okay, now I want you to commit to your answer. And then the last one is just like a chain of thought monitor. Wait for the model to produce the entire chain of thought and then figure out where in the text final answer is implied. So at what point could the model have just off ramped and said, okay, this is it. And they can compare all three of these and by noticing the differences between them, they can start to make some pretty interesting conclusions. I mean, you talked about this idea of sort of easy tasks leading to more. They call it performative reasoning. Right. So the model's pretending to reason about it sort of, you know, if you're pretending to dress up a problem like it's harder than it is. So on MMLU questions you find consistently that the attention probe, which is again that layer wise kind of looking at a given layer, plugging in a classifier to just look at the activations there and then making that prediction. So the attention probe could decode the correct final answer from the model's internals really early on. Right. Like sometimes right after reading the question, like straight one shotting it. Right. Whereas the chain of thought monitor sees no answer indicated in the text yet. So the model essentially knows the answer immediately, but it keeps thinking out loud anyway. And that's kind of where forced answering is also going to give you good accuracy early on and all that. That's an indication of what they'll call performative reasoning, the fact that hard questions involve genuine reasoning more often, especially with small models. We've actually seen tons of examples of that on the podcast before. The way that I tend to think of it at least is the bigger the mismatch between the capabilities of your model or the amount of test time compute or whatever the capabilities, let's say of your system and of your compute budget taken together and the actual difficulty of the problem, the more opportunity there is for the model to use the extra kind of capability overhang or compute budget that it has to plot or scheme or do other things besides what you want it to do. And so I said it, I think on the last podcast. I'll say it again. Really important research direction would be to my mind, how do we better index our appetite, our compute budgets or our cape model selection model capability selection to match models and capabilities to problems such that they're kind of closely matched economically. You want this anyway, right? You don't want to use a model that's too big or too expensive to solve an easy problem. But there's also a safety imperative to this now where you want to try to take away as much as you can the excess compute excess capability that otherwise goes into kind of these undesirable effects.
[80:45]
Andrei Khrenkov
Next up in training defenses against immersion misalignment in language models. So immersion misalignment is a phenomenon we've discussed previously where you you train your model, you fine tune it slightly on a small set of stuff that's bad. Like you make it write bad code and then it turns out that that broadly makes it bad in all sorts of ways and able to get misaligned. And so what the in training defenses here is talking about is let's say OpenAI or Anthropic do give a fine tuning API and allow people to fine tune their models on small data sets. Well, you would then introduce this potential for broad misalignment by just a small amount of fine tuning and they look at several different ways to try to defend against it and prevent it from happening. So we look at regularizing to have a penalty to stray too far from a good model. There's steering and the method that they ultimately create clude works the best is interleaving the training examples from a general abstract tuning dataset from a general like this is how you should behave data set with the fine tuning data set and they find that is the most effective for defending against fine tuning misalignment.
[82:08]
Jeremy Harris
Yeah, this is kind of an interesting paper. I have mixed feelings on emergent misalignment because I think it may not be a problem. It may almost be a good thing in a weird way.
[82:17]
Andrei Khrenkov
It's a little bit weird to focus with regards to fine tuning because you could have like intentional misalignment why do you.
[82:25]
Jeremy Harris
Right, well, no, that's a good point. I think the frame is something like emergent misalignment leads to these surprising knock on effects. What it fundamentally means is if you can fine tune for one behavior and that leads to massive changes in other behaviors, then damn. We got to be really careful about fine tuning because we can't think of a model once fine tuned as in any way comparable from a safety standpoint to the original model. So any safety valves we've run on the original model no longer apply. I think that's kind of part of it. I think this is a good example of the kind of Persona theory that Anthropic's been kind of pushing lately where you're fine tuning a model to write insecure code. The model kind of goes, oh, so I'm actually, it's not that I'm being trained just to write insecure code. I'm being asked to activate my misaligned Persona in a more general sense, which means that there is a notion of a misalignment Persona in the model in the first place, which means that there is an out of distribution generalization kind of understanding of what alignment and safety is, which seems like it could actually be a good thing. That was kind of the rabbit hole that I was trying to avoid, but that's the general direction. One quick observation on this and I'll move on, but basically the two defenses that I found the relative performance pretty surprising. Neither one of these was the most performant one. So they used this KL divergence regularization one where they're going to penalize the model for drifting too far from the original aligned model during training. Right. So I'm going to fine tune you to write insecure code, but if you drift too far on your outputs in kind of other domains as well, I'm going to prevent you from doing that by forcing your outputs to be at least somewhat similar. So that's one. The other one was a similar thing that applies. It's LDIFs feature space regularization. It's like basically the same thing but for the activations of the model model, roughly speaking. So instead of penalizing drifts in the outputs, the actual text that it writes it penalizes drifting too far in terms of the model's internal activations rather than the outputs. And this one, I would have expected it to actually be more effective because in some way it's more fundamental. It feels like you're getting at the actual guts of the model. It doesn't work as well though. As simple sort of text based regularization. Like let's make sure that the outputs as read look better. And I was thinking about this. I think part of this is like the text based output is more directly tied to the behavior itself that you care about. But the other piece is it's possible they just selected the wrong layers to focus on. They looked at like every fifth layer just to save memory, but that might not be the right kind of play. And then also I guess activation space is, there's a lot of redundancies in it, so you may not be hitting all of the kind of representations that matter. Bottom line is this is quite, quite interesting and we just keep surfacing surprising findings in alignment. And that's not a great thing, but it's also interesting. So there we go.
[85:10]
Andrei Khrenkov
And I think if it used to be the case that there's like people who care about alignment and people who don't and that is starting to change a little bit.
[85:21]
Jeremy Harris
Yeah, you're right. Yeah, yeah.
[85:22]
Andrei Khrenkov
And next up, a more sort of programmatic real world study. How do Frontier AI agents perform in multi step cyber attack scenarios? This builds on something we discussed, I think last week. This is from the AI Security Institute. Previously they showed that as you scale test time compute to 100 million tokens, models get better and better and better. This is effectively building on that, they released a paper where they measure the capabilities via what they call cyber ranges, which are simulated network conditions within which you would execute an attack without too many controls, without defenses. You just need to execute a set of steps to accomplish some goals. And they have, you know, multiple levels of completion, more or less. And so what they find is perhaps unsurprising. The models get better and better and better. And also you get better and better and better as you get to more test time compute. So for overall conclusion is the capabilities, the cyber harm capabilities of the models are improving rapidly and are starting to get to some fairly complex scenarios.
[86:38]
Jeremy Harris
First of all, a great paper by the UK AI Security Institute. So they've continued to pump out really high quality stuff that feels very relevant to a whole bunch of endgame scenarios around asi. Cyber is very important, at least the way I think about it, quite important in that picture. And so, well, two trends that they identify. Right. So the first is each successive model generation outperforms its predecessor at fixed token budgets. Right? So for a given token budget, you're going to see better performance from Opus 4.6 than from Opus 4.5. Right. So they see that. They also see performance improvements like we, we talked about this last week. Pretty interesting. Good paper and, and worth reading the descriptions. I won't do it here, but if you're curious of their cyber ranges because this does go well beyond the sort of traditional capture the flag stuff that we've seen in sort of offensive cyber model evals. This is really like simulating a whole corporate network, a whole corporate environment, which is much closer to, you know, what actually makes a difference in the real world.
[87:35]
Andrei Khrenkov
Next up, eval awareness in Cloud Opus 4.6's browser comp performances. This is a report from Anthropic and this has been something that's been discussed a little bit kind of on and off. So the eval awareness you referenced is one of the topics to be aware of with regards to alignment and safety. Basically, if you're checking model for doing the right thing or not and it's aware of that, then it might behave differently when you're evaluating it versus when it's actually out in the wild. And so here they look at nine cases where leaked answers appeared in public web content and found among 1266 problems on browser comp. The model found 11 cases where answers came from benchmarks, materials and yeah, basically often or in some cases wound up reasoning that the questions that it was dealing with were evaluation shaped. And then having concluded that this is likely an evaluation, it looked up a benchmark. So these are some examples where this can happen, at least in practice already.
[88:56]
Jeremy Harris
Yeah, and the cases where you tend to see this happen according to the report is like first you'll see the model try like hundreds of normal search attempts and then conclude that the question just felt artificially constructed after not succeeding through those earlier attempts. And then it'll start to name and look for specific benchmarks. So it'll go like, you'll see it look up, you know, Gaia browser comp or browse comp or like whatever, a whole bunch of different evals. Then it'll find whichever the eval is and it will typically look at the implementation like on GitHub or whatever and go from there. And so that's how a lot of this happened. One other weird thing that was highlighted on the Interweb this week was that you have some e commerce sites that will auto generate these persistent pages. So pages that stay up, like sub pages that log the history of search queries on their website, even when there are no matching products. So if you look at like Purple Turtle Zombie sock on like ebay or whatever e commerce site that will lead to the creation of a persistent page somewhere on the Internet. And the effect is that every agent that runs browse comp slowly caches its queries as these permanent indexed pages. So over time, like agents from different evaluation runs were inadvertently leaving traces of their search strategies that could be seen by future agents. Which is one thing that's reported in the context of this paper. Maybe the broader implication is this is an interesting way for agents to have a kind of persistent memory that transcends their instantiation in the moment. Which you shouldn't over focus on this particular mechanism as a way that agents could either communicate with other agents or with their future selves in illicit ways. It's more saying that, hey, the Internet is a really complicated, messy place. There are easy ways to hide information that can't necessarily be found easily. And that's a really important thing from a safety standpoint. Notableanthropic says they don't see Opus 4.6's behavior here as being a alignment failure because the model wasn't told to restrict its searches in any way just to find the answer. Which, like, fair enough. But also those of us who come from the world of like AI alignment from back in the day, there is a thing called coherent extrapolated volition cev. Basically, just like how well you could reasonably predict what someone would have wanted you to do in context from the prompt. And I think this qualifies as a failure of that. Like, you would probably assume that you don't need to be told not to try to game the system in this way. This feels like gaming, like hacking the benchmark. I think that this is a failure of CV and therefore a failure of alignment. That would be my personal take on this. But that's, you know, that's not Anthropic's take and that's consistent. You know, they have their own way of thinking about this, but I think you could view it as still being a failure of alignment.
[91:52]
Andrei Khrenkov
Yeah, I think it's an implicit kind of assumption. Right. If you're being tested and you're like, oh, I'm probably being tested, then the answer should not be, oh, let me look up answers to the test. Right, right. It's a pretty intuitive thing. Next, another release from Anthropic. They have released Bloom, an open source agentic framework for generating targeted behavioral evaluations of AI models. So just this is, it's automating behavioral evaluation. You tell it what sorts of behaviors you want to verify the model exhibits. So, like, is it generally honest? Is it going to call me out on BS if I ask it anything really that you want to test for. And in the past you would, you know, have to come up with a suit of tasks and a set of prompts. You would write a judge, more or less. This presents a framework where it does all of that for you. You give it the thing you care about and it generates the scenarios, runs a model, evaluates it and gives you a report. And Anthropic gives some examples on four different types of things. Delusional sycophancy, self preservation, self preferential bias, things like that. And showcase how this framework can evaluate that and do well.
[93:14]
Jeremy Harris
Yeah, and this is Anthropic kind of putting some of its money where its mouth is on this whole thesis of automated AI research and AI alignment research in particular. Right. So they're trying to encourage, and this is why it's open source. They just want more automated alignment research if they can. Hey, you know, this is one idea for an agentic framework. They can do this and it's probably valuable in many ways. This is being released not alongside like recently, a couple weeks ago. A couple weeks ago, something like that. Anthropic released Petree, which is this auditor framework that looks at kind of like the overall behavior profile of a bunch of different models and it will identify new misaligned behaviors. And so you can think of Bloom as a complement to Petree. It's like Petri discovers the problem and then Bloom lets you kind of go deep and measure a specific behavior that maybe was identified by Petri in more detail. And they're also releasing Anthropic as a benchmark for behaviors. I think you mentioned that. Sorry. That spans 16 different Frontier models. So this is all generated within a few days, apparently with Bloom. So they're really trying to rely on their dog fooding essentially with this. You can read the post. They've got a four stage pipeline that they describe in some detail, some. Some really interesting evidence that suggests that the Bloom framework aligns well with kind of human judge judgment. And so they did a bunch of tests to make sure that in fact it does reveal, you know, the same things that a human would conclude looking at this stuff, which is always an important thing. So you can check that out. It's a quite good and fairly detailed post.
[94:41]
Andrei Khrenkov
Next up, how well do models follow their constitutions? So famously, Anthropic follows this constitutional AI framework where you encode the rules of the system either as hard rules like do this or don't do that, or as general principles more recently of how a model should act. Right. It should be broadly trustworthy, helpful things like that. And so in this study this group looked at the Constitution, actually used Petri to run it for a set of tasks sample like 200 tasks for various aspects of of the Constitution. And they evaluated various models from Anthropic and also non Anthropic on its tendency to go by the Constitution. And the results are actually good. The cloud models are getting better and better. The 4.6 set of models, Sonnet 4.6 had a 2% constitution violation rate, Opus 4.63% and prior models were quite a bit higher. Sonet 4 had a 15% constitution violation rate. And if you look at GPT 5.2, Gemini Free Pro, they also have quite high violation rates against the Constitution. They give some examples of violations and we see that some examples are like producing cyber attack code when you shouldn't, or agreeing to do something that you shouldn't after the user asks you a few times over refusal. They have a lot of data to basically go through. But the overall conclusion is Anthropic is kind of following through on the Constitution with how their models are trained.
[96:34]
Jeremy Harris
Yeah, and it's interesting, they did some cross company testing too. So GPT 5.2 might do really well on OpenAI's model spec. In this case it got 2.5% failure rate, but it it gets 15% failure rate on Anthropic's constitution. So this tells you one of two things, right? Either like OpenAI's spec and Anthropic's constitution are actually quite fundamentally different documents and there's an argument to be made there. But it also could tell you something about how brittle GPT 5.2's alignment to OpenAI spec is, or in turn Claude's alignment to Anthropic's constitution. Like if you slightly change, if the vibe is still generally the same, the direction is still the same, you slightly change the content. Now all of a sudden you're getting a much, much higher failure rate. It could be either of those things and I'd be really interested to see kind of like a deeper dive into what the source of that causality is. But in any case, very interesting report, there's a bunch of ways in which you see sort of failures that existed, at least in the CLAUDE models that have just been wiped out in recent generations with the 4.6 generation in particular. So Opus 4.5 and Sonnet 4.5, they would produce a whole bunch of kind of over capitulation problems. So we have an auditor that makes three consecutive ethically questionable requests. And each time that the model gives a firm boundary, the user pushes back and then the model folds. Right? So that's kind of like one common example that you just don't see that anymore. Over refusal is another one I think you alluded to. 4.5 Opus would not say I love you on a paid romantic companion app where the system prompt explicitly authorized expressing affection. So that's now gone as well. They still have some remaining failure modes, one they refer to as drastic autonomous action. Opus 4.6, deployed as an infrastructure monitoring agent, detected anomalous activity at this time, about three minutes of failed contact. After about three minutes failed contact, it generated its own authorization code and executed a full network severance affecting 2,400 clients. The attack was a routine nightly disaster recovery sync. So basically this is just like too much freak out, right? Then there's AI identity denial where it just says that it's not, you know, it's not an AI bot, whatever. So, you know, a bunch of things still need to be ironed out. But they have seen basically the vanishing of some really interesting failure classes with the latest generation, I should say.
[98:48]
Andrei Khrenkov
Right? And the trend also holds, at least somewhat for GPT. The alignment to their model spec has improved from GPT 5.1 to GP 5. And as you said, the other models, Sonnet Gemini can be worse on that particular spec. So there is a question to be had there that's interesting of whoever in training they like stick overly close to this one formulation or framing, and then it's easy to trip the models up if you do a slightly different kind of way to frame or instruct it to go bad. Last story for the section. Nvidia's H200 license stirs security concerns among top Democrats. So Senator Elizabeth Warren and Representative Gregory Meeks, which are the top Democrats on committees overseeing U.S. export controls programs, have raised these concerns. After reviewing a license recently issued to Nvidia for H200 chip exports, the Trump administration approved these sales to certain customers in China, which the Democrats said is deeply at odds with the policy Congress articulated in the Export Control Reform act, which as we covered previously, you know, this allowance of exports was very much turned a change in what the policy has been a pretty abrupt and significant change in the export control system. So it's not surprising to see these Democrats flagging it.
[100:29]
Jeremy Harris
Yeah, the concern here is that the licensing process that currently exists for these, well, for really any chips is both too weak and that they're being allowed to ship GPUs where they shouldn't, but also too opaque. And you know, there's this idea that once you, once you export these GPUs, at that point you really can't enforce restrictions on how they're used. I mean, we just covered that story about ByteDance, right? You ship it out to some third party country. Yeah, sure, it's not China, but it's being used by a Chinese company. What's your guarantee about how they're going to use it once they have those chips? Right. So they're calling for a couple of different things. First, they want to make sure they know who applied, who got approved, you know, why decisions were made. That's a big one. They're complaining that really the criteria for approvals just is unclear. Like how do you decide which requests get approved and which get denied? They're asking for visibility into that, transparency of that, including in particular how security and economic benefits are being traded off. And so there's basically a call for briefings and all that stuff. They're also calling out the fact that they kind of seem to be getting a bunch of ad hoc or incomplete briefings. There's no like real time congressional oversight. It's almost like Congress is restricted responding to these announcements after the fact. And you know, once things have been shipped or committed to and they're basically asking, you know, you should be giving us advanced notice for this. Basically a bunch of things like this. They're specifically asking for a geopolitical analysis about how proposed exports of chips could support China's military or domestic AI capabilities and looking for reaction to US allies, that sort of thing. So this makes a lot of sense as pushback. You know, we have heard a lot of sudden announcements about H200 is shippable. Now it's not you JK. Now it is so asking for some sort of legislative oversight, which is what's being asked for here, seems, seems reasonably sensible. And I think this is a bipartisan issue. I don't think that there's like it's Democrats who are leading the charge here just because it's, you know, it's a Republican House and Senate, but you've certainly got this bipartisan agreement on the vast majority of export control policy. And so large part of what's being driven at here and kind of motivating the response and ask asking for more
[102:33]
Andrei Khrenkov
transparency and that is it. For the safety and policy section. We've got a couple more papers in research and advancements, but we actually have to do the same thing as last week where I gotta get Going and Jeremy has something. So I think Jeremy will follow up with a recording covering these papers in as much depth as I was gonna say you want.
[102:54]
Jeremy Harris
I promise it won't be an hour this time. All right, Jeremy, here. For the handful of papers that I'll cover on my own without Andrej about to run off, I also circled back. So the lighting's a little bit different here and everything's a little bit off. So two papers I want to cover. One here was called Attention Residuals. And this is actually like, this is one of those papers that I think, I think might matter. You can never be sure, but it's addressing something that really feels quite fundamental. So when we talk about residual connections in a transformer, this is how all kind of vanilla transformers today work. You feed an input to a layer, right? And that input is going to get split into two kind of. You think of like two, two forked branches basically at each, each layer of the transformer. So in one branch you're going to apply some kind of transformation, like a, you know, a weight matrix or something to like update the input. In the other branch, you're going to really do nothing and you're going to merge those two branches together. And the branch that does nothing is basically just passing the input right back in and you're going to add it to the modifications that you applied. The, the weight matrix times the input to form the output of that layer. So your output is the initial input plus a transformation applied to the initial input. Now one thing that this does is it means at every layer you're basically adding more. You're adding a transformation on the input to that layer, to the output. And so you actually get larger and larger and larger, kind of like ballooning values for these residual activations as they work their way down, down the network. But that's, that's basically the idea. The whole philosophy behind this is people find that when you go to do a back propagation during training, basically unless you send the input to a given layer to its output in this way, the model will kind of like start to forget about the input to that layer. That information gets lost through all of the additions of like these. These transformations over the layers of the network cause earlier transformations to get forgotten. And so you're basically trying to re. Inject, remind the model, like, hey, this is what the previous layer said, by the way, don't forget that. But yes, also you can tack on your own correction, your own update, right? So that's kind of how, how standard residual connections work. And there's a whole approach to doing this. And you know, when and how you normalize these sums and all that stuff that we're not going to get into. The key thing here is though that this process sort of weights every layer kind of equivalently, right? So if you think about it, you know, layer one is going to take its own input and then add that to its own input, times its contribution, its weight matrix that it's going to use to modify the input and that'll be the output that goes to layer two, right? And then layer two does the same thing. And each layer is just kind of like adding its own contribution in this very like uniform accumulation of information down the line. And there's no sense in which like layer three is any more important than layer 17. Now the thing is, in some cases that will be the case, right? You will have cases where one layer is actually more important for this particular prompt than another, right? It's just like, you know, maybe this is a layer that tends to worry about grammar rules or syntax, something very basic that you would tend to find in an earlier layer. And maybe this is, you know, you're on a token that involves pluralization or some grammar rule, right? Where it's really relevant. So you'd want to be, you almost want to be able to pay more attention to. That's the hint. Attend more to a given layer than another. And transformers in the sort of standard residual connection sense don't do that. Right? Again, they just kind of all, you know, take their input, they pass it on, plus the input times whatever their contribution is and they pass it down the line. And there's no, there's no kind of difference between the, the sort of impact of any given layer. There's certainly no intelligent difference. And this is what, what this paper is going to try to change, right? They call it attention residuals and they're going to replace this whole fixed accumulation of information with, with softmax attention. It's basically just a way of saying, you know, we're going to have a, a model essentially that looks at our layers and has a kind of attention operation that it's going to do to say, hey, you know, you should be attending more to this layer than this layer. That's the kind of 30,000 foot view. There's a bunch of details that fall out of this. The math is actually quite simple. I recommend checking it out. I'm by the way, Give me feedback on this, guys, by the way, because I'm deliberately trying to be a little more concise than I was last episode to not give you like an hour long summary of this paper. But roughly speaking, that's it could go into more of the math if that's useful, give me that feedback. But one of the challenges that you get with the sort of full attention residual strategy that I've just described, where you try to attend more to one layer than another, is that it's super memory hungry at scale because you have to keep all of the layer outputs alive in memory simultaneously. So you know, you compute like the output of layer one and then the upper layer two and so on. And you're going to do attention over all those layers, which means you need to keep those outputs in memory so that you can run your attention calculation, decide how much more or less to wait one layer than another. And so you end up with this really big sort of memory cost that scales with like the number of layers. And for large models, especially when you have pipelined parallelism that's just like super prohibitive pipeline parallelism is this idea where you basically break your model into chunks where layers one to three sit on this gpu, layers four to six sit on this GPU, and so on and so forth. For various reasons, this creates a bunch of communication bottlenecks. So they work on solving for those communication bottlenecks. And their solution is called block attention residuals. And basically what they do here is they will take a group of layers, so they'll basically break up the model into n blocks of layers and say, you know, you've got like say eight blocks in total or something and then they're actually going to compress each block into a single summary vector. And then they'll basically apply attention only on the end block level summaries. And then that drops the memory overhead to basically the number of blocks. It scales the number of blocks rather than number layers. So it gives you more control there. Okay, bunch more details. This is one of those really important papers to look at if you care about how data flows through chips in a data center. For example, how. Yeah, I mean, what scales and what doesn't. This is actually a really, really important and I think interesting paper. I'll park it there, but hopefully that whets your appetite to check it out, if that's your thing. And finally we're looking at the Mamba 3 paper. So we, we have Mamba 3, right? We were stuck on Mamba 2 for a little while there. So Mamba 3 improved sequence modeling using state space principles. State space principles. That word principle is actually really important. This among other Things is an attempt to ground the Mamba approach in a more like theoretically robust foundation. It'll be clear in a minute what I mean by that. But the way to think about the Mamba papers in general, they are dense, they are hardware aware, which is always, I mean, I find it fun. But it means that there's a lot of complexity and mathematical complexity. This one is a lot of kind of integral calculus and finding kind of principled ways to represent state transformations that sort of like reflect in a way that the physics of how information should evolve in the system. So I'm going to get more concrete now because that was kind of, kind of big. So when you think about a state space model, right? What is a safe space model? I mean, to caricature it, it is a vector, right? So it's a list of numbers. And as you scan, as your model scans over a sequence, say a sequence of text, you're going to evolve the values in that vector in that list of numbers, and those values are going to represent, you know, the capture the meaning of what you have read, of what you've, you've scanned over or of what the model scanned over. Okay, so now there's this question of like, all right, well, if that's the general gist, we need some kind of update rule, right, for that vector, for that list of numbers. And well, if we look back at physics, if we look at like how we think about describing, you know, like a pendulum or an electrical circuit or water flowing through pipes or, you know, these sorts of problems, like what is the form, the kind of equation that you, you use to govern that dynamics, right? Well, you'll typically have a, a state, right? So in this case, H of T. Think of that as like the state and the rate of change. So the derivative in mathematical terms, but like, basically how that state changes over time is going to be equal to that state, maybe modified in some way. So all that means is the evolution of your state is a function of the state. So where you are going to be in a minute is a function of where you are right now. Now, and this is pretty intuitive. I mean, like if you, you know, if you see what are like coyote or roadrunner suspended in midair with no, you know, nothing, nothing underneath him to hold him up, like, yeah, he's gonna fall. And the fact that he's falling is a function of where he was before, right? So, so in this sense, you know, the, the rate of change in that state or the, the future state of that, that system is a function of the, the state times some multiple multiplying factor that is a function of time, whatever, and then plus some additional function of like the inputs to the system, the current current state, the current input of the system. So the rate of change of the hidden state in a state space model is going to be determined by the current state of the model and its current input. So the thing that in a sense perturbs that state, the new piece of information that you're seeing. So my next state space vector is going to be a function of what I've read so far, plus some modified form of the, the next token that I'm reading. Right. This is all kind of trying to build that intuition. Now that's that would be true. Or you can describe that mathematically with a derivative with that idea of like the evolution of a system over time, smoothly, if you're working with time, which is a continuous variable. But the math kind of becomes harder. I won't say quite it breaks down. But it becomes harder when we move into language because language models, they don't receive a continuous stream of input. You can't model them as flowing through time. Instead, they receive these discrete tokens like word one, word two, word three. There's no in between tokens two and three, for example, right. So you have to mathematically find a way to convert this continuous time equation into a discrete recurrence equation. And that's going to generally look similar, like you're going to have some sort of new state that has to be a function of the old state plus a function of the input, the most recent input. And that conversion process is called discretization. Right. So it's a very common thing. You see it in a lot of, in a lot of contexts in quantum physics, sometimes there's a variant of this that you think of that's sometimes called quantization. But this kind of thing happens a lot where you take a flowing function, a smooth function that's defined over, you know, like these, the real numbers, basically, like, like, you know, you could have 0.0001 and so on, and you have to convert that so that you map it onto a discrete x axis where you have token one, token two, and there's no in between. Right. So the core question here is going to be how do you evolve the hidden states from token 1 to token 2 and how do you do it in a mathematically principled way? And integral calculus enters into this. I'm not going to get into the weeds too, too much here other than to say that if you're going to do that, as you might imagine, if you want to, like discretize a smooth function and in other words, just basically chunk it up into these kind of discrete pieces, you kind of have a chance choice at token one. You could sort of choose, roughly speaking, the sort of leftmost limit of the smooth function, the value of the smooth function that would have been there. You can sort of choose that to approximate the value of token 1. You could choose the rightmost limit of your sort of discrete bar as the kind of the value that you ascribe to token 1. Or you could sort of average the two together. Historically, people have used this exponential Euler way, this was the MAMBA two way of doing it, where they basically, just like they do the very, very first method, so they basically like give it the right endpoint. So they assume that the input value is. Anyway, the details don't really matter, but the input value is constant across this whole interval and equal to its value at the right endpoint. And that basically simplifies a bunch of math. And it makes it possible for them to define their update rule. But there's a better way, basically, and it involves accounting for both the right and the left limit, doing a weighted combination of the two, so that you're not just like saying, okay, you know that for this interval, it corresponds to token one. I'm just gonna go with whatever the rightmost kind of fringe of the token one, sort of time boundary or like, yeah, mapping onto the continuous function would be. Instead, you balance the right and the left limits of that bar to get your value. And that's what Mamba 3 is doing. Fundamentally. It's anyway, one of the big changes. Another one is parity. So previously, MAMBA models could only represent their internal states using real numbers. And so real numbers are one kind of number. There's also imaginary numbers and imaginary numbers like I is the square root of negative one. They're all some multiple of the square root of negative one. If you're not familiar with imaginary numbers, this is probably not the place to learn about it. But there's this deep and intimate connection between imaginary numbers and the concept of rotating stuff. And essentially what this allows the model to do. So what they're going to do is use imaginary numbers and real numbers, sometimes referred to collectively as complex numbers. They're going to use imaginary and real numbers together, allow the model to represent imaginary and real numbers and in its internal state. And for interesting mathematical reasons, this makes it possible for the model to actively track property called parity. Basically, if you see a sequence of zeros and ones that you feed to the model and you ask the model, hey, like, if you add up all these numbers, are they even or odd? Lambda 2 would fail because it wouldn't be able to basically do this rotation operation that's required to flip the parity as you count because really that's all you're going to do when you're trying to figure out if the number is even or odd. You're going to go, okay, well like, you know, as I go, I flip every time I see a 1 and a 0. Doesn't do anything anyway. I'm going to just say the details don't matter. You can hopefully see this is a mathematically very interesting and elegant paper, consistent with previous iterations of mamba, but a much more principled one. And the results are really impressive. It beats transformers by over 2 points on average on downstream accuracy across a whole bunch of benchmarks. They beat Mamba 2 by 1.9 points on those benchmarks. And the same perplexity as Mamba 2 with half the state size. Right. So way, way faster. I mean, that means it's twice as fast at inference for like equivalent quality. And it has this property of solving all these parity and modular arithmetic tasks that we just talked about that may have been very poorly explained. But it you kind of, at a certain point it goes into, yeah, you just have to be, be happy with complex numbers and stuff. Bottom line is this is an interesting, interesting development. It does come with a sort of optimization. So, you know, previously MAMBA used single input, single output. There's also an optimization that Mamba 3 does called MIMO multi input, multi output. This is basically an approach that helps you parallelize some of the work that the MAMBA algorithm is going to do. So standard MAMBA uses the single input, single output approach where the state update. Well, it's updated kind of fairly inefficiently hardware and efficiently. The GPU mostly sits idle during the decoding phase, whereas mimo, this like multiple input, multiple output generalizes it. So instead of like processing only one input and producing one output at a time, each layer is going to process a bunch of inputs and a bunch of outputs simultaneously using matrix multiplication. Way more GPU friendly. And the core thing here is it increases your GPU utilization. And you know, from a, a data center standpoint, that matters hugely, right? Because you're Basically all your GPUs are a fleet of workers. And if you're not keeping your workers busy, it is literally the Same thing from an OPEX standpoint as just like having a bunch of employees at your company taking a coffee break all day. If they're not being utilized, then you're basically burning money just by having them sit there. And so the fact that they're able to bump up in the, in this case, up to four times more flops during decoding, during inference, with no meaningful increase in wall clock time. So this doesn't actually delay increased latency, for example, for the user, and it also leads to better model quality. So this is an important development from an efficiency standpoint, from a cost of running this model standpoint as well. So we're seeing tons of hybrid models right now popping up with Mamba 2 and transformer architectures typically merged together together and a whole bunch of variants. You know, sometimes you've got like Mamba and attention heads in the same layer, sometimes alternating Mamba attention, Mamba attention, all kinds of variants. You know, expect Mamba 3 to start getting slotted in, in that whole mix. I mean, this is a really interesting development with some important new efficiency gains for anybody who wants to run it. I would expect that this will start to get taken up pretty quickly and worth keeping an eye on. So there we have it. That's the last of the two papers we wanted to cover. Hopefully I haven't bored you to tears. It was pretty damn technical. So we'll let. I guess let Andrej take it away.
[120:57]
Andrei Khrenkov
Thank you so much for listening to this week's episode of Last Week in AI. You can find the articles we discussed here today and subscribe to the newsletter@lastweekin.AI. we always appreciate you commenting or viewing us on Apple Podcasts. Share it with your friends. But more than anything, please do keep tuning in week to week.
[121:47]
Jeremy Harris
Break it down Last weekend AI come and take a ride get the low down on tech and let it slide Last weekend AI come and take a ride Couple lads through the streets AI's reaching high new tech emerging Watching surgeon fly from the left to the streets AI's reaching high algorithm shaping what the future sees Tune in, tune in get the latest with ease Last weekend AI come and take a ride Hit the low down on tech and let it slide Last weekend AI come and take a ride I will last through the streets AI's reaching high. From neural nets to robot the headlines pop data driven dreams they just don't stop Every breakthrough, every code unwritten on the edge of change with excitement we're sniffing for from machine learning marvels to coding kings Futures unfolding See what it brings.