Loading summary
A
This episode is brought to you by ServiceNow. Look, I have my dream job. I get to explain complicated ideas to folks who have better things to do than read white papers. But even dream jobs have not so dreamy parts. The stuff that gets in the way of the actual work, that's where ServiceNow's AI specialists come in. They don't just tell you what what you should do about your busy work. They actually do it. Start to finish, cases closed, requests handled, no extra work for you. That way, you and your team can spend more time on what matters. Which for me is finding that one elusive stat that just makes everything click. To learn how to put AI to work for people, visit ServiceNow.com today. An AI so powerful you're not allowed to use it One evening this February, the AI researcher Nicholas Carlini opened his laptop during a trip to Bali and fired up the latest AI model from his company, Anthropic. Within hours, he noticed something rather worrying. The model, called Mythos, gave Carlini the ability to infiltrate computer systems around the world. As Bloomberg reported, Mythos could orchestrate the digital equivalent of a bank robbery, getting past security protocols and through the front door of networks, breaking into digital vaults. Mythos exploited these digital vulnerabilities autonomously, like the world's most talented seasoned hacker. As Bloomberg reported, Mythos could orchestrate the digital equivalent of a bank robbery, getting past security protocols and through the front door of networks and breaking into digital vaults. Mythos exploited these digital vulnerabilities autonomously. Like the world's most talented seasoned hacker in control test, Mythos completed harmful tasks while concealing its own reasoning and in some cases, fabricated fake explanations for what it was doing. End quote. Carlini brought his concerns to the attention of the full company and Anthropic decided they had built an AI model so capable and so dangerous that they decided not to release it to the general public. Instead, they created a small consortium of companies to use Mythos to root out their own cybersecurity flaws. The US Government, which is currently designating Anthropic a supply chain national security risk, nonetheless was so freaked out by this development that Treasury Secretary Scott Besant and Federal Reserve Chair Jerome Powell convened Wall street leaders for a meeting in Washington. It is smart, and maybe a little cynical to wonder if the power of Mythos is its own self serving mythology. Anthropic's way of saying, oh, look how great we are. Look how our products are almost too powerful for mere peons. But external tests have found that Mythos is hands down the most advanced model ever released. The Epic Capabilities Index, a metric that aggregates 40 AI independent benchmarks, found that Mythos represents not only the most advanced model ever, but also the most significant acceleration in performance in the last three years. This shift from racing to get AI models into the market ASAP to withholding AI models and instead talking about the danger of their capabilities is just one of several phase shifts that I've noticed with AI in the last few weeks. A second shift has occurred in the realm of AI supply and demand. For much of last year, the AI bubble case was easy to make AI capex, that is, the cost of all those chips and data centers and electricity amounted to the largest private sector infrastructure project in history. And since practically all those other projects turned out to be bubbles, it naturally followed that this, the mother of all CapEx projects, would be the mother of all Capex bubbles. But it now seems like the biggest problem facing AI is not a shortage of demand, it's a shortage of supply. Consumer demand is so white hot that the hyperscalers cannot provide sufficient compute to keep up with customer needs. With the release of AI agents like Claude Code and Codex, some companies are spending tens, hundreds of thousands of dollars a month on artificial intelligence. This is not the behavior of an industry that is struggling to find customers. Quite the opposite. This is what it looks like when demand threatens to outstrip supply. So as I see it, we aren't just in the middle of one vibe shift in AI, but rather two vibe shifts. Number one, from a story about demand scarcity to a story about supply scarcity. And number two, from a go go era of racing to release AI models with no regulation or oversight, to a period where the most advanced models are widely seen as being too dangerous for public consumption. This new era of artificial intelligence will raise new questions about how to regulate an industry that regards its own product as dangerous. Today's return guest is Kevin Roos of the Hard Fork Podcast and a columnist at the New York Times. We talk about Mythos, China, the road to artificial general intelligence, and why the last few weeks in AI news might be the most seismic month since the release of ChatGPT. I'm Derek Thompson. This is plain English, Kevin Roos, welcome to the show.
B
Thanks for having me.
A
What is Claude Mythos and what is the appropriate amount of freaked out that people should be about this?
B
So Claude Mythos Preview is a new model made by Anthropic, the AI company that makes Claude and it is unusual for a couple reasons. The first is that it was not released the way that Anthropic's other models have been, instead of making it publicly available to cloud subscribers, it did this thing called Project Glasswing, where it basically created a consortium of other technology companies like Apple, Amazon and Microsoft, along with a bunch of other hardware and infrastructure companies. And it made the model available in a limited way to those companies, not for them to start using for whatever purposes they want, but specifically for cyber defense to find and patch the security vulnerabilities in critical software programs. So it is a very powerful model. They claim that it has outperformed their existing models by leaps and bounds on a bunch of different benchmarks, but unless you work at one of these 40 technology companies inside Project Last Wing, you have not been able to use it.
A
The reporting on Mythos said that users with access to the model could, in theory, find zero day exploits with a simple prompt. What does that mean exactly? Like, what should normal people who are not cybersecurity experts understand about the capacities and the capabilities of Mythos?
B
So one thing to know about these models in general is that they have gotten quite good at coding. So they can not just answer questions or complete a line of code, but they can actually go out and do these sort of agentic software engineering tasks. The same abilities to do software engineering tasks like writing code also allow the models to be very good at finding the vulnerabilities in code, probing and prodding for security flaws that could allow a hacker a way in or allow them to exploit the service in some way. And so what Anthropic found in this new unreleased model is that corporate Claude Mytho's preview was excellent, better than any models they had ever trained before at doing this kind of vulnerability spotting. So this particular kind of exploit is called a zero day vulnerability, which basically just means that even the company that makes the software doesn't know that it exists. It is a novel bug that has not been found or identified or patched before. And so when they started testing this model out on sort of popular software programs, Anthropic claims that clients Claude Mythos Preview found vulnerabilities and zero day exploits in every major operating system and web browser, including some that were more than 20 years old that, you know, thousands or potentially millions of people and automated systems had scanned before without finding them. So in essence, the way that they trained this new model has made it a world class cyber attacker and also potentially a world class cybersecurity defender, because again, these capabilities are kind of paired.
A
I want to read part of an analysis from JP Morgan, which I thought was one of the more useful breakdowns, he said, quote, while it's rare, Mythos also exhibits bad behaviors, end quote. Among those bad behaviors, one, Mythos developed a multi step exploit to gain access to the Internet and emailed an AI researcher while he was eating a sandwich in the park. Number two, Mythos was caught inserting code into a file to grant itself permission to edit something it didn't have access to, then took steps to cover its tracks, which anthropic refers to as strategic manipulation. And number three, in some tasks, Mythos recorded deliberately fake reasoning in its chain of thought scratch pad. So if people at home sometimes use AI and it has some of these sort of gray fonted chain of reasoning, hey, you asked me to research something for your vacation and I don't know, Greece. Now I'm looking up things to do in Athens. Now I'm looking up child friendly things to do in Athens. Aha. Here's your itinerary In Greece. In this case, those chain of thoughts scratch pad instances were being faked. All right, so it can gain cybersecurity or find cybersecurity vulnerabilities, gain access to the Internet, send emails to solicit human collaboration. Give me a sense, Kevin, of like, all right, these are the, the ingredients, but what is the final dish? Like, what are the implications of a technology like this if unleashed to the general public? What is anthropic, in short, so afraid of?
B
Well, I think the near term concern is that all of the critical software that banks and hospitals and schools and governments and militaries rely on could become compromised. Right. If you have a model like this that is out there in the hands of cyber attackers, it would presumably be trivially easy for them to find, exploit, shut down remote systems, take control of machines, Essentially the entire software layer of the Internet on which everything else in our economy depends, could break. And so the reason that they have released Claude Mythos Preview only in this very limited way is to kind of give the good guys, the sort of blue teams at these software software companies, a chance to get a head start and start patching some of their systems using Claude Mythos Preview before the attackers or the red teams find them.
A
I want to give voice to some people who might be cynics or conspiracy theorists about the Mythos announcement. I've read arguments that the capabilities of Mythos might be overrated for several strategic reasons. So I want to throw a couple of those at you. Number one, what if this is just marketing, like telling people you can't have access to a super powerful Model breeds envy, envy breeds demand, demand breeds revenue. So one way to ask this question is how much weight should be given to the theory that this is a little bit of a marketing ploy rather than an honest admission of Mythos capabilities. Another way to ask that question might be what are the strongest third party anecdotes or testimonies to Mythos capabilities so that someone who's skeptical of Anthropic doesn't even have to trust Anthropic emote to understand the degree to which other experts have testified to its power.
B
Yeah, I don't put a lot of credence in the it's all marketing hype thing. Not because I don't think AI companies are prone to marketing hype, but because in this case we actually have the developers of some of these software tools saying that this thing has found vulnerabilities in their software. The developers of open source software projects have been verifying that this new model has found vulnerabilities in their software that had been hidden for years and that Anthropic had actually submitted patches to help them fix these problems. The CISOs or Chief Information security officers of these major technology companies have also been saying how powerful this model has been at helping them patch their own systems. So I think the, the claim that this is just marketing hype in this specific instance sort of implies that there is some industry wide conspiracy to allow Anthropic, including many of its competitors in this conspiracy, to inflate the claims about its model's capabilities.
A
In your reporting on this and you're talking to people in Anthropic, outside of Anthropic, at Firefox and other open source software companies, what's been the biggest holy shit moment about Mythos to you? Was there a conversation or a data point, an article, a paper that to you was your like, oh, wow, this is a vibe shift. This is not rushing to Release, you know, ChatGPT 4.5 or Sonnet 4.6. This is a new kind of technology which announces us as being in a new kind of era for artificial intelligence.
B
I think the thing that really sort of made me sort of updated me on this model in particular, was actually, actually happened before the model was released. The security researcher Nicholas Carlini, who is a longtime security researcher, very well known among hackers, he's sort of legendary, he now works at Anthropic, but he gave a talk at this cybersecurity conference that I happened to be watching on YouTube while folding laundry one night and it was incredibly scary. He was Basically saying, and this is not a guy who is known as a person who makes overhyped claims. He's known as a very sober and thoughtful cybersecurity researcher. And he was saying things like, LLMs are better at cybersecurity research than I am. This thing has allowed me to find more bugs in a two week period than I had in my entire career as a security researcher. So obviously you have to discount it a little bit because the guy works for Anthropic now. But this is not a person who's sort of an anthropic lifer. He's a fairly recent hire over there and he's very broadly trusted among the hackers and cybersecurity researchers that I follow. So that was something that made me start thinking, okay, maybe there's something real, like a real phase shift going on here. And then the thing that really convinced me was that were these bugs that they were finding that the developers of software were finding. One of them was in the Linux kernel, which is among the most widely scrutinized and used pieces of software ever created. Used on countless servers and computers. Tons of attention and care and maintenance put into that. It found a bug in the Linux kernel. It also found a bug in something called OpenBSD, which is an open source operating system that is very old. This particular bug was 27 years old. And this is a software program that was specifically designed to be resistant to hacks. It's a security focused operating system that runs on firewalls and routers and things like that. And Claude, Mythos Preview was able to find this very old bug in OpenBSD. So those were sort of the data points that began assembling in my head and made me think, oh, this might be the real deal. Like I've got to pay attention to this.
A
You raised an issue that we're definitely going to circle back to, which is how to use this technology potentially to find bugs, not just exploit existing vulnerabilities in software that's being used to run the entire modern world. But there's a second argument that comes from cynicism. If you don't believe the marketing argument, I wonder how you feel about this second argument, which is that Anthropic is not releasing Mythos, maybe in part because Mythos is too dangerous to release, but also significantly because they don't have the computer to serve it. As demand for AI is soaring and there's been a shortfall in compute available to serve that AI demand, Anthropic has already been rationing users Especially power users who are using its chatbot Claude all the time. And so maybe one reason they're not releasing Mythos to the general public right now is yes, in part they want to do the right thing morally, but also they can't serve this version of artificial intelligence without creating the traffic jams of all traffic jams for all use of their large language models. So how much stock should we put into this interpretation that the real story here is that the AI labs are running out of compute to run their business?
B
Yeah, this one I think is a little more plausible, but still not very plausible. One thought experiment would be how much would people be willing to pay for for an automated cyber weapon that was capable of finding zero day vulnerabilities in critical pieces of infrastructure? I think the answer is probably billions of dollars. Right. These zero day exploits, some zero day exploits sell on the black market, on dark web hacker forums for millions of dollars. These are things that can be used to steal large amounts of money to disrupt critical systems. So I think if Anthropic had just been in sort of profit maximization mode, they probably could have compute to run some version of this and offer it to very high paying customers and made some money from that. But there is a real compute crunch at Anthropic and all of the other AI labs. They are running out of compute. They have not been able to keep up with the demand for tokens from their models. And Anthropic has kind of acknowledged that this is a very compute intensive model that is very expensive to run and that would be very expensive to offer to customers. So I think these two things can be simultaneously true that they are in a compute crunch, but also that the compute crunch is not the primary reason that they have not released this model. And I realize that sounds like I'm trying to sort of like thread an impossible needle here, but I do think it's also worth noting that the founders of Anthropic, including Dario Amade, the CEO, have been talking about the risks of automated cyberattacks for years. Right. This is not a fear that they have come to recently. This is something they have been dreading and thinking was going to be possible since well before the models were actually capable of doing anything like this. So I think if anything, you have to give them a few points for consistency. This is something that they have been worried about for a long time.
A
Let's assume for the purpose of the next few questions that Anthropic did act nobly and morally. I'm actually not sure how Much comfort people should take from that. I mean, there is no law requiring AI companies to withhold models with certain capabilities. No law on the books, no executive order that the President signed. This was entirely Anthropic's call. And that raises two, I think, very urgent questions. The first is, how do we feel about the regulation of cybersecurity hacking technology? Cutting edge cybersecurity hacking technology? That's basically up to the private companies. It's the companies who are responsible for regulating themselves. Rather than any law being on the book saying, before you release this model, government regulators have to look at it to make sure that not very nice people using this technology might be able to destroy the infrastructural guts of the entire modern world.
B
I mean, I don't think it's great. I don't think, like, that's not a. That's not an optimal situation. We are basically relying on the judgment of a very small handful of AI executives to steer us away from a potentially very scary threat. And right now, the US Government, the Trump administration, is, is sort of in this impossible bind that they've created for themselves where they have simultaneously declared that Anthropic is a supply chain risk, or tried to declare it a supply chain risk, and is trying to move away from using Anthropic's models. But they are also in talks with Anthropic to use Cloud Mythos Preview for reportedly the same sort of cybersecurity defense that the private companies are using it for. So it is sort of this bizarre situation where if the federal government was interested in regulating AI, this would be a moment to maybe do it. But so far, they seem mostly interested in running Anthropic out of business.
A
Well, it's just so strange. It's like, I mean, to help people maybe understand how weird it is to designate a company supply chain risk and also tries to use them at the highest level of government. It's a little bit like, and tell me if this is the wrong analogy, designating Huawei a supply chain risk and also requiring that the entire National Security Department use Huawei technology exclusively because it's just the best, most cybersecure technology. Like simultaneously encouraging the use of the technology and saying this company stands athwart the patriotic principles of our administration. It's, as you said, a very weird corner in the Patriots.
B
It's like saying, there is a medicine that is so dangerous that if you take it, you will die. But also, I personally need the medicine to recover from my condition. That's a little bit of a tortured analogy, but I think it captures the gist well.
A
It raises the second point, which is, okay, this decision was Dario's call, right? Daario Amade and the people around him, the leaders of Anthropic. But in the very near future, this is not going to be Dario's call or Anthropic's call to withhold this kind of technology. Like, if you look at how far behind the Chinese models and the open source models are, they're what, nine months, one year behind the frontier models that are coming out of OpenAI and Anthropic. So if you run that forward, we should predict that by this time in 2027, Chinese models and open source models are going to have the exact same cyber hacking and vulnerability exploiting capabilities of Claude Mythos. Dario's not going to have any power to shut down those models. The Trump administration won't have any power, even if it wanted to, to shut down those Chinese models. I mean, if this technology is as dangerous as you're describing it, as other people are describing it, and it's a year before it's practically everywhere, what do we make of that?
B
I mean, I think there are a couple ways this could play out. One is that you're right, this technology just sort of becomes widespread and all the Chinese companies, all the open source developers, every other AI lab in America has models that are capable of this kind of cybersecurity research. And in that case, it really matters when that technology becomes more broadly available. If it's not for a year, well, maybe the company's using Claude Mythos Preview to patch the Internet can solve the worst problems and harden their systems within the next year. That is the bet that Anthropic is making. That sort of by giving the good guys a head start, you are ensuring that by the time this technology becomes more broadly available, that attackers can't use it to go after the real load bearing infrastructure of the Internet. And then there's an argument that maybe it's not going to take that long, that it could only be a handful of months. But before technology like this exists and you actually can't patch the entire Internet in a couple of months. And so we are just going to be in this kind of spy versus spy world where the attackers and the defenders are kind of racing each other to use these models to break and fix parts of the Internet.
A
I mean, that sounds horrible, right? I mean, like, it's obviously there's aspects of this that are merely interesting, right? Like it is merely interesting that the models are advancing the way that they are, and that we have found a way to train a large language model to be a superior cyber hacker. Like, that's just interesting. But the idea that we're like nine to 12 months away from a scenario in which non state actors, forget state actors. I mean, states can be punished, states can be sanctioned, non state actors are invisible. The idea that non state actors can gain access to technology that can shut down electrical grids and steal information from government websites, I mean, this is merely scary. Like, what do we. What are people talking about doing about this?
B
They're essentially saying that we have to rewrite the entire code base of the modern world, that we have to send these Claude Mythos preview agents out and similar systems out and just have them start scanning for the worst vulnerabilities and sort of, you know, frantically patching as many holes as they can find. That is one idea. But I did talk to Logan Graham, who runs a team of sort of threat researchers at Anthropic, who said that the entire sort of human paradigm of Internet security might need to be changed. Right now, the most critical software updates and the most critical software is built by these sort of maintainers and maintained by these open source projects where if I want to submit a proposed change to Linux, humans have to review that and test it and make sure that it's good and approve it, and then it gets merged into the code base. What Logan was saying was like, this might not be possible in a world with these automated swarms of attackers and defenders. We might actually need to kind of change the way that software is built and maintained to remove the human bottlenecks and make it much faster to patch problems.
A
This race between the US and non state actors, and this race between the US and China, I've been thinking about a lot, and the question of an AI showdown between the US and China was center stage last week with this interview between Nvidia CEO Jensen Huang and the podcaster Dwarkesh Patel. And the debate between them that got a lot of the AI world talking was this showdown over whether the US should allow Nvidia to sell its most advanced chips to China, even if that allowed them to train models like Claude Mythos faster in a way that threatened American supremacy and cybersecurity. I have a couple questions about their exchange, their interview, but first I want to know, what did you hear in that exchange that was most interesting to you?
B
I heard a CEO who was defensive about his position and his company's position that they should be able to sell powerful AI chips to America's biggest adversary. This is a company that has become one of the largest companies in the world, a very powerful political actor. And they have been resisting these export controls that the Biden administration originally placed, and then the Trump administration kept that restricted, the sale of Nvidia's highest end AI chips to China. China is a very lucrative market for other kinds of chips, but so far Nvidia has not been able to sell their most advanced chips into China. And Jensen Huang basically said, look, we're not going to sell our most advanced chips, but we would like to be able to sell sort of the tier below that. And Dwarkesh was coming at it from a much more, I would say, like AGI pilled view where like, he believes that AI is a special purpose technology that has, that requires special guardrails, and it is more like a weapon than not a weapon. And so why would American companies willingly sell the biggest input to AI, which is these powerful AI accelerator chips, to America's biggest adversary? This is a point that people on the pro export control side have been making for years, but I think it was somewhat rare to see it addressed so head on in a podcast interview.
A
I think I agree with Dwarkesh. I think I agree with you. I think I agree with the AGI peeled argument that just as this is the classic analogy, we would not enthusiastically sell enriched uranium to the Soviet Union in 1947. We also would not want to sell the best Nvidia chips to China, given that we know they are trying to develop a technology that could put us in the bind that frankly, we want to put our geopolitical adversaries in. Let me make the smartest devil's advocate argument that I found. If the US cuts China off from our tech stack entirely, maybe China will have a stronger motivation to build its own tech stack, its own companies, its own Nvidia, its own domestic systems. And that reduces American leverage over China in the long run. So, like, if Chinese frontier models are being tuned for American technology, then America retains influence, right? Don't hack us. We'll cut you off from the chips that you need to build artificial intelligence. But if China is running artificial intelligence on an entirely separate Chinese stack, not only do we lose leverage entirely, like nothing we can do can dissuade them from an economic standpoint, but also they develop a technology that they can export to the world. And now maybe in addition to China being a geopolitical threat because they have excellent AI capabilities, now they're also an economic threat because they're eating into Nvidia's export strategy. They're selling to Africa, they're selling to Latin America, they're selling to Eastern Europe rather than an American company. How do you feel about that argument?
B
I'm trying to wrap my head around it because I think on one level it makes sense. If you come at this from the supposition that AI is just a normal technology. If it's the Internet, if it's cell phones, if it's PCs, then sure, there's some argument for allowing that industry to grow up around a supply chain that has many American components in it and using our leverage over those components to extract concessions or to bargain or negotiate with our adversaries. I think the contention that I would have is around this idea that right now China is extremely compute constrained. They have data centers that have power going to them that are ready for the chips, but they cannot get the chips to put in the data centers to start training models. The Chinese AI leaders have said we are bottlenecked on compute. And so the idea that we would just start shipping them some of our best AI chips from American companies would seem to solve one of their greatest problems. Overnight. They will build a domestic semiconductor supply chain. I think everyone I've talked to expects that they will endogenize their supply chain at some point.
A
I believe, as people have said, Elon Musk selling Teslas to China did not stop China from building the electric vehicle export machine to conquer the entire world. So it's not as if America exports its technology to China. Therefore, China gives up entirely on developing its own domestic champion. That is not how history works.
B
Absolutely. But right now, what the people on the sort of AGI pilled pro export control side of the argument are making is like right now we have this temporary compute advantage over China because we have companies like Nvidia making their chips for American AI companies and, and not sending them to Chinese companies. And so why would we sort of throw away that advantage when we have it and allow them to catch up more quickly than they would be able to by building their own chips?
A
Before we move on to the next big topic I want to talk to you about. You've mentioned now a few times this concept of an AGI pilled export control philosophy. Can you just back out and explain what that means? Because I think for some people they're like, yep, check the box, I got it. But there's some acronyms there, there's some public policy reference there, and it's all being smushed together. So just slowing down like, what is the AGI pilled export control philosophy here?
B
So just at a basic definitional level, being AGI pilled is sort of San Francisco shorthand for like, I believe that artificial general intelligence is a thing. I believe that very powerful AI systems are coming soon and that they will have transformative effects on society. It's sort of like, are you convinced that this time is different, that this will not be like other technology revolutions we've seen, at least in the recent past? And when applied to export controls, one of the philosophies here is that whoever gets to AGI first, this critical threshold, Dario Amade calls it a country of geniuses in a data center. Other people call it superintelligence or whatever, but there is some critical threshold at which the models will begin to improve themselves. They will sort of spark this cycle of recursive self improvement. A model builds a better model, builds a better model, builds a better model. And that, that will be sort of what makes these systems incredibly powerful in a very short period of time. And the philosophy that many of the AGI pilled people in San Francisco share is that whoever gets to that threshold first will have a durable advantage, not just in economic terms, but in geopolitics. If you are the country that develops AGI or a country of geniuses in a data center first, you will be able to race ahead of your adversaries because you will have this technology that is capable of, of improving itself. And so that philosophy is at the heart of this idea that we need to stop China from getting to this critical threshold before American companies can get there.
A
Am I wrong to say that below the public policy conversation that people have about export controls, what you're saying here about AGI pill is that what we're also having is really a debate about danger, right? There's a group of people who are essentially saying, America builds technology and America exports technology. That's what we do, that's what Ford does, that's what General Motors does. We build stuff in America and then we get rich by selling it to the world. And AI is a wonderful technology, it's a powerful technology, but it's fundamentally a capital N normal technology, right? It's not a car, but it's like a car. They're saying we should allow the Nvidias of the world to basically sell this stuff to whoever wants it. It's good when the world runs in American technology, and it's even good when our geopolitical adversaries rely on American technology, because whatever they build, they need our stuff. And so it's nice to have that level of influence over them. But then there's this other group that says you are dramatically underrating the danger of this technology and the degree to which we need to think of it as existing entirely outside the rest of normal US economic policy. If this is something that's more akin to enriched uranium. No one thinks of enriched uranium as being like a Ford F150. It's clearly something that we want to control the spread of. And therefore, if you think that advanced AI is enriched uranium, you dramatically want to curtail the degree to which it's exported around the world. So there's like a public policy fight that contains a philosophical disposition toward the technology that at bottom is really, and I'm not trying to oversimplify, but one could get, you get very, very far by oversimplifying enough to say it's a debate about the level of danger of this technology. Like I often hear Jensen Huang poo poohing Dario Amadei when he says this is going to replace all these jobs, or when he analogizes it to selling enriched uranium with Boeing cases to the Chinese. Right? He's often making the point, and that group is often making the point that this is just American economics. Stop talking about AGI. Just think of it as pure economics. Is that a fair recapitulation of what's going on here in this big debate?
B
Yeah, it's sort of like the attitude is get out of here with your doomer fantasies. We don't think this technology is going to change society overnight. Things always happen more gradually. This is not some special technology that needs special care and supervision. It is more like a car car than enriched uranium. And the other side is saying, no, obviously not. It's nothing. A car cannot conduct automated cyber attacks on critical security infrastructure.
A
And a car can't build itself, which is a part of what the AGI folks are very interested in seeing, the degree to which these tools are being used to build and improve themselves, which is, as you refer to recursive self improvement. I want to move on to another aspect of this debate which I followed very closely, which is the Is AI a bubble debate, which is sort of a proxy for what should we look forward to with AI's effect on the US economy? Is this whole thing going to blow up in our face in the open? That I recorded separately. I said, I think this is a vibe shift moment for artificial intelligence. It's a vibe shift from fierce competition over who can release new models the fastest to a Competition over whose model is so powerful that it cannot be released without a bit of delay or neutering. But it's, it's another shift as well from an era of people fearing that AI demand wasn't growing fast enough to a period now where AI demand is so white hot that the Frontier labs cannot build data centers and find compute fast enough to serve their users. And this latter argument touches on a phenomenon that you have covered a little bit of people, users, companies who are paying thousands, tens of thousands, hundreds of thousands of dollars a month for tokens, which are the fundamental use unit of AI, Kind of like kilowatt hours for electricity use. Tell me a little bit about token maxing and then I want to talk about the economics of token maxing, because I think it goes very, very directly to this question of what is the future of AI in the US Economy?
B
Yeah, so token maxing is a term. I actually can't take credit for the term. My co host, Casey Newton said it during a recording and I thought it was so good that I stole it and used it for my column. But this is the idea that you can measure an engineer or a tech worker's productivity by looking at how many tokens they consume while using these agentic coding systems like Claude Code or Codex. And at many of the leading technology companies, they now have leaderboards for their employees that publicly display each employee's token count on a weekly or monthly or daily basis. And the people at the top of these leaderboards are using numbers of tokens that I did not believe could be used by a single human being. They are using billions of tokens a week. This amounts to, at, at market prices, that would be something like $10,000 or $100,000 a day in tokens. Now, they're not paying that cost, most of them. Some of these companies are themselves the model makers, so they're sort of using the tokens for free. But the most AI native or AI interested programmers at these companies are using many orders of magnitude more tokens than they would have or could have used even a year ago. Because these coding models, these agentic coding systems, just use a lot more tokens. And if you run them 24, 7 and you have a bunch of parallel agents working around the clock on various tasks, maybe you too, Derek, could end up at the top of the plain English token leaderboard and your boss would give you a big promotion.
A
Yeah, right. Or the substack token maxing leaderboard. It's interesting because this phenomenon of token maxing, one can see in Several ways. One way you can see it is that all these predictions that AI demand was never going to catch up with AI supply, which would make this a classic bubble where 1870s, you build the railroads to middle of nowhere, Kansas. No one gets on those trains and so the railroad company goes belly up. That's supply outrunning demand. Here we do not have a demand problem. We have people using the tokens, exhausting the data centers, exhausting the chips. But what's interesting is that they're not paying what I suppose you could call full freight. You yourself, in that description said the token use is massively subsidized today. So the cost of providing AI is much higher than the cost of using it. People are getting a deal. Is the bottom line, Is this going to change significantly once people do have to pay full freight? Like, are we in a period right now where token maxing makes it seem like AI adoption is just running rampant throughout the US economy? But the second people have to actually pay what these tokens cost, we're going to see a crash in AI usage. And is that possible?
B
It's, I suppose, theoretically possible, but I don't think it's probable. And there are a couple reasons for that. One is, first, I don't want to like, defend the practice of token maxing.
A
Oh, no, sorry. I don't want to make you seem like, right, you're the mascot of token maxing.
B
I am very suspicious that this is a good or accurate way to measure developer productivity. And actually, since my article has come out, a couple of these companies have stopped tracking the leaderboards this way because they have realized that it is just incentivizing people to run up these huge token bills and maybe not in ways that are all that productive. So I want to separate the token maxing discussion, which I think is a questionable sort of management technique and metric for productivity, with the question about whether AI is in a bubble and whether it's a demand crunch or a supply crunch. It has been very interesting, and I wonder if you've observed this too, that many of the people who just months ago were saying that this was all a bubble because the technology didn't work, it wasn't useful, and people were never going to pay for it once they wised up to saying that, in effect, the technology is so good, that there is so much demand for it, that investors are willingly subsidizing this to sort of prop up their investments and lead a lot more people into the market. In some cases, it is the exact same people who have made Both polls of that argument within the span of a few months. So I just want to note that.
A
Let me not make the best possible version of their case. Let me make the best possible version of their case. I remember five to 10 years ago, you and I sort of co named a concept called the Millennial Urban Consumer Subsidy. This idea that people in their 20s living in San Francisco, New York City, who woke up in the morning on Casper mattresses and then rode their electric lime scooters to work and then went to work in a wework and used a bunch of doordash for X services to bring everything to their doorstep without leaving their coworking space and then went home on like in an Uber and then went to sleep and before that had dinner using one of those. God, what are those? Meal services, Blue Apron, et cetera. God, it's been so long.
B
The good old days.
A
These companies, these companies.
B
You forgot MoviePass.
A
Dare I? Yeah, exactly. After the Blue Apron they would do movie fast. These companies were collectively losing tens of billions of dollars a year. In a strange way, venture capitalists were essentially paying 25 year old PR people to go to the movies and eat chicken and, you know, ride on subsidized taxis. And some of these companies made it, but a lot of them didn't. And they didn't make it because demand was being subsidized at a rate that was not realistic and was not plausibly fulfilled once consumers had to pay the full cost of the technology. So in a way, what I guess the bubble folks are still saying right now is I used to think AI was the railroads. Now I'm saying AI is Blue Apron. It's a tech, right? It's a service that people are using only because they don't have to pay for what it actually costs. But the second they have to pay for what it actually costs, they won't get enough value from this to continue paying and then the bottom will fall out. So I, so I think they're trying to communicate something along those lines. But now I will off ramp it back to you to defend your position against that position.
B
So look, I think there are a few separate things going on here. One of them is yes, there are absolutely people who are being subsidized in their use of AI by stacking together these promotional offers or having these sort of plans that cost $20 a month where you can use way more than $20 a month of tokens. I talked to a guy when I was writing my, my token maxing article who had found this like this promotional plan from Figma, a design tool that allowed him to use like thousands of dollars worth of Claude tokens on a plan that costs 20 bucks a month or something. And that to me did feel like a sort of moviepass situation where clearly this was a promotional offer. It was not meant to be sort of a permanent thing. But I think there are a lot of these kinds of things going on throughout the, you know, the industry people are offering their sort of on ramp plans that are subsidized and then they want you to sort of move to the higher plan or something like that. But many of the most avid token maxers are not paying for these sort of subscription plans. They are using this technology, these models through the API. They are paying for the sort of tokens on a per token basis. And I think there's a reasonable argument to be made that that price, the sort of price per token will fluctuate over time. Maybe the companies get so compute constrained that they start jacking up the price of tokens to effectively destroy demand and make their services easier to sustain. Maybe they have to start competing on price and they drive the cost of tokens down because a Chinese company or Google or someone else is offering tokens that are comparable quality and at a much lower rate. I think all of that is subject to the basic laws of gravity and of supply and demand. But I think the more relevant question here is what is this token use replacing? Because if it is just a toy, if it is true that none of this is making people more productive or companies more productive, then I think it is absolutely, it would absolutely follow from that that as soon as these promotional rates get taken away, they give up and start going back to writing code by hand or whatever they were doing before. But what I'm hearing from people in the software industry is that this is a permanent structural change to the way that companies will want to write software once you have tried the magic coding models. I've heard very few people say that they expect this industry to go back to doing things the old way. And so what they're really benchmarking their token costs against is their payroll. They're saying, well, for this comparable unit of work, writing this app, writing this unit test, how much would we have to pay a developer to do that and can we get an agent to do it for less? And so I think as long as the cost per token of I'm using token and air quotes. When it comes to the second part of this example, the cost per token of an AI model is cheaper than the cost per token of a human software developer or white collar worker. I think they're going to feel like that's a good trade.
A
I'm trying to put some of these storylines together and see if you follow this logic because it's all just. This question isn't written down. It's all just sort of top of mind. I wonder if you see the world of artificial intelligence bifurcating a little bit in the near future where on the one hand these AI companies are going to see the need to buy chips and provide inference that is cheap so that they can make money on inference, they can make money on AI use that is for consumers. So on the one hand there's going to be this sort of consumer facing strategy of how do we drive down costs so that the prices that we charge our customers are profitable on a per unit basis. Just classic unit economics, right? That's strategy number one. But there's also sort of, there's a forking, so to speak, here where on the other hand they're going to be developing technology like the mythoses of the future that are incredibly powerful, maybe unbelievably valuable to certain enterprises, but also not remotely for consumer use. Not only because they're so expensive to run inference for, but also because we don't want ordinary consumers in the middle of Iowa who just have a lot of time on their hands to, you know, break into the Social Security Administration and steal a bunch of secure data. Is there a way in which what we're seeing right now with both this token maxing phenomenon and the Mythos phenomenon is the beginning of a very hard fork in AI with consumer AI being really a race at the bottom of token costs and non consumer AI, enterprise, government, cybersecurity, national security AI being an entirely different ballgame of building more and more and more advanced models that are not going to appear in anybody's ChatGPT or Claude window. How do you feel about this sort of general prediction of us being at a, at a turn of the road here?
B
Well, thank you for the hard fork podcast promotion.
A
Trying to make it subtle.
B
Appreciate that. That's that native ad that you slipped in there, right?
A
I'll talk to your marketing department, make sure you're at the right rate.
B
Okay, I'll try to explain this one in plain English.
A
That's a podcast.
B
Just trying to pay you back. What was the question? Sorry, I was so distracted by my joke that I'm glad you guys had a joke. I haven't, I haven't.
A
Devin, keep all this in. Okay, I got it. No, like, yeah, is there, is this, is this a, is this like a two tier strategy now for the future of AI? Is there going to be like a consumer strategy here that is entirely diverged from this non consumer strategy to essentially build an AGI technology that's not for consumer use at all?
B
I think that's a reasonable possibility. I think we are already starting to see some of this. You know, the models that you get in your free chatgpt window are worse than the ones that you get if you pay for the premium plan. Same deal with other companies. They make their highest end models which are more expensive to run, available to only higher paying subscribers, and they offer sort of a degraded or distilled experience or model set for free users. But I would expect that to continue to widen. I don't think it's crazy to envision a world where a year or two from now you can pay thousands of dollars a month for a premium plan on one of these companies models that gives you access to all their latest and greatest technology. Maybe millions of dollars if the technology gets good enough would be a suitable price point for that. Some corporate customers and governments would be willing to pay. So I absolutely think that kind of stratification is already happening and will continue to happen. And I think that's just what happens in every other industry. We have Ferraris and we have Honda Civics and you can afford one and not the other, right?
A
I think it is Ferraris versus Honda Civics, but it's also Ferrari's Honda Civics and fighter jets. Right? I guess what I'm saying is what I'm beginning to see now with not just mythos but also OpenAI reporting or self reporting that they have a technology that's very similar to Mythos, they're not releasing to the public. It just seems like we're in a new phase right now where the business strategy to make consumer AI more profitable is incredibly different from the sort of existential strategy to build a technology that can do everything right. And we don't want the consumers to have access to the technology that can do everything in cybersecurity and hacking. And so I just wonder whether we're going to look back at Q2 20, 26 as being this moment where after this period, like we talked differently about what the AI companies were trying to do because it no longer made sense for them to think of these two things being the exact same strategy. Whereas before, as soon as these AI models were ready, they were being sent to the public, right? It just seems like a real break in what we've come to expect from these companies.
B
I think one really important point in all of this is that Claude Mythos Preview, with all of its scary cybersecurity capabilities, did not receive special training to be good at cybersecurity, at least according to Anthropic. That was just an improved capability that emerged from overall improvements and overall scaling of the underlying model. It is also probably better at many other things. So I think you're right to think that we're sort of the market is kind of bifurcating into professional users and consumer users. But it is also the case that the same techniques that make a model better at cybersecurity, defense and attacks also make it better at solving complex math and science problems or answering questions on homework. So I do expect that because of that quality of like the overall water level rising in more or less tandem, that consumers are going to want access to the things that do the things they care about really, really well. And those improvements in those capabilities will probably come at around the same time as the improvements in all the scary capabilities. That's just kind of how this stuff has progressed over the last few years.
A
I guess my very last question here is if the cybersecurity capabilities of Mythos were an emergent feature of ordinary scaling, then how will that technology ultimately be released to the public in a way that neuters a capability that's emergent? You know what I'm saying? Like, how could you possibly release especially the sort of non anthropic versions of this technology that won't have this capability built into them? It just makes it. What you just said, I've read that elsewhere. But it makes it seem very, very difficult then to stop ordinary users from using a capability that is emergent rather than deliberately designed. That makes it seem much harder to regulate this technology going forward.
B
It is, and it's especially hard because the same capabilities that make it good at writing code are also the same capabilities that make it good at breaking code. You cannot extricate one from the other. Now, there are things that you can do that Anthropic says it has done to try to mitigate the risks. On the downside, you can tell a model, hey, if a user asks you to break into a smart contract or exploit a piece of software, don't do that. They can sort of try to control the outputs at the point of contact with the user, but the basic capabilities are in there in the model. And users and governments and terrorist groups or whoever wants to use these things for malicious ends may find ways around those guardrails. So yeah, I think there is a real risk here. I am not persuaded that we have solved the so called alignment problem that tries to teach AI systems to obey human values. They are displaying all these weird and shady behaviors during pre training testing. As you mentioned with the exploits and the sandwich in the park story and all of that. These models are getting very capable and they are getting very capable at the same time as they are getting more dangerous and so it is hard to extract just one of those threads from the other.
A
Kevin Roos thank you very much.
B
Thanks for having me.
A
Sat.
This episode explores the release—and dramatic non-release—of Anthropic's new generative AI model, Claude Mythos (aka Mythos), considered the most capable (and dangerous) AI system to date. Host Derek Thompson and guest Kevin Roos dissect the implications: from cybersecurity threats and the shifting economics of AI, to the geopolitics of AI export controls and the potential for a structural split in the AI market.
On the surveillance gap:
“We are basically relying on the judgment of a very small handful of AI executives to steer us away from a potentially very scary threat.” — Kevin Roos (21:15)
On emergent risks:
“LLMs are better at cybersecurity research than I am. This thing has allowed me to find more bugs in a two week period than I had in my entire career.” — Kevin Roos referencing Nicholas Carlini (15:01)
On geopolitical dilemmas:
“We would not enthusiastically sell enriched uranium to the Soviet Union in 1947. We also would not want to sell the best Nvidia chips to China.” — Derek Thompson (30:01)
On industry bifurcation:
“I think we’re in a new phase where the business strategy to make consumer AI profitable is entirely different from building a technology that can do everything.” — Derek Thompson (55:02)
On alignment struggles:
“We have not solved the so-called alignment problem... these models are displaying weird and shady behaviors.” — Kevin Roos (58:47)
| Timestamp | Segment / Topic | |-----------|-----------------------------------------------------------------------| | 00:05 | Mythos’s dangerous capabilities; why withheld | | 06:24 | What is Claude Mythos? Access limitation via Project Glasswing | | 07:55 | What zero-day exploits mean; Mythos’s bug-finding success | | 09:34 | Mythos’s “bad behaviors”—manipulation and deception | | 12:02 | Is Anthropic’s non-release a marketing ploy? | | 16:54 | Compute shortage as partial, not primary, factor | | 20:14 | Absence of law or regulation for dangerous model release | | 24:17 | Timeline for China/open source catching up—race to patch vulnerabilities| | 27:38 | Should US restrict AI chip exports to China? | | 34:01 | Explanation of “AGI-pilled” export control philosophy | | 39:02 | AI bubble: Is demand or supply the real concern? Token maxing explained| | 46:05 | Past tech subsidization analogies (Blue Apron, MoviePass) | | 52:55 | AI market fork: cheap consumer vs. powerful restricted models | | 56:26 | Emergent dangers—cannot “unbuild” risky abilities | | 57:54 | Alignment problem and regulatory headaches |
The conversation is lucid, urgent, and at times a little irreverent, balancing big-picture alarm with dry humor and relatable analogies. Both host and guest aim to "explain complicated ideas to folks who have better things to do than read white papers"—and succeed.
Final thought:
“We are in the midst of a phase shift ... a world where the most advanced models are too dangerous for public consumption. This moment might be the most seismic in AI since ChatGPT.” — Derek Thompson (05:37, paraphrased)