![Dylan Patel - The Infinite Demand for Tokens, Claude Mythos, and Supply Constraints - [Invest Like the Best, EP.469] — Invest Like the Best with Patrick O'Shaughnessy cover](https://megaphone.imgix.net/podcasts/ceb34422-3ec7-11f1-97a5-8bd036d5980c/image/e8a75bd4ab83f059c746a594afd471a8.jpg?ixlib=rails-4.3.1&max-w=3000&max-h=3000&fit=crop&auto=format,compress)
Loading summary
Patrick O'Shaughnessy
Most software companies try to maximize your time on their app to juice engagement. Ramp does the exact opposite. Ramp understands that no one wants to spend hours chasing receipts, reviewing expense reports and checking for policy violations, so they built their tools to give that time back, using AI to automate 85% of expense reviews with 99% accuracy. And since Ramp saves companies 5%, it's no wonder that Shopify runs on Ramp, Stripe runs on Ramp, and my business does too. To see what happens when you eliminate the busy work, check out ramp.com invest
Interviewer
Felix Byrogo is a personal finance agent that turns a single prompt into finished
Patrick O'Shaughnessy
client ready work using your firm's own templates, context and standards. Send Felix an email like Take these comments and turn them for me or update my tracker with the context of these emails. Or run the ability to pay math on this buyer and Felix sends back finished PowerPoint decks, Excel models and sourced research. Felix works the way your team already does, delivering work quickly and accurately around the clock. Learn more at Rogo AI Felix, OpenAI Cursor, Anthropic Perplexity and Vercel all have something in common. They all use work os and here's why. To achieve enterprise adoption at scale, you have to deliver on core capabilities like SSO, skim, RBAC and audit logs. That's where WorkOS comes in. Instead of spending months building these mission
Interviewer
critical capabilities yourself, you can just use
Patrick O'Shaughnessy
WorkOS APIs to gain all of them on day zero. That's why so many of the top AI teams you hear about already run on WorkOS. WorkOS is the fastest way to become enterprise ready and stay focused on what matters most, your product. Visit workos.com to get started. Hello and welcome everyone. I'm Patrick O' Shaughnessy and this is Invest. Like the Best, this show is an open ended exploration of markets, ideas, stories and strategies that will help you better invest both your time and your money. If you enjoy these conversations and want to go deeper, check out Colossus, our quarterly publication with in depth profiles of the people shaping business and investing. You can find Colossus along with all of our podcasts@colossus.com Patrick O' Shaughnessy is
Narrator
the CEO of Positive Sum. All opinions expressed by Patrick and podcast guests are solely their own opinions and do not reflect the opinion of Positive Sum. This podcast is for informational purposes only and should not be relied upon as a basis for investment decisions. Clients of Positive Sum may maintain positions in the securities discussed in this podcast. To learn more, visit PSum VC.
Interviewer
This is my second conversation with Dylan Patel. Dylan is the founder and CEO of
Patrick O'Shaughnessy
Semianalysis, where he tracks the semiconductor supply chain and AI infrastructure buildout. This conversation is about the supply and
Interviewer
demand of tokens on demand.
Patrick O'Shaughnessy
Dylan describes something completely explosive. He explains why the frontier model is the only model anyone wants and willingness
Interviewer
to pay for it is nearly unbounded.
Patrick O'Shaughnessy
His own firm has gone from tens of thousands of dollars in AI spend last year to 7 million run rate this year on supply.
Interviewer
We walk through the bottlenecks across memory, logic and fab equipment that will determine
Patrick O'Shaughnessy
how fast any of this can scale.
Interviewer
We also cover Mythos and what the leading labs need to do to fix
Patrick O'Shaughnessy
their growing perception problem. Please enjoy my conversation with Dylan Patel.
Interviewer
You told me this incredible story about how your own team's use of tokens has changed dramatically this year.
Dylan Patel
Yeah.
Interviewer
Can you retell that story and what it is teaching you about what's going on in the world?
Dylan Patel
Last year we thought we were heavy users of AI. Everyone's using ChatGPT, everyone's using Claude, providing whatever subscriptions anyone wants on the order of spend of tens of thousands of dollars for our firm. This year the spend is just skyrocketed. And it really started in late December with Opus that included Doug o', Laughlin, who's president. He's very much like leading the charge in the sense of non technical people using AI for coding. And so he's basically pilled the whole firm slowly over time. I think he's been the leader in doing that. Obviously the engineers were using AI anyways, but spend in January just started to inflect and rocket and rocket and rocket and rocket. We signed an enterprise contract with Anthropic and it's gone to the point where now I think when I last talked to you it was 5 million spend rate. It's actually 7 million spend right now. So we're spending 7 million, that's last
Interviewer
week by the way.
Dylan Patel
And a lot of that is just the usage. People who have never coded before are using clog code and spending thousands of dollars sometimes a day. And it oscillates. Some people spend thousands of dollars one day or spend a couple hundred dollars couple days and then they go back to a thousand dollars. It's very variable across each individual user, but across a Firm we're spending $7 million a year and now on CLAUDE code at the current rate versus our salary expense being in the neighborhood of $25 million. So we're north of 25% of spend on CLAUDE code as a percentage of salary. If this trajectory continues, then, you know, we'll spend more than 100% by the end of the year, which is a bit terrifying. Thankfully, I don't have to decide between people and AI because our company's growing so fast. It's more so like, okay, well, I don't have to hire nearly as fast and I can spend a lot more on AI and it works and we just grow faster. But. But I think other folks will start to reckon with the fact that, huh, if this person can do the work of 5 to 10 to 15 people using Claude code, then all of a sudden I should probably cut people. And the use cases are so broad. Give a couple examples. Okay, so for example, one thing is we have a reverse engineering lab in Oregon that we've been building for a year and a half. We have a bunch of fancy microscopes, scanning electron microscopes. The whole purpose of this is you reverse engineer chips. You get architecture out of it, you get the materials that they're using to manufacture, and this is some of the data we sell. This is a very slow process of analyzing that data. Instead, one person on the team, they've been able to spend with a couple thousand dollars of Claude tokens, they've been able to create this application that is GPU accelerated, runs on a server that we have at coreweave. And anytime we send it an image, it will take the picture of the chip and overlay where every single material is, oh, this part is copper. Oh, this part of the gate is tantalum. This part of the gate is germanium, this part of the gate is cobalt. And so you can do a finite element analysis of the entire stack up of the chip very, very quickly. Visual with a dashboard gui. It's everything. Few thousand dollars of tech. Claude, the person previously worked at intel, and he said that was an entire team's job to build that and maintain that. I'll rack that up across the entire firm. It's insane. Another example that I think is super fun is Malcolm. He's an economist at a major bank. Before their economist department was like 100 or 200 people. What he built was the most incredible thing ever. He piped all, all of this different data, Fred data and all these other data, right? Employment reports and all these other things from various APIs. We signed a couple contracts with folks to get API access to data, pulled it all in, started running regression, started looking at the impact of various economic revolutions on the economy from a deflationary inflationary perspective. The Bureau of Labor Statistics has this entire set of 2000 tasks. And so he did that with AI, which ones can be done by AI, which ones cannot. And grading them across a rubric, about 3% are doable now with AI. And so he's created this metric so that you can measure things that can be done by AI. What the cost of being able to do those with AI, and therefore the deflationary aspect of it. Phantom GDP is what he's called it. Output can go up, but because cost falls so much, actually GDP theoretically shrinks. So he created this whole analysis and a brand new benchmark of language models, a set of evals across 2000 different evals.
Interviewer
He did this all by himself.
Dylan Patel
It was all by himself. Yeah. And he's like, dude, this would have taken the team of 200 economists a year. He's like completely cracked out on Claude. He's like, everything has changed.
Interviewer
How do you think about it as a business owner going from close to 0 to 25%, accelerating towards whatever percent of total spend? At what point are you like, whoa, I need to put the brakes on this. And be careful how much we're spending. Maybe we don't need to spend on the most cutting it on Opus 4.7, which came out today, maybe I can throttle it back to something that's a little bit cheaper.
Dylan Patel
I think I'm in the information business. We sell analysis, we do consulting, we create data sets. I don't see why this wouldn't be completely commoditized on a pretty rapid basis. If I'm not constantly improving my first product that I was selling as a data set, there's more people trying to do it now. We've made it constantly better and better and better and more detailed. And so therefore it sells a market. But the way we were doing it in 2023 is not terribly different. It's basically what everyone else is doing now. If I don't move up the bar, then I will be commoditized. If I don't move fast enough, I will also lose my edge. So the question is, yes, AI commoditizes things just like it commoditizes software. Those who can move fast and keep control of their customers and keep providing them an awesome service and keep improving, the service won't shrink, they'll grow faster. Those who are incumbent and not doing anything, they're going to lose. And so it's a bit of an existential if I don't adopt AI, someone else will and they will beat me. Another easy example is the energy space. So we've had a few energy Analysts, for like a year now, we've been trying to build out this energy model. It's very complex. Energy's data services market is something like $900 million. So obviously a hu huge market for me to try and break into. And we've been like slowly grinding at it. And it's been helpful for our data services business. We really hadn't broken into the energy data services business despite a year of having multiple people on the team. Then cloud code psychosis hits one of the people who leads the data center energy and industrial business at semianalysis, Jeremy hits him. And now all of a sudden, in three weeks, he spent a lot. He was spending like $6,000 a day. It was an insane amount. But he scraped every single power plant in the us Every single transmission line above a certain voltage, and created this entire mapping of the entire US grid, as well as a lot of demand sources, all from various public sources of data. And it's got like this dashboard where you can view and check, you can see all the micro regions of the US where there's power deficits and surpluses. All of these details built in a handful of weeks. We started showing some of our customers who buy our data center data set. But our energy traders, we showed some of them and they're like, wow, how long did this take you? This is really good. This is better than XYZ Company. And then we like dig deeper. XYZ company has a hundred people and have been working on this for a decade. Obviously our thing is not fully as robust, but in some ways it is better. I'm going to commoditize these energy services companies, data services company, who's going to come commoditize me if I don't move faster? And so the question from a business owner's perspective is, yeah, I'm spending a lot, but what does that spend getting? These are getting more revenue.
Interviewer
Are you worried that in the limit, the people that control capital and invest in capital, who are often hiring you for what you do, we'll just say, well, we have analysts too who are really smart about this. Like, we'll just build this ourselves. If it's getting that easy, at what point does it just all pull into the investment firms that stand to gain the most because they have the most leverage on top of the data or the insights that they glean?
Dylan Patel
First of all, any information services business, obviously I don't generate as much value as my customer does from said information, because if I sell you information for a dollar, you're only buying it for A dollar. Because you know that information helps you make a decision that lets you make more than $1. And so therefore you have made more money off of me than I did from the information myself. These investment funds all have their own information services, you know, especially like the super, the Jane streets of the world and the Citadels, they're really detailed on their data and yet these sort of folks also purchase data from us and continue to do so and continue to grow with us. Because I think there's just some IT factor, right? We move faster, we're more nimble at the edge, we're a smaller team that's focused on just one specific thing, AI infrastructure and the huge revolution that causes in AI and tokenomics and all these things. And we see where it's headed and so we're moving faster and building faster. I think investment professionals, yes, they'll try and build some of the stuff we do and more likely they'll just buy the data from us. And it's cheaper for them to buy the data from us and then build on top of it than it is to build it themselves.
Interviewer
I feel like every conversation I have with you, what I'm always getting at is just supply and demand of tokens. That's the thing that's interesting to me in the world right now. What has this experience taught you about the demand? Has it changed your view on the demand side of that equation? Just feeling it viscerally yourself.
Dylan Patel
If we take a step back and look at the macro lens, right, anthropic has gone from 9 billion revenue to what they're at 35, 40 billion now. Probably by the time this airs, 40, 45 billion who does arrange their compute has not grown to the same degree. And if you do the calculations and you assume they didn't decrease their research and development compute, they clearly didn't. They're released, they have Methos, they have office 4.7. So they clearly didn't decrease their research compute spend. So ultimately what they've done, even if you assume all incremental compute they've gotten has gone towards inference, their margins are at a floor of 72%. In reality, some of that incremental compute they've got probably went to research and development. It may be higher than 72% gross margins. To be clear, at the start of the year there was a leak from their funding round. Docs 1 leaked it. 30/something% gross margins. Where on earth does a business like this grow margins like that? It's in principle, right? Their demand is so high they're able to cut back on usage limits, rate limits, all these things. What really matters is having an anthropic rep and having an enterprise contract with them and getting the rate limit increases that you need because otherwise tokens are ultimately super super in demand whoever can pay for them. Anthropic has the same problem. I mean not problem, it's just the reality of how capitalism works. Yes, people are sending them $40 billion ARR in tokens, but those tokens are generating way more than $40 billion in value. Various businesses will have different value generation per token, but as we get more and more intelligent, what really matters is access to these most intelligent tokens and leveraging them at things you as a person deciding what is the best way to leverage these tokens to grow business and generate value. Because a lot of folks will want tokens and generate tokens, but the shitty SaaS startup in SF who is using Claude to generate their software product is not necessarily actually creating a ton of value and therefore they're going to get priced out of tokens soon enough.
Patrick O'Shaughnessy
As your business scales up, everything gets
Interviewer
more complex, especially your compliance and security needs.
Patrick O'Shaughnessy
With so many tools offering band aids and patches, it's unfortunately far too easy
Interviewer
for something to slip through the cracks. Fortunately, Vanta is a powerful tool designed
Patrick O'Shaughnessy
to simplify and automate your security work and deliver a single source of truth
Interviewer
for compliance and risk.
Patrick O'Shaughnessy
There's a reason that Ramp, Cursor and
Interviewer
Snowflake all use Vanta.
Patrick O'Shaughnessy
It frees them to focus on building amazing differentiated products, knowing that compliance and
Interviewer
security are under control.
Patrick O'Shaughnessy
Invest like the best listeners get a
Interviewer
special offer of $1,000 off Vanta when
Patrick O'Shaughnessy
you go to Vanta.com invest. I know firsthand how complex the tech stack is for asset management firms, and seemingly every new tool and data source makes the problem even worse, adding more complexity, more headcount and more risk. Ridgeline offers a better way forward, one unified platform that automates away that complexity across portfolio accounting, reconciliation, reporting, trading, compliance and more. All at scale. Ridgeline is revolutionizing investment management, helping ambitious firms scale faster, operate smarter and stay
Interviewer
ahead of the curve.
Patrick O'Shaughnessy
See what Ridgeline can unlock for your firm. Schedule a demo@ridgeline AI I had this
Interviewer
experience just today where on the flight here I got really limited out on something. I saw 4.7 came out and what I immediately wanted was to be on 4.7. That second I couldn't think about using 4.6 anymore. Now this 4.7 is out. I was perfectly happy with 4.6 for the last many Weeks. It's amazing. Are you surprised that people are so insistent on going to the most expensive leading edge thing to the degree they
Dylan Patel
are, without a doubt. I think one of my funniest memories in the past month and a half is myself and a buddy of mine, Leopold, being on our knees in front of an anthropic co founder, begging him for access to Mythos and then pretending it doesn't exist because we knew it existed. We're like, please give us access. And he's like, I don't know what you're talking about.
Interviewer
What was your reaction to that rate card or that eval card coming out?
Dylan Patel
It was rumored in the Bay Area. We knew it was supposed to be really good, but if you just look at the benchmarks, obviously benchmarks change over time. Mythos is potentially the biggest step up in model capabilities in two years. I think that's really, really an important detail that it's so good that they're like, don't want to release it, even though they already announced the price to their people that they did a selective release for Cyber for, and it's 5 or 10x the token cost. They just don't want to release it because they're worried about the impact on the world and they're releasing a worse version, Opus 4. 7 to us. And they explicitly said in the model card, hey, we actually preferentially made it worse at cyber. I don't know if you read that. Whoever you are, if you have enough capital, you should get a freaking enterprise anthropic subscription where you pay per token, not with these subscriptions, because then you won't get rate limited much. And then you need to figure out how to leverage those tokens to the highest value task and make money off of it. Because ultimately what you're doing maybe like a year from now or two years from now, the business is actually just arbitraging tokens, right? The tokens are amazing, but let's figure out what direction to point them in, and then three or four years from now, the model will know what to do with the tokens and how to make the most value. You can look at this retroactively. Pick any benchmark. The cost to hit a certain capability tier used to cost X and now it cost 1/100 or 1/1000 of that. Deepseek, for example, on GPT4 was 1/600 the cost. And since then, the cost have fallen further for GPT4 class models. Of course, no one gives a crap about GPT4 class models. They want the frontier because the frontier lets them create the economically valuable things. But GPT4 class models can still be used in stuff, and so people are using them in some tiny use cases. It's just the cost of falling so fast. It's not really what's driving the demand. What's driving demand is all these new use cases. You have current 4.6 OPUs or 4.7 OPUs tier models. A year from now, my spend for the same exact quality of the model would probably be like 70k. I bet you it'll be 100 times cheaper. Irrelevant, because I'm going to be using a way, way, way better model which can do way, way better things. Anthropic Mythos is more expensive as a model, but it spends a lot less tokens to do the thing. And therefore it is actually cheaper in most tasks than 4, 6 opus, because it's just way more efficient, even though each individual token is smarter. So, yeah, there's crazy geniuses creating huge cost efficiency improvements every day. They work at the labs and they're making the models way more efficient. You see it every generation GPT, what was it, 5 nano or whatever was better than GPT 4 or 5 mini was better than GPT 4, and it was like 1/100 the cost. This just happens and we accept it at its face value. But ultimately, you keep making things cheaper and then you keep scaling them up and you keep getting humongous improvements.
Interviewer
When I last saw you, Mythos had just come out maybe the day before or something, or the card had just come out, and you said something like, it actually made you feel like a little scared. It was so good. What did you mean by that?
Dylan Patel
Anthropic's whole goal in 2025 and even a lot of 2024, they're like, hey, by the end of 2025, we need an L4 software engineer in our model. And they by and large achieved that with 4.6opus. What they didn't say is that. And if you look at Mythos, if you compare benchmarks, it's like an L6 engineer. So L4 is like pretty new. L6 is like, quite well experienced, I think. Anthropic said that the model internally was available in February. So in two months they've gone from L4 engineer to L6 engineer. What's next? When you think about the model progress, it's only accelerated. Anthropic's release cadence has compressed. OpenAI's release cadence has compressed. Why? Because generally, to make a better model, you need a few things Right. You need to. Amazing. Compute. Compute is very expensive and it has a timescale that we track. And it's like it's growing, but it's set in stone for the next short term. It's like set in stone what you've already signed. And there will be delays and shifts and somehow you can find a little more. But it's generally pretty set in stone. There's amazing researchers that people are paying tens of millions of dollars for. And then lastly there's implementation. And implementation historically has been very difficult. If I have an idea now, I have to implement it. Implementing is hard now. Ideas are there. Implementation is very easy. It's expensive, but it's very easy. So how does one decide what ideas to implement? And it turns out if your implementation is just so much easier now, you can just implement more ideas and move on the treadmill faster and faster and faster, whether that is AI model research. And so now your model release cadence is shrunk down to two months from where it was six months before. Or I want to take every power plant in the US and every transmission line and model it and run regressions and see the micro supply and demand. I can also do that. The idea is cheap. Which idea makes sense? Which idea is worth the capital that you have to spend on the tokens? Because the implementation is there. That's the key. Learning. And if implementation costs continue to tank, which they are. We don't even have Mythos yet. It's only been a handful of hours since Opus4.7 launched, but my team is pretty excited about it. Internally, what now comes to the world. It's a complete reordering of how economies work. What used to matter a lot was execution was very, very fucking difficult. And ideas were che. Now ideas are cheap and plentiful, but execution is very easy. So really only the good ideas are the ones that can justify the spend on super cheap implementation.
Interviewer
So are you actually scared? Or does it just introduce an uncertainty that's hard to grapple with?
Dylan Patel
Uncertainty is there. But I do think that causes some fear in terms of how does society reform itself? How does one exist in a world where actually your ability to implement something is not actually that important? Your ability to choose the correct idea for AI to implement and then your ability to sell that idea or sell what the AI has implemented is what matters. Your ability to garner capital towards that is what matters. And going back to the point of it's very important to have the newest model always. Who's going to have access to the newest model? Anthropic's project. I know it's not called Earwig, but I troll Anthropic people by calling it Earwig, Glasswig, Anthropic, Earwig, where they only release Mythos to certain companies for cyber. That's just going to be something that continues. Models will have less broad and less broad deployment. I know OpenAI and Anthropic and all these people are like, we want to have great AI for everyone. AI is very fucking expensive. Who's going to pay for the trillion dollars of infrastructure? People who have money and can build useful things with AI. And then you don't want people to distill your model, so you don't release them broadly. You release them to fewer and fewer set of customers. Those customers are also now wrestling over the tokens. Unless Anthropic jacks them, they could double their pricing on Opus and I would continue to pay. And I bet most users would continue to pay. I bet that wouldn't solve their humongous capacity problem that they have. So then the question becomes where does this cycle end? Where token usage and therefore the benefits of those tokens, the additional value generated on top of those tokens aggregates among fewer and fewer and fewer companies. I don't have Mythos. You know, as Mythos, top freaking banks. Now they're only using it for cybersecurity. But at some point I can envision a world where, hey, maybe I, because I have an enterprise anthropic contract and because anthropic people kind of like me, they're willing to give us slightly earlier access or slightly higher rate limits or something for a model. I hope that's what happens. And then my competitor, whoever that is, doesn't have that and I'm able to fucking crush them. There are people like Ken Griffin of Citadel, super well connected and super rich. He goes and signs a deal with Open Air Anthropic that's like, yeah, I'm going to get access to your models and I'll buy the first $10 billion worth of tokens each year. So whenever you release the model, I'll spend the first 10 billion tokens and then everyone else can get the model after that.
Interviewer
Yeah.
Dylan Patel
And it's like, okay, well now what does that do? Now he's going to crush everyone in the markets. That's just an example. It could be any number of things. It could be cyber like Anthropic is worried about, oh, now I can hack people. It could be information services business like myself where I crush someone else. I think it's such a broad base, we don't know what these models can do. Anthropic doesn't know what these models can do. No one knows what these models can do. It's up to the end user to figure out where they can leverage the tokens to see what they can build and imagine which is tremendously productive and uplifting humanity. But then what happens to the concentration of resources and usage of it?
Interviewer
Presumably right now robotics or robots consume relatively zero tokens versus everything else. What's your view of that? If that's like a second demand curve that could start to ratchet, there's a new startup every single day within a mile of here trying to build something interesting in robotics.
Dylan Patel
So there's this concept of software only singularity, which is that the world has AI singularity, but only in software. And now what about the rest of the world? The vast majority of the world is physical. You can see the world orient around hardware, not software. That's actually why I think software only singularity is like just a blip and not like we do get everything else. Because once software is super easy, what makes robots really hard, it's programming microcontrollers and actuators and controlling all this stuff is very difficult. And right now the interesting thing about models, AI models is they're actually really inefficient in learning. It's just we're able to give them so much data that they're able to learn and pass us in certain ways. Currently the robot models, VLAS Vision Language action models, which is very popular right now, is probably not going to be the thing that ultimately scales beyond. They're inefficient in data and we can't scale the data for them fast enough. There is going to be some way to large scale pre train robot models where just like humans see all this data throughout their lives. And what's interesting is humans, the reason why we're so good is we're sample efficient. One example, two example, we're good. So applying that to robotics. So once you have this software only singularity implementation is super cheap. Anyone can start to build these models that now robots are actually useful. And so I think in the next six to 18 months we'll start seeing real breakthroughs in robotics that enable few shot learning. That is there's a pre trained robot model and now there's a robot that you have hired or bought or whatever and you showed a few examples and it's able to do it right now. You know there's a lot of companies Doing robots for advertisement or robots for simple stuff like that. But it'll be like, oh, folding clothes, sure, sure, sure. No, but it's going to get really niche robots just for cleaning chalkboards and it's a rental service or it'll be a model package that you download onto your standard robot that then does that in any ways. It'll be a huge explosion in physical good acceleration and deflationary effects there. But that's ultimately going to keep token demand going crazy. I don't think token demand slows down, personally.
Interviewer
Did you learn anything else about the world based on Mythos results and how it was built?
Patrick O'Shaughnessy
It's my way of asking if you
Interviewer
break down the components of the scaling laws.
Dylan Patel
Mythos is a materially larger model than prior models and so yes, it is a much larger model. What chip it's trained on is not really relevant. It's the scale. And obviously to 100,000 Blackwells is equivalent to hundreds of thousands of prior generation chips. TPU's and Trainium have their different release cadence, so it's not exactly like mirrored one to one. But ultimately Osmitos is a significantly larger model. It's proof that the scaling law still work. Everything about it shows the trend line continues of models. More compute into model makes model better. And along the whole way it's not just more compute into model makes model better. Along the whole way we're also getting these compute efficiency wins which are as all this research compute that the labs are spending is actually turning into. If I want X capability to your model and every six months that cost or every two months that cost is dramatically decreasing. But then if I scale it up massively, I get a humongous capability drop as well. And so yes, it's proof that this is still happening. Google and Anthropic are not heavy, heavy users of GPUs on the training side, but OpenAI, they'll start having their new class of models. I think they're taking a more sensible, principled approach to scaling in small steps. Anthropic really went for a huge jump. We'll see better and better models throughout the year and the release cadence is only going to get faster.
Interviewer
We've gone a long way in the conversation with saying almost nothing about OpenAI, which would have been so strange 12 months ago.
Dylan Patel
So this is an interesting thing. Everyone's like, okay, so Anthropics just won, right? They had Mythos in February. They never even released it because they didn't feel the need to. They're already sold out their revenue's already adding $10 billion a month. And then you've got Opus 4.7 today, all before OpenAI's alleged spud release, which media such as the Information and others have posted about. So clearly, Anthropic is in the lead and OpenAI is cooked. What's interesting is because Anthropic has such bounds on compute and they can only grow it so fast. And to the point of. Dario used to gloat about how OpenAI was being too aggressive on compute and Anthropic was more sensible in their scaling. And now Anthropic is like, fuck, I wish we had a lot more compute. OpenAI is able to pay the bills perfectly fine. In fact, they raised a ton of money to get incremental compute in addition to the irresponsible levels of compute that they were buying from Oracle and core. We even SoftBank and all these people. And Microsoft, such as Trainium, now they're getting Trainium as well from Amazon. They've done this insane thing on compute. They also know they need more. But what's interesting is if you were to say, Opus 4.6, let's ignore models getting better over time. Let's just take diffusion of this technology. You and I may jump on the model immediately, day one, but other businesses take time and it takes time for people to learn. And the spark of oh shit, Claude Psychosis moment does it hit everyone at the same time? And so by the end of the year, let's say a 4, 6 opus ture model the economy would spend $100 billion on. I don't think that's unreasonable. It's spending 40 billion right now.
Interviewer
That's like a linear extrapolation.
Dylan Patel
It's a linear extrapolation, not an exponential. To get the exponential, you need the better models. Anthropic won't have enough compute to do that. And presumably OpenAI and Google will hit that tier soon enough, whoever hits that tier next. Sure, Anthropic may get to charge 70 plus percent gross margins, but if OpenAI hits it next, they charge 50% in gross margins. They still get all of this incremental demand. And probably they also won't have enough compute to serve all the users. Sure, maybe Mythos is a model where if the world had enough compute, it'd be $500 billion in revenue or something crazy. There is such demand for these tokens and such limitations on compute. We see this with H100 prices skyrocketing and all these other things. The useful life of these GPUs continue to extend and extend. It's pretty clear Even the tier 2 lab is going to be sold out of tokens, let alone the Tier 1 lab. The Tier 1 lab will have better margins, but the Tier 2 lab will be sold out, and probably the Tier 3 lab will also be close to sold out. Economic value that the best model can deliver is growing faster than our ability to actually serve those tokens to people via the infrastructure. And so this gap will continue to grow and the model labs will continue to have expanding margins until people in the hardware supply chain, infrastructure, supply chain are like, wait, no, why don't I just jack up my margins?
Interviewer
So suffice to say, I think the assessment today, or your assessment of the demand side is completely explosive in your own particular example here at Semianalysis, but just more broadly that you call it AI psychosis. As people fall into this experience of what they can do, the implementation difficulty, going completely away. I've certainly felt that my own token spend is just through the absolute roof, just in the matter of weeks. So that feels like a pretty good assessment. Anything we're missing on the demand side,
Dylan Patel
if you don't use more tokens, you'll never escape the permanent underclass either. You use more tokens and you generate economic value, outsize economic value for the use of those tokens. A lot of people are doing it the boring, lazy way. Oh, I guess I'll just work one hour a day instead of eight hours a day and I'll have AI do most of my job. That's the boring way. The cool way is I'll still work 8 hours a day and I'll do 8x the work and maybe I'll make 5x the money. You can't do this with a job. Obviously. There's people who have multiple jobs. There's people who start companies and start selling stuff. There's people who are hustling, which is what I view like you and I as doing, is we're mostly hustling. Get that economic value on this AI before everyone is using it in its table stakes, because it's still not table stakes. So if you don't use more tokens and generate the value from them and capture that value. There's three different problems here. Using more tokens, generating value from those tokens, and capturing value from the value that you created from the tokens. If you don't do these three things, you'll never escape the permanent underclass, I. E. As models continue to skyrocket in capability and the concentration of resources potentially happens, all Right, let's talk about supply.
Interviewer
What is changing at the frontier of supplying the entire stack that's required to serve all these tokens as the demand
Dylan Patel
curve explodes, as demand skyrockets, prices are going up for everything on the supply side, whether it be the end GPUs, their prices are going up. In addition, their useful life is extending.
Interviewer
H100 prices look like this.
Dylan Patel
Yeah, exactly. There's people who have argued GPU useful lives are less than five years. Complete nonsense. There are clusters now resigning. Three or four year old hopper clusters resigning for three or four more years. There's a 100 clusters that are resigning for another couple years. So the useful life is clearly not five years. It's maybe even seven or eight years. Arguably we don't know yet. We'll see when Hopper gets there. But it's clearly not five years. So useful life is extending and the prices are going up on that renewal. In effect, the gross margin was not 35% on a cluster. It's beyond that. So margins are expanding in the cloud layer. Margins are extremely healthy on the hardware layer with Nvidia still charging 75 or whatever percent gross margin. As we move down the stack memory, obviously margins have skyrocketed. There places like optics and logic, there are large prepayments and margins are growing slowly more so the companies that are making chips like Nvidia are paying huge prepayments. So in effect, the cast of capital or timing of cash flow, the return on invested capital is going up even if the gross margin isn't. And you see this across the whole supply chain. You see ASML is completely sold out and they need Carl Zeiss to expand faster. Everyone's either sold out and margins are going up, or they're getting prepayments, which increases the return on invested capital because the invested capital is lower. And so this is a consistent trend across any part. It's even like to make a PCB requires copper foil. And that copper foil is sold out and people are making prepayments for it. Anything and everything that has a pulse and is sold out, people are jumping to get more incremental supply and fighting over the supply for the years after.
Interviewer
What do you think are the most important bottlenecks typically in economic history? When there's this kind of demand, supply reorients and rises very, very quickly to meet the demand. It seems like it's almost impossible for supply right now in this moment to keep up. Famous last words. Every shortage is followed by a glut historically. But what are the most interesting bottlenecks to you across the supply side.
Dylan Patel
Supply chains are usually very fast to react. One unique thing is that our supply chains now are more complex than ever and the things we're building are more complex than ever. And therefore the lead times are longer. And it's not like we haven't seen 18 month long lead times in other industries. It's just building incremental supply didn't take years. And this is the case with memory. Memory can only grow capacity. Low double digit percentages a year, right? 20s, 30% a year, even less for NAND, a little bit higher for DRAM, but whatever. Even though the demand signal was very strong at the end of 2025, the memory companies immediately started reacting. None of that incremental capacity really gets here until the second that they've decided to do. In addition to the typical 20 to 30%, they can stretch a little bit. But really the true incremental supply doesn't come till 28, which is a very unique thing. Even if they wanted to build as fast as possible, it doesn't come till 28, late 27 at best. So the result is memory prices have gone through the roof and guess what? They're going to double and triple again. At least on DRAM especially. People are like, oh, the memory story is overplayed. Everyone gets in and it's like no, no, no, you don't get it. DRAM will double or triple from here still because that's how much capacity is required. And they have to steal capacity from somewhere else. And the only way to steal capacity from somewhere else in a capitalist economy is demand destruction via higher pricing. We're not rationing stuff here. And so ultimately that's what's going to happen. And so margins continue to go up. I think logic also has humongous capacity problems. TSMC just had their earnings, they keep upping. Capex ultimately takes them quite some time to build fabs. They're trying to do everything they can to squeeze every little output out of every fab that they have. But ultimately they're not raising prices fast because they're good people. It seems like single digit price increases instead of triple digit price increases like the memory guys have had. So you ultimately have this market where yeah, TSMC is a great company, but are they actually going to extract all the value? I mentioned things like copper foil, glass fibers for PCBs, lasers. These are things that are well understood in niche supply chains, but they're very, very tight and ultimately upstream. The semiconductor wafer, fabrication equipment supply chain is one that's gone up a lot, but it's still very underappreciated. TSMC, CapEx this year they say 56. We've had 57.4 billion since January and we may up it slightly more just because we see some ways that they can get incremental capex. But what people aren't focusing on is what does that mean next year and what does that mean the year after? And it turns out three years from now TSMC is going to spend $100 billion on capex. Maybe two years from now it might be 28. Sincerely. They may spend $100 billion on capex in 2028 and people just can't fathom that. But what does that mean for their downstream supply chains? Companies like Lam Research or Applied Materials or asml, or their further downstream supply chains like MKSI and all these other companies. The tailwhip, it just gets whipped harder and harder and harder and that's a shortage. If TSMT wants to spend $100 billion in 2028, which is a real possibility, I think people would think that's insane, but that's a real, real possibility.
Interviewer
What about other parts of the chip ecosystem where GPUs have been completely dominant? What about like CPUs or ASICs or things that start to pop out as both opportunities and bottlenecks beyond just Nvidia's GPU dominance?
Dylan Patel
Yeah, I mean ASICS are obviously taking off, but I'll pivot away from AI chips to talk about these other things. There's a project we did on FPGAs and it turns out there's 120 FPGAs per next generation rack, AI rack. And then what about all the FPGA names CPU wise, all these reinforcement learning environments, plus all the slop code you and I are generating that is now running on some Vercel instance or whatever it is, or some AWS incident or some bucket that we've spun up. All of that requires CPU. And so CPUs are completely sold out and demand is skyrocketing there.
Interviewer
To help people understand the role that CPU plays and everything, there's two main
Dylan Patel
reasons why you need tons of CPUs. One is when you're doing reinforcement learning, the CPU is very critical to that. So before you would throw all the Internet's data into the model, train it and it spits some stuff out, now you train all the world's Internet. You put all the Internet data into the model, then you put it in this environment. This environment is like hey model, try this out, and it tries stuff out, tries a bunch of different things. And in the end, there is an environment which scores whether or not what it tried out is successful and it grades it. And these environments can be anything. It can be, hey, check if the text was outputted in the right way. Structured outputs. It can be very simple stuff or it can be very complex stuff. And people are starting to get into very complex things. Like, hey, I want you to open this file, change it, edit it, update it, submit it to this website. I want you to open up this physics simulation from Siemens and edit this CAD model so the environments can get more and more complex. And those environments run on CPUs, they don't run on GPUs, they don't run on ASICS. The ASICS run the model that takes the input data from the environment, runs it through the model. The model creates outputs of various different trajectories, ways that it think it could solve it in different instances. Those trajectories are graded, slash, scored. And the ones that are successful, you train on and you update and you iterate, iterate, iterate. And so CPUs are very useful for that one. And then once you have these great models and you're deploying them, those models are generating code, they're generating useful output. That useful output, it doesn't go from a GPU straight to the human brain. It goes from a GPU or an ASIC through to a deployed app that you're deploying somewhere that actually just runs on CPUs. So that's another area where there's a lot of demand and things are sold out in a large, large way.
Patrick O'Shaughnessy
Your finance team isn't losing money on big mistakes. It's leaking through a thousand tiny decisions nobody's watching. Ramp puts guardrails on spending before it happens. Real time limits, automatic rules, zero firefighting. Try it@ramp.com invest. As your business grows, Vanta scales with you, automating compliance and giving you a single source of truth for security and risk. Learn more@vanta.com invest Ridgeline is redefining asset management technology as a true partner, not just a software vendor. They've helped firms 5x in scale, enabling faster growth, smarter operations, and a competitive edge. Visit ridgelineapps.com to see what they can unlock for your firm. Every investment firm is unique and generic. AI doesn't understand your process.
Interviewer
Rogo does.
Patrick O'Shaughnessy
It's an AI platform built specifically for Wall street, connected to your data, understanding your process and producing real outputs. Check them out@rogo.AI. invest the best AI and software companies from OpenAI to Cursor to proceed, use WorkOS to become enterprise ready overnight, not in months. Visit workos.com to skip the unglamorous infrastructure work and focus on your product as
Interviewer
you continue to assess and try to be the world's best informed person on both the trajectory of supply and demand. What are things that you wish you knew to make that understanding that you don't know?
Dylan Patel
I think the hardest area for us and for everyone is understanding tokenomics, economics of tokens. I think we have a really tremendously good insight into how much it costs to run infrastructure, what the cost of tokens are, what the cost of models are, what the margins of these labs are. But the usage and adoption is what's really difficult to model continuously, right? January we had crazy estimates for February. Anthropic smashdom. How do we calibrate this model? What are the data sources for this? February we had crazy assumptions for March. I know people were like, you're crazy, Dylan. And then they smashed it. Everyone sees the number of 10 billion and they're like, what the f? How do they add 10 billion of revenue? Who is using all these tokens? Why are they using them? What are they building with them? And then more importantly, with what they're building with these tokens, how is that actually diffusing into the economy? And what value is that generating? Because it's not really something that you can capture in any GDP statistic. All of the value of the tokens that I use get transformed into better information which I then sell at a discount to what people used to sell information for relatively. And therefore that information is now making its way throughout the economy and people are making better investment decisions or better competitive decisions if they're a semiconductor company or data center company or hyperscaler. What is the value of this? What has that done to the economy? It's clearly, by every subjective metric, amazing. But where is the phantom gdp? What is the phantom gdp? How do we track the real estate economic value? Because the GDP metrics are not accurate. If you were to say, what is the GDP that Dylan Patel is making? It's tiny compared to the value that I think is being created. And I think you would say the same for Patrick. What is the value being created by these tokens? Not on a basis of simple. What is the knock on effect of all the things that these things are doing? I think that's the real question and challenge that's hard to measure. I think we've got a tremendous Reading on the supply side of things. I think we've got a tremendous reading on even a lot of the demand side signals. But it's what is the value these tokens are generating that's hard to quantify. Quantify and measure.
Interviewer
I hope we get a chance to do this every three months because this changes so quickly. What do you think is going to happen next? When I come back three months from now and we're in San Francisco together again, what do you expect?
Dylan Patel
Large scale protests.
Interviewer
Really?
Dylan Patel
Yeah. I think there will be a large scale protest against Anthropic end up at AI. People hate AI. AI is less popular than ice, less popular than politicians. With Anthropic are adding so much revenue that's going to start causing business changes downstream. People are going to get more and more scared of AI. They'll start blaming more and more of their own problems. And things that are global have been deep seated problems for a long time. Those will bubble up and be blamed on AI. Probably some politician or some influencer will be able to start taking and weaponizing AI against people. You look at the comments of news articles where Sam Altman had a Molotov cocktail thrown in his house twice in two weeks and people are cheering it on. And this is just the beginning. So I think we'll see large scale protests against AI in three months.
Interviewer
What is the counterweight to that? How should the AI industry head that off?
Dylan Patel
First of all, Sam Altman and Dario have to stop getting on interviews. They're so uncharismatic. I don't know what they're doing. Every interview they do is like, wow, normal people are going to hate you even more. Sam being on Tucker Carlson probably made all Republicans hate OpenAI. I'm just guessing. Same with Daria. I think that's first. Two, they need to start showing uplifting things that can be done with AI. Three, they need to stop talking about how the capabilities are going to change the whole world constantly. Because then people are going to get fear of that capability because they have no connection.
Interviewer
They don't know how to use it.
Dylan Patel
Yeah, there's no connection to it either. The average person doesn't know an anthropic employee. The average person doesn't know an OpenAI employee. Average person doesn't know who these people are, what their goals are. And they just view them as a sneaky cabal of 5000 people at this company that are going to change the world and automate all the jobs and destroy society. That's what they view it as. And as people who are funding the building of all these data centers and power plants that are going to pollute the world. They don't quite understand what's happening, so they have to stop talking about the future thing that's going to happen and only talk about present. How uplifting AI is. I think it's a huge reorgan rebranding that needs to be done.
Interviewer
This is so much fun. I love doing this with you. Thanks for your time.
Dylan Patel
Awesome Biggs.
Patrick O'Shaughnessy
If you enjoyed this episode, visit colossus.com you'll find every episode of this podcast complete with hand edited transcripts. You can also subscribe to Colossus, our quarterly print, digital and private audio publication featuring in depth profiles of the founders, investors and companies that we admire most. Learn more@colossus.com subscribe.
Dylan Patel
Foreign.
Patrick O'Shaughnessy
You know how small advantages compound over time. That's true in investing and just as true in how you run your company. Your spending system is your capital allocation strategy. Ramp makes it smarter by default. Better data, better decisions, better economics over time. See how@ramp.com invest as your business grows, Vanta scales with you, automating compliance and giving you a single source of truth for security and risk. Learn more@vanta.com invest Every investment firm is unique and generic. AI doesn't understand your process.
Interviewer
Rogo does.
Patrick O'Shaughnessy
It's an AI platform built specifically for Wall street, connected to your data, understanding your process and producing real outputs. Check them out at rogo. AI invest the best AI and software companies from OpenAI to cursor to Perplexity use work OS to become enterprise ready overnight, not in months. Visit workos.com to skip the unglamorous infrastructure work and focus on your product. Ridgeline is redefining asset management technology as a true partner, not just a software vendor. They've helped firms 5x and scale, enabling faster growth, smarter operations, and a competitive edge. Visit ridgelineapps.com to see what they can unlock for your firm.
Invest Like the Best with Patrick O'Shaughnessy
Guest: Dylan Patel, Founder and CEO of SemiAnalysis
Episode: The Infinite Demand for Tokens, Claude Mythos, and Supply Constraints
Date: April 23, 2026
In this wide-ranging and energetic episode, Patrick O’Shaughnessy sits down for a second time with Dylan Patel, founder and CEO of SemiAnalysis, to dissect the explosive surge in demand for AI tokens, the ongoing supply chain bottlenecks constraining model development, and the earth-shaking implications of Anthropic’s Mythos model. Through firsthand stories and deep industry analysis, Dylan describes a market where demand is "nearly unbounded," business practices stand at the edge of reinvention, and the future competitive landscape, and even social dynamics, are being radically redrawn by relentless advances in AI.
“Last year we thought we were heavy users of AI… This year the spend is just skyrocketed.” (03:19, Dylan Patel)
“The person previously worked at Intel, and he said that was an entire team's job to build that and maintain that. I'll rack that up across the entire firm. It's insane.” (05:36, Dylan Patel)
“He piped all this different data… and started running regressions… He did this all by himself… would have taken the team of 200 economists a year.” (06:29 & 07:20, Dylan Patel)
“If I don't move up the bar, then I will be commoditized. If I don't move fast enough, I will also lose my edge.” (07:50, Dylan Patel)
“XYZ company has a hundred people and have been working on this for a decade… in some ways [our product] is better.” (09:27, Dylan Patel)
“We move faster, we're more nimble at the edge… they'll try and build some of the stuff we do and more likely they'll just buy the data from us.” (10:49, Dylan Patel)
Frontier Model Obsession
Businesses always want access to the latest and most powerful models, even at significant cost premiums.
“I got really limited out on something… I saw 4.7 came out and what I immediately wanted was to be on 4.7.” (14:53, Patrick O’Shaughnessy)
“One of my funniest memories… is myself and a buddy of mine, Leopold, being on our knees in front of an anthropic co-founder, begging him for access to Mythos.” (15:17, Dylan Patel)
Mythos as a Game-Changer
Mythos represents such a leap in capabilities it “made you feel a little scared.” (18:27)
“It's so good that they don't want to release it… even though they already announced the price… they're worried about the impact on the world.” (15:44, Dylan Patel)
Access as a Strategic Weapon
Early or exclusive access to new models could decide the fate of whole industries.
“Maybe because I have an enterprise Anthropics contract and… they're willing to give us slightly earlier access… I hope that's what happens. And then my competitor… doesn't have that and I'm able to fucking crush them.” (22:15, Dylan Patel)
Implementation Overhead Plummets, Ideas Dominate
Cheap and fast implementation means competitive advantage comes from imagination, identifying valuable ideas, and being first to harness models for those ideas.
“What used to matter a lot was execution was very, very fucking difficult. And ideas were che. Now ideas are cheap and plentiful, but execution is very easy.” (19:55, Dylan Patel)
AI’s Value Often Becomes Invisible
Traditional GDP metrics understate AI’s effect, as much value is felt through deflation and hidden productivity gains ("phantom GDP").
“What has that done to the economy? It's clearly, by every subjective metric, amazing. But where is the phantom gdp? What is the phantom gdp? How do we track the real state economic value? Because the GDP metrics are not accurate.” (41:36, Dylan Patel)
Growing Societal Pushback and Narrative Risk
AI is deeply unpopular and at risk of becoming the scapegoat for broader societal woes.
“I think there will be a large scale protest against Anthropic end up at AI. People hate AI. AI is less popular than ice, less popular than politicians.” (42:44, Dylan Patel)
Advice to AI Labs
Founders need to improve their public communication and demonstrate the current uplifting impact of AI.
“First of all, Sam Altman and Dario have to stop getting on interviews. They're so uncharismatic… They need to start showing uplifting things that can be done with AI.” (43:37, Dylan Patel)
GPU and Hardware Supply
Demand for compute hardware is so fierce that used clusters are getting multi-year lease renewals at rising prices (31:56).
“The useful life is clearly not five years. It's maybe even seven or eight years.” (31:58, Dylan Patel)
Bottlenecks across the Supply Chain:
“DRAM will double or triple from here still because that's how much capacity is required. And they have to steal capacity from somewhere else. And the only way… is demand destruction via higher pricing.” (34:40, Dylan Patel)
“Three years from now TSMC is going to spend $100 billion on capex… people just can't fathom that.” (35:44, Dylan Patel)
“CPUs are completely sold out and demand is skyrocketing there.” (37:31, Dylan Patel)
“I think in the next six to 18 months we'll start seeing real breakthroughs in robotics that enable few shot learning.” (24:55, Dylan Patel)
On Business Outpacing Employees:
“If this person can do the work of 5 to 10 to 15 people using Claude code, then all of a sudden I should probably cut people.” (04:57, Dylan Patel)
The Arms Race for the Frontier:
“No one gives a crap about GPT-4 class models. They want the frontier because the frontier lets them create the economically valuable things.” (17:53, Dylan Patel)
On Mythos’ Leap:
“If you compare benchmarks, it's like an L6 engineer… in two months they've gone from L4 engineer to L6 engineer. What's next?” (18:37, Dylan Patel)
Competitive Implications:
“There are people like Ken Griffin of Citadel… I'll buy the first $10 billion worth of tokens each year. So whenever you release the model, I'll spend the first 10 billion tokens and then everyone else can get the model after that.” (22:34, Dylan Patel)
Permanent Underclass Warning:
“If you don't use more tokens, you'll never escape the permanent underclass either.” (30:38, Dylan Patel)
From Dylan's firsthand transformation at SemiAnalysis to large-scale industry shifts, the conversation weaves together immediate business impacts, accelerating technical breakthroughs, and far-reaching implications for labor, capital, and society. As demand for tokens and model capabilities spirals upward—checked only by stubborn supply-side bottlenecks—the question shifts from “who can generate the most value from AI?” to “who gets access at all?” Whether it’s the future of work, competitive dominance, or social cohesion, Dylan and Patrick leave listeners with a sense that we are living through the fever pitch of an AI revolution whose outcome, both exhilarating and unsettling, remains undecided.
For more insight and the full transcript, visit Colossus.