
Dylan Patel, founder of SemiAnalysis and one of my go-to experts on semiconductors and data center infrastructure joins me to discuss AI in 2025. Several key themes emerged about where AI might be headed in 2025: (1) The hyperscalers are racing ahead in capital expenditure, (2) The expected explosion in AI workloads is drawing in a wave of new specialized GPU cloud providers, (3) By 2027, AI data centers alone could account for 10% or more of total US electricity consumption, straining America’s aging infrastructure, (4) Open-source variants like Llama 3.1 are driving commoditization at speed, slicing away the profit margins of plain-vanilla model-serving.
Loading summary
A
Hi. Happy holidays. While I'm away this holiday season, I wanted to drop here a few of my conversations with leading experts in AI. I discussed how AI might surprise us in 2025 with Ethan Mollick, Dylan Patel, Nathan Benech and Kai Fu Lee. The conversations were initially released for members of Exponential View. First one up, my discussion with Dylan Patel. Enjoy. So it's Azim here from Exponential View. And this is part of our series of conversations about what surprises artificial intelligence will have in store for us in 2025. It's a particularly hard question, of course, because they're surprises. If we knew what they were, they wouldn't be surprises. But to help us work through one of those questions, I've got Dylan Patel, who is the founder, the boss, the supremo of really my favorite semiconductor data center infrastructure vehicle for analysis, analyzing these questions. Dylan, thank you for joining us.
B
Yeah, thank you for having me, Azim. And I'm here with Azim, the, you know, the, the leader of, of exponents. Right.
A
Well, we try, we try. So I'm in a hotel in, in New York which is, explains the slightly bizarre, weird vanilla background. I was here for the, the Dealbook Summit. Andrew Ross Orin and New York Times DealBook. So I was able to hear Jeff Bezos, Sundar Pichai and Sam Altman speak yesterday. Well also Prince Harry and Serena Williams. A few of them talked about AI and the thing that really came across there was a slightly different message between Sundar and said there is no wall, things are still going to keep going. There's lots of room to run. And Sunda said the low hanging fruit has been picked and it's going to get a little bit harder from now. And Jeff said Amazon is really turning its attention and we've just released this new vertical stack of chips and models, new pricing and so on. Who was right out of them do you think?
B
Yes, I think, I think it's like quite a, quite an interesting. Right. Like one's just getting started. One saying hey, it's going to be more incremental from here and one's hypebeast. You know we're going to keep going exponential from here. Right. I think this, this is a sponsored by Smart Water somehow.
A
Right.
B
But, but the, the interesting thing is you also have to think about what is each person's motivations for their statements. Right. You know Amazon is just getting started, right. Like you know, if you look back a year ago they were woefully behind and today they're still woefully behind Right. Their new models are still like, very much, not even top five in the world, right. You know, but as, as like the leading cloud company, they should be in that top five, but they're not. You know, you look at, you look at OpenAI, their motivations are they must raise, right, the level of scaling that they want. They're going to spend all of the money that they just raised, the 6 billion equity and 4 billion debt, you know, in like a year and a half. And, you know, they would like to commit to more compute than they are they got with that money. And so they need to raise again, like, like in a quarter or two, right, if they want to really get that money and then start like committing to even more compute. And then, and then lastly, you have, you have Sundar, who, you know, I think, I think they're still in like release paralysis. They've, They've, you know, haven't been able to release their Gemini Ultra models in a while. Right. They keep releasing Pro iterations of the Pro. They keep inching up, but they haven't released this flagship model yet, right, that, that OpenAI has, anthropic has, you know, and so on and so forth, right. So they're really lagging behind on that front. So I think it's interesting that they say it's only incremental from here, given they still haven't even gotten to the top rung of top echelon. So that's a little bit concerning as far as who's right, I think it's closer to Sam or Sundar or, sorry, Sam or Bezos, because we are just getting started in AI, right? Like these scaling on many different vectors, whether it be synthetic data, whether it be compute, whether it be push training, whether it be reasoning, whether it be AI infrastructure and inference rollout. All of this has just started, right. We're nowhere close to this being as pervasive.
A
Right. And what you've just described though, reminds me of the Charlie Munger quote, which is, show me the incentive and I'll show you the outcome. And these three different leaders have got different incentives to give us a particular answer, as you say, sort of SAM has to exponentiate and continues to do so in order to keep raising. So you say, And I believe I also agree with that, that we are just getting started, that the. It's easy to mistake turbulence on the journey for the plane not getting to its destination. And of course, engineering of new products of these, of this scale is bound to hit roadblocks when we talk about exponential growth. Right. And we think about what's happening with these AI models. There's lots of exponentials going on. There is the amount of data they need, the amount of compute they need, the speed with which they're growing. Whenever we've looked at technologies like this, like in semiconductors back in the 70s, 80s and 90s, there are often cracks. Right. There are moments where it just becomes hard to get around that hurdle. You need a new approach. Where do you think that at what layer do you think we'll first see a crack in the exponentials that are driving AI? Will it be design, manufacturing, materials, deployment?
B
Yeah. Interesting. So you mentioned like glimmers of cracks, right? Like there's obviously one, one angle is, hey, models keep getting better, but are they getting better at the pace that people are espousing? Right. And so then there's the whole like debate over the last couple months of scaling loss slowing down. Then there's also the flip side is cracks around deployment and that's where actually there's a lot of cracks. Right. Microsoft, every earnings call says they don't have enough capacity to deploy the models as pervasively as they'. Like OpenAI constantly says they don't have enough capacity to deploy the models they'd like. Right. You know, and other, other folks as well. Right. Google and Anthropic, they've definitely had issues deploying the models as fast as they'd like to, just due to capacity. Right. So as you look at capacity, that's.
A
GPU data center capacity.
B
Right, exactly. And then, and then the flip side is like, hey, like enterprises, getting them to adopt AI is taking time, right. It's not a, not a flip of the switch as it has been with like startups and some of the tech companies. So you know, that's, that's like the, the challenge there. I think, I think the crack that we see now is really that no one gets to make money besides the person who's their first or second. Right. Everyone else is actually. Their margins are really bad. You know, when we're talking about, hey, serving llama tier models, serving deep seq tier models, that's great, right? Like, yeah, but like Meta and deepseek aren't making money from open sourcing yet. They're, they're, they've got other tertiary benefits, of course, course. But then the people who take these open source models and serve them, you know, plain vanilla, aren't making money right there because there's such intense competition. And then, you know, you kind of have to have built some value add on top. And that value add is more sticky and customer specific. Right. And that's where I think like there's a lot of cracks is like, how can you do the sticky customer specific stuff without doing the, you know, base model and instruct model stuff without that like dominating your cost structure and being really bad. So there's sort of a catch 22 there.
A
Right. So, so in a sense, the, you know, Andreessen, Mark Andreessen has gone off and said, look, this is a race to the bottom and these models are going to get commoditized really, really rapidly. So is the observation that you're making. If you look at some of these people who serve models like Together or grok, they're serving very, very cheaply, which means extremely small margin. But not only that, they're not necessarily serving at the Claude Sonnet or GPT4 level.
B
Yeah. So I think the real challenge here. So last year we talked about this for Mistral. We said it was a race to the bottom. The inference was a race to the bottom and that everyone was just going to take these models and commoditize them. And anytime there's an open source model that commoditizes everything below that tier. And that's been the case, Right. With the Release of LLAMA 3.1405 b and 70 b, that's continued Commoditize.
A
A.
B
Lot of language models. Right. So when we look at, like, hey, look at, look at Amazon's new model. Well, it doesn't beat 405B and it's, it's, it's just a commodity at that point. Right. Like what am I doing here that's differentiated. And so far the only two companies that have a good margin on serving are OpenAI and Anthropic. Right. And then they're partners who get to serve their models. Right. So you know, OpenAI with Microsoft and Anthropic with Amazon and now they recently announced the snowflake thing. So there's very few folks that get to make money serving and it's really just dominated by that first and second place player. And the question is like, hey, what happens when Xai gets to that first or second place? Does that knock one of them out? But given that they've been spending so much, do they even get there? Does that mean that there's now a first, second and third or do first, second and third now make less money and that tale continues on? Right. What happens when Meta releases Llama 4? Does that just wipe out another humongous tier of models?
A
Let's assume that these Xai's model is actually better than GPT4 or Claude 3.5. How long would it need to be better than those models for developers to actually switch? Right. Because the history of technology markets is that there is no number three, there is no number four, there is no number five. And it was true even in databases from back in the late 80s, early 90s. You very, very quickly you had kind of Oracle and Sybase and Infamix I think it was, you couldn't really think of who four or five were and the dominance of the top two was really substantial. And why would this technology market be any different?
B
Yeah, so I think one thing about language models is that they're very replaceable. Right? So we talk about, hey, this commoditization is, is bad for some businesses. It is. But guess what, it's driving so much adoption, right? The fact that people can adopt Llama 3.170 B without paying anyone a margin is driving the cost down so much. Right. And in fact in some cases VCs are subsidizing it with inference providers that they keep funding. Right? So, so, and then like clouds as well, right? Like with loss leadership type strategies. So it turns out that like hey these, these models commoditizing are driving adoption faster and faster. It's just there's no one really making money off. Every time someone open sources a model, any margin that the model makers were making, a lot of it gets wiped away and then what's, what's left there? It's margin that Nvidia has, right. And margin that the data center folks have, right? So, so it ends up being that you knocked out half of this margin, right. You know, you look at Anthropic and OpenAI's margins, their gross margins are 60, 70 plus percent. Right. So it's, it's that whole thing just gets wiped away, right. And, and 4o mini. I bet it's not making a ton of money because it's so small and there's got so much competition from Gemini Flash and Anthropic and Llama and you go list. So, so if a company like XAI were or Google were to come out with hey, this is definitively right up there. If anything, the only way they're going to get market share is by pricing it a little bit lower. That drives people down, right? That drives the margin maybe from 70% to 60% or 50%. All of that delta is money that's lost. But also like semiconductors, right? Moore's Law Every two years we doubled the number of transistors and have the cost, right. Guess what? Market never shrank, it kept growing.
A
That is the bet, right? That increasing demand, expanding the market, there's a positive elasticity of demand as price falls, is the bet that is partly being taken here so that if you ask or ask the question, what does 2025 hold? Does anyone really start making money? It might be the case that we're still not in that stable environment where anyone is really making money.
B
I mean, you can look at many different platforms out there in the sort of Bay Area tech community and they didn't make money for a long time, right?
A
Sure you can for Amazon, you look.
B
At, you can look at Airbnb and Uber or you can go even more intense and say YouTube. How long did it take Google to make YouTube money from YouTube? More than a decade. YouTube was a money losing enterprise for more than a decade, right? And it scaled and scaled and scaled. But now it's a very tremendously profitable business for Google. Not like Search, but it's still tremendously profitable. And same with like Amazon and Twitch, right? And there's so many different platforms that are hugely capital intensive, right, because there's so much compute or bandwidth or storage required to serve them to millions of users. But once they get the moat, you know, that's really helped. Now the big issue here, question here is the switching cost is really low, right? If I want to go from OpenAI's API to you know, hey, new llama model on together or new deep Seq model on together or Fireworks or whatever it is, turns out it's, it's like you change it like that, it's that easy. You have to do a little bit of testing and you're good, you're done, right? Like it's, it's, it's like immediately replaceable. So the question is like can OpenAI Anthropic and these other companies build services on top, right? The, the ability to switch off of Microsoft Copilot is, is probably zero right now. Copilot hasn't gotten a huge amount of adoption because Microsoft doesn't have the compute to actually deploy it at scale with all the features that they promised. But right. You know, that's something that doesn't have switching cost or has huge switching costs. Whereas like just raw API models, tokenomics, those have low switching costs. And so that always chases, you know, reducing cost and, and, and switching, switching makes easier and then that therefore driving down margin and cost.
A
You talked about compute constraints, constraints there for Microsoft So let's talk about compute and semiconductors as a, as a starting point here. The we've just seen Amazon announce their own family of chips in the last couple of days that the Trainium chips upon which they've got the new models running, Nova I think the other models are called. What do we think will happen with the semiconductor market next year? There's been a supply chain crunch in a sense. People can't order enough of these high end chips and we've got Nvidia continuing to push out their roadmap. What should we expect?
B
Yes, I think we'll see today we have more than 100 cloud companies that have started in just the last few years. Last couple of years. Some of them pivoted from other things, but all of them are basically new to GPU cloud. Right? Historically we had these massive cloud companies and huge earnings and all these guys still have fantastic margins on GPU stuff. But as, as these hundred cloud companies start to get bearings and hey, there will be some consolidation, but there's at least like five of them that are legitimate businesses that will be long term that I say will be around in 10 years, right? And be much, much bigger, right? Like the core weave, Nibia, those kind of folks. They're going to be much, much bigger, right. And so then the question is like what happens next year? Right? Well, we see, we really see some of these folks take a new stride, right? NEUS is already a public company. COR likely goes public. Crusoe continues to raise money, right? These kind of companies become a next level of scale and start rivaling the big guys to some extent. And then on the bottom side you see a lot of people falter and keep driving the price of the market down. And then you have the big guys, hey, they're going to continue to buy as much as possible, but given their plans for next year, investors are going to be like pretty spooked, right? Like you look at like, like what is Microsoft going to spend on Capex next year? North of $80 billion. Right. You know the street is not ready for that, right? So, so yes, Nvidia has to deliver. Nvidia's chips are a little bit delayed, right? But they'll be out there and they'll continue to grow revenue, they'll continue to ship chips. The real problem is how are people going to like handle the amount of capital being flown into them right now? Now thankfully, unlike, you know, most of these AI companies, GPU clouds are profitable today. So that's kind of helpful. But that's Also leading to a lot more demand being flown in. And by demand, I mean capital continuing to enter, whether it be the big guys with their balance sheet and their cash flows driving those down lower and lower and lower. Hey, is Microsoft going to have a cash flow of only 10 billion versus 40 billion or same with Meadow or if it's the new guys and they're like yeah, we're going to raise another $10 billion to build out another cluster or it's Oracle, it's like, hey, we're going to spend a bunch of money. This is the big theme of next year is the worry, the worry around that. Right. The dollars that have been spent so far are somewhat insignificant compared to what's being planned next year. Right. It's more than doubling.
A
Right, it's more than doubling. So let's just put these numbers into, into some kind of context. Right. So $80 billion is. You said. Well that's kind of where Microsoft will roughly end up with their CapEx. Their CapEx spend five years ago was probably in the $10 billion range, right? It was. I'm actually trying to look the number up right now, but I am struggling to find it. Yeah, I mean 20, 20, $15 billion in CapEx. So you're looking at a five, five fold increase in, in five years. And, but isn't that being communicated to the streets? Right. That is what, what happens in the, in earnings calls. This is what the, this is what is happening. If you look at what TSMCs TSMC says about their forward outlook and then you look at what Nvidia says about their forward outlook, that's, that's got to go somewhere. Right. And it's going to go into the big hyperscalers.
B
Yeah. And the only thing that these hyperscalers are communicating is hey, it's going to grow. They're not saying what number it's going to be. Right. And hey, Microsoft right now I think the street is maybe at 70, 80, but it's going to be north of 80 to be clear. Right. So there is a big, big change.
A
There, north of 80. And what about Amazon and Google? Because those are the big three actually in meta as well. They tend to dominate capex spending from the big, the big companies.
B
Yeah. And so what's interesting is their share of capex is also, their, their capex is also, also skyrocketing. Right. Amazon is, Google is Meta. Is this interesting thing there, Google being the least aggressive for some reason, which is interesting. But the, the other thing there is their share of capex is falling Right. All of these new guys are becoming a higher and higher and higher percentage of the total capex, which in effect is your natural market share falling. Right. So they really must either answer the question, hey, we're going to lose market share or are we going to be done or not done? But are we going to spend way more money than our investors would like? In the case of Meta, it doesn't matter what the investors think. But in the case of the other three, it matters a lot what the investors think. I say it doesn't matter for Meta because Zuck has all of the voting rights. Right?
A
He has all the voting rights. But actually I'm just curious about, you know, the new players who are taking up the market share. Fundamentally they're a sort of single line item business. You know, we buy, we build and operate GPU data centers as utilities. So what kind of capital is that, that appealing to? Because that just feels like it's a different class of capital than the kind of person who owns Microsoft stock.
B
It absolutely is. Right. And like this is sort of the arbitrage that's going on. Right. If you look at Microsoft, their equity cost of capital, you know, sort of is like much higher. Right. People aren't expecting equity cost of capital in the, in the single digit range. Corey can go out there and get loans for 9%, right? And then 9% loans stacking on, you know, these massive data center build outs and then renting them out, they can do that and then they can make a margin that's like fair. They can make a good margin. Maybe not what Amazon is used to when they're selling database products like redshift or whatever. Right. You know, or you know, all these different database products. That micro, that Amazon has a humongous gross margins, it's not that high, but it's still really healthy. And so it's very appealing to, you know, real estate investors, private equity credit folks. Right. Like all of these folks are like, wow, 9%, 9 and a half percent, 10%. And that's for Core, which is like the triple gold A, right. Like in this space you go out the tail. Some of these guys are paying 20%, right. Like a 20% three year loan on GPUs. It's like yes, like there's people chasing.
A
It, but, but turning it into an asset class actually does unlock an enormous new slew of capital that is willing to take a different risk with a different tenor, different time horizon. And, and that, that looks different to effectively vanilla equity on the NASDAQ or whatever. It happens to be, but there's still this issue about where the chips are going to come from. And I just read today that Elon wants to 10x the size of his biggest data center. I think he wants to build it in a different state and it's going to be a million chips. Where's he going to get those from? And can Nvidia supply him and everyone else?
B
So this is like Nvidia continues to ramp, right? They're going to produce, they produce more than 5 million high end GPUs, right? Hopper GPUs this year and a number of Blackwell as well, you know, as the start of the ramp. But more than 5 million, Hopper. And next year they're going to do more and the year after they're going to do more. And hey, Elon wants to do a million. They're not happening in three months, they're not happening in six months, right? It's happening over the next couple years that it's a million GPU cluster, right? So you can see they've been expanding capacity. They've been buying a bunch of land in Memphis. They're going to build a massive gas plant, not just mobile generators, but actually like a natural gas combines plant right there. They're, they're, they're going to, they can deploy it there. Nvidia and TSMC and SK Hynix and the rest of the supply chain here are continuing to ramp up production. The question is, hey, where's Elon going to get this capital from? Well, it turns out Elon can probably raise more money than anyone else in the world right now, Right? Because he's got, you know, he's got Tesla and SpaceX which have done tremendous returns for all of their investors. And you know, anyone who's invested in those is like, yeah, I'm going to throw money at Xai. I believe in Elon, right.
A
He's also got Mar a Lago.
B
There. There's the political angle too. Yeah, yeah, yeah.
A
I mean that's got to be a tailwind for him.
B
So, so the tva, Tennessee, Tennessee Valley Authority, as soon as Trump was elected, they approved some of like Elon's like permits or what have you, right? So it was quite interesting, right? Like it's like, oh yes, now that we're, you know, it's our folks, right? Like, you know, so, so I think there is like a political aspect. I don't think that the government will specifically give him money. I think what the government will do is just slash every regulation so he can continue to do whatever the heck he wants as fast as he wants, right? Oh yeah, yeah. You don't need to wait for FFA FAA approval. You know, you don't need to delay your launches for FAA approval with Star, with SpaceX. No, no, just do them whenever you want. Right. Like, you know, kind of thing. Right. And same with, same with Memphis. Oh, you want to build a massive gas plant, go ahead. You don't need an environmental site review. Right? Just go. Right. Like this is probably what's more likely to happen. Not money, but yeah, there is definitely an advantage to Mar a Lago.
A
Right. So we talk about that gas plant for the new Memphis data center that they're talking about. And of course they have mobile oil generators or diesel generators. I think at the existing Colossus site, which is 100,000 GPU site, they built in repeatedly 19 days. Infrastructure and power consumption seems to be a big bottleneck now. You know, we've seen the numbers, we've seen the estimates of the growth. I think you are, I think even more aggressive than Goldman's more aggressive growth numbers for power consumption from AI data centers. Can you just remind us what your high end estimate is and what as a percentage of U.S. electricity consumption that.
B
Is today, data centers are roughly 3, 3% of total U.S. grid consumption. And this is, this is, you know, differs by state. Right. States like Virginia are 30%. Right. And then there's many other states that are much less. Right. For US North America, I bucket in Canada because it's, it's satellite. Right. It's Canada's satellite state. Guys, come on.
A
Right.
B
By the end of this decade, the U.S. you know, power consumption for data centers, it could be as high as 15%. Right. You know, by 2027 is where we see it being around 10%. Right. And that's given the massive, massive growth that's happening in some states. Right. So Virginia continues to grow, but really there's a handoff to new states. Right. States like Ohio and Illinois and Texas and Wyoming. This is where the power consumption is soaring. Right?
A
Right.
B
Certain states like Oregon as well. Right. Continue to soar. Some of these states, the majority of power will be data centers by 2027.
A
Right, right.
B
Not 10%.
A
Yes. But let's turn that into, into numbers. Right? Because of course you can, you can forecast that, but ultimately you need to build power stations and you need to build grid connections and the US does not build power stations as quickly as some places. So is that going to end up being a realistic gating factor?
B
It already is. Right. You know, to be clear, these numbers would be higher if the US could build them faster. That's the biggest challenge. Right. Like instead you have companies rushing to go to, you know, very odd parts of the world like Malaysia and Indonesia.
A
The Middle East, Johar in Malaysia, that, that I wrote a, a, a, a piece for Wired on the growth of data centers in Johar.
B
Yeah, exactly. So, like, Microsoft's going there, many other companies are as well. Nvidia's investing there. There's so many folks investing in geographies that have, you know, pretty meh Internet connection, pretty meh data center capacity today. And a lot of that is because, hey, we don't have capacity in the U.S. i'm trying to get everything I can. Right. You wouldn't see Elon do what he's doing if it was possible to do it in the normal way. Right. What he's doing is more expensive and more complicated because that's the only way he can do it as fast as he can. Right, right. And so these.
A
You go ahead. Yeah.
B
The grid connection, the, the cost to transport power on the grid. Right. Actually, in the, in the Northeast, around Pennsylvania and Ohio and such, the cost of sending power around on the grid is soaring. Right. In some places the contract rates are above that, you know, for new contracts, above that of actually the cost of generating the power. So our grid is dilapidated and weak and people are trying to build as fast as they can. And you see folks like Elon building power on site at his data center. You see folks like Meta and Louisiana, you know, signing a contract for two, three, you know, two massive 750 megawatt natural gas plants right next door to their data center plans. Right. You see these plans because, hey, grid can't handle it. Right. We need to think about not just the data center and the GPUs. We have to think about the data center, the substation, the grid transportation, and the generation all at once holistically. Because otherwise the project of this scale just will not happen.
A
Right. So let's try to, let's dig into this a little bit. People started talking about the gigawatt data center requirement right? Now, to put that into context, the gigawatt is like the output of a big nuclear power plant or a really big CCGT power plant. When do we actually see gigawatt data centers coming around?
B
So there's a difference of how you can define it in one way. You could say, hey, Google in Iowa already has a gigawatt data center. Now why is that? Because they have three data centers within 20 miles of each other, and then another one that's 40 miles away in Nebraska, and that's a gigawatt once you add it all together. So that's next year, right? In other cases, you look at like, hey, this OpenAI deal in Texas, right, with Oracle, that's a gigawatt by end of 2026. Same with Microsoft in Memphis or, sorry, in Wisconsin, right? And then Elon in Memphis and Meta in places like Louisiana. These are all gigawatt data centers in 2026. Right. So 2026 is the year of the gigawatt data center. But in the meantime, people are still scaling, right? And they're doing it whether it be through multiple data centers in the same location or nearby, or even their plans include, like, spending billions and billions of dollars on fiber to connect data centers that are further away with massive amounts of fiber. So you see Microsoft having signed big deals, you see Meta having signed big deals, you see this also happening, right? Because, hey, we can't get the data center in one place. Let me connect them up, right? And it's sort of like a band aid, but the gigawatt data center, if you view it through lat lens, is already here next year. But, you know, for an actual site, it's 2026.
A
2026 now. So one of the things I was, I was playing around with was, you know, as people talk about scale, and I think, I think it may have been Sam who Talked about a 5 gigawatt data center at some point. It was one of the frontier labs, and I was playing around with it. The fact that the US actually only has one 5 gigawatt power station, which is the Grand Coulee or Grand Coulee hydropower setup over in Washington state. And then the second biggest is Vogel in Georgia, which is a nuclear power station, but it's below 5 gigawatts. But China, on the other hand, has lots of really big power stations. They've proven to build 7, 8, 10, 12, 6, 15 gigawatts in nuclear, wind, hydro, and they can build them really, really quickly. I just wonder, in this race, condition of needing to build bigger and bigger data centers in order to build bigger, better models. Does China's proven ability to build a 4 gigawatt power station in a, in a shorter period of time than the US perhaps change the nature of that competition?
B
Yeah, so, so the interesting thing is, like, what, what is. What does the US have a disadvantage in? It's, it's building physical infrastructure. We can't build the data centers we can't build the power substations, we can't build the power generation fast enough. On the flip side, China, what do they have an issue with? Well, they certainly don't have a power issue with power. They added an entire U.S. grid worth of power in like seven, eight years. Right. It's, it's, it snap of the finger. We talk about how they don't have.
A
A grid problem either.
B
Yeah. And they don't have a grid problem either. Right. And it's, and it's like, hey, do they have a data center problem? No, they, they literally have multiple factories, aluminum mills, et cetera, with multi gigawatt substations right there. Right. So it's, it's not a question of power. They have a problem is getting the chips right. And then, and then sort of, then there's like, there's sort of like the odds of both of these. Right. You know, hey, we have what they don't and they have what we don't. And, and the question is which one is solvable faster? And for the U.S. if we, if we play our cards right with regards to regulation, with regards to accelerating spending, with regards to unlocking the permitting process and approval processes, then the US could do it faster. This is not the case for China. Like the chip semiconductor supply chain is just much more complicated. Right. And restrictions are hampering them to some degree, although maybe not as much as people in Washington think they are, but they are hampering them to some degree.
A
There have been some degree, but I was really interested to see what had happened with Deep Seq. Right. And they introduced this, which is a Chinese AI model company that came out of a quantitative hedge fund and they brought out their reasoning model which was actually not a bad model. And these are much, much smaller, more power efficient models. Now the founder of Deep Seq does say, listen, we are compute constrained, but they are being forced to be a bit, a bit innovative. So it seems to me like there's, it's not a certainty that you, you couldn't make quite a lot of progress without the, the brute forcing that is the strategy of the US Frontier Labs.
B
So I think this is maybe like a bit of confusion, right? Like with, with regards to a lot of folks they think deepseek is behind. Right. And they're not. Right. Their models are better than any model that Xai has released. Right. Their models are better than any model that Amazon has released or Microsoft and even Meta at this point. Right. They're better than Llama. Right. And so the question is how are they able to do that? Well, part of it is they have an insane amount of talent. And the other aspect is their COMPUTE constraints are not as bad as people think they are. Right? Yes. They're in China. Yes. COMPUTE is limited. It's hard to import. Guess what? Smuggling GPUs is a thing. There are China versions of GPUs, and between these factors, and then you can rent GPUs externally outside of China. And between those three deep seq, what a Deep Seq employee has told me is that they have roughly 50,000 Hopper GPUs right now. For reference, this is not all in one location. Right? For reference, OpenAI has many, many more Hopper GPUs than that. But their largest individual site for Hopper GPUs is in Arizona near the airport. It's 70,000 GPUs growing to 100,000. Right. So when you think about, on the scale of, hey, just training compute, is OpenAI that far ahead? Not yet. Right now, next year, the plans are scaling really rapidly. Elon's going to 200,000 GPUs by the end of this year. Right. You know, there's, you know, OpenAI and Anthropic and such. Anthropic just announced a 400,000 trainium cluster, right. For their next generation training supercomputer. So when you look across these companies, yes, there is a lot more growth, but at least today there, there is not as much of a gap on compute, because, hey, 50,000 GPUs is not an absurd amount of capital either. Right. You know, it's, it's, it's on the scale of single digit billions. Right. And not even 5 billion. Right. Like I'm talking like 2 billion, like dollars for 50,000 GPUs. And, and that's to buy them, right. Like if you were to rent them, it'd be even less. And, and there's a good reason to believe they're renting a chunk of their GPUs. So, so when you, when you think about all of these factors, it's like, well, you know, this is a scale of capital that's like pretty large. But China capital investment's not a challenge either. Right. But certainly for Deep Seek, it is today at least. But you know, so the question is, can the US Labs outrace the Chinese labs? And that's the real big question. That's, that's out there because 50,000 GPUs, 100,000 GPUs, this is sub $10 billion. This is achievable, right, by many players, once it scales to 50 billion dollar clusters, right, that's when it becomes, oh well, this isn't really achievable. This is, this is such a large number of chips that restrictions do hamper it.
A
They, they know, they absolutely do. But if you can't get power to those $50 billion cluster, right? If you can't get power to your million Blackwells, you can only power 87,000 of them while you wait for a grid connection or permitting, it doesn't really matter that you've gone off and bought a million of them. And that is the one dynamic that I'm curious about. I'm curious about whether given gray market supplies, given the ability to rent, given the ability to perhaps run big clusters of older chips, there is a, there's a path that this race where of course Deepseek is ahead of XAI, but it's not ahead of OpenAI and Anthropic. There's a path where companies outside of the US continue to keep, keep within a few months of, of the, the US frontier.
B
I, I, I, I, I completely agree that there's a path to and Deep Seq over the next six months before the Blackwell Blackwell ramps really get full swing. There's, there's like, I see no way for the US labs to really leap that far ahead. You know, I see, I see them like improving incrementally, getting better, but nothing, you know, now that o fool is launched, right. Like, I don't expect too many big jumps much larger than 01, you know, within the next six months until, you know, really people get Blackwell scale training clusters, which is next year, right? And up until then Deep Seq is also not on a tremendous compute disadvantage then. Right? And in fact they may even be able to catch up even more right. With the 50,000 that they have. And so this is, this is, this is the question. This is, you know, this is like an issue of the government slowing down progress. Can, can something be done? No one's asking the government for money. They're asking for them to make the process faster.
A
Faster. Absolutely, absolutely. So now let's go upstack from the chips and let's talk a little bit about, about these models and how well they work. We are recording this one hour and 20 minutes after I laid down 200 bucks to get 01 Pro, which is the pro version of the reasoning model from Chat GPT. Why not? I've been quite impressed with it. What is the model that you use? What do you use most regularly in your work?
B
So we use A lot of Claude actually. But today I also just spent 200 bucks not just for myself but for a number of the people. You know, there's 17 people at my firm and guess what? Our spend on AI is skyrocketing. Right. We use a lot of Gemini as well because the long context of Gemini is really helpful for certain applications. But I think between those three, right, like you know, I probably will continue to use Anthropic for single, you know, one shot type applications. Right. Non reasoning. Right. Because Anthropic's model is the best. Non reasoning model. And then the reasoning models are very expensive and they take a long time to run. So there is a use case but the fact that they take a long time to run hampers workflow. So I will use them as well. And then Gemini, because they have such a long context length, you can input video audio into them and that's extreme. That multimodality aspect of it is really helpful. So between those three I kind of actually use multiple models. Right. But definitely Anthropic and OpenAI are the workhorses that I use.
A
Yeah, that's, that is, there's a convergence that's is very, very similar to us. So Claude New Sonnet 3.5 Gemini Multimodal for the longer things, the thing that, things that involve video and then O1 when you need to ask a different class of questions where a bit of reasoning is required is the portfolio and actually we have to do that switch in our, in our head. I have a favorite Claude mobile app feature which is the, the symbol simple voice mode. So what you can do is you can give it a stream of consciousness like James Joyce Ulysses style blurry blurred thoughts and just tell it to organize them for you which I find extremely useful when I wake up in the morning or if I'm in the car and I just need to offload some thesis I'm playing around with. I will kind of hit that record button and it does a pretty good job. It's not a transcript, it tries to make sense of what you've, you've said and it's a, it's quite a nice interface unlock. So Dylan, if you haven't tried that, I do recommend it.
B
Yeah, that's, that's. I didn't actually know that Anthropic had an enabled a voice mode because I tend to use Anthropic on the, on the computer and when I'm on the go I actually use OpenAI's advanced voice mode because it's, it's so smooth and conversation friendly. Right. Like if you're driving or what have you. Right. So that's, that's interesting. I will have to try that out. That's a good tip.
A
So that brings me to this, this sense of what we, we think might happen with the way in which apps get developed. So everyone from Sam Altman to Satya to any number of VCs and startup founders have said it's agent in 2025. I think OpenAI has operator is one of their, their sort of code names for this. And the, you know, the idea behind the agent is you can get these different systems to break down tasks and work within, between each other to give you a much more kind of complex output. And the outcome of that actually is a vast increase in the number of tokens that get fired around because every agent is going off and doing its own, its own piece. Do you think that that agent vision is something deeper than marketing wear and slideware? Do you think it's a kind of real architectural shift in the way we use these systems?
B
Yes, I think that agents are, you know, like, like OpenAI is like people, people hyped up agents a lot. They just kept hyping them up until OpenAI decided to release their sort of five levels. And the thing that really was interesting for a lot of folks is that reasoning was before agents, right. Because people thought there was, you know, there's language, models, agents, right. And there's going to be this huge boom now because reasoning is all about. And this is something that OpenAI released today, right in their evals. Some of the evals that they showed was, hey, if you ask the same question five times, you get five different responses just because the way the models work, temperature, the non deterministic aspect of GPUs, you get five different responses. And those five different responses, often one of them, maybe two of them will be right. Right. And then many of them will be wrong. And so a lot of people gain benchmarks by just like running it again and again until they get the right answer and then they gain benchmarks. So, so today with Zero1 Pro launch, they were, they showed that, hey, let's show the benchmark. If we run this many times and we make, we only count it as correct is if every single time it responded it was accurate. Right?
A
Right.
B
And that was a real, real game changer to me, right. Is that they're focusing on the reliability. Right. And the big ass issue with agents is if you're not reliable, then you can't string together tasks. Agents require you to string together tasks. And if your accuracy is 99%, good luck, because 10 tasks back to back, that's 99 to the 10th, right. This is very bad. You're under 80%, right? Like you're under 80% hit rate. It's like, who wants a human that hits 80%? No, like get the work done. Right. Scaling intelligence is about hitting that reliability metric on basic tasks, and that's through reasoning. Right. And once you hit that reliable, high reliability, then you can have agents that launch work constantly. And so I think agents are the holy grail, but people need to refactor and refocus to, hey, this thing, put something out that looks really smart and I only have to edit it a little bit. Fantastic productivity boost, by the way. Right. I'll have to edit it a little bit. But I can't dream of a world where robots or AI is doing a bunch of stuff for me without it consistently doing simple things 100% of the time. Right. And then it can move to doing complex things 100% of the time. Right. Because right now it's simple things 90% of the time and complex things 40, 50% of the time. Right. And, and string those together, you're done, you're, you're cooked. It's not going to work. So computer use, robotics, all this agent stuff, this is not possible until we bring the reliability up. And reasoning is really pervasive.
A
Well, yes. I mean, I use a few workflows that are a little bit agentic in the sense that I, for research. So I may put in a question, I farm the question out to three different LLMs that have different characteristics, different prompts. They bring their answers back. A fourth LLM acts as a judge, a convener, synthesizes the answers, evaluates them, and if it's not good enough, sends it back through the loop. And that process is actually really, really nice. It probably only works one time in six, but when it works, it's really good. And so it doesn't matter because I'm not running it in an automated fashion. I just kind of click, did it work? I check 30 seconds later. If it did work, I'm quids in. Right. I'm up. If it didn't work, I just, you know, go and use a different approach. So. So I do hear your point about, about reliability. Okay, let's turn to predictions for 2025. You are sort of in the forecast business, but I'm going to push you for predictions. So if you have like, let's, let's give go for three, three predictions and at least one, I would like you to say this is my most contrarian prediction out of them for, for 2025 in this space.
B
Contrarian predictions. I mean, I think it's mostly just, I believe the exponentials will continue to stack, right? There's a lot of worry and fear and difficulty with people imagining that exponentials continue to stack because the scale of investment, the raw figures are frightening, right? Are people really going to go from investing 150, $200 billion in data centers to 400, 500 billion? Yes, they will. That's, that's, that's a big question on people's mind. And is it 400, 500 or is it 300? Right. And no, it's 400, 500 billion. Right. This is, this is what the exponential, the, the requirement of rollout. I think the other aspect is that we will start to see, you know, real software companies with big impacts from AI. Right. You know, people talked about, you know, Zuck's year of efficiency, what Elon did with Twitter. This is going to happen in a lot of software because the one area where it's indisputable, the value of AI is coding, right? And the cost of coding is tanking faster than anyone can imagine. And I think that this is going to lead to a number of software companies instituting that level of cuts, right? It's not just like, hey, we're going to cut 10, 20%. I think some software companies are going to be able to cut more than that, right? And that's going to be a real, real. I think we can, we're going to start seeing some level of, you know, automation. And it's not, it's not, you know, the risk used to, or thought it used to always be, hey, it's the, it's the lowest people of society that are going to get automated and thrown into poverty. It's like, no, no, no. This is like some of the top income earners who are being thrown off of the rails and their earnings are not growing nearly as fast. This is like a very scary prospect for angst and, and such. And then I think, I think another aspect is we'll see the largest private round of a company ever, right? We still haven't seen that yet, but I think we'll see, you know, multiple rounds north of $10 billion, right? We see anthropic OpenAI xai were all sub 10 billion, core wave, sub 10 billion, right? Like they were all like teetering in the mid billions a little bit higher, a little bit Lower, I think we'll see rounds that are north of 10 billion from multiple firms, not just one multiple because the progress is going to be so powerful and the clarity of hey, what's going to happen in the future is going to be so strong that you'd be silly as a large multinational sovereign, you know, investment firm or as a sovereign wealth fund to not deploy your capital on this. And by the way, these guys have not deployed much capital yet into AI, right? You talking about the big sovereign wealth funds or the. As a percentage of AUM especially, they just haven't pushed much in AI. It's been quite small. These people are.
A
You've been a good, you've been a great Sport there with three predictions, right? The 4 or $500 billion capex collapse in the cost of software, which we'll see could see really significant headcount reductions and greater than $10 billion private company rounds showing up. Let's evaluate your predictions now. Do you think you are more likely to to be surprised to the upside? In other words, you were a bit more conservative with your predictions or more are likely to be surprised to the downside in that you were too optimistic and aggressive with your predictions.
B
So I think on the data center spend side it can't be any higher. Chip supply chains take time to build. Data center supply chains take time to build. Sorry, it just can't be higher. I think it would be higher if we could build it faster, but we can't. And so I think that sort of is like, you know, like is burgeoning on a. It could be more, but actually it's capped here. So you know, if there's upside or if there's a little bit of downside, we still hit that number basically versus what I think. And so I think that's sort of like the upper limit, right? Upper bound on terms of like CAPEX on GPUs, data centers, ASICs storage, networking, et cetera, right on. On the side of automation. You know, I think I'm still like hedging my risk in that like yeah, we'll see 20, 30% layoffs, right? Maybe we see bigger, right, with like 01 Pro or another leap in reasoning models or a humongous cost decrease of reasoning models. We could see more, right? I don't know yet, but I'm only still pinning in 20, 30% layoffs for some companies. And it's not across the ecosystem, just some. And then the sort of last one I think is where there's a lot of upside and downside, right? If models don't continue to progress, people can't keep raising money. And OpenAI sort of cemented that. They can continue to raise money because they're doing the 12 days of Christmas. We're starting today with the 01 Pro, and we'll see what they do on the rest of the days because, oh, my.
A
Oh, it's going to be nuts. Yeah, yeah, Absolutely.
B
Yeah. But regardless, like, you know, the challenge is like, hey, Anthropic's round was only Amazon. They tried to raise money from other people, to be clear, but they just couldn't get it. So they really need to put the metal to the pedal, pedal to the metal and actually deliver big increases and improvements. Otherwise, where is the money, money train going to stop? Right? Same with Xai. Right? Like, they've raised multiple rounds, but now, like, they still. They still have nothing to show for it. Right? They've got the most badass data center in the world. Right. Possibly also the most polluting, but the most badass data center in the world. What are they going to deliver with it? So, you know, things can happen, they can fail, they can mess up. So that's the other aspect of this.
A
It's definitely going to be a fun year to watch, I think. 2025. Dylan, thank you so much for your time.
B
Alrighty. Thank you so much as well.
Azeem Azhar’s Exponential View Episode: AI in 2025 – Infrastructure, Investment & Bottlenecks (with Dylan Patel) Date: December 23, 2024
In this special holiday episode, Azeem Azhar sits down with Dylan Patel—semiconductor and data center infrastructure expert—to examine what surprises AI might bring in 2025. Their energetic discussion digs into the economics of AI scale-up, who makes money in the current model ecosystem, the severe infrastructural bottlenecks in compute and power, and predictions for the AI arms race between the U.S. and China. Patel draws on granular data and industry experience to demystify market dynamics and trace the physical limits behind AI’s exponential progress.
The exchange is lively, data-driven, and grounded—Patel pairs skeptical realism with practical optimism, while Azeem steers with incisive, contextualizing questions. Both avoid hype, trading in on-the-ground industry intelligence and using candid, sometimes dryly humorous, language.
This episode is an insightful, jargon-savvy map to the real constraints, risks, and structural shifts facing AI in 2025. Whether you’re an investor, founder, technologist, or policymaker, the conversation pulls back the curtain on what “exponential AI” means in dollars, data centers, and jobs—not just headlines.