wavePod

Dylan Patel: NVIDIA's New Moat & Why China is "Semiconductor Pilled” - The MAD Podcast with Matt Turck | Wave AI Podcast Notes

Back to The MAD Podcast with Matt Turck

Dylan Patel: NVIDIA's New Moat & Why China is "Semiconductor Pilled”

The MAD Podcast with Matt Turck

Thu Feb 05 2026

Summary

The MAD Podcast with Matt Turck

Episode Summary: Dylan Patel – NVIDIA’s New Moat & Why China is “Semiconductor Pilled”

Date: February 5, 2026
Guest: Dylan Patel (Semianalysis)
Host: Matt Turck

Overview

This episode dives deep into NVIDIA’s evolving hardware and software strategy, focusing on its acquisition of Groq, the changing landscape of AI chips, and the company’s efforts to maintain dominance amid swelling competition and geopolitical strife—particularly with China’s “semiconductor-pilled” approach. Technological innovation, power grid strain, CapEx realities, and the cultural phenomenon of semiconductors in China are all covered, with memorable asides about the AI-fueled lifestyles and work culture of the episode’s protagonists.

Key Discussion Points & Insights

1. NVIDIA's Acquisition of Groq and Shifting AI Hardware Landscape

Why Groq?
- NVIDIA's dominance comes from wide surface area betting on hardware types. Now, with models and workloads fragmenting—decode-heavy, memory bandwidth-sensitive, and parallelism-focused tasks—they see the need for specialization (01:33–05:54).
- Quote: “Acquiring Groq is how you get those resources to make more solutions for different parts of the market to stay king.” – Dylan Patel (00:00)
AI Workload Fragmentation:
- Workloads now include highly parallel streams (multiple chains of thoughts in LLMs), decode-optimized, and context-heavy use cases.
- NVIDIA’s diversified portfolio: General purpose GPUs, CPX (context processing/KV cache), and now Groq (blazingly fast decode).

2. Competitive and Regulatory Dimensions

Anti-competitive Concerns:
- “I certainly think it’s not good from an anti-competitive sense... but with startups, it’s OK. These license-structured deals also avoid regulatory limbo that kills startups’ momentum.” (06:04–07:00)
Barriers to Entry:
- Specialized chip startups face enormous hurdles; only a few players have built full-stack AI hardware-software solutions.
- Startups must “try something weird or different” to compete.

3. The CUDA Moat & Evolving Software Ecosystem

CUDA’s Waning Dominance:
- “The reason why CUDA is so important…you can do whatever you need to do...But most AI chips will not be consumed by people programming anything for it…they will download an open-source inference engine..." (10:17)
- The moat is shifting from deep CUDA expertise to fast, broad compatibility with frameworks (e.g., VLLM, SGLang).
Open Source as Leveler:
- Frameworks are catching up: AMD, TPUs, Amazon Trainium are being integrated, reducing NVIDIA’s software advantage.
- Real moat now is rapid, seamless hardware-software integration and advanced inference features (e.g., KV cache management).

4. Chip Startups and Specialization Game

Only the Paranoid Survive:
- Jensen Huang embodies an “Andy Grove” mindset—constantly defending against threats and expanding NVIDIA’s reach.
- Multiple specializations now feasible: “You’re never going to beat NVIDIA at their own game…Everyone else has to try something weird or different.” (18:05)
Long-term Market Share Prospects:
- AMD emerges as credible runner-up ("single-digit percentage market share").
- Startups like Etched, Maddox, Positron betting on unique architectures, but succeed rate pegged at "<1%".

5. Geopolitics & China’s Semiconductor Crusade

China’s Industrial Policy:
- "The entire country is semiconductor-pilled. There are dramas where people fall in love in the fab...It's like super cool for your significant other to be a semiconductor engineer." (23:12–26:38)
Import Substitution and Fragmentation:
- China’s local governments are driving domestic chip adoption, sometimes surpassing the central policy.
- Massive specialization at the city/province level (“There’s a city for everything—lampshades, camera arms, even semiconductors.” 28:05)
Global Supply Chain Interdependence:
- “If you were to cut off every country and say there’s no more globalism, China has the most vertical stack in semiconductors today.” (30:39)
- Yet, leading-edge tech, chemicals, and lithography remain out of reach; China is about “10 years behind but catching up.”
Huawei’s Threat:
- “Huawei is the most vertical company in the world. No company is more verticalized than Huawei, which then leads to huge innovations…Of course, they're terrifying.” (35:40)
US Policy Response:
- American onshoring via the CHIPS Act is significant but dwarfed by global (especially Asian) capex needs.
- “How is $50 billion of subsidies going to change America’s needle? It does move it a little bit…But semiconductors need a lot bigger package.” (41:25–43:25)
- Legislation only passed when the car industry felt the pain from chip shortages.

6. The Power Grid, CapEx Bubble, and Industrial Impact

Energy Realities:
- Data centers’ electrical consumption is skyrocketing, pressing US grid capacity (10% by late 2020s).
- "US has not built power in 50 years…New grid expansions are slow and labor-limited." (55:05–57:06)
- Water consumption by data centers is minor; “all of Elon Musk's Colossus data center uses as much water as two and a half In-N-Outs.” (57:58)
Capex Bubble?
- Anand is bullish—demand is largely real as AI use surges, though timing (adoption vs. supply buildout) is the main risk.
- “Model progress is very clear—the moment that stops happening...if we hit a wall, then it’s cooked.” (51:00–51:33)
AI's Economic Transformation:
- “AI is under-earning the value that it's producing in the world. By a significant margin already today.” (51:49–52:10)

7. Models, Software & Work Reimagined

Workflows Are Changing:
- Non-coders harnessing tools like Claude Code to automate complex knowledge work.
- “Why would I hire an L4 engineer? I just tell Claude to do it...Low-level knowledge work just doesn’t matter.” (68:29–71:46)
Model Innovation & “Tokenomics”:
- Patel’s team tracks AI progress via alternative analytics (e.g., “tokenomics”—tracing actual token/data usage).
- “Even if you don't code, you've never had any training…You can code.” (68:29)
Competitive Model Landscape:
- OpenAI, Anthropic, Google all racing, with RL and pre-training breakthroughs.
- New models unlock moments of productivity—“Claude Code is a new moment where the way you work has forever changed.” (71:46)

8. Cultural & Lifestyle Tangents:

San Francisco AI Housemates:
- Stories of productivity and obsession—building a full RTS game in a week using only Claude, no typing.
- Roommate lore: “6'4", Olympian level fencer, perfect specimen”—on Sholto, their AI housemate (73:52).
Chinese Romance Fabs:
- Romance dramas set in fabs and photonics labs—evidence of China's cultural embrace of the industry.
US vs. China Societal Trends:
- “In China, it’s cool to date a semiconductor engineer; here, it’s all about influencers.”

Notable Quotes & Moments

"This is the biggest change in human history, maybe ever...What’s about to happen with AI?” – Dylan Patel (00:00)
“Huawei is terrifying, right?…No company is more verticalized than Huawei, which then leads to huge innovations.” – Dylan Patel (35:40)
“The entire country is semiconductor-pilled. There are dramas where people fall in love in the fab...romance comedies set in semiconductor factories.” – Dylan Patel (23:12, 26:38)
“Why would I hire an L4 engineer? I just tell Claude to do it.” – Dylan Patel (68:29)
“All of Elon Musk’s Colossus data center uses as much water as two and a half In-N-Outs.” – Dylan Patel (57:58)
“Only the paranoid survive is core to the Bay Area, core to NVIDIA. Jensen is very paranoid about losing.” – Dylan Patel (07:09)
“AMD will be caught up at times and very behind at other times…single digit percentage market share is still pretty good.” – Dylan Patel (17:09)
On the CHIPS Act: “I don’t understand why like EVs or solar was given this massive, massive trillion dollar package...semiconductors only given $50 [billion].” (43:25)

Important Timestamps

00:00 – 05:54: Deep dive on why NVIDIA bought Groq; new chip specialization trends.
10:17 – 15:00: CUDA’s changing moat, open source software breaking down NVIDIA’s traditional lock-in.
18:05 – 22:50: Viability of chip startups, AMD, and specialization strategies.
23:12 – 26:38: China’s semiconductor industrial strategy and cultural embrace.
30:39 – 36:20: China’s supply chain, Huawei’s threat, and what China still lacks.
41:25 – 43:25: US CHIPS Act and the true scale of global semiconductor investment.
55:05 – 61:01: Data center energy demands, power grid strain, and debunking water “crisis.”
68:29 – 73:47: AI transforming non-coding work, Claude Code productivity, and the coming revolution in knowledge work.

Tone & Style

The conversation is rapid-fire, deeply technical, informal, and sprinkled with asides about Silicon Valley life, cultural quirks (AI researchers playing LAN Age of Empires 2), and meme-worthy takes on hardware, geopolitics, and work. Patel is especially animated, blending sharp skepticism with bullishness on AI’s disruptive power, often with a mix of irreverence and granular detail.

Conclusion

Dylan Patel and Matt Turck deliver a whirlwind tour of AI’s hardware and software frontiers, NVIDIA’s competitive calculus, and the global economic, cultural, and political impacts at play. As AI chips become both battleground and backdrop for global hegemony, this episode argues—both seriously and with wit—that we’ve only begun to witness the seismic shifts that AI and its supporting silicon will generate.

Created for listeners seeking depth on the intersection of AI, hardware, and global tech politics, capturing the nuance, color, and hard truths of this pivotal moment in the MAD (Machine Learning, AI, Data) world.

Loading summary...

Transcript

A (0:00)

This is the biggest change in human history, maybe ever. What's about to happen with AI? This is the biggest revolution, bigger than industrial revolution. Jensen is very paranoid about losing. If he just kept making his mainline chip, people crush him on cost and performance. Acquiring Grok is how you get those resources to make more solutions for different parts of the market to stay king. At the end of the day, this is an economic war. If the US and the west win in AI, China will not rise to be the global hegemony. But without AI, China definitely will rise. They're just going to outrun America.

B (0:29)

Hi, I'm Matt Turk. Welcome Back to the Mad PodC. Today I'm joined by the one person Wall street and Silicon Valley turn to when they need to cut through the hardware hype. Dylan Patel of Semianalysis. We dove into many of the most important topics of Nvidia's massive move to acquire Grok, the truth about the Capex bubble, whether the US power grid can actually handle the AI boom, and the geopolitical chess match playing out between the US and China. But I have to warn you, this conversation went off the rails in the best possible way. And we ended up going into all sorts of fun tangents, like the strange phenomenon of Chinese romance drama set inside semiconductor factories and what it's really like when three AI famous roommates live together in sf. Please enjoy this fantastic conversation with Dylan. Hey Dylan, welcome.

A (1:15)

Hello. How are you?

B (1:16)

I'm great. I'd love to start with GROK and Nvidia since it's still fresh. So not so long ago Nvidia was saying that one GPU could do it all, and now they're doing this acquisition slash non exclusive deal with grok. What does that mean? Mean from your perspective, it's very clear.

A (1:33)

We'Re not sure where AI models are headed in terms of, you know, over the next few years what happens to the architecture. But you know, the thing that I think everyone has sort of like agreed on is models are pretty auto regressive, right? Next token generation is like the thing, but beyond that, right? Attention mechanisms change the how, how it works. Everything changes, right? Could, could change. And so what's interesting is the reason Nvidia won is because they just took like the widest surface area bet and then people kept developing models on that and that kind of shape worked. But now the workload is so there is room for specialization that will give you 10x increases in certain domains, right? In a general purpose workload doesn't work right. You know, it can't train it can't, you know, it can't inference really, really large models cost efficiently, right? You can't serve many, many, many users. But what it can do is it can go block screamingly fast, right? Same with the Cerebras OpenAI deal. But that's like one workload, right? Very decode focused, right? Gener doing auto regressive tokens in a, in a single stream super fast. Another direction AI models could head, right? We don't know. Are models going to think in one token stream or is it actually they're constantly context switching, right? And they're going from. They have this humongous, humongous context and they're generating in multiple parallel streams, right? And so Google and OpenAI have both released mechanisms of this with their pro models where the model actually doesn't just have one single chain of thought for reasoning, it has multiple, right. And then I don't know exactly like, you know, and, and, and how they choose which one and what the final answer to you delivers is is an area of research. But there, there is room for that kind of chip, right? Something that works on very parallel, lot of streams of chain of thought. And maybe the latency requirements are not as crazy, right? Maybe you don't want to go blindingly fast, right? Maybe you're okay with it being, you know, because I can spin up 100 parallel, you know, streams of thought or agents or whatever you want to call them. Maybe I care a lot about cost there. And because it's 100 in parallel instead of one going super, super fast, it's not as deep, right? The tree search or the depth of the inference is not as deep, but it is much wider. You know, there's other parts of inference. Hey, process creating the KV Ca Nvidia has a chip for that, right? That's the cpx. So they've made the cpx, they bought Grok for Decode and then they still have their general purpose gpu. So they're kind of trying to cover their bases because unlike the first wave of AI chip companies where they sort of just made chips and then tried to figure out where it would work, right? They had a thesis, Groq and Cerebras both as well as Samba Nova, right? Which was put a lot of memory on the chip and not necessarily in the case of Cerebras and Groq, no memory off chip. And in the case of Samba Nova, less memory off chip or slower memory off chip with higher capacity. You know, they sort of all made similar bets in that direction. And it didn't work for a while until it kind of did. Right. Because there's a workload that now necessitates it. Nvidia recognizes they're, they're the leader, they're at the tent pole. Hey, in one respect they can just run faster than everyone. But it's kind of hard to be 2x better than Google or, or OpenAI or whoever else's internal chip, right. To justify their, you know, 75% plus margins. Right. And then they have to be 2x to 4x better to justify 4x, better to justify the margins because that's what they're charging above cogs. You know, the question is what, what architecture will deliver that? Well, yes, keep the programmability of their GPUs is great for training and for a lot of workloads. But you know, guess what I think, I think a lot of people will just be downloading an open source model, downloading an inference framework and pressing go, right. A little bit more complicated than that. But that's, that's going to be the consumption method for a lot of enterprises. A lot of startups, a lot of tech companies is they're just going to do that or they're going to rent the GPUs or rent the chips and then download an open source framework and model and go right. And Nvidia recognizes this. And hey, there is room for products that aren't general purpose, right. The general purpose GPU will still probably be the main line for training and for a lot of inference and for cost efficient inference, but maybe blindingly fast or workloads that have a ton of pre fill that is creating the, the KV cache, maybe that those workloads could be different chips, right. And the CPX chip they announced, right. They say it's for the context, processing, creating KV cache. It's also really useful for video models because video models don't care about memory bandwidth. And so you know, why pay for the expensive memory that the general purpose chip has? Or why do what GROK is doing, which is tying hundreds or thousands of chips together and not having memory, but keeping the entire model on chip. The trade off for that of course is you need thousands of chips and you have less compute per chip. And so like Nvidia's trying to capture the whole surface area because again, you don't know where models are headed and it's hard to say where the research is headed.

A (7:09)

I think the thing about Nvidia is they take the Andy Grove mentality like more serious than anyone else, right? Like okay fine, Google like implemented okrs because intel did it. But that's like you know, management stuff, right? Only the paranoid survive, right? This is like core to the Bay Area, core to Nvidia. Jensen is very paranoid about losing, right? These specializations, if you just kept making his mainline chip would mean people could, you know, point point solutions for specific parts of the market would crush him on cost and performance and then he can't justify his margin. That's a threat to Nvidia's business model as a whole. Especially if the best model only changes every three months or the model you want to roll out. Okay, well then you can have three months to figure out how to make a model work on one chip architecture for that point solution and you know, it's fine software. Software advantage of Nvidia is not that important. Then Jensen's super paranoid about losing and frankly it's really hard to hire enough talented chip people. When you look across the market there is only a few companies who have successfully created a chip architecture software to run the models accurately run the run the models accurately, right? Like because you can look at random APIs of say an Alibaba Quinn model and different people are doing all sorts of tricks like quantizing it, but also many other tricks which then end up like making the model quality lower, you know, building a rack scale solution, networking thousands of chips together and then deploying an API. And GROK did the whole thing with frankly not that many people. So now it's like, okay, well I'm Nvidia, I want to make four different chip architectures and actually four different point solutions, maybe the general purpose and then one here, one here, one here. And in addition, my general purpose thing is actually not just like a GPU chip, it's like GPU chips, CPU chips, networking chips, NV switch, Nix. Like you know, there's many, many chips and each of those chips has many chiplets. You don't have enough engineering resources, right? And so like acquiring Grok is like how you get those resources to make more solutions for different parts of the market. And as far as like are they threatened? Like I think, I think like obviously there's some cool startups out there, right that are raising a lot, right currently or have raised such as Etch, MadX, Positron, these new age of AI companies. There's also the prior age of like Cerebras is, is out there still, right? You know, 10, et cetera. And there's. So there's a lot of AI chip companies on the startup side. But then there's also, you know, Google's TPU, AMD, GPUs, Amazon, Trainium, who are all really credible competitors. And then you know, Met as MTIA is somewhat credible and then you know, Microsoft's Maya is not credible. But like, you know, maybe it will be one day, right? So you sort of have like a lot of competition. They've got to hold the gates back. And so I think is there a risk to them being, I mean there's, there's risk from all of those companies that I mentioned and you know, effectively California, slash Seattle, right? Only two, two places. There's, there's also chips from other parts of the world, right? Obviously China has a number of different AI chip companies that are doing cool things. Anyone would have told you GROK was, you know, their business revenue, their revenue was not like stellar, right? In fact they missed revenue last year significantly. And yet they got bought, right? Because the value of the IP was there and the value of the team. Anyone else would have been like, well why the heck would I buy this Right. Makes no sense. There's definitely a credible threat. Yeah.

A (10:17)

I think they do. I think networking is super important. I think the CUDA software mode is very important. But it's also like changing rapidly. Right. It's an incredible amount of the software that Nvidia GPUs run on is not from Nvidia. It's. It's the developer ecosystem that's open sourcing it. When you look at for example VLLM and SGLang, right? These support AMD GPUs almost as first class citizens now. And VLM is getting significant support for TPUs for Trainium and there will be other chips coming out from startups that also support VLM sglang now like how difficult is it? You know, the, the reason why Cuda is so important is like okay, I can do whatever I need to do, right? Programming a gpu. I think most AI chips will not be consumed by people programming anything for it. They will download an open source inference engine, they will download an open source model and then they will put it on the. And it's really simple to download VLLM and like make it work. Like it's not that hard to set up, you know A. And Nvidia's putting out a lot of open source software like Triton Inference Server and, and Dynamo and all these things to, to make it easy because that is the consumption model ultimately for the majority of AI, right? Is, and it might be like oh it's my own inference engine. But most servers will not run code besides the inference engine in the model. It's like not like people are actually like researchers are like writing code for GPUs to see ideas if they'll work and train models and all these things or just mess around with them to figure out, you know, infra performance or whatever it is. But most of it won't be there. And so Cuda as a mode, Cuda language is like, you know, like it's like fine, right? Like you know no one actually writes Cuda, right? Like most people write Pytorch and then like torch compile and then they just run it on the gpu. They don't write Cuda. But a lot of this Cuda mode is like how does Pytorch translate into high performance GPUs? And that surface area from when people were just writing like hardcore when people are hardcore writing Cuda kernels to like, hey, they're writing Pytorch and then it's compiling down to GPUs versus oh, I'm just downloading VLM. It is a, it is a curve of like not a ton of people that can do Cuda kernels. A whole lot more people can do Pytorch, right? Random, you know, PhDs and random people. It's very simple, right? A crapload of people can do vlm, download it, run it on a server. Well, if it now supports other chips, what is the Cuda moat? Nvidia's recognize this and they've been building software that is not necessarily the Cuda mode. And I can give some examples, right? So the name of the game is fast tokens and lowest cost tokens, right? And lowest cost tokens happens by your chip being fast. But there's also tricks, right? One example, right, like I mentioned with you know, the CPX versus groq, right, Is processing your pre fill contacts, right? Super cheap cpx, right? If I'm, if I care a lot about speed, then Gro, these are optimizations on the hardware side. There's optimizations on the software side as well, right? And so one example is when I'm doing, for example, if I look at a Claude code or a cursor type application, right? The workload is like, it takes your repo, takes the relevant parts of your repo, puts it in the context of the LLM it prompts, it generates, right? And if it's an agent mode, it circulates the context a couple times, it'll collapse, put things off to the side, access different contexts. But what's, you know, and especially when you think about an agent for software and you can see this in Codex, you know, Codex, Codex actually not as good as cloud code, but can do it, work on time horizons of like 9, 10 hours and do like a big refactor better than cloud code can. Even though most of times cloud code is better. And what's interesting about Codex does is it'll like take your repo, it'll identify parts, if you're asking it to refactor it, identify parts, write stuff, you know, make like these notes for itself everywhere, collapse the context, switch from this part of the repo to that part of the repo to this part of the repo. But when you think about it, it's like, oh, if this thing is just generating tokens all the time, plus it's switching what my context is constantly, that's really Expensive, right? If you think about like what's the cost of inference? I want to say it's like it's, it's $10 per million tokens of output and or in $3 for decode or 10 for decode and 3 for pre fill. And so if you think about, oh, it just worked for nine hours on one task, one refactor, huge value. But if it changed context a ton of times and your context is like 30k usually or 50k or you know, heading to hundreds of thousands, you know how long your bigger repository is and how much context, which now you're spending all this money on pre fill, right? Not the decode tokens. But actually why am I like regenerating the KV cache? I can actually just like store the KV cache elsewhere and then when I need it again I can pull it and plop it into CPU memory or into GPU memory. And so Nvidia's got this like KV cache manager and they've been working really hard on like making it so they can interface SSDs and stick the kv cache on there and pull it out whenever they want. So for this kind of workload and then if you do this and you look at like coding as an application and you like look at these coding companies and how much they're paying for prefill versus decode, actually majority of their cost is pre filled tokens, not decode tokens because their context is just so large and it's switching all the time, even in agent modes. You know, if you can now not have to do the pre fill, your costs go down dramatically. But that's a very complicated thing to do from a software perspective. You know, companies like anthropic, Google, OpenAI have already done it, but what about the wide world, right? And so Nvidia's trying to make the open source software for this and that's like CUDA mode. But it's like actually no, none of this is cuda, right? Like it's like memory management and like, you know, storage management. And when do you call what and how do you transfer it and how do you like spread the KV cache across a bunch of different storage nodes and what happens when you read it and the network congestion. Just like all these things. Yeah, it's like Nvidia's wheelhouse, but it's not Cuda. And I think like the easy way to say it is, it is the CUDA mode, right? And so things like this kvcache manager and many other things they're trying to do to reduce the cost of inference. Like is how they build the new code of CUDA Mode. Because again today it's, it's, you know, it is quite, I mean AMD is like not fully there yet and TPU is being added right now and Trainium is being added soon as well to vlm. But all of them will have a very good UX for download model run model on VLLM by the middle of the year, I think. Right? Certainly AMD is already there by the end of this quarter. We have something that like tests this, right? It's called InferenceMax A. It's open source, all the code is. And the results are. But we run across, I think 60 million of GPUs which are donated to us by companies like Nvidia, AMD, OpenAI, Microsoft, Amazon, Crusoe, Core, weave together AI. All these companies are sponsoring GPUs for us to run this. We're running VLM and SGLang every night on, you know, nine different kinds of GPUs on a variety of different models and different work context lengths and all these things, right? To see the performance. And you can see the performance moving every day or pretty often because the software changes all the time. And so like the fact that this exists is the CUDA boat, right? It's not that like amd, you can do this on their chips, Nvidia can do this on their chips. It's oh, when the new model comes out, how fast does it get to peak performance? Because you know, it's a moving target or hey, can I implement this KV cache management thing? How hard is it? How many engineers do I need? Oh, just one. Great like or 10. Great. If I need 100 people to develop it, like Google and you know, so on and so forth did, then that's much harder.

A (18:05)

You sort of the whole specialization game, right? You, you have to specialize because you're never going to be Nvidia at their own game, right? They're going to have the supply chain unlock, they're going to get to the newest memory technology or process technology or whatever, packaging technology, whatever it is, sooner than you and they're just going to crush you, right? If you play their game, you have to. AMD is trying to play Nvidia's game, but AMD is like extremely good at engineering silicon, right? Everyone else has to, has to, has to try something weird or different, right? And so when you look at Etched or Maddox or Positron or cerebras or 10 store, you got to look at all these companies, right? There are unique things about what they're doing and it's not clear if AI models will still be within that realm when that comes out, right? Does, oh now people use like engrams and other sparse attention techniques? Is that like, is. Does that change like some of the specializations people are doing or hey, people are now doing like you know, models are now sparse moes instead of being dense models. Does that change things? There's so many optimizations and changes on the model side and you can't predict what's going to happen with the ML research easily, at least you can't. The thing you're optimizing for today has to be a vision of where AI will be in two years and Nvidia's fully accepted. They don't know where that's going to be. That's why they have a portfolio of chips now, not just one GPU line, right? It's not just Hopper, Blackwell, Rubin now. It's going to be, you know, it's not NPR hop, you know, you know, it's not that line. It's like there's a variety of chips to serve the different markets in different possible scenarios. They think each of them has this vision today, but oh, it might turn out the general purpose one sucks. And actually AI models have developed in a way where CPX or Grox style chips are the best. Right? Well, okay, now we have a solution for that market. And so I think that's the challenge with the startups. With that said, I think they're all taking very interesting bets. I think it's, I think it's much more exciting than the first wave of AI hardware bets. Graph Core, Cerebras, Samanova, Groq, where they all made the same bet on memory and putting the memory on the chip, they sort of just made a bet and they optimized for a certain kind of model, all similar kinds of model, and it didn't end up working out for a long time. Right. They had to pivot and they had to work on a lot of things. It took a long time. I think these companies have like a really clear vision of what they think models will look like. Right, right. Etch does, Maddox does, Positron does. And that's what's really cool about it between the three of them, these new age. So I mean I'm excited for them. I'm very, very skeptical. I don't know what, what a venture capitalist views as like likely chances of succeeding, but I think all of them are less than 1%. Right? Yeah, but you know, that's, that's, that's a.

A (20:48)

It could, it could, or it could be any given customer has like one workload they care a lot about. Anthropic clearly does not give a crap about video gen, image gen. Right. They just don't care. On the flip side, company like midjourney cares a lot about image and video gen. Right. Image and videogen is very, very like, like I mentioned, like it's a very, like, it's not very memory bandwidth heavy. It loves, loves, loves compute. Right. Whereas inference of large language models in the style of like, you know, this, these, you know, say for example coding agents cares a lot about decoding for long streams of time and that's very memory bandwidth heavy. Right. And so there's like, that's like a simple example but there's a lot more nuance there in terms of like even like the size of like the matrix multiply, you know, the tensor cores that, you know, the systolic arrays that you use or the ratios of networking and memory, memory and like what's that memory hierarchy look like? And you know, what are you doing for different kinds of attention and like, oh, like all these sorts of things. Like there's a lot of Specialization here. And so some people are betting big on, on different types of specialization and I think like you could clearly see a world where companies do care about different stuff, right? Like, like if for example a chip optimized for video and image generation existed today and it was better than Nvidia or Nvidia made it, I think Mid Journey would absolutely only use that for inference. I think for training they'd still use the general purpose thing and as would like Meta and Google would like they should do that. Right? And hey Meta actually has two lines of AI chips. Their mtia, there's a line that's focused on recommendation systems and then there's a line that's focused on Genai. The Gen AI one is a new line but that recommendation systems line is still continuing, right? It's not sexy. No one cares because there's no. And bytedance also has a recommendation system line of chips and it's not really focused on gen AI which is fine because you know this is a $200 billion business or something which is just deciding what ad to serve me right and what order to put my friends stories and you know, things like this. So I think like it's perfectly fine for there to be specialized AI chips given the target market is big enough and you have to have vision to know what that target market is. Unless you're hyperscaler then you can like just like you can just use general purpose until you've like it's clearly there and then you can make your asic right.

A (23:12)

A variety of things actually in some, in some quarters last year it was even north of 20 I think but I don't remember exactly. But anyways, you know if you look at 2022, China was almost the size of the US in terms of buying server hardware, right? Almost. Not quite, but getting there and it looked like they were going to be the same size as America in like a year or two after that, right. And if you look at like global data center capacity, global cloud capacity, et cetera, et cetera, et cetera, it's American companies and Chinese companies, right, that dominate the world. American companies obviously doing a lot better here but both of those dominate the world and if you look at like every industry, right, you know, it's, it's, it's very clear that like China wants to insource stuff, right? So in 2015 they made these five year plans for 2020 and 2025 where they set the percentage of semiconductors they wanted domestically produced. And they've missed the goal both times, which is fine, right? They set really aggressive goals and even, you know, shoot for the moon even if you miss, hit the stars, right? And that's sort of what's happened, right? Like, look, China is not caught up on, you know, leading edge semiconductors, but microcontrollers from China are almost as good as the microcontrollers are as good and cheaper than the ones from Texas Instruments or ST Micro or, you know, et cetera, right? Or like this power, random power chip is better than, or the same as the one from like another company, right? And so they've really built up a semiconductor industry and started insourcing a lot more. I don't see why China wouldn't be buying, you know, 30, 40% of the world's AI chips and the US like 50, 60% and then the rest of the world like, you know, and when I say us, I mean US origin companies. That seems like a more natural state for the world. But there are restrictions and, and hey, this is the biggest change in human history maybe ever. Knowledge, work and you know, everything that's going to happen there and, and, and then eventually like robotics and all these things. Like, you know, obviously there's, there's a lot of geopolitical stuff and so there are restrictions. Nvidia has been handicapped, hand handicapped from selling their best chips to China. And so that's obviously impacted the sales a lot because, like, why would you do that? And so when you look at who rents the most GPUs in the world, it's three companies, right? So one of them is obviously OpenAI. Second one, actually they were bigger than OpenAI. They are bigger than OpenAI today or. No, they were bigger than OpenAI then OpenAI eclipsed them recently is ByteDance. ByteDance rents tons of chips from Oracle and Google and, and you know, many other cloud companies because they couldn't get the chips they needed in China. They're mostly just serving TikTok. Right? Okay, well, they're not allowed to buy them and that sucks. But you know, they're allowed to rent them. And so, okay, if I'm not allowed to get the best ones, I'm going to rent externally. And if ByteDance is second biggest renter of GPUs in the world. That's substituting demand that would have been built in China. In many cases it's instead being built in Malaysia. And Oracle has over a gigawatt of capacity in Malaysia that ByteDance is going to take. Right? So things like this are, you know, you know, hundreds of thousands, not millions of chips, tens of billions of dollars of capacity that would go to China, but it's not, it's going to Malaysia instead as an example. Another sort of point around this is China's like, you know, they've had these five year plans so and, and you know the way these initiatives work from China is there is like some top down ordering but then they just kind of whip the whole like everyone just kind of gets into it and it's really cool. Like I don't think it's as top down as many people think. Like I think the entire country is like semiconductor pilled, right? There are dramas where people fall in love in the fab or dramas where people fall in love and they're photovoltaic like solar cell researchers and engineers. And it's like, it's like this is just the backdrop and it's like actually this is, it's like super cool for your like significant other to be that semiconductor engineer or to be that photovoltaic, you know, solar panel researcher.

A (28:39)

It's literally everything. Yeah, literally everything. There's a city. And it's not like, hey, specifically for camera arms, for example. There's ball bearings in this, and the ball bearings are like, there's ball bearings. There's multiple manufacturers of ball bearings for camera arms. And then, like, most of the camera AR in the world come from that one city. It's like, what the hell is going on? And so, like, the semiconductor industry, I think people don't realize, is absurdly specialized. I'm not answering your question. I just got a little bit of a rant because I think people don't understand China. Semiconductors. It's really sick. Or semiconductors in general, but, like, fascinating. You know, like in Japan, they, like, focus on a few different types of chemicals, and they're the best at it. And it's like almost a cultural thing, right? Like, Japanese people are so precise, like with sushi, and, like, it's all about the trade and the craft. And, like, you know, the French fruit in Japan is better than the French food in France because the Japanese chefs went there and then come back and they professed in Japan. And like, because they're so precise and, and there's so many different, like, things that, like, Japan is so good at because they're so precise and like, dedicated to the craft. And it comes out of like, I don't know, like samurai culture or something. I don't know. Right. Like, I don't exactly know how that culture came up. And so when you look at like, and it's like across the world, there's different places where things like this happen, right? Like, oh, like the Netherlands makes UV tools. Cool. I guess so. And you look across the semiconductor industry, there's a famous economic essay called I pencil or something like that. Or talking about how the pencil, like a simple pencil comes from like, oh, the rubber comes from like Indonesia for the eraser, and the graphite comes from this mine here. And. And the wood comes from these aspen trees in Canada. And like, you actually can't make a pencil without aggregating this entire supply chain. Semiconductor industry is like, way crazier because, like, I would say there's like 15 or 20 countries that could shut down the entire semiconductor industry. Right? Even like Austria could. Right. And it's like, what? It's like, well, yeah, there's two different companies there who have like 90% share in like some random niche stuff. And it's like, okay, cool, I guess Austria can. And oh, yeah, those two companies have less than a billion of revenue, but they just happen to have linchpin critical things. And there's linchpin critical things everywhere because the process is so complicated. And so China's been trying to replicate this.

A (30:39)

I think there's a lot of things. I think if you were to close your eyes and say. Or if you were to cut off every country and say there's no more globalism. China has the most vertical stack in semiconductors today, and they're the best at semiconductors in the world. Called because their fabs could still run somewhat on a lot of things because they have built some of these chemical supply chains, right? Like TSMC for certain kinds of chemicals, 100% share from Japan. Right. Or Intel. Same thing. Right. Or, you know, for certain kinds of tools. 100 share from Netherlands or 100 share from, you know, this American company or that, you know, Austrian company or this or that. Right. Like, there's just all these, like, you know, this Swiss company, like, it's just all these different places have 100% share. It might be one company might be three companies, but geographically or in the same area. And China's built that up, right? Because they've created this made in China initiatives which just plowed money into it. And they've got this culture of like the diffused, like you know, these provinces, like, yeah, I just decided I'm going to fucking focus on. Or maybe not even, might not even be the city, right? It may be the like, you know, someone brought it there and decided and then people are like, oh wow, you're doing that. Me too. Like I'm a Patel and I grew up in a motel and guess what? We like almost all the Patels I know grew up in a motel. And it's because some random Patel immigrated to America and like worked at a hotel motel and then bought a motel that like it just started happening, right? Like you sort of like these things are serendipitous of sorts and like, I don't know, like, and it's like I view it as the same kind of specialization, right. Chinese cities are like starting to do these things. China's missing a lot of things, right. I would say like if you say minus 10 years tech, China's complete and no one else is complete. Right. Taiwan is not complete. The fabs would shut down without foreign supply, you know, and you go down or you go across the stack. But if you go to tenure tech, maybe, maybe more like 20 year tech, you could get a fully vertical supply chain in China, which I do not think any country could do. Like America could not build a fully vertical fab without stuff from elsewhere. Even if it's 20 year old tech. Yeah, probably not even 40 year old tech. And so, so that's interesting. But then when the flip side is like, well like you kind of do need specialization. That's how that chemical gets the purest, best, you know, most engineered, you know, or that, that slurry of chemicals or that, you know, that gas or like that tool. Because every smart person or a lot of them in that country grew up around that culture and like the supply chain is there and like everyone kind of knows and like it's like a drive away and like sort of like this is what makes supply chains work is that there is this specialization and the best of the best only comes when you have that hyper specialization. So China doesn't have lithography. Their lithography is like 10 years behind and I think it'll be five years behind in a couple of years, right. They're catching up fast. I don't think they'll be as good as ASML for a long time, you know, maybe, I don't know, maybe they will be, you know, China, you shouldn't ever underestimate China. But like and Chinese engineers are, you know, but like for a while, right? Or like, you know, I don't think they'll be able to make leading edge chemicals like many Japanese companies or many American companies and their tools and like you just go across the supply chain. They're not hey, forefront on really anything in the manufacturing supply chain, on the design supply chain. There's some things that they're starting to be similar par but like cheaper or like a year or two behind but cheaper. And that's like fine for a lot of stuff. An example of that is Huawei. Right. Huawei in mobile phones was on par with Apple like entirely. And they had become Apple TSMC's biggest customer when they were designing the best thing. And they are number one in telecom and their tech is just literally better. And so when you think what happens, you know is, is, is China missing anything? Looks like they're, they don't, they don't, they don't have the best of much, you know, today in the AI supply chain they have a complete package and a couple years behind and they'll figure out how to make it cheaper, slash, do more, slash catch up and, and create a robust industry. But there's a reason like I don't think that like Jensen is scared of amd.

A (36:32)

Wally Shipped a little bit, but like mostly just like sticker capacity. Like there's nothing like. No, no, like I would say like a little bit as in like a few servers, not like a billion dollars worth of stuff, right? The thing is China's supply chain has to ramp up, right? China, China's express goal is to have all internalized. But then like a company like Alibaba is like, I don't want to use Huawei, right? Like, I want to make, I want to use Nvidia and just make the best freaking models, right? Because that's my business. My business is not, you know, using a Huawei thing, but it's like, okay, it's being pushed upon me. There's other companies too, like cameracon and so on and so forth. And so this sort of like supply chain, you know, companies in China don't want to use while they're kind of encouraged, obviously pushed, you know, you must some local provincial government be like, well you're doing this much business here, you got to do this, right? Like there's all sorts of like crazy stuff that, you know, pushing of companies to use Huawei. The challenge is Huawei can't manufacture enough, right. We've like done a lot of work on this and we've just put it for free, you know, instead of like to our customers because it's like something that's like national security, which is how was Huawei actually building chips? Well, actually they were using shell companies to get chips from TSMC and using different methods of like sneaking hbm, which is memory from, you know, Korea through Taiwan to China, right. Like all sorts of crazy stuff we've reported on and people, it's like a whack, a mole, right? They shut it down or like tools that get shipped to China and they shouldn't be for, you know, making leading edge chips, but they actually are. And all these sorts of things are happening because they can't make everything. And if they want to make the leading edge stuff, they do need to rely on the foreign supply chain quite a bit in terms of the upstream supply chain, right? Memory logic chips, tools for fabs, chemicals for fabs, et cetera. Huawei cannot satisfy the market because there's not enough advanced leading edge capacity in memory logic, you know, and all, all these other things domestically in China. And they're trying to build it as fast as they can, but that means there's just not enough to satisfy the market. So Nvidia has a market. I think they'll figure out how to sell chips to China. And Jensen's in China I think like right now or was yesterday. And so like he's clearly like wheeling and dealing to try and get his chips into China. Because, you know, I think Nvidia's argument is if we sell them chips then they won't, you know, there won't be enough of a, as much of a domestic market. The feedback loop for software and everything else won't be there. That was sort of like really challenge it, right? Like most of the open source software for AI has a lot of Chinese contributors, right? VLM and Pytorch, SG Lang and like all of these other like libraries and things that are just like, you know, and it goes to low level software especially, right. Like a lot of the best open source stuff is actually just from like a Chinese company who Decided to open source it and same with models, right? And so like it's like, okay, well if they can't use Nvidia chips anymore, then this open source stuff won't be designed for Nvidia chips, it'll be designed for Huawei chips. And now does that like weaken the Cuda moat? And now like not only is China domestic, now they have like a feedback loop internally and then they can externalize across the rest of the world, right? So this is the like argument Nvidia makes. I'm not sure if I am like, I'm like, you know, I think, I think my AI timelines are so fast. I'm not that fast. Like, not in terms of like AGI, but like, hey, AI is $100 billion of revenue across the industry. I think the industry could hit 100 billion ARR by the end of this year. Like 4550 for OpenAI, like 3540 for anthropic and then you know, Vertex DeepMind's models at Google Gemini, right? And then Vertex API for Anthropic models and bedrock APIs and Azure foundry APIs, like I think $100 billion like end of this year. That's a lot. And then what's the economic value of that? A hundred billion dollars. Now how much of that is in China, right? Like China's number is probably 10x lower, right? Because they just haven't been able to pervasively push AI, right? Chat GPT has a billion users roughly. And you know, then you add on Gemini and Meta claims they have 500 million users. I don't know, I think people just accidentally click like generative sticker or something. But like anyways, like there's like, like there's like a lot of usage of AI in the west already and it's going to climb, it's going to keep climbing and like you kind of have to get used to it. And so like the question is like, do you, you know, what's, what's the economic benefit to the world, right? And at the end of the day this is an economic war, right? If the U.S. and the west win in AI and control, you know, more powerful AI systems that have this feedback loop that improve economic growth and weapon systems and whatever else, right? Engineering of grids and cyber attacks and all these sorts of things, they have this like advantage over China, then China will not rise to be the global hegemony. But without AI, China definitely will rise to be the global hegemony. They're just gonna outrun America and So the question is like, you know, that's, I think like the other view, right? And how fast are super powerful AI systems versus, you know, China building a domestic ecosystem for chips and models and everything that is a few years behind. Like what's, what's, what's actually the value, right? Like that's sort of like around restrictions and regulations.

A (41:25)

I think TSMC is manufacturing wafers and they're like building real wafers and there's real fabs and like, you know, there's some other fabs that have been announced and like they're doing well and there's like a bunch of like different kinds of plants. Like a Korean company making a random gas plant in Texas for, you know, their chips, right? Like for chips and all these like, sort of things are happening. I think the chips act did really well with its $50 billion. It's just, I don't think people understand the scale of the semiconductor. It is the most complicated supply chain in the world, right? It's much bigger than, you know, say manufacturing airplanes. It's much bigger than like, you know, really anything else, right. If you look at the top 10 companies like of the world, I think eight of them design semiconductors right now obviously like Google designed semiconductors, but it's like, oh wait, no, but, but their cost of search would be like 10x higher if they didn't have tpus and tpu's are super optimized for search, right? Or like, you know, you, you, you go down the list, right? Like Meta serves recommendation systems with their chips, right? Like you go down the list, it's everyone is making their own chips. Apple devices would be materially worse if they didn't have their own chips, right? And you just go down the list. It's like, it's the most complicated supply chain and, and they, they're spending something on the order of like $150 billion roughly in subsidies a year to the chip industry. We are doing 50 over like a decade. Yeah, there's a difference in scale here, right? The collective total amount of like capex that has been spent in Taiwan is like 500 billion plus right? Across the industry, across all the companies that are making semiconductors in Taiwan. And Taiwan doesn't have a domestic industry. How is $50 billion of subsidies going to change America's needle, right? It does move it a little bit, right? I will be clear. Like the chips act is awesome. I don't understand why like EVs or like solar was given this massive, massive like trillion dollar package. Semiconductors were only given 50. Like semiconductors need a lot bigger package to actually incentivize on shoring. I think what's happened so far has proven that it's working well. TSMC is literally making chips for Nvidia and Apple and AMD and others in Arizona today. Right. And I think that's really great is.

A (48:52)

I'm obviously a maxi. I think we're going to to need a lot of infra and I think I'm literally paid to like analyze the supply chain and do consulting like that's what my company does. So like obviously I'm very biased. I think, I think we're pretty good at calling when, when things go down though right before like a part of the supply chain rubber. But anyways, you know again going back to the economics of it's north of $100 billion of revenue exiting this year for AI from a base of you know, sub 1 billion gen AI from a base because ads and stuff is like already a multi hundred billion dollar AI industry. Right. You know, go back to 2023, it was like right in 2024. I don't know exactly what number maybe let's call it 10 and 25 was maybe like 30, 40. It'll be north of 100 easily if you're talking about $100 billion of revenue let's say at a 50% gross margin. So that's $50 billion of gross profit and $50 billion of COGS that $50 billion of COGS needs to run on infra which cost roughly if a, if you're talking about five year depreciation call it $250 billion. Right. Of infrastructure etc for a hundred billion dollars of revenue. Okay, what is, what is the actual spend on AI? And for this year it's going to be like it's, I mean it depends on what layer. If you're talking about energy, those are longer lived assets and all these other things. Right? Data centers are longer lived assets. The chips are not as much. People are putting Capex down and the hyperscalers, Capex is going to be like $500 billion this year or something like this. And then besides them there's also a lot more hyper capex elsewhere. And so you know, is it a bubble? I mean theoretically like you know, it's twice as much as it should be but it's also like well no, there's an R and D component to this and the excess spent that wasn't revenue generating last year is what led to models being so good this year and led to like everyone who can using cloud code and like that changing their life. This is like it's not a bubble. Right. I don't think it's a bubble yet. I think if AI model progress stops, and that's the main thing. Right. The moment model progress stops, stops, all the spending is for naught. But so far we've had consistent improvement. As you put in more compute, you get more performance and better models.

A (55:05)

It would if the utilities were willing to let it. But I think the utilities are so slow and dumb that they don't want to not destroy, but like expanding the grid. I think the US could have a way better grid, but we just don't want to. Like, no one's made the effort or initiative. You know, there's not enough power. America's not built power for 50 years, really. Right. It's like converted from coal to gas and like things like this, but like, really just have not built wholesale new power on a large scale. And there have been a lot of times where the industry blew up, right? Independent power producers, IPPs have blown up multiple times in the 2010s when Korean and Japanese investors like flooded the market with because they saw such a good return there. Or before in the early 2000s, power was growing a little bit for a little bit. And so people overbuilt on power. So power industry has been burned a couple times, so no one really builds power. And then you've got data centers now all of a sudden coming online and going from 2% to 10% of the US grid in just a handful of years. And so you've got this humongous, humongous change in the industry. We don't have the labor. Right. I think ultimately that's the biggest problem is the equipment and the labor. And equipment is basically, you know, again, labor. And time takes time to build a factory, so you can build the things. I think the equipment side of things will be solved, like, more reasonably. And one example was like, gas, right? People initially thought, oh, you can only use like the two vendors, right? Siemens or ge, Vernova for gas turbines. They have the best ones, the most efficient ones. It's like, okay. Well, like, okay. Also, Mitsubishi exists and they're ramping up production fast. Oh, Doosan and Korea exist and they're ramping up production fast. Oh, actually I can just take Cummins engines, right? Like, you know, if you've ever like ridden a pickup truck or like, you know, like diesel trucks, like, everyone loves Cummins, right? You know, you see the Ram on the street and has the Cummins like badge. It's like, it's like, that's like aura symbol for a certain kind of redneck from South Georgia, which I have a little bit of. Anyways. I don't have a have a truck I have though. But anyways, like, there's like all these engines like, people are figuring out how to make the equipment, you know, Solar sucks, It's too intermittent. Wind sucks, it's too intermittent. Nuclear sucks. It takes forever to build. Coal sucks. It's way too dirty. How do you make power for data centers besides gas and like, okay, the grid's not willing to put the gas on your site, right? So Elon did. Now everyone's doing it. Right.

A (57:58)

And so, so the, the comparison we made is because, like, you know, it's a bit of a shit post, but it was like, serious research. Yeah, basically, like, we were doing serious research because we keep getting this, like, question and debunking it and we would do it seriously. But then I was like, no, no, no. This is like, too, like, complicated. Like, let's make it very simple. So I was like, guys, why don't we just compare it to like, hamburgers, right? Because. Because, you know, I've heard that argument from some, like, vegetarian people before or some Hindus or, like, I'm Hindu myself, although, you know, and I do eat beef sometimes, but, you know, like, I have Hindu, but, like, you know, so. So we made this comparison to hamburgers, right? Hamburgers require a shitload of water because cows, you know, for them, they require a ton of water. And what a. Cows take a lot of water. It's not the cow itself. It's all the feed you're feeding them, right? Because no one grass feeds their cows, you know, and just lets the rain take care of the grass. They like, either rain the grass or most likely they do mass industrial farming of corn, soybean, alfalfa, et cetera. Which uses shitloads of water, right? Like, you know, or like almond milk, like, uses tons and tons of water. Like produce is like the main user of water. I think the metric was the entirety of Elon Musk's Colossus data center, right? Uses as much water as two and a half in and outs. Because that's, you know, you do the calculation on how many, how many, Burt, what's the average revenue per in, n out. And how many hamburgers does that translate to, right? If everyone's ordering like a combo, right? Okay, let's ignore the drink, let's ignore the fries. Let's just talk about the hamburger. Let's ignore the bread, which does use have grain. Let's just do the meat and the cheese. And all of a sudden all this water is. There's so much water, right? Like a single query. Like all of your AI usage from ChatGPT of the average user is like a hamburger, right? Like, it's like, okay, this is nothing, right? You know, because these things are, the data centers actually are like, they're mostly closed loops and like, sure, they evaporate some water for like, cooling reasons, but like, by doing evaporative cooling, they're using less power, right? And that's actually better for the environment than. Than not using evaporative. Cool. There' all these reasons why this myth or hoax of AI, of AI using all the water is just nonsense, right? Like, Meta's data center in Louisiana is getting protested because the water, it's. It's going to be the largest data center in the world. It's going to be like 4 or 5 gigawatts at least announced so far. We're tracking some other ones that are. That may be as big or bigger. But Meta is getting protested because the local population around that area is like, oh, the water's dirty. It's because of this metadata center. And like, there's these trucks on these big trucks on these back roads that used to be empty completely. They're just like mad and annoyed about that. That, right? But at the end of the day, what actually made the water dirty is that that's an area where you go fracking. Like, fracking is absurdly worse. And almost all of that gas is being shipped to an LNG terminal and being shipped to Asia. Like, you know, like Japan or Taiwan or China or Korea and some Europe as well, right? Like, like, actually all of this water is dirty because of regulation slash Fracking. Like I support fracking by the way, but you know that's, that's an insane take too. Maybe, but like water usage is like not a relevant argument.

A (64:19)

I think it's completely fine. And I think like people are like freaking out and making narratives where there really is shouldn't be one. It's like, well, okay, Google doesn't have enough data center capacity. They need people to build data centers but no one can build a data center because they don't have the cap. Like don't have, you know, many cases capital is not the, you know, they don't have capital, right. Or like no one will give them a loan because they don't trust some random fucking company. And it's like. But then Google's like, well no, we've due diligence. Then we think they can build it here. We'll like even guarantee we'll buy the thing or start using it once they build it. You know, just having a customer alone spoken for it was enough. Right. In the case of Core Weave, they were actually able to no backstop, right. They were able to just say, hey look, here's Our Microsoft contract for this many GPUs, I want to put in that data center. That data center, that data center. Here's the contract for renting those GPUs. I want to hire these people. I want to do this. No one will like, they don't have any money, but then they were able to like have it work out because they were able to get people to lend to them. I think like Core, we've did that and there was no circular financing, but that was when there was like, the scale of investment was like single digit billions or less than a billion. Right now the scale of investment is hundreds of billions. And so the question is like, oh well, if I want data center capacity, how do I, how do I get data center capacity? I just go to everyone who's going to build it, looks smart, is smart enough to do it, but can't afford to do it and tell them I'll take it. And in fact, I won't just take it. I'll go to your debtor and be like, I'll guarantee you. Because you know, obviously you're a new company, I vetted you. But the debtor hasn't. And so, you know, like, you know, you know, they don't want me to just be able to walk away because like in the Microsoft Core weave deals, Microsoft could have walked away if Core. We fucked it up, right? Yeah, there's no, I mean, yeah, there's always like, sort of like cancellation or whatever possibilities. And so there's just a further form of guarantee. As far as on like a lot of these backstops. As far as on like Oracle getting the money and then OpenAI getting money and Nvidia, you know, paying. And it's this whole circular, it's kind of nonsense because it's like Nvidia is getting equity in OpenAI. They're basically saying, hey, every gigawatt you buy will also buy some equity. Right. Okay, well, cool. Now Nvidia owns an asset which they think is valuable. OpenAI. Right? OpenAI is turning around and is like trying to rent those, use the equity they buy. What are they? What are they? What is their use of equity? People's cash pay isn't that great, right? It's mostly just 99 plus percent of their spend at the company is probably just compute.

A (68:29)

I've said it to him, we've written it in articles. It's an entire practice of consulting that I just, I started in like 23. 2023 was token economics. And we've been trying to build out these like, you know, but basically I think the main things are like people who don't code can use cloud code now, right? I think people don't understand that like, even if you don't code, you've never had any training in software development. You've never taken had a job as a software development developer. You can code. Let's take an example of what's one of the. One of the analysts at my company did, right. Comes from a engineering background, but on like semiconductor systems, right? Like worked on mechanical systems, worked on these sorts of things. And they coded this thing which was they wanted to do an analysis of area of clean rooms, right? Clean rooms are the building that you. The fab has all the tools and most complicated kind of building in the world. Has every, all sorts of chemical systems and all this area of that. That a company who builds systems builds these systems and revenue of that company, right? And so it was like, okay, we have this FAB data set. Pointed it at it. It was like, hey, here's this FAB data set. What's the square footage of all of them? And we have this like thing that we built which just pulls with cloud code separately, which for data centers and, and, and fabs and everything else just calculates the area of something from a calculus from a satellite image, right? Very simple. So we have the square footage of all these things. Puts it that. Here's the company name. Okay, Go find the filing. So it dug through all these filings, it pulled the data, right? Okay, great. Now told it to compare these two, make a chart. Great. Oh, wait, there's this like weird inflection. Oh, that's because they bought a company five years ago. Can you do a pro forma of this analysis without those financials of that of that company they acquired? Okay, great. And then like we were able to like, like figure out an investment case for our clients as well as like, you know, some other interesting details from someone who's never ever really coded just using Claude code. And it like doing this all. And this is like not even there. And it wrote the note and they just like, they didn't even like work on this full time for like three hours, right? They just told the model and would go work on other things and told the model and worked on other things. It just did this. People don't understand that like the skill sets that like I think like if you go talk to an analyst, right, A very junior analyst at any company, right. Whether it's venture or especially growth venture or public markets or private equity. Their, their job is like finding data, cleaning it, making charts. It's like this is Claude code now. You don't need junior analysts. Just like a lot of companies have stopped hiring L4 engineers because it's useless. Why would I hire an L4 engineer? I just tell Claude to do it. You, you sort of like have this has happened and this is a really big like shift I guess. Like is that like low level knowledge work just doesn't matter, right? Why would I, why would I use Excel when I can just tell Claude to manipulate CSVs? Why would I use Word when Claude will just generate the markdown and I can copy and paste the markdown directly into our WordPress and then you know that WordPress is fully formatted now and it's oh my God, like what's the point of Word, right? And what's the point of doing all sorts of stuff? I think when we look at model progress that's just for Opus 4.5. OpenAI's new model I think will be better than Opus 4.5 and it's coming like somewhat soon in March ish timeframe, maybe February, March ish. But yeah, because OpenAI has a better RL stack than Anthropic today. It's just their pre trained models suck compared to Anthropic's pre training. Right? And so like if they catch up a lot on pre training and keep their better RL stack, they would actually have a model that's much better. Right? Flip side, Google has a better pre trained model than Anthropic or OpenAI but their RL stack sucks. So if they catch up on RL like these models are going to get ridiculously. And then Anthropic is obviously advancing as well. Right? And so, and then, and then you look across the ecosystem, everyone's advancing really fast progress. These moments are happening, right? You know, Chat GPT was a moment, Ghibli was a moment. Those were more consumer, those were less. Like, I mean there were chatgpt. Is everyone using it for work too? But like I think Claude Code is like a new moment moment, right? Opus 4.5 on cloud code is a new moment where the way you work has forever changed. And so now we're trying to force everyone in my company there's 54 people here. I think like half of them have coded. The other half, we're trying to force them to use like cloud code. And it could be like, oh well actually you come from a consultant, a semiconductor consulting background. Oh, you come from like a semiconductor, like engineering of like package. Oh, you worked in a fab. Right. Like these kind of people, they're using cloud code now. Right. And their productivity is being boosted. Used it and it's like, you know, workspace. Cloud workspace is new. It sucks compared to CLAUDE code, but it'll get there. Right. He said he coded it entirely in CLAUDE code. You know that, Right. Or it was on your pod. Right? Yeah. So like, you know, I've heard that and I think maybe that might have been from your pod. Original disclosure, my pod was before that, but yes. Oh, okay, okay. It was.

Summary

The MAD Podcast with Matt Turck

Episode Summary: Dylan Patel – NVIDIA’s New Moat & Why China is “Semiconductor Pilled”

Date: February 5, 2026
Guest: Dylan Patel (Semianalysis)
Host: Matt Turck

Overview

Key Discussion Points & Insights

1. NVIDIA's Acquisition of Groq and Shifting AI Hardware Landscape

Why Groq?
- NVIDIA's dominance comes from wide surface area betting on hardware types. Now, with models and workloads fragmenting—decode-heavy, memory bandwidth-sensitive, and parallelism-focused tasks—they see the need for specialization (01:33–05:54).
- Quote: “Acquiring Groq is how you get those resources to make more solutions for different parts of the market to stay king.” – Dylan Patel (00:00)
AI Workload Fragmentation:
- Workloads now include highly parallel streams (multiple chains of thoughts in LLMs), decode-optimized, and context-heavy use cases.
- NVIDIA’s diversified portfolio: General purpose GPUs, CPX (context processing/KV cache), and now Groq (blazingly fast decode).

2. Competitive and Regulatory Dimensions

Anti-competitive Concerns:
- “I certainly think it’s not good from an anti-competitive sense... but with startups, it’s OK. These license-structured deals also avoid regulatory limbo that kills startups’ momentum.” (06:04–07:00)
Barriers to Entry:
- Specialized chip startups face enormous hurdles; only a few players have built full-stack AI hardware-software solutions.
- Startups must “try something weird or different” to compete.

3. The CUDA Moat & Evolving Software Ecosystem

CUDA’s Waning Dominance:
- “The reason why CUDA is so important…you can do whatever you need to do...But most AI chips will not be consumed by people programming anything for it…they will download an open-source inference engine..." (10:17)
- The moat is shifting from deep CUDA expertise to fast, broad compatibility with frameworks (e.g., VLLM, SGLang).
Open Source as Leveler:
- Frameworks are catching up: AMD, TPUs, Amazon Trainium are being integrated, reducing NVIDIA’s software advantage.
- Real moat now is rapid, seamless hardware-software integration and advanced inference features (e.g., KV cache management).

4. Chip Startups and Specialization Game

Only the Paranoid Survive:
- Jensen Huang embodies an “Andy Grove” mindset—constantly defending against threats and expanding NVIDIA’s reach.
- Multiple specializations now feasible: “You’re never going to beat NVIDIA at their own game…Everyone else has to try something weird or different.” (18:05)
Long-term Market Share Prospects:
- AMD emerges as credible runner-up ("single-digit percentage market share").
- Startups like Etched, Maddox, Positron betting on unique architectures, but succeed rate pegged at "<1%".

5. Geopolitics & China’s Semiconductor Crusade

China’s Industrial Policy:
- "The entire country is semiconductor-pilled. There are dramas where people fall in love in the fab...It's like super cool for your significant other to be a semiconductor engineer." (23:12–26:38)
Import Substitution and Fragmentation:
- China’s local governments are driving domestic chip adoption, sometimes surpassing the central policy.
- Massive specialization at the city/province level (“There’s a city for everything—lampshades, camera arms, even semiconductors.” 28:05)
Global Supply Chain Interdependence:
- “If you were to cut off every country and say there’s no more globalism, China has the most vertical stack in semiconductors today.” (30:39)
- Yet, leading-edge tech, chemicals, and lithography remain out of reach; China is about “10 years behind but catching up.”
Huawei’s Threat:
- “Huawei is the most vertical company in the world. No company is more verticalized than Huawei, which then leads to huge innovations…Of course, they're terrifying.” (35:40)
US Policy Response:
- American onshoring via the CHIPS Act is significant but dwarfed by global (especially Asian) capex needs.
- “How is $50 billion of subsidies going to change America’s needle? It does move it a little bit…But semiconductors need a lot bigger package.” (41:25–43:25)
- Legislation only passed when the car industry felt the pain from chip shortages.

6. The Power Grid, CapEx Bubble, and Industrial Impact

Energy Realities:
- Data centers’ electrical consumption is skyrocketing, pressing US grid capacity (10% by late 2020s).
- "US has not built power in 50 years…New grid expansions are slow and labor-limited." (55:05–57:06)
- Water consumption by data centers is minor; “all of Elon Musk's Colossus data center uses as much water as two and a half In-N-Outs.” (57:58)
Capex Bubble?
- Anand is bullish—demand is largely real as AI use surges, though timing (adoption vs. supply buildout) is the main risk.
- “Model progress is very clear—the moment that stops happening...if we hit a wall, then it’s cooked.” (51:00–51:33)
AI's Economic Transformation:
- “AI is under-earning the value that it's producing in the world. By a significant margin already today.” (51:49–52:10)

7. Models, Software & Work Reimagined

Workflows Are Changing:
- Non-coders harnessing tools like Claude Code to automate complex knowledge work.
- “Why would I hire an L4 engineer? I just tell Claude to do it...Low-level knowledge work just doesn’t matter.” (68:29–71:46)
Model Innovation & “Tokenomics”:
- Patel’s team tracks AI progress via alternative analytics (e.g., “tokenomics”—tracing actual token/data usage).
- “Even if you don't code, you've never had any training…You can code.” (68:29)
Competitive Model Landscape:
- OpenAI, Anthropic, Google all racing, with RL and pre-training breakthroughs.
- New models unlock moments of productivity—“Claude Code is a new moment where the way you work has forever changed.” (71:46)

8. Cultural & Lifestyle Tangents:

San Francisco AI Housemates:
- Stories of productivity and obsession—building a full RTS game in a week using only Claude, no typing.
- Roommate lore: “6'4", Olympian level fencer, perfect specimen”—on Sholto, their AI housemate (73:52).
Chinese Romance Fabs:
- Romance dramas set in fabs and photonics labs—evidence of China's cultural embrace of the industry.
US vs. China Societal Trends:
- “In China, it’s cool to date a semiconductor engineer; here, it’s all about influencers.”

Notable Quotes & Moments

"This is the biggest change in human history, maybe ever...What’s about to happen with AI?” – Dylan Patel (00:00)
“Huawei is terrifying, right?…No company is more verticalized than Huawei, which then leads to huge innovations.” – Dylan Patel (35:40)
“The entire country is semiconductor-pilled. There are dramas where people fall in love in the fab...romance comedies set in semiconductor factories.” – Dylan Patel (23:12, 26:38)
“Why would I hire an L4 engineer? I just tell Claude to do it.” – Dylan Patel (68:29)
“All of Elon Musk’s Colossus data center uses as much water as two and a half In-N-Outs.” – Dylan Patel (57:58)
“Only the paranoid survive is core to the Bay Area, core to NVIDIA. Jensen is very paranoid about losing.” – Dylan Patel (07:09)
“AMD will be caught up at times and very behind at other times…single digit percentage market share is still pretty good.” – Dylan Patel (17:09)
On the CHIPS Act: “I don’t understand why like EVs or solar was given this massive, massive trillion dollar package...semiconductors only given $50 [billion].” (43:25)

Important Timestamps

00:00 – 05:54: Deep dive on why NVIDIA bought Groq; new chip specialization trends.
10:17 – 15:00: CUDA’s changing moat, open source software breaking down NVIDIA’s traditional lock-in.
18:05 – 22:50: Viability of chip startups, AMD, and specialization strategies.
23:12 – 26:38: China’s semiconductor industrial strategy and cultural embrace.
30:39 – 36:20: China’s supply chain, Huawei’s threat, and what China still lacks.
41:25 – 43:25: US CHIPS Act and the true scale of global semiconductor investment.
55:05 – 61:01: Data center energy demands, power grid strain, and debunking water “crisis.”
68:29 – 73:47: AI transforming non-coding work, Claude Code productivity, and the coming revolution in knowledge work.