Summary6 min read

Podcast Summary: The Twenty Minute VC (20VC)

Episode: "OpenAI and Anthropic Will Build Their Own Chips | NVIDIA Will Be Worth $10TRN | How to Solve the Energy Required for AI... Nuclear | Why China is Behind the US in the Race for AGI"

Guest: Jonathan Ross, Founder & CEO of Groq
Host: Harry Stebbings
Date: September 29, 2025

Episode Overview

This wide-ranging conversation with Jonathan Ross, a leading architect of AI hardware (ex-Google TPU team, founder of Groq), explores the relentless growth in demand for AI compute, why the US and its allies have a current advantage over China in the AI race, economic and technical forces driving the future of chips (including predictions for NVIDIA and custom chips from OpenAI/Anthropic), and the foundational role of energy in supporting AI advancement. The discussion also explores themes like market bubbles, labor transformation via AI, infrastructure supply chain constraints, and Europe's challenge to remain competitive.

Key Themes & Insights

1. AI Compute: The New Oil

Ross compares the current AI market to the "early days of oil drilling," emphasizing high "lumpiness" in returns but massive upside for early players.
Major technology firms (Google, Microsoft, Amazon) and powerful nations are aggressively investing in AI, signaling "the smart money" is all-in:

"Every time they make an announcement on how much they're spending, it goes up the next time." (05:23)

Market value and revenue are highly concentrated: 35–36 companies constitute 99% of AI token spend today.

2. The Insatiable Demand for Compute

The AI compute supply is massively trailing demand.
The value from more compute is immediate; for OpenAI or Anthropic, doubling inference compute would almost double revenue in a month (12:30, 41:59):

“If OpenAI were given twice the inference compute that they have today, if Anthropic was given twice the inference compute that they have today, within one month from now their revenue would almost double.” (12:22)

3. Speed, Supply Chains, and Value

Speed is not a “nice-to-have”—it’s core to user engagement, brand value, and winning deals:

“Every 100 milliseconds of speed up results in about an 8% conversion rate.” (13:02)

Groq’s key differentiation is supply chain speed: they can deliver compute in 6 months, versus typical 2-year GPU cycles, which wins hyperscaler interest (24:05).

4. The Race to Custom Chips

OpenAI, Anthropic, and other hyperscalers are expected to design their own chips, but not all will succeed:

“Building chips is hard… It’s like saying, ‘That Google search is pretty nice, let’s go replicate it.’ It’s insane, the level of optimization… You’re not going to replicate it easily.” (10:11)

The real motivation for custom chips: control over destiny and negotiating power with Nvidia (14:47, 17:05).
However, Nvidia’s effective “monopsony” over high bandwidth memory (HBM) makes it very hard for new entrants (14:47–18:08).

5. The Power of Energy

Compute is limited by available energy, and nuclear/renewables are essential to powering the AI revolution:

“The countries that control compute will control AI, and you cannot have compute without energy.” (32:26, 36:46)

Ross highlights Norway’s wind and hydro capacity and Japan’s nuclear relaunch as models for rapid change (34:12–35:14).

6. US vs. China in the AI/AGI Race

Despite headlines, US models (like GPT variants) are up to 10x more efficient to run than Chinese alternatives, supporting the US’s “away game” (29:35, 29:50).

"The US still has a training advantage... We have a massive compute advantage." (29:39)

China can subsidize domestic compute, but US + allies’ energy and compute efficiency is decisive for global influence.

7. Labor Transformation – More Jobs, Not Less

Contrary to common fears about mass unemployment, Ross predicts AI will cause:
1. Massive deflationary pressure, lowering costs (43:10)
2. People opting out of traditional employment
3. Creation of entirely new industries and job categories:

“We’re not going to have enough people… 100 years from now, jobs we can’t imagine today will exist.” (43:10–45:07)

8. Economic Perspectives & Bubbles

AI is creating real value, not just speculative hype—PE firms measure bottom line improvements from more compute (51:46).
Yet there is high concentration of market value in a few companies, raising risk if the growth train stalls (53:07).

9. Industry Predictions & Strategic Takeaways

Nvidia is likely to hit a $10 trillion valuation within five years; may represent a minority of chips sold, but majority of revenue, due to pricing power and brand (65:46, 64:14).
OpenAI, Anthropic, and others will join the “Mag 7,” growing into “Mag 9, 11, or 20” (60:43).
Switching costs for AI tools are low for technical users, but enterprise deals still lock in customers (59:34–59:50).
Groq's sustainable advantage: rapid supply chain and cost per token, not direct model competition.

Notable Quotes & Moments

The Compute Arms Race

"There is no limit to the amount of compute that we can use." (42:01, Jonathan Ross)

The True AI ‘Moat’

"People look at TPU as a big success... only one of [three efforts at Google] ended up outperforming GPUs... Building chips is hard." (10:11, Jonathan Ross)

On Government Response and European Energy

"Norway itself could provide as much energy as the United States and could do it consistently. The entire United States! That's one country in Europe." (33:00, Jonathan Ross)

On the Nature of Economic Cycles

“The most valuable thing in the economy is labor. And now we're going to be able to add more labor to the economy by producing more compute and better AI. That has never happened in the history of the economy before.” (51:46, Jonathan Ross)

On the Future of Jobs

“We're not going to have enough people... 100 years from now, jobs we can’t imagine today will exist.” (43:10, Jonathan Ross)

On AI Platform Wars

"Enterprises make these long term deals and they stick with whatever deal they made a year ago." (59:43, Jonathan Ross)

On Focus vs. Optionality

"I used to think that the most important thing was preserving optionality. Now I think it's focus." (74:11, Jonathan Ross)

On the Mind-Expanding Power of LLMs

"LLMs are the telescope of the mind... In a hundred years, we’re going to realize that intelligence is more vast than we could have ever imagined." (76:53, Jonathan Ross)

Important Timestamps & Segments

AI Investment Bubble? — 05:09–07:16
Demand for Compute & Role of Nvidia — 10:11–14:47; 41:59
Custom AI Chips & Supply Constraints — 14:47–20:16
The Critical Role of Energy — 32:09–37:49
US vs China & Open Models — 27:44–30:38
Deflation & Labor Shift Predictions — 43:10–45:07
Market Risk & Value Concentration — 51:22–54:10
Nvidia Future & Chip Ecosystem — 64:14–65:46
Quickfire Round (Nvidia, Groq, Silicon, Margins, Oracle, Moats) — 70:45–74:00
LLMs as the “Telescope of the Mind” — 76:53

Conclusion

Jonathan Ross offers a compelling narrative that places compute, energy infrastructure, and supply chain agility at the very heart of the next AI revolution. He argues the coming years will see custom chips proliferate, AI labs rise to “Mag 9/11/20” scale, and Nvidia grow even more dominant, but with room for new systems like Groq, especially as efficiency and speed shape the race. The social consequences could be epochal, with AI both lowering costs and creating labor demand, not unemployment. Throughout, Ross’s optimism about abundance, progress, and “the telescope of the mind” shines in a conversation as rapid as the industry itself.

For a full experience, key insights, and technical depth, listening to the episode is highly recommended!

Loading summary

Transcript270 lines

[00:00]
Jonathan Ross
The countries that control COMPUTE will control AI and you cannot have COMPUTE without energy. And now we're going to be able to add more labor to the economy by producing more compute and better AI. That has never happened in the history of the economy before. What is that going to do? I personally would be surprised if in 5 years Nvidia wasn't worth 10 trillion. The demand for compute is insatiable. If OpenAI were given twice the inference compute that they have today, if Anthropic was given twice the inference compute that they have today, within one month from now their revenue would almost double.
[00:32]
Harry Stebbings
This is 20 VC with me, Harry Stebbings, and this guest holds the record for the most downloads in last year's catalog of episodes. So I'm thrilled to welcome Jonathan Ross, founder and CEO at Grok, back to the hot seat. Now Grok is the AI chip company redefining inference at scale. Under his leadership, Grok has raised over $3 billion. With the latest pricing the company at close to $7 billion. And before Grok, Jonathan led the team that built out the TPU at Google, making him one of the leading architects modern AI hardware. Now this conversation has it all with everything from OpenAI, Anthropic, Oracle, to what happens to Nvidia in the next 10 years to how should we think about China? This was an incredible and very wide ranging discussion. But before we dive into the show today, I love seeing the team come together to make this show happen. What I don't love is trying to keep track of all the information, the data and the projects that we're working on across dozens of platforms, products and tools. That's why we use Coda, the all in one collaborative workspace that's helped 50 50,000 teams all over the world get on the same page. Offering the flexibility of docs with the structure of spreadsheets, CODA facilitates deeper teamwork and quicker creativity and their turnkey AI solution. The intelligence of Coda Brain is a game changer. Powered by Grammarly, Coda is entering a new phase of innovation and expansion, aiming to redefine productivity for the AI era. Whether you're a startup looking to organize the chaos while staying nimble, or an enterprise organization looking for better alignment, Coda matches your working style. Its seamless workspace connects to hundreds of your favorite tools, including Salesforce, Jira, Asana and Figma, helping your teams transform their rituals and do more faster. Head over to Coda iO20VC right now and get six months off the team plan for startups for free. That's Coda c o d a IO 20 VC and get six months off the team plan for free. Coda IO 20 VC and talking about precision, that's exactly what Brex brings to your finances. So when Brex was founded, it wasn't just about creating another financial product. It was about solving the really gritty challenges that founders face daily. Let's be honest, building something from the ground up is hard enough without dealing with clunky, outdated banks that pile on fees and leave your cash idle. Brex is different. It's the financial stack that scales with you no matter where you are in your journey. From corporate cards to maximizing your Runway to earning yield on your cash. Brex was designed with founders in mind to make every dollar go further so you can focus on building. And here's what really stands out to me. Brex combines the best of checking treasury and FDIC insurance in one powerhouse account. You can send and receive money globally at lightning speed, earn Yield from day one and still access your funds whenever you need. Plus, with 20x the standard protection through program banks, your cash is not just working harder, it's working safer too. It's no surprise that 1 in 3 venture backed startups in the US with companies like Anthropic, Coinbase and RobinH. I mean my God, these companies are incredible. Trust Brex to help them grow. If you want to join the smartest startups on the planet, head over to brex.com startups and see what they can do for you. And talking about trust today, customers expect it faster than ever. And that's why over 10,000 global companies trust Vanta. Vanta automates up to 90% of the work for in demand compliance standards like SoC2, ISO 27001 and more. Using smart AI to centralize workflows, manage risk and get you audit ready in not months so you can stop chasing paperwork and start closing deals. And a new IDC report found that Vanta customers achieve $535,000 per year in benefits. That's insane. And the platform pays for itself in three months. I had no idea about these. Whether you're growing fast or just getting started, Vanta connects you with trusted auditors and experts support to help you build trust with customers. Get a thousand dollars off your first year@vanta.com 20 that's vanta.com 20VC.
[04:48]
Jonathan Ross
You have now arrived at your destination.
[04:51]
Interviewer (likely Harry Stebbings or co-host)
Jonathan, you've just been told by our team that our last show was the most successful of the year when it came out. So there's no pressure at all that this is going to be the most successful of this year. But welcome to the studio, man.
[05:05]
Jonathan Ross
Thank you.
[05:06]
Interviewer (likely Harry Stebbings or co-host)
It's great to have you here, dude.
[05:07]
Harry Stebbings
Now, I wanted to start with a.
[05:09]
Interviewer (likely Harry Stebbings or co-host)
Understanding of where we are. It seems the world moves faster than ever before. And honestly, I think a lot of us are trying to understand where everyone lies in a new market. If we look at the current state of the market today, how do you analyze it?
[05:23]
Jonathan Ross
Are you asking, is there a bubble relatively in terms of whether or not there's a bubble? My answer is if you ask a question, you keep not getting an answer. Maybe you should ask a different question. And so instead of asking is there a bubble? You should ask, what is the smart money doing? So what is Google doing? What is Microsoft doing? Amazon, what are some nations doing? And they're all doubling down on AI. They're spending more. Every time they make an announcement on how much they're spending, it goes up the next time. One of the best examples of the value that's coming from the spend. Microsoft in one quarter deployed a bunch of GPUs and then announced that they weren't going to make them available in Azure because they made more money using them themselves and renting them out. So there's real money in the market. And the best way that I, I think to explain this market, this is like the early days of oil drilling. A lot of dry holes and a couple of gushers. I think the stat that I heard was 35 companies or 36 companies are responsible for 99% of the revenue or at least the token spend in AI. Right now it's very lumpy.
[06:24]
Interviewer (likely Harry Stebbings or co-host)
And so I'm surprised it's not less when you look at.
[06:27]
Harry Stebbings
No, but I mean, seriously, Nvidia really.
[06:29]
Interviewer (likely Harry Stebbings or co-host)
You know, having concentration of revenue with two clients so heavily.
[06:33]
Jonathan Ross
Yeah, and maybe Nvidia represents 98% of that, but when it's that lumpy, what that's an indication of is it's like the early days of the oil drilling where people didn't know how to find oil. They were going off of instinct, you know, almost vibe investing. And people who had a good instinct would make a fortune and everyone else would lose their shirts. Over time it becomes a science, it becomes very predictable and there's less lumpiness, there's more predictability, but the good investors make less money. So right now is the best time for investors. Right now people are making more money than they're spending. It's just very lumpy.
[07:08]
Interviewer (likely Harry Stebbings or co-host)
I'm sorry, they're making more money than.
[07:10]
Jonathan Ross
They'Re spending as an aggregate. Plenty of people are going to lose their shirts, but overall less money is going to go in than is going to come out.
[07:17]
Interviewer (likely Harry Stebbings or co-host)
But when we look at the capex spend today by the big providers, everyone is going, okay, okay, okay. Because there's something coming at the end of it.
[07:25]
Jonathan Ross
Yeah.
[07:26]
Interviewer (likely Harry Stebbings or co-host)
And the trouble is the capex spend is going up and up and up.
[07:29]
Jonathan Ross
Okay. You're thinking of it purely financially, and I think that the financial returns will be positive, but that's not why people are motivated. So I was in Abu Dhabi at the inaugural Goldman Sachs Abu Dhabi event. You know, as you now know, we're sponsoring McLaren. And so Zach Brown was talking, I was talking and it was a fun event, but I was asked a similar question. Like, is AI a bubble? And I asked the following question. So this is like a bunch of people who manage 10 billion plus in a like 50 plus people who manage 10 billion plus. I'm like, who here is 100% convinced that in 10 years AI won't be able to do your job? No hands went up. I'm like, great. That's how the hyperscalers feel. So of course they're going to be spending like drunken sailors. Because the alternative is that they're completely locked out of their business. So it's not a purely economical framework that they're using. It's a. Do we get to maintain our leadership? Now? When you look at it, the next step, there are these, you know, scale law sort of outcomes. You want to remain in the top 10. We keep talking about the Mag 7. If you're not a member of the Mag 7, you're not going to be able to get anywhere near the valuation. And so what do you do to stay there? You spend and it's worth it because the stock value stays up because you're in the top seven or ten.
[08:44]
Interviewer (likely Harry Stebbings or co-host)
At some point the returns have to be delivered though. The spend has to materialize into actual tangible revenue back. And if it doesn't, whether you're in the Mag seven or not, doesn't. It doesn't matter.
[08:56]
Jonathan Ross
That's correct. But right now AI is returning massive value already. It's very lumpy in the applications, but it's returning massive amounts of value. Let me talk about an example that actually happened for us. I've tried a little bit of vibe coding. I'm not the best in the world at it. We've got some interns who are amazing at it. We had this customer visit us and I had a meeting with them. They Asked for a feature and I specced it out. Very high level vibey. So I was prompt engineering the engineers and four hours later it was in production. Not a single line of code was written by a human being. There was no debugging done by a human being. It was all prompting. I think we even have Slack integration now where you commit things through Slack. So all that was done. Four hours later it's in production. Think about the value there. But now, fast forward six months from now when that could happen before the customer meeting's over. It's a qualitative difference. It's not even just a dollar amount difference. Yes, you know when you're able to do it that fast, you spend less to get the feature into production. That's however qualitatively when you can do that before the customer meeting is over, you're going to be able to win deals that your competitors won't.
[10:02]
Interviewer (likely Harry Stebbings or co-host)
Can I ask you, just going back to the Mag 7 to stay in the Mag 7, do you think everyone realizes that they will need to move into the chip layer and own the full vertical end to end?
[10:12]
Jonathan Ross
I don't think you're going to see too many successfully moving into the chip layer. People look at the TPU as a big success and what they don't realize is that there were about three chip efforts at Google at the same time and only one of them ended up outperforming GPUs. When you look around the industry, you've got a bunch of people building chips. Some of them are getting canceled, like Dojo recently got canceled. Building chips is hard. Going off and saying I'm going to build my own AI chip to compete with Nvidia. It's a little bit like saying, you know that Google search, it's pretty nice, let's go replicate it. It's insane. The level of optimization, the level of design and engineering that goes into that, you're not going to be able to replicate it with a high probability of success. However, if there's a bunch of players out there trying to do it and you have optionality and one of them succeeds, then you have another chip.
[10:59]
Interviewer (likely Harry Stebbings or co-host)
We mentioned earlier that you have to spend if you want to stay in MAG7. Nvidia investing $100 billion into OpenAI for OpenAI just to go and buy back Nvidia chips. Is this not just an infinite money.
[11:11]
Jonathan Ross
Loop that would be the case if they weren't spending it with suppliers to build those chips? It's not round tripping. If actual productive outcomes are occurring, what percentage of the spend is going to building that infrastructure. 40%. So at least 40% of those dollars are actually going out into the ecosystem. So that is not an infinite loop.
[11:31]
Interviewer (likely Harry Stebbings or co-host)
Okay, so it's a partial loop. 60%. Partial loop, 60% is going back to Nvidia.
[11:35]
Jonathan Ross
Sure.
[11:35]
Interviewer (likely Harry Stebbings or co-host)
And then they get a bump in their stock price of a couple of hundred billion dollars.
[11:39]
Jonathan Ross
Yes.
[11:39]
Interviewer (likely Harry Stebbings or co-host)
How did you analyze that?
[11:41]
Jonathan Ross
Let's analyze it in a couple of different ways. From an economic point of view. Makes perfect sense. Why not do that all day long? The value accrues. If there is lock in. When revenue increases result in stock price increases that are greater than the amount of the revenue, it's because you believe that that revenue is going to continue. And that's the belief. And I would actually say with Nvidia, that's probably true. However, it's not just because Nvidia is good and Nvidia is very good. It's also because there isn't enough compute in the world. The demand for compute is insatiable. I would wager that if OpenAI were given twice the inference compute that they have today, if Anthropic was given twice the inference compute that they have today that within one month from now their revenue would almost double.
[12:30]
Interviewer (likely Harry Stebbings or co-host)
How would their revenue double if they had double the compute?
[12:34]
Jonathan Ross
Right now one of the biggest complaints of Anthropic is the rate limits. People can't get enough tokens from them. And if they had more compute, they could produce more tokens and they could charge more money. And with OpenAI, it's a chat service. So how do you regulate your chat service? You run it slower, you get less engagement.
[12:50]
Interviewer (likely Harry Stebbings or co-host)
How important is speed, do you think? There's a lot of people who think actually it's fine. I'm very happy to have latency and I'm very happy to have a prompt. And then I go away, do something else and something happens when I'm away.
[13:03]
Jonathan Ross
Let's look at cpg. So consumer packaging goods. I want you to rank the CPG goods by margin at the very top. Tobacco, smoking tobacco. Right below that is chewing tobacco. Below that is soft drinks. Below that, you keep going down, you get to water and other things like that. What is the number one thing that a high margin correlates to in cpg? It's the speed at which the ingredient acts on you, that dopamine cycle. How quickly something occurs determines your brand affinity. When something has a very quick response, you associate to that brand and then you accrue brand value. This was the entire basis of Google focusing on speed, Facebook focusing on speed. Every 100 milliseconds of speed up results in about an 8% conversion rate.
[13:51]
Interviewer (likely Harry Stebbings or co-host)
So that is wrong in terms of people's assessment of the future, where they think, oh, it's fine, we'll actually just have lots of prompts going on in the background, and we'll be happy to let them run for long periods of time.
[14:00]
Jonathan Ross
100% wrong. In fact, when we first started working on getting speed on our chips, we knew what speed we could get. We even made a video example of how fast we could be. And people would look at that video example and they would say, why does it need to be faster than you can read? And I would respond to that by saying, well, why does a web page need to load faster than you can read? And there's just this mental disconnect where people couldn't. Couldn't grok the sort of visceral importance of speed. People are very bad at determining what's actually going to matter in terms of engagement, in terms of outcome. But we know this from building the early Internet companies.
[14:37]
Interviewer (likely Harry Stebbings or co-host)
Do you think OpenAI will be able to move into the chip layer? At some point, Nvidia must be concerned that the OpenAI will want to verticalize and own the chip layer as well. Do you think they will be able to make that successful transition?
[14:48]
Jonathan Ross
I think one of the problems in building your own chip, first of all, everyone thinks that building the chip is the hard part. And then as you do it, you start to realize building the software is the hard part. And then as you do it, you realize keeping up with where everything is going starts to become the hard part. I have no doubt OpenAI will be able to build its own chips. I have no doubt that eventually Anthropic will be building their own chips, that every hyperscaler will build their own chip. I had this experience when I was at Google where I got a lab tour. And this was before AMD was doing a great job, right? AMD was struggling for a little while, and now they're doing great. But they had built 10,000 servers and those 10,000 servers of AMD chips. I was walking through the lab and they were pulling the servers out of the racks, taking the AMD chip, popping it off, and throwing it in a trash can. The funny thing was, it was almost preordained because everyone knew that in that generation, intel was going to win. So why did Google build 10,000 servers? Because they wanted to get a discount on the intel chips they bought. And when you're at that scale, the cost to design your own server because they had to design their own motherboard in order to fit the AMD chip and to build that out and test it, versus the discount that you get, totally worth it. You have to think of what all the motivations are. When people are building their own chips, it's not just because they're going to deploy that chip in mass production. Nvidia effectively has a monopsony on hbm. A monopsony is the opposite of a monopoly. So when you're a single buyer, there's a finite amount of HBM capacity, which is the high bandwidth memory that goes into the GPUs. The GPU itself is made using the same process that's used to build the chip that's in your mobile phone. If Nvidia wanted to, they could build 50 million of those GPU die per year. But they're going to build about 5.5 million GPUs this year. And the reason is because of that HBM, because of the interposer that it goes on, there's just a finite capacity. So what happens is a hyperscaler comes in and says, I want a million GPUs. And Nvidia's like, sorry, I've got other customers. And the hyperscaler says, no problem, I'm going to build them myself. And then all of a sudden those GPUs are found by Nvidia to give to the hyperscaler. There is just a finite amount of capacity. By building your own chip, what you really get isn't your own chip. It's that you get control over your own destiny. That's the unique selling point of building your own chip.
[17:03]
Interviewer (likely Harry Stebbings or co-host)
What does that mean, control over your own destiny?
[17:05]
Jonathan Ross
Nvidia can't tell you what your GPU allocation is. It may cost you more to deploy your own chip because it's not going to be quite as good as Nvidia's. Let's think about why Nvidia's GPUs with a slight edge over AMD's GPUs dominate. If your total cost to deploy is a huge multiple of the cost of the chips in the systems, then a small percentage increase in the cost of the chip is negligible. If I'm going to deploy a CPU and that CPU is 20% of the BOM, and I get a 20% increase in the speed of the chip, that is a 20% value increase in the entire system versus a 20% increase in the chip cost, right? It's negligible so you get these huge multiples when you improve the chip performance. So small differences in performance make a huge difference in the value of the product. A small edge gives you a massive edge in selling that product.
[17:58]
Interviewer (likely Harry Stebbings or co-host)
Is it possible for OpenAI, anthropic, any of the mag 7, any of the other providers to move into the chip layer? If there is a monopsony on the HBM market, it's very hard.
[18:09]
Jonathan Ross
However, there is an incentive from those building HBM to spread that around because Nvidia gets to negotiate very good rates because they're such a large buyer. However, if you are building an HBM FAB and packaging house and all of this other, you know, part of the ecosystem, if Nvidia comes in and writes a big check, then you're going to build the fab for them. So Nvidia is always going to get the amount of supply that they want in advance. The problem is you have to write that check more than two years in advance. Where AI's gone, you know, just absolutely hockey sticking. Even when you have the cash flow of Nvidia, it's hard to actually write the checks for the amount of demand that there's going to be in advance. There is going to be a supply constraint and it's not purely based on being a monopsony. Part of it is based on just the sheer capital costs and the memory suppliers are very conservative. There's also this situation where the margin on HBM is so high that no one wants to actually increase the supply because then the margin goes down.
[19:10]
Interviewer (likely Harry Stebbings or co-host)
When you look at that and when you look at OpenAI, when you look at Anthropic having their own chips, is that why they're raising the money? They are. Sam said they're going to need hundreds of billions of dollars. Is that factoring that in?
[19:22]
Jonathan Ross
No. So buying a system is expensive. Buying a data center is more expensive. The reason is you're amortizing that data center over a longer period of time. So even if a data center was going to be 1/3 of your cost per year, if you're amortizing that Data center over 10 years and the chips over three to five years, the data center is going to end up costing you more per year. When you hear the hyperscalers Talking about that 75 billion to $100 billion a year investment because they're building out the capacity for data centers, they're putting a lot of money up for returns that they're expecting over the next 10 plus years. So it's actually not that much. Money. When you think about it, are we.
[20:00]
Interviewer (likely Harry Stebbings or co-host)
Thinking about amortization in the right way in a three to five year cycle? If chip cycles are actually faster than.
[20:07]
Jonathan Ross
That, people are definitely thinking about it over a longer period than I would. We use a more conservative number. Internally, I think five to six years.
[20:14]
Interviewer (likely Harry Stebbings or co-host)
Would be like three years, a little bit less.
[20:16]
Jonathan Ross
We're looking at upgrading chips about once a year. Now here's the way to think about it. There's two phases of the value of a chip. There's the am I willing to buy it and deploy it? And there's am I willing to keep it running? They're two very different calculations. When you deploy it, you have to be able to cover the capex. When you keep it running, you just have to beat the opex. So if I deploy a chip today, I have to beat the capex. I have to earn all my capex back and make a profit and produce a return once I've deployed it. As long as I'm beating my operational costs, I'm going to keep that thing in production. So you're okay with the value of that chip going down over time. Now, the bet that everyone is making is that those new chips that come out aren't going to reduce the value of the old chips below the opex. And in our case, we actually don't think that five years makes any sense.
[21:10]
Interviewer (likely Harry Stebbings or co-host)
Because they will be so much less performant that actually the value will be lower than the operating costs for the.
[21:16]
Jonathan Ross
Electricity and for paying for the data center.
[21:18]
Interviewer (likely Harry Stebbings or co-host)
So what happens then? We just have this excess supply of wasted chips which are going to.
[21:23]
Jonathan Ross
Because a lot of these people have entered into really long contracts and so they have a third point where they have to consider their calculation, which is breaking this contract is that cheaper than running the chip at a loss.
[21:35]
Interviewer (likely Harry Stebbings or co-host)
So what happens then?
[21:37]
Jonathan Ross
I can't tell you what happens because we're trying to avoid that situation. So by having a much faster payback period in all of our calculations, I would not want to make a bet that long out. The shorter the timeframe that you're making the bet, the clearer your outcome is.
[21:52]
Interviewer (likely Harry Stebbings or co-host)
So essentially you want to minimize payback period as much much as possible and then minimize operating cost so that you can shed less performant chips faster.
[22:01]
Jonathan Ross
Yes, but also here's another crazy part which is when you look at the math this way, like if I'm approaching it as an accountant, I'm going to be like, this is a terrible idea. But if I look at it empirically, people are still renting H1 hundreds. How old are those chips? They're getting close to five years old. They're still earning more than their operating costs by quite a bit. You would Never deploy an H100 today, but they're still profitable to run. They're in that second phase. The reason is people can't get enough compute. If that wasn't the case, H1 hundreds would be renting for a fraction of what they're renting for today. And as long as you can't get enough compute, that's going to be true. The question is, is there an alternative out there that isn't as supply constrained? And so this is where we're hoping to come in. Let's talk about our value proposition. So you started off asking me about speed. Do you know how many customers come to us asking for speed? 100%. Do you know how many customers keep asking about that once they realize the supply constraint out there? None. So they start with speed, they know the value of that to their end customer and then they're like, oh, wait a second, I can't even get enough compute. The real value prop is can you provide more compute capacity? So two weeks ago we had a customer come to us and ask for 5x our total capacity. They couldn't get that capacity from any hyperscaler. They couldn't get it from anyone else. We couldn't give it to them. No one can. We couldn't get that customer. The hyperscalers couldn't get that customer. There isn't enough compute. So your choice is I buy this compute and I get the customer. This is where I was going to you when I said if OpenAI or Anthropic were to double their compute, they would double their revenue. So if you're someone who can't get enough compute to serve your customer, then you're going to be willing to pay whatever it takes to get those customers because you feel that there's lock in value by getting that customer. Now the number one value prop that we have is that our supply chain is not like a GPU supply chain. You have to write a check two years in advance to get GPUs for us. You write us a check for a million LPUs and the first of those LPUs starts showing up six months later.
[24:03]
Interviewer (likely Harry Stebbings or co-host)
Wow. So you've got an 18 month chasm difference.
[24:06]
Jonathan Ross
That's right. So I had a meeting with the head of infrastructure of one of the hyperscalers and I talked about speed, I talked about cost and all this stuff. But when I talked about the supply chain and how we could do something in six months. He just stopped the conversation for a moment and wanted to dig into that. That was the only thing he cared about.
[24:22]
Interviewer (likely Harry Stebbings or co-host)
Given the speed of progression of the landscape of models. Does two years make sentence?
[24:27]
Jonathan Ross
So do you know Sarah Hooker?
[24:29]
Interviewer (likely Harry Stebbings or co-host)
No.
[24:30]
Jonathan Ross
So she wrote this paper, the hardware Lottery. My TLDR on that one is people are designing the models for the hardware. There are architectures that could be better than attention. However, attention works really well on GPUs. So if you are the incumbent, you have an advantage because people are designing their models for your hardware. It doesn't even matter if there's a better architecture out there. It's not going to run well. So it's not a better architecture. There's a little bit of a loop there. If you are building two years out and you're the incumbent, that's okay. But if you're trying to enter the market, no one's going to design for your chips two years out. So you have to have a faster loop.
[25:08]
Interviewer (likely Harry Stebbings or co-host)
When you see everyone moving into the chip layer, as you said, OpenAI will have their own, anthropic will have their own. What does Nvidia do in that world?
[25:15]
Jonathan Ross
Nvidia still keeps selling chips to who.
[25:18]
Interviewer (likely Harry Stebbings or co-host)
Though, given the concentration of their buyers?
[25:20]
Jonathan Ross
So we started off talking about is AI a bubble? If you look for the last 10 years infrastructure for data centers, you're planning that out 2, 3, 4, 5 years in advance, right? What happens is everyone's predictions are wrong. They end up building too little. This has just been what's happened for the last 10 years. So if you don't build enough for 10 years, what do you do? You try and overbuild. You try and build more than your most optimistic projections. And then once again, you haven't built enough. So you increase your projections and you just keep doing this. That's what's been happening. And yet people still aren't building enough Compute where people's instincts are off. AI doesn't work the way SaaS does. In SaaS, you have a bunch of engineers who go out and build a product and the quality of that product is determined based on what those engineers did. That's not the case in AI. In AI, I can improve the quality of my product by running two instances of the prompt and then picking the better answer. I can actually spend more to make my product better. On each query, I can even decide this customer is more valuable and I'm going to give them a better result. That's kind of what OpenAI announced when they said, we're now going to release some products where we can't really afford the compute. So we're going to give it to a limited set of users and we're going to charge more because we want to see what happens when we give more compute to the AI. We want to see what that product looks like and how much better it is. That is going to be our future. Every time you give more compute to an application, the quality increases. And this is why it's not coincidental that you see people's token as a service bill almost matching their revenue because they're competing for customers, and if they just spend more, their product gets better.
[26:59]
Interviewer (likely Harry Stebbings or co-host)
But bluntly, the assumption when you look at GPT5 and the focus on efficiency is that Sam transitioned from performance to efficiency because compute does not equal a parallel level of performance improvement. Do you think that is fair and true? And does that not go against what you just said?
[27:17]
Jonathan Ross
No. And you have to think of the different outcomes that they're looking for. So if you are OpenAI, you have moved into markets that are incredibly cost sensitive. Let's talk about India for a second. If you want to go in India, what's the one thing you need? 99 rupees a month. That's about A$13. You need to charge your customer $1.13 for your product. So they're going after a market whose alternative is, I have no AI.
[27:42]
Interviewer (likely Harry Stebbings or co-host)
You've got open. I mean, they can use deepsea.
[27:45]
Jonathan Ross
This is another misconception in the market. Let's just start busting every misconception.
[27:49]
Interviewer (likely Harry Stebbings or co-host)
Sure.
[27:49]
Jonathan Ross
Great. When the Chinese models came out, everyone reacted by saying, oh my God, they've trained models that are almost as good as the US models. And we had a podcast on this. Right. Even I was snookered a little bit at first. Oh, my gosh. Aren't these models so much cheaper to run now that I know more about the foundation models that people are using versus the Chinese models? No, they're not cheaper to run. They're about 10x as expensive, actually. Let's just take the GPT OSS model that was released. It's optimized for something different than the Chinese models, but the quality is very high and I would argue clearly is a better model for what it focuses on than the Chinese models. Now, the Chinese models focus on different things. However, the cost to run the OSS model is about 1/10 that of the Chinese models. So why was everyone charging less when you have sort of a captive market for a model because people say, I want this model and there's only one provider of it. You can charge ten times as much. The price was higher, and people were confusing the cost with the price. The Chinese models were optimized to be cheaper to train as opposed to be cheaper to run. When you see how much intelligence has been squeezed into the OSS model versus the equivalent Chinese models, it's clear that the US still has a training advantage. And the economics work out such that you have to amortize that training over every inference, which means that you want to charge more. And so there's still a balance there. But as you scale out into larger and larger numbers of people, being able to afford to train a model starts to be a payoff. As you deploy more inference capacity, you want to spend a bit more on the training to get your inference cost down. In the US we have a massive compute advantage, and so people train the models harder, bringing the cost down.
[29:35]
Interviewer (likely Harry Stebbings or co-host)
Why do we have a compute advantage in the US Just in terms of access to chips?
[29:39]
Jonathan Ross
That's correct.
[29:40]
Interviewer (likely Harry Stebbings or co-host)
Will China not just subsidies the inference and the running there? I understand what you. So does it matter if their cost of running is higher, but their Chinese, The CCP will just subsidize it? Does it matter?
[29:50]
Jonathan Ross
There's a home game and there's an away game. The home game is we want to build enough compute for the United States. The away game is we want to build it for our allies, right? Europe, South Korea, Japan, India, and so on. China can win their own home game. They're going to build 150 nuclear reactors, so they're going to have enough energy, even though their chips aren't as energy efficient and they can subsidize, as you mentioned. But the away game is different. If a country only has 100 megawatts of power, what are they going to do? Build another nuclear power plant? Like, that's just not a realistic thing. You can do that in China, you can't do that elsewhere. So having a better chip gives you an advantage in the away game. So my expectation is that right now, for the next two to three years, the United States has a clear advantage in that away game over China. And if we move very quickly, then we're going to be able to bring a bunch of allies into the AI race.
[30:39]
Interviewer (likely Harry Stebbings or co-host)
Do you think we should have open models to allow for China to distill in the effective ways that they have done already?
[30:45]
Jonathan Ross
The model itself is not a clear advantage. So the first time you had me on your podcast, I predicted that OpenAI was about to open source their model. Remember that? And my prediction was based on their branding strength, frankly, OpenAI could probably use llama 2, the old model from how long ago? Like two years ago? Yeah, yeah. And people would probably still use it. And so there's a brand advantage there. Now they do have very good models, but they don't necessarily need it because of that brand advantage. I think that anthropic should be open sourcing their previous generation in order to get people using them instead of the Chinese models. Because if someone is willing to use a Chinese model, then they would at least be using the anthropic model and their prompts would be recyclable. And just like you have software compatibility, you have prompt compatibility. For example, when the OpenAI OSS model was released, one of the main reasons people started adopting it over the Chinese models was they could reuse their prompts. Now of course, when someone has a low cost application and they can't afford the premium for OpenAI, they want to use one of these open source models. Eventually they start doing really well, they make more money, they start wanting to get access to the premium model. Their prompts are reusable. So there's a win by open sourcing these models. And you're also getting all of these infrastructure providers to drive the cost down on that model as well. There's a lot of innovation that goes into that.
[32:10]
Interviewer (likely Harry Stebbings or co-host)
There's so many different areas that want to take this. We said that just build as much computers possible. The energy requirements are intense. Is the only way to provide the energy required for this compute wave, tsunami, whatever you want to call it, is the only way. Nuclear?
[32:26]
Jonathan Ross
No, nuclear is efficient and cost effective. But renewables are are efficient, cost effective. I'll give you my simple hack. All the allies in the United States have to do in order to have more energy than China is to be willing to locate their compute where energy is cheap. Let's compare Europe to the United States. The United States is incredibly risk averse compared to Europe in everything. But you have to ask what kind of risk? There's two kinds of risk. There's mistakes of commission where you do something that's a mistake and then there's mistakes of omission where you don't do something and it's a mistake. And the United States is terrified of making mistakes of omission when you are in a massive growth economy. Missing out is more expensive than fumbling something. Europe is incredibly willing to embrace the risk of of omission the way that Europe is trying to compete is through legislation, by saying things like, I want to keep this data in Europe, or I want to keep this data in this country. If Europe wanted to compete in AI, all you'd need to do is say, norway, please deploy an enormous number of wind turbines. Why? Norway has about an 80% utilization rate of wind. So like 80% of the time you can be generating energy. They have enough hydro that if you deployed 5x the wind power of the hydro, Norway itself could provide as much energy as the United States and could do it consistently. The entire United States, that's one country in Europe. How much other energy is there out there that could be unlocked that isn't nuclear? And by the way, let's also deploy nuclear. Nuclear is incredibly safe these days.
[34:13]
Interviewer (likely Harry Stebbings or co-host)
Why do we not then fear? Is that really it?
[34:16]
Jonathan Ross
Yeah.
[34:17]
Interviewer (likely Harry Stebbings or co-host)
When you speak to European governments, what do they say to you?
[34:20]
Jonathan Ross
I don't bring up nuclear because I'm not going to push an energy source that everyone's going to push back on. But when I was in Japan recently, they were talking about bringing their nuclear reactors back online. Japan has a reputation of being very slow. There's a lack of subtlety and nuance in that perception. The reality is Japan is slow to make a decision, but when they decide something, they move really fast. Let's take an example. Japan decided to build a 2 nanometer fab. When I was there last, they were showing off these 2 nanometer wafers that they produced. Now, the yield's not where it needs to be. This is not production grade, but they built a 2 nanometer fab and they are producing wafers out of it. And they're going to start getting that defect density down. They're going to move quickly. They've allocated $65 billion for AI and they're going to spend it and they're going to spend it quick. They're going to turn their nuclear reactors back on. When Japan is going to turn their nuclear reactors back on, Europe needs to listen to that and go, gosh, we need to catch up in energy.
[35:15]
Interviewer (likely Harry Stebbings or co-host)
Catch up is exactly what I was thinking. Because what I'm thinking is the speed it takes to build out. You said about kind of Norway's latent capacity of wind and how we could utilize it. Dude, it takes years to build huge, huge supply of turbines. You think the Norwegian government is going to shell out and have 10,000 wind turbines on the coast?
[35:33]
Jonathan Ross
Why does the Norwegian government need to pay for it?
[35:35]
Interviewer (likely Harry Stebbings or co-host)
Who should?
[35:36]
Jonathan Ross
How about the hyperscalers? How about other governments that want to locate there In Saudi Arabia, there are gigawatts of power and they're building out data centers for that. Why doesn't Europe work with Saudi Arabia to say, you know what? So Saudi Arabia wants to do a program of data embassies where you have sovereign oversight over your data, but you get to use their energy. Why not use that? Problem solved. They're going to build out 3 to 4 gigawatts in the very near future.
[36:04]
Interviewer (likely Harry Stebbings or co-host)
So the hyperscalers would pay Norway to use their renewable energy sources and then leverage that.
[36:10]
Jonathan Ross
The complaint that the hyperscalers have is all of the paperwork and the slowness. I was talking to someone who was on the board of a major energy company that builds nuclear power plants. He said they spend three times as much on the permitting in the United States than on the nuclear power plant. And I don't know about Europe, but typically the United States is better than Europe on this. How much does it cost to build a nuclear power plant in Europe? The actual cost of the infrastructure versus the permitting. Here's what Everyone needs to walk away from this with the countries that control compute, will control AI, and you cannot have COMPUTE without energy.
[36:46]
Interviewer (likely Harry Stebbings or co-host)
How far behind is Europe, and is there a way for us to get back? Is there a chasm which we can catch up on?
[36:54]
Jonathan Ross
I don't think there's a problem right now. If Europe acts now, I mean, China is ahead in action, but there are 500 million people in Europe. There's over 300 million in the US and if you start bringing all the allies together, South Korea, who, by the way, knows how to build nuclear power plants. The power plant in the UAE was built by South Korea. They could build power plants here. France knows how to build power plants. How about a little bit of a Manhattan Project for building enough energy? When I'm walking around in Europe in the summer, it's incredibly hot, and when I'm walking around in the winter, it's incredibly cold. That is not an experience you have anywhere else in the world. Build more energy.
[37:34]
Interviewer (likely Harry Stebbings or co-host)
I'm with you, Jonathan, but I'm also realistic. I know how slow we are as governments, both singular and in collaborating together. It's not going to happen at the speed of which this needs to be done. What happens if that does not happen in the speed with which it needs to be done?
[37:49]
Jonathan Ross
Then Europe's economy is going to be a tourist economy. People are going to come here to see the quaint old buildings, and that's going to be it. You cannot compete in a new economy if you don't have the resources that the new economy is built on. And the new economy is going to be AI, and it's going to be built on compute.
[38:05]
Interviewer (likely Harry Stebbings or co-host)
Is model sovereignty enough to win if you look at a provider?
[38:09]
Jonathan Ross
Because if you don't have compute, you can't run the AI. Doesn't matter how good your model is. You could have a model that is 10 times smarter than OpenAI's model. And if you have 10 times the compute, OpenAI's model is going to be better.
[38:21]
Interviewer (likely Harry Stebbings or co-host)
So for Mistral, who say, hey, we're going to have sovereignty within Europe, and the German health carrier system and the Croatian Transport Ministry are going to use Mistral because we're a European alternative, that's not a reason to win.
[38:36]
Jonathan Ross
What's the usp? What's the unique selling point?
[38:38]
Interviewer (likely Harry Stebbings or co-host)
It's a European model and it doesn't have ownership in the US under a Trump administration.
[38:43]
Jonathan Ross
What you're solving for there is removing someone else's ability to control you. Yeah, but what you're not solving for is having enough of it. By the way, I'm not saying don't use Mistral. We have a partnership with Mistral. We love Mistral. The thing I'm saying is build enough compute so that Mistral can compete.
[38:59]
Interviewer (likely Harry Stebbings or co-host)
If you listen to this, you're not just like, shit, I should just buy the shit out of coreweave. Seriously, like, when you look at what they provide on demand.
[39:07]
Jonathan Ross
Yeah, Corvue is a great company, but they have a finite allocation of GPUs. Everyone has a finite allocation.
[39:14]
Interviewer (likely Harry Stebbings or co-host)
When we chatted before, you said to me that GPUs are not the best infrastructure for inference.
[39:21]
Jonathan Ross
Correct.
[39:21]
Interviewer (likely Harry Stebbings or co-host)
And that we are moving more and more into a world of inference as we move further along the maturation cycle of training models.
[39:29]
Jonathan Ross
Yes.
[39:29]
Interviewer (likely Harry Stebbings or co-host)
Does that not mean Nvidia's power hold weakens further?
[39:34]
Jonathan Ross
No. Nvidia is going to sell every single GPU that they build. Even if we end up supplying 10 times as many LPUs as GPUs. All that's going to do is increase the demand for GPUs and allow them to charge an even higher margin. Because the more inference you have, as mentioned before, the more you need to train the model to optimize for the inference. And the more training you have, the more inference you want to deploy to amortize the cost of the training. There's a virtuous cycle between the two.
[39:59]
Interviewer (likely Harry Stebbings or co-host)
Is the inference market playing out as you expected it to, in terms of maturation deployment speed?
[40:05]
Jonathan Ross
What I never expected was that AI was going to be based on language. What that's done is it's made it trivial to interact with AI. I thought it was going to be more like AlphaGo. I thought it was going to be intelligent in some weird esoteric way. The fact that it's language means anyone can use it. So I expected AI to come sooner and grow slower. It came later and it's growing faster than I ever imagined. It is so easy to interact with AI that anyone can do it.
[40:34]
Interviewer (likely Harry Stebbings or co-host)
10% of the world's population is a GPT weekly active user. Isn't that astonishing?
[40:40]
Jonathan Ross
Yes. But you know what's holding it back?
[40:42]
Interviewer (likely Harry Stebbings or co-host)
Compute.
[40:43]
Jonathan Ross
So compute is holding it back for the quality of it, but more people would use it, they just wouldn't get as much out of it. But more people would use it if more languages were supported. Well, this is the number one complaint we hear around the world. You know what would solve that? More compute. More data. If you have more data, then you can train more, but you need more compute. And by the way, if you have more compute, you can generate more synthetic data so you can train more. So you have data, you have algorithms and you have compute. If you improve any one of them, it's not a bottleneck. It's not like if the compute doesn't get better, I can't use more data, or if the data doesn't get better, I can't use more compute. Any one of these that gets better improves AI, and that makes it really easy to improve AI because you can improve one dimension of it. It just turns out the easiest knob to improve an AI. It's not the algorithms. Algorithms rarely improve. It's not the data because it's really hard to get more data and we haven't fully figured out synthetic data generation. We're good at it, but we're not at the point yet where we can just directly turn compute into more data. We're getting there. Compute is the easiest knob because it just keeps getting better and better and better every year. If I write a check for enough money and I'm willing to wait a little while, I'm going to get more compute. It's the most predictable part of the pipeline. And yet we still underestimate how much we need.
[41:57]
Interviewer (likely Harry Stebbings or co-host)
Do you think we are dramatically underestimating how much we need today?
[42:00]
Jonathan Ross
Yes.
[42:00]
Interviewer (likely Harry Stebbings or co-host)
By what scale?
[42:02]
Jonathan Ross
Going back to what I said about how every time you add more compute, a product gets better. There is no limit to the amount of compute that we can use. It's different from the industrial Revolution. In the Industrial Revolution, you couldn't use energy unless you had the machinery to use it. And you had to build machinery. And that took time. If I wanted to have more cars on the road, I had to build the cars. It wasn't enough to just pull more oil out of the ground. AI is not like that. Yes, if I make my model better, I can actually do more with the same amount of compute. But if I double my compute, I double the number of users, I improve the quality of the model. This is different. I can literally just add more compute to the economy and the economy gets stronger. We've never had that before where it wasn't a bottleneck, it was more of a rubberneck where you could just force more of one component through and then everything improves.
[42:49]
Interviewer (likely Harry Stebbings or co-host)
You said the economy gets stronger. When we think about what that's predicated on, that's predicated on the $10 trillion labor spend in GDP shifting to AI and US taking a portion of that, do you think that we will see significant shifts in the GDP or the to spend on labor moving towards AI in the next five years?
[43:10]
Jonathan Ross
I believe that AI is going to cause massive labor shortages. I don't think we're going to have enough people to fill all the jobs that are going to be created. There's three things that are going to happen because of AI. The first is massive deflationary pressure. This cup of coffee is going to cost less. Your housing is going to cost less. Everything is going to cost less, which means people are going to need less money.
[43:29]
Interviewer (likely Harry Stebbings or co-host)
So how is it going to cost us to have a cup of coffee because of AI?
[43:33]
Jonathan Ross
Because you're going to have robots that are going to be farming the coffee more efficiently. You're going to have better supply chain management. It's just going to be across the entire supply chain. You're going to be able to genetically engineer the coffee so that you get more of it per watt of sunlight, just across the entire spectrum. So you're going to have massive deflationary pressure. That's number one. And what that means is people will need to work less. That's going to lead you to number two, which is people are going to opt out of the economy more. They're going to work fewer hours, they're going to work fewer days a week, and they're going to work fewer years. They're going to retire earlier because they're going to be able to support their lifestyle working less. And then number three is we're going to create new jobs and new industries that don't exist today. Think about 100 years ago, 98% of the workforce in the United States was in agriculture. 2% did other things. When we were able to reduce that to 2% of the population working in agriculture, we found things for those other 98% of the population to do. The jobs that are going to exist 100 years from now, we can't even contemplate. 100 years ago, the idea of a software developer made no sense. 100 years from now, it's going to make no sense, but in a different way, because everyone's going to be Vibe coding influencers. That wouldn't have made sense 100 years ago, but now that's a real job. People make millions of dollars off of it. So, number one, deflationary pressure, Number two, opting out of the workforce because of that deflationary pressure, and number three, jobs and companies that couldn't exist today that are going to exist and are going to need labor. We're not going to have enough people.
[45:07]
Interviewer (likely Harry Stebbings or co-host)
It's fascinating, the counter narrative, isn't it? Everyone being like, oh, millions and millions of people will be unemployed. And you're like, no, we're actually not going to have enough people for the jobs.
[45:16]
Jonathan Ross
Well, what was the famous prognostication 100 years ago that there was going to be massive famine because we weren't going to be able to feed ourselves? People always underestimate what's going to change in the economy when you improve technology.
[45:28]
Interviewer (likely Harry Stebbings or co-host)
When you think about the requirements from an energy perspective, and then also what you just said there about kind of labor. Do you think Trump and a Trump administration is doing more to help or to hurt the advancement of AI in.
[45:40]
Jonathan Ross
The US Definitely help. All of the moves that have been made are things that are going to help with AI. For example, the permitting issues. Overall, it's been a very positive experience on AI.
[45:51]
Interviewer (likely Harry Stebbings or co-host)
You mentioned Vibe coding. I do just have to ask about it. Do you think this is an enduring and sustainable market? When you look at a lot of the use cases today, they're quite transient. How do you analyze the future of the Vibe coding market? Having played with it a little bit and having seen also interns, as you said, who are very good at it internally, use it well.
[46:09]
Jonathan Ross
So let's take reading. Reading and writing used to be a career. If you were a scribe, you were one of the small percentage of people who knew how to read and write, and people would hire you just to record things. And you did much better than the average person in the economy because of that, because it was A specialized skill. Coding has been the same thing. Very small percentage of the population did. It took a couple years to learn how to do it. Well, some people were really good at it. Now everyone reads, everyone writes. It's not a special skill. It's expected in every job. And coding is going to become the same thing. For you to be in marketing, you're going to have to be able to code. For you to be in customer service, you're going to have to be able to code. I was having dinner with someone who runs a chain of 25 coffee shops, has never coded in their life. And they Vibe coded a supply chain tool that allowed them to check inventory. They didn't write a single line of code. They got it to work. And it was funny because they discovered all the problems that we software engineers discover over time. They started getting feedback from their employees, like, this feature doesn't work. This thing doesn't work. When I do this. All the little edge cases, and then he just started fixing them all through Vibe coding.
[47:16]
Interviewer (likely Harry Stebbings or co-host)
Do margins matter in a world of exponential growth? When we look at the demand for your products, when we look at the demand for a lovable or a replet, both have bad margins. Does it matter having bad margins when growth demands are so high?
[47:32]
Jonathan Ross
First of all, you do have to have profitability in the end, or at least break even to be an ongoing concern. At some point, you can't just keep raising money. Even Amazon had to start making some money. The real reason why you need higher margins is volatility. Because if you have a razor thin margin and the market moves, you may not be able to raise more money. You may not be able to get a loan. And so what a margin does is it gives you stability and staying power in the market. On the other hand, what it does is it also gives competition the ability to enter. Your margin is my opportunity. And so what you're trading is stability for a competitive moat. That's the decision that you have to make.
[48:11]
Interviewer (likely Harry Stebbings or co-host)
How do you think about margin internally today?
[48:14]
Jonathan Ross
I think you want the ability to have margin and you want to give it to your customers, and you want to give them an advantage. And if you have the ability to take that margin when it's needed, then, then you're in a great position. So we hired this amazing CFO recently, but I remember talking to a previous candidate when we were talking about margin. They said that we should price so that our supply met our demand. In other words, they wanted to increase the price in order for the demand to come down.
[48:41]
Interviewer (likely Harry Stebbings or co-host)
Makes sense.
[48:42]
Jonathan Ross
Does it?
[48:42]
Interviewer (likely Harry Stebbings or co-host)
Economic sense?
[48:43]
Jonathan Ross
Yeah, economic sense.
[48:44]
Interviewer (likely Harry Stebbings or co-host)
Logically and rationally, yes.
[48:46]
Jonathan Ross
But then logically, why not use up your brand equity? Why not use the trust that your customers have to sell them things that aren't good? Brand value. Brand equity has value. You want to keep your brand equity as high as possible because trust pays interest. And similarly, you want to keep your margins low enough that you're building up this sort of equity value with your customers where they know that you are giving them a good deal. When you charge a high margin, you are at odds with your customer. You want to do everything that you possibly can to align with your customer. I want my margin to be as low as I possibly can make it while keeping my business stable. And I'm going to make my cash flow by increasing the volume. One of the things that I love about the compute business is that the need for compute is insatiable. It's Jevons paradox. If we produce 10x the compute, we will have 10x the sales. That's just the way it works. As long as we keep bringing the cost down, people are going to buy more. I want to keep bringing that cost down, I want to keep increasing the volume, and I want to keep selling more for less so that people get more value out of their business and they buy more and that cycle continues.
[49:59]
Interviewer (likely Harry Stebbings or co-host)
How far are we on the journey to bring the cost down? You know, I remember I look back at some of the shows, dude, and I cringe at myself because I'm talking about, about like, oh, canva implementing AI and it's hurting their margins because they're implementing AI and it's going to cost them more. And it's just such a naive approach to ask that question, even because now the cost of implementation has gone down by 98%. How far are we in terms of that cost reduction cycle?
[50:27]
Jonathan Ross
Well, let's step back and use your CANVA example. Successful businesses don't watch the bottom line. They watch their customers. They solve problems that their customers have. If you are competing, you are doing it wrong. You want to differentiate. You want to solve a problem that your customer has not solved yet and can't solve any other way, and then they're happy to pay you money. That's how it works. You solve their problem and then your cash flow is solved. Someone's spending on AI. If you just look at the balance sheet, that doesn't make sense. But when the customer is very happy and they're solving a problem that they couldn't solve otherwise, first of all, you're increasing the tam, usually with AI because it makes the product so much easier to use. Did you use Photoshop two years ago? Impossible. Now, if you want to generate an image, you just explain what you want. That increases the tam. You may be able to charge less per photo, but your total revenue increases, your total market increases.
[51:23]
Interviewer (likely Harry Stebbings or co-host)
Forgive me for this financial question, but we see the S and P about to hit 7,000 and we see this ripping of the Mag 7. Like we haven't seen a concentration of value in many, many years. People suddenly start to feel like, wow, it's getting toppy. I listen to you and I hear all of this and I think it's just the start. How should I think about the duality of those two thoughts?
[51:47]
Jonathan Ross
There's two components to the value. One is the weighing machine and one is the popularity contest. There are some products that are pure popularity contests, like crypto. I have never bought a bitcoin. Why? Because I can't play in the popularity contest. I'm not good at it. I don't know what's going to be popular and what isn't. All I can do is I can see value. When I look at AI, I see real value being delivered. Best example, PE firms are all over us. They want access to cheap AI compute because every time they get more cheap AI compute, they can change the bottom line of their businesses. It has real value. When PE firms go after something and see value in it, it's not a popularity contest. It's pure value. And so what happens is the reason companies get a large multiple is people see that the actual value is going to accrue or they get hype cycled on it and it's pure popularity contest. And there are different participants in the market. Some of them are just playing the popularity contest, others are looking at the value and they may come to the same conclusion for different reasons. Coming at it from the value point of view, the weighing machine point of view, the most valuable thing in the economy is labor. And now we're going to be able to add more labor to the economy by producing more compute and better AI. That has never happened in the history of the economy before. What is that going to do?
[53:07]
Interviewer (likely Harry Stebbings or co-host)
Do you worry that if we have a speed bump in the short term, it will derail significant parts of of the economy given the concentration of value everyone rips today. But if Nvidia, Meta, Google, Microsoft suddenly hit speed bumps and the AI speed train is just slowed down, the consequent multiplier effect is mega. Do you worry about that?
[53:29]
Jonathan Ross
Yeah. And this is independent of the value of AI. This is the sort of control system theory of what's going on. Right. So a stock market could inherit, apparently be on an upward trajectory. It can overheat and that overheating causes it to run away. People bid things up, they realize they've made a mistake and then it has to come back down and then it dips below where it should be. Spending retreats, Then people don't have the funds they need to build their businesses. A lot of good businesses can die during one of these downward trends. But this is also where the best businesses are made. How many times do you see a downturn and a ton of amazing businesses come out of it?
[54:08]
Interviewer (likely Harry Stebbings or co-host)
Do you think we will have a downturn in the next year?
[54:11]
Jonathan Ross
I can't predict whether or not there'll be a downturn. The ability to predict something is largely dependent on whether or not predictions affect predictions. If a prediction affects the prediction, you cannot predict it because whatever your prediction is changes the outcome. The only things that are predictable are things where the predictions don't change the outcome. If an asteroid is headed towards the earth and we see that, if we don't have the technology to stop it, then it's going to happen. But if we see that happen and we can predict it, then we might develop the technology to stop it. Do you see the problem?
[54:43]
Interviewer (likely Harry Stebbings or co-host)
I do.
[54:44]
Jonathan Ross
And so in the economy, you don't have to do anything other than move dollars around. So you have these very sort of fast twitches in the economy based on people's ability to predict, which makes it unpredictable. I can't tell you what's going to happen in the economy. All I can tell you is that right now the biggest problem I see in AI is if you see a good engineer, one that you would have hired before, they can go out and they can raise 10, 20, 100 million, a billion dollars, and then rather than contributing to one of the other AI startups, they go create their own. Which means that you have difficulty in getting critical, massive talent in any one of these AI startups. On the other hand, AI is making everyone at one of these startups more productive. So in terms of whether or not the economy is overheated, I think one of the best predictors of that is the economy getting in the way of the success of the companies. If it's not getting in the way, then I don't think it's overheated.
[55:43]
Interviewer (likely Harry Stebbings or co-host)
Do you not think it is getting in the way? Because fundamentally the capital supply side is so large that we are actually Preventing you from being able to get great engineering teams together. Because we're funding talent to the extreme where they can raise huge amounts of money rather than join Grok.
[56:00]
Jonathan Ross
Yes, please stop doing that. No, no. But AI is making people more productive. So it might be possible for the economy to keep ripping and for all of the companies to continue being very successful. We don't know, we've never been through this before.
[56:14]
Interviewer (likely Harry Stebbings or co-host)
Is the war for talent insane today?
[56:16]
Jonathan Ross
It's definitely much more aggressive than it's ever been in history. But only in tech. When you look at sports, sports have always been insane, or at least recently been insane. Like you look back 20 years, 30 years ago in sports, the salaries looked a lot like tech salaries. Sure, people are just realizing the value. The problem is in sports you have a limited number of teams, you might even institute a salary cap and things like this. In technology we're not doing that. You have an unlimited number of teams, an unlimited number of startups. Just imagine if anyone could go create their own football team. What would that do to salaries and what would that do to the value of the franchise?
[56:56]
Interviewer (likely Harry Stebbings or co-host)
Which in company are you most impressed by and which are you most worried or concerned for?
[57:01]
Jonathan Ross
I would say Google has probably done the biggest turnaround and they had a structural advantage in that. So Google historically has depended more on their engineers to come up with good ideas. And as long as management gets out of the way, great things happen at Google. And so I just think from a cultural perspective that's a systemic advantage. And for them to think Gemini has.
[57:22]
Interviewer (likely Harry Stebbings or co-host)
Been a success for them, ultimately I do.
[57:24]
Jonathan Ross
I mean, you just look at the numbers of the adoption. It's been great.
[57:28]
Interviewer (likely Harry Stebbings or co-host)
How do you feel about the implementation into consumer products?
[57:31]
Jonathan Ross
Less so. I mean, you see like random Gemini introduction into each product. It's like it's in Gmail, but it's practically unusable. It's in pretty much every product. And it seems thrown in, kind of half thought through. But you shouldn't judge that yet because at least they're getting exposure to how people are using it and they can use that to figure out what they should actually do. I mean, what happened with Google Chrome, right? Like it was originally Google tv, it was a total flop and then they iterated and they turned it into Google Chrome. This is the classic problem where someone puts something out there, everyone throws darts at it and you don't realize that they're just willing to take those darts in order to build a better product.
[58:10]
Interviewer (likely Harry Stebbings or co-host)
And it's fine to take those darts as long as the window of distribution advantage remains. But what's challenging is it doesn't. OpenAI has closed that chasm so significantly.
[58:21]
Jonathan Ross
That's true. Google may be too late.
[58:23]
Interviewer (likely Harry Stebbings or co-host)
Do you see what I mean? It's like a classic, like can the incumbent attain innovation before the startup acquires distribution? And it's like the startups acquired distribution 10% of the world. It's pretty impressive.
[58:34]
Jonathan Ross
Yeah. At this point it would be hard to imagine a scenario where OpenAI goes away. I don't see how that happens. And so at the very least, you have two competitors from this point on.
[58:45]
Interviewer (likely Harry Stebbings or co-host)
Going at it, which is OpenAI and Anthropic or OpenAI and Google?
[58:49]
Jonathan Ross
OpenAI and Google. Anthropic does something different. Anthropic is doing coding. OpenAI is doing a chatbot. Google's doing a chatbot. Google's also doing coding. Google's doing everything.
[58:56]
Interviewer (likely Harry Stebbings or co-host)
Well, I mean, OpenAI is doing coding too.
[58:58]
Jonathan Ross
Yes. And actually our engineers recently started using codecs more than using the Anthropic tools.
[59:04]
Interviewer (likely Harry Stebbings or co-host)
Wow.
[59:04]
Jonathan Ross
Yeah. And it's funny because it's almost on a monthly basis. So we have a philosophy. We don't tell our engineers what tools to use. We do tell them you must use AI because otherwise you're just not going to be competitive. But we saw them using Source Graph, we saw them then using Anthropic, we saw them then using Codex. Next month it'll probably be Source Graph again. It just keeps going around and around in a circle.
[59:29]
Interviewer (likely Harry Stebbings or co-host)
Do any of these have enduring value then? If the switching cost is so low and if they're just bluntly being used.
[59:35]
Jonathan Ross
So promiscuously, our engineers are cutting edge engineers who will switch to the best tool the moment it's the best tool. Not everyone is like that.
[59:42]
Interviewer (likely Harry Stebbings or co-host)
A lot are like that though.
[59:44]
Jonathan Ross
A lot of the people you interact with are like that. Enterprises make these long term deals and they stick with whatever deal they made a year ago.
[59:51]
Interviewer (likely Harry Stebbings or co-host)
Would you rather invest in OpenAI at $500 billion or anthropic at 180?
[59:55]
Jonathan Ross
I'd want to invest in both.
[59:57]
Interviewer (likely Harry Stebbings or co-host)
Would you?
[59:57]
Jonathan Ross
Yeah. They're both undervalued. Highly undervalued. You're still looking at them as if they're competing in a finite market for a finite outcome when they're actually increasing the value of the market. With the more R and D that they do.
[60:11]
Interviewer (likely Harry Stebbings or co-host)
Play this out for me, then if we do the bull case for them, what does that look like?
[60:16]
Jonathan Ross
I think the current tech companies can increase their value significantly, but I don't know why? They could increase their value significantly while the AI labs catch up to where those the current technology leaders are. The Mag 7 is going to increase in value. And what's going to happen is the AI labs are going to achieve the same amount of value as the current Mag 7, but the Mag 7 is going to be more valuable. The question is, will the AI labs overtake the Mag 7?
[60:42]
Interviewer (likely Harry Stebbings or co-host)
What will determine that?
[60:44]
Jonathan Ross
Frankly, I think they're just going to become the Mag 9, the Mag 11, the Mag 20.
[60:48]
Interviewer (likely Harry Stebbings or co-host)
Do you think the AI labs move very significantly into the application layer and subsume the majority of it?
[60:54]
Jonathan Ross
That is the natural tendency of a very successful tech company. They start to do what their customers do and they move up the stack and then they subsume what their customers did, and then there are new people who build on top of them. OpenAI. I think on your show, Sam Altman said something about how if you're just. Just doing like a small refinement on top of OpenAI, you're going to get overrun or whatever. He was just being very honest. That's what they do. In our case, we found an area where we will not compete with our customers, which is we will not create our own models. We just won't do it. And by putting that line in the sand, we're saying it's safe to build on our infrastructure because we're not going to go after what you do. That may be the wrong call. We may find that we're subsumed by one of our customers. But it also means that you can trust that you can build on us. I could be making a huge mistake on that call.
[61:45]
Interviewer (likely Harry Stebbings or co-host)
You could be. You would also need a lot of cash to do that.
[61:48]
Jonathan Ross
To build our own models.
[61:50]
Interviewer (likely Harry Stebbings or co-host)
Yeah. And speaking of cash, how much did you just raise?
[61:53]
Jonathan Ross
So we raised 750 million.
[61:55]
Interviewer (likely Harry Stebbings or co-host)
$750 million at a. What was it, 6 billion?
[61:59]
Jonathan Ross
Yeah, almost 7 billion.
[62:01]
Interviewer (likely Harry Stebbings or co-host)
Got you. This sounds really unfair, and that's amazing. Is that enough money?
[62:06]
Jonathan Ross
It is. In fact, we were only going to raise 300 million. You brought up the question of profitability and all that. The hardware companies are in a good position because unlike these other companies, we actually make money off of what we sell. When we sell hardware, those hardware units actually have positive margin.
[62:22]
Interviewer (likely Harry Stebbings or co-host)
I thought you had negative margin.
[62:24]
Jonathan Ross
When we sell hardware. No.
[62:26]
Interviewer (likely Harry Stebbings or co-host)
Versus when you sell software.
[62:28]
Jonathan Ross
When we sell software, it depends on the model. So our most popular model models on the chip that we're ramping up now are positive margin. But we do have some models that we run that beat the opex, but we're not happy with the capex. Others would be happy with the capex, but we're more conservative. It's just easier to say when we sell hardware we have positive margin because you know it at that moment we might have positive margin on even our least profitable models because we just don't know how long the hardware is going to last.
[62:57]
Interviewer (likely Harry Stebbings or co-host)
Like what are the margins and where do they go over time?
[63:01]
Jonathan Ross
One of the benefits of being private is. I don't have to tell you.
[63:04]
Interviewer (likely Harry Stebbings or co-host)
You don't, but it'd be lovely if you did.
[63:06]
Jonathan Ross
It's the only advantage of being private.
[63:08]
Interviewer (likely Harry Stebbings or co-host)
No, no, no. There's many, many advantages. You don't have a lockup period. You can sell much more easily.
[63:14]
Jonathan Ross
Yeah, but I don't sell shares.
[63:15]
Interviewer (likely Harry Stebbings or co-host)
So you've never sold a share, have you?
[63:16]
Jonathan Ross
Never.
[63:17]
Interviewer (likely Harry Stebbings or co-host)
You clearly don't understand how this game works. Don't mind, I will teach you. But margins over time, do they like, how do you think about that? I'm not asking necessarily.
[63:26]
Jonathan Ross
No, no. I'm going to say what I said earlier, which is I want our margins to be as low as our business remains non volatile. So the only reason for a high margin is because you want to have the ability to bring in cash when you need it. And all you need is the ability to price higher if you need to in order to be able to lower your margin. The demand for compute is so high that if someone came to us and said I need this compute and we have it, they will pay a higher margin which allows us to charge a lower margin.
[63:58]
Interviewer (likely Harry Stebbings or co-host)
Can you help me understand what the chip market looks like in a five year timeline? You said there, we'll have OpenAI, we'll have anthropic, we'll have all the providers having their own chip infrastructure. You'll also have Nvidia. There'll also be. What does that look like in five years time?
[64:15]
Jonathan Ross
My prediction is that in, in five years Nvidia will still have over 50% of the revenue. However, they will have a minority of the chips sold. You know, minority share. They might have 51% of the revenue and they might have 10% of the chips sold.
[64:33]
Interviewer (likely Harry Stebbings or co-host)
Can you help me understand that?
[64:34]
Jonathan Ross
Yeah, there is huge value in being a brand. You get to charge more. However, it makes you less hungry and you're going to start charging high margins and some people are going to pay it because no one's going to get fired for buying from Nvidia. It's a great place to be in that business is going to remain Incredibly valuable. If you're invested in Nvidia, you're probably going to do okay. However, if you're looking at it from the customer point of view, when you have customer concentration, like we're seeing where, you know, 35, 36 customers are 90%, 99% of the total spend in the market, they're going to make decisions less on brand and they're going to make decisions more on what makes their business successful because they're going to have more power to make those decisions. So you're going to see other chips being used because those companies are going to have enough power to make decisions themselves.
[65:27]
Interviewer (likely Harry Stebbings or co-host)
You said you won't do badly if you're an Nvidia investor. One of my friends says, the thing I love about Harry is that he's wonderfully charming, but at the end of the day he goes, that's great, that's great, but what about me? Which is very true. Over under on Nvidia in a five year timeline, 10 trillion.
[65:46]
Jonathan Ross
I personally would be surprised if in five years Nvidia wasn't worth 10 trillion. The question you should ask is, will grok be worth 10 trillion in five years? Possible. We don't have the same supply chain constraints. We can build more compute than anyone else in the world. The most finite resource right now, compute the thing that people are bidding up and paying these high margins for we can produce nearly unlimited quantities of.
[66:12]
Interviewer (likely Harry Stebbings or co-host)
What do you think the market does not understand about GROK that you think they should understand?
[66:16]
Jonathan Ross
Oh, it changes every month. It used to be we couldn't have multiple users and then we demoed multiple users to people on the same hardware. Right. They used to think that we.
[66:25]
Interviewer (likely Harry Stebbings or co-host)
This is because of the SRAM structure.
[66:27]
Jonathan Ross
Because of the sram. Actually, here's another one I still got, you know, whatever.
[66:29]
Interviewer (likely Harry Stebbings or co-host)
Impressed with my learning from last time. Thank you so much. Yeah, dude, I learned so much from you. Genuinely. I was like genuinely learning so much. But okay.
[66:37]
Jonathan Ross
The question I get asked the most is, isn't SRAM more expensive than dram? The answer is yes. A good way to think of it is SRAM is inherently three to four times as expensive per bit. Inherently.
[66:50]
Interviewer (likely Harry Stebbings or co-host)
And just for anyone who doesn't know, again, SRAM is versus DRAM super simple.
[66:54]
Jonathan Ross
So I'll keep it super simple. But this is the isn't technically accurate. SRAM is the memory inside of a chip. DRAM is the external memory. It really has more to do with how you design it. So SRAM has three to four times as many transistors or capacitors just transistors for SRAM than dram. DRAM is a capacitor and a transistor. SRAM is six to eight transistors. So SRAM is inherently larger per bit, which means it uses more silicon, therefore it's more expensive. You're also deploying it on a more expensive chip, like a 3 nanometer chip, so it costs you more per unit of area than DRAM. So there's a multiple, maybe it's 10 times as expensive per bit. The thing is, when we're running a model like Kimi, and we're running it on 4,000 of our chips, you're running that Kimi model on eight GPUs, we're using 500 times as many chips, which means the GPUs have 500 copies of that model, which means they're using 500 times as much memory, which means that their cost is higher. Because even if The SRAM is 10 times more expensive, they're using 500 times as much memory in the DRAM. This is one of those classic problems of looking at it from a chip point of view rather than a system point of view. Everything that we did was actually system point of view and now it's world point of view. We actually load balance things across our data centers. We're now at 13 data centers. We have data centers in the United States, in Canada, in Europe, in the Middle East. When you have a world scale distribution, you don't just make decisions at the data center level. We actually will have more instances of some models in some data centers with different compile optimizations for input or output based on what's going on in a geography. We may not even have an instance of a model in a particular data center. We may have it elsewhere and we can load balance that. We're optimizing at the world level, not at the data center level.
[68:42]
Interviewer (likely Harry Stebbings or co-host)
What would you do if you weren't scared, Jonathan?
[68:45]
Jonathan Ross
I'll rephrase that to where could I increase risk in the business? Yeah, and where we haven't, we could double our orders in our supply chain. Yes, we have a six month supply chain so we can respond to the market faster than anyone else.
[68:59]
Interviewer (likely Harry Stebbings or co-host)
How overweight demand are you then? Supply.
[69:01]
Jonathan Ross
Like I said, last week someone came to us and asked for five times our total capacity. Here's the only reason we don't just completely double down.
[69:09]
Interviewer (likely Harry Stebbings or co-host)
If you're not supply constrained, why can't you just do that?
[69:11]
Jonathan Ross
Because there are thresholds. So for example, if we had double the capacity, we wouldn't have won that Customer they needed 5x. It's not enough to have twice as much. We have to have enough. If we double the capacity, do we have enough for those customers?
[69:24]
Interviewer (likely Harry Stebbings or co-host)
And so the risk that you could.
[69:26]
Jonathan Ross
Take, we could just double the rate at which we're building out supply. With this fundraise, we ended up raising, you know, more than twice what we were, you know, expecting to raise. And then we were forex oversubscribed over what we did raise. And so we could have raised a lot more money. It would have been more dilutive and I'm trying to be dilution sensitive for investors and everyone else, but on the other hand, we could have just raised more money and we could have just built a ton of compute. The other advantage that we have is versus anyone else. Our cost per token, especially given a speed, is very advantageous. So we know that we can charge less than the rest of the market, which matters when you're trying to build these businesses. Not because people are spend conscious. If we lower what we charge 50%, people are going to buy twice as much. They're spending as much as they're making because whatever they spend increases the quality of the output.
[70:20]
Interviewer (likely Harry Stebbings or co-host)
Do you think about going public at all?
[70:22]
Jonathan Ross
Our focus is purely on execution right now. Whether or not you go public, that's a completely different game than we're playing right now. Right now all that matters is can we satisfy the demand for compute.
[70:34]
Interviewer (likely Harry Stebbings or co-host)
Why do you think Cerebras decided to go public?
[70:37]
Jonathan Ross
Well, they recently decided not to go public.
[70:39]
Interviewer (likely Harry Stebbings or co-host)
Dude, I could talk to you all day. I do want to discuss a quick fire round. So I say short statement, you give me your immediate thoughts. Does that sound okay?
[70:46]
Jonathan Ross
Yeah.
[70:47]
Interviewer (likely Harry Stebbings or co-host)
What's the biggest misconception about Nvidia today?
[70:49]
Jonathan Ross
That Nvidia's software is a moat.
[70:53]
Interviewer (likely Harry Stebbings or co-host)
Cuda lock in is bullshit.
[70:55]
Jonathan Ross
Yeah, it's true for training, but it's not true for inference. I mean, we have 2.2 million developers on us now. That's how many have signed up.
[71:02]
Interviewer (likely Harry Stebbings or co-host)
How many do CUDA have they claim?
[71:04]
Jonathan Ross
6 million.
[71:05]
Interviewer (likely Harry Stebbings or co-host)
If you were founding GROK today with Nvidia at 4 trillion and the AI boom in full swing, what would you do differently?
[71:12]
Jonathan Ross
I wouldn't do chips. That chip has already sailed. It takes too long to build a chip. The bet that does it.
[71:17]
Interviewer (likely Harry Stebbings or co-host)
So for the chip providers today that are coming out, and we are seeing new chip providers come out where they're raising a lot of money from good people, it's too late.
[71:26]
Jonathan Ross
Yeah. So the reason that I decided to go into Chips. I did the Google tpu, but also before I left, I set a record on the best classification model, like ResNet50. With someone in Google Brain, we did an experiment. We beat everything. And so I could have gone in the algorithm side and actually when we were fundraising, I wasn't even 100% sure that I was going to do chips. I was like thinking maybe we, we do something on the algorithm side, especially in formal reasoning, which is good that I didn't. But the main motivation to go into chips was the moat, the, the temporal moat. So a question we get asked by VCs a lot is what prevents someone from copying what we're doing? And the answer to that is if you copy what we do, you're three years behind us because it takes that long to go from the design of a chip to a chip in production if you execute perfectly. I've done three chips now that are in production or ramping to production. All three were a zero silicon. Only 14% of chips that are taped out the first time work the first time are a zero silicon. So that means there's an 86% chance each time that you're going to have to re spin it. When we built our V2 chip, we actually already scheduled a re spin for it and we ended up not having to do it because to our shock the first one worked. Like you shouldn't expect that. So that three years is if everything goes perfect. Nvidia typically takes three to four years per chip and they just have multiple being done at a time. GROK is now in a one year cycle. So a year after our V2 is our V3 and a year after that is our V4.
[73:01]
Interviewer (likely Harry Stebbings or co-host)
How do you evaluate the meteoric re acceleration rise of Larry Ellison and Oracle?
[73:07]
Jonathan Ross
Brilliant business decisions and the willingness to move fast. Most people right now keep asking themselves, is AI overheated? Should we double down on this? They just went for it. They're just aggressive and that's what it takes to win. When everyone else is fearful, you should be greedy. And when everyone else is greedy, you should be fearful. Right now there's a lot of fear. And around AI, what you're seeing though is there's a couple of greedy, really smart people and they're making tons of money and it looks like there's a lot of greed out there. It's just a handful of people that are moving fast.
[73:40]
Interviewer (likely Harry Stebbings or co-host)
Where should I be greedy and where should I be fearful? I'm an ambassador today, obviously, wherever there's.
[73:45]
Jonathan Ross
A moat, you know. Hamilton Helmer seven Powers. Right. Wherever you see a moat, you should be greedy.
[73:51]
Interviewer (likely Harry Stebbings or co-host)
Very few people have a moat.
[73:52]
Jonathan Ross
Yeah. And especially at the stage that you invest in. So you have to predict that there's going to be a moat.
[73:57]
Interviewer (likely Harry Stebbings or co-host)
And if there is a moat, it's a billion valuation for a pre seed.
[74:01]
Jonathan Ross
I mean, there's a billion valuation for a pre seed. Pre moat. That's what you should. You should call it pre moat. That's what the investors should denote it as.
[74:09]
Interviewer (likely Harry Stebbings or co-host)
What have you changed your mind on in the last 12 months?
[74:12]
Jonathan Ross
Oh, my gosh. It's not so much that I've changed my mind, it's that I've changed. What percentage of our business doubles down where every month we become more focused? Say yes to fewer things. What happens is the business just does better. I used to think that the most important thing was preserving optionality. Now I think it's focus. However, I think having that optionality early on was crucial so that we could play where we would be most successful. Now it's about focus.
[74:42]
Interviewer (likely Harry Stebbings or co-host)
We've spoken a lot about OpenAI Anthropic. Do you think Elon Musk is able to pull it off with Grok and Axe?
[74:49]
Jonathan Ross
Yes. Although it's probably going to be different. Whenever a new area emerges, a bunch of people think that they're competing and they're not. All of these people creating foundation models think that they're competing for the exact same thing. What did anthropic do that was brilliant? They decided to stop competing by doing everything and focus on coding. And that's worked great for them. If you look at xai, they have a social network and they've integrated their chatbot with that. I'm not going to use that chatbot for, you know, solving deep analysis or, you know, deep research problems. I'm not going to use it for coding. Now, they do have a coding model, but they don't have a coding distribution. Can they use that distribution to get into coding? Maybe, but then they're not going to be as focused. So what are they doing? Eventually, the markets will diverge. Mag7, all of those companies have some overlapping business, but the primary business of each of those mag7 companies is different. If you do not differentiate, you die.
[75:47]
Interviewer (likely Harry Stebbings or co-host)
When you look at Google, Microsoft and Amazon, you can buy one and you can sell one. Which do you buy? Which do you sell?
[75:54]
Jonathan Ross
It depends on the time frame. In the short term, Microsoft is resetting a little bit because of the OpenAI relationship. Long term, they're probably going to do fine again.
[76:03]
Interviewer (likely Harry Stebbings or co-host)
Do you think that's A material damage to them?
[76:05]
Jonathan Ross
No, that's why I'm saying in the short term I think it's going to hit them, and in the long term it's not.
[76:10]
Interviewer (likely Harry Stebbings or co-host)
Have they not done majestically well from that? They have the financial ownership of OpenAI and then they have the flexibility to use anthropic for most of suite and.
[76:17]
Jonathan Ross
They'Ve deployed an enormous amount of compute. So if OpenAI diversifies and gets their compute elsewhere, they have that compute. Now, compute is like gold. If you have it, you have AI. And then Amazon, I think, doesn't have AI DNA if you compare them. So you didn't mention Meta, Right. But Meta and Google always had the AI DNA and Microsoft bought it with OpenAI, but that bought them time. Amazon still doesn't have that DNA, but they do have compute.
[76:44]
Interviewer (likely Harry Stebbings or co-host)
Final one. What are you most excited for when you look forward? I like to end on an element of positivity. What are you most excited for when you look forward over the next five to seven years?
[76:54]
Jonathan Ross
I think the things that scare most people are what excite me. And what I mean by that is everyone's afraid of what AI is going to do. I think there's a good historical analogy here, which is Galileo. A couple hundred years ago, Galileo popularized the telescope. He got in a lot of trouble for that. And the reason he got in so much trouble was the telescope allowed us to see some truths and allowed us to realize that the universe was larger than we imagined. And it made us feel really, really small. And over time we've come to realize that while we may be small, the universe is grand and it's beautiful. I think over time we're going to realize that LLMs are the telescope of the mind, that right now they're making us feel really, really small. But in a hundred years, we're going to realize that intelligence is more vast than we could have ever imagined. And we're going to think, that's beautiful.
[77:46]
Interviewer (likely Harry Stebbings or co-host)
John. Dude, I always end up taking copious notes in our conversation. Thank you so much for doing this with me, man. It's so lovely to do it in the studio and you've been fantastic.
[77:55]
Jonathan Ross
Thank you.
[77:57]
Harry Stebbings
It was so special to have Jonathan.
[77:58]
Interviewer (likely Harry Stebbings or co-host)
In the studio there.
[77:59]
Harry Stebbings
And if you want to watch the episode, you can see it on YouTube by searching for 20VC where you can find all of the episodes live in the studio. We'd love to see you there. But before we dive into the show today, I love seeing the team come together to make this show happen. What I don't love is trying to keep track of all the information, the data and the projects that we're working on across dozens of platforms, products and tools. That's why we use Coda, the All In One collaborative workspace that's helped 50,000 teams all over the world get on the same page. Offering the flexibility of docs with the structure of spreadsheets, Coda facilitates deeper teamwork and quicker creativity. And their turnkey AI solution, the intelligence of Coda Brain is a game changer pack. Powered by Grammarly, Coda is entering a new phase of innovation and expansion, aiming to redefine productivity for the AI era. Whether you're a startup looking to organize the chaos while staying nimble, or an enterprise organization looking for better alignment, Coda matches your working style. Its seamless workspace connects to hundreds of your favorite tools, including Salesforce, Jira, Asana and Figma, helping your teams transform their rituals and do more folks faster. Head over to Coda iO20VC right now and get six months off the team plan for startups for free. That's Codacoda iO20VC and get six months off the team plan for free. Coda iO20VC and while Coda keeps your team aligned, Radix makes sure your startup's name is just as sharp. This one's for all you tech founders out there. You finally come up with the perfect name for your startup. Then you check the.com and damn, it's taken, parked, unused or priced like rent in Palo Alto. So you settle with extra letters, weird spellings, whatever it takes. But hey, you don't have to compromise because now there's finally a domain for tech founders like you. Tech domains get the startup name you actually want on tech. No compromises. What's more, when you use tech, you signal to your customers and investors that you're building tech with just your domain name. Isn't that cool? So if you've got a name in mind, search for it now with tech on a trusted platform like GoDaddy or visit Get Tech 20 VC to grab it. You've got the name locked down with Radix. Now it's time to get the fun structure just as solid. If you're listening to 20 BC, you know we have a really freaking high bar. Well, Angellist is the modern platform used by the best in class venture funds where over 40% of top endowments and banks are LPs. Their customers include a top five venture firm, 20 VC. And they now have. Check this out, $171 billion of assets on the platform. They combine an all in one software platform with a dedicated service team that moves as fast as as you do. One manager said this awesome quote angellist feels like an extension of my fund. Another said, angellist gives me total peace of mind. The attention to detail, lightning fast response time and just real sense of ownership from the team are exactly what I need to stop worrying about back office ops. So if you're starting a new fund, don't be a moron. Just use Angellist. They're incredible. Head over to angellist.com 20vc to learn more.
[81:21]
Jonathan Ross
More.
[81:21]
Harry Stebbings
As always, I so appreciate all your support and stay tuned for an incredible episode coming on Thursday with the one and only Jason Lemkin and Rory o'. Driscoll.