
Jonathan Ross is the Founder & CEO of Groq, the creator of the world’s first Language Processing Unit (LPUTM). Prior to Groq, Jonathan began what became Google’s Tensor Processing Unit (TPU) as a 20% project where he designed...
Loading summary
Jonathan Ro
We did not raise 1.5 billion. That's revenue. That's actually about 30% of the revenue of OpenAI. Your job is not to follow the wave. Your job is to get positioned for the wave. You can almost say we're one of the best things that ever happened to Nvidia because they can make every single GPU that they were going to make and they can sell it for training. High margin gets amortized across deployment and we'll take the low margin, high volume inference business off their hands and they won't have to sell either margin. We are growing faster than X bit and when you are growing faster than exponential, there is no amount of profit that you can make that matters. What matters is getting a toehold in the market and becoming relevant.
Harry Stebbings
This is 20 VC with me, Harry Stebbings and today we feature a company that has just booked 1.5 billion in revenue. They say very simply, Nvidia should own the training market for AI and they will own the inference market. Simple. This is an exceptional discussion recorded in Paris last week with Jonathan Ro, founder and CEO of Grok, the creator of the world's first language processing unit. And prior to Grok, Jonathan began Google's Tensor processing unit TPU and implemented the core elements of the first generation TPU chip at Google. But before we dive in today, turning your back of a napkin idea into a billion dollar startup requires countless hours of collaboration and teamwork. It can be really difficult to build a team that's aligned on everything from values to workflow. But that's exactly what Coda was made to. Coda is an all in one collaborative workspace that started as a napkin sketch. Now, just five years since launching in beta, Coda has helped 50,000 teams all over the world get on the same page. Now at 20 VC, we've used Coda to bring structure to our content planning and episode prep, and it's made a huge difference. Instead of bouncing between different tools, we can keep everything from guest research to scheduling and notes all in one place, which saves us so much time. With Kodi, you get the flexibility of docs, the structure of spreadsheets, and the power of applications, all built for enterprise. And it's got the intelligence of AI, which makes it even more awesome. If you're a startup team looking to increase alignment and agility, Coda can help you move from planning to execution in record time. To try it for yourself, go to CODA io20VC today and get six free months of the team plan. For startups, that's Coda iO20VC to get started for free and get six free months of the team plan. Now that your team is aligned and collaborating, let's tackle those messy expense reports. You know, those receipts that seem to multiply like rabbits in your wallet? The endless email chains asking, can you approve this? Don't even get me started on a month end. Panic when you realise you have to reconcile it all. Well, Pleo offers smart company cards, physical, virtual and vendor specific so teams can buy what they need while finance stays in control. Automate your expense reports, process invoices seamlessly and manage reimbursements effortlessly all in one platform. With integrations to tools like Xero, QuickBooks and Netsuite, PLE fits right into your workflow, saving time and giving you full visibility over every entity, payment and subscription. Join over 37,000 companies already using Pleo to streamline their finances. Try Pleo today. It's like magic, but with fewer rabbits. Find out more at PLEO IO 20 VC. Don't forget to secure trust with your customers. Trust isn't just earned, though, it's demanded. That's why over 9,000 companies, including Atlassian, CORA and Factory, rely on Vanta to automate their security complian compliance. So Vanta helps businesses achieve certifications like SoC2 and ISO 27001, turning months of tedious work into this beautifully fast and straightforward process. Their platform automates compliance across over 35 frameworks. It centralizes workflows and it proactively manages risk, all while saving you time with automation and AI. So whether you're just starting or scaling your security program, Vanta connects you with auditors and experts to get audit ready quickly and build trust with your customers. Get $1,000 off year by visiting Vanta.com 20VC. That's V-A N T A.com 20VC. You have now arrived at your destination.
Unnamed Interviewer
Jonathan, thank you so much for agreeing to this in Paris. You look fantastic, by the way. I feel so underdressed, but you look great.
Jonathan Ro
Thank you. I could take the tie off if you want, but I'll never be able to tie it again. I don't know how to tie it. No, literally, my chief of staff has to tie it for me. It's. And it's like a struggle because, like, he's putting it on himself, he's tying it. I literally only bought this suit recently.
Unnamed Interviewer
Well, I mean, you look fantastic. I don't, I think, have a suit. So you're one up on me. I want to split the show into two parts there. I want to Talk about the landscape where we're at. And then I want to dive specifically into Grok, where you're at. You've announced a massive new deal that I think everyone's slightly misunderstanding what we were just talking about.
Harry Stebbings
I just want to start on where.
Unnamed Interviewer
We'Re at in terms of like scaling laws. Everyone says we are at the limits of scaling laws and then there seems to be exponential innovation happening with the likes of DeepSeq and others. Where are we at in terms of the limits of scaling laws?
Jonathan Ro
Scaling Laws is a paper that was published by OpenAI and what it does is it effectively says the more parameters your model has, basically the better it can absorb information. You'll see these curves that they draw and they're amazing. You should show it if you can. But effectively you have these sort of asymptotic drop offs where you keep getting better and better, but you get a logarithmic improvement when you put a linear number of tokens in. This is why you see people doing 15 trillion tokens of training and whatnot. But they're misunderstood because the assumption is that all of the data is the same quality. So eventually you're going to be training your kid and you're going to say, and play along with me here. What's 1 plus 1, 2? What's 2 times 3, 6? What's the second derivative of the square of the hyperbolic tangent? But that's how we train these models. We give them really simple problems to solve and then we give them these really hard ones. We don't really train them up, we don't do it smart. So what some people do is they will train on the dregs of the Internet and then they'll save some high quality data for the end to make them better. But what you can do, and this is where I think everyone's getting confused, is it's sort of like with AlphaGo Zero, where it generated its own data and trained. You could have an LLM, generate synthetic data, and when it generates the synthetic data, the data is better. You then train on that synthetic data. So what you do is you train.
Unnamed Interviewer
Why is synthetic data better than real data?
Jonathan Ro
Because the model is smarter. Reddit is great, but not necessarily as high quality as talking to someone with a PhD in a topic. And so just like with more expert people who are more knowledgeable and more capable, if you have a better model, it generates better data. So you train the model, it gets better you, you produce better data and you produce a, a range of data here and you get rid of all the parts that are wrong. So now it's the best part. So it's a little better than the model is because you're pruning it, because you get to do this offline, right? And then you train the model and the model comes up here, and then you do this again. And then you keep the better data, you train it again. You just keep moving up. When you do that. The actual scaling laws don't look like these asymptotics. They actually.
Unnamed Interviewer
But there has to be a ceiling on efficiency, no?
Jonathan Ro
Does there? So there is a mathematical limit. If you study computer science, you've probably heard of something called big O complexity. Big O complexity is if I am solving a problem and I look at how I solve it, I might need to take more steps if I solve it with one algorithm versus another. So, for example, quicksort versus bubble sort. Quicksort, I need N log n steps, bubble sort, I need N squared. What's the difference? If I am sorting 1,000 numbers, n log n, that's 10,000 steps. But with N squared, that's a million steps. Because it's either 10 times 1000 or 1000 times 1000. One of the reasons that these LLMs struggle to multiply large numbers is because multiplying is not linear. These LLMs could do anything linear without needing to think. But just like on a piece of paper, how you need to write out all those intermediate steps. These LLMs need that intermediate space in those steps in order to compute these things. It's a mathematical requirement. There's nothing. You cannot train a model enough so that it'll see any arbitrarily large number. Just be able to multiply it. But you can choose bigger and bigger groupings of numbers for it to memorize, in which case it can do it in fewer steps and effectively. As you are training the model on more and more data, it's seeing more and more examples. So now it just has the answer for more specific situations. So it doesn't need to do as much reasoning, but it still needs to do reasoning for some of these problems.
Unnamed Interviewer
So what does that mean for the next step? If we have no efficiency ceiling, what does that issue mean?
Jonathan Ro
You need both. So the training of the model makes it more intuitive. It means that it can sort of just come up with the answer like that. More stream of consciousness. The reasoning part is different. The reasoning is the algorithm on top, right? The big O complexity portion. So it's system one, system two, thinking, or thinking fast, thinking slow, like Daniel Kahneman's Book when you pair them together, when you make it more intuitive, you get better this way, right? But when you start adding in the system 2 portion, you start to get this. You hear the volume is very little, but when you do this. And so you get this. Polylinear is the term, but you could think of it as geometrically increasing improvement in the model when you combine it with that improved training. But also the improved, what they call test time compute or runtime compute.
Unnamed Interviewer
Just so I understand. So when we think about bottlenecks, if we have synthetic data that powers the.
Jonathan Ro
Training, it gets more intuitive, it gets more. It gets to the answer more quickly. Sort of like a grandmaster in chess, just seeing the right moves.
Unnamed Interviewer
Sure. But synthetic data is not constrained in terms of its supply side. If we think about the other bottlenecks, there is hardware, there is energy efficiency, there's algorithmic limits. What is the.
Jonathan Ro
But if I'm telling. If your job is to get better at multiplying numbers, and I tell you that I want you to be able to do it with fewer steps, more intuitively, for you to be able to multiply three digit numbers versus two digit, you need 10x the data and you need 10x the examples. And so as you get better on the intuitive part, you need more examples to train on.
Unnamed Interviewer
And so what is the bottleneck then? Is it the hardware quality? Is it compute? Is it algorithms?
Jonathan Ro
It is the compute, it is the data, it is the algorithms. It's all three of them. But people misunderstand the concept of a bottleneck. Compute has been more of a less of a bottleneck and more of a soft neck or something, right? Where when you provide even more compute, you can sort of overpower the lack of data, the lack of improvement in algorithms. So it's not a hard bottleneck, it's a soft bottleneck. But ideally you would improve all three. You would be getting better data, you would be getting better algorithms, and the algorithm improvements are going to be there, the data improvements are going to be there. But compute has always been the easiest lever because it's so fungible. If I just give you more compute, it works better.
Unnamed Interviewer
Has DeepSeq not shown you that? Actually we don't need the compute and you can do more with less.
Jonathan Ro
Not exactly. There was an algorithmic improvement on that. And the algorithmic improvement, seemingly silly thing where they just wrote the answer in a box and then they knew what to look for, rather than having to have a human being check it or something like that. It was very simple, but that was an algorithmic improvement and it made it easier to generate the data that was then trained on.
Unnamed Interviewer
I think there's misconceptions around compute data, especially synthetic data. As you said there algorithms. When you think about the biggest misconceptions that people have around AI and specifically kind of inference, what do you think they are?
Jonathan Ro
When we started, the first misconception, which people don't hold anymore, is that training was more expensive than inference. At Google, anytime we would train a new model, we would end up using 10 to 20 times as much compute on the inference as the training. Inference was always the critical infrastructure piece that we needed. But then after getting past that, now everyone understands inference is important. I think one of the.
Unnamed Interviewer
Do you think they fully do? Because when you look at Nvidia's stock price post Deepsea, it was down 15%. If you understood the value of inference shouldn't be down 15%.
Jonathan Ro
And jevons paradox and all that. And yeah, I don't agree that Nvidia stock should have gone down for that. I think that was a misunderstanding on most people's part. But it also shows, I think that shows more. Everyone keeps saying Nvidia stock can't possibly go higher and they were looking for an excuse for oh, now that's it. That's why we were wrong and we need to sell now. But that has nothing to do with the, that's just a sort of popularity contest side of the market that had nothing to do with the weighing machine of the market.
Unnamed Interviewer
So should founders building stage, should they build with the assumption that scaling laws will continue? Should they build with what we have today? How do you advise them on that?
Jonathan Ro
I would advise you to build based on things getting better. I would also focus a little more on the sort of big quantum steps. The analogy that I like is if you look at the information age, we went through the printing press, we had the telephone, we had the telegram, we had the Internet and we had smartphones. Right? And if you had built Uber back when we had Internet, it wouldn't have worked because you'd book a ride, you'd go somewhere, how do you get home? And we're in the same sort of space now, so the models hallucinate. So it would be hard to build a medical diagnosis company, it would be hard to build a legal company. However, if you are doing that and the algorithmic enhancements happen that get the hallucination rate down, you are perfectly positioned. Just like Grok. We were around for seven years before we had product market fit. Our bet was scaled inference. That inference was going to be the bottleneck that we were going to need to run really big, heavy models. Everyone was assuming you would have a single PCIe card running inference, because training was the complicated part. Right. But the reality was we made the right bet ahead of time and then we were perfectly positioned. Your job is not to follow the wave. Your job is to get positioned for the wave. And that's the hardest thing to do because everyone is trying to talk you into coming onshore again. Almost everyone was telling us, don't do LLMs. They're going to be terrible for you. And we're like, this is literally what we built for.
Unnamed Interviewer
Did you ever doubt yourself? Seven years is an incredibly long wait time.
Jonathan Ro
There was doubt, but there was never a pause. And the reason was, even back before starting the tpu, I was concerned that AI was going to be a technology that would allow some people to have outsized control, outsized influence. And if you allow that to just happen in potentially not the best hands, it doesn't really matter how rich you are. It doesn't matter. Nothing matters. It's the most important technology. So it didn't matter how hard it got. There was no choice but to be successful. And our goal is to preserve human agency in the age of AI. If we don't do that, we have failed. It wouldn't matter whether there was doubt or not. And yes, there was plenty of doubt. There was a point where we were so close to running out of money. We did this thing that we called GROK bonds. So, you know, war bonds from World.
Unnamed Interviewer
War II, of course, but for anyone that doesn't, what is a war bond?
Jonathan Ro
So World War II was funded with bonds. The U.S. government, they had these posters. It was like, fund your troops and whatever, and you'd buy them and they would pay you a return. And that funded the war effort. We were very close to running out of money at one point. Rather than trying to pretend to be strong, you know, we were vulnerable with our employees. And we said, we're going to run out of money. We need you to trade salary for equity. We literally took pictures of the war bonds and we put GROK bonds on it instead. And we had all hands where we said this and we were worried everyone was going to leave instead of leaving. About 80% of the employees participated. 50%, I think, went to the statutory minimum salary by law. When we finally raised the first bit of our $300 million round, we had so little money in the bank left that it was less money than we saved doing GROK bonds. So had we not done that, we would have literally run out of money. So there were some really hard times. And I know every founder has these. And from the outside it's so hard to understand. It's like watching a TV show, you're not in it, but when you are there, everything is 10 to 100 times more intense because people left their jobs, they left their careers, their families are banking on this. You have to make decisions. What would have happened if we went out there and asked everyone to do Grok bonds and everyone quit? Then the shareholders would have been like, you have all of these people depending on you, but if you lean towards that vulnerability, people are often going to go with you on it.
Unnamed Interviewer
So what does a world where inference is so crucial and 20 times more important than training, what does that world look like?
Jonathan Ro
I think the simplest way to understand it is equate an LPU or a GPU to an employee. If you have enough of them, the LPUs or GPUs, you can do work just like with an employee. It's a little different in the sense that they can't quit and take another job. You don't have to retrain. Once you get a model to a certain capability, it'll always be at least that capability. Right. So you get the consistency out of it. But now imagine that you're a startup and rather than having to go out and hire 100 people, you hire 10 and you buy the amount of compute equivalent to 90 employees worth. That's a very different way of thinking about the world because now Capex or in some cases different types of opex can be used instead of just employees. And in terms of inference, just to give you a sense of our scaling, we started 2024 with about 640 chips in production. We ended with over 40,000. This year we want to be at over 2 million and next year the number is much, much, much larger.
Unnamed Interviewer
Are we seeing constraints on chip supply? I mean that is an unbelievable scaling story.
Jonathan Ro
Yeah. So for us to hit our numbers next year, which I'm not sharing publicly, we're going to need almost all of the capacity of the fab that we're the biggest issue. So seven powers. We love seven powers. Right? Hamilton Helmers. Okay. You don't normally think of tech companies as having a cornered resource, but Nvidia has a cornered resource. They're a monopsony, the opposite of a monopoly. A single buyer for HBM and the interposer, the Cowas.
Unnamed Interviewer
So what is HBM?
Jonathan Ro
So HBM is high bandwidth memory and GPUs.
Unnamed Interviewer
And who produces HBM? I'm sorry for the dumb questions.
Jonathan Ro
There's three companies in the world that do this. SK, Hynix, Samsung, and Micron. It's a specialty memory. It's only used in high end servers, so there's a limited quantity that's built. It's very expensive to ramp up. It's a very technically challenging type of memory to build, more so than others, and so there's a very limited supply. And GPUs are so fast computationally that if you were using regular memory, it'd be like drinking out of a martini straw. It would just take forever. This is why you see people preferring to do even inference, but especially training on GPUs rather than CPUs because the memory bandwidth is too limited. And CPUs rarely use HBM. They're mostly regular memory. The observation that we had when we started Grok, everyone knows Moore's Law. Every 18 to 24 months, like clockwork, double the transistors, it means double the compute. But we noticed that AI was getting better, faster, and it clearly wasn't the algorithms, because algorithms have sort of discontinuous jump. It also didn't seem to be the data because there wasn't that much more data. And the transistors were only doubling every 18 to 24 months. So where was all of this capability coming from? Turns out the number of chips was also doubling every 18 to 24 months. So rather than 2x, it was 4x. The question we asked was, if you're effectively going to have an unlimited number of chips, do you do something architecturally different? And the answer is absolutely. So rather than using external memory, we just use a large number of chips and keep all of the parameters of the model in the chips live. And then we just have this pipeline where the computation flows through it, sort of like an assembly line. So imagine if you were trying to build a factory and the Factory was only 1/100th of the size needed for the assembly line. So you'd run a bunch of cars through 1/100, tear it down, set up the next 1/100th December line. You just do this over and over again. That's the way a GPU works, LPUs, very different. We actually just have the computation flow through a whole bunch of chips. So Rather than using eight chips, we'll use 600 or 3,000 for a model.
Unnamed Interviewer
How does that change energy efficiency?
Jonathan Ro
It improves at about 3x. And the reason is how does it.
Unnamed Interviewer
Improve it when you use more per.
Jonathan Ro
Token, So the footprint is higher. Think of it as the difference between a factory or a backyard sort of garage. The backyard garage is not going to be as efficient, however, it has a lower energy footprint. Or another example would be if you were trying to transport a ton of coal from one side of the city to the other, and you did it on mopeds, or you did it with freight trains, which one would be more efficient? The moped would use less energy per trip, but it would need more trips and therefore would use more energy overall. In fact, this is one of the things most people misunderstand. They think that edge computing is lower energy. Actually, edge computing is less energy efficient than computing in the data center. When you're computing in the data center, it's a little bit like that freight train. You're actually getting to do a whole bunch of jobs simultaneously. So the fact that we don't have to read from that external memory means that we don't have to spend the energy doing that. Even with GPUs, you get to batch. But going back to why it's so energy efficient, the amount of energy used in a chip, there's these physical wires. And the physical wires have a width. And when you look at the width and you look at the length, you charge that wire up to set it to a one and then you discharge it to set it to a zero. It's sort of like charging a capacitor and discharging a capacitor using energy. The longer that wire, the more charge. When you have HBM here and another chip here, you're actually having to charge a wire between the chips and then discharge it every time you send a bit. That's a long distance to travel. But also the wires are wider than the wires that are inside the chips. So you just use a lot more energy. When we keep that memory in the chip, it's only traveling a little distance using much thinner wires and therefore it uses a lot less energy.
Unnamed Interviewer
So do we see a world of LPU and GPU GPU usage in combination? How does that distribution look between LPU usage and GPU usage?
Jonathan Ro
There's a couple of things. The first is training should be done on GPUs. Nvidia will sell every single GPU they make for training. Right now, about 40% of their market is inference. If we were to deploy a lot of much lower cost inference chips, what you would see is that same number of GPUs would be sold, but the demand for training would increase. Because the more inference you have, the more training you need and vice versa. The other use case is we're actually so crazy fast compared to GPUs. We've actually experimented a little bit with taking some portions of the model and running it on our LPUs and letting the rest run on GPU. And it actually speeds up and makes the GPU more economical. So since people already have a bunch of GPUs they've deployed, one use case we've contemplated is selling some of our LPUs to sort of nitro boost those GPUs.
Unnamed Interviewer
Well, this is my question, which is that, you know, people have bought GPUs so far, far ahead of time that by the time you get them, they're deployed and installed, they're almost out of date.
Jonathan Ro
Actually, we've spoken with some customers that put orders in over a year in advance. They paid a year in advance and still haven't gotten them. The recent deployment we did In Saudi Arabia, 51 days from contract to the first tokens being served in production in country.
Unnamed Interviewer
How are you able to do it so quickly 51 days is astonishing.
Jonathan Ro
Yeah, part of it is, architecturally things are much simpler for us. We don't have a bunch of other hardware components. We actually don't use switches to communicate between our chips. We just plug our chips into our chips. Our chips are the switch. We don't have all of this network tuning.
Unnamed Interviewer
Given the energy efficiency, given the predictability, why is Nvidia not being More proactive on LPUs?
Jonathan Ro
What makes you think that they don't want to be more proactive on it?
Unnamed Interviewer
They don't talk about it.
Jonathan Ro
Well, why would they talk about it? That would be like talking about something you don't have when you're trying to project strength rather than vulnerability.
Unnamed Interviewer
Well, I think if you wanted to protect shareholder value and wanted to protect a Wall street image of dominance and being ahead of the game, you'd at least say, ah, we are of course working on LPUs as well.
Jonathan Ro
But then until they had LPUs, they would effectively be exposing that there's something missing. Like if you look at the last GTC, there was an announcement that, that the latest GPUs were 30x faster than the previous generation. When you look at how it was done, there was this curve that looked kind of like this and then it basically ended here. And then there was another curve that was kind of like this. Now that 30x was from the end of this curve to this curve, if you moved it here, it would have been less than 30x. If you moved it here, it would have been infinite. So their chip is infinitely faster than the previous one. But that wouldn't have sounded reasonable. Right. There's a history in this market of specsmanship because it's so hard to get access to chip. And this is a lesson on enterprise sales. In enterprise sales, people rely on specsmanship.
Unnamed Interviewer
Because specsmanship is what?
Jonathan Ro
Well, my specs are better than your specs, my chip is faster than your chip, I get more teraflops per second than you do. But who cares? Tell me what the tokens per dollar is and tell me what the tokens per watt is. Nothing else really matters. But people will find all of these other weird things to measure that they might be better on. Sort of like, I'll sell you a car with better RPMs. RPMs don't matter. What matters is miles per gallon and maybe the speed that you can drive at. Although speed limits kind of render that, you know, moot. Right. In the case of enterprise sales, there was a time when you would market soap. The billboards would say, our soap has more bubbles than this other brand soap. Who cares? And what they figured out was, let's put really happy people up on a billboard after they use the soap. And then maybe people associate that happiness. Right. Lifestyle marketing.
Unnamed Interviewer
Sure.
Jonathan Ro
For some reason, enterprise still hasn't learned this lesson. It's still, we have more bubbles, we have more teraops, we have more whatever things that people just literally don't care about.
Unnamed Interviewer
So you think Nvidia is, hey, we're 30 times faster is not good marketing.
Jonathan Ro
It worked because it's what people are used to. But our counter was we did a press release to that that said grok still faster. That was it. And people went gaga over it. Right. Because it was just, we are, we're still faster, so who cares?
Unnamed Interviewer
Do you think Wall street understand that though?
Jonathan Ro
I think they're starting to. But again, I don't think there's real competition here. I think if you are competing, you have done something seriously wrong. If you're competing, it means that you haven't found an unsolved customer problem. Because if you're competing, someone else has already solved the problem. So why are you spending time on it?
Unnamed Interviewer
You don't view Nvidia as a competitor?
Jonathan Ro
No. They don't offer fast tokens and they don't offer low cost tokens. It's a very different product. But what they do very, very well is training. They do it better than anyone else. And by such a wide degree, it's a solved problem. Why Would we bother trying to solve a problem that's already been solved?
Unnamed Interviewer
So you're like, seed the training market to them. We'll own the inference market.
Jonathan Ro
Yeah.
Unnamed Interviewer
And they're saying we also want the.
Jonathan Ro
Inference market, of course, the way it always works.
Unnamed Interviewer
So what do we do now? So now we are competing in the.
Jonathan Ro
Inference market, but are we? Yeah, we don't really have people saying we're going to buy GPUs instead of you. We do have people saying we're going to buy both. That happens, but we don't care. I showed a demo to someone and he's like, should we just not buy any more GPUs? I'm like, no, you should buy every single GPU you can get your hands on. And he's looking at me very perplexed and I'm like, how are you going to do training? We don't do training. Buy the GPUs, get every single one you can. Because I want your models running on us to be ready.
Unnamed Interviewer
But for inference, they don't need to buy Nvidia anymore.
Jonathan Ro
They don't need to buy GPUs for inference. But if you can get them, I mean, they're a little expensive, but if you're used to it, why not? People still sell mainframes. If you want lower cost and faster, then you want an LPU.
Unnamed Interviewer
How much lower cost is it?
Jonathan Ro
More than 5x lower?
Unnamed Interviewer
More than 5x lower.
Jonathan Ro
Just the memory alone in the latest GPUs costs more than our fully loaded CAPEX per chip deployed. And on top of that, so we talked about the energy efficiency. So we use about a third of the energy per token over a three year period. One third of our cost is the opex, which is mostly energy and data center rent, and two thirds is the capex, which means that since we're one third of the energy, the cost to run that gpu, to produce the same number of tokens for infrastructure inference is the same as our total cost. Just the OPEX for the GPU is the same as our CapEx plus our OpEx.
Unnamed Interviewer
I'm really sorry I'm asking the most stupid questions, but I'm just rolling with it. We're in Paris and it's nearly the end of the day. Why is 40% of their revenue inference then? Why have you not taken so much more of that?
Jonathan Ro
At the beginning of 2024 we only had 640 chips. At the end we had 40,000. We're not at that scale yet. You have to provide quality. You have to provide low cost, you have to provide speed, but you also have to provide capacity. And so this is where that most important part of not using HBM came in. It means that we effectively have no scale limits. So the GPU itself is actually manufactured using the same process that you use for your mobile phone. So the same silicon that's in your mobile phone is the same silicon for the gpu. In fact, they build the mobile phones first, the mobile phone chips first, because they're smaller, so they yield better. Nvidia actually gets it after Apple. The difference is that memory, that's the only difference. But that memory is the hard part to manufacture. That's limited in scale. So by us avoiding that, we effectively have almost no limit on how much we can scale up. And that's important for inference.
Unnamed Interviewer
What is Nvidia's margin?
Jonathan Ro
70 to 80%.
Unnamed Interviewer
70 to 80%. So they can take 70 to 80% off and be radically more comparatively cost effective compared to you. You could destroy their margin.
Jonathan Ro
But in that same vein, you could almost say we're one of the best things that ever happened to Nvidia because they can make every single GPU that they were going to make and they can sell it for training. High margin gets amortized across the deployment. We'll take the low margin, high volume inference business off their hands and they won't have to sell either margin.
Unnamed Interviewer
What's low margin margin?
Jonathan Ro
Depending on the deal, we do get some on the backside, but up front it's about 20%.
Unnamed Interviewer
About 20%?
Jonathan Ro
Yeah.
Unnamed Interviewer
Okay, so theirs is 80, yours is 20, but then you're looking at a 20x.
Jonathan Ro
But then we get more later off of it, so we take some of the risk.
Unnamed Interviewer
What do you mean you get more later? Sorry.
Jonathan Ro
So the deals that we do, the partner will off because we don't deploy. We don't spend money for our own capex. The partner will put up the money for us to deploy, we pay back with a decent irr, but we split and most of it goes to the partner. And then once we hit the irr, it flips the other way. So others are putting the capex up for us.
Unnamed Interviewer
What does it look like at the end then?
Jonathan Ro
Well, at the end it's not like other business models. So we didn't just innovate on the chip, we also innovated on the business model. And we're limited in how much money we can make based on how much we can deploy, not how much money we have, because the partners are putting that money up. So when I'm looking at what we can do, it's all about how much we can scale.
Unnamed Interviewer
What are the limits to your deployment? Is it purely chip constraints?
Jonathan Ro
Mostly. So you're asking about misconceptions in AI. I think one of them is about power. It is true that there is a mismatch in the market between people with chips and people with power, but that's partially because you need a data center in the middle and there aren't enough data centers. Those aren't the hardest thing in the world to build. They're not easy, but they're not the hardest thing. It's harder to build up the power. However, because of that mismatch. You have big hyperscalers going around and saying, I need a gigawatt of power. And they'll say this to 60 different potential data center builders. Then all of a sudden you hear this echo. Well, I heard that there's a gigawatt here and a gigawatt here and a gigawatt here. And all of a sudden there's like 60 gigawatts of demand. And it's this echo from that first gigawatt. The thing is, I am aware of about 20 gigawatts of power that people want to make available for data centers now. Right now there's about 15 gigawatts of data centers worldwide. So more than double the current capacity. Concern that I have is that people are now building up more power. And what's going to happen in the next three to four years is people are going to be like, I built up all this power and no one's using it. This was like a complete waste and we're never going to do this again. Then what's going to happen? Remember that doubling of chips every 18 to 24 months? Well, three to four years, you double that 15 gigawatts twice. And now you're talking about, what, 120 gigawatts? There isn't that much power available. And then another one after that. Now you're at 240. And so what's going to happen is we're going to overbuild slightly right now just because of that mismatch and the miscommunication that's going on right now. And then we're going to dampen our building and we're going to close down on that, and then we're going to have the real need for the power. That's my big concern right now, because that power will become a hard bottleneck in three to four years.
Unnamed Interviewer
Why will we have that data over or data center oversupply when we are moving into a world of inference which will be 20x larger than training.
Jonathan Ro
So the problem with data centers is everyone thinks that data centers are real estate. And a lot of people do real estate. Data centers are not real estate. The common joke in the industry now is someone says, I'm going to have 100 megawatts of capacity for you and I'm going to have it in three months. Are you willing to sign? And then you ask a question like, what's your uptime? And they're like, I don't know, whatever the power grid is, you're like, wait, what? Where are your generators? Oh, I haven't ordered those. I'll order them now. You know that there's a 90 month lead time on generators right now. Oh really? And then the next 90, 90. And then the next question is, where are you getting the water from? Wait, data centers need water? I thought it was a bunch of chips. So there's a bunch of people have no idea what they're doing going into it because they think it's real estate. Those people are now building an oversupply of data centers, but they're not really building them. So they're fake data centers that people think are real.
Unnamed Interviewer
What happens to those data centers? Because they're not going to be utilized, are they? Amazon is not going to pay for a data center that doesn't.
Jonathan Ro
Well, Amazon doesn't fall for this. Amazon has really good people.
Unnamed Interviewer
Whoever the buyer is is not going to pay for a data center that's got no water or got no power.
Jonathan Ro
Yeah.
Unnamed Interviewer
And so is it just wasted from your.
Jonathan Ro
Most of these projects will never be developed.
Unnamed Interviewer
Will we build them fast enough? You said about the data, like the oversupply, it does take time to build a data center.
Jonathan Ro
So it's almost okay. If you train a model, you really want to amortize it for about six months. If you deploy chips, you really want to amortize it for three to five years. We're more on the three year side. Others are more on the five year side. If you build a data center, you're probably talking 10 to 15 years in a power plant. You're talking like 15, 20 years. The problem we have in the industry is not on this. And there's this mismatch between the sort of financing and the needs here. So you have someone who wants to train a model and they're going to be doing this for six months and they don't understand why people want three to five year commitments. On the chips and then the people deploying the chips don't understand why someone wants a 15 to 20 year commitment on the data center. Is that seven years now on the data centers? And then the people building the data centers then need a long seven years commitment? Yeah, that's the kind of thing they're asking for. So you've got this complete mismatch throughout the ecosystem. But the funny part about it is while they all want to take zero risk and have a committed sovereign wealth level sort of credit rating, on the other side of it, with long commits, the longer the payoff time, the more generic the infrastructure is. A model has a pretty specific use, but accelerators like LPU's and GPU's can be used for other things besides generative AI or LLMs. The data center can be used for other things besides the accelerators. The power can be used for anything. While they're looking for the least risk. Over here, it's the place where there is the least risk. Because if we don't use it for AI, we'll use it to power all of the electric cars.
Unnamed Interviewer
Is this a case where incumbents win because they're one of the only ones who are able to match the duration of required by data center providers?
Jonathan Ro
Well, and this is why we've partnered with Aramco in this new entity in Saudi Arabia, because they have an enormous ability to fund this over the long term. They have a very long term perspective. They have an amazing credit rating when.
Unnamed Interviewer
You say they have an ability to fund it. And this is where the misconception was. People think it's a funding round of a billion and a half. It's not a funding round of a billion.
Jonathan Ro
No, we did not raise 1.5 billion. That's revenue. That's actually about 30% of the revenue of OpenAI.
Unnamed Interviewer
Can you just walk me through how that deal is structured?
Jonathan Ro
Yeah, we started off last year, right, and we got to 19,000 of our chips deployed. We did that in about 51 days. The question was, what can we do this year? So they've gone off, they've collected up a bunch of power in the country and the deal is structured so that they will put up the capex for us to deploy our chips in that data center or those data centers and we pay back based on the money that we make. It's a little bit different than debt in that they participate in the upside. It's similar in nature. It is revenue because we actually make profit up front.
Unnamed Interviewer
How does that change what you can do?
Jonathan Ro
Well, we are not limited by capital anymore. There is one misconception around grok. There was a paper that was written that said that we couldn't be profitable while being the lowest price. We could charge more. But actually we have a very positive contribution margin right now. As far as we know, we're the only ones that are actually making money running these open source models. Because with the open source models everyone's sort of competing with VC dollars, trying to take market share uber style, right? And meanwhile we're sitting here going, we could do this all day long because we're making money. We're able to to even pay off an IRR and make our partners money. There's another part of the model. So we're also working with some proprietary model providers. So we actually showed off the first one at Leap on Sunday where we did a voice model with Play AI. That one is also a rev share. But the thing is they get to make money off of that, whereas most others in the industry are losing money because of the commoditization of the models.
Unnamed Interviewer
So do you have cheaper pricing over time as you bluntly have less monopoly power, or do you have higher prices as your monopoly increases?
Jonathan Ro
We want the margin to stay about the same, but we want the prices to go down because then we get into Jevons Paradox and life gets great. Because we're going to scale and our focus is on getting to scale. To preserve human agents in the age of AI, we need to be one of the most important compute providers in the world. And our goal by the end of 2027 is to be providing at least half of the world's AI inference compute. We think we could be further than 2x given that we don't have all the constraints. But in order to get there, we do need to be very aggressively building out and we need to give people no excuse for not running their models on us and using the models that are on us by charging extra. And what I keep telling the team over and over again, because you have to remind them sometimes, is we're growing faster than exponential. And when you are growing faster than exponential, there is no amount of profit that you can make that matters. What matters is getting a toehold in the market and becoming relevant.
Harry Stebbings
What could prevent that?
Jonathan Ro
We used to be worried that someone would try and price below us. And then we realized that wasn't a concern because there's so much money going into this that people are going to want to lose less money by running on us. That isn't a concern, but that was the big one early on, until we realized that.
Unnamed Interviewer
When we see Zuck investing $65 billion in data centers, what does that actually mean? That means he's internalizing all of the margins that he would have had to spend on data centers with the providers that we mentioned earlier. And Facebook is doing full stack, so.
Jonathan Ro
Met is doing 65 billion a year. I think Google said 70 or 75 and Satya said Microsoft's doing 80. And then you've also got Stargate. Yeah, these are crazy sums of money.
Unnamed Interviewer
And this is all for data center build out?
Jonathan Ro
No, it also includes the stuff that goes in, includes the chips as well, the systems, everything.
Unnamed Interviewer
We've never seen money like this.
Jonathan Ro
No, no, there's never been anything like this, but there's never been a case where it was so clear that there was going to be value at the end if you knew how successful Search was going to be. Right. Remember, Google stayed private as long as they did because they were afraid that Microsoft would figure out how much money Search was making and then would try and replicate. And the moment that they went public, Bing, they called that perfectly. Everyone knows how much money there is in AI, so everyone's going after it.
Unnamed Interviewer
Do you think that value is distributed amongst many players or concentrated towards one or two? I completely agree with you in terms of the clear value when assigned. But is it distributed to some levels evenly or concentrated?
Jonathan Ro
It's a power law. The more value there is in the economy, the more risk there is of a single entity being so far on one end that they just dominate. And you see this with the Mag 7 and it's predicted just the bigger the economy gets, the more you will have big swings in the economic outcomes. Right now, the hyperscalers are all sort of even in their market caps, it's strange you would expect one of them to just be killing it and taking it much further. And so I don't understand why they're so closely grouped.
Unnamed Interviewer
When we think about that distribution, how do we think about changing that then? Obviously with Grok, you want to be one of the Mag 7, you want to be one of the most important companies in the world. How do you see that?
Jonathan Ro
So the way that you get there and the way that you stay, there are two very different things. There's sort of a circle of life that happens in startups. The first circle, circle. The first stage is solve an unsolved problem. That's how you go viral, that's how you do well. The second stage is the marketing stage, which is now other people are trying to copy what you've done because they can't think of something themselves. And now you have to fight it out in advertising and marketing and whatnot. You see, CPG companies often get stuck there and it becomes more about where on the shelf they are than anything else. And then the final stage is the seven powers. It's once you've found some of those and you've really started improving it and you have sort of systemic advantages and then what happens is someone solves an unsolved customer problem and the whole cycle of life continues. Right now Google has to redo this because LLMs are better than search. So the way that you start off to become a mag 7 is you solve that unsolved problem. The way that you stay there is first you find one of those seven powers or multiple, but then you have to be ready for when you get disrupted to continue fighting back and solving customer problems.
Unnamed Interviewer
We mentioned the different huge amounts of money that's being spent here. Is this a good bubble that lays the foundations for an incredible next 10 to 20 years where bluntly the capital actually turns out to be productive, but not seemingly so on paper? Or is it where actually just a huge amount of money is incinerated on depreciating assets?
Jonathan Ro
I can guarantee you that a huge amount of money will be incinerated. I also bet that in total more money will be made than will be put in. This is the problem. You have to look at it either in aggregate or individual bets. When everyone is making investments in the market, some people are going to lose money because not every company is going to be successful. What you always see is when there is some real tech improvements or things coming, you've got the things that were earned early that people are investing in heavily that are super successful and then everyone else wants to get in on it. And it goes from you have AI chips and AI models to now you've got AI T shirts and next thing you know you've got AI thermal grease. People just start applying AI to everything. Next thing you know you'll have an AI condo. So the trick is discerning what is real and what isn't. You're always going to have all of these really obnoxious charlatans coming in whenever there's something real. And that's unfortunate, but eventually they get cleared away. Once people start to understand the technology and what's real and what isn't. The job is to start educating. And the more educated people are, the less they'll invest in AI thermal grease.
Unnamed Interviewer
What is the Largest individual bet that will lead to the largest incineration of cash.
Jonathan Ro
I'm not going to call anyone out in particular, but I actually think it will happen across every single discipline. Are you aware of the Keynesian Vita contest?
Unnamed Interviewer
No.
Jonathan Ro
John Maynard Keynes, the economist. This will explain everything you need to know about vc.
Unnamed Interviewer
I'm nervous, but keep going.
Jonathan Ro
So take a magazine full of models. Human models, like good looking models, have a whole bunch of VCs in the room. They're allowed to make bets on who the most beautiful model is. In the end, whoever has the most money on them is the winner. And based on the proportion that you put on that particular model's face, you get the share of all of the money. If you put money on one that isn't the most beautiful by dollars, then you lose your money to the people who bet on that one. That was sort of the bet that SoftBank was making, which was they could win the Keynesian beauty contest. I'm just going to put more money in and I'm going to win. That is problematic when you have true technological advantages as opposed to marketing. When you're solving customer problems, it's a weighing machine. Once the customer problem has been solved, you then get into this sort of popularity contest of marketing. Now something unusual has happened this time around, which I don't think has ever happened in VC before, which is you see people raising billions of dollars who have competitors who've raised billions of dollars. Usually there is a clear winner. In the Keynesian beauty contest, you don't have this like, fight where, you know, it's sort of like, well, I got to put a little more money in, I got to put a little more. I got to, you know, put 10 billion in, I'm gonna put 20 billion, I'm gonna put 500 billion in. Because the Keynesian beauty contest has gone completely amok. This has never happened before. And so now people don't even understand how to react. Because it used to be if someone had raised a billion dollars, you're like, oh, they're the winner. Now it's like there's three or four competitors who have a billion dollars.
Unnamed Interviewer
So who wins and who loses? Is Massa going to incinerate the largest amount of cash ever?
Jonathan Ro
I think the Keynesian beauty contest no longer applies here because there's so much money available being spread out. And I think you're gonna see that the people who have the best products are actually gonna be the winners because everyone can be capitalized. But there will be problems for the winners because of this, the problems are going to be of the sort, you had this employee that you were going to hire and someone offered them a ridiculous amount of money. Yeah, you see this all the time now. And they could have gone and contributed to the winner, but now they're contributing to a competitor that shouldn't even exist or is equally likely to win. And now you're splitting the talent.
Unnamed Interviewer
What do you also do when you have such high salaries? We've seen a million, two million for kind of junior to mid level in some of these companies and they are living an amazing life actually in great places. You think they're living that amazing life in Guangdong when they're working for Deepseek or any other Chinese alternative? I think they're actually getting paid much less working their fucking ass off 20 hours a day and not getting kombucha and being paid 2 million a year. Fair.
Jonathan Ro
Not only fair, we have a policy that we never offer the highest because we want people to choose us, not choose the salary. If we win in a bidding war, then that means the next time someone comes along with a higher salary, that's it. They're just going to go take that other job. There's no loyalty. They don't believe in the mission. Instead we focus on, look, we're going to build this. This is your opportunity. You're going to get to work with amazing people, spend some time with the team. Are these the people you want to be working with? Because frankly, you're going to make so much cash, it doesn't matter. But better the equity, the outcome, help us make this thing valuable. And people who buy into that, they're so much easier to manage because they're mission oriented. They all want to do the same thing. They're not there because they want the kombucha and they're not going to complain because the cappuccino machine is broken. They'll just go and buy their coffee next door.
Unnamed Interviewer
Will you and Nvidia move into the model layer? Everyone talks about model providers becoming application providers. Will infrastructure providers become model providers?
Jonathan Ro
We have decided that we're not going to train our own models. We'll do a little fine tuning for specific cases or whatnot. But we don't want to compete. That's really important because people are putting their models with their weights on us and they don't want us to learn from and take that stuff for our own benefit. This is the problem you have when you work with a hyperscaler because you know they're also doing everything that you are doing. So we've decided model providers. You make the model, we don't do that. There's also the data side of the users and the queries. So the other thing that we could do that we do not do is log the queries and then we've got data. If we wanted to train, we don't train. We have no reason to hold the data. We only temporarily store things in the dram. There's no persistent storage. If the power went out, everything's gone and DRAM is limited. So we can't hold things for a long time. So you know that we don't have your data. Now, people who are building businesses on top of us, you can obviously keep the data from your customers if you want. We have no control over that, that's fine, but we don't take any data.
Unnamed Interviewer
Do you think Nvidia move into the model providing.
Jonathan Ro
It's possible, but if I was them, I would avoid it because I wouldn't want to give the customers of mine. I mean, Nvidia is great at training, right? It's crazy. It would be like being a car company and then creating your own taxi. You're now competing directly with your customers. Right? And I think tech companies love to do this. We have a management philosophy and it's based on big O complexity. We only do things that require a sublinear number of employees. So what I mean by that is if someone comes to me and says, I need 10 people to go do this thing, a lot of people would say, well, why can't you do it with five? I would say, okay, you're supporting customers. If we double the number of customers, do you need 20 or do you need 11? Because I want to know, what's that growth rate? Are they automating everything? Right. We completely automated our compiler, we completely automated everything that we are, you know, large portions of our cloud. That means that we can scale with a small team. We have 300 people. With 300 people, we built our own chip, we built our own networking hardware and software, we built our own runtime, we built our own orchestration layer, we built our own compiler, we built our own cloud. All this with 300 people. Now, we would only be able to do this with a small number of people because you don't have the communication over overhead. You have to decide what your constants and variables are. What are the things that you want to preserve. And one of the constants is talent density. We want to stay small, we want to stay nimble. And the other side of this is Growth is a problem. So we measure our growth in what I call problem units. So a problem unit. Every time you triple something, you have about the same number of problems as the last time. You tripled, going from 100 employees to 300, 300 to 1,000, 1,000 to 3,000. Each one of those has the same number of problems. We scaled from 620 or 640 LPUs last year at the beginning to 40,000. That's four problem units. That's four triplings of the number of chips. If we were also tripling the number of employees, that would be another problem. Unit management bandwidth is limited. You can only solve so many problems, so you have to decide where you're going to allocate them. If you build things really well from the beginning and you can scale up with the number of employees you have, then you can scale over. Here you want to triple the number of customers. There's another problem unit that you have to solve.
Unnamed Interviewer
What's the biggest challenge when you are scaling at that rate, but then the team is not scaling in conjunction with it.
Jonathan Ro
There's this common belief that the people that you have early on are right for the job and the people that you get later, they're better in a or more corporate environment. I don't think that's the case. I think you should always try and get generalists. Otherwise you get stuck and ossified in a particular way of doing things because that's what that one person knew how to do. But there are people who burn out. Being in a startup is hard. There are people who just literally burn out. There's also people who were the best that you could get at the time. And then there are people who are just unmanageable wild children and they should go off and start another startup and they shouldn't be scaling with you. That happens, but it's the rarer of them. Saying that you're going to hire B players because you've gotten large enough is laziness and an excuse. It's a lack of creativity in your business model and how you're going the algorithm of how you're going to scale. Think of it this way. Walmart versus Amazon. Walmart wants to double the number of customers. They have to double the number of stores and employees. Amazon does not need to double the number of websites. That's a fundamental advantage. But Amazon still has to double and improve the logistics. They don't have as many problems where they have to scale linearly, but they have some. You wanted to disrupt Amazon. What you would do is you'd build a completely robotic logistics system and bring the overhead and complexity of that down and then you could outmaneuver them. That's how you need to improve. Don't just say, I need more people. Focus on the algorithm of your business.
Unnamed Interviewer
The last time we spoke, we discussed Deepseek. I think more has come out over the last few weeks about, bluntly, their innovations, some of the distillation that they used. Where is China better than us today?
Jonathan Ro
Well, they're more willing to use things that maybe they shouldn't be using. They distilled the OpenAI model. A lot of people have the opinion, well, OpenAI was scraping the Internet, so good one for deepseek. But most of the model providers had considered that a red line they didn't want to cross. I don't know if that's going to change, but it might.
Unnamed Interviewer
The other thing is the open source nature of deepseek. OpenAI now benefit from the innovations that they did also have.
Jonathan Ro
Well, and they also probably have all the data that Deepseek paid them to generate. They also were clever, they innovated. I think the biggest thing is this is a shot in the arm for morale in China and it gives them a sense. But as I said, this is Sputnik 2.0. It's also woken up the US.
Unnamed Interviewer
How do you compare Stargate to the $128 billion that China have now committed?
Jonathan Ro
China has a more complicated situation and a simpler one at the same time. The problem is they don't have the technology that we have in terms of the chip efficiency. On the other hand, they have scale. If they wanted to deploy 150 nuclear reactors, as I think the plan is no big deal, they just do it. So if the chips aren't as efficient, they can just deploy more of them. On the other hand, if they want to go out into the world and deploy chips like they did with Huawei and networking gear, that's going to be complicated because people aren't going to have the power around the world to run more expensive accelerators at home. I don't think anything's a problem. I only think as they're trying to expand, it's going to be an issue.
Unnamed Interviewer
What do we not know about China that we would like to know?
Jonathan Ro
I think the most important thing to understand is where they're going to end up on the censorship and privacy of these models. We come from democratic countries. We have an expectation that companies can build something that says anything. Are they going to be permissive and Allow models to make mistakes and hallucinate, or are they going to shut it down? If you know that, you know whether or not China has a shot. One of the biggest nightmares that they have is free speech. It's the exact opposite of that vulnerability we talked about earlier. Can you imagine Xi Jinping going out and saying, country, we've lost our advantage in AI. I need your help. Never, ever. It's always going to be, we're the greatest, we're the best. Everyone's going to know differently, but they're all going to have to toe the party line right now because of that, I think it's really hard for them to just allow these models to say anything, say, the US Is great and better at AI. That's a bad thing for them. And so that's going to really tell you a lot about the AI story in China.
Unnamed Interviewer
And so if they aren't permissive of more open, truthful models, then they're inherently disadvantaged.
Jonathan Ro
You're saying, I forget, if it was. Was it Jack Ma who got in trouble with the ccp?
Unnamed Interviewer
Yeah.
Jonathan Ro
If they aren't more permissive than if you are running a Chinese tech company, your fear is that you become Jack Ma. That's really going to stifle innovation. If I was in China right now, I'd be looking for the exit. If your craft is AI, I would want to do that someplace that's supportive.
Unnamed Interviewer
Do you really buy that? They don't have access to Blackwell, this is China. Xi Jinping's like, no, sorry, no Blackwell.
Jonathan Ro
I don't think it matters on whether or not they physically have it. Because right now, most of the cloud providers are happy if you swipe a credit card to rent it to you.
Harry Stebbings
But there is limits to renting.
Jonathan Ro
No, one of the concerns right now is about Malaysia or Singapore. That region over there being a place where people are deploying GPUs with the Wink, wink, like, we're not going to rent it to China. That's a belief that a lot of people are doing that. Otherwise, that's a lot of GPUs for that region. But that feels like it's even more of a safety net, just in case the tap ever gets turned off at the hyperscalers. Because right now, you could just write a check to any of the hyperscalers and say, I need these chips. They'll deploy them and you can run on them. Doesn't really matter where you're coming from. I mean, if you're a sanctioned country. No, China's not sanctioned.
Unnamed Interviewer
So we Have China, where China is obviously in terms of innovation and actually proving that they are in the race. We have the US and then we have Europe, which feels like it's languishing. Is this the ultimate nail in Europe's coffin?
Jonathan Ro
We talked about how Grok almost died, but we had the right technology all along. We were just waiting for the thing, for the LLMs to arrive. And I think Europe's very similar. I think Europe has amazing talent, amazing talent, but that talent leaves and goes to the US or other places. The question is, how do you have Europe's LLM moment? How do you position yourselves? And it's not that complicated. The problem is when you surround yourself, you become the average of your five closest friends, right? If your five closest friends are like, that'll never succeed. Ah, you should just keep your job. Ah, startups, they're terrible. Then you're going to be risk averse. But if your five closest friends say, you should do it, that's great, I support you. Then you're going to be more likely to do a startup. Even in Silicon Valley, people make that transition from the big tech company to the startup and it's hard. They're comfortable, the big companies take care of them. They have a fiduciary obligation to their family. How do they make that leap? And it's because you've got tons of entrepreneurs trying to hire them and they hear the pitch all the time and they get used to it. They also see the success around them. And then VCs come in and try and close some of the candidates in the early stage too. Right. Europe needs the same thing. You need a place where people are surrounded, just surrounded by entrepreneurial people who are risk on and who aren't going to try and talk people out of joining a startup.
Unnamed Interviewer
From a regulation perspective, Europe is unbelievably efficient and the masters of regulation. You know, I was speaking to someone the other day in the EU who supposedly hired 1500 people for AI safety and policing. What would you do if I put you in charge of European AI regulation?
Jonathan Ro
Well, I wouldn't waste my time regulating something that doesn't exist. Instead of regulating, what are you going to promote? You want to promote risk taking. You want to promote that enclave of people who are risk on. I was just visiting Station F yesterday. Amazing. Macron was there. It was like full of people, right? Vibrant, you feel it. And I was talking to the person who runs Station F, Roxanne and Xavier Neal. And with Roxanne, we were talking about what about a City F What about you start off with like 10,000 people in the center, right? A little radius, and then once that's full, you expand it. Once it's full, you expand it and so on. To get to like a million people in Europe who are all risk on a little Silicon Valley here, I would give it special economic dispensations. I would allow everything that employers need. I would make it simple and I would say, you know what, if you don't want to buy into that, that's fine, go to other regions in France, go to other regions in Europe. But if you want to participate in what is going to be the biggest technological revolution in human history, this is the city for you.
Unnamed Interviewer
You're not inherently punishing incumbents then. And what I mean by that is, if we are talking about. I'm just using this as an example. AI, insurance underwriters, startups. There's many companies that are going after insurance underwriting and AI and you are giving them benefits like that. You are inherently punishing some of the biggest providers of insurance in your region. You're inherently punishing people who hire 200,000 people. That feels unfair.
Jonathan Ro
There is no right to be an incumbent, especially a slothful incumbent that is not reacting to disruption. And you want to encourage disruption. This is one of the things in Silicon Valley you can move from one place to another. There are no. We had non solicits when I started, but even that's gone. That free movement of people is very important.
Unnamed Interviewer
Are you allowed to start work straight away?
Jonathan Ro
Straight away, but not before. If you start before, we have months.
Unnamed Interviewer
Months, six months.
Jonathan Ro
There's no such thing. And so in that region, I would say you can immediately start, like literally early the next day.
Unnamed Interviewer
That is so good. We have to wait six months.
Jonathan Ro
It's not good. If you are a company right now, it feels like, okay, well, it's harder to poach, but what does that do? It suppresses wages. It's harder to hire someone, they're less likely to move, there's less competition, it suppresses wages.
Unnamed Interviewer
And by the way, the company has to pay for the six months anyway. It makes no sense at all. So I totally get you and understand that. Can I ask you, you mentioned what would you promote? A lot of people would promote. I loved what you said. Risk. On being a European, I actually thought first safety and regulation, specifically safety. So sticking with that. All that Dario will talk about these days is safety. Is he losing a step by being so focused on safety, when bluntly his competitors are talking about product?
Jonathan Ro
So safety matters. In AI, it's a little bit like nuclear power. Lots of pros, lots of cons. I'm worried about different things than Dario is worried about. I'm more worried about people voluntarily giving up their decision making authority because it's so easy. And this is what I mean by preserving human agency in the age of AI. Good analogy is you probably know plenty of wealthy people and the struggles they have bringing up children with wealth. I refer to it as financial diabetes. You have children who aren't incented. They're not going to strive to succeed. I was very fortunate when I was growing up. I actually just told this story for the first time today, so no one's heard it, but I was fortunate because my father lost all of his money multiple times. And he would sell a billion dollar life insurance policy and he would get all the commissions from that, and he would have tons of money and then he would spend it all. There was one time we were living in a $20 million mansion, and there was a couple of times where we ordered food. He would talk to the delivery guy and he would convince him to give us the food, and he would pay him back later because we'd get money later. But this time he was like, so despondent, he sort of locked himself in his office and wouldn't come out. My little brother came to me and said I had to go and talk to the Chinese food delivery guy and convince him to give us the food. I was mentally preparing how to convince him to do it. And I walk out and I walk up to him and I'm like, getting ready to do my whole spiel. And he hands it to me and I'm like, I don't have the money right now. He's like, oh, yeah, pay me later. I didn't have to. I fortunately didn't have to do anything. He just trusted because we were living in a $20 million mansion. But that happened multiple times. And when that happens multiple times, like, I have a friend who was homeless once for a couple of weeks, and he'd almost been homeless a couple of times. And he said the best thing that ever happened to him was that he was homeless for a couple of weeks because he survived it. And it's like, I've been through it. I always viewed this as the worst thing that could ever happen in the world. But now that I've been through it, I can survive it. I'm not worried anymore. I think we live incredibly comfortable lives. Way too comfortable. Most people don't have to go through that. We have this sort of financial diabetes as a society. And I think it's going to get worse. With AI, we're really going into an age of abundance. Very few people have to worry about food security now. But what happens if you don't need to worry about home security or anything? What happens if you can just live a life without working? And what is that going to do to your psychology? And so as we enter an age of abundance, how do we get people to still be making their own decisions and have a fulfilled life?
Unnamed Interviewer
Do we get better or do we get accepting of good enough? And what I mean by that is, you know, now bluntly, with. With the majority of schedules, we will start with OpenAI and we will do deep research. And then we will use different prompts depending on different guests. And then we supplement it with a huge amount of research from speaking to Chamath and speaking to Scuder and speaking to everyone in between. We care about it being good enough first and then great later with all the references. Most people will actually just be happy with good enough and get away with it. Do we as a human society get happy with good enough?
Jonathan Ro
When we hire, we hire for something that we call booking the win early. One of the most important driving forces for people is loss bias. When you have something, you don't want to lose it. When we have an engineer that we're hiring and there's a room full of people who are saying, if we do this thing, we could be twice as fast, I want that engineer to hear, wait, if we don't do that, we're going to be half the speed we could have been. The loss bias, right? Book the win early because it's possible, it must be done. That's a smaller segment of the population. Those are the people who deliver amazing things that no one else is going to do because everyone else is like, that's good enough. However, I think with AI, it's so easy to create a prototype to stand out. You're going to need to do that. And one of the things that's happened with the ability to communicate more freely and see what other people are doing. Think back to what the restaurant experience was 20 years ago versus what it is now. The average restaurant is better than what high end restaurants were 20 years ago, because people see all of the stuff that others are doing the best and they start to expect that. And you have less localization and more globalization. You have to compete at the highest ends. AI is no exception. There's going to be 40 people creating that app. You know that you have to polish it in order to stand out.
Unnamed Interviewer
Listen, dude, I could talk to you all day. I do want to do a quick five. What do you believe that most around you disbelieve?
Jonathan Ro
I'm going to go with anti founder mode here. I'm anti founder mode. I believe in delegation. When you are telling people how to do their job, that is an indication that it's not. Not necessarily a problem with you. It could just be that that person is not right for that job. And it's much easier to just direct them than to go find someone else competent. But it also means you probably haven't aligned them. We align people through this challenge coin. Everyone at Grok carries this 25 million token per second challenge coin. What this is, is it tells everyone what we're doing. It's an alignment. I can't tell you how many people I've showed that to who are like, that's awesome. And yet no one else is making them. You like to say you know the greatest things in life.
Unnamed Interviewer
Yeah, I know. Gold, but unmade decisions.
Jonathan Ro
Exactly. Well, this was a very made decision because I had to consolidate everything we were doing into one very simple message of we're going to get to 25 million tokens per second and then engraved it on a coin on this tiny amount of space right here and gave it to everyone at Grok. And now whenever we're in a meeting and something doesn't help with this, you can just tap their coin on the table and be like, no, no, that's not the way this is going to go.
Unnamed Interviewer
Is everyone wrong on founder mode then?
Jonathan Ro
I think that's what you do when you don't have the quality of people working for you. You need the right gearing ratio between you and your direct reports.
Unnamed Interviewer
It's a really unfair question, but I have to ask it. How do you analyze Elon's attempt to buy OpenAI?
Jonathan Ro
So I was sitting at the Elise palace, or however I pronounce it, at the dinner with Macron and Sam Altman. It was macron, it was J.D. vance, and it was Sam Altman. Frankly, I think Elon was a little jealous that Sam Altman was sitting next to JD Vance and it wasn't him. And because it was right around the time that Sam Altman was speaking that he announced it. And frankly, I thought Sam's tweet response part of it was pretty good. I would have probably said, instead of whatever he said about 9 billion, I would have said, yeah, I'm going to take Twitter public at $420 a share. It was attention grabbing, right? Some people can't stand to not be getting attention. And so my revenge on this is to give as little attention as possible. So let's move on.
Unnamed Interviewer
What would you do if you knew you could, couldn't fail?
Jonathan Ro
I would put in 100% of the orders for every single chip we could possibly manufacture. Because right now the demand is unlimited. But every time you triple, you find the same number of problems. And so you got to keep. You got to do it a little judiciously. But if I knew that no matter what problem was going to come up that we didn't need to be safe at all, I would just go, great, we're going to go build 20 million.
Unnamed Interviewer
Chips done in 10 years time. Is Nvidia 3x bigger, 10x bigger or 50x bigger?
Jonathan Ro
I think they will be bigger. Training will become more important. I wouldn't be surprised if they were 3x bigger. I also wouldn't be surprised if they stayed around the same. It's so hard to tell where things are going because remember a lot of assumptions in the investment in Nvidia were they were going to run away with the entire market, including the inference model, including the inference market market. And they just haven't built the right thing for inference. I do think as a weighing machine they should increase in value, but there was so much popularity contest applied to it that I don't know if they might need to grow their revenue to get to where they are. But it's a pretty fair multiple given everything going on. The popularity contest skews everything.
Unnamed Interviewer
What's a crazy AI prediction you have that everyone else think is science fiction?
Jonathan Ro
I would assume that in the next 10 years, and I know this is going to be crazy, but you saw that picture of me and my weight loss, right?
Unnamed Interviewer
Unbelievable, dude.
Jonathan Ro
£70. £70. Yeah. I was on Mounjaro. So if you know anyone who's overweight and it's hurting their health, get them on Manjaro as soon as you can. It works.
Unnamed Interviewer
What is Mounjaro?
Jonathan Ro
It's one of those GLP inhibitors, one of the weight loss drugs that have become popular recently. It works. But my crazy AI belief is that if it is possible, if it is possible to significantly slow or stop aging, I think that you will have a Manjaro moment in maybe the next 10 years. Because that came out of nowhere all of a sudden. You could just lose weight. Something finally worked. And I don't know if it is possible to slow or stop aging right. Some Some wear and tear is a real thing, and it might just be impossible. But if it is impossible to slow or stop aging, if. Then I think in the next 10 years, we will do it. And it will be sudden. It'll be like the Manjaro. It'll be like that moment.
Unnamed Interviewer
I don't see how it is not possible. Like, when you look at the advances that will come in medical research, I don't see how it's not possible that we will at least extend longevity by 60 years. I mean, Dario said we'll live to 150. I don't see why that's impossible.
Jonathan Ro
I don't either. But I also don't know that it is possible. And until I know that, I'm going to stick that conditional in there and say, if possible, what have you changed.
Unnamed Interviewer
Your mind on in the last 12 months?
Jonathan Ro
And this is less of a mental one and more of an emotional one. We didn't have product market fit for seven years. That is terrible. Like the morale, like. Yeah. When you find product market fit, the world is brighter, the birds sing. I feel like hugging people. You sleep, I sleep. Life is better. I forget if it was you or someone else. Someone was talking about Type one type to happiness. Yeah, I think there's a third. As a founder, the only type of happiness you get is this third type, which is future happiness. The other two, the common ones are the present is happy. Right. The other one is you went through some real crappy stuff, but the memories are make you happy. Right? So there's past, there's present, and there's future. As a founder, you're living 100% in future happiness. When you get product market fit, you start to get that past happiness. And when you start to get that revenue and everything, then you end up getting the present happiness and it changes everything.
Harry Stebbings
If you had to bet on one.
Unnamed Interviewer
Company other than Grok to define the AI era, who would it be?
Jonathan Ro
Probably focus more on the companies that you haven't heard about. And I don't know what the companies are, but I can tell you what they're going to do and I can tell you what each one of them will be. The first one will be the one that solves the hallucination problem. The second one will be the one who is best able to break down sub goals for agentic. I think Agentic comes after you solve the hallucination problem, because otherwise you get these long chains where you can introduce hallucinations. It'll kind of work, but it'll work much better after I think the next one, what I call the invent stage. So right now, the way the LLMs work, they make the most probable prediction. It's actually kind of amazing. It's like I'm going to take an entire novel. You've got a detective, you know, murder mystery. You get to the point where the detective says and the murderer is. And it can actually predict it. It had to understand everything. Right. But it's going to give you the most probable answer. And that's not good for invention. It's not good for art writing. The reason that the writing from LLMs is terrible is because it's predictable. So how do you actually say something that's non obvious but is obvious when you see it? We don't even have the right word for it. Right. Non obvious, but obvious. That is going to unlock invention. And then the final one is what I call the proxy stage. When someone makes it. So that models can just make decisions for you. You can proxy your decisions. Like the decision to do this interview. Right. Other things had to be canceled. Canceled. The flight had to be booked. We had to get a ride over here. Right. You would trust an EA or a chief of staff to make that decision. You wouldn't trust an LLM. And that's the final stage, I think, before you get to Generative Act. But each company that does that is going to be a defining company.
Unnamed Interviewer
You said that we're going to have to fix hallucination before we get efficient argentes. Does that mean that money going into Ajante today will be burned?
Jonathan Ro
Now let's take an example on hallucinations. So the examples I gave you were medical diagnosis and law were two areas that will be unlocked once we get rid of the hallucinations. But there are startups like Perplexity that are doing just fine right now, even though there's hallucinations, because it's not high risk. It's for entertainment only. But. But if you click those links, you can check them and it works kind of okay. It depends on how risky the industry is that you're in on whether or not you can get started trying to position for the wave early and generating. But if you're in the right position. We were in the right position for seven years and the wave came so that money isn't incinerated. In fact, that recent deal we just announced is more revenue than the money we've raised.
Unnamed Interviewer
How does that cash hit?
Jonathan Ro
I know it's a nerve throughout the year.
Unnamed Interviewer
Throughout the year, yeah.
Jonathan Ro
But it's this year, potentially more next Year a lot more.
Unnamed Interviewer
What is that contract in three years?
Jonathan Ro
If we sell everything that we possibly can this year, it is many billions. But just from the capacity alone, there's tens of billions of capacity of hardware that we could build next year in these sort of deals. But we're also doing it at high volume, low margin. So if we were talking about GPU sales and GPU prices, I mean we'd be talking about hundreds of billions. We're just not charging that much.
Unnamed Interviewer
Final one for you. The thing that I'm singly most excited for is actually like disease discovery in terms of drugs. Obviously my mother has Ms. And that's incredible. It was always taught to me that it was incurable and actually now it's like actually maybe not. What are you singly most excited for?
Jonathan Ro
We went from a phase where people were hardware engineers to they were software engineers. To be a hardware engineer is ridiculously difficult. The training you have to get things right. There's a real expense if you get it wrong. Becoming a software engineer, so much easier. All you have to do is get a little bit of time on a machine and you can teach yourself. Nowadays you can just download manuals from the Internet or tutorials from the Internet. I think prompt engineering is going to unlock a huge swath of human society. There's 1.3, 1.4 billion people in Africa who know how to know how to speak. If you were to give them access to a tool that they could create applications live just by speaking to it, that would be another 1.3 1.4 billion potential entrepreneurs. There's 8 billion people on the planet. And the difference is hardware was just ridiculously difficult. It was arcane knowledge that's hard to get. Software was plentiful. Language, you already know it, you don't have to learn a thing. What's that going to do for venture? What's that going to do for entrepreneurialism?
Unnamed Interviewer
Jonathan, I love talking to you. It's always such a broad and wide ranging discussion. Thank you so much for putting up with me in person and I've loved it.
Jonathan Ro
Awesome. So glad to be here.
Harry Stebbings
I have to say that show was so much fun to do in person. I want to say a huge thank you to Jonathan for giving up his time on the Paris trip after the AI summit. If you want to watch the episode in full, you can find it on YouTube by searching for 20VC. That's two 0VC. But before we leave you today, turning your back of a napkin idea into a billion dollar startup requires countless hours of collaboration. And team teamwork. It can be really difficult to build a team that's aligned on everything from values to workflow, but that's exactly what Coda was made to do. Coda is an all in one collaborative workspace that started as a napkin sketch. Now, just five years since launching in beta, Coda has helped 50,000 teams all over the world get on the same page. Now at 20 VC, we've used Coda to bring structure to our content planning and episode prep, and it's made a huge difference. Instead of bouncing between different tools, we can keep everything from guest research to scheduling and notes all in one place, which saves us so much time. With Kodi, you get the flexibility of docs, the structure of spreadsheets, and the power of applications, all built for enterprise. And it's got the intelligence of AI, which makes it even more awesome. If you're a startup team looking to increase alignment and agility, Coda can help you move forward from planning to execution in record time. To try it for yourself, go to CODA io20VC today and get six free months of the team plan. For startups, that's Coda iO20VC. To get started for free and get six free months of the team plan. Now that your team is aligned and collaborating, let's tackle those messy expense reports. You know, those receipts that seem to multiply like rabbits in your wall wallet? The endless email chains asking can you approve this? Don't even get me started on a month end panic when you realize you have to reconcile it all. Well, Pleo offers smart company cards physical, virtual and vendor specific so teams can buy what they need while finance stays in control. Automate your expense reports, process invoices seamlessly, and manage reimbursements effortlessly all in one platform. With integrations to tools like Xerox, QuickBooks, Netsuite, Pleo fits right into your workflow, saving time and giving you full visibility over every entity, payment and subscription. Join over 37,000 companies already using PLEO to streamline their finances Try Pleo today. It's like magic, but with fewer rabbits. Find out more@pleoio20VC don't forget to secure trust with your customers. Trust isn't just earned, though, it's demanded. That's why over 9,000 companies, including Atlassian, Core and Factory, rely on Vanta to automate their security compliance. So Vanta helps businesses achieve certifications like SoC2 and ISO 27001, turning months of tedious work into this beautifully fast and straightforward process. Their platform automates compliance across over 35 frameworks, it centralizes workflows, and it proactively manages risk, all while saving you time with automation and AI. So whether you're just starting or scaling your security program, Vanta connects you with auditors and experts to get audit ready quickly and build trust with your customers. Get $1,000 off your first year by visiting vanta.com 20vc that's v a n t a.com 20vc as always, we so appreciate all your support and stay tuned for an incredible episode coming on Wednesday with Adarsh at Merkor.
Podcast Summary: The Twenty Minute VC (20VC) – NVIDIA vs Groq: The Future of Training vs Inference
Episode Title: NVIDIA vs Groq: The Future of Training vs Inference | Meta, Google, and Microsoft's Data Center Investments: Who Wins | Data, Compute, Models: The Core Bottlenecks in AI & Where Value Will Distribute with Jonathan Ross, Founder @ Groq
Release Date: February 17, 2025
Host: Harry Stebbings
Guest: Jonathan Ro, Founder and CEO of Groq
In this episode of The Twenty Minute VC (20VC), host Harry Stebbings engages in an in-depth conversation with Jonathan Ro, the founder and CEO of Groq—the creator of the world's first Language Processing Unit (LPU). Jonathan brings a wealth of experience from his time at Google, where he contributed to the development of the Tensor Processing Unit (TPU). The discussion delves into the competitive landscape between NVIDIA and Groq, the future of AI training versus inference, data center investments by tech giants, and the core bottlenecks in AI development.
Jonathan Ro begins by addressing the scaling laws in AI, referencing a pivotal paper published by OpenAI. He explains that while increasing the number of parameters in a model generally enhances its ability to absorb information, there are diminishing returns due to the quality of data used in training.
“When you are growing faster than exponential, there is no amount of profit that you can make that matters. What matters is getting a toehold in the market and becoming relevant.”
— Jonathan Ro (00:00)
Ro emphasizes the importance of synthetic data over real data, arguing that as models become smarter, they can generate higher quality synthetic data to train themselves more effectively. This iterative improvement helps models transcend the asymptotic limitations traditionally associated with scaling laws.
A significant portion of the conversation focuses on Grok's strategic positioning in the AI hardware market, particularly in relation to NVIDIA. Jonathan outlines Groq's unique approach to handling AI inference, contrasting it with NVIDIA's dominance in training.
“We're one of the best things that ever happened to Nvidia because they can make every single GPU that they were going to make and they can sell it for training. ... We are growing faster than exponential.”
— Jonathan Ro (00:00)
Grok aims to own the inference market by providing LPUs that are more cost-effective and energy-efficient compared to NVIDIA's GPUs. While NVIDIA continues to excel in training, Groq focuses on high-volume, low-margin inference operations, allowing both companies to benefit without directly cannibalizing each other's markets.
Jonathan discusses the bottlenecks in AI, particularly focusing on compute, data, and algorithms. He points out that while compute has traditionally been seen as a less restrictive bottleneck due to its fungibility, the rapid scaling of chip deployment is beginning to strain data center power capacities.
“... the more inference you have, the more training you need and vice versa.”
— Jonathan Ro (22:55)
Grok's partnership with entities like Aramco highlights their strategy to overcome chip supply constraints by leveraging long-term funding and infrastructure support. However, Ro warns of an impending power bottleneck within the next three to four years, as the demand for compute power outpaces available data center capacities.
Grok's innovative business model involves partner-funded deployments, allowing them to scale aggressively without being limited by traditional capital constraints. This model not only accelerates their market penetration but also aligns Groq's incentives with their partners.
“... we have a very positive contribution margin right now. ... we're making money running these open source models.”
— Jonathan Ro (37:47)
By structuring deals where partners fund the capital expenditure (CapEx) for deploying Groq's chips, the company can focus on scaling its infrastructure to meet the burgeoning demand for AI inference without diluting its financial stability.
Jonathan shares his vision for the future of AI, predicting significant advancements once current challenges like hallucinations in models are addressed. He outlines several stages of AI evolution, including:
“We’re growing faster than exponential. ... there is no amount of profit that you can make that matters.”
— Jonathan Ro (31:03)
Ro emphasizes that Groq's focus is on maximizing compute deployment to preserve human agency in the age of AI, asserting that rapid growth and relevance in the market are paramount over immediate profitability.
The discussion shifts to the geopolitical landscape of AI, with Jonathan analyzing China's advancements and Europe's challenges. He acknowledges China's willingness to utilize synthetic data and streamline deployments but highlights potential drawbacks related to censorship and privacy, which could stifle innovation.
“You're saying ... if you are running a Chinese tech company, your fear is that you become Jack Ma.”
— Jonathan Ro (56:32)
Regarding Europe, Ro critiques its regulatory environment, suggesting that excessive regulation and risk aversion are hindering its ability to capitalize on AI advancements. He advocates for fostering an entrepreneurial ecosystem similar to Silicon Valley to spur innovation and competitiveness.
In this enlightening episode, Jonathan Ro of Groq provides a comprehensive analysis of the current and future state of AI infrastructure. By strategically positioning Groq in the inference market and addressing critical bottlenecks in compute and data center capacities, Groq aims to play a pivotal role in shaping the AI landscape. Jonathan's insights into scaling laws, competitive dynamics with NVIDIA, and the geopolitical implications of AI advancements offer valuable perspectives for founders, investors, and technologists navigating the rapidly evolving world of artificial intelligence.
Notable Quotes:
These quotes encapsulate Groq's strategic vision, emphasizing the importance of scaling and market positioning over short-term profitability.