
Jonathan Ross is the Co-Founder and CEO of Groq, providing fast AI inference. Prior to founding Groq, Jonathan started Google’s TPU effort where he designed and implemented the core elements of the original chip. Jonathan then joined Google X’s...
Loading summary
Harry Stebbings
So everyone's seen the news about Deepseat today. Is it as big a deal as everyone is making?
Jonathan Ross
Yes it is. Sputnik 2.0 it is true that they spent about 6 million or whatever it was on the training. They spent a lot more distilling or scraping the OpenAI model. I can't speak for Sam Altman or OpenAI, but if I was in that position I would be gearing up to open source my models in response because it's pretty clear you're going to lose that so you might as well try and win all the user and the love from open sourcing. Open always wins. Always.
Unknown
This is 20 VC with me Harry Stebbings and today we focus on Deepseek. As our guest put it today, this is Sputnik 2.0 and joining me for the discussion is one of the best placed in the business, Jonathan Ross, co founder and CEO of Grok, providing fast AI inference. Prior to founding Grok, Jonathan started Google's TPU effort where he designed and implemented the core elements of the original Google chip. But before we dive in today, here are two fun facts about our newest brand sponsor, Kajabi. First, their customers just crossed a collective $8 billion in total revenue.
Harry Stebbings
Wow.
Unknown
Second, Kajabi's users keep 100% of their earnings, with the average Kajabi creator bringing in over $30,000 per year. In case you didn't know, Kajabi is the leading creator commerce platform with an all in one suite of tools including websites, email marketing, digital products, payment processing and analytics for as low as $60. Whether you are looking to build a private community, write a paid newsletter, or launch a course, Kajabi is the only platform that will enable you to build and grow your online business without taking a cut of your revenue. 20VC listeners can try Kajabi for free for 30 days by going to kajabi.com 20VC that's kajabi.com K-A-A-B I.com 20VC once you've built your creator empire with Kajabi, take your insights and decision making to the next level with AlphaSense, the ultimate platform for uncovering trusted research and expert perspectives. As an investor, I'm always on the lookout for tools that really transform how I work. Tools that don't just save time, but fundamentally change how I uncover insights. That's exactly what AlphaSense does. With the acquisition of Tagus, AlphaSense is now the ultimate research platform built for professionals who need insights they can trust fast. I've used Teagus before for company deep dives right here on the podcast. It's been an incredible resource for expert insights, but now with AlphaSense leading the way, it combines those insights with premium content, top broker research and cutting edge generative AI. The result? A platform that works like a supercharged junior analyst, delivering trusted insights and analysis on demand. AlphaSense has completely reimagined fundamental research, helping you uncover opportunities from perspectives you didn't even know how they existed. It's faster, it's smarter, and it's built to give you the edge in every decision you make. To any VC listeners. Don't miss your chance to try AlphaSense for free. Visit AlphaSense.com 20 to unlock your trial. That's AlphaSense.com 20. And speaking of incredible products, what comes to mind when you think about business banking? Probably not speed, ease or growth. I'm willing to bet that's because you're not using Mercury. With Mercury you can quickly send wires and pay bills, get access to credit sooner to hit the ground running faster, unlock capital that's designed for scaling, and see all these money moves all in one place. I speak to dozens of founders every week and most of them are using Mercury because they're super smart and that's what you have to be using. Visit mercury.com to experience it for yourself. Mercury is a financial technology company, not a bank. Banking services provided by Choice Financial Group, Column NA and Evolve bank and Trust, members of fdic.
Harry Stebbings
You have now arrived at your destination.
Unknown
Jonathan, thank you so much for joining me today. I so appreciate you doing this emergency podcast with me.
Jonathan Ross
No problem. But before we start, can I, can I just say one thing? I think you have the most amazing, unique go to market that I've ever seen in my life for a podcast. I've never seen this before. I think your strategy is you're literally interviewing every single audience member, forcing them to watch videos and get addicted to you.
Harry Stebbings
I mean, I thought you were going to say my accent, but I'm totally going to take that. That's wonderful. And yes, you're absolutely right and do.
Jonathan Ross
Things that don't scale.
Harry Stebbings
But I do want to start. Obviously everyone's just talking about deep sea, little bit of context. Why are you so well placed to speak about deep sea? And let's just start there for some context.
Jonathan Ross
Well, my background. So I started the Google TPU, the AI chip that Google uses, and in 2016 started an AI chip startup called Grok with a Q, not with a k, that builds AI accelerator chips which we call LPUs.
Harry Stebbings
Okay, so everyone's seen the news about Deep Sea today. I want to just start off by saying, is it as big a deal as everyone is making of it?
Jonathan Ross
Yes, it's Sputnik. It is Sputnik 2.0. Even more so. You know that story about how NASA spent a million dollars designing a pen that could write in space and the Russians brought a pencil? That just happened again. So it's. It's a huge deal.
Harry Stebbings
Why is it such a huge deal?
Jonathan Ross
So up until recently, the Chinese models have been behind sort of Western models, and I say Western, including, like, Mistral as well, and some other companies. And it was largely focused on how much compute you could get. Most people, actually most don't realize this. Most companies have access to roughly the same amount of data. They buy them from the same data providers and then just churn through that data with a GPU and they produce a model and then they deploy it and they'll have some of their own data, and that'll make them subtly better at one thing or another. But they're largely all the same. More GPUs, the better the model because you can train on more tokens. It's the scaling law. This model was supposedly trained on a smaller number of GPUs and a much, much tighter budget. I think the way that it's been put is less than the salary of many of the executives at Meta, and that's not true. There's an element of marketing involved in the Deepsea release. It is true that they trained the model on approximately $6 million worth of GPUs. Right? They claim 2,000 GPUs for. I think it was 60 days, which, by the way, also don't forget, was about the same amount of GPU time, 4,000 GPUs for 30 days as the original, I believe, Llama 70. Now, more recently, Meta has been training on more GPUs, but Meta hasn't been using as much good data as Deep Seek, because Deep Seek was doing reinforcement learning using OpenAI.
Harry Stebbings
Is this distillation? Just so I understand and so can you just help me and help the audience understand what is distillation in this regard, and how have Deep Seat been using distillation to get better quality output through OpenAI data?
Jonathan Ross
It's a little bit like speaking to someone who's smarter and getting tutored by someone who's smarter. You. You actually do better than if you're speaking to someone who's not as knowledgeable about the area or giving you wrong Answers. First of all, before we get into any of this, I need to start with the scaling laws. These are like the physics of LLMs, and there's a particular curve and the more tokens, which are sort of the sort of the syllables of an LLM, they don't match up exactly with human syllables, but kind of. So the more tokens that you train on, the better the model gets. But there's sort of these asymptotic returns where it starts trailing off. The thing about the scaling law that everyone forgets, and that's why everyone was talking about how it's like the end of the scaling law. We're out of data on the Internet, there's nothing left. But most people don' realizes that assumes that the data quality is uniform. If the data quality is better, then you can actually get away with training on fewer tokens. So going back to my background, one of the fun things that I got to witness, I wasn't directly involved was AlphaGo. Google beat the world champion Lisa Doll and Go. That model was trained on a bunch of existing games, but later on they created a new one called AlphaGo Zero, which was trained on no existing games. It just played against itself. So how do you play against yourself and win? Well, you train a model on some terrible moves. It does okay, and then you have it play against itself. And when it does better, you train on those better games. And then you keep leveling up like this, right? So you get better, better data. The better your model is when it outputs something, the better the result, the better the data. So what you do is you. You train a model, you use it to generate data, and then you train a model and you use it to generate data and you keep getting better and better and better. So you can sort of beat the scaling law problem. One quick hack to get past all of that in the stepping up is if there's a really good model already right here, just have it generate the data and you go right up to where it is. And that's what they did. It is true that they spent about 6 million or whatever it was on the training. They spent a lot distilling or scraping the OpenAI model.
Harry Stebbings
So they scrape the OpenAI model, they get this higher quality data from that and from refining it, and then they get greater, higher quality output.
Jonathan Ross
Correct, Correct. And all that said, they did a lot of really innovative things. That's what makes it so complicated because on the one hand, they kind of just scraped the OpenAI model. On the other hand, they came up with some unique Reinforcement learning techniques that are so similar.
Harry Stebbings
That was so impressive because I think a lot of people want to just say, I'll bundle the Chinese copy and duplicate, as they always have done.
Jonathan Ross
No, they came up with innovative stuff, but actually the best way to describe it. Have you ever taken a test before? You got an answer right and your professor marked it wrong, and then you go back to the professor and you have to argue with them and everything and it's a pain, right? Well, if there is only one answer and it's a very like simple answer, and you say, write that answer in this box, then there is no arguing. You either get it right or not right. So what they did was rather than having human beings check the output and say yes or no or whatever, what they did was they said, here's the box. There's literally some code to say here's a box, output the answer here and then check it and if it's correct, we have the answer. If not, we don't. No need to involve a human. Completely automated.
Harry Stebbings
Can OpenAI not just do distillation on deep seats model then?
Jonathan Ross
They don't need to because they're actually better still. They're a little bit better. They could, but why would they.
Harry Stebbings
Do we buy their GPU usage? Alex Wang, who we both know was like, nah, they've got 50,000 H1 hundreds. Do we buy the GPU usage or is that questionable down there?
Jonathan Ross
I don't think you have to disbelieve it because of the quality delta. However, why would they try and smuggle in GPUs when all they would have to do is log into any cloud provider and rent GPUs. GPUs. This is like the biggest gaping hole in the whole way that export control is done. You can literally log in. You can swipe credit card, whatever and just like Pay and get GPUs to use.
Harry Stebbings
So export laws unnecessary.
Jonathan Ross
Then they're, they're good. But the problem is it's like the Menageno line, you just go around it. So you need to like seal it up a little more. There's a little bit of room left to, to, to go here. Keep in mind OpenAI was effectively subsidizing accidentally the training of this model because they, they were using OpenAI. Right? And rumors are that OpenAI may not be completely profitable yet in terms of every token in the API. Like on the subscriptions maybe, but, but in the API. And so each one that they generate effectively, they were losing a little bit of money while Deep Seq was getting training data. Now, by the way, OpenAI probably still has that data in theory. They could just probably train on it.
Harry Stebbings
Josh Kirschner said in a tweet today though that this would likely be a violation of US export laws. Do you think that's not true?
Jonathan Ross
I'm not aware of where it would be an export issue. I do know that many people log into cloud providers and just use them from remote. One of the problems. So we actually block IP addresses from China and I believe we might be unique in doing that. It's also a little bit fruitless because someone can just like rent a server anywhere, log into us from there. Then there's nothing we can check.
Harry Stebbings
You said there about kind of blocking IP addresses from China. There's a lot of concern about US customer data going back to China. Do you think that is a legitimate and justified concern?
Jonathan Ross
Yes, it's probably the most significant concern. There are other concerns that's probably the most significant because people don't think they're so used to using these services. When you use one of these other services, you might be shocked to hear this. When you say delete, what they do is they write delete right next to your data. They don't actually delete it, they just mark it delete it. When you later come back and ask for your data, they give it to you with the word delete right next to it. It's still there. And these are well meaning companies. Do you really think like the CCP doesn't have all your data and isn't going to look it up later? Some governments are more aggressive than others and if they have access to your data, not even your data, it could be your next door neighbor's data. Your next door neighbor might put something in there that accidentally gives information away that makes you more vulnerable. Now the CCP has something, maybe you had some package delivered and they put a complaint somewhere and whatever. Like you might not even do it yourself, but other people around you. The health data of a spouse.
Harry Stebbings
Right, Jonathan? I'm going to avoid the British indirectness. Choosing Deepseek is an instrument that will be used by the CCP to increase increased control on Western democracies.
Jonathan Ross
Yes, but I don't think it's deep seek that that's doing it. So you have to understand any company that operates in China and Hong Kong, the the one country, two systems thing didn't quite work out as anticipated. Or maybe as anticipated, but not as stated. They have no choice. In 2016, when Grok started, we decided that we were not going to do business in China. This was not a geopolitical decision. This was purely commercial. And what it was was we kept seeing companies like Google Meta fail over and over again trying to win in China. The formula is actually pretty simple. You're not allowed to make net money, you're allowed to spend more money in China. But the moment that you start to become profitable or anywhere near profitable, all of a sudden there's a thumb on the scale. Companies that manufacture a lot in China and send more money to China can actually be successful there. They can sell things there. Yeah, it's a pretty simple formula. You must send more money to China than you take out. But at the same time, they also require that you hand over all data. And not only that, they also require that certain answers be in a form that they find acceptable. So, for example, one of the more common ones that you see about Deep Seq right now is when you ask about Tiananmen Square. If the temperature is low on the model and temperature, we don't need to get into that. It's complicated. But it's how like low means low creativity, then it's actually going to give you an answer that basically says, I don't want to talk about that. It's a sensitive topic. But you ask it about other things that are sensitive topics elsewhere in the world and it'll just answer, but what, what happens if the CCP requires that they start to say, what about TikTok? Should it be banned? Absolutely not. Here's why. And it gives you a cogent reason that's kind of scary.
Harry Stebbings
Jonathan, what do we do from here? I share your concerns completely. My challenge is TikTok you can ban and shut off. They would not sell the Algo. That is a closed end product that we can ban tomorrow if we really want to. Here it's open source.
Jonathan Ross
Yeah. And worse. So we, we up until recently refused to run any Chinese models and we had to make a very difficult decision on Deep Sea. We now have it on our API at Grok.
Harry Stebbings
Why did you decide that you would break the rule for deepsea?
Jonathan Ross
So what it came down to was when we saw Deep Seek become the number one app on the App Store, the realization was people are going to be putting their data in there and what we want to make sure is that you actually have an option. So we store nothing. There is no like delete or whatever. Like there is just. We store nothing. We don't even have hard drives. We have dram and when the power goes off, everything goes away. So we wanted to make sure that there was an alternative where when you use Deep seq's model, your data is not going to the ccp. Well, right now the CCP is probably going to be taking the safeties off the weapons. They're going to be like, why are you making this model open source? Please direct your data towards us. Go win a bunch of customers this way. But now we want the data, right? And so they're going to change the strategy. But remember, Deep Seek is a real. I mean it's a hedge fund. They're doing this themselves and they're just influenced by the ccp. And the ccp, now that they've seen the success of this, might see it as yet another TikTok.
Harry Stebbings
My question to you is how long is it before the US reacts to prevent this?
Jonathan Ross
One question to ask is, are we going to be talking about Deep Seq for the next six or one real for the next six months? And the answer is absolutely not. We might be talking about R2 and R3 and R4, but R1 was one shot. The question is, are they going to keep coming up with very interesting things? Are we going to, you know, cat and mouse it? Is everyone going to learn from this? The biggest problem is we've, this has just made it absolutely nakedly clear that the models are commoditized, right? You've been asking the question, right? Like if there was any doubt before, that doubt's over. So what is the moat? And for me, I love Hamilton Helmer. Seven powers, right?
Harry Stebbings
My favorites. I do it for every single investment we do, we have to fill it out. Every single person team has filled it up. So. Yes.
Jonathan Ross
So marketing is the art of decommoditizing your product. And the seven Powers are seven great ways to decommoditize your product. Scale economies, network effects, brand counter positioning, cornered, resource switching, cost process power, right? The question is who's going to do what? OpenAI and you got to give Sam Altman, that team credit. They've got amazing brand power like no one else in this space. And that's going to serve them for a really long time. But what, what you see Sam trying to do is scale, right? He's trying to go. That's why we hear about Stargate and $500 billion, right? That's what he, the power he would like to have. But the power he has right now is brand and he's trying to bridge that. But what about the others?
Harry Stebbings
I'm sorry, does this news not ridicule the $500 billion announcement at a time when we've seen increasing efficiency to a scale like never before. With Deep seat today, the $500 billion seems ridiculed.
Jonathan Ross
Actually, I don't think it's enough spending. And the reason is. So we saw this happen at Google over and over again. We do the tpu. So why did we do the tpu? The speech team trained a model. It outperformed human beings at speech recognition. This was like back in 2011, 2012. And so Jeff Dean, most famous engineer at Google, gives a presentation to the leadership team. It's two slides. Slide number one, good news. Machine learning finally works. Slide number two, bad news. We can't afford it and we're Google. We're going to need to double or triple our global data center footprint at probably a cost of 20 to 40 billion dollars. And that'll get a speech recognition. Do you also want to do search and ads? There's always this giant mission accomplished banner every time someone trains a model and then they start putting it into production and then they realize, oh, this is going to be expensive. This is why we've always focused on inference. And so now think about it this way. At Google, we always ended up spending 10 to 20 times as much on the inference as the training back when I was there. Now the models are being given away for free. How much are we going to spend on inference? And now with the test time computer, I've asked questions of Deep Seq, where it took 18,000 intermediate tokens before it gave me the answer.
Harry Stebbings
Jensen said that now half of their revenues is from inference.
Jonathan Ross
Yeah.
Harry Stebbings
So what does that look like in the future then?
Jonathan Ross
95%. I mean, it just makes sense, right? You don't train to become, you know, a cardiovascular surgeon, and then that's what you do for 95% of your life. And then you perform 5%. It's the opposite. You train for a little and then you do it for the rest of your life.
Harry Stebbings
Do you think the US put sanctions on Deep Sea to prevent the CCP using it for data capture on US citizens?
Jonathan Ross
I don't know what the solution is. There's carrot and there's stick. Right. So you can either use a stick, block it. That might be effective. I don't know that the US has really done that before. There's also the carrot, which is. It's kind of interesting how it's being offered for free in China, and not just in China, but to anyone else. And then others are doing that too. Is it possible the CCP is underwriting that because they want the data, dude.
Harry Stebbings
They're doing it with the car industry. The subsidization of cars for Chinese cars, with BYD in particular destroying the European car market, is absolutely that.
Jonathan Ross
The thing is, we have a lesson from the Cold War, which was mutually assured destruction. The problem is we do some sort of tariff and then we do a tariff back. There needs to be some sort of automated response of like, if you do this, we will respond. If you subsidize this industry, we will automatically subsidize the equivalent industry, just automatic. So don't do it because there's no benefit to you.
Harry Stebbings
Does the fact that it's open source, how does that change everything?
Jonathan Ross
It's the only reason people are using it. If it wasn't open source, it wouldn't have gotten the excitement. And open always wins. Always. Keep in mind Linux 1 back when people didn't trust open source, they thought it was less secure, they thought the features were worse, it was more buggy and it still won. Now people expect open to be more secure, less buggy and have more features. So how is proprietary ever going to win?
Harry Stebbings
Everyone always says that. Actually distribution is one of the major advantages that ChatGPT Enhance OpenAI has, especially over the other providers. Every single day that deepsea is out and is being used so pervasively, it is diminishing the value of OpenAI.
Jonathan Ross
Yeah, agree, disagree, agree. Especially for the pricing, because they're losing their pricing power on this. I can't speak for Sam Altman or OpenAI or anything like that, but if I was in that position, I would be gearing up to open source my models in response because it's pretty clear you're going to lose that. So you might as well try and win all the users and the love from open sourcing. Otherwise like you're already at a point where you're going to be using your other powers like brand and so on. I don't know why you try and keep that internal anymore.
Harry Stebbings
Would that be possible? And would that not cannibalize their core main line of revenue?
Jonathan Ross
But how would it cannibalize it any other way? Remember distribution, right? How many people are going to buy something because they trust Dell? People trust Dell. Dell has earned their reputation over the course of decades. Supermicro builds some interesting hardware, but look at what they've been going through recently. You know there's a pro and con, right? Cheaper, trusted. You got to make a decision. OpenAI has been around for a while. Most people think of them synonymously as AI. They could Just switch to Deep Seq and people would still use them. It's brand, it's one of the seven powers.
Harry Stebbings
So if you were OpenAI and sound today, you would switch to open and offer it for free?
Jonathan Ross
I would. And there's probably more cleverness. They could probably strike some deals before they do it or whatever, but that would be the move that I would make. And also it would be a position of strength. The only problem is the timing because if it happens right after Deep seek, it looks like a response as opposed to an intentional thing. So I don't know how you do that.
Harry Stebbings
Do you not just own in? It's a response.
Jonathan Ross
Maybe, you know, we had to respond. We're better. Let's see which model people choose.
Unknown
How do we think about matter?
Harry Stebbings
Matter share the open source values that DeepSeeker espoused. Does this help or hurt matter?
Jonathan Ross
I think one of the ways that we've been looking at LLMs is a little bit like you look at an open source project, software project, like Linux or something. The thing is Linux has switching cost and I think what we've discovered is LLMs have no switching cost whatsoever.
Harry Stebbings
It's where the analogy to cloud doesn't hold up at all because everyone's like, oh, it's like cloud, there's going to be a couple of cool vendors and actually they're going to win. No, you don't rip out your cloud very often.
Jonathan Ross
Okay, so let's start mapping seven powers to the top tech companies. So I would say Microsoft's biggest strength is switching cost, right? You go into a room full of people and you're like who uses Microsoft? Bunch of hands go up and you're like who likes using Microsoft? Hands go down. It's very largely switching cost. So you go into gen AI. Is that a thing that gets disrupted? You look at meta, it's network effects. They could literally give every piece of technology away for free. I am completely jealous of that because if I had that right now I would open source everything because then you don't have to worry about it and you get everyone helping you. So I think meta is sort of because of the network effect thing, always in a position where open source is to their advantage. It almost doesn't matter where it comes from. Now I'm sure that they would prefer to have the Linux of LLMs, but I think the more it goes open source, the more of an advantage they have inherently.
Harry Stebbings
If you were meta, would you do anything different?
Jonathan Ross
Meta is an amazing competitor. What they would normally do if this was Some sort of proprietary social mechanism they would try and replicate and then they would compete and they would say, come join or not. I don't think that the come join works here. But the beautiful thing is all of the information for this model is available. Meta's already been doing this. They have way more compute. The question is, are they willing to scrape OpenAI like Deep Seat did? They've been super careful on everything that they've been doing. And so that's the disadvantage.
Harry Stebbings
Do you not put morals aside to win? This is the AI arms race, and.
Jonathan Ross
I think that's going to happen. I think people will like, you cannot lose. And so what it's done is it's changed the game. Right. So if you. Okay, so let's talk about Europe for a minute. We almost forgot about Europe. It feels like with Europe, there's a lack of a willingness to take risk. There's a black mark if you get it wrong. Everything's about downside protection. Whereas in the US it's like, that was a great effort, you failed, but I'm going to fund you again. Right. So there's that difference. But when you look at the US and then you look at China, China practices RDT research, development, theft. It's just part of the culture. And it's not just against Western companies, it's against each other too. The difference is if you're a Western company, then the government steals from the Western company and then provides it to the Chinese companies, which is less fair. The famous stories of turning on Huawei switches and you see Cisco's logo and all the bugs and. Right. So is that a new paradigm? I really hope not. Like, for Europe to compete with the us, Europe has to adopt a more risk on attitude. Does the west have to adopt a more theft on attitude? I really hope not. Like, that's just like viscerally disgusting to me. I'm like literally repulsed by the idea.
Harry Stebbings
Are we not being idealistic? If you're running in a race with someone who's willing to take steroids, if you want to win, you're going to have to take steroids too.
Jonathan Ross
And then everyone is taking steroids. Whereas if no one was taking it, then everyone's healthier and you have a real competition. Yeah, it's a real problem. And the question is, can governments get involved? Here's the thing. I would love nothing more than to compete directly with Chinese companies on a fair footing. They have really smart people. Deep Seek has proven this. Really smart people. But when the government keeps putting its thumb on the Scale. We're going to try and avoid that competition wherever we can. And now there's no avoiding it. Maybe the governments just have to get involved.
Harry Stebbings
But dude, I'm being blunt. Like, Xi Jinping cares about one thing. Power retention and growth is the only thing that matters to him. And AI is central to that. He will do whatever it takes to win. Having some rational discourse about some rules of play is bluntly unrealistic.
Jonathan Ross
Okay, and it gets worse than that. China has a lot of advantages. The chief advantage is the number of people they have. Now, number of people is not sufficient. So you also have India, and India has an advantage from the number of people, but China has out executed. In fact, India was asking China for some time to help build out the roads and infrastructure. They've really mastered that. Right, but people and sort of organization, discipline, alignment. Right, and so what is the concern with AI? The concern with AI is what if an LP or GPU becomes the equivalent of a contributor to the workforce and you can literally just add more to the GDP by creating more chips and providing more power. Now if that becomes the case, does China's advantage erode? They're concerned that in terms of workforce, the US could catch up, the west could catch up, and then at the same time they have a huge population advantage. And this is why so much want for Europe, Europe to get into the fight on AI. There's 500 million people who could be jumping into this.
Harry Stebbings
If you were to advise the EU today on Europe stance, what would you say?
Jonathan Ross
So have you ever seen Station F?
Harry Stebbings
Yeah, of course, I was there last week. We hosted an event.
Jonathan Ross
So I would say by the end of this year you should have 100 station Fs and by the end of next year you should have a thousand. Done. So what you're doing is you're collecting up 3,000 people and surrounding them with other risk taking entrepreneurs and then they're supporting each other, they're, they're risk on. And when you surround yourself with other people who are risk on, you're going to be risk on and you're going to, you're going to take the entrepreneurial leap.
Harry Stebbings
What does this space look like in three years time? I'm obviously a venture capitalist for a living. All of my friends are going, oh my God, oh my God. We just lost hundreds of millions of dollars on these foundation model companies.
Jonathan Ross
How many companies are you aware of that have become incredibly successful? That didn't pivot. Pivot?
Harry Stebbings
My pivot.
Jonathan Ross
Yeah, exactly. So pivot. Get over it. Just pivot. I've been talking to a lot of the LLM companies and frankly, they have some good ideas. In fact, I really like. So I watched your interview with the SUNO founder. I think he saw it from the beginning, like models are going to be commoditized and that's why he's focused on the product. He got it from the beginning. What is your product? Not what is the model. Model is. It's a piece of machinery, it's an engine. But what is the car? What is the experience?
Harry Stebbings
What do you think Perplexity is in three years?
Jonathan Ross
A question I used to get asked when, when we were raising money a little while ago was, is AI the next Internet? And I'm like, absolutely not. Because the Internet is an information age technology. It's about duplicating data with high fidelity and distributing it. Telephone does. It's what Internet does. It's what the printing press did. They're all the same technology, just the much different scale and speed and capability. Generative AI is different. It's about coming up with something contextual, creative, unique in the moment. And so the LLM is just the printing press of the generative age. It's the start of it. And then there's going to be all these other stages. Just imagine trying to start Uber when we didn't have mobile yet. Great, I'm going to book a trip over to here. How do I get home? You can't carry a desktop with you, right? So you need to be at the right stage. So when I look at perplexity, I look at perplexity as being perfectly positioned for the moment that the hallucination or really confabulation rate comes down. The moment that these models get good enough, where you don't have to check the citations anymore, that's going to open up a whole set of industries. All of a sudden. You'll be able to do medical diagnoses from LLMs, you'll be able to do, you'll be able to do legal work from LLMs. Until then, it's like trying to create Uber before we add smartphones. It just doesn't make any sense. However, people are willing to use Perplexity today, even though you have to check the citations so they have an actual business that gets to continue. So like they're getting to sort of ride the wave. And the moment that that tsunami of lack of confabulation or hallucination comes along, they're perfectly positioned. Each company has to find their own thing. And I would look at SUNO as like A great example of how things are being done around the product as opposed to just the models.
Harry Stebbings
Think it is possible to pivot when you are OpenAI or anthropic or any of the very large providers who've ingested.
Jonathan Ross
Billions of dollars, disruption happens. If you're not able to pivot now, you're not going to be able to pivot later when you get disrupted anyway.
Harry Stebbings
One would think that with commoditization of models and with cheaper inference, that actually big tech wins, right? Have you seen the stock market today? They've been hit hard.
Unknown
How do you think about that?
Jonathan Ross
What you see is a bunch of people who are concerned about training and the need for it, and everyone's still thinking that most of compute is training and that there's going to be less of it because someone trained a model on 2000 GPUs and the nerfed A800 version with slower memory or whatever it is, and they're like, oh, people aren't going to need as many chips. But again, Jevons paradox, right? The more you bring the cost down, the more people consume. So for the last five to six decades, like clockwork, once a decade, the cost of compute has gone down a thousand X. People buy 100,000 X as much compute spending 100 times as much. So every decade they spend 100 times as much. So you make it cheaper and they want more. What's really happening is every time one of these models gets cheaper, we see our developer count just skyrocket and then it comes back down a little bit, but the slope is higher than when it started. Better models create more demand for inference. More demand for inference then has people going, I should train a better model. And the cycle continues.
Harry Stebbings
I just bought a shitload of Nvidia. They dropped 16% on the thesis that the increasing efficiency means that obviously we wouldn't need as much Nvidia chips. And I thought exactly that, which is why you'll still need the Nvidia inference and you'll just have much higher usage. So to me, it's the most screaming buy of the century. Do you share my optimism on Nvidia given what you just said in Jevons Paradox?
Jonathan Ross
So I think over the long term, the only thing I say is Warren Buffett and Charlie Munger, who in the short term, the market is a popularity contest. In the long term, it's a weighing machine. I can't tell you about the popularity contest, but in terms of the weighing machine part, this is a misunderstanding. It's actually more valuable, thanks to deep seek not less valuable. Okay, so Jevin's paradox was actually discovered by Jevin, as recently made famous in Satya's tweet. However, I did beat him to that by quite a bit. And just as Satya likes to say that he made Google dance, I'm going to say I made Satya dance. He might take exception to that, but less than a month before he posted that, I did a cute little tweet on it. So what's really happening here was in the 1860s, this guy Jevin, he actually wrote a treatise on steam engines, which I guess is what you did for fun back then in England. He realized every time steam engines became more efficient, people would buy more coal, which is the paradox. But if you think about it from a business point of view, when the OPEX comes down, more activities come into the money, so people do more things. And so what's happened is every time we've seen the cost of tokens for a particular level of quality of models come down, we've actually seen the demand grow significantly. Price elasticity, baby.
Harry Stebbings
A lot of people suggest that Nvidia's incredible high margin status, which I'm going to butcher. I can't remember what it was in the latest release. It was for something 45 or whatever it was, but it was very, very high. And then relate to your margin is my opportunity. I think of it back to the seven powers and go, their margin is their defensibility. And it makes me really just consider the strength of their moat. Do you think your margin is my opportunity or do you think their defensibility is their margin?
Jonathan Ross
Today there's this wonderful business selling mainframes with a pretty juicy margin because no one seems to want to enter that business. Training is a niche market with very high margins. And when I say niche, it's still going to be worth hundreds of billions a year. But inference is the larger market and I don't know that Nvidia will ever see it this way, but I do think that those of us focusing on inference and building stuff specifically for that are probably the best thing that that's ever happened for Nvidia stock. Because we'll take on the low margin, high volume inference so that Nvidia can keep its margins nice and high.
Harry Stebbings
Do you think the world sees this?
Jonathan Ross
No. And I was actually like, we raised some money late 2024 in that fundraise, we still had to explain to people why inference was going to be a larger business than training. Remember, this was our thesis when we started eight years ago. So for me, I struggle on why people think that training is going to be bigger. It just doesn't make sense.
Harry Stebbings
Just for anyone who doesn't know, what's the difference between training and inference?
Jonathan Ross
Training is where you create the model. Inference is where you use the model. You want to become a heart surgeon, you spend years training and then you spend more years practicing. Practicing is inference.
Harry Stebbings
Where does efficiency go from here? Everyone was so shocked by how R1 is so much more efficient. What next?
Jonathan Ross
What you're going to see is everyone else starting to use this MOE approach. Now there's another, there's another thing that happens here. So.
Harry Stebbings
And the MOE approach, just so I understand, is like the segmentation of where information goes. So it's rooted to like the optimal point of the model.
Jonathan Ross
Yeah, it's called. So MOE stands for mixture of experts. When you use llama 70 billion, you actually use every single parameter in that model. When you use Mixtral's 8x7B, you use two of the roughly 8B, you know, experts, but it's much smaller. And effectively, while it doesn't correlate exactly, it correlates very closely. The number of parameters effectively tells you how much compute you're performing. Now, if I have, let's take the R1 model, I believe it's about 671 billion parameters versus 70 billion for llama. And there's a 4 or 5 billion dense model as well. Right. But let's focus on 70 versus 671. I believe there's 256 experts, each of which is somewhere around 2 billion parameters. And then it picks some small number, I'm forgetting which Maybe it's like eight of those or 16 of them, whatever it is. And so it only needs to do the compute for that. That means that you're getting to skip most of it. Right. Sort of like your brain, like not every neuron in your brain fires when I say something, something to you about the stock market. Right. Like it, the neurons about, you know, playing football, those don't kick off. Right. That's the intuition there. Previously it was famously reported that OpenAI's GPT4, it started off with something like 16 experts and they got it down to eight. I forget the numbers, but it like started off larger and they shrunk it a little and they were smaller or whatever. And then with what's happened with deep SEQ model is they've gone the opposite. They've gone to a very large number of experts. The more parameters you have, it's like having more neurons, it's easier to retain the information that comes in. And so by having more parameters, they're able to on a smaller amount of data, get good. However, because it's sparse, because it's a mixture of experts, they're not doing as much computation. And part of the cleverness was figuring out how they can have so many experts so it could be so sparse, so they could skip so many of the parameters.
Harry Stebbings
But if we take that then back to like that's where we are saying how they've become so efficient, what's the next stage of that then? Experts, they can root it so efficiently. What now?
Jonathan Ross
So Meta recently released their llama 3.3.70B and it outperformed their 3.1405B. So their new 70B outperformed their 405. What was surprising to me, I thought they retrained it from scratch. It turns out you read the paper and they talk about how they just fine tuned so they used a relatively small amount of data to make it much better. Again, this goes to the quality of the data. They have higher quality data. They took their old model, they trained, it got much better. But that 70B, that new 70B outperforms their previous 405B. What you're going to see now is now that everyone has seen this Deep SEQ architecture, they're going to go, great, I have hundreds of thousands of GPUs, I'm now going to use a lot of them to create a lot of synthetic data and then I'm going to train the bejesus out of this model. Because the other thing is, while it's sort of asymptotes, the question is on this curve, where do you stop? It depends on how many people you have doing inference. You can either make the model bigger, which makes it more expensive, and then you train it on less, or you make it smaller and it's cheaper to run, but you have to train it more. So Deep Seek didn't have a lot of users until recently. And so for them it would have never made sense to train it a lot anyway. They would much rather have a bigger model. But now what you're going to see is all these other people either making smaller models or trying to make higher quality ones of the same size, but just training it more.
Harry Stebbings
We've seen Deepseat now say, hey, only now Chinese phone numbers can log in. That is the new sign up. I think it is what's happened and what is the result of that?
Jonathan Ross
So they ran out of compute. And this is why. This is the other reason why chip startups are going to do just fine. Because they ran out of inference Compute. You train it once, but now. So you spend money to make the model, like designing a car, but then each car you build costs you money, right? Well, each query that you serve requires hardware training. Scales with the number of ML researchers you have, inference scales with the number of end users you have.
Harry Stebbings
Do you think Deep Seq are astonished by the response they've got from the global community?
Jonathan Ross
I think they marketed very well. Like, you look at some of the publication and they make it sound like it's a philosophical thing and you know, they talk about they spent 6 million on the GPUs and everyone just zoomed in on that, neglecting the fact that Llama's first model was trained on like, I think, 5 million worth of GPU time and it set the world on fire in a good way. And then ignoring the fact that they spent a ton generating the data and all this. They're really good at marketing. I think they were probably surprised at how well it worked, but I think this is what they were going for.
Harry Stebbings
Is that anything that I haven't asked or we haven't spoken about that we should.
Jonathan Ross
What's up with the $500 billion Stargate effort?
Harry Stebbings
Okay, what's up with the $500 billion Stargate Effort?
Jonathan Ross
I've gone back and forth on that. I actually did. So Gavin Baker tweeted some math. Before I saw that tweet, I came up with very similar math. However, talking to some people in the know, some of the comments are actually, they've got it. But then you keep pressing and it's like, well, maybe is there some cutesy ness to it? What I think it is is an acknowledgment that the models have been commoditized and infrastructure is what's important in terms of maintaining elite like scale. It's one of the seven powers. I think what you're seeing there is an attempt to move from having a cornered resource or something like that into a scale economy.
Harry Stebbings
Do you think it will work?
Jonathan Ross
I don't think you get there in a short period of time with GPUs, because most of the compute is inference. And so, you know, if you're talking about building out all the power, building like it's going to take time. It's infrastructure, it's capex. The real win here is brand. That's what I would be doubling down on. I would be like hiring the best brand Firms I could, I would do a complete makeover.
Harry Stebbings
Will OpenAI have a stronger or a weaker brand in three years time?
Jonathan Ross
Much stronger. I think they're going to double down on that and they're going to focus on it.
Harry Stebbings
Who will lose?
Jonathan Ross
People who can't adapt to disruption. Anyone who just wants to keep going on a straight line and do what they were doing before is going to lose and the rate of disruption is probably going to increase. Because going back to the analogy of LLMs being the printing press, imagine if there were a couple of smartphones left over from an ancient civilization. All of a sudden the printing press is invented and you're like, ooh, Uber's coming, I want to position for it. I know where this is going. We are the smartphones. We know where generative age technology goes. And now everyone's like, well we know how big this gets, let's put money into it. I can't be the one who doesn't spend money on this because I know how big of an advantage it's going to be. It's like getting to add more workers to the workforce. And so I think the generative age, we're going to speed run it faster than whatever comes next because we know what it looks like.
Harry Stebbings
Is there any chance we see a plateauing? We saw it in self driving, for example, where we kind of went through this desert of lack of progression and suddenly all of once it came. Will we see that or will we just see this continuing dominance?
Jonathan Ross
I think with self driving the problem you had was the, the threshold. It had to be way superhuman because if you look at the number of miles driven by these self driving vehicles, it's an enormous number and the number of fatalities and incidents is lower per mile, but we have no tolerance whatsoever for them. When it's a machine, when you're writing poetry and code, it's very different versus doing a surgery or driving a car.
Harry Stebbings
If you're Elon and Axe, how are you feeling and do you feel better or worse? Post this.
Jonathan Ross
I would probably feel both better and worse. I'd feel better about my bet on building out more hardware. I would feel worse about trying to build out my own model. Why is Elon doing that? Just pick one up off the ground. Like why are you making your own?
Harry Stebbings
Are you excited when you look forward at the next few years or are you quite nervous? You can say this is a time of hiking, heightened international warfare in terms of this new AI arms race. China stealing everything. Us forced to steal back.
Jonathan Ross
Long ago I stopped having good Days and bad days. It's yes, it's how many good things, it's how many bad things. Right. When you, when you run an organization, I'm both excited and nervous and I'm excited and nervous about different things at the same time. The thing that I am most nervous about is that unlike nuclear war, you can use AI tools to attack each other. Google just announced recently the first zero day exploit found by an LLM that was previously unknown. Yeah, that's a scary one. So now it's great for anyone who.
Harry Stebbings
Doesn'T understand zero day expert.
Jonathan Ross
So how would you like me to have access to your phone?
Harry Stebbings
Not ideal.
Jonathan Ross
How would you like the CCP to have access to your phone even last the night? That's a nation state and nation states have a lot of resources and if they stand up a bunch of compute and they start scanning for vulnerabilities and all the open source that's out there and not even the open source just like scanning ports on the Internet and trying to figure out if they can break in, they can just automate that. Now they don't need to hire people to do that. And now the defense has to be automated because there's no way to keep up with automated attackers. And what happens if this gets out of control? But worse, it's not killing anyone and it's also deniable. That's the hardest part about it because is it really China, is it Russia, is it North Korea, is it a friendly that's making it seem like it's one of them or vice versa. Now you have this ability. So you go from where we had a cold war because having a war was unconscionable. It was unthinkable because of the consequences to now, yeah, I'm just hacking you. That could spiral out of control. I'm worried that we're going to have more back and forth and think of it this way, if you are a nation state and you, let's say that Harry, you're a beacon to the venture community and you want to rally the European entrepreneurs to be risk on and I'm someone who doesn't want that because they don't want the competition. A country that doesn't want that. Maybe I sully your reputation, maybe I make you Persona non grata. How is that any worse than shooting someone? It could be worse in some ways, but you can get away with it. And so that has me nervous, really nervous. But I'm also really excited. We are seriously going to be able to innovate as fast as we can come up with ideas now. You're not going to have to implement things. You're going to be able to prompt engineer your way through things. Just as we move from hardware engineers to software engineers and sped up productivity, you're now just going to be able to have a prompt engineer who doesn't even write software. One of our engineers made this app where you can just describe what you want built and it builds it. And because, you know, we're so fast, it's like that and you just iterate and it'll build an app for you.
Harry Stebbings
I just don't understand where the. I'm sorry to just continues but where the value accrues them. Because you mentioned that kind of, hey, they created this tool which allows you to prompt and it'll build the app. I'm sure you've seen Bolt New. I'm not sure if you've seen Lovable, where it's basically chatgpt, but for kind of website creation, is there value? And everyone was like, there's no value in these wrapper apps. Everyone's like, there's no value in these foundation models. Where the fuck is there value?
Jonathan Ross
And that's part of the exciting part. It's discovering that. But I think people will always prefer to use the highest quality, most polished product. I think there is an opportunity for artisanship, craftsmanship and just perfecting it. Getting to a certain number of nines in the details. The Eames quote. The details aren't the details, the details are the thing. I used to be a little concerned with the quote. You know, if you're not ashamed of the quality of your, your first release and you've waited too long because there's a subtlety and nuance. There's. There's soundness and then there's completeness. What you want is an incomplete product, something that doesn't do everything. That's why you should be embarrassed. But it shouldn't like blue screen of death on you. That's not a good embarrassment. Right. And so what you're going to see now is because it's so easy to come up with something that just kind of works. It's a little embarrassing, but it kind of works. People are really going to value well crafted, high quality products.
Harry Stebbings
Jonathan, I cannot thank you enough for breaking down so many different elements for me and putting up with my basic questions. You've been fantastic.
Jonathan Ross
No problem. Have fun out there. I mean, this is a brand new age. It really is.
Unknown
I mean, what a show that was. If you want to watch the episode in full. You can find it on YouTube by searching for 20VC. That's 20VC on YouTube. But before we leave you today, here are two fun facts about our newest brand sponsor, Kajabi. First, their customers just crossed a collective $8 billion in total revenue.
Harry Stebbings
Wow.
Unknown
Second, Kajabi's users keep 100% of their earnings, with the average Kajabi creator bringing in over $30,000 per year. In case you didn't know, Kajabi is the leading creator commerce platform with an all in one suite of tools including websites, email marketing, digital products, payment processing and analytics for as low as $69 per month. Whether you are looking to build a private community, write a paid newsletter, or launch a course, Kajabi is the only platform that will enable you to build and grow your online business without taking a cut of your revenue. 20VC listeners can try Kajabi for free for 30 days by going to kajabi.com 20VC that's kajabi.com K-A-A-A-B-I.com 20VC once you've built your creator empire with Kajabi, take your insights and decision decision making to the next level with AlphaSense, the ultimate platform for uncovering trusted research and expert perspectives. As an investor, I'm always on the lookout for tools that really transform how I work. Tools that don't just save time, but fundamentally change how I uncover insights. That's exactly what AlphaSense does. With the acquisition of Tagus, AlphaSense is now the ultimate research platform built for professionals who need insights they can trust fast. I've used Teagus before four for company deep dives right here on the podcast. It's been an incredible resource for expert insights, but now with AlphaSense leading the way, it combines those insights with premium content, top broker research, and cutting edge generative AI. The result? A platform that works like a supercharged junior analyst delivering trusted insights and analysis on demand. AlphaSense has completely reimagined fundamental research, helping you uncover opportunities from perspectives you didn't even know how they existed. It's faster, it's smarter, and it's built to give you the edge in every decision you make. To any VC listeners, don't miss your chance to try AlphaSense for free. Visit AlphaSense.com 20 to unlock your trial. That's AlphaSense.com 20 and speaking of incredible products, what comes to mind when you think about business banking? Probably not speed, ease or growth. I'm willing to bet that's because you're not using Mercury. With Mercury, you can quickly send wires and pay bills, get access to credit sooner to hit the ground running faster, unlock capital that's designed for scaling, and see all these money moves all in one place. I speak to dozens of founders every week and most of them are using Mercury because they're super smart and that's what you have to be using. Visitors visit mercury.com to experience it for yourself. Mercury is a financial technology company, not a bank. Banking services provided by Choice Financial Group Column NA and Evolve bank and Trust members of fdic. As always, we so appreciate all your support and stay tuned for an incredible episode coming on Friday with the CEO of Monzo.
The Twenty Minute VC (20VC) Episode Summary: Deepseek, AI Arms Race, and the Future of Inference with Jonathan Ross
Podcast Information:
Harry Stebbings welcomes Jonathan Ross, CEO of Groq, highlighting Ross's extensive background in AI, including his pivotal role in Google's TPU initiative. The discussion centers around the recent news about Deepseek, positioning it as a significant development in the AI landscape.
Notable Quote:
Jonathan Ross [05:08]: "Yes, it's Sputnik. It is Sputnik 2.0. Even more so."
Ross delves into why Deepseek is making headlines, comparing its advancement to Sputnik 2.0. He explains that unlike traditional models that rely heavily on scaling with more GPUs, Deepseek achieved superior performance with a tighter budget and innovative techniques.
Key Points:
Notable Quote:
Jonathan Ross [07:16]: "It's a little bit like speaking to someone who's smarter and getting tutored by someone who's smarter."
The conversation shifts to potential responses from OpenAI and the US government to Deepseek's advancements. Ross speculates whether OpenAI might open-source their models to maintain competitiveness and user trust.
Key Points:
Notable Quote:
Jonathan Ross [12:59]: "Do you really think like the CCP doesn't have all your data and isn't going to look it up later?"
Ross elaborates on the broader geopolitical implications of AI advancements, particularly the competitive dynamics between the US and China.
Key Points:
Notable Quote:
Jonathan Ross [28:00]: "If you're running in a race with someone who's willing to take steroids, if you want to win, you're going to have to take steroids too."
The discussion transitions to the technical aspects of AI compute, differentiating between training and inference, and their respective impacts on the industry.
Key Points:
Notable Quote:
Jonathan Ross [20:36]: "I can't tell you about the popularity contest, but in terms of the weighing machine part, this is a misunderstanding. It's actually more valuable, thanks to Deepseek not less valuable."
Ross introduces Hamilton Helmer's Seven Powers framework to analyze the competitive strategies of top tech companies in the AI space.
Key Points:
Notable Quote:
Jonathan Ross [18:25]: "Marketing is the art of decommoditizing your product. And the seven Powers are seven great ways to decommoditize your product."
Harry inquires about OpenAI's ambitious $500 billion Stargate plan. Ross provides his perspective, emphasizing that infrastructure investment alone may not suffice without strong brand power.
Key Points:
Notable Quote:
Jonathan Ross [44:36]: "The real win here is brand. That's what I would be doubling down on."
In the concluding segments, Ross reflects on the future trajectory of AI, encompassing innovation potentials, regulatory landscapes, and market adaptability.
Key Points:
Notable Quotes:
Jonathan Ross [50:41]: "I think people will always prefer to use the highest quality, most polished product."
Jonathan Ross [47:23]: "The thing that I am most nervous about is that unlike nuclear war, you can use AI tools to attack each other."
This episode of The Twenty Minute VC provides an in-depth analysis of Deepseek's recent advancements and their broader implications on the AI industry and global geopolitics. Jonathan Ross offers expert insights into the competitive dynamics between major tech players, the critical importance of inference in AI compute, and the strategic maneuvers necessary for companies like OpenAI to maintain their dominance. The discussion also underscores the urgent need for balanced regulatory frameworks to prevent unchecked AI-driven competition and safeguard data security.
For listeners interested in the intersection of venture capital, technology, and global strategy, this episode offers valuable perspectives on navigating the rapidly evolving AI landscape.