
“I think this is a big moment in the history of A.I. development”
Loading summary
Oracle Ad
This podcast is supported by Oracle.
Kevin Roose
AI requires a lot of compute power and the cost for your AI workloads can spiral. That is, unless you're running on OCI Oracle Cloud Infrastructure. This was the cloud built for AI, a blazing, fast, enterprise grade platform for your infrastructure, database, apps and all of your AI workloads.
Casey Newton
Right now, Oracle can cut your current cloud bill in half if you move to OCI. Minimum financial commitment and other terms apply. Offer ends March 31. See if you qualify@oracle.com hardfork oracle.com hardcore.
Kevin Roose
Hard fork well, Casey, the last time we recorded an emergency podcast, you were at gate E8 of the San Francisco airport and we were talking about OpenAI and how Sam Altman had just been fired. Are you at the airport today? And if not, would you like me to mail you an Auntie Ann's pretzel so you feel more comfortable?
Casey Newton
Yeah, that'd be no. All things being equal, Kevin, it's actually much more comfortable to record here in my home studio and not have to compete with the PA system announcing flights to Houston.
Kevin Roose
Casey we are here today to talk about a little company called Deepseek, which probably most people had not heard of, but that is causing a major series of events in the US Stock market and around the US Tech industry this week.
Casey Newton
That's right. By now our listeners have probably seen that the stock market dipped on Monday and that some companies whose fortunes are closely tied to AI dipped quite dramatically. But they also might have just noticed it in the App Store, where Deepseek has hit number one this week, which is a rarity for a Chinese consumer app to do in the United States. So yes, suddenly everywhere you look, there are sign of this Deep Seek affecting the world.
Kevin Roose
And you know, we should say to maybe talk directly to the things some listeners may be thinking about, like why we are interrupting our normal production schedule to do a special emergency episode about Deep Seek. Like, it is not unusual for people in the AI world to start freaking out about some new development or breakthrough or some new model that was released. But I believe that this is the real deal. I think this is a big moment in the history of AI development and it is really taking a toll on stock markets in ways that I think are really interesting. I mean, you said dip, but Nvidia stock, one of the highest performing stocks on the market over the past few years, and certainly the one that is most closely correlated with people's feelings about AI, is down about 18% today. That represents hundreds of billions of dollars wiped off the market cap of just One company by this announcement from Deep Seek. So I think this is a broader story, just the stock market. I think it has tons of implications for other companies developing an AI and also for concerns that a lot of people working on AI safety have about how this technology could get out of hand.
Casey Newton
Yeah, I'm excited to get into it too. But I will signal that I think that there are also some reasons not to freak out. And so I'm going to be trying to bring some of those to the discussion. But today, Kevin, I think we just really want to do three things. One, we want to tell you what Deep Seek is. Two, we want to give you some insight into why people think this is such a big deal. And then three, I think we want to debate a little bit back and forth just how big a deal this really is.
Kevin Roose
So, yeah, let's get into it.
Casey Newton
All right, so let's start with what Deep Seek is. Kevin, we have mentioned it on the show before, but tell us a little bit about this new model and why it has taken the world by storm.
Kevin Roose
Well, let's talk first about Deep Seq itself. You may remember if you listened to our episode a couple weeks ago on this, that deepseek is a Chinese AI company. It is about a year old and it spun out of a hedge fund called High Flyer. And it was something that I think outside of China, most people were not paying attention to until late last year when they released something called V3 that was an AI model that was, they said, competitive with some of the leading AI models created by American AI companies. And it really caught people's attention, not just because it came out of this little known Chinese AI startup, but because of what deepseek said about how it was trained and how much it cost to train.
Casey Newton
Yeah. So tell us about what was so interesting about how they did it and what it cost.
Kevin Roose
Yeah, so the first interesting thing about Deep Seq that caught people's attention was that they had managed to make a good AI model at all from China. Because for several years now, the availability of the best and most powerful AI chips has been limited in China by Chinese export controls. You are not allowed, if you are Nvidia or another American company to export your most powerful AI chips to China. So deepsea came out with this paper and they said, well, we actually didn't use your fancy AI chips. We used a kind of second rate AI chip that was artificially limited in order to be able to export them to China. We have a bunch of those and using just those kind of lesser AI chips. We were able to get a model to perform as well as you American tech companies with all your fancy H1 hundreds. And then the second thing that really caught people's attention was about the cost. Deepsea claimed that they had spent just five and a half million dollars training V3. And I think there are some reasons to take that number with a grain of salt. But just in terms of the raw cost of the training run for that model, $5.5 million is peanuts relative to what American AI companies spend training their leading models. You know, something on the order of a hundred times cheaper than what something like an OpenAI model of equivalent performance would cost to train.
Casey Newton
Right. And this comes against a backdrop of all the US tech giants saying we're going to spend tens of billions of dollars this year to increase our capacity and data centers and the amount of compute power that we'll have. So this really stood in stark contrast to that. So that tells us a little bit about what V3 is. Sort of the training and the costs were maybe more interesting than the model itself, which is just kind of like a chatbot like a lot of us have already used. But V3 came out around Christmas, Kev. So why is the market reacting so strongly now?
Kevin Roose
So a couple things happened in the past week or so that have led to the freakout that we're seeing now. The first is that last week Deepseek released another model, R1, which was its attempt at a so called Reasoning model, basically V3. The last model was kind of similar to things like Claude or Gemini. It was sort of a basic language model, but R1 was more like OpenAI's 01 and 03, which are its newest reasoning models. So that happened early last week. And then a little later, a couple days later, deepseek did something else, which was that it released an app where Anyone could download DeepSeek and go use its model in a very easy way. And this is when people really started to go from being interested and fascinated by deepseek to really panicking about it. Because all of a sudden millions of American were downloading this app using Deep seq's models and realizing, oh wait, this is like as good or better than ChatGPT. It's free, it doesn't have any ads, it seems to be quite smart. And it does something that OpenAI's models don't do, which is it shows you the internal thought process that it is going through as it is producing these answers. That is something that OpenAI's models do not show. The user, but Deep Seeks models do. And I think people found that really compelling.
Casey Newton
Yes, and that last point that you mentioned I think is really important because I suspect actually all the AI companies are going to copy this now, because the process of using a chatbot today is you ask a question. I've likened it before to like throwing a penny in a fountain, Right? You're just sort of making a wish. You see what you get back. What's really interesting about the Deep Seq thing is that as it's answering your question, you're seeing how the computer understood your query. And so if you want to ask a follow up question, you now have a much sense of how the computer understood you. And this actually does seem to be a sort of conceptual breakthrough in product design, you know, just as much as the like underlying science. All right, so that gives us a sense of what Deep Seek is and what its latest models are. Let's talk about why everyone is freaking out and maybe more specifically, take a look at who is freaking out. So as we mentioned at the top, one of the big ways people are noticing this is through the decline in the stock market. Kevin, give us a sense of the industry reaction to what the Deep SEQ models might mean.
Kevin Roose
Yeah, so I would say the people who are freaking out the most are investors in the biggest American AI companies, as evidenced by all of the tech stocks selling off today. And I think you could categorize that as a fear of declining margins and commoditization. Now that sounds very boring, but basically what they're saying is, look, if a Chinese AI company that no one had ever heard of until a few weeks ago can come along and for a fraction of our costs, develop a model that is as good or better as the leading models on the market with substandard chips. By the way, then the barrier to entry in this market is just not nearly as high as we thought it was. You know, one of the fundamental assumptions over the past few years when it came to AI was that bigger was better. Right? That in order to build the most powerful models, you needed billions of dollars, maybe tens or hundreds of billions of dollars and huge data centers and all of the leading chips. And that assumption may no longer be true. If what deepsea claims is true and checks out, then it may mean that it only costs single digit millions or double digit millions of dollars to build a leading model which would just radically shift what these companies are able to charge for their models, as well as the number of competitors in the market.
Casey Newton
Yes, I definitely agree. It changes what companies might be able to charge. But I would also just note that nothing that Deepseek did is possible without American innovation. One of the reasons that it was cheap for them is because it was expensive for everyone else. And other companies did spend hundreds of billions of dollars figuring this out. So worth saying. All right, let's talk about a second group of people who have been really rattled by this series of announcements, Kevin, and that is folks who are paying attention to geopolitics.
Kevin Roose
Yeah. So a lot of people who worry about China in general are worried about this Deep Seq announcement because Deep Seek is obviously a Chinese company. If you're a person who's sort of worried about Chinese tech dominance or the possibility that Chinese firms could eventually get to something like AGI first, I think you are especially worried about what Deepseek is doing. And I think we should also say that the models themselves are recognizably Chinese. So people over the weekend I saw testing out various queries on deep seq R1, including things like, you know, tell me about what happened at Tiananmen Square, and the model just refuses to answer them. And so I think there is a worry that if Chinese companies do take the lead in AI, then Chinese values and censorship laws may become embedded into this technology in a way that is very hard to extract.
Casey Newton
Yeah, I think that's true. I also just always urge caution when people try to use the existence of China as a reason to dramatically accelerate the AI race. A lot of those people have made investments that will pay off handsomely if we find ourselves in some sort of protracted and awful conflict with China. So whenever anyone starts talking about China in the context of AI, my eyebrows arch up a little bit. All right now, Kev, there is one more group of folks that I think is quite justly nervous about what they're seeing out there with Deepseek. And who is that?
Kevin Roose
So the third group of people that I would say are freaking out about Deepseek are AI safety experts. People who worry about the growing capabilities of AI systems and the potential that they could very soon achieve something like general intelligence or possibly super intelligence, and that that could end badly for all of humanity. And the reason that they're spooked about Deepseek is this technology is open source. Deepseek released R1 to the public. It's an open weights model, meaning that anyone can download it and run their own versions of it or tweak it to suit their own purposes. And that goes to one of the main fears that AI safety experts have been sounding the alarms on for years, which is that just that this technology, once it is invented, is very hard to control. It is not as easy as stopping something like nuclear weapons from proliferating. And if future versions of this are quite dangerous, it suggests that it's going to be very hard to keep that contained to one country or one set of companies.
Casey Newton
Yeah, I mean, say what you will about the American AI labs, but they do have safety researchers. They do at least have an ethos around how they're going to try to make these models safe. It's not clear to me that Deep SEQ has a safety safety researcher. Certainly they have not said anything about their approach to safety. Safety. Right. As far as we can tell, their approach is, yeah, let's just like build AGI, give it to as many people as possible, maybe for free, and see what happens. And that is not a very safety forward way of thinking.
Kevin Roose
So, Kasey, that is a lot of information that we just dumped on our listeners. But really what I want to know is like, are you freaked out about this? Do you think that this is as big a deal as some people seem to think it is?
Casey Newton
I think as I'm doing my reading and having conversations with folks this morning, my sense is I am freaking out a bit less than some other folks that I'm talking to. I think this is a big deal and merits discussion. But I also think that people may be getting a bit over their skis when it comes to thinking through the implications here.
Kevin Roose
So make that case, because all I'm seeing all over my timelines is people saying this is the Sputnik moment for AI. This is the biggest moment in AI since the release of ChatGPT. Everyone needs to stop what they're doing and pay attention. So what is the case that you are seeing out there that people are hyperventilating a bit over nothing?
Casey Newton
Sure. So let's take a few different points. One reason why people are really nervous here is that Deep SEQ was able to train this model very cheaply. And I want to be clear, this is a significant technical achievement. At the same time, the cost of training and inference has been falling rapidly in AI for a long time now. Ethan Mollock, who we've had on the show before, posted a chart on X that showed this decline. And in some cases, for example, running inference on a GPT4 level model, the cost of that has fallen a thousand fold over the past couple of years. So things have already been moving in this direction. And I think most people who work in AI expected that it would continue to go there. So that is the first point that I would make.
Kevin Roose
Got it. And if you are Satya Nadella at Microsoft or Sam Altman or OpenAI or Asunder Prachay at Google, are you worried that you are going to spend tens or hundreds of billions of dollars building out new data centers and filling them with all the fanciest GPUs and that some, you know, Chinese startup is going to just take everything that you do and copy it three months later for pennies on the dollar?
Casey Newton
So this is a great question, which leads me to a second reason why I think at least some folks may be overreacting here. And that is that in most cases, the money that is being spent to build out the data centers that will handle these giant training runs can be repurposed. The same servers and chips that you would use to do that also be used to serve what is called inference. So basically actually answering the questions. So as more and more people start to use AI, it will be those giants that actually have the capacity to serve those queries. They will be able to build businesses around that. And by the way, that is another reason why I don't think that Deep Seek is evidence that the export controls failed. Because the folks over at DeepSeek would love to have all of these chips not just to do the big training runs, but also that they could serve all of the demand that they are currently generating. Right. So I think that's another important thing to keep in mind as this discussion moves forward.
Kevin Roose
Yeah, that makes a lot of sense to me. I do think the cost dynamics here are very important because I think I, you know, I talked to a person that I know who works at one of these big companies, and he said that a lot of their customers are already starting to ask, well, could we shift over from using the OpenAI APIs and their models to using Deep Seek if it saves us 80% of our costs? And so I think in the short term, there is reason for the American AI companies to worry because people want these things to be as cheap as possible.
Casey Newton
Yeah. And let me just say, as somebody who spent $200 to upgrade to GPT Pro last week, so I could try Operator. I'm looking forward to that price going down. But. But that leads to, I think, maybe the third reason that I think people might be overreacting a little, which is a lot of what we are seeing here is just essentially a fancy ripping off of techniques that were pioneered here in the United States. Right. It has long been the Case that open source models were just a little bit behind the models that are made by the big labs. Right. You look at Meta's Llama models, which, until Deep Seq were seen as the best open weights models that were out there. They weren't as good as what OpenAI or Google or others were doing. Right. Where I do think that this gets super interesting is that Deep Seq is showing us open source can now catch up faster than it used to. Right. That the labs used to have a little bit longer lead, but now people are just getting cleverer and cleverer about these techniques. And so it is getting harder to build that defensible moat because this is just one of those technologies where once you figure out basically how people are doing it, you could just get in there and do it too.
Kevin Roose
Yeah. Now, Kasey, I'm curious what, if anything, you are hearing from inside Meta specifically, because I think this is one of the most fascinating angles. You know, Meta is a company that has spent billions of dollars developing AI models and unlike most of its competitors, has chosen to release those models freely. And part of what Deepseek has shown is that you can take a model like llama 3 or llama 4 and you can distill it. You can make it smaller and cheaper. You can do that without sacrificing a lot of the performance. And so there were some reports in recent days that Meta is basically at DEFCON 1 right now that they are. The information reported that they have four war rooms at Meta headquarters devoted to trying to figure out how to respond to the deep sea threat.
Casey Newton
Yeah. And by the way, I hope they were the same war rooms that Meta used to use to protect America from election interference. They say, hey, get out of here. We got something else we gotta figure out.
Kevin Roose
So are you hearing anything from Meta? Because I think that is the company that I would say has the most to worry about when it comes to Deep Seek. Because Deep Seek is doing essentially what they do, but at a fraction of the cost.
Casey Newton
Yeah. So I do not have my own original reporting to share on this yet, but I do trust the information that they are freaking out. And the reason is that Meta is supposed to be the best company at ripping other people off. And so when they find out that some Chinese Johnny Come latelys are gonna be better than they are at ripping things off, they're gonna have something to say about it. And so nothing could be more poetic. Now that, you know, Deep Seek has ripped off all the American companies, Meta is coming back and they say, oh, you think you're good at ripping people off. Just wait until we have plumbed the guts of V3 and R1. We're gonna be back on top sooner or later, bucko.
Kevin Roose
Yes. Now, I wanna ask you about one other reaction that I saw on social media, which was from Satya Nadella, the CEO of Micro, on his X account late last night and posted the following. Jevons Paradox strikes again. As AI gets more efficient and accessible, we will see its use skyrocket, turning into a commodity we just can't get enough of. And then he linked to a Wikipedia article about Jevons Paradox. So, Kasey, did you understand this? And if so, what did you make of it?
Casey Newton
Well, I did, because we had just discussed Jevons Paradox on this very show. Kevin. It's true when hugging faces, Sasha Lucioni came on and explained, explained Jevons Paradox, which is essentially, as stuff becomes more efficient, you simply increase demand for it, thereby canceling out a lot of the efficiency gains. So when I saw Satya tweet Jevons Paradox, I said, once again, Hard Fork has set the national news agenda. And if you're not listening, fix that.
Kevin Roose
Yeah, many people are talking about Jevons Paradox. I predict that this is going to be something I'm going to hear about at every single party I go to for the next six months. And I think the. Just to connect the dots a little bit, I think what Satya is trying to say here is that Deepseek is not actually a threat to companies like Microsoft, because as the cost of building and using AI models comes way down, people are just going to want to use them more and more. And so the overall demand and Microsoft's overall profitability will not change, which could be true. But I would also just say is exactly what you would expect the CEO of Microsoft to say on a day where investors were panicking and selling their stock.
Casey Newton
It is. It is. All right. Well, Kevin, I think that's a pretty good overview of what Deep Seek is doing, why people are freaking out, and at least some thoughts about exactly how freaked out you should be. There is a lot more to say about this subject, and if you are starving for even more discussion of Deep Seek, I can promise you that we'll have more to say on our regularly scheduled episode of Hard Fork this Friday.
Kevin Roose
Yes, Casey, I love doing these emergency podcasts. They fill me with a sense of danger, excitement, living on the edge.
Casey Newton
I love them for that reason. I love them for a second reason, Kevin, which is that I get paid by the episode. So here's to many more emergencies in 2025.
Kevin Roose
This emergency episode of Heartfelt was produced by Whitney Jones and Rachel Cohn. This episode was edited by Rachel Dry and was engineered by Daniel Ramirez. Original music by Dan Powell. Our executive producer is Jen Poyant. Our audience editor is Nel Galogly. Special thanks to Paula Schumann, Huiwing Tam, Dahlia Haddad and Jeffrey Miranda. You can email us as always@hardforkytimes.com.
Oracle Ad
You just realized your business needed to hire someone yesterday. How can you find amazing candidates Fast? Easy. Just use Indeed. Join the 3.5 million employers worldwide that use Indeed to hire great talent fast. There's no need to wait any longer. Speed up your hiring right now with Indeed and listeners of this show will get a $75 sponsored job credit. To get your jobs more visibility at indeed.com NYT just go to indeed.com NYT right now and support our show by saying you heard about Indeed on this podcast. Indeed.com NYT terms and conditions apply. Hiring Indeed is all you need.
Podcast Summary: Hard Fork – "Your Guide to the DeepSeek Freakout: an Emergency Pod"
Episode Information:
In this emergency episode of Hard Fork, hosts Kevin Roose and Casey Newton delve into the sudden upheaval caused by DeepSeek, a relatively unknown Chinese AI company, and its significant impact on the U.S. stock market and the broader tech industry.
Background of DeepSeek: Kevin Roose introduces DeepSeek as a burgeoning Chinese AI startup that has recently made waves globally. Spun out from a hedge fund named High Flyer, DeepSeek has rapidly gained attention with its AI models, V3 and R1.
AI Models V3 and R1:
Casey Newton states, “V3 is just kind of like a chatbot like a lot of us have already used,” highlighting its functional similarity to existing models while emphasizing its unique features.
Cost-Effective Training: DeepSeek has astonished the AI community by developing high-performing models at a fraction of the usual cost. Roose mentions, “DeepSeek claimed that they had spent just five and a half million dollars training V3. ... something on the order of a hundred times cheaper than what something like an OpenAI model of equivalent performance would cost to train” ([04:56]).
Innovative Hardware Utilization: Due to U.S. export controls limiting access to top-tier AI chips, DeepSeek employed less advanced, restricted AI chips to train their models. This approach not only circumvented regulatory barriers but also demonstrated that high-quality AI models could be developed without the latest hardware.
Stock Market Reaction: The release of DeepSeek's models has led to significant volatility in the U.S. stock market. Notably, NVIDIA’s stock plummeted by approximately 18%, erasing hundreds of billions in market capitalization ([03:19]).
Implications for American AI Companies: The affordability and competitiveness of DeepSeek's models pose a threat to American AI giants. Casey Newton observes, “This really stood in stark contrast to [American companies']... [spending] tens of billions of dollars this year to increase our capacity and data centers” ([06:08]).
Kevin Roose emphasizes the broader implications: “The barrier to entry in this market is just not nearly as high as we thought it was” ([10:53]).
Chinese Tech Dominance: The success of a Chinese AI company like DeepSeek raises alarms about potential shifts in global tech leadership. Roose notes, “people who worry about China in general are worried about this DeepSeek announcement because DeepSeek is obviously a Chinese company” ([11:21]).
Cultural and Censorship Implications: DeepSeek’s models reflect Chinese values and censorship norms, restricting responses to sensitive topics such as the Tiananmen Square incident. Roose explains, “... the model just refuses to answer them” ([12:18]).
Open-Source Concerns: DeepSeek released R1 as an open-weights model, allowing public access and modification. This openness exacerbates fears among AI safety experts about uncontrolled dissemination and potential misuse of advanced AI technologies ([13:53]).
Casey Newton criticizes the lack of a safety-centric approach: “... DeepSeek’s approach is, yeah, let's just like build AGI, give it to as many people as possible, maybe for free, and see what happens” ([14:22]).
Meta’s Response: Reports indicate that Meta is in a state of urgency, with multiple war rooms activated to address the DeepSeek challenge. Newton remarks, “Meta is coming back and they say, oh, you think you're good at ripping people off. Just wait until we have plumbed the guts of V3 and R1” ([20:36]).
Microsoft’s Satya Nadella on Jevons Paradox: Satya Nadella tweeted about Jevons Paradox, suggesting that as AI becomes more efficient and accessible, its demand will surge, potentially neutralizing cost-saving measures. Newton connects this to broader industry skepticism: “... what Satya is trying to say here is that DeepSeek is not actually a threat to companies like Microsoft...” ([21:46]).
Kevin Roose: Roose underscores the significance of DeepSeek’s impact, viewing it as a pivotal moment in AI history that challenges existing market dynamics and safety paradigms.
Casey Newton: While acknowledging the gravity of DeepSeek’s advancements, Newton urges caution against overreaction. She points out that the trend of decreasing AI costs was anticipated and that foundational American innovations remain integral. Newton states, “...things have already been moving in this direction. And I think most people who work in AI expected that it would continue to go there” ([15:01]).
Kevin and Casey agree that while DeepSeek’s achievements are noteworthy and disruptive, stakeholders should approach the situation with a balanced perspective. They emphasize the need for ongoing discussion and analysis, promising further exploration in future episodes.
Kevin Roose aptly captures the duality of the situation: “I predict that this is going to be something I'm going to hear about at every single party I go to for the next six months” ([22:14]).
The episode concludes with a light-hearted exchange, highlighting the hosts' commitment to addressing critical tech developments as they unfold.
Notable Quotes:
This detailed exploration of DeepSeek’s emergence underscores the disruptive potential of innovative AI developments and the multifaceted responses they provoke across markets, geopolitics, and industry safety standards.