Loading summary
Joe Weisenthal
89% of business leaders say AI is a top priority, according to research by Boston Consulting Group. The right choice is crucial, which is why teams at Fortune 500 companies use Grammarly with top tier security credentials and 15 years of experience in responsible AI. Grammarly is how companies like yours increase productivity while keeping data protected and private. See why 70,000 teams trust Grammarly@Grammarly.com Meta's.
Tracy Alloway
Open source AI available to all, not just the few. Here's Dr. Cal Clark of Zaron Labs.
Zvi Moshowitz
Meta's open source AI model Llama helps.
Joe Weisenthal
Us collaborate with universities to help radiologists catch more errors.
Tracy Alloway
Learn more@AI.meda.com Open.
Joe Weisenthal
Bloomberg Audio Studios Podcasts Radio news hello and welcome to another episode of the Odd Lots podcast. I'm Joe Weisenthal.
Tracy Alloway
And I'm Tracy Alloway.
Joe Weisenthal
Tracy, the deep seek sell off.
Tracy Alloway
That's right. It's pretty deep. Has anyone made that joke yet?
Zvi Moshowitz
We're in deep seek.
Joe Weisenthal
Yeah, I don't think anyone has made that joke yet.
Tracy Alloway
I will say, like, you know, it's bad in markets when all the headlines are about standard deviations. Yeah, right. And then you know it's really bad when you see people start to say it's not a crash, it's a healthy correct. Yes, that's the real cope.
Joe Weisenthal
But just for like real scene setting, you know, we've done some very timely interviews about tech concentration in the market lately and how so much of the market is this big concentrated bet on AI, et cetera. Anyway, on Monday, I think people will be listening to this. On Tuesday, markets got clobbered. Nvidia, one of the big winners as of the time. I'm talking about this 3:30pm on Monday, down 17%. So we're talking major losses really, across the tech complex. Basically, it seems to be catalyzed by the introduction of this high performance open source Chinese AI model called Deepseek. It was born from what we know out of a hedge fund. Apparently it was very cheap to train, very cheap to build. You know, the tech constraints at this point didn't seem to be much of a problem. They may be a problem going forward, but yes, here is something. The entire market betting on a lot of companies making AI. And there are now concerns about, of course, a cheap Chinese competitor.
Tracy Alloway
I just realized, Joe, this is actually your fault, isn't it?
Joe Weisenthal
Yeah. Yeah.
Tracy Alloway
Because last week you wrote that you were a deep seat AI bro. And look what you've done. You've wiped $560 billion off of Nvidia's market cap.
Joe Weisenthal
Might be might be that's you. Anyway, one of the interesting questions though is this was sort of announced in a white paper in December. Why did it take until January 27th for related to freak people out. Big questions. Anyway, let's jump right into it. We really do have the perfect guest. Someone who was here for our election eve special, a guy who knows all about numbers and AI and quant stuff. And he writes a substack that has become for me a daily absolute must read where he writes an extraordinary amount. I don't even know how he writes so much on a given day. We're going to be speaking with Zvi Moshowitz. He is the author of the Don't Worry about the Vase blog or substack. Zvi, you're also a Deep Seq AI bro. You've switched to using that.
Zvi Moshowitz
So I use a wide variety of different AIs. So I will use Claude from Anthropic, I will use 01 from ChatGPT from OpenAI, I'll use Gemini sometimes and I'll use Perplexity for web searches. But yeah, I'll use R1, the new deep seat model for certain types of queries where I want to see how it thinks and see the logic laid out and then I can judge like did that make sense? Do I agree with that?
Tracy Alloway
So one of the things that seems to be freaking people out as well as the market is that purportedly this was trained on like a very low cost, something like $5.5 million for Deep Seq V3, although I've seen people erroneously say that the 5.5 million was for all of its R1 models. And that's not what it says in the technical paper. It was just for V3, but anyway. Oh, I should mention it also seems like a big chunk of it was built on llama, so they're sort of piggybacking off of others investment. But anyway, $5.5 million to train, is that a realistic? And then b do we have any sense of how they were able to do that?
Zvi Moshowitz
So we have a very good sense of exactly what they did because they are unusually open and they gave us technical papers that tell us what they did. They still hid some parts of the process, especially with getting from V3 which was trained for the 5.5 million to R1 which is the reasoning model for additional millions of dollars where they tried to make it a little bit harder for us to duplicate it by not sharing their reinforcement learning techniques. But we shouldn't get over anchored or carried away with the $5.5 million number, it's not that it's not real, it's very real. But in order to get that ability to spend $5.5 million and get the model to pop out, they had to acquire the data, they had to hire the engineers, they had to build their own cluster, they had to over optimize to the bone their cluster because they're having problems with chip access, thanks to our export controls. And they were training on 8 hundreds. And the way that they did this was they did all these sorts of mini little optimizations, including just exactly integrating the hardware, the software, everything they were doing in order to train as cheaply as possible on 15 trillion tokens and get the same level of performance or close to the same level of performance as other companies have gotten with much, much more compute. But it doesn't mean that you can get your own model for $5.5 million, even though they told you a lot of the information. In total, they're spending hundreds of millions of dollars to get this result.
Joe Weisenthal
Wait, explain that further. Why does it still take hundreds of millions? And does this mean, if it takes hundreds of millions of dollars, that the gap between what they're able to do versus the, say, American Labs is perhaps not as wide as maybe people think?
Zvi Moshowitz
Well, what Deepseek is doing is they have less access to chips. They can't just buy Nvidia chips the same way that OpenAI or Microsoft or Anthropic can buy Nvidia chips. So instead they had to make good use, very, very efficient killer use of the chips that they did have. So they focused on all of these optimizations and all of these ways that they could save on compute. But in order to get there, they had to spend a lot of money to figure out how to do that and to build the infrastructure to do that. And once they knew what to do, it cost them $5.5 million to do that. And they've shared a lot of that information. And this has dramatically reduced the cost of somebody who wants to follow in their footsteps and train a new model, because they've shown the way of many of their optimizations that people didn't realize they could do or didn't realize how to do them that can now very easily be copied. But it does not mean that you are $5.5 million away from your own V3.
Tracy Alloway
So the other thing that is freaking people out is the fact that this is open source, right? We all remember the days when OpenAI was more open, and now it's moved to closed source. Why do you think they did that? And like, how big a deal is that?
Zvi Moshowitz
So this is one of those things where they have a story and you can believe their story or not believe their story, but their story is that they are essentially ideologically in favor of the idea that everyone should have access to the same AI, that AI should be shared with the world, especially that China should help pump out its own ecosystem and they should help grow all of the AI for the betterment of humanity. And they're going to get artificial general intelligence and they are going to open source that as well. And this is the main point of Deep Seq. This is why Deepseek exists. They disclaiming even having a business model, really. And they're an outgrowth of a hedge fund and the hedge fund makes money. And maybe they can just do this if they choose to do that, or maybe they will end up with a different business model. But obviously very concerning from a lot of angles. If you open source increasingly capable models, because artificial general intelligence means something that's as smart and capable as you and I as a human, and perhaps more so. And if you just hand that over in open form to anybody in the world who wants to do anything with it, then we don't know how dangerous that is. But it's existentially risky at some limit to unleash things that are smarter, more capable, more competitive than us, that are then going to be free and loose to engage in whatever any human directs them to do.
Tracy Alloway
I have a really dumb question, but I hear people say artificial general intelligence all the time. AGI, what does that actually mean?
Zvi Moshowitz
There is a lot of dispute over exactly what that means. The words are not used consistently, but it stands for artificial general intelligence. Generally it is understood to mean you can do any task that can be done on a computer, that can be done cognitively only as well as a human.
Joe Weisenthal
I mean, most of these things do things much better than me. I don't know how to code, but I get that there are still some things maybe they wouldn't be as good as proving some of the Are you a human test. Everyone's talking about Jevons Paradox. And so we see Nvidia and Broadcom shares, these chip companies, they're getting crumbled today. And one of the theory is like, oh no. With all these optimizations and so forth, researchers will just use those and they'll still have max demand for compute, and so it won't actually change the ultimate end for compute. How are you thinking about this question.
Zvi Moshowitz
So I'm definitely a Jevons Paradox, bro, right now, from the perspective of this.
Joe Weisenthal
So you don't think it'll have a negative impact and just the amount of compute demanded.
Zvi Moshowitz
The tweet I sent this morning was Nvidia down 11% free market on news that its chips are highly useful. And I believe that what we've shown is that, yes, you can get a lot more, in some sense out of each Nvidia chip than you expected. You can get more AI. And if there was a limited amount of stuff to do with AI, and once you did that stuff you were done, then that would be a different story. But that's very much not the case. As we get further along towards AGI, as these AIs get more capable, we're going to want to use them for more and more things more and more often. And most importantly, the entire revolution of R1 and also OpenAI's O is inference time compute. What that means is every time you ask a question, it's going to use more compute, more cycles of GPUs to think for longer, to basically use more tokens or words to figure out what the best possible answer is. And this scales not necessarily without limit, but it scales very, very far. So OpenAI's new O3 is capable of thinking for many minutes. It's capable of potentially spending hundreds or even in theory, thousand of dollars or more on an individual query. And if you knock that down by an order of magnitude, that almost certainly gets you to use it more for a given result, not use it less, because that is in fact starting to get prohibitive. And over time, if you have the ability to spend remarkably little money and then get things like virtual employees and abilities to answer any question under the sun, yeah, there's basically unlimited demand to do that or to scale up the quality of the answers as the price drops. So I basically expect that as fast as Nvidia can manufacture chips and we can put them into data centers and give them electrical power, people will be happy to buy those chips at the.
Tracy Alloway
Risk of angering the Jevons Paradox Bros. Just to push on the Nvidia point a little bit more. So my understanding of Deepseek is that one of the reasons it's special is because it doesn't rely on, like, specialized components, custom operators, and so it can work on a variety of GPUs. Is there a scenario where AI becomes so free and plentiful, which could in theory be good for Nvidia, but at the same time because it's easy to run on a bunch of other GPUs. People start using more ASIC chips, customized chips for a specific purpose.
Zvi Moshowitz
I mean, in the long run we will almost certainly see specialized inference chips, whether they're from Nvidia or they're from someone else. And we will almost certainly see various different advancements. Today's chips are going to be obsolete in a few years. That's how AI works, right? There's all these rapid advancements, but I think Nvidia is in a very, very good position to take advantage of all of this. I certainly don't think that you'll just use your laptop to run the best AGIs, and therefore we don't have to worry about buying GPUs is a poor position. It's certainly possible that rivals will come up with superior chips. That's always possible. Nvidia does not have a monopoly, but Nvidia certainly seems to be in a dominant position right now. Me.
Joe Weisenthal
89% of business leaders say AI is a top priority, according to research by Boston Consulting Group. But with AI tools popping up everywhere, how do you separate the helpful from the hype? The right choice is crucial, which is why teams at Fortune 500 companies use Grammarly. With over 15 years of experience building responsible, secure AI, Grammarly isn't just another AI communication assistant. It's how companies like yours increase productivity while keeping data protected and private. Designed to fit the needs of business, Grammarly is backed by a user first privacy policy and industry leading security credentials. This means you won't have to worry about the safety of your company information. Grammarly also emphasizes responsible AI so your company can avoid harmful bias. See why 70,000 teams and 30 million people trust Grammarly@Grammarly.com Enterprise that's Grammarly@Grammarly.COM Enterprise Grammarly Enterprise Ready AI I'm alpine skier Mikayla Shifrin.
Mikayla Shifrin
I've won the most World cup ski races in history. But what does success mean? To me, success means discipline. It's teamwork. It's the drive and passion inside of us that comes before all recognition. And it's why Stifel is one of the fastest growing global wealth management firms in the country. If you're looking for success, surround yourself with the people who will get you there.
Stifel Representative
At Stifel, we invest everything into our advisors so they can invest everything into their clients. That means direct access to one of the industry's largest equity research franchises and a leading middle market investment bank. And it's why Stifel has won the J.D. power Award for Employee Advisor satisfaction two.
Zvi Moshowitz
Years in a row.
Mikayla Shifrin
If you're an advisor or investor, choose.
Stifel Representative
Stifel where success meets success stifel, Nicklaus & Co. Inc. Member SIPC and NYSE for J.D. power 2024 award information, visit jdpower.com Awards compensation provided for using not obtaining the.
Joe Weisenthal
Award it seems to me, I mean, I know there's others, but it seems to me in the US there's like three main AI producers of models that people know about. There's OpenAI, there's Claude, and then there's Meta with Llama. And it's worth knowing that Meta is green today, that the stock is actually up as of the time I'm talking about this 1.1%. Just go through each one real quickly how the sort of deep sea shock affects them and their viability and where they stand today.
Zvi Moshowitz
I think the most amazing thing about your question is that you forgot about Google.
Joe Weisenthal
Oh yeah, right. Yeah, that's very telling, isn't it?
Zvi Moshowitz
But everyone else has forgotten about.
Joe Weisenthal
I know I never use Gemini.
Zvi Moshowitz
It wasn't that surprising. Yeah, this is very Gemini Flash thinking. Their version of O1 and R1 got updated a few days ago and there are many reports that it's actually very good now and potentially competitive and effectively it's free to use for a lot of people on AI Studio, but nobody I know has taken the time to check and find out how good it is because we've all been too obsessed with being Deep Seq Bros. Google's had its rhetorical lunch eaten over and over and over again. December OpenAI would come out with advance after advance after advance. Then Google would have advance after advance after advance, and Google's would be seemingly actually, if anything, more impressive. And yet everyone would always just talk about OpenAI. So this is not even new. Something's going on there. So in terms of OpenAI, OpenAI should be very nervous in some sense, of course, because they have the reasoning models and now their reasoning model has been copied much more effectively than previously and the competition is a hell of a lot cheaper than what OpenAI is charging. So it's a direct threat to their business model for obvious reasons. And it looks like their lead in reasoning models is smaller and faster to undo than you would expect. Because if Deep SEQ can do it, of course Anthropic and Google can do it and everyone else can do it as well. Anthropic, which produces Claude, has not yet produced their own reasoning model. They clearly are operating under a shortage of compute in some sense. So it's entirely possible that they have chosen not to launch a reasoning model, even though they could or are not focused on training one as quickly as possible. Until they have addressed this problem, they're continuously taking investment. We should expect them to solve their problems over time, but they seem like they should be directly concerned because they're less of a directly competitive product in some sense. But also they tend to market to effectively, much more aware people. So their people will also know about Deep Seek and they will have a choice to make. If I was Meta, I would be far more worried, especially if I was on their Genai team and wanted to keep my job, because Meta's lunch has been eaten massively here, right? Meta with Llama had the best open models, and all the best open models were effectively fine tunes of Llama. And now deepseek comes out and this is absolutely not in any way a fine tune of Llama. This is their own product and V3 was already blowing everything that Meta had out of the water. R1 There are reports that it's better than their new version that they're training now. It's better than Lava 4, which I would expect to be true. And so there's no point in releasing an inferior open model if everyone on the open model community just being like, why don't I just use Deep Seek?
Joe Weisenthal
Tracy it's interesting that Zvi said the people who should be nervous are the employees of Meta, not Meta itself, because Meta is up. And so you got to wonder. It's like, well, maybe they don't. I don't know, maybe they don't need to invest as much in their own open source AI if there's a better one out there. Now the stock is up anyway.
Zvi Moshowitz
The market has been very strange from my perspective, on how it reacts to different things that Meta does. For a while, Meta would announce, we're spending more in AI. We're investing in all these data centers, we're training all of these models, and the market would go, what are you doing? This is another metaverse or something, and we're going to hammer your stock and we're going to drag you down. And then with the most recent $65 billion announced spend, then Meta was up. Presumably they're going to use it mostly for inference effectively in a lot of scenarios because they had these massive inference costs to want to put AI all over Facebook and Instagram. So if anything, I think the market might be speculating that this means that they will know how to train better llamas that are cheaper to operate and their costs will go down and then they'll be in a better position. And that theory isn't crazy, since we.
Tracy Alloway
All just collectively remembered Google. I have a question that's sort of been in the back of my mind, and I think Joe has brought this up before as well. But, like, when Google debuted, it took years and years and years for people to sort of catch up to the search function. And actually no one ever really caught up, Right? So Google has, like, dominated for years. Why is it when it comes to these chatbots, there aren't like higher, wider moats around these businesses?
Zvi Moshowitz
So one reason is that everyone's training on roughly the same data, meaning the entire Internet and all of human knowledge. So it's very hard to get that much of a permanent data edge there, unless you're creating synthetic data off of your own models, which is what OpenAI is plausibly doing. Now. Another reason is because everybody is scaling as fast as possible and adding zeros to everything on a periodic basis in calendar time. It doesn't take that long before your rival is going to have access to more compute than you had, and they're copying your techniques more aggressively. There's just a lot less secret sauce. There's only so many algorithms. Fundamentally, everyone is relying on the scaling law. It's called the bitter lesson. It's the idea that you just scale more, you just use more compute, you just use more data, you just use more parameters. And Deep SEQ is saying, maybe you can do more optimizations, you can get around this problem and still get a superior model. But mostly, yeah, there's been a lot of just, I can catch up to you by copying what you did. Also because I can see the outputs, right? I can query your model and I can use your model's outputs to actively train my model. And you see this in things like most models that get trained. You ask them who trained you and they will often say, oh, I'm from OpenAI.
Joe Weisenthal
The Internet's gotten so weird. I just. They Internet is so weird. Zvi Moshevich, thank you so much for running over to the Odd Lots and helping us record this emergency pod on the deepseek sell off. That was fantastic.
Zvi Moshowitz
All right, thank you.
Joe Weisenthal
Tracy. I love talking to Zvi. We got to just sort of make him our AI. Our AI guy.
Tracy Alloway
I mean, to be honest, we could probably have him back on again this week because there's gonna be stuff happening, right?
Joe Weisenthal
Maybe we will. And obviously we could go a lot longer. This is a really exciting story. This is a really exciting story and things are just getting really weird these days.
Tracy Alloway
It is kind of crazy how fast all of this is happening. And then the other thing I would say is just the bitter lesson. Great name for a band.
Joe Weisenthal
Oh totally. Totally great. Maybe when we do our AI themed prog rock band Tracy. Yes, that could be our name.
Tracy Alloway
Yes, let's do that. Okay. Shall we leave it there.
Joe Weisenthal
Let's leave it there.
Tracy Alloway
This has been another episode of the Odd Lots podcast. I'm Tracy Alloway. You can follow me racialaway and I'm Jill Wiesenthal.
Joe Weisenthal
You can follow me hestalwart. Follow our guest ZV Moshevitz. He's hesve. Also definitely check out his free substack. It's a must read for free for me. Don't worry about the vase. Really great stuff every single day. Follow our producers Carmen Rodriguez Ermenarmon dashiell Bennett at Dashbot and Kellbrooksalebrooks. For more Odd Lots content, go to bloomberg.com oddlots we have transcripts, a blog and a newsletter and you can chat about all of these topics 24. 7 in our Discord Discord ggodlots maybe we'll get Zvi to do a Q and A in there with people.
Tracy Alloway
Oh yeah, that'd be great. And if you enjoy Odd Lots, if you like it when we roll roll out these emergency episodes, then please leave us a positive review on your favorite platform. Thanks for listening.
Bloomberg Representative
Join Bloomberg in Chicago or via livestream on March 11th for the future finding the opportunities this 2025 event series will examine how companies are investing in their businesses to create efficiencies, innovating their products and services, and improving the customer experience. This series is proudly sponsored by Invesco. Q. Q. Q. Register at Bloomberglive. Com futureinvestorchicago. That's Bloomberglive. Com futureinvestorChicag.
Odd Lots Podcast Summary: "The AI Model That Tanked the Stock Market" Bloomberg, Released January 28, 2025
Introduction
In the January 28, 2025 episode of Bloomberg's Odd Lots, hosts Joe Weisenthal and Tracy Alloway delve into a seismic event in the financial and technological sectors: the stock market's dramatic downturn triggered by the emergence of a new open-source AI model named Deepseek. The episode features an in-depth conversation with Zvi Moshowitz, a renowned AI and quantitative finance expert. This summary encapsulates the key discussions, insights, and conclusions drawn during the episode.
1. The Deepseek AI Model and Market Turmoil
The episode opens with a stark revelation by Joe Weisenthal about the massive market impact caused by Deepseek, an open-source AI model developed in China. He states:
"It seems to be catalyzed by the introduction of this high-performance open-source Chinese AI model called Deepseek. [...] there are now concerns about, of course, a cheap Chinese competitor." (01:31)
Tracy Alloway humorously attributes a significant portion of the market loss to Weisenthal's own enthusiasm for AI:
"Because last week you wrote that you were a deep seek AI bro. And look what you've done. You've wiped $560 billion off of Nvidia's market cap." (02:27)
This dramatic decline primarily affected tech giants like Nvidia, which experienced a notable 17% drop in stock value within a single day, underscoring the fragility of market sentiments tied to AI advancements.
2. Training Costs and Technical Insights
Zvi Moshowitz provides a comprehensive breakdown of Deepseek's development, emphasizing its cost-effectiveness:
"They did all these sorts of mini little optimizations... in order to train as cheaply as possible on 15 trillion tokens and get the same level of performance or close to the same level of performance as other companies have gotten with much, much more compute." (05:53)
He clarifies that while the initial training cost was reported as $5.5 million for Deepseek V3, the total investment, including data acquisition, engineering, and infrastructure, amounted to hundreds of millions of dollars. This distinction is crucial in understanding the true financial commitment behind developing competitive AI models.
3. Open Source AI and Its Broader Implications
The conversation shifts to the philosophical and practical ramifications of open-source AI. Moshowitz discusses Deepseek's ideological stance:
"They are essentially ideologically in favor of the idea that everyone should have access to the same AI... it doesn't mean that you are $5.5 million away from your own V3." (07:17)
He raises concerns about the potential risks of releasing highly capable AI models openly, citing the existential threats posed by Artificial General Intelligence (AGI). The democratization of such powerful technology could lead to unforeseen and possibly dangerous applications.
4. Competition Among AI Laboratories
The episode examines the competitive landscape of AI development, highlighting the position of major players like OpenAI, Meta, and Google. Moshowitz notes:
"OpenAI should be very nervous... it's a direct threat to their business model for obvious reasons." (15:10)
He underscores that Deepseek's advancements have narrowed the gap between established AI laboratories, making the race for AI supremacy more intense. The discussion also touches upon Google's overlooked contributions and the challenges faced by competitors like Anthropic in keeping pace with rapid AI advancements.
5. Implications for Nvidia and the Chip Industry
A significant portion of the discussion centers on Nvidia's role in the AI ecosystem. Moshowitz argues against the notion that Deepseek's efficiency lowers overall chip demand, invoking the Jevons Paradox:
"As we get further along towards AGI... there's basically unlimited demand to do that or to scale up the quality of the answers as the price drops." (09:50)
He contends that increased efficiency in AI models will drive greater demand for compute resources, thereby benefiting chip manufacturers like Nvidia. Despite Deepseek's ability to optimize chip usage, the overarching need for more powerful hardware remains unabated.
6. The Concept of Artificial General Intelligence (AGI)
Tracy Alloway probes the often-debated term "Artificial General Intelligence," seeking clarity on its definition. Moshowitz explains:
"Generally it is understood to mean you can do any task that can be done on a computer, that can be done cognitively only as well as a human." (08:49)
The hosts discuss the implications of reaching AGI, emphasizing the transformative and potentially disruptive impact it could have on various industries and society at large.
7. Google's Position in the AI Race
Towards the episode's conclusion, the hosts and Moshowitz reflect on Google's strategic positioning:
"Google's had its rhetorical lunch eaten over and over and over again." (15:40)
Despite being a powerhouse in AI research, Google's contributions often go underappreciated in public discourse. However, Moshowitz suggests that Google's substantial investments in AI infrastructure and model optimization could position them strongly against competitors like OpenAI and Deepseek.
Conclusion
The episode of Odd Lots provides a thorough examination of the immediate and far-reaching consequences of Deepseek's introduction to the AI landscape. Through expert analysis, the hosts highlight the intricate interplay between AI advancements, market dynamics, and the broader economic implications. The discussion underscores the pivotal role of AI in shaping future technological and financial paradigms, urging stakeholders to remain vigilant and adaptive in this rapidly evolving field.
Notable Quotes
For more insights and detailed discussions, listeners are encouraged to tune into the full episode of Odd Lots on Bloomberg.