
Loading summary
Alex Kantrowitz
Let's break down what the release of GPT 4.5 means for OpenAI and the future of generative AI plus anthropic also has a new model. Nvidia CEO Jensen Huang makes a bold claim and Amazon introduces a better version of Alexa. That's coming up right after this.
Ranjan Roy
Hi, I'm Kwame Christian, CEO of the American Negotiation Institute, and I have a quick question for you.
Alex Kantrowitz
When was the last time you had a difficult conversation? These conversations happen often all the time.
Ranjan Roy
And that's exactly why you should listen to Negotiate Anything, the number one negotiation.
Alex Kantrowitz
Podcast in the world.
Ranjan Roy
We produce episodes every single day to.
Alex Kantrowitz
Help you lead, persuade and resolve conflicts both at work and at home.
Ranjan Roy
So level up your negotiation skills by.
Alex Kantrowitz
Making Negotiate Anything part of your daily routine.
Ranjan Roy
Struggling to keep up with customers with agentforce and Salesforce Data Cloud deploy AI agents that know your customers and act on their own. That's because Data Cloud brings all your data to AgentForce, no matter where it lives. Get started@salesforce.com data welcome to Big Technology.
Alex Kantrowitz
Podcast Friday Edition where we break down the news in our traditional cool headed and nuanced format. We have so much to talk you through this week. It feels like this week, among many crazy weeks, has been one of the craziest. We have a new model from OpenAI, a new model from Anthropic, a new Alexa, Nvidia earnings and Skype is dead. So it was a very, very promising week for a lot of companies, but not for Skype, which will forever live in our memory. So we will say goodbye to Skype at the end of the show. But in the meantime, joining us as always on Friday is Ranjan Roy of Margins. Ranjan, great to see you.
Ranjan Roy
Happy New Model Week. Alex, how have all these new models changed your life? As of today, February 28th, not at all.
Alex Kantrowitz
But we will talk about whether that will matter in the long term because of course I put your is it the model or is it the product? Question to OpenAI head of research Mark Chen. We talked about it and now you're going to get a chance to respond. But first let's just break down the news because yesterday we had the release of course of GPT4.5 OpenAI for the first time ever, put a spokesperson on this podcast. We broke the news here with Mark and now we're going to analyze what it means because we sort of left the fog of war and we have some perspective on whether this is disappointing for OpenAI, whether this is promising for OpenAI, and whether, you know, this means that generative AI can continue to progress or not. Now that we've seen some more reactions outside of Mark Chen saying yes, scaling is still alive. So this is from the Verge OpenAI announces GPT4 4.5 GPT 4.5 is the largest and newest large language model from OpenAI. It's going to be available as a research preview for ChatGPT Pro users to start. And here here's like a weird thing though that happened. There was some documentation. We're gonna get right to it right away. There was some documentation that OpenAI released about this model and then removed and it's very mysterious. They said GPT 4.5 is not a frontier model, but it is OpenAI's largest LLM, improving on GPT 4's computational efficiency by more than 10x. It does not introduce seven net new frontier capabilities compared to previous reasoning releases, and its performance is below that of 0103 mini and deep research on most preparedness evaluations. OpenAI has since removed this mention from an updated version of the document. So they did remove it. I don't think they disputed it though. And I found what was interesting was yes, this was a step chain change improvement over GPT4. It was not over the reasoning models. So you would think maybe you could build reasoning on top of this. We're going to talk about that in a moment and it will be even better. But for the meantime, OpenAI has a new model that does not exceed the reasoning models in certain benchmarks and seem to admit that in a document. So Ranjan, you've been following along this whole way, what do you think the implications of this are?
Ranjan Roy
This week had me thinking. I feel with iPhone releases in recent times, a lot of us have been saying, do we really need a big event to release every new iPhone? Certainly the 16e was not exactly the iPhone launches of yesteryear. I'm starting to feel like that with all of these large language model releases. Claude 3.7 GPT 4.5 even as you're listing out all of the kind of release notes around this, and then there's some ingredients that are not listed or these things are removed from the actual release documentation, it's not that exciting. It's not exciting enough to have to try to launch a livestream and get everyone you know hyped up around it. GPT 4.5 and we're going to get into there's some elements of emotional intelligence or emotional quotient that are around it. Perhaps creative writing is a little bit better. Perhaps there is a bit of computational efficiency introduced to it. Even Sonnet 3.7, and I was trying this Claude code is a pretty big release and a pretty big step change, but it's not revolutionary. So I think a lot of these companies have gotten caught in this hamster wheel of needing to do these big model launches. And there was a time where the step change was so big that it was actually exciting for all of us. But Now I think 4.5 is probably the least interesting model release from OpenAI to date, because even O1 and adding reasoning models to the overall suite was a pretty big deal. 4.5. I still cannot tell you what the big deal is. Maybe you can tell me.
Alex Kantrowitz
We are going to get some commentary from Andrej Karpathy about this that he put on Twitter yesterday, which does answer that point. But even from OpenAI itself, there was some very interesting communication, shall we say, around this model. So Sam Altman came out with this tweet and he said, GPT 4.5 is ready. The good news, it's the first model that feels like it. Talking to a thoughtful person, to me. I've had several moments where I sat back in my chair and have been astonished at getting actually good advice from AI. The bad news is it's giant, it's expensive. We really wanted to launch it to Pro and plus users at the same time, but we've been growing a lot and are out of GPUs, and we'll add tens of thousands of GPUs next week and roll it out to the plus tier then. And there's hundreds of thousands coming soon, so I'm pretty sure you'll, you'll be able to use it once we can rack up. Uh, this isn't how we want to operate, but it's hard to perfectly predict growth surges that lead to GPU shortages. Remember, ChatGPT has gone from a hundred million to 300 million users in a very short amount of time. Heads up. This isn't a reasoning model and won't crush benchmarks. It's a different kind of intelligence and there's magic to it. I haven't felt before really excited to have people try it. Look, it's very interesting because again, we're going to go into my interview with Mark Chen very quickly. But Mark, I was like, hey, listen, does this show that we're getting diminishing returns from scaling? And he said, absolutely not. But then you have these endorsements from Altman and it's fairly muted, so make sense of that for me.
Ranjan Roy
Ron the world's greatest product marketer cannot market his own product. I mean, it's not as great at some things, but trust me, there's this magic which I felt, but I'm not going to actually tell you what that magic is. I think it kind of actually his statement really captures the overall feeling I have of 4.5. It just tries to put a positive slant on it, but I think that's exactly it. They have to keep pushing new models, new narratives, pushing towards GPT5 whenever, if and when that will come. But to me they need and actually this is going to get back to our product versus model debate. They need to show more product again. Operator Deep Research. Those were exciting moments. 4.5 as any kind of announcement is not incredibly interesting to me. You had asked Mark Chen during your interview, you know, what are the new use cases or what are the use cases where this will be better at? And I was actually sitting there waiting with bated breath, ready to hear, okay, this is how this is going to help me or other people. And there was a somewhat generalized answer around how with creative writing tasks, whatever that might mean, this is better. And that was kind of it that I got out of the interview. So I think. And which lines up with the whole idea around emotional quotient, emotional intelligence, more creative writing, more thoughtful answers. And I've seen a lot of examples of out there of 4.5 answering and being a bit funny and people saying this is the first time AI has made me laugh. But if you're just trying to get a little bit more grocky with your model, I don't know, that doesn't seem like that's going to fill in that softbank valuation for me.
Alex Kantrowitz
Yeah, look, I didn't find it grokky at all so because you know, as we've talked about on the show, but.
Ranjan Roy
You know more on the, the trying to make it funny or interesting as opposed to just giving you information.
Alex Kantrowitz
So having experimented with it, you know, as I was going to say, both of us have paid for that 200amonth upgrade because we wanted to try deep Research. And I guess mine is still live. I think yours just dinged. So. But I'll say that I spent a good amount of time chatting with GPT 4.5 yesterday and what they're saying is real. Like it is definitely much more pleasant to talk about it. And I spoke with, with Mark about this a little bit yesterday. The responses are shorter, they're more human, like, like it doesn't feel the need to like print out, you know, a master's thesis for each answer. Like you can actually have a back and forth with it. And it was actually one of the more enjoyable conversations I've had with, with a bot to date.
Ranjan Roy
That's okay. I'll give you. That is a good point. If the big functional change is we've all gotten very used to this idea that you query a chatbot and you get this really overly thoughtful answer that tries to both hedge itself from any kind of safety consideration and lists out 10 bullet points and as you said, a master's thesis. So maybe there is something very important there where it actually starts to be able to answer you correctly in a concise way, in a more conversational, human way. Maybe there's something there. But to me, again, why not just put that out there in the model? Why have a big event around it? Why make a big press push around it? Why not just put that in the product?
Alex Kantrowitz
Well, here's why I would say it's important to do that is because, and this is what Mark was saying, that you have linear progression of the model's capabilities based off of what you predict. If you put this much compute in it, you get this much output. And I think OpenAI is saying that this 4.5 is the next step on that progression and it's met with the amount of compute that they've put in the benchmarks that they've expected to hit. And that's why I said to him, did you find the scaling wall? And he said, GPT 4.5 is really proof that we can continue the scaling paradigm. So basically I think that is sort of like, that is the march. But I also think it's important to kind of talk about like what it's going to feel like to all of us. And then this gets to Carpathy's comments and it's basically here he describes really well the progress from the original models because you, you it is. It's going to feel less as you get better. So he says GPT1 barely generates coherent text. GPT2 was confused, was a confused toy. 2.5 was skipped straight into GPT3, which is even more interesting. And GPT 3.5 crossed the threshold where it was enough to actually ship a product and sparked OpenAI's ChatGPT moment. He says, I went into testing ChatGPT or GPT 4.5, which he's had access to, and he says everything is a little bit better and it's awesome, but not exactly in ways that are trivial to point to. Still it is incredibly, incredibly interesting and exciting as another qualitative measure of a certain slope of capability that comes from comes for free. Just by pre training a bigger model, he says we actually expect to see improvement in tasks that are not reasoning heavy. And I would say there are tasks that are more EQ as opposed to, as opposed to IQ related and bottlenecked by world knowledge, creativity, analogy making, general understanding and humor. So these are the tasks that he was most interested in during his vibe checks. Basically saying that like you use this model, it's a little bit better and that matters a lot because we've already come so far from the barely coherent part to where it is today.
Ranjan Roy
I think I'm going to nominate you as the new product spokesperson for OpenAI because I think you just convinced me right here. I think you just turned my entire view of 4.5 in this moment. So basically I've been talking a lot about AI has a branding problem. The idea that people say that's written, quote unquote written by AI. Everyone has this really narrow view of what AI text generation is and that's because of this very dry, weird, almost inhuman way that it responds to you. And every model, whether it's Gemini or Claude or ChatGPT, everyone has this view of this is what an AI response looks like. So actually if the real advancement here is it can move beyond that and make things more human and conversational, that actually could be very interesting overall in terms of getting people to use these products. So I think if that's the real change here, I'm surprised that they didn't hone in on that, that this is going to be what takes ChatGPT to the next 700 million people outside of all early adopters and makes people comfortable and happy with it and makes a much more natural within all types of mediums and channels and outputs that would that if they positioned like that and if that's what's really happening here, that is kind of exciting for me.
Alex Kantrowitz
I think, I think that is how they're, they're positioning it. They've, they're talking about, they are talking about the fact that this has great EQ and that is where they want to seem to focus people with this release. And you look at some of these benchmarks and so I'll just read a few of them. Simple QA accuracy GPT 4.5 has 62.5% compared to, let's see, 47% the closest model, which is OpenAI01. The hallucination rate is 37.1%. Again, lower is better. GPT4O has a 61.8% hallucination rate which seems high. So those are like the standard benchmarks. But then you get into the everyday queries and they say that for everyday queries people prefer 4.5 57% of the time over GPT 4.0. For professional queries they preferred 63.2% of the time over 4.0 and for creative intelligence 56.8% over 4.0. So that's not nothing.
Ranjan Roy
No. I think if Cantrewitz and Roy were behind this marketing campaign and launched, we could have just come up with a simple make AI less AI. What about that one? Something, something just pushing the idea that that's what this is really about. Not getting caught up in the scaling law side of it, the compute efficiency side of it and really saying this is the first model that makes AI less AI. It makes more people feel comfortable using this on an everyday basis. I think that I would have been, it would have been a little more, more exciting for me, definitely.
Alex Kantrowitz
And so there's a very interesting debate that's going on about like where did it get this more EQ oriented positioning? Was it pre training, like is it because of its abilities or was it post training where like they just added this personality after the model was built? We don't fully know. And actually if I was going to have one question that I'd want to ask Mark Chen, if I could get him on the phone for like another five minutes, it would be that question. And I feel bad having left that out yesterday. But I have seen some very interesting debates about it over the past couple days where there's this one Princeton academic, Arvind Narayan. He says apparently the main thing we're getting with GPT 4.5 is an exchange for 30 times percent price in exchange for a 30x price increase is fuzzy stuff like IQ. The ironic thing is this is an aspect of behavior, not a capability. My bet is that any difference in EQ are due to post training, not the parameter count. Okay, so that's an interesting thesis. Ethan Mollick from Wharton slides into his mentions. And Ethan Mollick, of course, he's a professor, he's been on the show, he's been pretty good at sort of following the pulse of AI. He's pretty positive. So he tends to take the sunny side of things. But he says disagree on on this one. Stuff like theory of mind or EQ are deeply rooted in abilities, not behavior in humans. And I would bet the same for AI, but again, we don't know yet. So basically if this did come out of like just training the model, making it, making it more able and then it all of a sudden produces like a more human style of communication, I think that's pretty interesting.
Ranjan Roy
Well, yeah, I do think, and I, I, I was thinking about this as well after reading these on one hand, it could be essentially kind of a party trick. It could be more instruction level after the actual core training where it's just speak in this voice, give concise answers, try to lean your behavior towards a certain way. I think that would actually be very sad because that would be easy. What Ethan's saying, I think is the more interesting part. And I have to say, if it's OpenAI doing this, I have to imagine for this kind of product and model, they're not going to be going the party trick route and genuinely changing the way the model thinks and produces knowledge would be a very big deal, as Ethan's saying. But again, we don't know what that means or what it looks like. Is it in the supervised fine tuning layer? Is it in the base training layer? We don't know. I'm actually surprised. Yeah, we gotta find out. You gotta ask, you got to ask Mark Chen again, because to me, again, that is the really interesting stuff they should loudly be talking about rather than Sam Altman just saying it's kind of magic and not giving us any more.
Alex Kantrowitz
Exactly. And so there's been this other thing that's happened though, which is that people have taken a look at the evaluation scores and have noted that this is not as good as reasoning models in a lot of different fields. So I think we should talk about that because it has been used as a discussion point about whether OpenAI has lost the magic. So let me just go through some of these. You know, whatever they're going to mean to you, I'm just going to read them out. So there's GPQA, which is science, 4.5 gets us 71.4% compared to 79.7% for OpenAI. O3 mini. So it's down by 8, 78 percentage points there. There's AIME 24, which is math GPT 4.5, 36.7% compared to O3 mini, 87.3% less than half as performance. It's amazing. It just beats on this multilingual test. And it is a little bit, no, it is a little bit better on one coding benchmark and then a little bit worse on another coding benchmark. But basically people have taken this and I think this was also something I saw afterwards. I was like, oh, oh dear. You know, like there is reason these reasoning models are outperforming this on a lot of benchmarks. Now I think we should say that the reasoning models use the intelligence of these, you know, standard models and they learn how to attack things step by step, which is like, yes, the reasoning models are doing the things that they're supposed to do and it just shows you how impressive uh, the reasoning is. Uh, but then there's also just like, why is it lagging? People have been like, all right, that's really disappointed. Here is. Let's hear from trustee Bob McGrew, former chief research officer at OpenAI that always seems to hop in the discussion at an opportune time. He says, don't be disappointed that GPT 4.5 isn't smarter than 01. Scaling up pre training models pre training improves responses across the board. Scaling up reasoning improves re responses a lot if they benefit from thinking time and not much otherwise wait to see how the improvements stack together. I think this is really important, right? It's that this 4.5 is going to be the basis of the next reasoning model that OpenAI is going to put out. And I think Mark hinted on this, is that GPT5 will bring both of those capabilities together where you're going to have the smarter basic foundational model, which is going to be GPT5 or something built off of GPT 4.5 and then you're going to add the reasoning in and then it should even further outperform the stuff we're seeing with like 01 and 03. So what do you think about that?
Ranjan Roy
Yeah, trustee Bob McGrew making sense of it. I think that makes sense that building this more apt, able, emotionally intelligent foundation model and then building, you know, incorporating that with the reasoning model and ideally that Getting us to GPT5 seems like something ambitious enough to actually push forward on. I guess I still have such a difficult time again though, when we're looking at GPQA benchmark AIME24, even when we're looking at what you had shown earlier, there's like on everyday queries that GPT 4.5 beat 4.0 by 60%. What does that actually mean? What does that look like? What kind of real life problems? Because I'm so fascinated by what is an everyday query in one of these tests that if you have an AI researcher creating a benchmark, what is their everyday query versus your or my everyday query like? I think that that's the part that still worries me about OpenAI, that so much focus is on that research house part of it and the like, the very, very research oriented approach to all of this going back to product versus model. But it feels like we're still locked in that rat race here.
Alex Kantrowitz
Okay, well that, that just takes us to our model versus product question again because I did bring this up to Mark and I said, all right, you're the head of research at OpenAI, you're a model guy. So just like I am trying to figure out how to argue this to Ranjan, maybe you can help me figure it out. And. And he did. He gave an explanation, basically saying that as the models get smarter, these products, like for instance, Deep Research, get smarter. We talked last week about how if they're hallucinating, they become useless. So the less hallucinations you would imagine, the better, maybe. Unless you're Benedict Evans who wants zero hallucinations. So I'm kind of curious to put that to you and get your thoughts on what it means.
Ranjan Roy
I listened to it and it still felt like a relatively generalized statement for something that shouldn't be a generalized statement. Even going back to what are the real use cases? Is it creative writing? Is that really what you're pitching me with? 4.5 that it's going to be better? Is it Everyday users will have a better experience with a chatbot and feel more comfortable. Is it? AI therapists are going to get a lot better because now it can actually talk to you in a more emotionally connective way. To me, that's the part that the hallucination rate side of it I think obviously matters. But if the idea is it's like to me, the 99% versus 98% versus 97% for most AI use cases in the world, I think will probably be okay. To me, again, it still doesn't answer that question. Deep research can get better and better, but does that mean financial analysts will actually trust everything that they is put into a Deep Research report in a week, in a month, in a year? What does that actually look like?
Alex Kantrowitz
Yeah, I think we still don't know. I mean, we can definitely say for sure that like improving the model from GPT1 to where we are today has mattered. But I think that the question is, yes, what are these incremental improvements going to really lead to? And like, yeah, I mean Mark was like, it's all about getting to the frontier of knowledge in AI. The smarter these things are, the more they can do. Just like a smarter human can do more. I think it's great. I love that they are pushing, you know, the cutting edge on this and that every AI lab is trying to get push, push the cutting edge on it. But I, you know, I cannot staunchly sit in my position for much longer unless I see some tangible outcomes from this. But, but anyway, I'll, I'll still be on Team model for the time being.
Ranjan Roy
All right. It makes for better Fridays knowing we still have product versus model.
Alex Kantrowitz
Yeah, I don't really see myself going away from that position anytime soon. I want to see the better models. Thanks for shipping the better models.
Ranjan Roy
I'm waiting for GPT5 to show up and magically solve every use case perfectly. And I will eat my hat whatever one does on that day.
Alex Kantrowitz
Well, the last thing I'll say about Mark is I did say like, so aren't you setting expectations too high? And he said, I don't think so.
Ranjan Roy
Okay.
Alex Kantrowitz
All right, Mark, GPT 5, baby.
Ranjan Roy
I don't know what trustee Bob McGrew would say about that, but let's see.
Alex Kantrowitz
All right, so now that I've become the sort of de facto product spokesperson for OpenAI, let's go to Gary Marcus, because I feel like we should talk about, we should at least give some time to those who've said actually that GPT 4.5 launch shows that OpenAI is toast and sort of discuss their points. And one of those people are Gary Marcus, longtime critic, former. He's been on the show as well. I'm sure we'll have him back soon. He. He messaged me on LinkedIn after he saw my Mark Chen interview and said, allow me to give a rebuttal. I said, all right, send me something. We'll read it on the show. I haven't got anything back, but I will read a LinkedIn post from him and we can discuss it. So he says OpenAI is in serious trouble. They still have the brand name, a lot of data, and tons of mostly unpaid users. But GPT 4.5 is usually expensive. Even so, it offers no decisive advantage over competitors in Zero Moat. Scaling hasn't gotten them to AGI. The GPT5 project was a failure. There is already starting to be an Is that all? They have reaction, including from some people who've said they've had to adjust out their prediction of when we hit AGI. He said deep seek led to a price war that cuts potential profits. There is still no killer app. OpenAI is still losing money on every prompt, a bunch of investment turns to debt. If they can't make the transition to the nonprofit fast enough and Elon has perhaps upped the cost. Many, many top people have left. Some have started serious competitors with similar IP because OpenAI's burn rate is so high they have limited Runway. Microsoft no longer has their fully has their back. Altman's credibility has diminished. Sora went nowhere. Whatever lead they had two years ago has been squandered. And if Masa changes his mind, they will have a serious cash problem. And Elon is right that they don't have all the money for Stargate. Man. What do you think in response to that list?
Ranjan Roy
What takes ed Zitron about 5,000 words to write? I think Gary Marcus did in about in one tweet actually that reminded me of Sora that it exists which I played. Have you used it recently or I have not the text to video or image to video model. That one definitely went nowhere. Could have been a good product demonstration. I think overall they are making this bet. It's what we keep talking about. But it has to be GPT5 has to be oh my God, this solves everything. Like this is where there's no hallucinations, it's reasoning. It's a huge foundation model. It's relatively low cost. Somehow I think it really the way they're positioning their entire business is that it's going to be the kind of silver bullet to everything. Otherwise, I really don't see again the zero moat part of it. You're seeing more and more, which is to me maybe that is why they push so hard on these constant model releases. Because they have to stay relevant because the moment they're just an API in the background, then you're the most commoditized thing imaginable and then then that will kill you anyway. So that's a pretty compelling case right there.
Alex Kantrowitz
Yeah. And I think that one point I think that I should make here and I did speak to one more point about the market. I spoke to him about starting and stopping and he said that's a normal part of training any model. But if you're starting and stopping on a model that's this expensive to make, then your costs go way up. So I think Gary is right that the errors or whatever, the changes, the tweaks that you have to make become very expensive tweaks when you're starting to work on projects this size.
Ranjan Roy
Yeah, the cost of the model training and we're going to get into how anthropic supposedly the new Claude was much less expensive. Deepseek, we know whatever whether it was 6 million or 60 million or whatever it was was significantly less expensive. I think overall you have one side of the industry showing us that it actually can be cheaper and cheaper and cheaper. But then those with the best interest REM OpenAI's competitive advantage could be talent to an extent. Even though a lot of talents left, they have a pretty deep bench that's pretty impressive. Or it could be resources, cash and like access to compute. So they almost have to make that their game because if that's not their game, they're not going to win. If that's not the game, they're not going to win.
Alex Kantrowitz
Yep. So we talk about the competition you teased Anthropic. We have so much more to talk about, including the new Anthropic model. What Jensen Huang has talked about how expensive reasoning is in Nvidia earnings and of course the new Alexa. We're going to do that right after this.
Ranjan Roy
Hey you, I'm Andrew Seaman. Do you want a new job or do you want to move forward in your career? Well, you should listen to my weekly show called Get Hired with Andrew Seaman. We talk about it all and it's waiting for you. Yes, you wherever you get your podcasts struggling to meet the increasing demands of your customers. With AgentForce and Salesforce Data Cloud, you can deploy AI agents that free up your team's time to focus more on building customer relationships and less on repetitive low value tasks. That's because Data Cloud brings all your customer data to AgentForce no matter where it lives, resulting in agents that don't deeply understand your customer and act without assistance. This is what AI was meant to be.
Alex Kantrowitz
Get started@salesforce.com data we're back here on Big Technology Podcast Friday Edition, talking about all the latest AI and tech news, including the fact that Anthropic has a new model. Jensen Huang has a stance on how much Compute Reasoning uses and the new Alexa. And by the way, Skype is dead. So let's see if we can get to that all in the second half. The first is that GPT 4.5 wasn't the only model. Here we have Anthropic. Anthropic's Claude 3.7 sonnet. It's here. This is from TechCrunch. Anthropic is releasing a new AI frontier model called Claude 3.7 sonnet, which the company designed to think about questions for as long as users want it to. So like we've been talking about, there is It's a hybrid AI reasoning model, a single model that can give Both real time answers and more considered thought out answers to questions. And you just choose do you want the quick response or do you want the thinking response? And the model represents Anthropic's broader effort to simplify the user experience around its AI products. We're longtime cloth heads. I would say on this show I've gotten a chance to use it, you've gotten a chance to use it. I believe the thinking toggle that we talked about is pretty good. It's almost as good as Deep Seeks. What is your response? That we have another model from Anthropic and the fact that we went not from 3.5 to 4, but from 3.5 to an incrementally better 3.7.
Ranjan Roy
What about 3.6? I was waiting for 3.6. That was going to be the big one, but we just skipped. We skipped that ahead to 3.7.
Alex Kantrowitz
On to 3.7, baby.
Ranjan Roy
I think I've been using as a cloth head, I've been using 3.7 regularly. Again from the model side, the thinking toggle mode, which I'll still categorize a bit as product, maybe that one lives between product and model is good. Claude code is definitely a very new offering from them and I think it's going to be very interesting because still coding to me is the most monetizable direct to actually productive use case for generative AI as of today. So I think the way they approach this is kind of how I want these model launches to be approached. There's a blog post, there's some tweets, you know, there's, there might be an explainer video here and there and that's it. And, and we keep getting improvements and as we wait for 3.9, maybe not 4.0 because that's AGI.
Alex Kantrowitz
4.1.
Ranjan Roy
4.0 is AGI.
Alex Kantrowitz
So yeah, so I will say just having. We're going to get into how they trained the model because it's interesting. But I will say I did an experiment this morning where I've been using, I mean, I think I've mentioned this on the show. I've been using Cloud every day as a diet coach where I basically like write down the meals I've had weigh in. It will give me how I did a letter grade based off of the prompt that I gave it about the way that I want to be eating and it will count up the calories and grade the foods. It's very good and it has lots of memory. And so I like, I copied the history which goes back like probably a month and a half at this point in the latest chat and dropped it into OpenAI's GPT 4.5 Claude 3.7 reasoning and deep Seek and unfortunately I'm here to report that Deep SEQ did the best job of all of them.
Ranjan Roy
You didn't try GROK and have it yell at you and make fun of.
Alex Kantrowitz
You for I'm good on that.
Ranjan Roy
Thank you though I think see that's like to me, I want that as its own standalone benchmark. The Alex what did I eat benchmark. That is the leading benchmark for all frontier models going forward. I mean that's the real life stuff that's actually interesting to me. I do that very regularly. I'll have three tabs open, try the same question across three and see what I get a good answer on. Those are the use cases and the kind of ways that I think everyone, all of our listeners approach these models in that way. Try different models and try the same question. Just see what happens and see what you like better. I think that's the real way to try to decide what's really happening in terms of progress here versus the more theoretical stuff.
Alex Kantrowitz
Can I just say, one of the big takeaways for me this week is just that reasoning is just freaking unbelievable. Like it's a true breakthrough and when you use those models you just get better stuff. And that to me has been sort of like, I think discounted I think in some of the conversation here, but not in our show. I think we've already always talked highly of reasoning, but in the broader like the walls hit OpenAI's toast like reasoning is both useful and better.
Ranjan Roy
I'll agree it's better, but it's still better for many things, but not all things. Again, simple queries of back and forth. Analyze this text something that's like where all the information is right there in front of you and doesn't require a great deal of complexity. You don't need reasoning for that and that's more expensive or that's more complicated or it's more time consuming. So I think I agree I'm still wowed. But there's also the UI element of that that deep seek again listing out the questions of the chain of thought as they're coming up was the kind of using the term party trick again. But it's just UI feature that makes it so much more real and now everyone's doing it. It's amazing how quickly sometimes it's almost annoying now on ChatGPT where it starts walking me through what it's doing and how it's thinking when I actually don't want it to. When I'm like, I'm good, I'm good, just give me an answer, I'll take a little, I'll switch to another tab and come back.
Alex Kantrowitz
But it is like your very talkative friend who's like, let me tell you exactly how got to this. And you're like, nah, it's good. We're just going to go with your answer.
Ranjan Roy
I'm going to start on this tab and then I'm going to go here. It's almost, I want the log afterwards if something's wrong to go back. But we've talked about this before. The problem that remains is if something is broken in that reasoning process, you can't simply fix it. It's not like I go back and I'm like, okay, on step three of eight, I would rather you have done this than this. That does not exist yet. So at that point the reasoning is nice, the show of reasoning, but it's not actually you can't utilize it in any meaningful way.
Alex Kantrowitz
That's fair. So Ranjan, I want you to talk a little bit about this cost efficiency that Anthropic seems to have found in training 3.7 because I think that's pretty significant when we think about how these businesses will operate and whether they need to spend as much money as they are training their latest models.
Ranjan Roy
So on the cost side, 3.7 sonnet, it apparently cost just a few tens of millions of dollars to train. We already talked about Deep Seq. I think again it goes back to showing like what are the real costs involved. There's gathering up some large amount of data. If it's a reasoning model, there's a supervised fine tuning side of it. There's a reinforcement learning side of it that could involve bringing lots of humans. And again, that literally is like what is the correct way to get to this answer? Is the answer correct? What rank these outcomes and actually going through hundreds, thousands, tens of thousands of times and training the model that way, obviously that's time intensive and it's expensive. But I think it's important to recognize that even Anthropic, who has kind of been in the whole big models, expensive models game so far, the fact that they are moving towards this, it almost means, I guess OpenAI is probably the only player left that's still kind of trying to sell. You need big expensive models to win.
Alex Kantrowitz
So then, what do you think about this comment from Jensen where he talks about now we're going to go to reasoning and inference and that's going to be more expensive. So Nvidia earnings came out this week, so they had revenue jump 78% from a year earlier to 39.33 billion in the quarter. They're projecting 43 billion in the next quarter. They delivered 11 billion of their Blackwell chips. So life is good for Nvidia. But everyone's getting the sense as to like, how's your business going to look if we get more efficient if we go toward these reasoning models? And this is a very interesting statement from Jensen Huang where he says AI has to do a hundred times more computation now than when ChatGPT was released. Basically talking about how the reasoning approaches are more expensive. Next generation AI will need a hundred times more compute than older models as a result of new reasoning approaches that think about how to best answer questions step by step. The amount of computation necessary to do that reasoning process is 100 times more than what we used to do. So it is interesting to me because I mean, you look at what Deep Seek did and they found a way to not only do reasoning but do it more efficiently. And Jensen is saying this thing that seems to disagree with this a little bit. Well, I'm curious what you think, Ranjan.
Ranjan Roy
I mean, never to speak ill of Jensen Huang, I think he's saying what he needs to say. I mean, if the thesis that things are going to get much cheaper and require less compute holds, we could have the Javon's Paradox, which I haven't heard in a little while, but we all heard about that one week. Again, the idea that the more ubiquitous AI would get because it's cheaper would actually require more aggregate compute. But it's, I mean, it still feels like Nvidia has to tell that story. And I'm again, the company blew out numbers again. And even though it's getting caught up in the larger stock market route as of today, but this is still an insane company in terms of its ability to produce and deliver. It still hurts their longer term story, at least with the expectations that have been set by the market.
Alex Kantrowitz
Yeah, I mean, it's just one of those things where I'm like, I see his logic and I see where he's going, but I don't really see how. I mean, yes, they've talked about how inference is 40% of their revenue, but I just don't really see how it's going to cost 100 times more to do reasoning. Maybe I'm missing something.
Ranjan Roy
No, I think it's very difficult to try to calculate out because even I guess the more complex the use cases get, and maybe we'll start unlocking use cases that we haven't even imagined or AI is going to be applied to areas where we haven't even started to and that those will be the ones that really soak up all that compute. But I agree with you that the idea that it's going to require a hundred, it's going to 100 times more compute, especially as the trend is everything's getting cheaper, doesn't make sense to me either.
Alex Kantrowitz
Let me ask you this one thing that I saw from earnings and I'm curious if you think that it's right. I mean the fact that they, they shipped 11 billion in Blackwell chips, the expectation was like three and a half billion. So clearly there's a huge amount of demand for the Blackwell chips, which are the latest generation of Nvidia chips. All the hyperscalers are saying we'll take as much as we can get. Including Andy Jassy at the Alexa event this week. Does that show that there, there's already enough tangible process. Sorry, does that show that there's already enough tangible progress within AI that merits this further investment of chips? Or do you think we're just still in the finding out phase?
Ranjan Roy
You never want to be in the find out phase. I feel, because I think we all know what happens after. But no, I think from the hyperscaler side it's still like no one backed down from actually Microsoft a bit, seemed to hedge and I believe there's some reporting that they're canceling some data center leases. But overall the hyperscalers are playing the same game that we're going to. It's an arms race for compute and we're going to continue down this road and we're going to get into the Alexa event. Maybe. It does start to seem like the more complex Alexa gets if every single person who has an Alexa is actually actively engaged all day with Alexa plus then you start to see that, okay, it's going to require a lot of compute. So if those, like, if it really lands in the way that it's being promised to, maybe that does make sense. But I think as of today it's just everyone is, all the hyperscalers are taking the exact same bet.
Alex Kantrowitz
Right. So we're still in the like scale up infrastructure and maybe this will work. Not in the, this is working enough that we're going to keep investing.
Ranjan Roy
Exactly.
Alex Kantrowitz
So I think that, I mean I am still, I'm still bullish on Nvidia But I think if you take this kind of like dubious proclamation about reasoning being cheap, being much more expensive, combined with the fact that like, yes, they're still ordering, but you know, there's a big if at the end of the tunnel. I do wonder a little bit like, if there's like a potential nasty surprise for Nvidia coming in a couple years.
Ranjan Roy
No one ever won that prediction in the last few years at least. But I'm not disagreeing with you, but it's one of those things that it's almost, I'm almost like fearful of saying out loud.
Alex Kantrowitz
Yep, I'm sure I'll eat the words there. And maybe Mark Zuckerberg will be the one that continues to keep Nvidia running, taking all that ad money and pushing it right into this chip and server company. Right. Because now Facebook is going to potentially spin off a Meta AI app in an effort to compete with OpenAI's ChatGPT. This is according to to CNBC. Meta AI will soon become one of the social media companies standalone apps. Joining Facebook, Instagram and WhatsApp. The company intends to debut a Meta AI standalone app during the second quarter. Of course, they're going to have all the app install power that you have on Facebook to get people to use it via their ad slots and new slots they're going to put in. And Mark Zuckerberg is really intent on basically taking over OpenAI's lead with ChatGPT. He sees it, we've talked about it in the past. He sees this as a big consumer app. He sees that it's growing fast and he doesn't want somebody else to do it. Same way that he sort of cut off Snapchat and did it with some success against TikTok and Reels. Very funny response from Sam Altman when he sees this. He says, okay, fine, maybe we'll do a social app. He says it's funny if Facebook tries to come at us and we just Uno reverse them, it would be so funny. I mean, I think he's. You get that response from Sam Altman when he's not completely sure of his footing. And I don't really feel that he's sure of his footing against this one because you don't want to go up against Facebook when it comes to a consumer app. It's. It doesn't usually end well. So what do you think, Ranjan?
Ranjan Roy
I think that I'd actually, it's been so long since I played Uno, I had to look up what Uno Reverse.
Alex Kantrowitz
Was first, but I was like, so surprised. Like, what are you? Uno Reverse? Who speaks like that?
Ranjan Roy
But anyway, Sam Altman speaks like that, but the same guy who's coming up with everyday queries for our AI benchmarks. But I thought this was a very interesting one because I have been using meta AI more for image generation just because it's very easily accessible and it's still in a weird place because it lives in the search bar for Instagram and WhatsApp or Facebook and living in the search bar. I've also accidentally used it when I'm searching for something on Instagram and somehow it pushes me towards meta AI and gives me a weird chatbot response. So I think spinning it out is a very interesting idea. And then being meta, they would be incredible at quietly guiding people towards that app from all of their other apps. But it's still. Is it needed? Is there other ways to integrate it more as a tab in existing Facebook Blue and Instagram itself? That part I would think it would just be another tab on the regular app versus having people download something. But I see this one as another threads. They'll get some big numbers, but I don't think it's going to be anything too impactful.
Alex Kantrowitz
Yeah, I don't think it's going to work. I would like to see an OpenAI social network and not for nothing but open faces out there for the taking.
Ranjan Roy
The AI first social network where all of your posts are commented on extensively, where you have a million friends who are all fawning and love you very much. I think they could go down that road.
Alex Kantrowitz
Open face, Open face. I'd be a day one user.
Ranjan Roy
Social networking needs like an entire remake and maybe Sam Altman is the one to bring that to us.
Alex Kantrowitz
I mean, if anybody can do it, Maybe it is OpenAI. They're the best product in AI. So you don't even need a better model for that. I'll give you that, Ranjan.
Ranjan Roy
That's the product that we've all been waiting for.
Alex Kantrowitz
So we got about 10 minutes left and I've saved this for last and I don't want to spend too much time on it because I am going to have a podcast next week covering it. But Alexa, the new Alexa app is out. This or is not out has been introduced the new Alexa revamp. It's called Alexa Plus. It is conversational. It is able to accomplish things in the real world. It seems to have an awareness of what happens between your Amazon services. So you can ask it to play a song and then say, all right, can you take me to a point. In this movie that song is on and it will do that based off of Amazon prime music and video. It will search your ring cameras for you. It will potentially order you an Uber. You can use it to control the sound in your apartment. If you have echoes with conversational tones, like can we have this song play in that room? Or can I want to hear it over there? It was a very impressive, very impressive demo, I felt, and it was live, unlike Apple Intelligence, where ample intelligence was a promise. And it seems like a lot of this Alexa stuff is going to work, so I do want to do this preface that we're going to have Panos Panay along with Daniel Rauch, the head of Alexa. So it's going to be a fun conversation that's coming up on Wednesday. You're going to hear a lot more about that. But Ranjan, I'm very curious, like what your reaction was in our chat. I think in the discord you dropped this Gurman tweet where he talked about how like this was chatgpt voice on steroids. And Apple has to be embarrassed at this point for how bad Apple Intelligence is. But what was your takeaway? Looking at the Amazon news and, and if you can, if you want to, I'm just gonna say, if you want to, you can say what it might mean for Siri.
Ranjan Roy
Oh man. I spent so much time rewiring my entire House for HomePods and I watched that event and I want to just go back, I want to switch all to Alexa and I'm sure we'll definitely talk about it more after your next episode. But like, it looked good. It looked exactly like what it should be. It looked like putting ChatGPT voice mode on a device or Gemini voice on a device or just what basic voice interaction should do right now. And I'm avoiding hitting my table right now. So there's feedback on the mic because.
Alex Kantrowitz
You can tell I'm working on do it this time.
Ranjan Roy
Just telling me, sorry listeners. Oh my God. That's all it should be doing right now. And it did what it's supposed to do. And voice, we know generative AI voice is that good right now. I think the only thing that I think could be a little problematic for Alexa is like Amazon does not have a great reputation in terms of privacy or just overall, like it can still be a little creepy. The reason I got rid of my Lexus was it would always ask these follow up questions which you could not turn off. You'd be like, what's the weather? Oh, here's the weather and can I interest you in these three other things? And you couldn't turn it off. So I think, I mean, the way they portray it, it becomes your like, really trusted companion that you're sitting there sharing yourself with. That's a big ask in terms of trust. So I think in terms of the technology, I'm pretty confident they're there. In terms of getting people actually comfortable with interacting with your voice device in that way, we'll see. But oh man, it's going to cost me a lot of money.
Alex Kantrowitz
So, yeah, we did talk about it yesterday. I'll just give a quick preview of what this is going to look like. I mean, we talked about it yesterday and they are aware that talking back to you and being proactive can be pretty interruptive and annoying. And I think they're paying attention to that as they roll this out. So that'll be at the end of the conversation for those that listen. But yeah, I thought it was really interesting. I think Amazon has a shot here. I wrote about this in Big Technology that basically all big tech companies want to build a universal, contextually aware assistant that helps you get things done. And Amazon has a pretty good shot to be the one that pulls it off, especially because they have, I mean, they have a working demo of this and it seems like it's going to go live next month. And I don't know, I mean, they don't have an operating system, which is on one hand. They don't have a mobile operating system on one hand. That's a curse because that default matters a lot. We know Google pays Apple $20 billion a year to be the default search engine on the iPhone. However, that does let them use other, other productivity services and not privilege their default. And I think these companies privileging their default productivity services have been sort of the downfall of the modern AI assistant. Like if I'm using an iPhone and I can't use Google Calendar or Gmail in there because Apple's so dedicated to whatever Apple mail. That ruins Siri for me. But Amazon doesn't have that problem. And so I was speaking with actually the head of prime who was mentioning that, yeah, I use my Google Calendar in my Echo devices and it works just as well. So that could be a blessing for them.
Ranjan Roy
Yeah, I 100% agree though speaking of prime, the way they rolled out the pricing I think was the most like, savage and amazing Amazon move ever. Again, I forget what the monthly subscription is.
Alex Kantrowitz
Is it 1999, 1999, 1499?
Ranjan Roy
Let me see, it's something that most people would not pay as of today, but as an Amazon prime member, you get it for free. So it just kind of like assigns this additional benefit to being a Prime member, which, if you're a Prime member, you shop on Amazon X percent more. So they're going to kind of assign this incredible value to it on day one, and you're going to feel like, oh, well, now if I was questioning, should I renew my prime membership? Well, it's a gimme. I mean, I'm getting Alexa for free. Run.
Alex Kantrowitz
John, listen to this. Alexa plus cost 19.99amonth. Prime cost 14.99amonth. Oh, wait.
Ranjan Roy
Oh wait, I thought it was free for Prime.
Alex Kantrowitz
No, it's if. If you have prime, it's free. So basically, you could pay an extra $5 to just get Alexa plus or 5 less dollars to get all of prime and Alexa. Okay, that is savage.
Ranjan Roy
That is savage. I mean, if Lina Khan was still around, I don't know what she would say, but my goodness, that is not.
Alex Kantrowitz
Okay. Before we go, we need to talk about Skype. Microsoft is killing Skype. This is hot off the presses and makes me very Sad. As from TechCrunch. After kickstarting the market for making calls over the Internet 23 years ago, Skype is closing down. Mike. Microsoft, which acquired the messaging and calling app 14 years ago, said it will be retiring it from active duty on May 5th to double down on Teams. Skype users have 10 weeks to decide what they want to do with their account. It's not clear how many people will be impacted. The most recent numbers that Microsoft shared were in 2023, where it said it had 36 million users. A long way from Skype's peak of 300 million users. We do, you know, look at tech with a critical, sometimes hopeful eye. I will say that Skype is one of the products that I've loved the most on the Internet. Just have good feelings about it helping me make international calls and calls to friends. And you could play different games on it back in the day. And that little squeak that it makes when you get a message will forever remain in my heart. Rest in peace, Skype. We bury you the week after the humane pin goes the way of the Neanderthals. And I am, I'm much sadder about losing you then the wearable AI device.
Ranjan Roy
I had my first job, first remote job interview on Skype. I agree. International calls, it opened up the world. Sold for $8.5 billion to Microsoft in 2011. And sorry you had to get caught up into a corporate battle with Microsoft Teams that clearly won and ended up basically, I think the last few times I ended up on Skype, I had all these messages that were clearly phishing and scam things that were just barraging my Skype account and did not open it after. Goodbye Skype.
Alex Kantrowitz
Goodbye Skype, and goodbye to all of you. But just hopefully for a couple days, because I'll be back on Wednesday with those two Amazon executives and Ranchon and I will be back on Friday. Ranchan, thanks so much for coming on the show.
Ranjan Roy
See you next week.
Alex Kantrowitz
All right, everybody, thank you for listening. And we'll see you next time on Big Technology Podcast.
Big Technology Podcast Episode Summary
Host: Alex Kantrowitz
Guest: Ranjan Roy
Release Date: February 28, 2025
Episode Title: OpenAI’s New Model, Jensen’s Bold Claim, Alexa+ Is Here
In this episode of the Big Technology Podcast, host Alex Kantrowitz delves into the latest advancements and shifts in the technology landscape. The discussion centers around OpenAI’s release of GPT-4.5, Anthropic’s new model Claude 3.7 Sonnet, NVIDIA CEO Jensen Huang’s assertions about AI compute demands, Amazon’s introduction of Alexa+, and the retirement of Skype by Microsoft. Joined by guest Ranjan Roy, the conversation navigates the implications of these developments for the future of generative AI and the broader tech ecosystem.
Overview: OpenAI unveiled GPT-4.5, touted as the largest and most computationally efficient large language model (LLM) to date. Initially available as a research preview for ChatGPT Pro users, the release has sparked mixed reactions within the tech community.
Key Points:
Notable Quotes:
Discussion Highlights:
Scaling vs. Product: The debate centers on whether advancements are driven by model scaling or by developing tangible products. Ranjan expresses skepticism about the excitement surrounding GPT-4.5, likening it to a "hamster wheel" of model releases without significant breakthroughs.
Endorsements and Criticisms: While OpenAI’s Mark Chen emphasizes scaling as a pathway to enhanced AI capabilities, critics like Andrej Karpathy and Gary Marcus argue that improvements may be more about behavioral adjustments than genuine capability enhancements.
Insights:
Overview: Anthropic announces Claude 3.7 Sonnet, an AI model designed with a unique feature allowing users to toggle between real-time answers and more contemplative, thought-out responses.
Key Points:
Notable Quotes:
Discussion Highlights:
Comparison with GPT-4.5: While GPT-4.5 focuses on scaling and computational efficiency, Claude 3.7 emphasizes user control over the depth of responses, positioning itself as a more versatile tool for varied user needs.
Cost Efficiency: Claude 3.7’s training reportedly cost “just a few tens of millions of dollars,” indicating a move towards more cost-effective model development without sacrificing functionality.
Insights: Anthropic’s approach with Claude 3.7 Sonnet highlights a strategic pivot towards enhancing user interaction and experience, contrasting with OpenAI’s scaling-focused advancements.
Overview: NVIDIA reported a substantial revenue increase, driven by high demand for their latest Blackwell chips. CEO Jensen Huang made significant statements regarding the computational demands of next-generation AI.
Key Points:
Notable Quotes:
Discussion Highlights:
Compute Efficiency Debate: Ranjan and Alex express uncertainty over Huang’s claim, noting contrary evidence from companies like DeepSeek that are achieving efficient reasoning without exorbitant compute costs.
Market Implications: The high demand for NVIDIA’s chips underscores the ongoing investment in AI infrastructure, despite debates over the actual computational requirements.
Insights: NVIDIA’s statements reflect a bullish outlook on AI compute needs, but the reality may be more nuanced as advancements in model efficiency continue to evolve.
Overview: Meta (formerly Facebook) plans to launch a standalone AI application powered by its Meta AI division, aiming to compete directly with offerings like ChatGPT.
Key Points:
Notable Quotes:
Discussion Highlights:
User Experience: Ranjan shares personal experiences of integrating Meta AI into daily tasks, noting both its utility and privacy concerns.
Market Positioning: Meta’s approach contrasts with OpenAI’s product-centric model, potentially offering more seamless integration with existing social platforms.
Insights: Meta’s entry into the AI assistant market leverages its existing ecosystem, presenting a significant challenge to competitors by embedding AI capabilities directly within its widely used platforms.
Overview: Amazon introduces Alexa+, a revamped version of its virtual assistant, enhancing conversational capabilities and real-world task execution.
Key Points:
Enhanced Functionality: Alexa+ can perform complex tasks, such as controlling smart home devices in specific rooms, integrating with Amazon Prime services, and performing contextual actions based on user commands.
User Interaction: Designed to be more proactive and conversational, Alexa+ aims to provide a more seamless and intuitive user experience.
Notable Quotes:
Discussion Highlights:
Pricing Strategy: Amazon offers Alexa+ free for Prime members, integrating it as an added value to incentivize Prime subscriptions.
Privacy Concerns: Ranjan raises issues about Amazon’s reputation regarding privacy and the intrusive nature of proactive AI assistants.
Insights: Alexa+ represents Amazon’s strategic move to solidify its dominance in the smart home market by offering advanced AI capabilities bundled with its Prime service, though not without facing potential privacy backlash.
Overview: Microsoft announces the retirement of Skype, marking the end of an era for one of the earliest internet-based communication platforms.
Key Points:
User Base Decline: From a peak of 300 million users, Skype’s active users have dwindled to 36 million by 2023.
Strategic Shift: Microsoft aims to focus resources on its Teams platform, which has overshadowed Skype in the enterprise communication space.
Notable Quotes:
Discussion Highlights:
Emotional Impact: Both hosts express nostalgia and sadness over Skype’s closure, reflecting on its historical significance in global communication.
Phishing and Security Issues: Ranjan shares personal experiences of encountering phishing attempts on Skype, highlighting security vulnerabilities that may have contributed to its decline.
Insights: Skype’s shutdown underscores the rapid evolution of communication tools and the challenges legacy platforms face in adapting to changing user needs and security standards.
This episode of the Big Technology Podcast offers a comprehensive analysis of significant developments in the AI and technology sectors. From OpenAI’s incremental yet debated advancements with GPT-4.5 to Anthropic’s user-centric Claude 3.7 Sonnet, the conversation highlights the diverse approaches companies are taking to push the boundaries of generative AI. NVIDIA’s optimistic outlook on AI compute demands and Amazon’s strategic launch of Alexa+ illustrate the dynamic interplay between technological innovation and market strategies. Additionally, the sentimental farewell to Skype serves as a reminder of the ever-changing landscape of digital communication tools. As the hosts wrap up, they anticipate further discussions with industry leaders, promising listeners deeper insights into the future of technology.
Notable Quotes with Timestamps:
Alex Kantrowitz [06:00]: "GPT 4.5 is ready. The good news, it's the first model that feels like talking to a thoughtful person."
Ranjan Roy [04:07]: "GPT 4.5 is probably the least interesting model release from OpenAI to date."
Jensen Huang [41:32]: "AI has to do a hundred times more computation now than when ChatGPT was released."
Gary Marcus [28:28]: "OpenAI is in serious trouble. They still have the brand name, a lot of data, and tons of mostly unpaid users, but GPT 4.5 is usually expensive."
Sam Altman [06:00]: "GPT 4.5 is ready. The good news, it's the first model that feels like talking to a thoughtful person. The bad news is it's giant, it's expensive."
This summary captures the essence of the episode, providing listeners and non-listeners alike with a clear understanding of the key discussions, insights, and conclusions drawn by Alex Kantrowitz and Ranjan Roy.