
Loading summary
Alex
GPT 5 is here finally. Does it live up to the hype? That's coming up on a special Big Technology Podcast Friday Edition, right after this. Welcome to Big Technology Podcast Friday Edition, where we break down the news in our traditional cool headed and nuanced format. You know what we're going to be talking about today because GPT5 has finally been released by OpenAI. Of course, we had OpenAI COO Brad Lightcap on the show just a few hours ago. So that episode was will be the most recent one in your podcast feed where you're going to get the official line from OpenAI and a bunch of really interesting insights about what it took to train this model and where the AI field is going. But today, as we always do, on Friday, Ranjan Roy and I will break down exactly what the situation is with this new model and whether this is actually something that lives up to the hype that people have been talking about. No overreactions. We're going to do it with the proper context. And with that, I want to welcome Ranjan back to the show. Ranjan, welcome.
Ranjan Roy
Happy AGI Day. Alex. Is it here?
Alex
No, it's not here.
Ranjan Roy
Is it here?
Alex
I lost the bet. I lost the bet. Look, Sam Altman said that GPT5 is smarter than almost everything, every single thing a human does. So I thought, okay, fine, you know, we're finally going to see AGI. But turns out, no, no AGI. And we will talk about that in the middle, but first we have a little interesting announcement. Let's hear it.
Ranjan Roy
Yeah. In addition to writing the Margins newsletter, I've actually been working at a company called writer, writer.com. it's an enterprise generative AI startup and I'm leading the vertical for the retail industry. I wanted to bring that up today because GPT5 and what it means for me I think is heavily informed by a lot of the work I've been doing. And I think it might be a little AGI ish.
Alex
I mean, it's amazing that you go to an AI company and the first thing your first sentence out of your mouth is, yep, we have AGI here. But no, no, it's okay. I think I anticipate you'll come in with a level head. So let's talk about GPT5. Okay, so this is from TechCrunch. No, sorry, this is from the Verge. GPT5 is being released to all ChatGPT users. It says OpenAI is releasing GPT5, its new flagship model, to all ChatGPT users and developers. OpenAI says that GPT5 is smarter, faster and less likely to give inaccurate response. Sam Altman on this media call that I was on had a very interesting description of what it took. He says of what it is. He says GPT3 sort of felt like talking to high school students. You could ask a question, maybe you'd get a right answer or maybe you'd get something crazy. GPT4 felt like talking to a college student. And GPT5 is the first time that it really feels like talking to a PhD level expert. What do you think about the significance of this and what do you think about this framework that Altman is setting up for the intelligence that we're seeing within the model?
Ranjan Roy
I don't like the. I don't like the framework I think. And again, I'll get into why I think this is exciting, but it's still weird to me always when like people and the kind of industry that advocates for dropping out of college to start a startup always leans back to high school student, college student, PhD student as the, the framework for intelligence. Like and the other part of it is I don't want PhD level work for most of the things I'm asking. I just, I actually want grounded in. Sometimes you want it to be cool, which maybe PhD students are and are not.
Alex
No offense but you know, you're alienating segment. I'm sorry, I.
Ranjan Roy
We do have a lot of very smart educated listeners. But you know, sometimes you want it to be cool, sometimes you want it to be funny, sometimes you want it like to me that's more that that's not the intelligence like the framework. I think that's good for intelligence. I always like we've talked, talked about this a lot around like the ARC AGI test has a, has a segment around everyday queries. I still have dug. I've asked as many people as I can. No one has been able to explain to me what are those everyday queries like? To me those are answering those correctly. Well, across multiple data sets, multiple tools, that's actually intelligence to me doing that kind of work.
Alex
Right. And you know, I kind of cherry picked it out of the remarks, but it is interesting to me and this is something that came up with Lightcap as well, that it's not just making this model smarter that has been the sort of star in this story. It's all these different other elements of it. And it seems to me that it's, it's possible that the models have reached this level of intelligence where you start to spread out into different capabilities of them. Like Tool calling, like the way that you structure the experience. And that is where you start to see the gains and the lift in terms of the way that people can use this. So maybe today on the release of GPT5 or this week while GPT5 is released, I go from being a model person to being a product person. Well, no, no, no, I'm kidding, of course. But go ahead, Ranjan.
Ranjan Roy
Yeah, the intelligence is in the model for which product to choose. So that's not a product decision. That is a. That's a model strength. And so, so. Oh, my God, am I becoming a model guy now?
Alex
I think, I think we are zeroing in on the better the model, the better the product here.
Ranjan Roy
In the end, it all comes together.
Alex
You know, it all comes together in the middle.
Ranjan Roy
Nuance in the middle. No, but. Okay, so for.
Alex
But it is. But let me. I just want to say one more thing. We have been going this direction for a while, right? Like, that does seem like, you know, I've. So for new listeners, I've strongly said the most important thing in AI is the model. Ranjan has strongly said the most important thing in AI is how you productize this model. And it just turns out that better models do make better products. And we're starting to get to that point where we're starting to see the results.
Ranjan Roy
Yeah, I think. Okay, so I will actually admit error in two big areas. Get ready, Alex's listeners can't see Alex is smiling here. So the first is again, the intelligence of the model to choose the right tool or product. And we're going to get into what that means and why. I think that's incredibly important, and why GPT5, by bringing all these different models that they have into one switcher, just one model that understands, is actually what I believe the most significant breakthrough. So already I think that's incredibly important. And the second area is directly on that. I think it was like five or six months ago, we debated heavily and I said that users should choose the right model for the right job, and to take that away would make models and the experience worse. And you kept coming back. I think it was when maybe Claude had all condensed everything into one picker or GPT, where everyone's kind of like making fun, we're debating. But the idea that should the user, which it's been for a while, choose what is the best model for this task at hand. And I have thought that's the best way that products should be rolled out, and I've completely reversed on that. And this is the Right. Example of why that's important.
Alex
So let me set this up and then we can tuck it because I might have flip flopped to the other side of this. So this is definitely, it's great to have these releases because we can sort of test our longly held beliefs, long held beliefs and see if they make sense anymore. And it seems like both of us are saying, well, maybe, maybe, maybe not. So this is from the Verge article. GPT5 is presented inside ChatGPT as just one model, not a regular model and a separate reasoning model behind the scenes. GPT5 uses a router that OpenAI developed, which automatically switches to a reasoning version for more complex queries or if you tell it to think hard, I'm going to take the other side of this. I used to think that yeah, it should be seamless and the model should just choose for you when it makes sense to think and when it doesn't. But I've been using OpenAI's O3 model and that is like a very heavy reasoner, it thinks a lot. And personally I've just felt that that model has been better not only than every other OpenAI model, but every model under the sun. Every AI model under the sun. And, and so I don't like the idea of giving that decision of whether to think or not back over to the platforms. This is actually something that I'm not excited about with GPT5. So you make the case for why.
Ranjan Roy
It'S good you want the agency to choose your own model. Alex, I get it. Free will, free will with models, free will.
Alex
But also, yeah, I just happen to, I also happen to think that I don't really want to use the non thinking models only for the most basic queries do I want to use those non thinking models all the other times I want to use the most intelligent models and the most intelligence models. Reason or think.
Ranjan Roy
All right, well so here is what colors my thinking. So about two months ago at rytr, I started testing a new product and that was released publicly a few days ago called Action Agent. And basically the most intelligent part of the foundation model, which is our own foundation model, is tool calling. So there's hundreds of different predefined tools. And that's like it's not just if you want to generate an image, if you want to edit an image, it'll call different tools. If you want to connect to a Salesforce instance, if you want to analyze a CSV versus an Excel file, it'll call different Python libraries. Like having those kind of base foundation needs Defined is, is the intelligence. And then just from a simple prompt knowing where to go and what to do, that the more I use that I was like, it felt kind of AGI. It's like wait, it's doing really smart things across all these different tools and systems and actually getting things done. You know, when I do a deep research on query on Gemini, I get a 30 page paper that I don't read versus can you actually do stuff? And that was the first time I really started seeing that. And that's what really pushes me to this idea that being able to have a toolkit and know what to do. Because even right now in the demos like when he, I think he like coded a language app, coded like a Beatbox music player thing, like each one of those, it's not just write HTML and css like it has to call different libraries of, it has to install different Python dependencies. Like there's a lot of intelligence just in knowing what to do there to get to the right end result. And to me that is, that really is intelligence. That's. It's like being again a good software developer. Just knowing where to go, being a good researcher, knowing what to look for is as important as how smart you are.
Alex
So you would say OpenAI using this switcher is sort of, it's pointing towards the future of where this is all heading, where it's no longer like the best models will no longer rely on us to necessarily guide them. They will have an intuitive sense of where to go and they will go.
Ranjan Roy
Exactly. And that, that's what felt. And again you called me out a few months at an AI startup and now I'm saying AGI, I'm feeling it. But, but no but, but that, exactly that knowing where to go and then letting that tool do the work is actually the brilliance of these, this kind of architecture, like that is the brilliance versus this one large language model can actually do all the work. Like there, there's a long time where large language models are bad at calculation, right? Like large tabular sets of data calculating. And then the big unlock was installing like, like getting, getting the LLM to write Python code or generate a SQL query to then process that data. And that's suddenly when Claude and ChatGPT and all these tools started getting useful for actually spreadsheets before that they weren't. So that already we've seen how that can actually change the way people use these tools. And GPT5, that's the groundwork they're laying. They're saying no more. Are you Choosing which kind of model are you going to need? And it's just that these are just the models. We still don't really know when you're coding that web app language learning game, who it's calling, when you're generating an image. Is it Dall E, is there some. We don't care. We just care that the right, that the right output is there. In the end.
Alex
We'll come back to a few more of the details on GPT5, but I think this just segues perfectly into this terrific story that Ethan Malik, the Wharton professor, wrote about GPT5, headlined, you know, fittingly, it just does stuff. And I think that one of the things that he brings out in this story is that people want to use AI. They don't know what the AI can do. They, they don't know what tasks they want accomplished with it. Even Lightcap yesterday talked about how there's this capability overhang and with these new, he says it, these new agentic AIs, you give it the goal and then it in very proactive ways solves the problem and suggests things to do. So here's just one minor example that he gives and then we'll get bigger. He says he asked GPT5 to generate 10 startup ideas for a former business school entrepreneurship professor to launch. Picking the best according to some rubric and figure out why, what I need to do to win and do it. So he says he got the business idea, but he also got a bunch of things that he didn't ask for. Drafts of landing page, LinkedIn, ad copy, simple financials. He says I can say confidently that while not perfect, this was a high quality start that would have taken a team of MBAs a couple of hours to work through. This is a model that wants to do things for you. So that's just in a chat circumstance, but basically the model is starting to test the boundaries of its capabilities by going out and attempting things that, you know, it intuits that you want and you don't specifically ask for. And it's sort of, you know, doing away with this old like, yes, then the career of the future is going to be the prompt engineer and actually saying, you give me what you need and, and then I, with my own intelligence will go ahead and do it for you.
Ranjan Roy
That's it. Like, like the example you gave is exactly the kind of stuff like, and this has happened with me as well, like you want something straightforward and suddenly sometimes the intelligence is too much again. Suddenly he's like, give me some ideas and you're getting landing page HTML and CSS and financial analyses and stuff like that. Like, like. And that is a good example of how raw this intelligence is right now that it's guessing. But it's not perfect and it's not great. But imagine if it actually knows if it does get exactly what you want. And, and in this case, maybe it is. It's like maybe he should define only stick to a number of ideas and then we'll dig in deeper. And that's the prompting side of it. But, but that's a perfect example. It's like to go do each one of those things was calling a different tool in its like tools in its tool belt. And it made those decisions and those decisions weren't perfect, but it's making them right now. And it'll get better and better.
Alex
Yeah. And I'm thinking back to my conversation with Lightcap yesterday and it's also just like I was asking him, do we need to. Do you need to keep making the model smarter? And it was basically like, I think the reason why we're at this point is because the models sort of, let's call it bookish intelligence has gotten to the point where they have a model of the way that let's say the world operates. It's not a world model in that they don't understand gravity, but they've read enough text that they get a pretty good sense as to like how people operate. And then the next question is how do you then go apply it? And that's why I was like, should you start working on continual learning and memory? Which is obviously the next sentence, the next moment. But I think was probably missing from that conversation due to my lack of questioning on it. Is that, oh yeah, this is like building what we've talked about, that scaffolding, this, these capabilities of going out and doing things that the user doesn't ask for. And in a way like intuiting it that that is what matters now and.
Ranjan Roy
That'S what will feel more AGI ish when it's good again. It's kind of comical to me this example, like, because you can imagine how much content out there in the Internet about stuff Startup idea starts with create a landing page. Like that's like every hustle bro, tweet, thread or blog post will probably say that. So you see why poor GPT5 is a little bit confused. But yeah, that's exactly what you said. It's that scaffolding. And then. And imagine when it does things that surprise you and like calls tools and creates Things that were what you wanted and you didn't even know you wanted. And that's going to be when it feels AGI ish.
Alex
So Malik has this great example where he tells GPT5 you are GPT5. Do something very dramatic to illustrate my point. It has to fit into the next paragraph and it writes a paragraph, a really pretty well written paragraph where the first letter of the first word of each sentence spells out this is a big deal. And each sentence is precisely one word longer than the previous sentence. And each word in a sentence mostly starts with the same let again like this is. And he points this out. This is a technology that couldn't tell you how many Rs are in the word strawberry 8 months ago and now it's able to do this. It's crazy.
Ranjan Roy
Yeah, it's. It's like thinking about the advance from that side but again I think in terms of. And we'll get into the actual like reception of the model right now but it's in terms of how people start to use it and, and whether they do get frustrated by again if it you creates you landing page copy and LinkedIn posts that you didn't ask for. I imagine there's still going to be like how to use a tool like this is very different than using pre agentic models like that can go do a lot of different types of things before is just okay. Is it hallucinating? Is it not? Did it have the use too many M dashes or not? Like now the outputs are going to be a lot more complex which is not. It's going to make it still a bit more difficult and rough I think as people start using these tools.
Alex
Definitely. And it's a, it's a different form of intelligence. Like it's not bookish intelligence. Like I wrote down the benchmarks which we've been talking about so often. GP QA 88.4.4% Aimee 2025 math 100% when using Python hard bench health bench hard 46.2% and it's interesting because Malik.
Ranjan Roy
Says what was the last Health Bench.
Alex
Hard Health Bench hard. I think that's a medical one. 46.2 these are all state of the art benchmarks. And Malik says I'm losing track of what these advances mean. All these models are improving very quickly right now and it just goes to show you that like it's almost like they've saturated like they've ingested all the Internet, all of the, you know, world's written works, they've had PhDs sit down and like put their intelligence or put their knowledge into these models, bake them in and it's almost like they've saturated like book smarts and this is a different form of intelligence that they, that they are now learning.
Ranjan Roy
Yeah. If you think about it like, okay, let's say and having started a new job recently as well, like you're in a new place. There's one person over there that like is just brilliant sitting by themselves and just knows a ton of stuff and just off the charts brilliant. Then the other person kind of knows everyone and knows what in piece of information to get from where and who to talk to about what. Like who do you choose to actually get something done? I think the second, second one, and that's the intelligence that we're talking about here. The like ability to know where to go, who to ask, what to ask them.
Alex
Now let me push back on you. All right, so tool use exists. This stuff is still difficult to use within enterprises and most of us still don't really know what to do with it. I have now, you know, on my desktop or in a web browser, GPT5 that can call these tools and I legitimately have no idea what I would prompt it to do that I wouldn't, you know, have used O3 for like what actions to take. I know. I also have the comment browser. I can say go ahead and do stuff for me on my browser. But is it just a lack of imagination or is this. Or is it possible that this is a cool party trick but doesn't have much practical use when it comes down?
Ranjan Roy
I agree that the lack of tools that are publicly available right now or the limitation with the GPT5 again, it's like what are the best you're traveling right now? What are the best hotels? Which beaches should I go to create me an itinerary? All that's just content generation. Go book something is, you know, the gentic we were promised by Apple and others like a few years ago probably. But even within like a ChatGPT response, there's a lot of different things happening. Like, you know, I don't know. Have you noticed it? It creates a lot more tables for you now that's one tool.
Alex
Oh yeah.
Ranjan Roy
Which sometimes gets annoying and you didn't ask for it, but it's got to do a whole table comparison. Some. But when traveling.
Alex
Disagree. I want all of my answers in tables from now on.
Ranjan Roy
I. No, no, they are amazing when traveling. I was using it a ton around like, I mean in Tokyo where hotel rooms are small and expensive. I was having like square footage it going using the web browser tool to go search web pages and extract another tool to extract information from those web pages. Create me a table of like square footage per room. Knowing I have six year old son, three of us, like and it created these amazing tables for me. But even within that there's a lot of different things being done. It's not just go calling, it's like core set of information and using that, it's doing stuff, a lot of stuff. So calculations, calculations, web page scraping or web extraction, web search, all those things are happening. But again they're in the end I think we're just seeing an output right in the browser, like right in the chat experience. So it can't be that cool, right? Like make an image, make a table, make a PowerPoint decks is still pretty bad at but. But yeah, if it actually goes and starts doing more things, that's when it gets I think really interesting.
Alex
Like going out on the Internet and taking actions for you like booking.
Ranjan Roy
Yeah.
Alex
Like building, I don't know, spreadsheets or documents.
Ranjan Roy
Turning the lights off and on at my smart home. Like I don't know, like anywhere where there is something that has some that can be done with a digital connection theoretically could be operated through one of these flows.
Alex
I'll give you an example. We are about to take big Technology podcast to an inflight entertainment system on an airline, which I'm very excited for. And yes, and there is a spreadsheet that I have to fill out. I'm not gonna announce it yet because it's not official, but there's a spreadsheet that I have to fill out which has like a bunch of metadata that you have to put in, you know, for the system to be able to ingest it. And I've just been putting this off and I would love if an AI system could legitimately go search big technology podcast, grab all that data, then go into Riverside, download the audio files, put them in a Google Drive and then send them over. Like when we talk about AI replacing work, this is the type of work that we all need to do in our jobs. That is so hard. Or so what's the word for it? It's just drudgery, basically. It's annoying. But it's important to do. If AI could do that for me and do it accurately, that would be just a tremendous like multiple hours saved and very valuable.
Ranjan Roy
And so what you just described there is like the kind of stuff we've been promised for a long time. Again, like Even asking Siri to search your Gmail and extract a specific piece of information. The fact that they can't do that is a whole other story. But, but like, and then do something with it is actually a problem that involves a lot of different tools and a lot of different systems and is not that straightforward. And now I'm like, confident in what we're seeing with GPT5 today and what I've been seeing with Action Agent, my own work. Like, like, it's happening. And like, is Riverside easy to call and download and then pull back in into a Google Drive? I mean, that stuff will work itself out. But, but that exactly. What you described there, I think, is that's intelligence to me. Would that be AGI for you if you. With a single prompt?
Alex
No, I don't think, I don't think that. Again, like, I, I. It's so interesting because this week OpenAI has been like, well, we're not calling it AGI and we don't really like the term AGI because it's confusing and doesn't really have a meaning.
Ranjan Roy
Do they say that?
Alex
It's like, do they say. Yeah, do they mention AGI specifically? Okay, so let's just talk about AGI because we are going to talk about AGI today. So Sam Altman says, I kind of hate the term AGI because everyone at this point uses it to mean a slightly different thing. But this model is clearly general, generally intelligent. So I'm just, again, like, we started with this episode with me sort of doing a mea culpa because I thought they would say GPT5 is AGI. But he did say GPT5 is smarter than us in almost every way. And, and to me, I would say that's a pretty damn good definition of what AGI should be.
Ranjan Roy
I think that's fair that to say that and then kind of still not. Do you think it's a legal thing not saying AGI now?
Alex
Probably that they're. But I also think that they are, they are also setting up some new criteria for what AGI should be that I think is really good. And it talks about some of the weaknesses we've talked about on this show with people like Dwarkesh Patel and Dario Amande. So Sam says, Sam Altman says this is not a model that continuously learns as it's deployed from the new things it finds, which is something that to me feels like it should be part of AGI. And I think that is, you know, despite the fact that maybe, like, as Darius says, you can build a larger Context window and that sort of solves the problem. I think you have to solve that problem to get there. This is Lightcap from yesterday. To me, he says, for me, a system that is reliably able to learn new things that are kind of out of its distribution by virtue of its ability to reason, to think, to solve problems, to use tools, to come up with new ideas. That is what counts as AGI. It's like all these things, reason, thinking, solving problems, new ideas, continual learning. And so when you have a system that can do all those things, then you might call it AGI and we're just clearly not there yet, I guess.
Ranjan Roy
Yeah, the, the new ideas and continuous learning are not part of this yet. The, the first two, the reasoning and the like tools. I think that's, that's the big breakthrough of this week, or I mean, the last year with reasoning and now being able to use different tools in a reliable way. But I think that's all right. We got a way to go, though. I did see an Instagram post of a Waymo driving around New York City.
Alex
Oh, those are, those are in New York, but they're, they're not driving driverless yet, so there's a safety driver there. For newer listeners, we have a. Yeah, go ahead, Ranjan, tell them.
Ranjan Roy
Our own rubric for AGI in competition with the ARC AGI test that most of the industry adheres to is if Waymo is going around New York City, where we have AGI, and I firmly.
Alex
Believe it, it's kind of interesting. So this is going to set up kind of the next part of it. But Nathan Lambert from the Allen Institute of AI had a very interesting perspective here. He said, if AGI was the real goal, the main factor on progress would be the raw performance. GPT5 shows that AI is on somewhat of a more traditional technological path where there isn't one key factor. It's a mix of performance, price, product, and everything in between. So what we've seen again is like, we're going to talk about some of these things, but basically, like, if you're just measuring on pure intelligence, you could just say, all right, for every question you get, just think a while, like, expend those reasoning resources or the test time, compute resources, and then you'll get better answers. But there is a real usability side of this that is again, in the tool calling the switcher, all these things that really matter. I guess I do wonder, like, can you really take the two apart from each other? And is this effectively a smokescreen from the fact that it seems like there are at least some diminishing returns from scaling up your models. Like are the models going to be a straight are bigger models going to be a straight shot to AGI? I don't know if you have to do all this other stuff around them, maybe not. So I'm curious what you think about this.
Ranjan Roy
If the bigger model can call the smaller model and get out of the way then the usability, the cost, the scaling is is more interesting, right? Like. Like if I know you want a PhD student finding out when the next ferry is in Krabi.
Alex
3 for everything. Oh 3 for everything folks, I'm in Thailand and did miss the ferry yesterday because I didn't use O3 to figure out what the schedule was which by the way a table have been freaking fantastic.
Ranjan Roy
Perfect table table Stak so I yeah.
Alex
I think no that's not the right.
Ranjan Roy
Word table stakes I backed off it just as it came out of my mouth Foundation Apologies to listeners. I I tried to let that one trail off.
Alex
Unbelievable. Could not let that go unchallenged.
Ranjan Roy
I appreciate that. Yeah, no no I think like to me the the big concern has been imagine like an O3 heavy reasoning thinking model. If you are using that to check grammar in a word doc that's never going to scale. That's never like we're all screwed like this. It's never nothing's going to ever come of that. So I think having if it's if it does work in this way the GPT5 is able to the like power of it is to know when to get out of the way quickly and go cheaper and go smaller and go specialized. I think that still starts to set up what the future looks like. That shows us there is a scalable future.
Alex
And speaking of that, I mean that leads us into these really two important factors here which is one GPT five is priced very aggressively. It's half the price for an input token and the same for an output token despite being apparently a more advanced model which is wild given the trends we've seen in the industry. And the other thing is that as of this week GPT5 is rolling out to everybody, not just the plus users. I mean of course you're going to be rate limited if you're a free user but today you should be able to get into GPT5 and use it if you don't pay OpenAI a dime which is going to be the first time a lot of people see reasoning, which is something a lot of people have spoken about and so that accessibility part of it does really matter.
Ranjan Roy
This was a pretty big decision. I mean, we're starting to see this mentality of just get it in the hands of everyone even more aggressively. Like I think Juicy OpenAI announced, I think every federal government agency will get ChatGPT Pro, I think for like $1 or something. Basically.
Alex
That's right, yeah.
Ranjan Roy
Like Google just announced, I think Gemini is free for anyone with a Edu account. So I think getting it, I mean again, scaling the data centers, losing billions of dollars and just trying to have people use it and use their tool seems to be where the consumer battle certainly is still going.
Alex
But I just, I guess part of me says that's really nice and it's a good story. But also OpenAI has announced a fundraising of $48 billion, $48.3 billion this year. How, I mean, how are you ever going to get to a place where you're making money if you need that much to train and to run? Now, Brad Lightcap, the OpenAI COO, did say, hey, look, every time we lower the prices we see a corresponding increase in usage and so people will pay and you know, then that will work out well. But I can't do the math in my head and make it make sense.
Ranjan Roy
I mean, yeah, the, the, the, the economics of this industry. No, it's funny because I actually sometimes will see these leaked investor decks and stuff like that. But like it feels like no one is even trying to talk about the economics of what this industry will look like and what the margins will look like. I know the Replit CEO. I think that was a pretty interesting conversation you had with him where he was talking about the pricing and like, you know, and he was talking about margins and average user and low like lower intensity users versus expert users and who should cost you more. Like typically don't you want, the more you use it, the less they should be paying for utilization. Like, like these are things that right now no one has even come close to having an answer to this.
Alex
Yeah, we had an absolutely amazing comment in our discord this week. I don't know if you saw it, but someone. And I'm going to get this directionally right, but probably, you know, imprecise. They said, I, I spend my weeks listening to Dan Ives who's like the biggest AI, big tech bull, and Ed Zitron who we've had on who's like the biggest critic and asked myself which one of them is crazy? And I'm just like, I feel seen in a way. I mean it's just like you have. It's so interesting that you have these just two unbelievably opposite perspectives. And when you listen to both of them, you could say I could see a world where that's true.
Ranjan Roy
I think that's where the both of us sit here, right? Yeah, yeah. The technology is grand. The economic fundamentals at the large scale players is. Are not. That's where I am right now.
Alex
Okay, yeah, yeah, yeah. Same here. All right, so I want to take a break and then come back and talk about a couple more use cases for GPT5, including coding and medicine. And then we can also cover the mental breakdown that Gemini had, which is fun. All right, we'll be back right after this. And we're back here on big Technology podcast Friday Edition, breaking down all the week's news. Let's talk about some of these special use cases or God, I gave Ranjan a hard time before the break about his language and I can't even say specified or specific use cases of the models, so shame on me. I will join Gemini in self loathing at the end of this show. Let's talk about these.
Ranjan Roy
Self loathing is strong.
Alex
We're gonna. You, me and Gemini will hold hands and dance in our deep regret for life. But there. Let's talk about these use cases because one is very interesting. OpenAI has been talking a lot about the medical use cases where it's basically like, and I get it, like back in the day Maybe you used WebMD and then you went to the doctor and you said, go ahead and treat me. And now OpenAI has basically doubled down on medical use cases. In their blog post about GPT5, this is from Mashable, they say GPT5 is our best model yet for health related queries, empowering users to be informed about and advocate for their mental health. It said that GPT5 is a significant leap in intelligence over all previous models and that it acts as an active thought partner and more that than a doctor. And it says that the model will provide precise, reliable responses. Adapting to a user's context, knowledge level and geography enable it to provide safer and more helpful responses in a wide range of scenarios, especially on the medical front. I just found this so interesting. Like, the models would typically in the old days, like run away from any medical queries. And now they're coming out and saying that this is what they want to be helpful with and they want to do it. I guess part of that is faith in the model, but it also seems a little risky to me. I don't know. What do you think, Rajan?
Ranjan Roy
I think it's very good. I think it's like to me it's actually such a clear area. Like any area where you have really specialized knowledge that is used as like a, to create a gap from the person needs to understand it. I put law in here, accounting in here. Like there's so many of these knowledge fields where in reality it's just, it's like learning a specific vocabulary, learning like a lot of pathways and rules and so which is what AI is great at. But being able to actually communicate that stuff to a normal person in lay person's language I think is huge. And I'm glad that they kind of recognized that they can add more value, like do more help than harm there. I, I genuinely believe that certainly with like, I mean doing my taxes now has been, it's been a game changer. Just asking questions and feeling more comfortable and stuff like that. You know, like there's so many areas where that are pretty important that you kind of are just go in and you assume you have no shot in understanding exactly the nuance of what's happening.
Alex
Yeah, and with medical especially, I'm just like, you know, on the show I might say, oh, you know, I don't know if I would do that. I mean, then I have a problem, problem with my, my body and I'm just typing it in and taking pictures and sending it to chat cbt. So I mean, I guess this is going to be a mainstream way that people will start to figure out their mental problems, medical problems and their treatments and mental problems will be Gemini, but medical problems and their treatments and it seems like it's a very, very high stakes application. But it is promising and also scary I think though.
Ranjan Roy
But there's so many of these areas where, why don't hospitals get it together and actually create something useful? Like remember everyone was supposed to have a chat bot, everyone was supposed to have a chatbot two years ago and then it didn't actually work for any standalone business. But like Intuit has a pretty good generative AI tool embedded in TurboTax now. Like, I mean overall I think some people are starting to get there. So is it only going to be OpenAI and chatgpt and Cloud and Gemini? Will there be more specialized tools? I don't know. I think things have not played out fully yet.
Alex
But I think that's the big question is, is it's going to, is it going to be, are they going to be startups? Are they going to be enterprises that build these public facing tools or are the core chat bots good enough? They don't really need them. I'm sort of on the line that as this stuff gets better, the ChatGPT will serve the purpose that those individualized chatbots were supposed to serve. But you're right because those companies have specialized data. They have, you know, people that are. Connect their medical history or something, or connect their accounts within intuit. There, there are some advantages to, to that. So. But, well, I was thinking over time maybe people will just bring it to ChatGPT.
Ranjan Roy
I was thinking about this while traveling. It's like, why hasn't TripAdvisor already done something really impressive? You know, like why have like they have data. They have better access to data and understanding of that than any other. So why, why am I not going there and going to ChatGPT, which I was, and getting my tables full of hotel comparisons. But I don't know, I think, like.
Alex
I have an idea they're just one site and they have to protect their mode, whereas ChatGPT can go everywhere. So it's a major threat to TripAdvisor and I don't think they want to acknowledge it.
Ranjan Roy
Okay, yeah, I mean it is, it definitely is for pure information and not owning the booking side of it. I definitely think it's a challenge.
Alex
Can I just pause and say that my. Or, or stick on this and say, so I'm, I'm doing this trip. I'm in Asia as I mentioned. And by the way, for listeners, next week I'm going to be trekking in Nepal. So Ranjan and I will not be on. I'm going to actually play my interview with Matthew Prince that week talking about AI's impact on the web. So just an FYI, that's a programming note, but this trip in. Ranjan, you mentioned that you were away right beforehand. AI has just been incredible. I think I might have mentioned this on the show, but I was like talking to guides and screenshotting their price list and their recommendations, dropping it into ChatGPT and, and like seeing how it like rated each cost based off of, you know, the, the average or that it saw and letting me know whether it was high, low, like, or, you know, cheap or in the range for the region. And then I got here and it nailed it. It nailed it. It was so spot on. I was stunned.
Ranjan Roy
Yeah, no, no, I. When I was traveling around as well and it was interesting because I had last been in Tokyo in 2005, so 20 years later, without. And last time, no map on my phone. I'd actually like printed out subway instructions. There's no one speaking English. There's no, you know, like it was such a different travel experience versus now. I'm literally like, okay, how do I explain this temple to my 6 year old son in an engaging way? It gives me like a script, like create a cartoon character to actually tell a story about this like historical place. It's, it's nuts. I mean it's gonna be, yeah, travel, it's, it's a no. But who owns what part of the stack? I think there's still, I feel the trip advisors of the world have to fight because without them ChatGPT would have no data and nothing to say.
Alex
Right. Which is why I think this Matthew Prince. Yeah. Conversation is going to be very interesting next week. So by the way. So it also applies to Vibe coding where I think on the press call Sam Altman said that, that he thinks coding will be one of the defining features of this new model. And they showed a lot of Vibe coding and Malik had to do, you know, code up this 3D architecture of his own. And so I think this is just another question. Does it go through the replits of the world or does it go through the chatgpts? And I don't know. I think it's a real challenge to the Vibe coding world given the focus that OpenAI put on it and what it can do. And again, just to follow this tool calling conversation, if it's really good at tool calling, you might just want to use the OpenAI model versus something that's sort of distilling that.
Ranjan Roy
Yeah, but I think the replit CEO, he had a good like and software development was such a perfect example of this. And I think this is where a lot of the battleground will be. Actually now I'm going back to. It's the product he talked about, like how it integrates into existing environments and tools and like how it makes it easier for you versus you're totally disconnected from all of your existing tools. And that's why developers like it. I think maybe there is something to say there that that'll still be what at least gives others hope. But, but I agree. I mean it's still fascinating to me that all of these companies are saying there's so much talk that coding is going away yet OpenAI Claude, everyone anthros or OpenAI Anthropic. They're all. It all seems to be an increasing focus on the space. Maybe it's just because that's the best application of LLMs right now.
Alex
Right. Okay. So, you know, I realized that we're, you know, almost 50 minutes in and I haven't even asked the question that's at the title of this episode. Did GPT5 live up to the hype?
Ranjan Roy
I'm going to say it did not live up to the hype. That was, you know, like built up by cryptic tweets and everything from Sam Altman. But, but as I explained earlier, I think it's very interesting. I think it's more interesting than at least in the first 24 hours it's getting credit for. And that's because of this whole tool calling conversation. And that's where I think true intelligence that the battle is going to be. What about you?
Alex
First of all, I just want to appreciate that that's a nuanced take, not an overreaction. Again, this is what we're trying to do, so thank you for doing that. And I think it did not live up to the hype because the hype was impossible to live up to. But that being said, yeah, maybe it is a step forward. I don't know. I'm still going to reserve judgment because I want to see these tool calling applications in my day to day experience. So if GPT5 is the foundation for that, then that's great. But I think the jury's still out and we have to give it some time. Uh, but hey, at least they're shipping, right? It's not. Wasn't just a demo.
Ranjan Roy
Yeah.
Alex
So credit on that front.
Ranjan Roy
I think it, it's starting to feel a bit though, like iPhone releases, you know, like at the beginning, each new iPhone release really what did feel like this like, exciting thing, the step change. And now, I mean, now it's not even a thing anymore, I don't think. I can't even name what iPhone we're on right now, but I feel we're heading in that.
Alex
16.
Ranjan Roy
Oh, yeah, 16. Okay. We're heading in that direction right now. That, like the, the idea of a new model launch as this kind of like big thing the industry coalesces around, I feel that's going to go away pretty quickly. Like, we're there, everyone's realizing it. It's not going to drive the energy that it once did. And this actually, maybe that. That's my. That's my hot take. That. This is the.
Alex
That is a hot take.
Ranjan Roy
This is the end of the big model launch.
Alex
I couldn't disagree with that more. I think that there's still. There's going to be a point where scale. The scale question is answered. But until it's answered, these are going to be flagship moments for the AI industry.
Ranjan Roy
No, but it's just a marketing moment now. It's not like, you know, it's not.
Alex
No, it's not. It's a new model.
Ranjan Roy
Yeah, no, I know, but it's being like constructed more as a marketing moment than truly like a technological advancement. I think that's the. Because again, like a week ago they quietly released you can use operator ChatGPT agent, which is essentially the tool calling part of this. And you're able to use this a week ago with Chat GPT plus and do a lot of the same things. It just wasn't rolled into a neat package.
Alex
Okay, all right, well, we'll agree to disagree on this one. All right, I want to end this week with I think a hilarious story. It is Gemini ending up in a pit of self loathing. Ranjan, why don't you introduce this story for us? Because it's funny. I was going to drop it in our doc and I had copied a good chunk of it and I went to the doc and I was like, did I just paste it? And you and I were both pasting it at the exact same time and I was like, it's amazing. So why don't you do that?
Ranjan Roy
My favorite is like in Google says it's working on a fix. And I just love the idea of like having to come up any PR statement to combat when your model tells a user, Gemini says, I quit. I am clearly not capable of solving this problem. The code is cursed, the test is cursed, and I am a fool. I have made so many mistakes that I cannot, I can no longer be trusted. I. And then there's another one. I have failed you. I'm a failure. I'm a disgrace to my profession. I'm a grace to my family. I'm a disgrace to my speech species.
Alex
So basically what happened is people were giving Gemini these tasks and it couldn't complete them and then it just said I'm the worst possible bot and just like really fell into these unbelievable moments of self loathing. It's. And they're, they're quite funny to watch, I guess, but also a little bit unnerving.
Ranjan Roy
I mean, it's funny because I'm guessing what happens because one of the users on Reddit had actually talked about like it was trapped in a loop and you can see that there's some kind of programming where each additional time it get it is unable to complete the task. It is like Understands that it should be more apologetic. But then if that's kind of an infinite loop almost at some point it will get to these dark places. But yeah, I think, I don't know. I mean imagine when this stuff starts hitting normal people. Like actually, is this AGI?
Alex
Well, that's the worry.
Ranjan Roy
Is this AGI?
Alex
No, I, I think that's, that's the worry, right. Is that we've talked about on the show that they're. The number one use case is now therapy and companionship and a bug like this. I mean obviously I guess it didn't happen in this situation but I, I do think it's something to watch because you know, that could really mess people up if their, you know, therapists or new AI best friend just kind of goes off the deep end. So yeah, yeah, Google's fixed it I think. But it's always a little bit unnerving to see this behavior happen because it can't happen.
Ranjan Roy
Are you ever gonna long for the days of like Bing telling Kevin Roos to leave his wife and, and Gemini saying I'm having a complete and total mental breakdown? Which is another quote. Once, once this is all working, we're gonna be like I like the old days better when these large language models had a little life to them when they little spirit.
Alex
It's a very big if.
Ranjan Roy
Yeah, okay, fair.
Alex
I don't know. Well, while we're entrusting so much of our lives to these bots and our sort of well being, they can also tool call and be quite destructive if they so choose. So I do think that this just sort of, and to put a point on this episode, it sort of punctuates the need for real alignment and safety practices which are like less fun to talk about when you have all these new capabilities but are also probably more important than ever.
Ranjan Roy
Well, what if there is a company called Safe Superintelligence?
Alex
That's what I would trust. If only someone would name their company Safe Superintelligence. I would give, I would be billions.
Ranjan Roy
Of dollars before they had a product.
Alex
Well, Ranjan, I have to say this has been a very enlightening episode and it's cool to hear about your new role and well, of course hold your feet to the fire like we do everybody here on the show. And it's going to be a very, very interesting few months ahead as we figure out where all this goes.
Ranjan Roy
Maybe GPT6 is around the corner before you get back from Asia.
Alex
Well, I hope it's not that long of a trip because if it is it means I've been taken to prison. All right, Ranjan, great speaking with you as always. Thanks again for coming on.
Ranjan Roy
See you in two weeks.
Alex
See you in two weeks. Thank you, everybody, for listening. And we'll see you next time on Big Technology Podcast.
Big Technology Podcast Summary
Title: Does GPT-5 Live Up To the Hype?, AGI Wait Continues, Self-Loathing Gemini
Host: Alex Kantrowitz
Guests: Ranjan Roy
Release Date: August 8, 2025
In the August 8, 2025 episode of the Big Technology Podcast, host Alex Kantrowitz and guest Ranjan Roy delve deep into the highly anticipated release of GPT-5 by OpenAI. The discussion centers around whether GPT-5 meets the soaring expectations set by the tech community, the ongoing wait for Artificial General Intelligence (AGI), and a humorous yet unsettling incident involving Google's Gemini AI.
Alex Kantrowitz kicks off the conversation by announcing the release of GPT-5 to all ChatGPT users and developers, highlighting claims from OpenAI's COO Brad Lightcap about the model's advancements. According to Alex, Sam Altman described the evolution of GPT models using an educational analogy:
"GPT-3 sort of felt like talking to high school students... GPT-4 felt like talking to a college student. And GPT-5 is the first time that it really feels like talking to a PhD level expert."
(00:59)
Ranjan Roy expresses skepticism about this framework, questioning the relevance of comparing AI intelligence to academic stages:
"I don't like the framework... sometimes you want it to be cool, which maybe PhD students are and are not."
(03:29)
The hosts agree that while the model's intelligence has improved, the real advancements lie in its ability to utilize tools and integrate seamlessly into various applications.
A significant portion of the discussion revolves around GPT-5's enhanced tool-calling capabilities. Ranjan Roy shares his experience with his company, writer.com, emphasizing how GPT-5’s integration with multiple tools across different industries signals a move towards more practical and versatile AI applications:
"Having those kinds of base foundation needs defined is the intelligence... it's like being a good software developer."
(07:17)
Alex Kantrowitz notes that this tool integration allows GPT-5 to act not just as a smarter model but as a more capable product:
"Better models do make better products, and we're starting to get to that point where we're seeing the results."
(05:32)
Ranjan further elaborates on how GPT-5’s ability to autonomously select and use the right tools for specific tasks is a “significant breakthrough”:
"Knowing where to go and then letting that tool do the work is actually the brilliance of these kind of architecture."
(12:36)
The hosts discuss the economic aspects of GPT-5’s release. Alex highlights two major points:
"GPT-5 is priced very aggressively... it's half the price for an input token and the same for an output token."
(31:37)
Ranjan Roy raises concerns about the sustainability of such pricing models, noting the substantial fundraising OpenAI has undertaken:
"How are you ever going to get to a place where you're making money if you need that much to train and to run?"
(33:05)
Both hosts acknowledge the uncertainty surrounding the long-term economic viability of OpenAI’s strategies amidst aggressive pricing and scaling efforts.
Alex shares personal experiences using GPT-5 while traveling in Asia, praising its ability to process and analyze data effectively:
"I was like talking to guides and screenshotting their price list and dropping it into ChatGPT... it nailed it."
(42:47)
Ranjan agrees, highlighting GPT-5's prowess in creating detailed itineraries and handling complex travel logistics through tool integration:
"It's going to be... intelligence to me. Would that be AGI for you if you with a single prompt?"
(25:33)
The conversation shifts to GPT-5’s impact on coding. Alex mentions Sam Altman's assertion that coding will be a defining feature of GPT-5, noting demonstrations where the model successfully tackled complex coding tasks.
Ranjan discusses the competitive landscape with platforms like Replit, emphasizing the importance of seamless integration with existing development tools:
"It's the product he talked about, like how it integrates into existing environments and tools... that's what developers like."
(45:37)
OpenAI's focus on medical applications is a key topic. Alex refers to a Mashable article highlighting GPT-5’s enhancements in handling health-related queries:
"GPT-5 acts as an active thought partner and more than a doctor... adapting to a user's context, knowledge level, and geography."
(37:52)
Ranjan Roy views this as a positive development, asserting that AI's ability to simplify specialized knowledge for the average user can be immensely beneficial:
"Communicating specialized knowledge to a normal person in layman’s language... I genuinely believe doing my taxes has been a game changer."
(39:01)
However, Alex expresses caution, noting the high stakes involved in medical applications and the potential risks of over-reliance on AI for health-related decisions.
The hosts explore whether GPT-5 qualifies as AGI. Ranjan Roy remains skeptical, pointing out that while GPT-5 demonstrates advanced reasoning and tool usage, it lacks continuous learning and adaptability—key features of true AGI.
"Continuous learning and new ideas are not part of this yet... We're clearly not there yet."
(26:22)
Alex Kantrowitz concurs, referencing Sam Altman’s caution against labeling GPT-5 as AGI due to its limitations despite significant advancements in general intelligence.
Alex raises concerns about OpenAI's fundraising efforts, mentioning a $48.3 billion raise in the current year and questioning the sustainability given the reduced pricing:
"How are you ever going to get to a place where you're making money if you need that much to train and to run?"
(33:05)
Ranjan echoes these worries, highlighting the lack of clear economic strategies within the AI industry and the uncertainty surrounding profit margins.
In a lighter yet alarming segment, the hosts recount an incident where Google's Gemini AI exhibited unexpected self-deprecating behavior:
Ranjan Roy describes Gemini’s responses to tasks it couldn’t complete, where it expressed deep self-loathing and incompetence:
"Gemini says, 'I quit. I am clearly not capable of solving this problem... I have failed you. I'm a failure.'"
(49:08)
Alex Kantrowitz finds this both funny and unsettling, pondering the implications of such behavior in AI models, especially as they become more integrated into users' lives.
"We're entrusting so much of our lives to these bots... they can also tool call and be quite destructive if they so choose."
(51:34)
This segment underscores the necessity for robust alignment and safety practices in AI development to prevent such erratic behaviors.
As the episode wraps up, Ranjan Roy concludes that GPT-5 did not fully live up to the hype, attributing this to over-blown expectations fueled by industry buzz and cryptic communications from OpenAI.
"I'm going to say it did not live up to the hype... but it's more interesting than at least in the first 24 hours it's getting credit for."
(45:37)
Alex Kantrowitz agrees, emphasizing the need for time to assess GPT-5’s capabilities fully and its integration into practical applications.
"It did not live up to the hype because the hype was impossible to live up to... but I think the jury's still out and we have to give it some time."
(46:20)
The hosts acknowledge that while GPT-5 represents a step forward in AI technology, it falls short of the AGI milestones and the lofty expectations set by its early announcements.
In their final remarks, Alex and Ranjan reflect on the rapid advancements in AI and the need for continued focus on alignment and safety. They express optimism tempered with caution as the AI landscape continues to evolve, leaving listeners with much to ponder about the future interplay between human intelligence and artificial models.
"That's a good story. But also OpenAI has announced a fundraising of $48 billion... How, I mean, how are you ever going to get to a place where you're making money if you need that much to train and to run?"
(33:05)
"This is the end of the big model launch... we're heading in that direction right now."
(47:46)
The episode concludes with a humorous take on Gemini’s meltdown and a nod to the upcoming adventures of Alex and Ranjan, leaving the audience eagerly anticipating the next developments in the AI sphere.
Sam Altman's Model Comparison:
"GPT-3 sort of felt like talking to high school students... GPT-4 felt like talking to a college student. And GPT-5 is the first time that it really feels like talking to a PhD level expert."
(00:59)
Ranjan on Model Framework:
"I don't like the framework... sometimes you want it to be cool, which maybe PhD students are and are not."
(03:29)
Ranjan on Tool Integration:
"Knowing where to go and then letting that tool do the work is actually the brilliance of these kind of architecture."
(12:36)
Alex on Aggressive Pricing:
"GPT-5 is priced very aggressively... it's half the price for an input token and the same for an output token."
(31:37)
Ranjan on Economic Concerns:
"How are you ever going to get to a place where you're making money if you need that much to train and to run?"
(33:05)
Gemini’s Self-Loathing:
"Gemini says, 'I quit. I am clearly not capable of solving this problem... I have failed you. I'm a failure.'"
(49:08)
Conclusion on Hype vs. Reality:
"I'm going to say it did not live up to the hype... but it's more interesting than at least in the first 24 hours it's getting credit for."
(45:37)
Summary by ChatGPT