Transcript
Host (0:00)
OpenAI has just announced GPT5. This is the much awaited model. So I'm not going to give you a massive intro here. Basically on the podcast today, I'm going to be breaking down what its improved capabilities are over all of the other GPT versions where I think we're going to see the most growth or use usage gains in chat GPT and where it disappoints. And spoiler alert, there are a lot of people that are mad at this model for a bunch of different reasons. We're going to get into all of that on the podcast today and of course we can rely on basically a number of sources for a lot of this. One is that of course OpenAI has their own blog post, basically breaking down everything it's capable of doing. But then I think maybe the more important version that we're also going to get into is basically what the responses are on Twitter of people that are trying this, actively using it and what do the betting markets say is this the best model of the month? Because there is millions of dollars bet on the line that this would or wouldn't be. So we're getting into all of that. Before we do, I wanted to mention if you want to try all of the models I talk about on the podcast, I love for you to try out my own software company called AI Box AI. We're currently in beta. We're building this super cool drag and drop no code app builder which is basically a competitor to Google Opal, but it's got all of the AI models on it but currently today. So this is not accessible. This is an exciting sneak peek that you're that we're building it right now. Currently, currently what you have is access to the top 40 different AI models all in one place for one price. So you can chat with it just like you would chatgpt, but you can switch to any different model that you want to try midway through. So maybe you want to try out the latest model from Deepseek and compare it to something from Anthropic. It's all on there. It's a great platform. Link is in the description. It is AI Box AI. All right, let's get into basically what's going on with GPT5. So the big news here and honestly what I am the most excited about that I not the most excited. I guess that's an. That's an over. That's definitely exaggerated a little bit. One thing I'm very excited about is currently if you go to chat GPT and you click the dropdown it's going to ask you, do you want to use GPT 400304 mini 04 mini high and a whole bunch of other ones. All of those models are getting depreciated, appeared, gated and instead it's going to be replaced by three new models, GPT5, GPT5 Mini and GPT5 Nano. So those are the three models. The Nano is like this super tiny version that's going to be able to run on edge devices. Mini is just a faster, quicker, less powerful and then GPT5 is their, their best model. And the great thing is you don't have to pick which model you want to answer your questions. You don't have to pick what tool. Right? Because sometimes we're looking at things like, you know, four oh mini high, great at coding and visual reading. Uh oh, three uses advanced reasoning. Right? Like you're like, do you want to use the reasoning? There's all these different tools that you could toggle on and off if you want to do deep research and agent mode and all this kind of stuff. Okay, you don't have to other than I guess, agent mode probably. But you don't have to ask it or pick your model anymore. You just ask a question and it will choose what the best model is for you. Thank goodness. I think 99% of people are excited about this. The other question a lot of people are asking is who is getting access to GPT5? Everybody is. It's actually going out to free users, paid users, everyone. You get more usage if you're a paid user. Obviously they have that, but honestly you get a ton of usage as a free user as well with GPT5 and it is rolling out. So this is really, really exciting. What are the things that are kind of integrated into it? Number one, improved answer quality. Basically it's more accurate, it makes way fewer mistakes, it hallucinates way less. So that's obviously going to be pretty exciting. Just that it's marginally better. How much better? We'll look at some benchmarks. Some people were kind of roasting made improvements, but it wasn't as huge. Everyone was expecting it to make this huge step game where it was going to be insanely better and it was going to absolutely crush everything and the hype was really high. At the end of the day, I think it is better, but not insanely better. And there are some models that on different benchmarks are still beating it, which is not really a good place to be. You probably wanted OpenAI to beat everybody before they Put out an update, especially because we have Google cooking up something. But more on that in a minute. So expanded reasoning and context. One thing I am very excited about, Google's been doing this for a long time, but they just added 1 million token context window. Now this is something that Gemini has been doing, but OpenAI was not. So basically it's going to be able to look at all of your past conversations that you've had with it and it's going to be able to look at like way further back. We kind of had this problem with ChatGPT where you'd like if you, if you had a thread open for a very long time and start forgetting things that you said at the beginning of the thread, this probably is going to be less of an issue now as the context window is way, way bigger and it can remember way more things. One other thing that's interesting is the coding and task automation has improved. So coding basically is just doing better on the benchmarks at coding tasks. Not just writing the code but really quickly iterating on the code. It's getting faster at that. This is something that Claude has been crushing and winning in the code. Like hands down, Claude is a preferred coding tool for developers. But it seems like OpenAI is trying to put a really big effort in here and try to basically keep up. They have a whole bunch of demos that you can basically make. You can use and create all sorts of things. They have in their blog post. This like it's called Jumping Ball Runner and the whole thing is a whole game where you click a button and the ball is jumping over the things. The whole thing was just created in GPT5. You put a prompt in, create an entire game. The color, the design, all of that, very, very interesting. One of the thing that's pretty great is the task automation. So basically it's getting better at multi step prompts. If you tell it like hey, you know, create a newsletter for me, right? Or I don't know, something like that and it will basically break that into down into multiple steps. Or if you're like create a newsletter and also write a blog post and also write a link or like a post on LinkedIn based off of this data that I give you, right? It's much better doing these multi step tasks which is like sure good for like chatgpt but really what it's getting at is it's going to be much better for agents. Agents running off of this as kind of their operating system are much better at completing these multiple tasks. These multitask Prompts, it kind of crushes that. So that's interesting. Modal input and output is another thing that I am absolutely excited about. Basically what that means is you're going to be able to upload images and, or audio and video as inputs into the model. So you upload a video and it's going to understand. Now, I think this is really important, especially as OpenAI starts tying in everything they're doing with OpenAI's Sora, their video generator. They have a video generation tool, they have an audio generation tool. You also are going to be able to get, you know, transcripts from videos that you put in and transcripts from audio that you upload. Now, these are all tools that OpenAI has tools for. They have Whisper, which helps you to do that, basically. But now it's. Everything's just getting tied into CHAT GPT. So all of these tools that they had that some of them were disconnected. It feels like we're consolidating everything, even to the point where they took their, you know, their agent and they put the agent into ChatGPT. It feels like everything's coming into ChatGPT, which makes it way more useful of a tool. So that's very, very good. All right, how is it doing on evaluation? So they have actually fixed their charts that they have on a bunch of their evaluations on their blog post. One in particular, their coding chart got so ridiculed because in the original, in the original demo that they were, that they were producing, they had the scale way, way off, where basically they were like, okay, PT5 compared to O3. They said with or they said without thinking. Basically it was at 52%, but their OH3 model was at 69. So literally, without thinking, this model is worse than their last model at coding, which isn't very good. But they're like, with thinking guys. Now with thinking, we're at 6. 75%. So it's like, with thinking, it is technically better. But the interesting thing about it is when they released the original graph, they had the 69% of openai03 at like a tiny little fraction. So basically they're making it look like, you know, this, this new model is crushing, like, with and without thinking all the other models. But if you look at the number, it's actually like the. The scale was way off. And it should have been. It shouldn't. It barely is better. So Sam Altman's like, oh, how embarrassing. We've updated on our blog. It is updated on the blog, but in the live demo, it was kind of like misleading. And a lot of people roasted them for that, which was sort of funny. Okay. One other thing that it's getting a lot better at is basically keeping track of your recent conversations. It's the memory feature for paid users. It now includes a longer term memory and it's able to go across a whole bunch of months and like many months and many projects that it's able to remember. So basically the, the scope of its memory is increasing. They're basically just giving you like more memory storage on their servers, which I honestly, I think that's great. Also like before users can review, edit and delete what the model remembers. So if you don't want it to remember a specific thing, you can go see what it's remembering about you and go delete that out of its, out of its memory. Honestly, this happens to me sometimes where like I'll be, I'll be helping somebody else on a project and they're like, hey, like my mechanic shop needs to do X, Y and Z with AI. Can you help me test this thing out? And I'm like, yeah, sure. Like here's how I do it. And I go on there for like a client or something, something and, and whip something up. And then before you know it, I all of a sudden have like in embedded in the ChatGPT memory. It's like you own like Gary's Auto Shop. Like do you want to get some great marketing tips for this? I'm like, oh no. So that kind of stuff you want to, you want to be able to pull out. Okay. Overall, I think that basically on a bunch of benchmarks it looks like it's improving, it's got a bunch of cool features. So where is it, how is it being received? I would say I made a bunch of posts over on LinkedIn, but one of them that was interesting, I went over and took a screenshot of polymarket. Now if you look at Polymarket, basically there's a bet there with $2 million being bet right now that says which company has the best AI model by the end of August. Right. And I mean it's only the 7th of August right now. So before today all of the betting was basically saying, I think as of like this bet has been going on, I guess the bet kind of opened on July 1st. So over a month ago this bet opened up. OpenAI took the lead on the bet as of June 26th. So everyone knew that GPT5 was coming out. They took the lead to the point where OpenAI was at like 73% favored to win this bet, 73% of people said OpenAI is going to have the best model by the end of August. They dropped their model on the 7th and instantly, as soon as they drop it, OpenAI crashes to 13% and, and Google spikes up to 82%. That is a massive reversal. Basically everyone was really hyped that GPT5 was going to blow everyone out of the water and it didn't. It was marginally better than some models, it was worse at others, which I'll get into. It's GPT5 is actually worse than GROK Heavy, which, I mean, I don't know if that's a super fair comparison because GPT5 is going out to free users and GROK for heavy is like $300 a month. But the fact is like, there is a model that people are subscribed to that's kind of beating it still. So I know there's like drama on it. It's marginally better than just the regular Grok 4, which is like, isn't great because Grok was released a while back. And so the other thing is we know that other models are working on new updates. So Google released their kind of a smaller version of their Gemini 3.0 and so we know that they're going to release their full Gemini 3.0 model soon. And whenever that happens, people are assuming it's going to happen this month and it's going to smoke OpenAI because OpenAI was barely ahead of them and they have a brand new model. Google has been really crushing it and bringing the heat lately. So if this is the case, this will be crazy. The odds just keep splitting further and further and Google is at 81% chance of being the best model by the end of the month. And they haven't even dropped anything yet. Will they? Honestly, I think it's, I think it's a good bet. If you want to make a contrarian bet. You could, you could bet that maybe OpenAI would be the winner, but I, I think Google, I think Google is going to come up with something pretty impressive and apparently the betting markets and you know, $2.1 million in volume right now is being bet that that's the case, sadly. Xai, Anthropic, deep sea, Alibaba, meta, all of them are just abysmal results right now on this. It's really kind of a battle between OpenAI and Google right now, which is very, very crazy. Yeah, very, very crazy results that we're seeing right there. Okay. A couple other things that I did find that were quite interesting from all of this. The first one being that it seems like we have really this. It feels like a lot of people are saying AGI is canceled. If you go over on Twitter, everyone's saying AGI is canceled. It's no longer going to be. You know, we have time to, to make all of our, make our money before the AI takes over the world. And it's kind of sad because on the one hand, like GPT5 was supposed to, it was so hyped and was going to be the biggest thing of all time. And when it didn't quite meet people's expectations, a lot of people are like, oh man, it's going to be 30 years before I ever see anything cool. Is this reality, in my opinion? It's not in my opinion. I think we're going to see all sorts of really exciting updates and advancements that I was going to keep moving quite rapidly. We see a lot. But I think that at the end of the day, this probably was a little bit disappointing because there was so much hype around GPT5. I mean, we saw the length that OpenAI went to. I remember when GPT4 came out, everyone was blown away. Then GPT5, it was a huge jump in quality. And then. Or GPT4.5 and then OpenAI did literally everything humanly possible to not release GPT5. They released GPT4.0 and 03 and 03 mini and nano and GPT4.5 research, like all of these random naming conventions basically because I didn't want to release GPT5 and be it be a letdown. So it felt like to a lot of people, by the time they released GPT5, it was going to be this big, huge basically moment where everyone was, I don't know, super stoked about what actually came out. And it seems like we were all a little bit let down. There is. So anyways, that's, that's kind of how I think the sentiment overall on if you go look over on like Twitter and stuff. But at the end of the day, I think that we're still going to see a lot of really exciting, exciting advancements. AI is not done. We're just getting started. There's. There's a lot of exciting things. We've seen literally updates all, all week this week from all the big top tech companies. Anyways, if you want more updates on OpenAI and every other AI company, make sure to subscribe to the podcast. Drop a reviewer rating wherever you're listening or watching this podcast. It helps out a ton on YouTube, Apple and Spotify. And as always, make sure to go check out AI Box AI if you want to try all of the top models in one place for one price, you don't need a subscription. Every time a new model comes out, you know, testing it against everything else, you got it all there for the same price. All right, thank you so much for tuning into the podcast today. I will catch you in the next episode.
