
Loading summary
A
OpenAI's newest model, GPT 5.4 has landed and it is a big deal for everyone.
B
We get yet another state of the art AI model. It is a huge boost in coding and guess what? It costs about half as much as Opus 4.6.
A
And it turns out 5.4 beats the average human at the little things, like using a computer.
B
Oh no. When we asked a model to use CUA compared to 5.3 codecs. It doesn't have to spin up like a new environment to do it. It's more like how you or I would interact with a compute. And maybe even crazier. Kevin. OpenAI sees no wall and expects intelligence capabilities to continue to increase this year. Yeah.
A
Dramatically. Gavin, are you okay?
B
Kevin? I just need sleep.
A
That's adorable. Not gonna happen. Not this week. Because meanwhile, anthropic CEO was on the phone with the friggin Pentagon trying to save a $200 million military contract while simultaneously being called, quote unquote, a liar with a God complex. That's fun, right?
B
Plus, big unlocks for everyone, new AI video updates from Kling, Grok and even Notebook lm.
A
Weirdly, yeah. The Notebook LM thing is honestly the most exciting part, which is crazy. Plus, big friend of the show, Ben Affleck has been secretly running an AI company which just sold to Netflix. What a week.
B
All of that and more on this week's AI for Humans. Everybody, let's get started.
A
Gavin, we have to got to refresh. We have to redo the teases here. It looks like.
B
No, you're.
A
No. Why? Open AI just launched a new model. Yeah.
B
No, Come on. No, no, no.
A
Start the show.
B
Welcome everybody to AI for Humans. This is your weekly guide to the wonderful world of AI. And Kevin, we got a new one. A new state of the art model has dropped again. We are at maybe at probably at like 7 for this year so far. Is that where we are right now?
A
I don't know. Are we about to summon the boys? Is it too early?
B
No, wait a second, wait a second. The benchmark boys will come. They will be here. Okay, we got it. We'll summon them in a second. But I do want to bring them
A
do the magical whistle. I mean, because we do have a new way to summon them.
B
We will. Okay, let's get to that in a second. So, yes, OpenAI has a new state of the art model. This is GPT5 4, a brand new model. And Kevin, it is very good. The biggest thing I think we have to dive into and we're going to Talk about a bunch of stuff, is that it is very good at computer use. And as you pointed out at the top of the show, it is very good at doing the sorts of things that humans do. And we are inching ever closer to that ability for AIs to do most of what we do. So first and foremost, let's just dive into some of the basics in this, and then we can definitely talk about benchmark boys.
A
First of all, it's 5.4, which is too louder than the last model that most people used. Naming conventions aside, for those that actually care about this detail, it's because they want 5.3 to remain for the Codex model, which is their coding model. It's dedicated for that. This is now 5.4. They acknowledge that their naming situation is a mess, but if anything, that just makes me feel so much better about Skynet not happening tomorrow if they can't get the names right. Still, we probably don't have to worry. Computer use is the reason for the season. Humans, why don't we talk?
B
Why don't we explain what is that? Tell people what computer use means? Because I think some people in our audience may understand what that is, but a lot of people don't.
A
It's. It's. It's what it sounds like. It's not like a fancy AI term. There are tests which are designed to test a model's ability to use a computer. And we're talking like, see what's on the screen, maybe hear what's coming out of the speakers, click, drag, poke around, type text into a box, close windows, manage them, etc. One of the benchmarks is called OS World. Humans typically score.
B
That's you and, I gather, doing it already. You're benchmark boying, but that's fine. Go ahead. Benchmark.
A
Do you want to call benchmarks?
B
Yeah, let's summon boys. Benchmark boy. Benchmark boy. Benchmark boy.
A
Bench. Bench. Benchmark boys. Bench. Benchmark boys.
B
We'll have another new graphic that just rolled because Will, I'm making. Will make a second graphic here. Yeah, let's talk about the benchmarks. So this again, this is just for. If you're out there. These are the numbers that help you immediately understand how it is different than the others. But, Kevin, talk about the specific number you were referencing.
A
And again, like these, these models can be tuned to perform better on benchmarks than they do when you and I try to use them. The vibe test, if you will. But let's just hear this out. At least as it is on paper, the OS world benchmark is designed for things like I just said, like clicking and dragging. In this case, looking at a browser. Right. Getting a screenshot of a web.
B
I'm very good at these things, Kevin. I can click and drag with the best of them.
A
What about scheduling calendar events, Gavin?
B
Or I'm in the 50% range on that, probably 50% range.
A
What about sending me your business ein last year, Gavin?
B
Pretty bad at that. I'm pretty bad at that.
A
I'd say that was a 12 month horizon task, but you eventually got it done. The point is, humans normally score 72% ish on this test. That's the average human. This model just hit 75% accuracy. So on the the broad using a computer thing, it's better than the average human. It's way better than the average human on things like spreadsheet manipulation, generating presentations. And then if you really want to get me excited, Gavin, the tassels will pop out. We can talk about tool usage.
B
I don't know, I don't want to keep those tassels back for a second. Let's actually.
A
Tool tassels. You want me to keep those tool tassels?
B
Your toe tassels are fine. It's the other ones I'm wearing back in the satchel. So, Kevin, all that mark boy stuff is good, but actually OpenAI took some time to show how this might matter to an actual person. They had a website they were creating for a coffee shop and they kind of showed how computer use can make a difference here. So right now it's calling the Image gen tool and it has a smart use of Image Gen as well because images take a while to generate. So it's actually doing all four of these images concurrently, which is pretty neat. Now the model is able to check its own work using kua. What KUA did here is like open up the image, inspect it, open up the website, also look at it, compare them side by side and make sure that the website created is as close as possible to the image that put with this update. It makes the work a lot cheaper, a lot more efficient and also ultimately
A
helps you do better work.
B
So cua, there is a short word for computer use, right? That's what they're talking about. Kua Computer use agent. Yeah, yeah. So in this instance you have a computer use agent that is actually using the computer to do these sorts of things and kind of manipulating in this instance, photos and different things like that and trying to get a sense of how it will all lay out. So this is a really Interesting way of a real world use case of this agentic idea. Yeah.
A
And that the workflow being demonstrated is like one that's been shockingly missing from so many tools. Gavin, like, you know, you've been vibing very hard. I can see it in your eyes. And we will get to that later in the show. But if any of the folks out here have ever tried to prompt a website into existence or make a game or anything else, you know, oftentimes like the scaffolding is there now, but the visuals just don't match. It's missing all that stuff. And this is a concrete example of, hey, I have this site, here's the layout. Go make it pretty and it knows how to do it. A lot of these tools can generate images, but they fail to generate and place them in there. And this is just, just a very simple workflow that's so impactful.
B
And I think one thing that's really important to point out here is we've talked on the show before about how like, coding is the kind of opening door to all of this kind of like AGI stuff where we need this stuff to do real world things. And, and the reason why is because the better it gets at understanding how to write code or read code or do stuff with the computer, the smarter it gets at making things that we actually might want to use. Right. So each of these models has been a pretty big step change in terms of actual coding from, you know, the early stage, like kind of Opus 4.5 was when it started to feel like it was all really clicking. And I think that's the moment kind of end of year last year, where it was like, oh, you can just make something and it works. Finally now we're getting these steps where like, we're improving with each of these steps. And Kevin, I just be. It's March 5th, right? It's March 5th of 2026, which is crazy. And that feels like we've got a long ways to go until it's really perfect. But we are actually seeing stuff that really makes a difference now.
A
The it's. The line is still very much going up. We have not hit a wall. We might be another algorithmic breakthrough or three away from unlocking this AGI massive superintelligence future that we' you know, patiently waiting for or impatiently filled with anxiety over, but we're not slowing down. And other criticisms, Gavin used to be. Well, these things you can't trust, right, because they're just going to make things up. Hallucinations with this model, according to OpenAI's benchmarks again, they're down 33%.
B
That's a huge deal. That's a massive deal, right?
A
It's a big drop. And even as something as critical as, like, legal document review, there's a AI company out there called Harvey, which a lot of lawyers use to review their legal documents. They say it scored a 91% on their big law benchmark.
B
Oh, big law.
A
Big law. Big law's got a big cowboy hat, big six shooters, and a massive gavel, baby.
B
You know who has to be in the big law courtroom is Wonky Walrus. I bet he's got a lot of. He's probably got a lot of things going on there, actually.
A
Wonky Walrus has priors. Gavin.
B
Oh, no. What did he do? Oh, bleep that Will. That's gotta be bleed. We're not putting that.
A
Wonky Walrus was caught in a 711 parking lot.
B
No, he wasn't.
A
Three times. Three times.
B
All right, so a couple other quick things about this. Yeah, it is faster and uses fewer tokens. So that is a big thing. When you know you're vibe coding something, you're doing anything. Tokens are how you. Especially if you're working with an API. But even if you're working with a max plan or one of the larger plans, it means you'll be able to get more done with less, which is a big deal. There's a 1 million token context window which equals, I think, what Opus 4. 6 now has. So this allows you to have a much longer code base. As Kevin mentioned, I've been doing a bunch of I coding. We will talk later on about, like, what I've been doing with 4. 6. And also what is annoying sometimes is when it has to reset that context window. At this point, I'm like, give me a 5 million context window. Like, I just want the longest possible context window I can have.
A
Can I take the tassels out now for tool search?
B
Take the tassels out. Take the tassels out.
A
One of the nerdiest things, but probably the most exciting to me whenever an AI agent has to go and use a tool. Any sort of integration. Yes. Check an email, book a thing, make an image, et cetera. Usually all of the tools available are dumped into that context window and it slows things down. It makes the prompts bloated. It. It ends up costing you more money. Um, and again, the. The time thing, like when I say it slows things down, it can really bring things to a grinding halt as it uses tools this model reduced token usage by 47% with no accuracy loss. This is again what they're reporting. It just came out. You and I are probably going to get real hands on with this thing over the next 72 hours or so and see. But, but that's a big deal and it doesn't come cheap. I should mention because this thing is more expensive. Input tokens went from $1.75 to $2.50 per million output from 14 to 15. So it is a bit more expensive. However, OpenAI is claiming, hey, we made all these advancements, we use way less tokens, so overall it should be cheaper on how that's going to play out.
B
Yeah. And Dan Shipper from every who had a little bit of early access to this, is comparing it to Opus 4.6 and he says it's half as much. So when you think of the edge case coding stuff, that's a big difference right now. We'll see how this all plays out. Like, one of the interesting things I've been tracking over the last bit is like you remember like last November, Kevin, when it felt like Gemini was going to run away with the AI World models and it became a big thing. Well, a lot of people have said since then that Gemini's new models feel a little bit more too benchmarky adjusted that the benchmarks may not be feeling as much like the real world use cases. And I think a lot of this going forward is going to feel like is it making a difference for you, the person who's using it? Because I don't think these benchmarks benchmarks other than like a sense, a sense of tracking about what it looks like over time. Like one of the things we've talked about with benchmarks in Here is the AGI ARC 2 benchmark, which this is again has a good number on. But it might also be what they call saturated, which means that the questions have been seen by a bunch of these models. It's easier to answer them in a bigger way. There is an ARC AGI 3 benchmark coming out soon. In fact, by the way, this is the funniest thing. There's going to be like a party for the release of it in San Francisco. So if you're living it up in San Francisco in this world, it's a whole new thing. I want to mention one other thing about that's really important, and I mentioned this, was mentioned this in the tease, is that Noam Brown, who is one of the best researchers at OpenAI, he's a really smart guy to Follow at Polynomial, he tweeted. GPT 5.4 is a big step up in computer use and economically valuable tasks. We see no wall and expect AI capabilities to continue to increase dramatically this year. And I think just to sit on that for a second, we are following this every week. And maybe you're a listener or viewer who comes in every week. Maybe you're somebody that just pops in when a new model comes up. I think the thing that's important to take away from here is we have now unlocked that thing where things are going to be moving a lot faster. And you can imagine, Kevin, the model that we're going to see in September from these companies being way better, right? Like significantly better on a lot of this stuff. And again, I'm not doom scrolling or doom posting or doom talking. Sorry, Will, you can beep that. But like, you have to prepare for a world where you're living in a space where these models can do stuff that you didn't think they could do a year ago. That's just an important thing to be aware of.
A
I fully agree, but I sleep soundly at night, Gavin, knowing that the best in class, most capable, most token efficient tool is the one that's going to make the decision whether or not to drone strike me.
B
Oh, do you? Is that right?
A
Yeah, I mean, that's because, like, listen, you know, the Department of War has picked a winner here. It is OpenAI. Are we ready to transition to that off this discussion or you want to talk about a few more things?
B
Let's transition to that. They're very fast before we move on to that because it is a very important conversation. We have to talk about it. There was another model that dropped earlier this week from OpenAI. 53 Instant. So they decided to go with one other model, 5.3 Instant. This is. They're essentially going to be driving their free model. The emojis are back in this model, which is interesting. And supposedly it's better at writing. It was not that good at my Pac man test. I keep trying to get them to write better. Pac man. Pacman shows it good. I mean, I don't know how we do this, but at some point, one of these models, even the new one, 5.4, is not very good at it. In fact, it came up with my Favorite Pac Man 5.4. Came up with my favorite Pac man fun of all time, which I texted you right before this. Did you see this? The pun that it came up with is Pac Man's autobiography is called Eat, Pray Waka. That was the pun it came up with. So anyway, they're still not that great at coming up with really funny things. Okay, yes. Let's transition to more. Let's transition to this scary conversation. Actually, before we transition to that, please support your local podcast. We are an independent production and we make stuff for you if you like, and subscribe to this YouTube channel that's doing a lot. But we've also provided you many different ways to open the door to give us money. That's the most important thing. You can give us money.
A
What about the tote bag?
B
Well, there's no tote bag yet, but maybe soon. At some point we promise merch and we'll. We will get merch to you, everybody. Maybe we can set our agents to that. There is a Patreon, which, thank you very much. Many new people have joined, which is exciting. And Kevin just opened a buy me a coffee on our website at aiforhumans show underneath the slash pod command, which you can see a lot of fun songs that we are going to update this week. We will talk a little bit more about that website later, but you can support our podcast there. We appreciate every single one of you. And now, Kevin, it's time to move on to probably one of the scariest AI conversations we've ever had. This was kind of brewing last week, week, and it kind of, no pun intended, blew up over the. The week and weekend. So this. Maybe you want to do the kind of set the stage thing and then we can kind of get a little bit into what's going on right now.
A
Well, yeah, there. Listen, the Anthropic is one of the few companies that before, I guess last week was doing a lot of work with the government. They had their models running on a secure cloud where government agencies could access them and do all of the things that government agencies want to do. Right. Analyze mass communications, look at satellite imagery, drop war plans, simulate war games, et cetera. I don't probably play 5D chess, whatever they're into. Maybe they were just vibe coding Snake games. I don't know what the government does. I don't pretend. I guess maybe document redaction was high on the list. Point is, Anthropic was the go to. But they had two red lines. Two red lines which existed under the Biden administration, which the Trump administration also agreed to. One was that their models could never autonomously pull the trigger. Basically, there had to be a human in the loop before one of their models was involved with deciding to take down a target.
B
Right.
A
Human or otherwise. So that was one and two was mass surveillance on US Citizens.
B
Yes, Those were the red lines. Pretty good lines. I kind of agree with those lines. Mostly, like. Right. Not bad.
A
I have. I have done talent agreements to, like, appear on stage which have more red lines than using AI with weaponry. So, yeah, I'd say those are two good red lines. Maybe there should be 200 of them. But that's. Yes. You know, out of my depth. The government, our present government did an abrupt about face and basically said, remove these two red lines. Oh, else.
B
Yes.
A
And there was some fa. There was some fo. And that's the summary. Are we good?
B
Well, that's. Yeah, that's that. So there's a little bit more than that. The big part is a lot more than that. There's a lot more than that. Yeah. So. But. And I don't want to rehash this for a lot of people who are probably familiar with it. The very quick tldr of it is that Anthropic said, we are not going to change that. Then they got into kind of a fight with the Secretary of War, Pete Hegseth. Pete Hegseth then went out and said that they are a supply chain risk which just today has been updated to. They're going to hold that up. So this is a big problem for Anthropic because what that means is they cannot have government contracts which also might make them unsecure for a lot of other company use cases to point out
A
real quick the gravity of that thing. This is never been done before for a company like this.
B
Like for an American company. You've been up for an American company before.
A
Yeah, and. And it's not only that they can't have government contracts. It's pretty much anybody that does work with the government can't use anthropic products either. So this is a massive hobbling.
B
That's right. So then Dario went on CBS and did a interview that kind of blew up and had him. In fact, maybe we can play just a tiny little bit of that interview, because first of all, there's two things to know about this. One, if you're not watching it, Dario looks strangely stuck, skinny. Like, clearly this has been holding and pulling weight out of him. But two, he is pretty much holding a line here. If you had a moment with the president right now, tonight, what would you say to him?
A
You know, again, I would say we are patriotic Americans. We have done. Everything we have done has been for the sake of this country, for the sake of supporting U.S. national security. Our leaning forward in deploying our models with the military was done because we believe in this country. We believe in defeating our autocratic adversaries. We believe in defending America. The red lines we have drawn, we drew because we believe that crossing those red lines is contrary to American values. And we wanted to stand up for American values.
B
So you get a sense there like he's taking a pretty significant stand now, Kevin, the crazy part of this story, which we didn't cover last week, but everybody has seen now, is that Sam Altman and OpenAI then came in and made a deal with the Department of War. That is a very mixed conversation around, like, where did they cave and where did they not cave based on where anthropic stand. And Sam Altman has dealt with a lot of internal and external strife about this agreement. It sounds like at this point, here we are on Thursday, that they are kind of pushing back, even Sam is pushing back on this. More importantly, I think the bigger thing is that, you know, the US Government has reiterated this supply chain risk situation right now. And even though there's also a story that Dario is renegotiating with the Department of War. So it is a very loose situation right now and we're not exactly sure where it's going to end up.
A
And honestly, like, we wouldn't really get into it last week. And some of that is my fault because I was too busy vibe coding dumb songs into an ipod. But you know, the tendrils in this run very deep with like super PAC donations being made by one side and, you know, bending the knee or ring kissing as some people, oh, observe from one company versus another, right? And maybe that all factors in here. Maybe this is all, again, a big 5D chess move. Maybe it's not like we can really, really get in the weeds about this. But where does this end up, though? Because, like to the end user, right, These tools might be used, are the end user. I say, as our listeners, our watchers, right? But to everyone out there, myself and you included, these tools may be used to surveil you. These tools may someday be used to suppress you. One of these companies is going to be providing that. On the other side of that, does this send a chilling effect to any company doing anything AI related in the US you know, or anything tech related, where it's like, well, hey, you really better be able to play nice. And is this anything new or is this just being so, so, so blatantly communicated and, you know, and argued in the public? Is this new? I mean, is this really unprecedented?
B
I think this is unprecedented in the fact that, like, the biggest issue is that it's the public and the government all waking up to how important AI is going to be going forward. And I do want to call out the fact that Dario, Dario, like, released an internal memo where he kind of ripped OpenAI a new one, called them mendacious, which is like, kind of like not exactly truth telling, and also kind of said that the reason why, to your point before, that they're kind of in with the Trump administration, is that perhaps there are people internally at OpenAI and there are Greg Brockman, the president, donated money to the Trump campaign. So this is also like a lot of like internal politics between these two companies. Now, granted, I think the most important thing for our viewers to understand is like, we are also in a one level up war. Let's call it maybe war is too much, because I know obviously there's a terrible situation happening in the Middle east right now that a lot of people are suffering, but we are in a one level up kind of cold war around AI with other nations, right? Specifically with China. And there's an argument to be made here, and I'm not saying about this administration, but in the American government at large, that perhaps the government should be able to control exactly what of these things are and what they do. Now, I'm not going to make that argument myself, but you could say that if a private company has the ability to say yes or no based on a technology, then that does become a very big lever for nationalizing that private company, which means that the government takes control. In fact, just this week, Palantir CEO Alex Karp had a pretty divisive comment, let's say, and use divisive language in it, mentioning this specific idea that they would get nationalized if they continue to refuse this. So, Kev, this is also very sci fi weirdness, right? Like the idea that you have a company that has a piece of technology that is so significant that the government, they're afraid to give it to the government. This is like, it has echoes of the nuclear race. And I know that Dario, I found it this week. Did you know this? That Dario gives everybody who starts at Anthropic the book how the Atomic Bomb Was Made, because he believes that, like, this thing he's working on has that same level of importance. So anyway, this conversation is going to be ongoing. The bummer is I really do think Claude is about to maybe win. And that's partly why, like, OpenAI is racing these updates out. Because if you haven't noticed first of all when all this went down, Claude has rocketed to the number one app in the App Store, which it has never been before. Katy Perry is now a Claude max user, which is also. She's part of the cloud resistance.
A
Listen, anecdotal, but three people that I know canceled their OpenAI subscriptions and immediately signed up for Claude. I don't think that that is a single tier in the drop of a bucket in the sea of money that OpenAI will make from the government contracts. But still people are waking up and canceling GPT.
B
Well and you say that, but there's like actually some really interesting charts coming out about like where the actual money that is getting made by these. And Anthropic is catching up significantly. Anthropic is making about $19 billion a year right now, which is a lot of money. And OpenAI just raised out, said well we're making 25 billion a year now people have said the fact that OpenAI has all these consumer users and they haven't turned ads on that that might take that stick and go even further up, but it is a big deal. And I think actually this is a good, I mean we should just land this, this plane here a little bit and just make sure we're on the same page. This is the sorts of stuff that we've been talking about kind of existentially that we are worried about with AI from the very beginning of the time we the show, right? Like this idea that you're going to have to ask big questions and if I am a part of a government that is fighting another government, even if I don't agree with that fighting, how do I respond when I, when my government wants to use the best possible tool in order to save the people that are defending our government. Right. This is a weird big questions that people have to ask themselves.
A
And here's the other fun one. Could you you friggin imagine the Manhattan Project taking place in public with like three or four, let's say 12 different companies going after it and tick tock exists at the same time. Could you imagine what social media would
B
be or, or under this administration? Right. Which is another thing is like pure chaos. The idea that like you get into these public fights and that the administration people air their dirty laundry in public in your point like no, it's horrible. It's a terrible sit for this to
A
happen in that and that's a channel that we need to make by the way, is TikTok influencers reacting to things as if the Manhattan Project is being built in the real time and oh,
B
that's a great title. That'll be amazing SEO. It'll be like it'll be 10 for 20 years.
A
But hey, there's Kevin. There's so many more exciting things happen. Gavin, I'm glad you're getting what you voted for. Let's move on to talk. Thank you, Claude Code advancements because let's pour one out for the startups. We need a graphic for this. We need to pour one out every week.
B
Oh, that's a good idea. We'll pour one out for everybody. To startups.
A
I would actually before that. Yeah.
B
I want to talk about a new term that I've coined that I believe is going to take the world by storm, which is cloud rotting. And this week, Kevin, I clod rotted. Which cloud rotting is basically. Do you know what bed rotting is? You're familiar with that term, I assume?
A
I do. I do know. And brain rod as well, but I don't like clod rod.
B
Claude rotting is. I took a tiktoker myself just to point it out. It's when you are you caught Claude
A
Midia is I think what happened.
B
No, I did not cut Claude Midia. Cloud media is not what the. This is basically. This is spending too much time doing Claude code and not taking care of yourself. So I spent a bunch of time this week but yes, there's some big updates. Voice mode is coming to cloud code. There are a bunch of new cloud code skills that are coming. So Kev, this week in my cloud code update, I added a link blog to our new website, AI for Humans Show. Mostly because I wanted to do something that was 20 years old and just enjoy a process of it. It does have an RSS feed for you and for your agent though, if you want to go to it. But I think this is an important thing to understand is that Claude. There's something about Claude for me and I don't know about what you think about this for you. I don't know what it is but like maybe it's just literally how it's talking to me or what I'm used to. But like I do find it better than even using the Codex app. Are you having that experience at all? Like, do you feel like Claude has like something. I don't know. I'd never had this before and it might just be that I'm starting to use the terminal window stuff, but there is a weird connectivity that I've just started.
A
I feel you pair bonding in a way that you haven't since Grok launched their anime babe assistants. Gavin, this is really nice. You have the twinkle back in your eye.
B
Yes, this is great.
A
Claude is my daily driver, full stop. I've got multiple terminal windows open that are running instances of Claude code. My Claude bots are all powered by Claude code. Yes, they have access to OpenAI models as well because I bounce them off of each other when Claude gets stuck. But it is my daily driver. I use Claude cowork to plan some trips and travel that I have coming up. I use it to it to make PowerPoints and Slideshows like it is still my number one. So whenever they release a new something, I'm. I'm all for it. And I like the link blog, by the way. People should. Don't sleep on it. AF Link blog.
B
It's really cool by the way. Like, yeah, it's very cool to have and it's a. What, what's interesting about it is I used it as a way so that I created a iPhone shortcut so that I can easily like send a link from anything to the link blog. It'll post automatically. Same with, I have a bookmarklet on Chrome or Safari where I can add stuff to it. Those are the little cool things that you can do. And one of the things, Kevin, that's really interesting about now that I'm doing this on a regular basis and again, you and I talked about this earlier that you were maybe like six months prior to this kind of starting this process for yourself or even earlier than that, but now that like it's easy to do this, like I'm just got a terminal window up all the time and if a new feature comes into my brain for the website, I'm like, oh, let's, let's explore that, right? Let's go try this thing. And it just goes. And then you're just having this back and forth. So again, all of this talk about Claude and hopefully Claude will not go away. We're hoping that the government and Cloud can come to some sort of agreement that we can move forward with Claude. But you should try all of these models. But specifically there is something special about what's going on with Cloud right now. And now they're adding voice mode. I feel like they're going to continue to grow in use. So it's a good one to show your friends and family as a trial
A
now, not just Claude, several companies releasing updates this week that people can get their hands on some incredible new features, stuff that you know. Here, here's the wash rinse repeat. Stuff that took us hours or days or weeks to do months ago, which are now instant and within a single app. Cling. Motion Control 3.0 is something that really caught my attention.
B
Yeah. So, I mean, this is their next version of Motion Control. You and I have talked about this sort of thing since Runways Act 1 and Act 2. This. This can. Basically, you can puppet a video or add a look to a video if you want to. Based on what it is, it's very good. In general, like, it does a very good job of it. They have some examples on Clingsight.
A
Yeah, I mean, the ability to record yourself and transform your performance into or onto any character in a scene is massive. And yes, there's been some other tools that allow for it. This one looks incredibly accurate, very nuanced, captures the detail of the performer's face, and it transfers it really well to the scene, including lighting and shadows and everything else. So here's Alex Petrascu or Maxescu on X posted some examples of using audition tapes as an input. And then. So basically, if you're on the video version, you'll get it. If you're on the audio, someone is just delivering a dynamic performance and it comes out amazing.
B
Helena will never love you as I could, but you will love her nonetheless.
A
This.
B
And you will tell her every day that she is beautiful and she'll believe it because she is.
A
Okay, audio only. I know that that was a weird ASMR moment. Video friends. You saw a performer. You saw the face transfer to a completely different character. Physics in the dangling earrings on the ears of the character.
B
Right?
A
The. The. The soft shadow of candlelight creating shadows across their face which work in the scene. And then if you watch the rest of the clip, you see them kind of just change the character over and over again, and it just works.
B
So, Kevin, I did it. It mostly just works. I decided why not try to take. Why not try to take one of the most viral videos of this week? If you missed it, the McDonald's CEO tried a big arch burger and kind of failed miserably at eating it. It's a very fun video to watch. He's very kind of stiff. So, Kevin, I tried first to take this video and I tried to make the. Make him into a woman eating the burger. And what I didn't realize when I first did this is I gave it. It somehow used him. So it made this woman that kind of looks like him eating the burger. Play this one first.
A
Some crispy onions on here as well. I see those kind of coming Out. All right, the moment of truth now. Okay, before I'm stopping at five seconds in because there's. It looks like it's like a dog with an Instagram with the tongue hanging out of the mouth. So far, despite it being an alternate version of the McDonald's CEO with the red hair and the sports bra, yes, it did a really good job. Like, I thought it captured the look and everything else. I'm about to get to a big bite moment.
B
Just wait, just wait. But the biting part, you'll see. See?
A
All right. Okay. So the model doesn't understand eating.
B
So good.
A
That's a big bite for a big arch.
B
Okay, well see there?
A
So.
B
So, okay, so what's going on here is in the actual video, that's when the. The guy shows the burger to the camera and it did not know what to do with that at all.
A
So horrific jump scare.
B
Okay, wait, horrific.
A
That was a ring style jump, jump scare. Because a hand just appeared out of nowhere and the model started flailing like mad.
B
So wait, now go to the next one. So I wanted to kind of try to create an image of a woman who we're going to use in a later video that would be doing this instead. Right. So I created a woman with red hair in a fancy restaurant. So I did this. So play this one. And it's a similar sort of problem, but you'll see at the end as
A
well some crispy onions on here as well. I see those kind of coming out. All right. The moment of truth.
B
That is so good.
A
That's a big bite for a big.
B
You see that one go like, kind of like a crazy. So what's so funny about it?
A
That's like a left for dead enemy when it gets agitated.
B
So if you, if you're not seeing this, what basically happened in the video was like, again, the CEO holds the burger up to camera. The video model doesn't know what to do with it. So it's kind of freaking out. But. But both of these instances, I thought the talking actually looked really good. The thing that didn't work very well was like the biting of the burger, right? Like each time when the CEO took a bite. Now the CEO took a weird bite, which we all know, that's why I got famous. But it took a bite and it kind of mistakenly did something. So interactive stuff is not that great. Talking wise, though, really pretty good. I think in general.
A
Yeah. I think, you know, obviously dynamic scenes and dynamic motion, maybe not there. But for the intimate, it very important to capture the nuance of Someone's face. I think it's, it's just getting better again. We did a launch video for our startup months ago and it was a little painstaking to bolt on multiple services to get this out, like these types of performance out. So now it's like, okay, yeah, here it is. So if you want to go and I don't know, launch your OF as your alt, if those words mean anything to you, here's your chance. Or if you want a new spokesperson for your business or you want to make UGC content, here you go.
B
You're talking about Open Fries. Open Fries is my favorite new startup. Right. Like that's a robotic fry making startup. Anyway, let's move into. Speaking of of, let's talk about Grok Imagine Extend. So Grok Imagine Extend is a way to extend your Grok videos. Now Grok, we've talked about a lot on this show. It's got some interesting ins and outs and I definitely experienced that when trying to use it yesterday a little bit. The, if you're, if you're at all interested in spicy stuff, Grok still got it there. I didn't even mean sometimes for stuff to come out. And it came out and I was like, wow. Well, there it right is. Anyway, Grok Extend allows you to take a video and basically extend on the end of it and add stuff on. Now, Kevin, I will be honest with you. In general, this feels like a kind of thing that we have seen before in other video things, but it does a pretty good job of understanding where it was. At last I took a video of that same woman that I used in the McDonald's one and I kind of made her go into a restaurant on a date. Play this and you'll kind of get a sense of like there's a little bit of a degradation over the course of it. It's about 30 seconds. You can go to 30 seconds long.
A
Just so I know. What did you start with?
B
I started with the image of the woman at the dining room table. So the very first still image, it was just a still image and I uploaded it image to video and I said, animate this. And I, I did give it direction as it went along, but not a lot.
A
Great. All right, here we go. She's sipping a glass of wine, smiling at the camera. In walks a waiter, a delicious tray of tins of catfish.
B
I love this brand. It's so delicious and nutritious. Would you like to try some fancy feast tuna taco time? John, Here, take a bite. I don't know who you're talking to Carol. There's no one there. Leave the Fancy Feast Tuna Taco Time and skeet at all. So that's 30 seconds. It was my. I did invent Fancy Feast Tuna Taco Time. But what's interesting about that is the voices in Grok are still not amazing. And you know, the audio at the very top. The first animation where you just hear the piano playing was actually quite nice, I thought. But in general it's a little tricky because like I don't know exactly what I'm supposed to do with this thing. Right? Like, it's like, I get that it's very cool that you can extend this out and there's ways you could extend it. But you'll also notice that the faces shift a little bit over each generation. And again, it's like timing wise was okay. It's fine. I think there's a lot to be done here. And by the way, I no shade against Grok imagine because it has gotten way better. Like actually as an image model is very fast and really good. I just am not sure that extend video does a lot where you. I think it's going to be more of a shot for shot world going forward.
A
I think it's interesting. Like the. The physics on the wine within the glass are really good. Her reaching around the waiter's arm with the fork to kind of. It's a bizarre movement, but it's one that actually kind of makes sense within the framing of the scene. So there's some stuff here that's actually like very impressive and then some other stuff that would make it again. Yeah, to your point. Like I don't know if this is production ready because the faces shift a bit.
B
So just an example of like kind of how much better Sea Dance is in some ways is. I took a 15 second clip of the same image and I gave it a very specific prompt. But again, this is not about extend. But this is 15 seconds. So you can watch this and play it out.
A
Big burger bite.
B
Hey wait, what is this?
A
Jeez, lady.
B
So there's a little bear in the burger. It what's amazing to me though, Kev, is like you hear the burger bite you. Her voice sounds like a natural voice. Like, like these are the kinds of things that like state of the art video is doing right now. And I think that Grok is maybe not there yet for everything outside of just generation of image or video.
A
So it's not just Grok, it's not just Seed Dance. Google released a new video something this week, but it's not the same type of experience. This is their enhancement. Yeah. To NotebookLM. This is their cinematic overviews. In the same way you could go and drop a bunch of links and create an interactive chatbot, a flashcard game, an audio podcast to go over whatever your content is. Now Google will stitch together these video overviews that have imagery and they have animation and they're very kinetic and they have soundtracks and good voiceover. And it's kind of a one shot explainer that if I were a YouTube channel, especially a faceless one, I might be a little concerned. The idea of a limit is one
B
of the most powerful tools in all of Metaverse. Okay, stop it. That's enough. That's enough stuff. Here's the thing, Kevin. I would not be concerned because I think this is not very good and I don't mean this in a bad way. I'm not saying I'm not. I'm not. I guess I do mean it.
A
Here we go. Yeah. How do you mean it?
B
Here's the thing. When you watch these, it feels like an AI made these things, right? Like the graphics feel like AI made them. They're not that interesting or dynamic as to what I'm seeing. And yes. Is it video to words? Are they connecting VO3 videos to words? Yes. But like, like I. I'm a believer that faceless YouTube channels can work if you're having interesting conversations and interesting things that are happening behind the scenes. But this iteration of it feels very much like the kind of thing like your seventh grade teacher would put together. And I just don't know how compelling that is to me. Do you know what I mean? When you look. Sorry.
A
Look at the grand teachers out there. Please don't, don't stop.
B
I said my seventh grade teacher, not generally seventh grade teacher teachers.
A
Okay, Mrs. Mrs. Busatil, please don't withdraw your patreon. Gavin didn't mean it. What was her name?
B
This is busted. Is that what you called her?
A
Teal. It's a name. Look at the, look at the Justine Moore example of the Disneyland one that she put together. I thought that it was making smart choices about forced perspective. It was making interesting choices about when it was cutting to shots or whatever. Is it early days? Of course, but.
B
Well, that's what I'm saying.
A
I could. Yes. Yeah, yeah. I could absolutely see this being like a great starting point. And I. The number of times I have gone to NotebookLM to digest big topics lately and generated like the. A podcast is non zero. And it's getting higher each and Every week I would opt for this if it could get to just as quick. Might as well have a little video overview and have that playing as well. So I don't. I think this is the, the beginning of a. A very interesting product.
B
I think that's an important thing to be aware of is that like, I think it's the beginning stages of it. And, you know, maybe there is a world where this as everything we talk about in the show, like in six months it just gets way, way better and it will get compelling. But right now, I think when you watch it, it still feels like it's not there yet. For me, in the same way what happened with Notebook LM's podcast, I get kind of bored by that podcast. I don't, I don't love the way that it kind of always feels the same. I, After a while I just feel was like, I don't want to hear these two same people digest it all the time. Like, it just felt to me like understand our audience. At least we're making weird choices and we might feel differently about each other week to week. We might have different passions about one thing that starts to feel like two robots who are having the same exact feelings towards every piece of content that comes in. So that's my worry with this sort of stuff is like, I think I would rather see a less interesting graphic for. With a human choice behind it. Maybe, I don't know. This will be interesting to see how that works.
A
Well, Ben Affleck has its way. Gavin. There won't be any creatives making any decisions because Ben Affleck sold out the industry which he supposedly loves so much. And he's been working on an AI company because he hates water and he hates untouched land.
B
We got a good clip out now that's going right up on Tick Tock. Just that alone. So, yes, Variety and a bunch of other places are reporting that Netflix of all places, has bought a company called Inter Positive. And this is Ben Affleck's secret AI startup that is not making AI models, but specifically Ben and did this company so that he could find ways so that on productions, doing dailies and things like that, there would be ways to use AI tools to help productions shoot better and more efficiently. So, you know, this talks about the idea of AI jobs we discussed on the show where Ben Affleck actually discussed. I think it was at one of those conferences where he talked about AI and he had a pretty interesting take on it. He didn't think it was going to replace the creative process, but it would improve the actual process of it. So this leads into that. I did see, Kev, somebody mentioned in the background that this deal maybe wouldn't have gotten done if the Netflix Warner Brothers deal had gone through because they had spent a bunch of money. So maybe Netflix, Netflix is out there going to kind of snatch up a couple things like this. But overall, I think it just shows you that like Hollywood has been doing stuff with AI and just not being very vocal about it.
A
Point out a few things. They built a custom data set. Like it's a 16 person company that actually built a full production set and shot a ton of footage. So they could probably tag, here's what the lens is, here's the light temperature that we chose, here's the distance and the framing and the blah, blah, blah. And they made their own model. And the idea is that if you are working and Netflix is going to give this to partners for free. So if you're working on a Netflix show, you'll have access to this tool. Maybe the lighting was a little bit off in one of your shots, or maybe you're trying to get VFX into a shot, but it's not playing well with the type of lens that you chose. That's what this tool promises to make better, is that you could go. And it's mostly a post production tool. There's a lot of going out of the way, Gavin, to reinforce, hey, guild, relax. We're not replacing humans here. We're enhancing humans. Which of course is like the right messaging and like a great approach. But if this does replace some sort of post supervisor or a colorist or
B
whatever else, are they going to make
A
that announcement as well? Are they going to be like, oops, our bad, our tool accidentally did this?
B
No, they're not. And like this is the struggle that they're going to run and that all guilds and people who work inside guilds and like I said, I remember for now of the Writers Guild, a bunch of places, places like that that are going to have a hard time with because this is going to keep pushing forward as we see more and more people adopted. So yeah, this is where I think we said on the show a couple weeks ago this idea that creatives will have roles in making things for a very long time, but production might change very fast and there will be a lot less roles in production than there were before.
A
Kevin, you should have activated your Spectre device when you said that. Gavin, you should have activated your brand new cyber cyberpunk audio jamming device, which is apparently Going to be a real thing. Before you said anything bad about AI taking human jobs.
B
That's right. So this is the Spectre 1 device. It is a, from a new startup and this is a prototype so far. So we're not sure how this is going to work. But this basic idea is that in order to stop AI devices from recording you, you have seen things like friend or other devices that are out there that are trying to record IRL conversations. We are now going to be interested entering into a world where you were going to be able to block those conversations. And basically this is kind of like if you've seen those videos of the people wearing cloaked clothes where like, you know, you take a picture and like it breaks the camera. This is the same idea with audio so that when you try to record it, it actually kind of covers it up over time.
A
Yeah, it's basically the size, from what I can see of like a Google Nest or maybe the Mini.
B
It's like the little table.
A
Yeah. And it supposedly uses AI and whimsy. I guess it says Spectre uses AI and novel physics to reinvent jamming. So it's going to be more targeted and smart and portable. They say conventional audio jammers work by overpowering microphones through a lot of power, which is inefficient. So I, look, I, I could, I don't know if, I don't know if this is, this device is going to be the one, if it's going to work as advertised, but 100%, I see a future for devices like this, especially as everybody's got the meta Ray Bans coming around or, you know, whatever Apple's going to release. And if you don't want to be recorded all the time, whether it's your, your face or your voice, you're probably going to have to have some sort of anti recording device on you.
B
Did you. This might be Conspiracy corner and I don't really know because sometimes you do it up on a lake. You'll end up on a lake. But did you see that thing where. Yeah. Tinfoil hat. Tinfoil hat. So this isn't real, but you see that thing where they think that WI fi might be able to give you a video signal of people in their homes? Have you heard this at all?
A
Real? Real.
B
Yeah. Is that real? That's pretty crazy.
A
Yeah. Yeah.
B
So it's certain. Maybe you have a better sense of what that is. But I, I just read that the other day and I was like, wow, that's insane. But that feels crazy to me.
A
Well, so Wi Fi is bouncing around everywhere, right? Like all the Wi Fi signals, like all the whole spectrum from your 2.4 GHz network all the way up to whatever the hell the new standard is they're pushing out of Costco. All of that is generating all these, like, waves. Well, as things move through those waves, it creates disturbances. So if you can build a piece of hardware or software or sometimes both that can read those waves, you can see the distortions and the disturbance. And so they basically train models on that. So as you move around in your house, if they can get, you know, access to reading those, those waves, they can determine who's moving about. So that is just like, welcome to our future.
B
Welcome to our future. So we're gonna need blockers for that too. You notice we don't need a block for. Kevin is seeing what you did with AI this week. That's right, it's AI. See what you did there? Sometimes you're scrolling without a care, then suddenly you stop and shout. Okay, Kevin, we have some really fun stuff this week. One of the first things I want to point out is there's a video that's gone viral on TikTok and then on Twitter. And it is called the Shape Store. And this is from the.
A
I love it. Yeah, so, so much.
B
So this is from the team that made the Bird Game videos. If you remember those videos where it was Bird Game was like this fake game that got created. So describe what you are seeing so that the people at home maybe will play it while. While you're talking about it.
A
Yeah, I mean, it feels like a late 90s, early aughts fisheye camera lens. A glimpse into this underground movement, it seems, of this like, Shape Store where a bunch of people are there, they're hanging out, they love big blocks. They're playing mini games with big blocks. They are. You know, these are like the big. The big primary color wooden blocks that you probably played with at like a dentist's office when you were in the waiting room when you were younger. I know I'm speaking to a certain demo here, but it's like the old trains with the magnets, the little cars you built, mini castles or whatever. Well, this is an underground world that I want to visit so badly at the Shape Store where people are playing with those blocks, they're going down slides, they're riding like bumper cars, they're playing midway games. And it's all this like, it's kind of like a hip hop centric feel to this underground movement. It's really Just, it's just like a world I want to live in.
B
There was a big story this week that came out where like the Supreme Court decided they weren't going to like argue that the story that like AI can't be copyrighted and there was this kind of big fight that went online around the idea that like. Well, that's only because if you have a purely AI output that can't be copyrighted, but once human input touches it. This is just a good example of what it means to be a really interesting voice. Like this is a creative voice with AI tools put together like this. They took that music, they, they generate all these images but they made it look like this thing. Like this is a really cool thing. And just go. If you want to make sure you go to@a.I. solation on TikTok and on Instagram, that is the person that generated this. You see this get shared everywhere on other socials but make sure to go follow them and they're very.
A
That's great.
B
Yeah. Okay. Another big video that came out online that I saw this week, which has to do with the fact that we are in a war now, was Billowal Seat, who, who is a awesome youtuber and does a lot of stuff. He actually worked in Google's geolocation geospatial world, does a lot of covers geospatial tech and AI. He made this incredible visualization for Operation Fury. There's a long YouTube video you can watch. But what he has done is basically taken a UX and ui, laid it over real time data from the actual rollout of the war and allows you to kind of jump from place to place throughout it. It is fascinating. He's productizing this, he's going to make a product but for right now it is just this YouTube video. But if you get a second go look at this, this is one person doing this. Now he does have an expertise in geolocation and spatial data stuff, but it shows you how far we have come with the ability to kind of vibe code something like this into. Into being. So I just thought this was an amazing thing.
A
Yeah, super fascinating to see how quickly something like that goes together. I'm sure there's a million poly market bot bros that are trying to feverishly put their own dashboards together to look at the conflict in real time. But I would rather scroll Infinite Favicons Gavin.
B
Yeah, this is a really cool little project that was created by Joseph Jojo, which is a great X handle name, but he basically took a bunch of Favicons. Are they Favicons or Favicons? I thought they were favicons.
A
Well, do you put it in your. Your Jon Favreau's or your favorites?
B
I don't know, man. This is. We got to figure out. This might be a good one for. I think it's a fabricon.
A
I mean, it's a favicon.
B
I'm pretty sure. I'm pretty sure.
A
Yeah. Because it's your favorites. So you would say I put it in my. Yeah, I put it in my fat. My fabric folder. My. Where my favorites go.
B
Yeah, right. Where my favorite. Anyway, what this is is what's your
A
favorite flavor of ice cream, Gavin? Do you have a favorite?
B
We're moving on. We're moving on. I've got mini fabrics that I love in my life. Anyway. Infinite Favicons is a thing where you can scroll through all the Favicons that exist in the world, essentially. And Favicon, if you're not familiar, or Favicon, according to Kevin, is the thing that goes in the upper left corner at the far left of your URL. It's the little kind of image that makes it look like that you're on that particular website.
A
I'm going to ask Flava Flav what he thinks. Hip hop artists. Flava Flav.
B
Flav a flab. All right, moving on. We're moving on. Lastly, lastly, we have from our friend Blizzain who we've talked to about a bunch. He open sourced a music video maker and he hasn't released this yet, but it's very cool. He does a lot of work with local open source video models, and he basically created a front end to create a music video for any song. So you would upload a music video. You'd upload a music track and it would create a music video. He just goes to show that, like, these people are out there right now making between. Between Blizzain and Billowal. These are two really different people who have specific expertise vibe coding something right now. So again, this is your call to go out there and think about what you want to make this week.
A
I can think of some fun songs that they should plug into their music video making. Okay.
B
I thought you were gonna say, like, my Flavorite Valentine or some other song names was flat. But you want to talk about the fact that we've updated our website, that's fine. I like that too.
A
Yeah.
B
So flavor things. These are a few of my Flavorite things.
A
Well, now,
B
what about Jon Favreau? Kevin. Jon Favreau Is fav.
A
That's how I started. This was if you're. Oh, too busy.
B
Totally embarrassed. That was too late now.
A
So returning fans know that last week I launched a cursed ipod on our website. Ai4Humans show pod using an audio model that I'm still shocked has not been shut down. Yes. Yeah. To create artist covers from other songs. So we had Adele covering Presidents, United States Lump, the Eagles covering Africa. People really liked a Rage against the Machine doing Britney Spears Toxic. I continued to iterate the prompt on my end because I'm not just going to the website and saying make this a cover of that. I created a skill for my clodbot, Mr. Tibbs that does research on the artist like pulls like lyrics and what they usually talk about, analyzes their cadence, how many words they typically use in a sentence, etc, and then generates a prompt and will modify the lyrics of the song they're covering to try to fit the artist better. And I would say, like again, there's no actual benchmark here, but I would get a nominal improvement, a 10% improvement of the output or whatever. I kept grinding on it after hours, just generating song after song and sort of a B testing. Be like, that one's better. This one's better. It's still a little hit and miss, but I've now gotten to the point Gavin, where all day in a special telegram group, I have a bot that is ripping out songs for me. Oh, wow. Choosing which artist goes where and how. And I have like a personalized playlist of weirdness. What are you paying for that? I've given the sonata site maybe 30 bucks for now.
B
Wow. Okay. All right, great. That's not crazy.
A
Which is. It's. It's not crazy. I mean, it's a lot. And let me be clear, I wish that money were going to the artist.
B
Sure.
A
I like. They're just. It's just not right now. And I tried to talk about it last week and a lot of AI haters came after me, especially the music AI haters, which are a specific breed. It's kind of like the gaming ones. And I get it. I totally get it. I wish that money were going to the creatives. I think this website is probably going to be shut down. Yes, I wish it wasn't shut down. I wish the recording industry would make a deal with them and figure it out. But in the meantime, I'm having fun, so don't yuck my yum. Stay out of.
B
Anyway, go, go, go to the website and you'll see the new song.
A
Here's here's what you'll see. Yes, the prompts that and Will can cut this all down. I get it. The prompts that I have fallen in love with are 1. Rock and metal bands doing nursery rhyme covers. So you'll get System of a Down doing the Wheels on the Bus. You'll get Rage against the Machine doing Humpty Dumpty where it has lyrics like, who put that egg on the Wall? Like, it did a good job of, you know, modifying the lyrics to be the performer and then telling the model, Mr. Tibbs, to go out and generate songs based off of marketing, marketing slogans and advertising swap in the style of the Mars Volta, Arcade Fire, Emma Rosa, these bands that I love. Oh, tell me it's not you. You can tell me it's not good. I'm not gonna listen, but you can tell me it's not good. Or. Or you could throw a couple bucks in the buy me a coffee so I could buy me a coffee copy.
B
But also, this is what's so great about what we're talking about.
A
Yes.
B
This tickles Kevin, right? Like, and tickles Kevin in a way that, like, is very particular to Kevin. And that is awesome. That is a very awesome thing. Like, and, you know, there's all sorts of things like that that exist out there. Go listen to it. AI4 humans show pod. You can see it right now. We'll have the new tracks there, and we will see you all next week when you sign off.
A
I didn't know we were going to
B
be back next week.
A
Bye, friends.
B
Bye, everybody.
Episode: OpenAI's GPT-5.4 Is a Beast. But Good Luck Staying King.
Date: March 6, 2026
Hosts: Kevin Pereira & Gavin Purcell
This week’s episode focuses on the release of OpenAI's new GPT-5.4 model, its technical and economic impacts, and the shifting power dynamics in the AI industry—particularly against the backdrop of government contracts and corporate maneuvering. The hosts also delve into major updates from Anthropic and other AI companies, notable AI-powered creative tools, and the broader societal consequences of accelerating AI capabilities.
[00:00–14:41]
Introduction of GPT-5.4:
The new model offers substantial improvements in computer use, outperforming the average human and enhancing efficiency in coding tasks, spreadsheet manipulation, presentations, and tool usage.
Naming Conventions:
OpenAI’s messy naming system persists—GPT-5.3 remaining for the Codex model (coding-focused), while 5.4 is for general use.
Benchmark Highlights:
Efficiency & Cost:
No Wall in Sight:
[14:41–26:03]
Anthropic’s Pentagon Trouble:
Dario Amodei’s Stand (Anthropic CEO):
OpenAI Steps In:
Anthropic’s Rise Amid Turmoil:
[27:06–30:18]
"Claude Rotting":
Claude Code as Daily Driver:
Voice Mode & Skills Growth:
[30:18–42:59]
Kling Motion Control 3.0:
Grok Imagine Extend:
Sea Dance (Seedance) & Google NotebookLM Video Summaries:
[42:59–45:29]
[46:11–48:06]
Spectre 1 (AI Audio Jammer):
Wi-Fi Surveillance Risks:
[48:46–57:04]
[54:54–End]
On Naming Chaos:
"If anything, that just makes me feel so much better about Skynet not happening tomorrow if they can't get the names right." — Kevin [02:39]
On Breakneck Progress:
"We might be another algorithmic breakthrough or three away from unlocking this AGI massive superintelligence future." — Kevin [08:34]
On AI in the Military:
"Could you imagine the Manhattan Project taking place in public with like three or four, let's say 12 different companies going after it and TikTok exists at the same time?" — Kevin [26:03]
On Claude’s Appeal:
"You have the twinkle back in your eye." — Kevin [28:36]
"Claude is my daily driver, full stop." — Kevin [28:39]
On AI Video Comedy:
"It looks like it's like a dog with an Instagram with the tongue hanging out of the mouth." — Kevin (on Kling video tests) [33:21]
"That's like a Left 4 Dead enemy when it gets agitated." — Kevin [34:32]
On AI Copyright and Creativity:
"This is a creative voice with AI tools put together... this is a really cool thing." — Gavin [50:30]
This episode emphasizes how quickly AI is evolving, both in technical capability and societal impact. The hosts maintain their signature mix of humor, skepticism, and excitement while breaking down potentially world-altering trends—from coding and creative arts to military contracts and privacy tech. Listeners come away with not just an update on the state of AI, but also a sense of the cultural and ethical negotiations that accompany frontier technology.
For full demos, code snippets, and curated links, visit aiforhumans.show/pod.