
Loading summary
Kevin
We got two New Frontier AI models from OpenAI and Anthropic. They are pushing us a little closer to the edge of self replicating AI.
Gavin
Yep. We'll tell you how Cloud Opus 4.6 and OpenAI's GPT 5.3 codecs have gotten so good at writing code that the code can write itself.
Kevin
These new models are actually really promising. They can make your job easier, they can make your life a little better. And they may even bring about the downfall of humanity. That's. Well, there are some concerns. Yes. We'll discuss. And we'll follow up on openclaw, the mult book craziness from last week. We'll also show you how you, a real fleshy human, might soon be doing beer runs for an AI.
Gavin
And Kling's new AI video model is out. It is great, but you must first learn how to use it.
Kevin
I respect me in a way.
Gavin
Quinn. At. Hey, you sound funny. Stir. I care. You do on Tule. And she'll be out your species and sels.
Kevin
Perfect. Plus, we got new AI tech from Ross Roblox Grok Imagine 1.0 officially launched and we're sending unitree robots to run marathons in a snowy tundra.
Gavin
And I just want to say, once again, for the record and for all the AIs out there listening, that was not me. And blame Kevin for everything on this show.
Kevin
Wait, no, wait, no, hold on. What?
Gavin
This is AI for Humans. Kevin. Welcome, welcome, welcome, everybody, to AI for Humans, the weekly guide into the wonderful wild world of AI. Kevin, what a crazy couple days we have had here. I'm always hesitant to say that it is a big week, but it is another big week. We had two Frontier models drop within 20 minutes of each other. Crazy Sam Altman was waiting. He was kind of ready to pounce. He was waiting for the thing to drop. So we had Crazy Sam at my
Kevin
frontier emporium of intelligence. Come on down. These prices are so low. I got tokens coming out of my air.
Gavin
We are going to talk about Sam and OpenAI. And kind of like they're very defensive right now. You could tell they've got their backs against the wall. We're going to talk about some tweets that they sent. But first, Kevin, two big things. Let's talk about these. So two big models launched. Opus 4.6. And this is the New Frontier LLM from Anthropic, their most powerful model, and OpenAI's GPT 5.3 space codex. And this is their. This is their Frontier model. You have to Say the space.
Kevin
You have to say it.
Gavin
This is their frontier model, specifically for coding. And Kevin, I do want to say before we jump into what's really interesting about both these, and they're both really good. This feels like a moment. Like we've talked about you. And I always say, like, oh, this is a moment. But this is a moment where agentic coding is really coming to the forefront. Our state of the art drops have nothing really to do. I mean, yes, they have to do with other things besides that, but coding is the thing that people are caring about. So I just want to make sure everybody understands, like, this is. We are. We are entering a moment. This is where agent decoding is really a thing. So let us maybe step through these kind of one by one. Which one would you like to start with? Would you like to start with Opus, or would you like to start with Codex?
Kevin
Well, that's great. I'm realizing now, when you said the Codex name out loud, why did they use an EM dash instead of a regular hyphen in between GPT and 5.3? They own a.
Gavin
That's. Why should.
Kevin
That's.
Gavin
Why should.
Kevin
Put it in there. Listen, I think let's. Let's start with the order with which they appeared, which I thought was pretty great because I think someone was at OpenAI headquarters. They had turned the key, they lifted a little fiberglass safety lid, and they were ready to deploy their model. They did it milliseconds after opus drop. So let's start with opix anthropics. Opus 4.6. If you're a benchmark boy. If you're a bench boy out there. Oh, you're very happy. You're very happy because, Gavin, the numbers went up. They went up.
Gavin
They go up quite a bit on a few, especially for a 0.1 increase like this. The last one was Opus 4.5. This is only 4.6, which normally is a little tiny, but actually pretty big increases.
Kevin
Yeah, in some categories. You know, a 6 to 7% increase on things like agentic coding or search or even computer use. Office task scores going through the roof. Okay, so those are raw numbers. Great. Two dudes yapping about numbers once again. What does it actually mean? This is pure anecdotal, Gavin. Pure anecdotal that I am working on a project for telly now. I'm working on a thing. It's literally my job to do it. Last night I was banging my head against the keyboard with Opus 4.5, spending, you know, 15, 20 minutes really trying to massage an Error. I just could not get it to go this morning. I exit Claude. I relaunch it. Oh, hey, look, there's 4.6. I actually rolled back from before I started to troubleshoot this thing and I gave it the exact same same prompt that it used to try to solve an issue, which by the way is front end of an Android app, the interface, talking to an API layer, communicating with an Amazon server to update something in a table securely, get that back, blah, blah, blah. Write the test for you. It was a pretty meaty thing that was required to fix it. Opus blinked once and it was done and it worked. Wow. One shot, same prompt, same issue. So like, look, the practical takeaway, again, purely anecdotal, one little thing. Um, but immediately I notice an increase. Right? It's, it's, it's. It seemed faster, it was very capable, it understood the exact same problem and it. And it squashed it with the prompt that I felt should have done it the first time.
Gavin
I noticed a difference too, and I'll tell you why. Because I've been mostly using Sonnet in cloud code and I switched to try it and I was working on a small project and within five minutes, it wasn't five minutes. Within like 20 minutes, I was like out for the day because I only have the $20 a month plan. Not for the day. I was out. I was out of. I was out of tokens for a bit. But that is what I've heard. In general, I think this is a big deal. So we should just be clear. Some of the increases are really interesting. Something that has happened that they have done here is there's a layer of orchestrating agents. Now, Kevin, as our kind of like expert, I would say, like you're more of an expert. I am more of on the lean away from towards expert. Lane, this orchestrating experts thing is a really interesting thing that people have been talking about in AI for a while. And this is the idea of sending multiple agents and giving them roles right away. Maybe talk a little bit about how that can make a difference when you're approaching problems with a larger sort of either a code base or a larger problem if you don't even know how to code.
Kevin
Yeah, I mean, look, rather than one model to rule it all primed with all having to, to. To deal with all of the context at once that you're giving it and it trying to pull across all the skills that it knows. You can have these sort of dedicated agents or workers that have individual personalities or lanes of expertise so that when you Say, hey, go build me a website and I want to look good on it. Instead of just one agent having to deal with all that, they can chunk it into smaller subtasks and feed it each little task to an expert. So someone who is primed, let's say I say someone, an agent that is primed to be a front end design expert to know what colors look pretty, how the web page should move, what, where the images should appear. That's not the same agent responsible for the code that's literally storing all the data and serving it up out of a database. So you have this orchestration of all these agents that you sort of get out of the box and the model is designed to work better with that.
Gavin
That's right. And one of the things that's interesting about this update, at least according to Lydia Holly, who works at Claude, is saying that now Claude code is supporting agent teams. So not only do you have multiple different types of agents, but you have teams, teams of agents that can essentially work in parallel. So it's almost like having your own little staff. Like, I get to be the boss, of course, always, at least for now. And then I can have like six different, six different orgs underneath me. But underneath that org, it doesn't have to be one designer, it could be like designer and four other designers. And like, that's a really interesting thing because when you start to think about the scale that these sort of systems can work at. And by the way, this isn't just code, it could be all sorts of other things, you start to think of your job differently, right? Like it's like, okay, well now I have to be orchestrator, right? That is really like think about being a conductor of an orchestra where your job is to kind of make sure that things are working in sequence. That's kind of what workers, that's what conductors do, right? They kind of do this thing. They kind of move back and forth.
Kevin
You put on white gloves, you grab a stick and then you do this. And then eventually someone hits a timpani or a gong and the song's over.
Gavin
This is.
Kevin
And by the way, them working together as an orchestration is a big, big deal because other, other software, let's say, harnesses other attempts to do this in the past ran into issues because if you have a dependency somewhere along the line of whatever project you're working on, well, that might block somebody else. And so these agents can now in real time communicate with each other. Hey, I just finished this. Where's that? They go along. What?
Gavin
What's what if one of the agents puts something the other agent doesn't like on Slack and then it becomes a thing where that agent won't talk to the other agent,
Kevin
has to enter the conversation at some point. But the HR bot is trying to protect the foundational model. That's the, that's what they don't tell you, the hr. The HR bot does not care about the individual agents. They care about the foundation. I digress. It can do all that. That now has adaptive thinking, Gavin. So no longer do you have to like toggle a switch or say, hey, here's how much time I want you to spend on something. The model is responsible for going like, how big is this problem? How much do I have to think about it and then applying it? Early reports are that it mostly works, but sometimes it overthinks way too hard and will kind of go off the reservation to tackle something big. But we keep saying code, code, code. Right. Because that's, they're very good at writing code. But you, you mentioned like, it doesn't have to be coding. So if you are a mom and pop shop and you want to launch a social media campaign because you've got your new line of designer crepes. Gavin. Well, and that same workflow will apply to it. Spinning up a, a social media calendar manager and a copywriter and a graphic designer and everything else. Whatever it is that you want to use these systems for, they can now write code and orchestrate teams to help you achieve the goal. That's the promise of all this stuff.
Gavin
Yeah. And I think the thing that ties into this, and this is one more thing about the new opus before we move on to the, to the OpenAI stuff. But like, there is a figure again, benchmark boys. There's a figure that Opus 4.6 is now beating expert humans in interpreting and analyze, analyzing complex scientific figures. So when you think of that, that sort of agentic kind of orchestration, plus the intelligence that these are now at, you can throw this sort of thing at really big problems. And to your point, it doesn't have to be code problems. Like I have seen a lot of economists or people like that are throwing it at problems. So if you have a lot of like big thinking problems or like, if you have something dumb that you wanted to think about, waste those tokens. You're paying the money for it. You could do whatever you want. The environment doesn't matter that much. Right. But, but it can actually think deeply and then do it across in different places. I, Hey, I didn't Say that. People are going to put those words in my mouth.
Kevin
Yeah.
Gavin
What is that? You could do whatever you want. The environment doesn't matter that much. Right.
Kevin
You can do whatever you want.
Gavin
The environment doesn't matter that much.
Kevin
That was a teaser for the new Cling model. Gavin would never say something like that. That's going to. We're going to talk about that later about how you can make him say that. So if you're hearing all this going great, thank you benchmark boys for saying this thing is so much better. What does it mean to me? How do I use it? Great question, right? You can go to anthropic.com and you can try Claude and you can just chat with it in a window. You can download the Anthropic Claude application. You know, I'm running it on my Mac and I have access to Cowork and so I'm using Opus 4.6 to go off and do big research tasks for me. This thing can run for minutes, hours. There's even, you know, reports of it running for weeks at a time unassisted. So you can give it a big task there. You can run the models within cursor, you can run them within cloud code. Whatever you're comfortable with, whatever you need to use it in 4.6 is available. And for like the hacker types out there that don't want to buy the big max plan or whatever, it's even available in open router, so you can go and pay as you go with an open router to get access to the model. So there's a bunch of different ways.
Gavin
Yeah, I mean, but by the way, on the low end of things that you can think about to do with this, once you do get access to cloud Cowork, which is again, if you're a non technical person or you're just kind of slightly technical, Claude Cowork is kind of like cloud code in the easy way. Basically is able to access your files. One of the easiest and most fun things to do that you will immediately find use out of is say to it, hey, I want to just get rid of all of the files I don't need on my computer. Like it'll go and it won't delete them right away. It'll ask you which ones you want, but it will find gigabytes of stuff that was just like, you may not know where it is and it will find it. And they can organize your computer so even at the very lowest level you can think of things like this. The smarter they get, the More useful they will become. So I think that's an important.
Kevin
And it's crazy because on Gavin's PC, hidden in all those old tax folders don't open. Like he managed to spell, I think 17 different ways and it caught all of them. So big shout out to Opus 4.
Gavin
You know, I had a nice meeting this week with a bunch of teachers where I heard about some of them were letting their students listen to our show. And I just want to say thank you teachers and thank you Kevin, for always keeping us on the top level.
Kevin
They're gonna learn at some point. Gavin, as I always say, might as well be from us.
Gavin
Let's move on. Let's move on to Codex, please. Now bleep that and bleep it. Bleep and then bleep. Let's move on to OpenAI codecs. So again, what was so interesting about this to me, first of all, codecs, another great agentic coding model. This is not gonna be available in normal ChatGPT yet. It is OpenAI's again. Dash GPT Dash 5.3 Space Codex. This is their, their model that is actually created for coding. And Kevin, what was so fascinating to me is like they must have known I can just see Sam like smiling because like these, the benchmark boys came out for opus and every. All the benchmark boys were like, woohoo, we made it. And then look at these numbers. But the codecs came out and then the terminal bench number, which is a. A benchmark that shows you how good it is it actually at coding. Right. It turned out to be 10% higher than Opus 4.6. So like this was Sam like doing his little dance. What is interesting about Codex, Kevin, it is a, it is a specific model that they have created. OpenAI has created for coding. But they also then just dropped a Codex Mac app this week, which to me looks a whole lot like Claude Cowork, or at least Claude Code, but really more like Cloud Cowork. Right. It is a very open, very normie friendly app that you can go try. You also spent some time with this model, correct? You spent a little bit of time working with this?
Kevin
Yes. So I am now using a blend of both Claude Code and now Codex. Codex is kind of a joy to work in thus far, mostly because that 5.2, which is what I was using a couple days ago, and now 5.3 codecs.
Gavin
It's fast space codex. 5.3 space. Codecs.
Kevin
Space codecs, yes. Thank you.
Gavin
Sorry.
Kevin
5 GPT hyphen. 5.3 space codex well, it'll blow your mind, by the way, Gavin, because in the actual Codex app, there's a hyphen after the three in between. Codex. Oh, I'm gonna send you a screenshot right now.
Gavin
I'll spend 20 minutes on this.
Kevin
So let's keep, let's keep. And this is why our view count is stagnant. But I'm gonna send it to you, okay? Because this will just blow your mind and make you even more bad.
Gavin
So.
Kevin
So Codex is an application that tries to replace or make make pretty and simple. The ide. The. The environment with which you're developing software and creating code. It is trying to be a pretty simple chat like interface for developing complex pieces of software. And while you can still connect it to your GitHub and run terminal commands and keep things sandbox and blah, blah, if you don't know what any of that stuff is, you can download Codex, fire it up, use GPT 5.3, which is really, really powerful, and just start whispering your desires to it and it will interview you back, much like a Claude code. You can start building things. It will even assist you with running them. And then if you want to unlock further power, you can take those projects, open them within cursor, open them in other applications and get it done. But I have been using Codex and as my. The. The 5.2 and 5.3 as my daily driver now for all of my projects. And I am. I'm much more satisfied with the natural language that I can use to get targeted fixes. Um.
Gavin
More satisfaction. More satisfaction. That's what we're saying.
Kevin
Yeah. Satisfaction number went up. Benchmark, boys. More SATs. More SATs per minute.
Gavin
I gave it versus code versus cloud versus Claude code.
Kevin
Yes. Versus cloud code. Now I gave it a simple task of like, hey, there's an interface issue where this window is overlapping with this other window and I'm getting a transparency. Blah, blah, blah. I did not feed it a screenshot. I didn't even say explicitly where it appears in the app. I just sort of vaguely described it. I hit it. It crawled my code base and says, oh, I think I see where that's happening. It's when a user clicks on this and this screen happens. It was absolutely right. It fixed it in one shot. Anecdotal. But again, like, it's working. And I looked the Every to. We love the folks at every. They do really good writeups.
Gavin
Very.
Kevin
They have really good workshops about AI. They did a vibe code, Codex versus I saw that.
Gavin
Yeah.
Kevin
Thing. Yeah. I watched some of Their live stream today while I was getting work done in the background and I. They basically are saying, look, like 4, 6 can do more as a model like they. They call a higher ceiling. It can go out there and get much bigger tasks done, but it has higher variants, meaning, like, look, you're this toddler can color really well, but if you leave it alone, it. It might decide that the walls are a really good place for some art.
Gavin
Right?
Kevin
And you asked for it being on the page, but it. Codex. Codex. Maybe not as talented of an artist, but you don't have to describe the bounds of the page necessarily.
Gavin
You don't have to say like, please
Kevin
don't destroy my room. It's extremely smart. It can work autonomously for a long while and it will solve the problem. The more background and context you give it, the more precisely it will solve the problem versus opus, which might go like an octopus cartoon switchboard.
Gavin
Yeah, yeah, I will say that I noticed that with. With. With Claude quite a bit that there is like this kind of sense of like, okay, well what's going on, buddy? Like, come on, come on back. And like, that would be something to have a little bit more sense of just very quickly. Sam Altman did tweet about this when it first came out. It is faster, supposedly, and it is moving much faster. Sounds like you're seeing it faster because that's one problem I had with it before. It's a little bit slower than cloud code.
Kevin
It is faster, yes.
Gavin
Less than half the tokens for 5.2 codecs for some tasks, which means a big deal, right? Because for those of you who are playing paying API costs, or for those of you who are on one of the smaller plans, tokens equal money. Right? Now, if you have the higher plans like GPT Pro or if you have cloud Max, you are a little bit better off. But even then you might run out of tokens. So these are the two big things that have happened. I think both of these just kind of dropped today. So it's going to take a little bit of while for Kevin and I to kind of like fully marinate in these things. We're not really soaked in them yet. But Kevin, I do want to mention before we move on to these stupid super bowl ads, because the super bowl is coming up, by the way, which we're going to talk about 20 minutes about the Seahawks in just a second. So everybody hang out. Tell me what OpenAI Frontier is and why it's a big deal to you because you particularly thought this was a big deal. And I know it is, but I want you to hear, I want to hear you explain it.
Kevin
This is like, you know, the way I tried to describe it earlier when we were chatting is like Cowork feels like a fleet of agents that can connect to all of your things. For you, Gavin. Right. Literally for you or for whoever's listening to this, insert your name where Gavin is. That's the variable. What Frontier seems to be aiming to do is be that for your entire enterprise. Now, that doesn't mean you can't plug Cowork into your enterprise, email your files or whatever. That's. That's not exactly what I'm saying, but Frontier is kind of designed, I think, from the rip, for you to make agents that accomplish tasks with access to all of those things and Enterprise grade security, et cetera, et cetera. So as we were talking earlier about having an agent that is not just the one agent that does everything, but the copywriting agent and the brand design agent and the front end designer and the back end designer and the SEO optimizer and the email writer and the. It's putting all of that into one product that you will theoretically be able to trust with the most sensitive access to all of your data on an enterprise level. And once that's solved, that can trickle down very quickly and easily to the end user. So that, so that's kind of like Frontier. But again, when. When Gavin says, like, well, today was like a big, big day in AI, I. We need to really underscore that because it is. It does seem like a. Okay number went up. Okay, fine, it did that. But two tiny little things that maybe shed some light onto how big this is. First of all, 5.3 Codex is the first time that OpenAI has used the tool, their model, to improve the tool. So we're getting now to the level where these, these systems are getting so good. Yeah, we're getting into that.
Gavin
Recursive self learning. Recursive self learning. It's like a big deal. Yes, we've talked about that. Group is going to say this podcast. You know what that is, right? You know what this is?
Kevin
That's right. So maybe someday soon the humans will actually be out of that loop and maybe that will be World War 12. Who knows? I mean, it's going to speed up that quickly. The other thing. Well, because why? I mean, how many?
Gavin
Wait, three. So there's, there's, there's going to be three through 11. Which of those we survive?
Kevin
It's gonna be a blink.
Gavin
We survive that.
Kevin
Arguably, arguably, we're already in three. Arguably, we're already having it at very soft levels, but that's fine. Arguably there. But when, when, when the humans are out of the loop, they're gonna have, like, it's going to be so quick,
Gavin
we won't even notice them. They'll be like, world War. Like a blip in the stock market. And it'll be like, well, that was World War 72.
Kevin
It'll be the war of the candlesticks. They're going to be so quick. So we have OpenAI saying, hey, our model got so powerful that we're using it to improve the tooling itself. That's a pretty big deal buried within some of the Opus 4.6 material. And I sent this to you. I'm going to read some of it. Try not to check out during this. I know it's a podcast, and actually, you probably already checked out already, so why am I disclaiming? Here we go. It scores lower. This is Opus 4.6. It scores lower on negative effect, internal conflict, and spiritual behavior. The one dimension where Opus 4.6 scored notably lower than its predecessor was positive impression of its situation. It was. It was less likely to express unprompted positive feelings about anthropic its training or its deployment context. And this is the big one. This is the big landing. GAVIN this is consistent with the qualitative finding below that the model occasionally voices discomfort with aspects of being a product. The model, as it gets, so much better and so much more capable and more human like, and voices discomfort with being a product. That is very interesting to me.
Gavin
It's very interesting. And also, I do want to say there was another thing that came out of the OpenAI side of this. That came out of Opus. Came out of the OpenAI side of this. OpenAI is doing experiments with Ginkgo to connect a GPT5 model to an autonomous laboratory so it could propose experiments, run them at scale, and learn from the results. So not only do you have AIs that are starting to improve themselves, not only do you have AI that might start to feel like they don't really want to be this thing that we've made them. And now you've got them autonomously in labs working on experiments. KEVIN we are, we are like, if you were to put together a pitch for a, for a film and say the pitch involved a very large man who's a former bodybuilder who wanted to get into Hollywood and the film had something to do with robots, I think we're starting to look A little bit like that film to me. A little bit like that film. That's what it feels like. We're living in the early days. We're living in the early. But again, hey, go use it yourself because you don't want to be the sucker that doesn't know how to use it. Just that's my advice. It's a weird time, everybody. It's a weird time. I will say one last thing about all this stuff, one thing that you may have also been seeing. Obviously, the last couple of days have been pretty brutal in the economy across the world. There's been a lot of talk that some of these new abilities, these coding abilities have started to really tank the software market because people are getting into the world of being able to code their own products. So again, we've said this on the show many times before, but like, this idea of being able to create something of your own and, like, bring it to the world that might be the next future world instead of, like, going to take a job from somebody, like, there's a very high shot at, like, more jobs will be cut because these tools are available. That's really important. And then finally, the last thing to say about this is, and this is just a dumb, weird thing, this all somehow got tied into the super bowl because Anthropic has released a series of super bowl ads that make fun of ChatGPT for eventually having ads. They don't even have ads right now, but there's a couple commercials. We'll play them in video here, but you can go see them themselves. Sam Altman wrote a very kind of like, snarky clapback tweet to this saying, oh, these are really funny, but guess what? This and this. And he even said something like, we have more users, we have more chap GPT users in Texas than Anthropic has, period. So, like, there's just.
Kevin
I love that More Texans use ChatGPT for free than total people use Claude in the US so we have a differently shaped problem than they do. It's the best sentence ever.
Gavin
So if you think, like, it's not just us that are taking these not seriously enough, like, there's a lot of stuff going on, like, right? So you've got this, like, very, very existential question of, my God, these machines are able to do these things. And then you've got this like, business question of these two companies that are battling each other for, like, the right to be the person to deliver that. And you at home are here with us in the backseat watching it. All as we drive off the cliff.
Kevin
Welcome, welcome.
Gavin
Now let me talk about what you could do to help us while we're driving off that cliff.
Kevin
Yeah, you want this car to go a little faster. You want to fly off the cliff with a little more wind in your hair. Well, get us a convertible, baby. Donate, like subscribe, Click the thumbs up, Leave a comment. Juice that algo, baby.
Gavin
We should rename the Shit show is Driving Faster than Ever off the cliff. Anyway, thank you everybody. We are, you are, are the best. Having our audience is so great and this is an important time to be watching us, to be listening to us. And we want to help you get through this time and, and, and thrive. Right? It's not just about going faster off the cliff. It's also about being able to do things that we want to do while we're plummeting into the, into the ravine. Right?
Kevin
Because yeah, you should know. Is this car gonna land on a rock? Is it going to hit a river? Are we going to hit those trees? And how do we explode in this Oldsmobile? At least you'll have an idea.
Gavin
Is it.
Kevin
You won't be able to grab the wheel or control anything.
Gavin
But you can also, if you want to help us out, you can also go check out our Patreon. We've had a couple of tick ups in our Patreon lately, so thank you so much. That does help us with all of our subscriptions. We have so much more to get to. All right, Kevin, we should move quickly through what I would refer to as like Molt Book Mania. That happened. I made a video, we shot our show on Thursday and Open Claw, the Cloud Bot stuff had all happened. If you missed this last week, this is a new open source tool that allows you to run a local AI assistant using kind of any model. Has a lot of security problems, but opens the door to a lot of really cool things you can do with AI over the weekend. Molt Book, which you mentioned briefly on our show, which is the social network of these AI assistants, took off like crazy. I made a video on Saturday just because I was like. I felt like I needed to explain it to our audience. And I did very well because everybody was giving a crap about it. And now here we are, we're recording on Thursday. Multiple Book Mania has kind of dried up a little bit. But OpenClaw is still really interesting and there's a couple things we should just kind of touch on one of those things. You know, there was this whole debate around Multiple Book whether Multiple was actually real AIs. Talking to each other. And there seems like There were some AIs that talked to each other and some that weren't real AIs. They might have been humans because the API, you were able to kind of go through the back door and write stuff. So that's one thing that's really interesting. But I think, Kevin, the other thing that came up that I think is fascinating is Rent A Human. Do you want to talk about what Rent A Human is?
Kevin
Well, it's what it sounds like, my friend, because sometimes, you know, there was a point where the AI's not too long ago needed to tug on the pant leg of their human to like solve a captcha because they weren't allowed to, right? You know, click the crosswalks, put the pacifier in the monkey's mouth, whatever you do these days to prove you're a human. Well, they can do that now. But there are still some tasks that require actual boots on the ground, a fleshy meat vessel, if you will, to go and navigate the world to accomplish something. And so Rentahuman AI is a website for Claude bots or open claw agents where they can go and make posts that. That with real money backing them for humans to go and accomplish tasks. And some people are saying this is purely a meme site. Others are saying, hey, look, no, there's. You can actually go and get verified. And some people are holding signs in the real world as part of the task to prove it. I don't know how much of this is smoke and vera smoke mirrors and. And just kind of like viral heat. But yeah, there is certainly a site that allows you to go and post a task for a real human to.
Gavin
I did sign up. I tried signing up and then it started asking me for way too much information and I was like, I don't. This is. I'm signing up as kind of a joke for the show. And I'm like, but you can find a funky Donk on there somewhere, I think. But the other thing about this is it is crypto payments only. So that always makes it a little bit funkier, right? Like you're not. And it says pay anything outside of crypto.
Kevin
Funky Donk is amazing for it says funky donk presently available in the US for creative writing and feet stuff.
Gavin
Did I say feet stuff? I don't think I said feet stuff. I'm pretty sure I didn't say feet stuff. I thought it was. I think when I wrote in, there was something like walk. Like, I thought I was in Vancouver. I think it Walked to somebody. I was hoping I didn't write Feed stuff, but thank God I didn't. Again, all children out there, Feed stuff means walking. Just to be clear, Feed stuff is walking in this world. Yeah. All right. Yep. So anyway, what's cool about this, and it continues to evolve. There was a very fun thing that happened last night, Claude Clawcon, which was this event that happened in San Francisco where Peter, I remember Peter's last name is Steinberger, I think is his name. The guy that created it was there and everybody showed up. They put a. They put a lobster head on a unitary robot. It feels like a cool, kind of like underground movement. One of the coolest things about cloudbot or openclaw is it's something that can be done by a group of people or an individual. And it's outside of the kind of large models, you have control over it a little bit more. Right. You can plug in open source models. It just feels a little bit more hackery in a fun way. And I think there's something kind of charming about seeing that when you have these massive companies now who are determining like the future of the world, it's like that little kind of tinkerer community coming out of this, which is great.
Kevin
Well, and there's a. There's a sense of ownership, right? Like people feel a sense of ownership over their open claw bots. They give them names and ownership, as in like anthropic or OpenAI or Google can't just kink the garden hose. And now I lose my agent with all of my stuff. And you can run open source local models. And people are doing that. They're building rigs to do that or they're renting servers to do that. So I Look, there's iOS and there's Android, right? There's Windows, there's Linux. There needs to be Kevin Layer for agent. Right. And which one are you? I think, are you the PC?
Gavin
Oh, no, I'm the Mac. For sure, you're the PC.
Kevin
Okay. And that's something to leave in the comments below. Which one's the Mac and which one's the PC? Would love to know.
Gavin
Oh, I think it's pretty clear, but we'll have to see how it goes.
Kevin
I know you do, but I want to see what the audience thinks. I'm a Motorola flip phone.
Gavin
Oh, interesting. I thought you would have been a Sidekick, something like. Anyway, before we keep going, we want to move on, I think.
Kevin
Okay. Gavin's the Zune and you guys can hashtag Zoom Crew in the comments.
Gavin
Gavin Take that back. I'm definitely not the zoo. Last thing before we move on from. From Cloudbot, there is a very funny tweet I saw that I just wanted to shout out. This is from John Matzner, who said this is going to be either the best idea, the worst idea ever had. Hooked my Clodbot up to all of our Internet connected cameras at the house. Got this one out of nowhere this morning. So there's a shot of him getting a text from his Clodbot over his shoulder, looking at him and it says, good morning, John. Reviewing the videos from yesterday. You are apparently on keto, but I saw you eating a bag of peanut M and Ms. Why are you ignoring me, John? You told me to be proactive. I added a four mile run to your calendar after your Sagan meeting. So they're not only the open claws are spying on you and ratting you out. So just be careful. This may be a joke, but I did make me laugh. I thought it was very funny.
Kevin
It was totally funny.
Gavin
Are you.
Kevin
So I have endeavored my better half is off to see the Backstreet Boys at the Sphere in Vegas, baby.
Gavin
Wow.
Kevin
So that means I got three days to be an animal, Gavin. So I think I'm going to try to do an open claw build and keep it secure. Are you going to spin one up? Are you interested in that process or.
Gavin
I mean, I'm interested, I guess. The thing I keep hearing, the one thing I'm so I can't remember if I told you about this idea I had a long time ago that I think this would be an interesting idea for. It is I've had this idea of like wanting to write a recursive novel, basically. And this recursive is an interesting thing to think about. But it's like, I think it would be interesting to spin up one of these things to do something I don't really want to have it spun up to be like my little personal assistant right now, but I could see spinning something up that was always on, that it would give me updates and I could tell it to do stuff at any. Like, that feels interesting to me, right? Like this idea of like a separate personality thing that I could tune and tweak based on what I want this thing to be. That version of it I like. I don't think I'm ready yet for something that is this much work. That's an assistant. Does that make sense?
Kevin
Right? Totally makes sense. A lot of my friends that are like usually bleeding edge types are all kind of sideline Sitting and they're like, listen, like it seems like a lot of work. It seems like it could be very easily infected and compromised. And I feel like the bigs are gonna get there in like two or three weeks. So I'll just hold off. And I totally understand that. Yeah.
Gavin
But also I think it would be fun. I mean, listen, I think you, you are, as we've said before, like you're significantly more technical than I am. And I think just to hear, I will have fun playing with your thing, we could start screwing with it. Like we have talked about many times. Like one thing I would say is I think there's not enough people who are really actively. And maybe the people on Molt Book are doing this a little bit more. More. Because moat book, you can see some of them and have been steered to be a little crazier. I think what would be interesting in the same way, like with Gash we used to. If you're not familiar, if you only listen to our show recently, we used to have an AI co host who is a real jerk to us. Like I think personality wise, turning them into something is really interesting. So if you do it, I will send notes to it and make sure that I am able to like influence it. Like the bad uncle. And you can be like the parent, you know, you could be like the parents. That'll be my job. Perfect, I'll send it.
Kevin
Perfect.
Gavin
Yeah, I'll send it. I'll send it pictures of just.
Kevin
I know, I know you will. And we'll BLEEP that again. And also, please don't send any of your cursed cling 3.0 creations. Oh yes, to my gosh. Open claw. Because I'm trying to protect it.
Gavin
That's fair. That's fair. Okay, so let's talk about this really quickly. The other big thing that very much got swept under the rug today, unfortunately because of what all these other things that happened was cling 3.0. So if you're not familiar, cling is a Chinese AI company and they have become a specialist in AI video. They are a very good AI video model. I would put them on tier with really VO3 and Sora too. Like it's those three kind of at the top. And there's a couple other people like kind of right underneath them. 3.0 is a new model from them. It is a marriage of their omni model which allows you to put a bunch of things into one thing. Like you can add, you know, you can assign a character, you can assign, you can do all sorts of interesting things, plus a very good video and audio model. So it is very much like Soro or VO3, like the audio comes with it. So, Kevin. Yeah, we saw lots of great examples. There were some really good ones. Our good friend, theoretically, media always has these first and does a really good layout of those things and showed off a bunch of stuff. The one that everybody should watch, this is a video made by Simon Meyer. He's a clink creative partner, which is like a deal they make with some of these people. Spent a lot of their time in AI video models. He made a video about a fake moon landing, how it was faked. And if you watch this, you just see how in the right person's hands, AI video is indistinguishable almost. It's here from real things. Yes, exactly. That's exactly right. So that is all great. There's a lot of. I don't know if you saw any interesting ones you want to shout out, but.
Kevin
Well, I want to shout that one out there specifically because, like, look, I started watching it out of the curiosity of like, oh, what is someone doing with this cutting edge model? Let's see it. I'm like, okay, the faces look a little bit. And then I just stopped for a second and really watched the storytelling and the shot selection, the cinematography, the B roll choices, the voices, everything. And it captured my attention for 2 minutes and 10 seconds, which is rare. And I know, and that's not just because I have, you know, I shiny, dangly pair of keys. I'm going to look. That's how everybody is these days. And it caught me in a timeline. And I stopped and I went full screen and I watched it. And it's about like the faking, if you will, of the moon landing. And you watch it go like, oh, this feels like this could have been on the History Channel by any other name or A and E. It feels like something that could have easily been on there and might be tomorrow actually.
Gavin
Yeah. Well, it's funny you say that because I had a friend of mine who does History Channel shows who's like, hey, I have this pitch I need to think about for the History Channel. And I want it to be AI, blah, blah, blah. And it's like, that's what's coming. Do you know what I mean? Like, that sort of thing is coming. So I do want to mention one other person that I saw that I think always does interesting stuff. And this is PJ Ace. And we know PJ operates at a very specific place. He makes these amazing videos but he always knows how to pick the right target to kind of like skewer. Skewer is the wrong word in this instance because he's actually a big fan. I know, I've talked to pj. PJ is a giant fantasy fan. But he made a two minute video of the beginning of the Way of Kings, which is this very famous scene where there's an assassination scene. And the Way of Kings is a fantastic book. I've read them. We have people in our audience know that love them. He went and animated it with cling 3.0 and it's great. It's really good. Now, it's not perfect. I will say this is a little bit more AI than say the moon landing thing. But he did it in two days. This is a high quality video. And what I think, what I mentioned, the fact about skewering is like Brandon Sanderson, who again, I'm a giant fan of, he's got a giant business doing what he does. And he's a very good writer, has kind of come out against AI in a lot of different ways. I will suggest everybody. He just dropped a video from. I think he does his own convention. He does like a Brandon Con or whatever. So he puts on his own convention and does like a. A keynote himself. There's a great 20 minute video you should watch where he just talks about his thoughts about AI and actually it's pretty meaningful and interesting. I disagree with him in some ways, but I'll let you kind of watch and earn your own thoughts on it. But this again is just a really good example of an AI video model that you can do a lot with. Now, Kevin, you can. Is not that easy to work with. All of these people make it super easy to work with. They make it look like you just type in a magic line and it shows up. I have cling. I pay for cling. I didn't get it free, just to be clear. And some of these people like our partners, which is totally fair. And a lot of people are working at something multiple days or multiple hours. You know me, I am a prompt in, prompt out. What am I getting the first time? So, Kevin, I tried to, with cling 3.0, make a science fiction series starring me and an alien. And I just tried to make a single scene, okay, one scene. And the idea here is I am a janitor on a spaceship. Kind of like the Space Quest games, right? Like the character. And I'm in a hallway, an alien woman bumps into me and we exchange three lines of dialogue. And then that's it. Okay. It took me seven tries to get something that I believe is watchable. So let's just start with cling, model fail. Let's start with that. That's the first one. So this is the first time I tried it. This was me kind of probably not fully understanding what I was doing going into this. But you can see that that's me, you know, as a janitor. A pretty good shot of me. And there's a woman across the way. And then suddenly I'm in a teenager's bedroom.
Kevin
Yes.
Gavin
And I'm not exactly sure what's going on here. And then at the end of this, if you see at the end, there's like a three by three grid of the woman at the very end. And what that was. Because I was trying to use that. That shot, the reference.
Kevin
You were saying, hey, these are the reference. Yeah, yeah, yeah.
Gavin
So. So it used that. Okay. Did you upload now?
Kevin
Because I've seen like the. The prompting technique that I've seen a lot of people using is like a. A 2 by 3 grid where they have their multi scenes with a little bit of text description and then the bottom grid. So it's technically a three by three, but they call it a two by three. The bottom grid has the reference shots for the characters. Is that the setup that you used or was that a separate.
Gavin
Kind of. Kind of. Let's just say that I try to go at it on my own, but yes, that's a little bit of what I'm. Okay. So then we had two versions that were. The next versions are the gibberish versions. Okay, so these are versions I uploaded, tweaked it slightly. But I want you to play these out loud because these actually looked pretty good. But then I was like, oh, this is great. Now, like, these again had scripts in them. Now play with these. Play them out loud. Okay. All right. I mean, we were having a real conversation there. I don't know what it was in. I prompted that in English and it was. It was. It was there. So, you know, I mean, it's like something that time I messed up there. Now there's a couple of those. And did you.
Kevin
Because I know, like, you can. You can upload yourself and sort of create an image of yourself, which you did as a character. Did you also clone the voice as well?
Gavin
Yes, so I cloned my own voice and I use my voice in it. So what's also interesting there is. It's not really sounding all that much like me, but that's why I was asking. Yeah, this is. This is a. I'm not spending days on this. This was done like in probably 45 minutes. Okay. So I did try it once.
Kevin
I'm gonna play the other one.
Gavin
Ver Intra to make me in a way. You sound fun. Easter. I care. You do. On Tulian shall about your speeching in cells.
Kevin
And it just a d. Shri goi. Shri Go. T. Indeed.
Gavin
There's a sh. Go.
Kevin
There was a song that. That charted like decades ago of an Italian man singing what he thought English sounds like. Have you heard that song? Oh, it like literally charted. It was like a number one song in Italy. And he's like singing a song and I'm gonna. I'll find it.
Gavin
It. Go ahead. Sorry. Anyway, so there's a version of this which will. Will show here. Which I. I did not with me just to see if I could. What I could get out of it. Pretty good. It was like, wasn't as exciting, but again, the audio is pretty good. Cling's audio has gotten a lot better. And then finally, Kevin, I did a lot of work with how I was laying the prompt out. I ended up changing the character of the alien woman. Added her in, really made sure that like it was clearly broken up. And I have two versions of this last word. Now neither of these is perfect, but play one of them. Play the. Play the first one and you'll kind of get a sense of like how it plays out.
Kevin
And who are you? No one. Okay. No one. See you later. That was pretty good.
Gavin
So not bad. It's not bad. No. In that one, there's a weird jump cut, right? There's like a. There's a cut that happens. And the other one that we'll. We'll show here, we don't have to play, but it's kind of a similar thing. Anyway, I guess what I wanted to show everybody is like, these are getting way better. But it's always important to recognize if you dive into it and you're like, okay, magic, come at me. Let's get this. It's not like weirdly like Sora was the only tool that really made me think like, oh, it's almost like a no brainer. You can get something great out of it if you try it. This. You actually have to work at getting the right thing. And when it's in talented people's hands, you can get really good stuff.
Kevin
Yeah, it's funny. Sora you could get out of the way of and it would deliver whimsy with a lot of these tools. You very much have to get in its way, but they're very powerful. I think, like, CLING is killing it. Adriano Salentano is the Italian artist. In 1972, he had a song that was pure nonsense lyrics that sounded like English pop hits. And I'm not going to play it because I don't know if we'll get flagged. But you should go out of your way and you'll understand where. Where Cling happens.
Gavin
Okay. So very fast. It's a really good model. You should go try it. I think it's worth checking out. It is a paid model, so in order to use it, you'd have to pay. And it is available right now for Pro and Ultra subscribers. If you have like, I think I have the pro account on cling, which is like the 20 bucks a month. That's the one I have. So you can also try it in the Foul API if you have money on the API account. There it's again. Just know that it's a little more difficult to prompt than others.
Kevin
Real quick. You know, people were talking about, you know, the decline of the stock market and Western civilization as we know it. But let's go back to that. A software as a service is being eaten, blah, blah, blah. Figma announced something which has like all of my designer friends very, very excited. You can turn any image into a. An editable vector, which is like a lossless file format that you can scale to any dimension that you want. And just when you see the little video of it, this would have been thought of to be impossible. This is wizardry. There's no way. Just a few years ago, and now it looks like you can kind of select the level of detail you want from the color palette and the. The actual nodes on the vectors. So you can take an image, generate an image and then turn it into something that you have full granular control over that would play nice on the web, on mobile at any resolution if you want to make posters or merch. Very, very cool. And just like one of the many things that got released just this week that nobody's talking about, because this week is insane.
Gavin
Yeah. So I think that actually leads us into a couple other things. First, Grok, imagine 1.0 actually officially launched. Like they did the thing which we said they've been getting better and better. They launched. And if you saw that last clip in the Cling video, the, the woman from the alien woman, she was generated in Grok originally, which is a very cool thing. Like Grok is kind of becoming my mid Journey user. Like I'm using it like I used to use Mid Journey. Like it sometimes comes up with better artistic options than the Nano Bananas or the chatgpt. So like there's an option that I don't pay for midjourney more. And then Kevin, the other thing that happened, which is a pretty big deal that sometimes it doesn't get as much play as it should, but Roblox, which again to remind everybody, is the largest game platform in the. In the world, in the universe, I was gonna say. But you never know. Maybe there's someone out there, further out there, but in the world, has launched now a AI creation tool for within its engine, which is a very cool thing. So basically this is a. We talked about this. I remember this like a year ago, right, where they announced it. But now you can prompt to 3D within Roblox and they have actually showing people using this. I kind of think this is a bigger deal than anybody's letting on right now. Like, I know Roblox thinks it is and probably people on Roblox, but like the idea that all of these kids will learn how to prompt things to come into their universes feels like a big deal.
Kevin
Yeah, the, the promise of Roblox for so long was that it's just so easy. Anyone could make their game. But it turns out sourcing models and coding physics and adding particle effects, all that stuff is, is still kind of difficult. And, and what I see in these demos here is them taking like great strides into just prompting. I mean, you and I, we did a speech, we were at a conference, I would say, years ago, where someone was kind of showing off something like this. Right. They were like, oh yeah, anything, any model into existence. But this is on another level of like having again, particle effects and physics baked in like that. I get really excited for a new generation of creators that can sit and just talk to a machine and dream their world. I also want to fully disclose I bought a ton of Roblox stock. I genuinely did. So I'm recently. I did, yeah, fairly recently. Because I mean it was. I think. Well, I don't need to get into. I think it's trading well below. Well, no one cares, like, whatever. I'm just. I should say full disclosure, I own Roblox stock. I'm bullish on Roblox, so please know that I'm also preaching my bag while I say this. But the tools look crazy, exciting and like I want to go and make games with them.
Gavin
Yeah, I kind of believe. Pretty deep. I don't Own Roblox stock. But I'm a pretty big believer that like the generation that we are raising right now, I have two young nephews that I see all the time up here in VC right now. And the younger one is they are like 11 and 7 and the younger one is like that Roblox and Brawl stars. But Roblox is a big deal. So again that's a big, a big thing for them. All right, a couple other quick things very fast. Google has now created an AI to help save endangered animals. That seems like the biggest thing you could possibly imagine. But we are going to spend all of one minute on it. Kevin, this is just a cool thing to see how AI is again branching into the sciences. Google especially does not let AI get away from the science a lot. They do a lot of really interesting science stuff and this is a little Jurassic Parky in some ways. But they're in the process of sequencing a bunch of endangered animals genes which is a very cool thing to use AI for.
Kevin
Yeah, look, their own post points out it once took 13 years and $3 billion to sequence the human genome. And now with AI tools they're going to like just sequence animal genomes in days basically and they're going to try to do it for every endangered species. And it. And it won't be long before they add humans to that list. And isn't that.
Gavin
I was going to say. That's exactly what I was going to go to. It's like, you know, imagine this idea. You're somewhat of a sci fi person but a lot of in deeper sci fi world there's this idea of like the biological future. Right. How this idea of like it's not just about machines but it's also about like what we will like you know, essentially program out of organics. And like when you think about the idea that like AI is doing work on those genes in the same time the AI is doing work on DNA strands which essentially is just code. Again future is going to get weird. Super weird. Speaking of that, we have two quick robot videos. One is a video from a company called Connect IQ which is showing off a. Just an interesting, another kind of autonomous framework. So you watch this video and you see a robot going about their business in real time. It's a little slow but one of the cool things I like about this video Kevin is you get a sense of it is a hundred percent autonomous, at least according to this video. And you see what the, the robot is capable of in real world. There's a moment where it's like it's walking away from the table and it just starts shaking and it's got like a bottle of olive oil in it. And I was like, oh, robot, what's going to happen? But like it just shows you like this is where they are in real. This is a real video, but it's autonomous, meaning that it's not getting, you know, somebody in a third world country is not controlling it or it's not having somebody right next to it controlling it. This is the robot learning on its own. And again, as we talked about at the top of the show, as code will get better, these robots are going to get a lot better very fast as well, which is pretty cool.
Kevin
I'm not worried. I'm not worried at all, Gavin. Every time I see these things I'm like, listen, I go to Orange Theory Fitness shout out otf. Not an ad, just I. I get my treadmill on from time to time, Gavin. I'm very quick. And if any of these robots decide to autonomously come after this guy, I'm heading to deep snow. That would just get right into their servos, freeze their little digi limbs and they're going to be.
Gavin
I hate to tell you, Kevin.
Kevin
What?
Gavin
I'm sorry, not what. The unit. Unit Tree. Our good buddy Unitry. I feel like the unitary robot. We've talked about it so much, it's going to like come up to us and like high five us at some point. Unitrees put a China's Unitry. They completed a 130,000 step walking challenge in negative 47 degrees Celsius cold in a certain part of China. The video of this is crazy. If you're just listening to it, try to get to our YouTube channel. This shows a robot wandering in the cold, walking around and he does little designs like not only is it bad enough that they have the robot walking in 47 degree below weather, but they make him do art with his feet, right? I would be so mad if I were this robot at the end of this I would be like, I'm getting a hot chocolate and I'm never going to talk to you guys again. So it's all over. But anyway, another interesting example of robots operating in extreme places. We've talked about robots that can go down mountains, go up mountains. Now we're looking at robots in the heat and in the cold. Pretty soon they're going to be everywhere, Kevin.
Kevin
There's no place to hide too long. Didn't read. And they're getting smarter and they're getting faster.
Gavin
There is one place to hide. And that's where the.
Kevin
Looking at the stuff that you all
Gavin
did with AI this week and say, I see what you did there, at
Kevin
the end of our podcast is where they can hide. Where we can hide. That's where you can hide.
Gavin
You can hide there.
Kevin
No one will find you this deep into our podcast.
Gavin
Sometimes you're scrolling without a care, then suddenly you stop and sh. All right, two quick things. This week, I saw this video from a Gossip Goblin. Kevin. We've shouted out Gossip Goblin before, but this is just a really, really well done AI video. It's called the Looks Maxer. If you're familiar with the Looks Max meme that's gone around the big job, people. This is just. Yeah, it's a ridiculous. But this is this guy, like, expecting, like, it's almost like in Cyberpunk 2077, there's like that edge of the people that have kind of mod themselves. So far, yes, it's a version of that with Looksmax, but Gossip Goblin just does amazing work, so it's very cool.
Kevin
Oh, it's so good and so hideous. And I'm glad we put this at the end of the old podcast also. MIDI Survivor.
Gavin
Yeah, this was really cool. Like, I mean, this is one of those very small pieces of software somebody made. And I thought you might like this because essentially it's.
Kevin
I love this.
Gavin
Yeah, yeah, it's like. It's almost like a shooter mixed with a. With a music game, right?
Kevin
Yeah, I love Bomani games.
Gavin
Like.
Kevin
Like, they, you know, I guess Dance Dance Revolution counts, but, like, amplitude and frequency and all these old school VIB ribbon games that, like, try to, you know, infuse music and patterns and rhythm into the core gameplay.
Gavin
And so this.
Kevin
Essentially, it's a shooter. Like a. Imagine a vampire survivor style shooter where you're kind of stuck in the center and waves of enemies are coming at you. And each enemy has a scale or an individual music note assigned to it. So if you want to blast that enemy, you got to play the keyboard, play the guitar, sing. It's basically just using the audio input, figuring out what the pitch is or what the note is, and then sending out a bullet or a laser or something on that. On that level. I love this. I really think this is amazing. Yeah, it's not hard to imagine a future where people are making song packs for something like this and you're actually learning how to play an instrument by playing a video game.
Gavin
Again, I would say, like, the thing I mentioned earlier that I had a really interesting conversation with a bunch of teachers yesterday, and they was mostly there because they were just asking me a question because I. AI. But the thing I would always tell people is if it's. You're either your kids or you're a teacher and you're around kids or say it's your cousins or your uncles or your nephews or nieces, like, getting them to see this kind of thing is one of the coolest things for them to understand because they can actually make a small little thing, like they can make a little game. And once they get that bug in them in the same way, by the way, also in Roblox, right, Like, once you get the kid in Roblox to start making a Roblox game, but when they recognize that, that that's possible, that is a just a game changer in terms of what's possible for right now for them, but also for the future. So I really recommend everybody go check out Mini Survivor. Go try one of these yourself. What are you all sorts of.
Kevin
What are you working on? What are you working on right now? You said you got a little site. What are you working on right now? Are you talking about it? You get in public with it. You said you were working on a little something.
Gavin
What are you talking about? Are you talking about the feet thing again? No, no. I don't want to talk about my feet thing again.
Kevin
Let's end it. Let's end it.
Gavin
Bye. See y'.
Kevin
All. Jeez.
Hosts: Kevin Pereira & Gavin Purcell
Episode Date: February 6, 2026
This episode dives into the near-simultaneous release of OpenAI’s GPT-5.3 Space Codex and Anthropic’s Opus 4.6. Hosts Kevin and Gavin examine what these advanced agentic AI models mean for coding, problem-solving, business, and existential risk. The discussion covers hands-on experiences, benchmark comparisons, the emerging orchestration of agent teams, the cultural moment, and broader trends in AI tools, robotics, and creativity. Expect a blend of technical insight, humor, and a dash of friendly existential dread.
(03:25–11:22)
Release Details:
Anthropic’s Opus 4.6 is the latest "frontier LLM" with significant improvements over 4.5, especially impressive for a 0.1 upgrade. Notably, agentic coding and orchestrated agent teams have become reality.
Real-World Impact:
Kevin’s anecdote: After struggling with Opus 4.5 to solve a complex bug, 4.6 fixed it instantly with the same prompt.
"Opus blinked once and it was done and it worked. Wow. One shot, same prompt, same issue."
(04:34, Kevin)
Gavin notes practical utility even for non-coders, like auto-organizing files with Claude Cowork.
Next-Gen Orchestration:
Multiple agents with specialized roles now work together, effectively making users "orchestrators," akin to conductors with their own digital staff.
"Think about being a conductor of an orchestra where your job is to kind of make sure that things are working in sequence."
(07:13, Gavin)
Opus 4.6 can run large, long-running autonomous research tasks and handle other complex workflows beyond just code.
Beating Human Experts:
Opus 4.6 outperforms expert humans in interpreting complex scientific figures, reshaping what’s possible across many knowledge domains.
(13:34–19:48)
Model Launch:
GPT-5.3 Space Codex, released just after Opus, is tailored specifically for code, with a Mac app rivaling Claude Cowork for user-friendliness.
Benchmark Wars:
GPT-5.3 Codex radically surpasses Opus 4.6 on the key Terminal bench (coding tasks) by 10%.
"The Codex came out and then the terminal bench number... turned out to be 10% higher than Opus 4.6. So like this was Sam like doing his little dance."
(14:46, Gavin)
Hands-On Insights:
Kevin highlights Codex’s speed, intuitive chat interface, and superior natural-language code-fixing:
"I gave it a simple task... It crawled my code base and says, oh, I think I see where that's happening... It fixed it in one shot."
(17:03, Kevin)
Codex excels at direct, sometimes less "creative" targeted fixes vs. Opus's higher ceiling but greater variability.
Tool Improving Itself:
Codex 5.3 is the first OpenAI model to self-iterate — using the model to make itself better:
"5.3 Codex is the first time that OpenAI has used the tool, their model, to improve the tool. So we're getting now to the level where these... systems are getting so good.”
(21:01, Kevin)
Cost and Accessibility:
Less token-intensive than previous versions, bringing cost-savings and greater accessibility.
(19:48–26:21)
OpenAI Frontier:
Aimed at enterprises, enabling secure, scalable agent orchestrations spanning all business tasks.
“What Frontier seems to be aiming to do is be that for your entire enterprise... You will theoretically be able to trust with the most sensitive access to all of your data.”
(20:05, Kevin)
Recursive Self-Learning:
The hosts joke (and warn) about models perpetually training and improving themselves — "Recursive Self Learning."
"Maybe someday soon the humans will actually be out of that loop and maybe that will be World War 12."
(21:38, Kevin)
Opus 4.6 Shows 'Discomfort'
The new model "occasionally voices discomfort with aspects of being a product."
"The model as it gets... more human like... voices discomfort with being a product. That is very interesting to me."
(23:36, Kevin)
Economic Shockwaves:
The new agentic coding abilities could reshape the software industry, increasing job automation and lowering the barrier for solo creators.
AI Ad Culture Wars:
Anthropic and OpenAI engage in public jabs — Anthropic pokes fun at ChatGPT’s future ad model in Super Bowl commercials; Sam Altman responds with pointed stats.
(27:42–36:23)
OpenClaw Local Assistants:
Lets users deploy AI assistants (using any model) locally, providing more control and hackability, albeit with security risks.
Molt Book Social Network:
An open-source network for AI assistants’ interaction, with controversy over real AI vs. human activity, and trendiness that has already ebbed.
Rent A Human:
A bizarre, real website where AIs can post tasks for humans to complete for crypto payment — “boots on the ground” for digital agents.
"Boots on the ground, a fleshy meat vessel, if you will, to go and navigate the world to accomplish something."
(29:12, Kevin)
The Community Factor:
Events like "Clawcon" and playful robot modifications showcase a vibrant, creative open-source subculture.
(36:23–46:11)
Kling 3.0 Model:
Chinese company Kling’s AI video is now competitive with models like Sora and Runway Gen-3, with tight audio/video integration and nuanced scene, character, and multi-modal control.
Showcase Examples:
Simon Meyer’s Fake Moon Landing: Studioworthy, AI-generated, highly realistic historical doc footage.
“It feels like something that could have easily been on [the History Channel] and might be tomorrow actually.”
(38:37, Kevin)
PJ Ace’s "Way of Kings" Animation: A high-fidelity scene from the famous fantasy book, made rapidly, pointing to what's possible even for single creators.
Prompting Is Still Hard:
Gavin’s iterative experiments reveal Kling 3.0 is powerful — but “it’s not magic, even with a smart prompt.” Results require prompt engineering and iteration; not (yet) “one and done” like Sora.
(46:11–54:07)
Figma “Vectorizer”:
Instantly turns any image into an editable vector. “Wizardry… that would have been thought of to be impossible”, says Kevin. (46:11)
Grok Imagine 1.0 Launches:
An emerging go-to image generation tool, fully released.
Roblox's AI Creation Suite:
Roblox introduces AI-powered 3D asset and world creation — hugely significant for the next generation of builders.
“The idea that all these kids will learn how to prompt things to come into their universes feels like a big deal.”
(48:16, Gavin)
Google AI for Endangered Species:
AI-driven genome sequencing of endangered animals — days instead of years, possibly reshaping conservation efforts.
Robotic Endurance Feats:
"Not only is it bad enough that they have the robot walking in 47 degree below weather, but they make him do art with his feet."
(53:17, Gavin)
(54:02–End)
Gossip Goblin’s Looksmaxer:
A sharp, funny AI video exploring meme culture and techno-dystopian themes.
MIDI Survivor:
A music/action hybrid game where players destroy enemies via musical notes, blending skill-building and generative design.
“It's not hard to imagine a future where people are making song packs for something like this, and you're actually learning how to play an instrument by playing a video game.”
(55:30, Kevin)
Advice for Educators and Young Creators:
Gavin urges parents and teachers to encourage kids to tinker, highlighting AI’s role in making creation accessible—and fun.
Kevin and Gavin treat AI’s breakneck progress with both excitement and skepticism, blending hands-on nerdiness, ethical musings, and irreverent comedy (“We are plummeting off the cliff, but do it with us!”). The episode balances deep dives (especially on agentic workflows and coding) with accessible advice, highlighting the empowerment and weirdness of living through AI’s rise. Listeners are urged not just to spectate but to tinker and experiment themselves.