
Loading summary
A
OpenAI canceled Sora and they just canceled SpicyChat.
B
They are fully focused on enterprise reaching AGI and a new frontier model called Spud.
A
Yes. Is this tater based intelligence going to get their backs off the perceived wall? And why does anthropic seem to outship them every single day?
B
Plus new AI, music and audio models
A
from Google and Meta has a new AI that can supposedly read our minds.
B
They don't want what's in here, Kevin. They don't want what's up here.
A
I got a strong feeling many of us don't, Gavin.
B
Well, let me tell you something because I'm thinking about this new ping pong robot. Have you seen this? It's pretty incredible. I'm going to take it and I'm going to remake Marty Supreme. I'm going to attach Timothee Chalamet.
A
Okay. Hey, good night, sweet prince. This is AI for Humans.
B
Welcome everybody to AI for Humans, your twice a week guide to the world of AI. And today, Kevin, we are talking about OpenAI with their back backs up against the wall. I walked away from my microphone there. Their backs up against the wall. They are trying to kind of slim their ozempic, their whole company down to try to approach a new frontier model and the race to AGI.
A
Let's get into what TLP one. Is that what we're talking.
B
Actually, that is a perfect idea. That should be. So there's been some big news over the last couple days about OpenAI. The biggest news, really, and we covered this very briefly at our last show, is that they have canceled Sora, the AI video program. And this is a like, they didn't just like say, we're going to cut the Sora app, which some people thought at first.
A
Did you see the videos of the Never AI people dancing in the streets, Gavin? Did you see them all celebrating drinking Chianti and shooting off streamers?
B
I will say this. I put up a video pretty much right after it happened and I got so many yay responses that was like crazy. But this is a big deal. Not only did they shut down the app, but they've taken out the entire AI video program. They are not going to lay in AI video anymore into ChatGPT. They have canceled their deal with Disney.
A
They canceled the deal with Disney $1 billion deal. That much hullabaloo was made over, which I think is a Jungle Book character. They're also killing the API, which means if you trusted OpenAI and built a business, any sort of application or whatever on the backs of this video model, Bye bye.
B
Toodles.
A
Now, to those that celebrated in the streets, Gavin, they were saying, oh, this is it. The Generative AI is starting to die. I think that's missing the forest for the trees.
B
Yes.
A
Usually the same tre AI is using to power these machines because in its wake there's a million different models from other providers. It's not like they're killing this due to lack of interest. I think they're killing this due to lack of focus within.
B
Yes, that's right. And I think that's an important thing. You know, Capcut just today opened a new app that they are going to kind of plug their ByteDance's AI video model seed Dance 2.0 into. There is a probably a new VO coming from Google. I would say closer to IO maybe in May. But another thing happened here, Kevin, which is just a good kind of overarching story, which is a thing we've been tracking for a while. This was known as Citrus Mode at one point. They are fully canceling Spicy Chat. So there are a lot of people out there in the world. I'm not going to say I'm one of them, but I was interested in what this could do who were interested in this idea that Sam Altman had mentioned they were going to allow spicy chat in ChatGPT. You had to age gate it. But this feels like another kind of like slimming down of the. Of the pathway to doing this stuff. I don't think you were necessarily excited about this or I was necessarily excited about this, but this is something they had promised and now they're saying maybe not. We're not going to do that right now. It feels like to me this is part of their move of saying like, hey, business people. Hey, you know, you person wearing a collar at your job and maybe a tie. We're here for you. We are not here for the guy in his. In his bedroom at 2:00am trying to get. I don't know what's going on.
A
Do you have a stethoscope or beakers? We want you. You have a fedora and a fox mask. Hold on. No disrespect, but we need the money. Actually, I think it's a smart refocusing. I look, I don't understand. Like, first of all, we as humans can change our fetishes. So if you want to like get off to Token talk, any LM could be spicy any if you. If you treat Jason like asmr and if you get that joke, welcome, you're fine. It's not. Would it be hard for Them to just relax the guardrails, Gav, right? To say, hey, are you over 18? All right, let now we're going to let you have that chat. So that to me is less the issue. The Cap Cut thing is interesting. If people are listening to this or watching this, what does that mean for them? Do they have access to the latest Seed Dance? It's not out now.
B
It's still not out in certain territories, including the US and actually we did get a note from our last show that the VPN roundabout we thought was going to work is not working yet. So they're doing some interesting stuff to gate it. I think C Dance is going to cut a deal. The biggest thing I keep thinking about AI Video is VEO and C Dance are the two companies that own the data. And when I say that, what I mean is that Google owns YouTube and this video will be on YouTube every day. There are some crazy number like a million hours of new video uploaded to YouTube. Seed Dance is owned by ByteDance and ByteDance owns TikTok, which is also a platform worth a lot of video getting uploaded. So at some point these companies were going to kind of run away with it. I think for OpenAI, the biggest thing that's going on here is they, as we discussed last time, are starting to get lapped a little bit by Anthropic, especially when it comes to recurring revenue. Like, the numbers are pretty close right now. So, Kev, I think the other thing that's going on here is this is a nice, like, PR pivot internally to talk about their new model, which they're calling Spud now, there is not a lot of information about this. The biggest thing is it's supposedly going to be released in two weeks, which is a very fast turnaround. Right. We talked about how Anthropic is shipping all these features. If OpenAI delivers a significantly improved new model on top of GPT 5.4, which by the way, just launched what, like three weeks ago? That is a big kind of salvo in this space. And ultimately people have talked about coding and business as being like kind of the use case that people are featured on right now. I know this because you know this too. Like, I spent a lot of time Yesterday with Opus 4.6 and Codex both trying to squash bugs in the small thing I was working on. And I spent four hours trying to find one problem and eventually I did. It would be nice to have that be 20 minutes. So if that can. If we're going to that point. That's fantastic.
A
Yeah, yeah. Look anecdotally, and we'll talk about this maybe in a little bit when we talk about the game that I vibe coded over the weekend. I think 4.5 is the most capable model that's out there right now for coding. There's also a 4.5 mini which is very quick and also very capable for the price and the amount of usage that you get, especially when you turn on more thinking. They also had Codex Spark, which was like an experimental model running on slightly different hardware that was lightning fast. So I think they have plenty of tricks up their sleeve. I think their models are fantastic. I think their software around it, their harnesses, needs to get a little bit better. And I think they know that as well, which is probably why they're refocusing. So I would certainly not count OpenAI out at all. I don't mind that they're refocusing. I wonder if they release Sora as an open source something or sort of release it as like a here you go community. Like, sorry, we couldn't do something with it.
B
I would be shocked. And here's why. Because I think the tricky thing, if they were really going to release it open source, what that means is that all that stuff in there probably is, is like weapons free, which when I say that stuff in there, I mean all of the faces, all of the,
A
all of the model data, all the celebrities, all the everything.
B
So I don't know that that feels like to me a tricky thing. I do think this is like a kind of speed bump on the way to what, you know, we think about AGI. We've talked about AGI on the show before. This idea of artificial general intelligence. Sam did say, and there's not a. This is in this kind of memo that went out to his team. He said, quote, things are moving faster than many of us expected. And this refers to the model, the spud model coming out in a few weeks. So I guess that means that we're going to keep seeing this acceleration that we've noticed in the last little bit. And I guess not surprising. Like, it just feels like we are, like you mentioned in the last show, we are in this kind of like churning mode right now where things are just going faster and faster. I mean, case in point, there are two new Google updates today that feel like, you know, a few years ago these would have been like major hurrahs, but like they're new. There's a new audio model from Google that is really specifically important for agentic AI audio interactions, which you and I both Know really well. This is Gemini 3.1 flash live. And, Kev, what I'd be interested in. Let's play this. This tweet from Josh Woodward because there's some audio in here, and we'll get a sense of, like, how it's improved versus what we had for Gemini. I'm at the gym. Give me a quick three set finisher for triceps. A great finisher is triceps push downs with a rope attachment. Okay, so, Kevin, is that a great finisher? Let me. I don't know this, but is that a great finisher?
A
Yeah, that's a totally fine finisher if you want to polish it off. I mean, you. So many people only work one section of the tricep. You really gotta get the. The bronchialis. Lateri you have to get in there and you want to just keep tweaking and twerking until the.
B
Oh, bleed that part out.
A
I was actually more stunned at, like, they put in a little audio cue. So you knew that the thing was actually processing what it did. And it seemed like that tiny little audio cue was barely finished before it started talking. And that's really impressive.
B
Yeah, I mean, the big thing here is they talk about it being really improved in agentic tasks. And this, again, is that same sort of thing. Like, you know, you and I have both been using coding agents. Well, this is the idea of, like, could you put a voice so that it could be able to kind of figure out what you want quickly and then give you the answer you're looking for? This is one of the problems sometimes with voice is that, like, you know, it doesn't have the time to go through the thinking process, because in the thinking process, it's a delay. And if you're doing voice, what you really want is a direct answer. I think that speed to answer is going to be a big part of the next, like, year of consumer AI Particularly because I'm okay with waiting for a while for coding answers, but I'm not that okay to wait for a tricep answer when I'm in the gym.
A
General knowledge answers. Yeah, exactly. Plus, you gotta do your set and clear the machine. No one likes a loiterer. All right, I get my ring light, I get my tripod, I set it down, I do my one rep. Do that guy, and I. Yeah, and then I break it all down. I take my road mic off, I loosen it, take my GoPro vest off, and I move away.
B
I move away.
A
I'm a fitfluencer. I know how this works also along those lines. Did you see the flashlight stuff? The.
B
No, I haven't seen this yet.
A
So they, they have Google Gemini 3.1 flashlight L, I, T, E, and it's Flash, not flesh. And they built the flashlight browser to show it off. And they have a video, Gavin, we'll put on the screen where it is. This isn't a pre built website. They're just showing the speed with which this model can render images and, and text and format it like into a website.
B
Crazy.
A
Someone's clicking around and you would just think that they're on like American broadband. It loads, right? You know, it's like, it's a little slow, but it gets there. And everything that's on the screen, if you're getting the video version is, is being rendered in real time. And literally on Tuesday, we were talking about a future where software is rendered almost like real time video. This is kind of that, this is, this is the beginning of that. Imagine an interface, Gavin, where you're cutting a video and you go like, oh, this isn't what I want to see. Give me a panel that shows me special effects options that will add sparklers to my blah, blah, blah. And. Okay, let me just render that for you. Here you go.
B
And also, just as a side Note, Kev, Google DeepMind just released this piece of research called Turboquant, which actually speeds up access to memory in a much bigger way, which I think is going to actually make these AI tools way more useful.
A
Yeah, so the vector memories, they're stored in like 3D space and usually to retrieve them, imagine someone giving you directions and saying, walk three blocks this way and then turn, you know, make a right at the Albertsons and walk five more blocks.
B
Right.
A
Then you're good. Now you're at the dispensary. Why, why would that be? Directions I need. Follow me. Instead of that, Gavin, they'll just tell you walk 35 degrees, like position yourself there and just walk three blocks and that's it. And it's like, it's like polar coordinates. So it kind of compresses all of this data to get to the vector memory, puts it into these like polar coordinates. But the point is like, in fact, memory stocks kind of took a slight dip on the heels of this because people are like, oh, we're going to need less memory. Nope. We all know a very, very popular paradox which means that the demand for this stuff is just going to increase, increase, and the expectations are going to increase as well. But this is cool. Like, we know that we need A few more technological breakthroughs to get to AGI and then maybe beyond that, this might very well be one of them.
B
Yeah. And it's. By the way, it's not the only Google announcement. This week they also announced Lyra 3 Pro. So this is their AI music model. You can now generate full songs, which is kind of interesting.
A
Let me give you a soundtrack while you announce this, Gavin.
B
That's right. It's time for Lyra 3 Pro. You know, and I know the way that this company works is they will drop things that are interesting anyway. So this is one of the things people have been doing with this is exactly this. Basically they're using it to create lyrics. Synthwave background music. Yes. I mean, and listen, we said last time we tried Lyra 3, it's fine. It's not like the most exciting AI music model. We felt like Suno was kind of advanced. But we do appreciate the fact that Google continues to build across all of these platforms. I think one thing that's interesting about Google is they have all this other money because they make money from ads on Google search. So they are able to kind of dump a bunch of money into these side projects that a company like OpenAI just don't think can right now.
A
Yeah, it's one of the first audio models where I don't notice that, that like that AI shimmer that's on everything. Like, that was the first thing that grabbed me when I heard it. So there's clearly like room to improve. And because I give them $20 a month for my Google voice account, like, yes, they're making money hand over fist everywhere.
B
Yeah, exactly. And by the way, like Google, I'm very excited to see what comes out of Google IO because that's happening in May. By the way, Kevin, we are invited and I have signed us up, but I know you may not be in town, but we can go to it this year. So we got invited.
A
We got invited.
B
We got invited. Yeah. So we are able to show up at it. And I will happen to be in the San Francisco Bay area.
A
Tell them I got a thing I want to play two. Just let them know that Kevin's got a thing.
B
One other cool audio note, Mixtral. If you remember, Mixtral is the French AI company that has been working on open source. This is an open source video model, Kev, and it's very performative. You want to play Just a little taste of the open source Mixtral model to see what we're getting here.
A
The one the world knows me by. And now
B
it can travel further. Not away from me, but more fully as me. It sounds very French. It sounds very French. This is existential French. Anyway, it is open source. You can go use it. It is called VoxTrol, VoxTral TTS. And it's worth going to check out, obviously. All open source models. You can play with, download, and you can screw around with on your own. You can, you know, anytime Voice gets better. We love talking about it. I do think Voice is kind of taking a backseat right now to coding. And even video is taking a backseat to coding. Still many cool things happening at the same time.
A
Is he digging on the couch? Who?
B
Ollie.
A
Yeah.
B
Ollie, what are you doing?
A
What is going on?
B
They did talk about. They did. Yeah. Don't dig. Ollie. This is not our place.
A
Ollie, you dig. It's okay. Especially because it's not your place. You go for it, buddy. You get through those.
B
That's my dog. That's my dog. We should run out quickly to Multi shot.
A
Our friends at Runway.
B
Yes. Our friends at Runway have released a new app called Multi Shot Video. This is an idea that's basically. I'm sure this is kind of a video harness, which is interesting. They're basically taking your screen grab or still or whatever, and you can put a very simple prompt in for it, and you can get a video out. So if you want to look at this, I took this week's thumbnail with you and I and a lobster being boiled in a pot. And I just said, here, make something interesting where the lobster gets away. So play this. Pretty good.
A
You grab it. I just needed a, like a. A mildly, mildly interested French narrator to really grab me. We should have put some Vox DREL over it. It's funny when you say that, because
B
when you look, there's one shot where we're both turned to, and we have much larger noses than we actually do. So maybe they did make us French already. But again, if you didn't see, if you're just listening to this, that is, you know, Kevin and I from the thumbnail. We kind of turn. We see the lobster jumping out of the pot. The lobster jumps out, falls on the ground. It's all very pretty. Well done. And I didn't prompt any of the scenes. So this goes back to that idea of, like, everybody in the world of AI Video is kind of working towards this idea of, like, how can we simplify putting together shots, right? Because Sea Dance 2.0 is very good at this. Like, you can add shots in for the. For the normal person if they don't Want to break it down by shots? This might be a really cool thing.
A
That was one of the big, like out of the box magical things about Sora when it first launched that you could give it the tiniest prompt and you knew it was doing a script breakdown behind the scenes to do all the shots. Breaking news gas in.
B
Oh no, what do we have?
A
Real time translations just became even easier in the Google Translate app now with headphones on iOS because it was already out on Android on iOS, real time language translation. So all the people that bought the new AirPods to try to get into that real time translation, apparently you didn't need it just wait on Google. So. Wow, cool. That's actually pretty interesting.
B
So that's them doing their. They're taking their audio model and they're expanding it out into consumer use cases. That is actually something I very much would use. And you will use very soon, probably because it's going to be in a different country. Yeah, that's amazing. There's another huge story this week that kind of didn't get enough coverage right now. It is a science story and maybe didn't get enough coverage because Matt is going through right now if you didn't follow this. Meta and YouTube have both lost a pretty big lawsuit around, like social media. But Meta AI has created a new thing called Tribe V2. And what this is is a basically a way to simulate what humans perceive in their brain and they can essentially read, not only read thoughts, but it can also predict thoughts about what you're seeing. This is an AI model that actually sees brain firings, like meaning that like you'll see like if you look at something, it'll show you what part of the brain like lights up when you look at a thing and it's predicting this stuff. So this is a combination of what's known as FMRI technology, which is like kind of brain scanning technology and AI to be predictive. And Kevin, this, a lot of people are saying that this is the step towards brain uploads, which, you know, we've talked about science fiction ideas before, but the idea that you could upload a brain, have it, simulate what the sort of scenario is, is a pretty big deal. Now this is just the beginning stages of this.
A
So kind of you to think they're trying to do anything other than predict what will click on or what will make us more excited to watch on Instagram reels. We know that if we show him this cucumber wearing Oakley's graphic, a certain point of his portion of his brain is going to light up and he's going to be more motivated to buy that.
B
That is the thing that worries me some ways about Meta. It's one of the funny things about Meta at large. Like we've talked about how Meta is kind of in the background right now on AI. They spent all this money on all these AI researchers. They did bring in Molt Book to try to kind of like clearly own the AI agent space. But Meta has a way of making everything they do about ads right and about their business model, which again, I don't necessarily think from a business perspective that's bad, but maybe from a humanity standpoint, not so good.
A
Well, this, in this paper, the interesting thing was this was done about five years ago, but they didn't release it because they thought the tribe model was broken. They were showing all the brain scans and there was nothing lighting up and they realized they were just testing on users in Horizon Worlds.
B
Oh, damn. That was a long walk to get there. But we got there finally. Rip. Horizon Worlds Rip. Horizon Worlds rip. Kevin's face up close there. And finally, Kevin, we have some really awesome. We have an awesome robot story here. This is called smash. The world's first high dynamic humanoid robot for outdoor table tennis. Fully autonomous with onboard perception. If you are just listening to the show, you should definitely come and watch this video because it is fascinating. Basically what you're seeing here is a robot ping pong player. And ping pong, as many people know from Marty supreme, is like a pretty difficult thing for humans to do, especially because you're tracking a small ball and it's moving fast with weird physics. What are you laughing about?
A
I just like you guys know ping pong from Marty Supreme. You've never probably seen or heard of
B
it in the most recent world. You understand what pro ping pong is? Yes, yes, of course. But anyway, this is a very cool project. It's one of another sort of sign that robotics is moving along very fast. And I do think it's an important thing for people to rise. There was also a video recently speaking of figure 03. I don't know if you saw this, but where it was walking with Melania in the. Yeah, I saw House, which was like just this goofy looking video. When you compare that walk to the ping pong robot, you just get a sense of how vast the differences in terms of what people are pulling off
A
right now is there was a couple weeks ago. I don't. Maybe we showed it. I don't know. We make so many of these now. I don't know the, the tennis playing unitry like it was fully stumbling around a tennis court and swinging the racket like yeah, that, that's similarly impressive as hell.
B
So do you think there's going to be a unitary tennis pro at the club that like tries to like creep up on the old country club ladies like, he's like, hey ladies, look at this. I can do a back. I can.
A
Let me get behind you and show you the better grip. Please, please stop grip. Continental content tennis playing robots, great. But the real humans, Gavin, the real humans out there, they're too busy. They're too busy clicking like and subscribe and smashing the bell and leaving a comment to juice our algo down below. So please do that. If you listen on any podcast apps, give us a five star review. Leave a comment there. That helps us grow. It's the only way we get any attention. Share us on social, promote us on LinkedIn. Open to work. Also Discord, we're gonna be beta testing the tile matching Battle Royale game which is almost, almost stable and dare I say, almost fun. I don't think I'll get it there. I don't think I'll get it there, but maybe stable.
B
It is definitely fun. We are very excited to play more of Kevin's game. Yes. Please come visit us in our Discord link is below and we will see you all next week.
A
Bye Bye.
Episode: OpenAI's Path to AGI: Kill Sora, Launch a Potato
Hosts: Kevin Pereira & Gavin Purcell
Date: March 27, 2026
This episode delves into a seismic week for the AI ecosystem, focusing on OpenAI’s abrupt pivot away from consumer applications like Sora (AI video) and SpicyChat to double down on enterprise and the next leap towards AGI (Artificial General Intelligence) with its mysterious new "Spud" model. The hosts also break down new AI audio and music launches from Google (Gemini 3.1 Flash, Lyra 3 Pro), Meta’s foray into (literal) mind-reading AI, advances in video and voice models, and robotics feats like the outdoor ping-pong bot. The tone is lively, irreverent, and sharp, helping listeners parse the hype and substance behind each headline.
OpenAI Cancels Sora & SpicyChat:
Reason for Shutdowns:
Rise of Anthropic and Other Players:
OpenAI’s New Model “Spud”
Coding Capabilities:
Open Sourcing Sora?
Speed and Responsiveness:
Flashlight Browser Demo:
TurboQuant Research:
Runway’s Multi Shot Video:
Magic of Script Breakdown:
SMASH Autonomous Robot:
Robotics Progress Comparisons:
For anyone watching AI’s “AGI or bust” era unfold, this episode offers a wry, detailed snapshot of what’s changing, why it matters, and what’s next—from potato-powered future models to ping-pong bots.