A (4:26)
Okay, so what's interesting to me with all of this? Oh, they have like a blooper final scene at the end driving a fancy car. And then it says Bill will return in Sora 3. So I'm assuming Bill is like an AI avatar person that they're going to use to be their mascot for all of their. All of their video announcements. Okay, I think this is really interesting. They said in their post, they said the original Soar model from February 2024 was in many ways the GPT. One moment for video. I would tend to agree with this. It was interesting. It was interesting. It could do a lot of stuff, but I didn't see a lot of people actually using it. They said the first time video generation started to seem like it was working and similar behaviors like object permanence emerged from scaling up pre trained compute, they said. Since then the SORA team has been focusing on training models with more advanced world simulation capabilities. We believe such systems will be critical for TR AI models that deeply understand the physical world. A major milestone for this is mastering pre training and post training on large scale video data which are in their infancy compared to language. So I think what's interesting here we are going to be seeing the advancements are going to get really impressive and they're going to move forward a lot. Like we're at the very early stage of what we can do with video but they have a really clear path forward to how they can do this better. Some people think like oh, we've hit a peak or we've hit a plateau with AI capabilities especially when they're talking about like text models and how smart they are. But we're really. And whether, whether that's true or not, I don't think anyone is saying that about video. We're just at the very tip of the iceberg with what we can do with video and what it's capable of. And we definitely haven't hit a plateau on making these models much, much better, which is really, really impressive. They said prior video models are over optimistic they will morph objects and deform reality to successfully execute upon a text prompt. For example, if a basketball player misses a shot, the ball may spontaneously teleport to the hoop. In SORA 2 if a basketball player misses a shot, it will rebound off of the backboard. Interestingly, mistakes in the model make makes frequently appear to be mistakes of the internal agent that sort to is implicitly modeling. So this is really interesting because it gets to this whole question about like the laws of physics, how these AI models obey laws of physics. And apparently this is a lot better. They have a video of like a dog jumping around a dock. It's, it's bumping into objects and it's doing a much, much better job than what you would have seen from, from other models. And so they say that this model is a big leap forward in controllability, which is basically the ability to follow intricate instructions. You can do it across multiple shots while accurately persisting with, with an accurate persisting world state. Meaning you can have kind of like the same environment and you can have a person from multiple angles and multiple places ins environment. It is super realistic, cinematic, they can do anime styles, they can do a lot of really impressive things and the cool thing for me is like with the anime and the, you know, they have, they have like a video of like a dragon. It looks kind of realistic but kind of cartoony. You could literally make a full on animated movie. They have tons of these like really cool animation type videos that they've created and you could, you could create full on movies with this, which is quite, quite exciting. I feel like finally for the first time. Now of course this is like chat GPT 3.5 left to get to 4 and then we'll have to get to, you know, GPT 5 on the video eventually. So it's definitely not perfect. Even in some of their, in their demo videos there's like one that they, they showed with an Asian guy that is like in this pool and he's spinning the stick around and at the very end of the video when his, when his stick is in resting position, his hand looks kind of like twisted in a weird way. That's not natural. So even in their, even in their demo videos it's not perfect but you definitely feel this is coming leaps and bounds ahead. They have one where an ostrich stole a guy's hat and it's running away with it. The guy's trying to chase it and get it back and it's pretty funny. So they said on the road to general purpose, this is over on their announcement page. They say on the road to general purpose simulation and AI systems that can function in the physical world. We think people can have a lot of fun with models we're building along the way. We first started playing with this upload yourself feature several months ago on the Sora team and we all had a blast with it. So basically in this new announcement it's going to be on the app for iOS. Inside it you can create and remix other people's generations. You can discover new videos. They have a customizable Sora feed and you can bring yourself or your friends into video cameos. So they said with cameos you can drop yourself straight into any Sora seam with remarkable fidelity after a short one time video and audio recorded in the app to verify your identity. And we've seen this with other video apps like this, like hey Gen. Where basically they're gonna make you prove that you're not just like deep faking somebody else. So you gotta do an audio recording and you gotta do like a video of yourself. And they have ways to basically verify that you're the actual person. And after that you can upload videos of yourself or pictures of yourself and it's going to be able to animate them, which is really, really cool. They said they launched the app internally last week and they've heard from a lot of people that are loving it. So yeah, it's very interesting. They said there's definitely concerns because it's sort of a social media platform at this point that they're rolling out with this video thing. There's concern about doom scrolling addiction, isolation and real time sloptomized feeds are top of mind. So here's what we're doing about it. I think it's hilarious that they're calling this sloptimized feeds basically this AI slop. They're worried that everything's going to turn into AI slop. I think as the models get better this is less of a problem. But they said we're giving users the tools and optionality to be in control of what they see in their feed. By default, we show you content heavily biased towards people you follow, interact with and prioritize videos that the model thinks you're most likely to use as inspiration for your own creation. So it's interesting they're explaining their algorithm here and it's not just like what's going to get the most engagement or what's going to be maybe the most sensational, but what's going to make you want to create more, which is interesting and of course that goes into the creation loop. So it's in their best interest. But I find it interesting that that was kind of their one of their big philosophies and they have a whole bunch more details on what they call their feed philosophy. You can go into Overall, this is super phenomenal. There's a ton of great responses. Of course it's all on an app. They said Android users will be able to get it once they have an invite code from someone who already has access. Otherwise it's available. And they said we also plan to release Soar 2 in the API. So Sora 1 Turbo is going to keep remaining available and everything that people have created with it are going to continue to live on the sora.com library. But now they've created this new one. People on X are saying Camp Believe video models are not getting as hyped as LLMs. This is just insane. Anyways, someone said what problem does it solve? You are selling digital cigarettes at this point. Which is funny because you know, maybe people are going to get addicted to AI videos, but at the end of the day this is obviously a very useful tool. If this can solve this kind of image or this video creation problem with AI that people have been looking for. This could be amazing for creators making all sorts of content. So really excited for the creativity that comes out of this one. Let me know what you think and I will catch you in the next episode. Before we do, if you want to try all of the top AI models in one place, I would love for you to try out my platform, which is AI Box AI. You can go over to our website and you can basically get the top 40 different AI models all in one place. You can get access to everything from OpenAI, from 11 labs, from from Claude and everyone else all there. And you can chat with it all in the same chat thread, which is super interesting and useful. We have history and we also have a no code app builder. So you can describe the tool you're trying to create and you can have our app builder create it for you, chain together a bunch of different prompts and models and create a tool for you. So if you want to check it out, it is over at AI Box AI. Thank you so much for tuning into the podcast today. I will catch you in the next episode.