Loading summary
A
What if I can show you the exact workflow for how to come up with AI video ads that get hundreds of millions of views? Well, I can because today I brought on the viral AI ad madman himself, PJ Ace. This guy is the number one guy when it comes to creating the most viral AI videos. And today he shows you his entire workflow, all his prompts, all the time tools. He uses VO3 Sora 2, this app I never heard of, called Rev ChatGPT, his entire workflow, how he uses Figma. And we just go through it and there was nothing that he held back. There are going to be a bunch of people make it to the end of this episode that learn how to create viral AI videos. So I can't wait for you to enjoy this episode. People charge thousands of dollars for this sort of sauce, but on the Startup Ideas podcast, it's free for just a like and a comment. Enjoy. Pj, by the end of this episode, what are people going to learn?
B
They are going to learn how to make a video that gets 230 million views if you're lucky. But our entire end to end process that we've used to scale to a six figure, seven figure agency and like a few months, it's like been the most insane liftoff of the building of a AI native agency. And you know, I'm excited to teach everyone exactly how we do it step by step.
A
Okay. And I, you know, keyword exactly, because we have a, we have a word on this podcast called Sauce and we don't like to gatekeep the sauce. So pj, can you commit to spewing as much sauce, sharing as many prompts, sharing your screens as much as possible so that at the end of this, it increases the probability of success to actually create one of these AI videos.
B
That full open kimono, don't point the cam down. Not wearing any pants right now. You guys are going to get full exposure with this.
A
Okay? All right, let's rock.
B
Sweet. So Greg, you had me on the pod a month or two ago, I shared about how I made that viral video for KELSEY that got 50 million views. And you know, tonight, today I want to dive deeper because we have better tools, obviously. You know, the, the exponential growth of AI tools is like every other day we get, you know, a 2x improvement in quality. So it's pretty nuts. So I've got a lot more tools in the tool belt than I'm excited to roll through. We recently had an ad that did like for David Beckham that did like 230 million views. But the process for that was probably a little too complicated. So what we're going to do is do a little bit simpler one. For one, that company called Origin financial. This got 2 million views. So I'm really excited to basically dive in to this ad. So first we're going to watch the ad, and then I will take you guys step by step through it.
A
Beautiful. Let's do it.
B
All right. In the past, it was easy to get bad financial advice. It's a timeshare in beautiful Pompeii. Get your own cake. They said there was gold here yesterday. 20 off on the first unsinkable ship. This is your college fund. Of course you're diversified. You have Enron and Blockbuster. I made all my money in meme stocks.
A
Hey, get off my yacht.
B
You've had money questions for years. Origin finally has answers till two boat rides for the price of one score.
A
So good. So my favorite part about it is just how extra it is.
B
Yep, yep. And that's that. So that is what we tell our clients from the start. And so, you know, like, at genre, we typically work with, like, larger. Larger companies. They want to use AI Video, but they're not sure, like, how do we do this and not get the pitchforks drawn out? Because people are like, you know, oh, you're cutting jobs or anything. So we always try and lead. Now this is going to feel like a Super bowl commercial in that it's going to be ridiculous. It's going to be comedy first, and there's going to be, like, mostly jokes and a little bit of brand at the end that connects it to you guys. This is the only way we've found to, like, mitigate a lot of the kind of, like, AI disruptions, changing everything, customer pushback. And so, like you said, if people are laughing kind of the whole time or they're entertained, one, they're more likely to watch the full ad, which is what brands want. And two, they're less likely to bust out pitchforks because clearly this is, like, stupid. It's ridiculous, but it's kind of funny. And you're like, oh, okay, I get. I get what they're trying to do.
A
So I think a lot of people have played with AI Video, but they haven't been able. They want to do what you do, but they don't know how. Right. Like, it feels like they've. They've scratched the surface. They've tried it. Maybe it didn't work how they. How they thought it would work. Or. So, you know, what is the Strategy for actually, like, I've seen the result. How do I get to that result?
B
Yeah, yeah. So we break it down into steps. Step one is scripting. Step two, well, let's. Let's just pretend you want to be a filmmaker and you want to work with clients. You know, like, everyone first always asks, like, how much should I charge, et cetera, et cetera. You know, like, and I. And the rates vary. Like, at starting out, you maybe should charge nothing. Build up a spec portfolio of like, you know, essentially targeting a niche of like, I love drink companies, beer companies, et cetera, perfume companies, clothing companies, et cetera. I always say, like, follow your interests to whatever videos that you like to make. You're naturally going to want to play with it and stick. You'll outlast your competition because it feels like work you're following. It doesn't feel like work you're following, your curiosity, et cetera, et cetera. So start in that domain and brands and then reach out to brands that are look alikes, competitors, et cetera. And always kind of serve a niche initially. And then you can expand out from that. Like, the first video I made, it was. It was a pharmaceutical company video, and I had a ton of pharmaceutical companies reach out. But then I was able to kind of expand my portfolio with Kelshiad and then some other stuff to where we now can do a full suite of services. But anyway, you start off on the scripting phase with a given client, and usually we try and pitch three concepts. Now we work with pro writers. Like, I'm a good writer, but I'm not a great writer. I'm a good director, and I'm a great director. So as you kind of expand over time, you do want to work with experts so that everyone's kind of focusing on their genius. And then eventually you kind of play the role of executive creative director or kind of producer that is orchestrating, you know, all the talent together. But I do think when you're starting out, you got to learn to wear all the hats and be kind of an army of one, which is where most filmmakers or creators kind of have that background of, I know how to dabble in enough of this. So anyway, step one is the script, you know, and so we kind of went through last time the script. Let me pull up the script for this one. So. So to give a brief overview of, like, the entire process. Very fast. It starts off script, it goes on into chatg to take the script, turn it into a shot list. It then goes on into creating a Figma board for laying out all your images and then generating all the images. And then you move on into a VO3 like animator to actually animate all the clips and then you put it into editor. So that is what we're going to be covering over the next 30, 45 minutes today. So let's jump in to the script first. You know, you have to start with like the big idea for the spot. So the big idea for the spot is we like recognizable IP to start so that people have the familiar with the foreign. So for this, the opening shot we wanted was this kind of guy in Pompeii which is, you know, iconic. And then the mountain's about to explode in the background. So that's kind of what we have in this opening shot. Actually later on we had the pyramid scheme which is, is kind of a funny. He's like, eh, you like. And. And then the opening shot here is like an exploding volcano. So, you know, the idea is that like he's trying to sell a timeshare. And the kind of conceit of this whole thing is that in the past it was easy to get bad financial advice. So it's like our writing team sat around and were like, all right, what's, what's the worst financial advice we give out throughout history? Let's try and make it in kind of a chronological order that leads up to today. So, you know, initially we thought, you know, okay, Pompeii would be funny if you're trying to sell a timeshare like on the foot of the dormant volcano. Maybe something with Mary Antoinette where she's, you know, maybe she wants to buy more cake and there's like rioting soldiers outside. It'd be really funny if we could have like a Titanic moment. It'd be funny if we had like Beanie Babies as like the investment thesis for your kids college fund. You know, of course you've got your diversified, you've got Enron and Blockbuster. And then of course we wanted to end in some sort of meme coin, you know, kind of stab at all the NFT Bros. So, you know, always the writing just starts off like basically loose concepts. With ChatGPT, it's like, hey, ChatGPT, give me ideas for like iconic moments throughout history that incorporate like bad advice and then give me some like some potential lines for each of these so then, you know, it'll spit out like, okay, here's like a bunch of different suggestions. Bad advice and chatgpt is not funny. 50 times out of, you know, kind of like whatever. Like, 99 times out of 100, it's not funny, but one time it's good or it'll spark an idea that actually gets one of these great lines, you know, And a lot of times, like, it's not even like the line is funny. It's just the kind of comedic contrast where, like, the Beanie Babies are college fund. Or the final bit is like, you know, two. Two boat prices for the price of one. You know, as the guys on the Titanic life raft, like, it comes out as you iterate. And a lot of times you'll do half this commercial, and then more ideas spring to mind of how you can dial it up, elevate it. So does that kind of make sense?
A
Yeah. So what I'm learning here is there's three ways to get people to, you know, continue watching. One is using existing IP that is relevant, that people understand. Right. People have. People know Mary, Mary Antoinette, they know Pompeii. So thinking about. And these are public domain ip, right? So you want to use public domain ip. That's one. Two is juxtapositions. So how do you incorporate juxtapositions in a video? Because that's going to get people to, like, share it and do something. You don't want them to be like, lean back. You want them to be lean forward, and that helps. And third is leaning into, you know, Internet native, like, what's trending maybe on the. On the Internet. Obviously Beanie Beanie Beanie Babies aren't trending, but, you know, the meme coins are trending. Right. So you're more likely, like, the fact that you ended with that was a very smart strategic move because you shared it on X. And a lot of people trade mean coins and talk about it on X. Yep, yep, exactly.
B
And this feels like a little timeless. We wanted to feel funny. I'll pull up real quick one of the other videos that were kind of similar in nature, but different. So this was this, like, Kalshee video. And we don't have to watch it, but basically, you know, the. The conceit for this was like, also historical moments. So we had like, will Jesus rise again, Peter? You know, the British are coming. Are they coming? It was basically odds throughout history of, like, different underdog moments, you know, like, will the Trojans accept their gift? Will David defeat Goliath? You know, Wright Brothers here. And this one was not really funny as much as it was kind of, like, inspiring. And it was big dramatic music and boom, it's your turn to defy the odds. So this kind of a similar structure framework to it. Tie something to a brand that feels historic and feels like, you know, universally relevant, if you will.
A
Sounds good.
B
Yeah. Yeah. So that's a similar kind of script and framework. So, yeah, so let's just say you've got your script, the client is like, hey, I like this big concept. So then you move on to the next phase, which is the exact scripting phase. And then we don't work with like scripting software. We just do everything in Google Docs because clients can make notes on everything, et cetera, et cetera. So, yeah, once you've basically locked down your script, we need to move into the next phase, which is taking it all into ChatGPT. So what ChatGPT is going to be able to do is you can upload the script and you say, hey, give me a prompt for each of these as images. So we like to work with images on the basis that it's a lot easier and cheaper and faster to generate images for the commercial than it is to do everything text to video. Like you see here, where you're basically generating the entire commercial kind of blind and you're not sure if the client's going to like it. So it's a huge benefit to be able to basically start to do everything shot by shot. And so that's why we basically take the script. We do scene one, scene two, scene three, scene four, scene five, all the way until the end. And then we start to fill it in with, you know, kind of each drafts of the shot. Now here you're typically what you're going to see is like all of the shots that led to this, which is what I'm going to show you in a second. So, you know, Mary Antoinette had like, she's got different, you know, kind of poses and stuff, and you would typically see that reflected in here. You've got lesser versions, but anyway, so we're going to use a platform called Reeve and rev. It's just app.rev.com and basically it's pretty awesome because it gives you three different versions of whatever you're wanting to prompt. So we go into ChatGPT, I'll basically say, hey, ChatGPT, here is my script. I need you to basically turn this into scenes and a shot list for each scene. And I need you to give you shots and I want the prompt to be structured. Let's just say I have like a master prompt kind of a thing that I use for all my things these days. It's not that sacred anymore. The image Models are great and they'll get you good stuff regardless. But, you know, it's like a nerdy and nervous Roman real estate agent strides, you know, arms out, conversational gesture, blah, blah, blah. So it's going to describe a lot of motion. But we're only doing images to start with. So, you know, we kind of got something like this. I lost the real photos. But as you can kind of see, a lot of these images are quite similar to what we had in the, in the final ad. So for all intents and purposes, these are the same prompts. So we'll kind of go through, through it. Now the good news is once, let's just say we paste in a Pompeii thing here or let's go Titanic. So we'll paste it in a Reeve and then what we're going to find is Reeve actually gives you three variations of the images. And now thanks to Nano Banana like technology, which this also works in Nano Banana, I just find Reeves interface to be a bit better. And we also like the kind of photorealism we have on here. You can even run it through other enhancer AI to make the skin more detailed and all that kind of stuff. So anyway, it's going to give us frames here. Now what it's going to say is it's going to suggest, can we move the passenger closer to the camera? Can we add more dock workers? Can we show the passenger? So it's like, okay, give me. Actually let's suggest, let's click this. So we actually click this photo and give me this as a close up. And then you can just hit Enter and it'll actually give this image as a close up because it's referencing the image and then it's actually going to do three variations of it. So now the editor is, I'm sure, as you remember, this is light years ahead of where VO3 Text to video was back in the day. You didn't, you know, you had to burn, you know, $4 a shot and you, you know, you just kind of pull the slot machine blind. But now you're really able to kind of go shot by shot and then start to build out, like I said, each shot sequence of like, okay, I like this for the wide shot, I like this for the close up. I like this for the peasants looking in through the door. Then I'm going to cut back to her. Is that, Are you kind of tracking with me so far?
A
Yeah. I mean, it makes so much sense by the way, to do images, like to do script to images. Then the video, like in retrospect, like we were crazy for going just like.
B
Straight, straight into it, just raw Dog and text. Raw dog.
A
That was crazy. The other thing I was thinking about is this seems better than Nano Banana, this Rev app.
B
It's, it's pretty good. There's a lot of realism and that the structure feels right, that this chat interface, you know, sometimes it'll get like, this guy looks like a fucking psychopath. So I would reroll this character, but there's a lot of, like, this looks super realistic. Like this doesn't look AI generated at all. And again to do it, it's like, let's make this character more of a wide shot and include the captain yelling at him, you know, so it's just very iterative and for non filmmakers, like, you really don't have to know a ton about angles and lighting and camera movements and all those things. It just kind of builds it out with you conversationally, which is kind of the future of this. Unless you go into image model world models, I think we're going to see that start to infiltrate this stuff soon. But anyway, if that kind of makes sense. Basically that's what we do is we basically go into copy code, paste code, copy code, paste code, copy code, paste code. Just back and forth until we have all the, the shots here. Yeah. And I mean, you know, it's, it really starts. You can kind of regenerate each of the images, but you can kind of see how you can build out multiple angles, coverage of a scene consisting characters, etc. Etc. You've got, you know, the guys smoking in the thing and that's, you know, that's essentially the core of the image generations here.
A
Cool. Okay, so we've got our images.
B
Yep. So we've got all our images. You know, we're going to save them. I think we just hit the download button here and then we're going to start to slowly put all the downloads into kind of our, our master boards here. Like I said, typically I'll have a bunch of like alt versions here because how, how we work with, we work with a dedicated writer, we work with a dedicated director, and then we work with AI cinematographers. And these AI cinematographers are the ones who are going to take the script and the director's treatment and they're basically going to generate a lot of these images and kind of fill out these boards with multiple options. You can see it here. This is for this David Beckham project we did with Im8 where the director will say, okay, I need this Eyeball. They'll just upload a reference shot of an eyeball. But they're like, you know, we want it to be unique to the project. And boom, the AI editors churn out a bunch of examples. Or this opening shot, we need like a cool sci fi setting, multiple angles of coverage. And then our AI cinematographers just do bunch of shots, Serengeti, et cetera. So this is. This is for that IM8. So, as you can see, the more complicated the ad, the more, you know, like, you can have hundreds of shots for just one shot on the shot list. And I do find that, like, when you're doing more stylized stuff like this Im 8 David Beckham ad that we did that got like 230 million views. You want to give a lot of reference images here for the tone, the style. So this is what. This is kind of like, if we're doing like the basic version, this is like expert mode. You know, you'll have, like the line of dialogue here. Red or green? It's a simple choice. So this like the first 10 seconds, and then the director is coming in and he's saying wide shot of an abandoned facility. We've got. In this case, we had the Tennis Star arena, and she was able to kind of. We were able to deep fake shots of her. Um, so you're just having, like, a lot of shots in this one setting. And then our, you know, artists are doing these. And then we're basically going to have the director come in and make selects of all these shots. So that's what it looks like on more of a complicated, you know, shot. So that kind of brings us to. We generated most of the shots in the shot list here. We laid them out to this here. And then now we go on to VO3 animation as our next phase. Now, I do want to note that there's a lot of programs right now, and I'll list them off briefly. VO3 is probably the best model at the moment for making characters talk. It's just really realistic. The motion's great. Like, he's walking up here, camera pans to that. So, you know, it's. It's even worth kind of noting how we do this. So even though we have an original prompt for the Titanic, which is here, I don't know how to find that prompt. It's like this bursting doc, blah, blah, blah, snag tickets. So even though it's a text model, I'm still saying the dialogue. And then I'm just actually just copying pasting that Same prompt into VO3. And I'm uploading. So instead of text to video, we're going to do frames to video and we're going to upload that shot that we just downloaded from Reeve and then that acts as our first frame. And then that is going to be a similar prompt here. Just snagged first class tickets and it's going to be similar. And ChatGPT can also help you. It's like, hey ChatGPT, here's my image prompt. Now I need it to be like an animation prompt that you need to kind of tell me what's the camera movement here? Are we starting with him? We're going to pan to the ship next and that'll help you really make these shots dynamic like you see here.
A
Yeah. And by the way, to you, pj, because you're literally an expert, you probably understand camera movements, right? To the average person listening to this, including myself, like, I don't even, I barely know what a camera movement means. You know what I mean?
B
Dolly jib, do a three quarter turn. Yeah, like honestly, ChatGPT gives you pretty great suggestions and you can also be stupid with it. Like move camera left to end on the ship, you know, like it doesn't matter. It'll do the same as like some complicated director term. And so yeah, I found that like VO3 is pretty, pretty good. Character performances are the best. However, if you're not having characters talk, that's when we'll move on into some of the other animations to highlight. There's a platform called Kling with a K. Great 1080p actually might be like 4k now. Another one called Luma Labs, another one called Seed Dream or Seed Dance, which is from ByteDance, which is from TikTok. Great models. Another one called Minimax. Honestly, they're all kind of similar these days. So that's why most people just default to VO3 because it will do the best talking performances at the moment. And it's Google like their terms and service, their indemnification, like for working with clients. A lot of times we'll just tell them like, you know, if you just want us to use Google from start to finish, you're still going to have a great experience because Nano Banana's got you covered on the image generation and then VO3 has got you covered on speaking and any animation. And Google's broader indemnification policies covers any commercial production. Like they're solid, great training data, ethically.
A
Trained, et cetera and even Gemini on scripts and stuff like that, right?
B
Yeah, exactly. You can, you can do this and we're doing this for like an upcoming project for Google. Like it's pretty seamless to go Gemini. Now technically I think you can even generate image like text in Gemini, images in Gemini and then even do video generations in Gemini. But if you're going to use the, the, the Google suite, it's typically best use the Gemini app for that and then go to like labs.google for kind of images and videos and stuff. They also have another thing called AI suite. It's Google's old, they have great always foundational technology. But the real struggle for them I think always is this application layer that sits on top is how do we make it seamless and cohesive. So they're still working on making this all like one big filmmaker suite. But in the meantime, Gemini app for or chatgpt for images. I'm sorry for the scripting phase, you can use for images either Nano Banana in Google's AI suite, you can also use it on a platform called freepik that has all of the image models. So let's actually just look through here. So in freepik you have, if you go to generate images, you can have the entire models from Google. You can use Nano Banana, you can use Flux which is Black Forest Labs, you can use C Dream, you can use Ideogram, like basically they have all the models in here. Now the downside to using an all in one platform for image generation and video generation is they're using API pricing. So it's just going to be a lot more expensive than if you were to go in. But there's a number. Like for instance, image generation these days is practically free. So you get unlimited image generation on Freepik for like I don't know, 20 bucks a month. Like it's nominal. And then I would just buy a subscription to Google Flow because basically for like I think it's like 120amonth you get I think unlimited on fast mode, which is important to note that you shouldn't do quality mode, you should do fast mode just because it's like 80% of the same quality and I think it's free. And then you can do portrait, you can do landscape and then the questions outputs, you could do 4, 4 outputs if you want for that. And like I said before, just if you want to do frames to video, you can just add a starting frame here and then you do your prompt which makes the characters talk and et cetera. Like that.
A
Beautiful.
B
Yeah, yeah. So that's kind of the process in a nutshell. And then obviously you take it into your editor. And as you saw, you know, essentially we're just putting, you know, generated clip, generated clip into the timeline kind of sequentially. And this ad was very simple to edit. You know, the image model or the video models like VO3 will add sound. They'll also add. You don't want it to add music, but, you know, it kind of like you just put a basic music track. It already has dialogue and sound on it. So the edits are like just extremely simple things. Like, you really don't need to have a ton of editing experience now with everything so laid out for you.
A
So you didn't. Well, I guess, yeah. VO3 doesn't put on a music track. Right. So you have to go and find, I guess either Story Blocks is the popular one.
B
Yeah, Most people use Epidemic Sound as well because for like $9 a month, you can get unlimited songs and they've got a pretty big selection. The reason you wouldn't want VO3 or even Sora to like generate a music track is because it's going to generate a new music track for each clip because it's only giving you a piece of the pie. You want, you know, the whole thing to be no ingredients as far as music goes. But sound effects are great.
A
And then how do you. This is probably a dumb question, but, like, what software are you using to actually put it all together?
B
If you're wanting something free, you could just use Cap Cut is what a lot of people that are just want something simple. It's like an online editor made by TikTok Quality. Most Middle people, like in the mid stage. I think Da Vinci Resolve is actually free as well. And then most of the industry runs on Premiere, which is like 19 bucks a month. I want to say I like Final Cut just because it's. It's probably the easiest platform. It's like Imovie on steroids. But they all work. I mean, they're all great.
A
And anything else we didn't cover on this whole process that you want to.
B
Make, there's some interesting tricks. Like you can do these cool things where basically in V3, instead of adding this as the starting image and then describing what Beanie Babies look like, you can actually include in the upload image an image of the Beanie Babies that you want to uncover. And then like you can even say a note that. So then when I uploaded this as the starting image and then I said, he uncovers this, Google actually out paints this and he rolls it up to be the Beanie Babies that were there. So if you look here, this is kind of like a complicated thing, but it's just one of those details where he starts. Oh, we kind of start with him midway here, but basically the clip actually started with it fully closed. But we had to give it a reference image for what Beanie Babies looked like and they had the big TY tags or whatever like that. So just there's weird hacks like that that you can do where you can put picture in picture to describe what you're. You're wanting to like pan the camera over to. Or if it was like a script, like a, like a, if they didn't know what the Titanic looked like, you could put a picture of the Titanic here and then you would say remove this image and then pan over to this ship. And so then it would pan over to this ship and then it would look like the ship you did at the starting frame.
A
Dude, this is crazy. Like, even the Beanie Baby shot, like, if you, if you. I mean, I just, you know, recorded this docu series and I saw how expensive it is and how time consuming it is to shoot things like that shot would. You can. Well, you couldn't even do that shot, right?
B
Oh yeah, think you couldn't.
A
You know, the, the TYs aren't that big and it's so much more impactful with the TYs being big.
B
Yep, yep. Yeah, this ended up actually being a different image, but. But it's kind of similar here and we had 100 variations of this because the issue is like, you can't. You want to generate this as kind of your end frame, but you can't have it go in reverse to like pile in. So that's why the, the picture in picture kind of helped so that the AI understood what we were like opening it to. The other actually thing you could do here is this, this is a good case for just doing straight text to video where you could just prompt the Beanie Babies here. But anyway, those, those are just kind of the details. So, so that's, that's it in a nutshell. And we've, we've used this process, like I've said, for a lot of our, you know, kind of main viral videos. Like if we go to my page here. All right, so to. So to show the same process in another video, here's a video we made for a company called Ramp that's like a financial services credit card company that makes it really easy to, you know, essentially expense things. Same process. I'm going to walk you through it right after we watch this ad, your mileage log. W2. It's audit season. Wait, guys. We're good with intelligent receipt capture Ramp has your back. So I can just take a photo rant? An audit doesn't need to be a horror movie. Next audit fix. That was fun.
A
That was fun because it's just like, things blowing up, you know, dark like. I love. That's a. That's a. That's like a horror movie in an ad.
B
Yeah, that's exactly. That's exactly it. So, as you can see here, we start off, you know, initially we did it all in actually text to video, and we didn't like the look. This is when image to video kind of first came out. So we did the add in text to video and. And it just looked too like AI, like the faces were morphe and it looked like shit. So we actually. We made the whole ad and then we threw it all away. But that's when we first. This is like two months ago we made this. This is when we first realized the power of image to video and how much better kind of the base images could look like if you did everything with text to image and then image to video. So similar process. How do writers write a great script from them? We had a director basically create a shot list of how it all flowed. And then from that, we had our AI director of photographers come in and basically just give us variations for each shot on the shot list. Like, we need some, you know, zombie coming up on the plane of glass. We need people reacting to it in horror. And then we need kind of big zombies breaking in. And, you know, like, look at this. She's awesome. So really up to the fidelity of a lot of this stuff.
A
This is.
B
This is great. And it really just made everything kind of come to life. And our directors were able to basically come in and just select the shots that they loved. So, yeah, this made it so much faster. And obviously, visually, it looks so much more cinematic. The prompts that we use then we keep in this and we just bring it over here and I can send them to you, Greg. And again, this was all done in Reeve or Rev as well. And the prompt is something like this. And then our team can come in and basically if they want to, like, do a slight tweak, they can just take that same prompt and then they can do further variations here. Once they like a given character, they can kind of prompt her in different angles and poses.
A
Yeah, the characters don't look that AI generated. Right. Which is so cool.
B
Yeah, yeah. And it just gets better. Like you can even run a second pass through a program called Enhancer AI where you can even add like acne and details to her face. Right. So it's rough her up a little. Exactly.
A
Make her more normal, you know, because some of these people look too perfect. Right?
B
Yeah.
A
They've got this AI glow and you gotta, you know, that's not, that's not how we look like in real life.
B
Yeah. Most of us at least. No. Yes, yes. So, yeah, man, that's, that's, that's the, the spot we did for Ramp. So. And that's kind of the process we rinse and peep for different clients. So I think this is a major opportunity right now. And the real question is, how much is Sora going to disrupt this existing workflow? Because as you saw, I mean, we don't, I don't even have to show it because everyone's been watching it for last week. But basically it's like Sora auto does the script, it automatically does the image and video generation, does sound effects, it does music all in like a 10 second bite. Now that's the problem is it only is limited to 10 seconds. But I was watching behind the scenes interviews and they're like, 30 seconds is coming, 60 seconds is coming. Character consistency is kind of coming. The ability probably to tweak individual clips so you can edit it is coming. So that's the question. Like, we're disruptive to the big, the big agencies. This is very disruptive to us because it essentially makes our six to eight week timeline down to what, like a week or less once you're able to kind of tweak and the quality gets better, which it will over the next three to six months. So I, I mean, what it's going to mean is it's going to mean we have to do a lot more volume. We're probably going to have to lower our prices, do higher volume. But it's a good thing for brands because basically brands will be able to release a new ad each week at a, at a lower price point. And for us it's good because we'll just do retainers with all brands. And it's like, hey, we're going to optimize for comedy writers and supervising director. We're probably going to minimize our dependency on AI cinematographers or some of these other animator roles that are automated by Sora. But it's going to be a wild next six months. Like I thought we'd be here a year or two from now. We're Here next month it's clear that.
A
Sora is going the direction of productizing pretty much the entire workflow that you've shared over time. It's just going to get better and better.
B
That's right.
A
That being said, I think like the limiting, like okay, how, how can you take advantage of this opportunity? And the way to do it is the limiting factor is just great ideas ultimately. So from your perspective, it's like how do I hire people who've got really great ideas, scroll stopping ideas that, that aren't thinking like everyone else. And then you know, from, from my perspective as like a founder, like you know, I want to create these ads for myself. You know, I'm just kind of like, well, I just need one really good idea a week, right? And if I, if I can get one really good idea per week, then and one of these, you know, ads pop and get me the right, you know, cact LTV ratio or go viral. That, you know, that is, that sets your company on a trajectory.
B
Yeah, I think that's gonna be the real science is how do you. In an age where like so like the Stephen Hawking clip, have you seen it?
A
Yeah, not only have I seen it, I've watched it like five times. I'm obsessed with that clip. We'll put that clip in.
B
Okay. So the thing I love about this clip is well one, it's just fucking stupid but two, it's like the physics are like photoreal and it's kind of like maybe this could be a sport. It's like rocket league. It's like maybe this could be a sport. So I, if I were a brand, I mean this is the real question, like if you're Red Bull, do you stick a Red Bull in his hand as a close up shot and then cut to like the wide of this and then he's doing this and like do you have to get permission from Stephen Hawking's estate? Like the real question for Sora is like if the estate of Hawking uploaded him as a cameo and then charged likeness for brands to like pay a royalty fee or something, that's going to unlock like meme branding as like a thing, you know, like you can like a lot of these estates of dead, whether it's like maybe Tupac or Kobe, like I think that's going to open up in the, in lieu of that you're going to have all these historical figures like Einstein or you know, Plato or whatever and like it's just going to be open season. Kind of like we were just showing you with our ads on open source IP on these characters that brands can work with.
A
Yeah. I saw Sam Altman yesterday or the other day, said I hope Nintendo doesn't sue me.
B
Oh, everybody's about to sue him. But he just raised like, you know, like they're like at $500 billion as a private company, so I think they'll be okay. Obviously they did the dirty playbook of like no guardrails and restrictions to get it to the top of the app store and then they nerfed the ever loving crap out of it.
A
So.
B
So it's not that fun these days to you just constantly are getting like, you know, cannot generate this, cannot generate this. I do think as time goes on they're going to get certain IPs to opt in and then like those IPs will be revitalized. So I was talking to like someone who is in Japan and he owns a bunch of like old IPs and I was like, you've got to talk to Sora to get like yourself opted in so that we can make episodes for your old show. And now let's pretend it's thundercats. Now he man or thundercats becomes like a huge IP because everyone can remix it.
A
Yeah, I mean this just gets me thinking like I want to buy ip. You know what I mean? Like IP is so undervalued right now that once, because we know that OpenAI and the like are going to do deals with these IP makers, it's only a matter of time.
B
Yep, yep. I think the other opportunity is like production companies, agencies, like small creators that can act as this go between. We're like, okay, so the, the old like thundercats and like, I mean it's probably all owned by like Hanna Barbera slash Cartoon Network, slash whoever owns the Universal. I forget who all the stack. But essentially like, or maybe it's like minor Japanese ones that are still like loved but like they don't have the internal structures to know how to do the prompting. So what we're going to see in my head is like, I want this like exchange marketplace, almost like Fiverr, where you have brands that are opting in like, okay, train on my data. But I need people to do the labor of, you know, a small team. And I really think it could be simple. It would just need to be like a good Hollywood writer, a good director, a good art director that can maintain visual cohesion and then some sort of general editor all around. And they can generate voices. You can hire real voice actors. Just depends on like the budget I think the budgets will get down to like to do another like he man or Thundercats. You could probably do it for like 30 grand on the super lowest end between all the roles, if you're doing it like at scale and at volume. And then some of the bigger IPs more recent, like Pokemon, you know, they'll still want to spend a couple hundred grand an episode, but it gets. Price gets super compressed.
A
Pj, I love having you on, man.
B
I feel, I feel.
A
You really are the viral AI ad madman and I appreciate you sharing the sauce with us. For folks listening in the show notes, I'll include where you can follow PJ on X. I'll include his newsletter and I'll include a link to genre AI. Is there anything else I should be including? Pj?
B
That's it. No. If anyone wants a great course, I always plug my buddy Roark Heath's course called GenHQ. I can provide you a link for that as well. I don't yet have a course. I'm trying to do one by the end of the year. But in, in lieu of that, Roark's got a great top to bottom and it's like 99 bucks. It's like fantastic.
A
Beautiful. Yeah. If you send a link I'll include in the show notes, people can check it out. And my, my advice to people is just get your hands dirty. Like this was a how to on how to. How to create high quality cinematic AI videos and sometimes you just gotta get your hands dirty.
B
So yeah, I think I'll be back on in like a couple weeks once Sora unlocks like pro mode and you're able to like dig in. Because I do think that like, for everything I laid out, it's important for now, but it's gonna be baked into Sora's kind of suite where you can edit clips, you can kind of double tap into this and this entire process is gonna get like a lot faster and a lot cheaper. So. So like you said, Greg, the biggest takeaway for any creator is just like start creating, start putting shots on target, consume the viral content and then figure out if I were a brand, how would I find this palatable, to have my image and, you know, kind of associated with this. And so once you create a portfolio that's full of like branded viral content, that's when the keys to the kingdom unlocks and you can make a lot of money and grow a huge agency.
A
PJAs, you'll have to come back on again. Let's do it.
B
Okay, talk soon, Greg. Thanks, man. Later.
A
Take care.
Host: Greg Isenberg
Episode: How I use Veo3 + Sora 2 to create Viral AI Videos (300M+ views)
Date: October 13, 2025
Guest: PJ Ace, Founder at Genre AI, viral AI video creator
This episode dives deep into the exact workflow used by PJ Ace and his agency to create viral AI videos that have amassed over 300 million views, including high-profile ads for companies like Origin Financial, Ramp, and even collaborations with figures like David Beckham. PJ shares an unfiltered, detailed, practical blueprint—including every tool, prompt, and trick—empowering listeners to replicate the process themselves. Discussion focuses on using AI video platforms (Veo3, Sora 2, Nano Banana, Rev/reeve, and others), the creative process, and forward-looking thoughts on how AI—specifically Sora—will disrupt creative video production.
Key Memorable Quotes:
Notable Quotes:
Example:
“That Beanie Baby shot would be a nightmare in live action… you couldn’t even do it!” (Greg, 29:36)
Memorable Quotes:
| Timestamp | Segment/Topic | |------------|---------------------------------------------------| | 02:05 | Overview of full workflow with new tools | | 05:11 | Scripting and following your curiosity | | 09:59 | Viral content secrets: IP, juxtaposition, trends | | 12:04 | Turning scripts into images; working with Rev | | 18:28 | Collaborating with AI cinematographers | | 22:07 | VO3 explained and prompt writing for animation | | 25:13 | Cheap/free workflow tips and suite walkthrough | | 28:11 | Advanced hacks: reference images & outpainting | | 30:55 | Case study: “Ramp” ad, lessons learned | | 34:23 | Refining, tweaking, and collaborative feedback | | 36:17 | Sora’s impending disruption of the workflow | | 39:53 | AI, IP, and the marketplace for remixing | | 42:22–43:29| Closing advice: Get started and portfolio value |
Greg (00:45):
“People charge thousands of dollars for this sort of sauce, but on the Startup Ideas podcast, it’s free for just a like and a comment.”
PJ (01:56):
“Full open kimono. Don’t point the cam down. Not wearing any pants right now. You guys are going to get full exposure with this.”
PJ (03:54):
“It’s a Super Bowl commercial—ridiculous, comedy first, with a little brand at the end. If people are laughing the whole time, they’re more likely to finish the ad, and less likely to bust out pitchforks at the AI.”
Greg (10:11):
“The fact that you ended with that was a very smart, strategic move because you shared it on X, and a lot of people trade meme coins and talk about it on X.”
PJ (22:22):
“Dolly, jib, do a three-quarter turn…honestly, ChatGPT gives pretty great suggestions and you can also be stupid with it, like ‘move camera left to end on the ship.’"
Greg (29:36):
“If you—I mean, I just, you know, recorded this docu-series and I saw how expensive it is... that shot... you couldn’t even do that, right?”
PJ (35:30):
“It essentially makes our six to eight week timeline down to what, like a week or less.”
Greg (39:53):
“This just gets me thinking like I want to buy IP. You know what I mean? Like IP is so undervalued right now…”
PJ (42:42):
“The biggest takeaway for any creator is just start creating, start putting shots on target, consume viral content and then figure out: if I were a brand, how would I find this palatable?”
Greg (42:22):
“My advice to people is just get your hands dirty. Like this was a how-to… Sometimes you just gotta get your hands dirty.”