
Loading summary
Host 1
Hey everyone. We just recorded and my mind is spinning. We did a pod with Gaurav Mishra who is the co founder of Captions AI and they just came out with an amazing video model that is just going to completely change how you think about doing video. We are going to show you how to make amazing AI social and product videos. We got the founder on the show who's going to give you the masterclass on what to do and the core use cases you are going to who want to stick around for everything. Captions AI and their new Mirage model. Let's get to today's show. Here's a quick word from our sponsor.
Host 2
Cutting your sales cycle in half sounds pretty impossible, but that's exactly what Sandler Training did with HubSpot. They used Breeze HubSpot's AI tools to tailor every customer interaction without losing their personal touch. And the results were pretty incredible. Click through rates jumped 25%, qualified leads quadrupled and people spent three times longer on their landing pages. Go to HubSpot.com to see how Breez can help you grow your business.
Host 1
We are about to have an amazing conversation with Gaurav Mishra. He is the co founder of Captions AI. They just released a new text to video AI model called Mirage and I've been playing with it and it's blown my fricking mind. I built a video promoting HubSpot's customer agent and what I did is I had ChatGPT's O3 model write me a script.
Host 2
I put that script into captions.
Host 1
I picked my actor, my editing style and everything. And what you are seeing now on the screen is just a one shot. No edits, no fine tuning. I could make it way, way better. But like one shot within like 15 minutes attempt at an awesome product video for HubSpot's customer agent. And it's really good. And by the way, if I spent a couple hours on it, I think it would be excellent. And what's great now is you all have this power. You have it in this new Mirage model from Captions. It's remarkable. I'm paying like 25 bucks a month. The fact that this is like 25 bucks a month is literally making my head explode. And so we're going to dive in with Gaurav. He's going to show you how to do it. He's going to show you some tricks. You're going to want to stick around. At the end, Kir and I are going to talk about our use cases and at the very end we're going to give you the one thing you should go and do immediately with all this information. We got a fun one today. We got a last second guest, Gaurav Mishra, who is one of the co founders of Captions AI Gaurav. You all just released a new product slash model yesterday called Mirage. And it is probably the best text to video that I have seen for kind of short form vertical video. It's really, really good. And so we wanted you to come on and join us real quick and kind of talk about Mirage, some of the use cases and everything. So thanks for being here today.
Gaurav Mishra
Of course. Yeah, thanks for having me. And really excited to chat about Mirage. And you know, this has been a long time in the making for us. You know, we started the company four years ago now, like a little bit before sort of the AI hype even started. And this is something we were thinking about back then, but we didn't even imagine that it would be possible in such a short period of time. Like we were thinking like 10 year timelines. Right. And to see that it's become actually usable and actually useful in such a short period of time has been crazy as a journey. And yeah, happy to chat a little bit about Mirage and give you a little bit of background if that helps.
Host 1
Yeah, I think most people's reaction anytime they hear the phrase text to video that know about it at all are like, ah, it's fine, exactly like it's not that good. I'm not sure exactly when I will use it. Maybe if I need a video and I need to take a video from English to Spanish, I'm okay with using it for something like that. But that's kind of the range.
Gaurav Mishra
It's so true. And honestly, that's why we waited a while to really start on the video generation journey. Our goal has been as a company, let's make it easy to create video. And our target customer actually was sort of the long tail small business professional and worldwide by the way. Right. And so when we started out, we recognized two main problems. It's really hard to record a video because being in front of camera, having the camera presence, like it just doesn't come naturally to everybody and, and people don't want to do it. Right. And when they do it, they still get just footage and like footage is half the journey and then they have to edit it. And like editing is super technical. It's a totally different skill. And there's like so many technical terms and all that to get something that's actually usable right at the End of the day. And so we wanted to solve these problems. Like we wanted to get these problems out of people's way so they can actually produce whatever comes to mind, right? And so Mirage actually is really tackling the first side of the problem, right? Which is recording. How can we make recording really, really easy? I'll give you a little bit of background on this really quick about the space. So when people think text to video normally, they probably are thinking about generating sort of B roll, what we call what you might think of as like stock footage, right? Like if you go to a normal sort of text to video generating website or company, you'll type in text of something like, oh, like cityscape of New York or something like that. And it'll be like a shot from like a drone of like New York City or something like that. But that's stock footage, right? Like is that really that useful, right? Like, and we always question that of like, why are we building multi hundred million dollar models to create stock footage? It almost makes no sense. And really we wanted to focus on like that a role, right? Which is like how do you actually tell the story, right? A lot of which is talking. And like you watch a movie, you know, you watch a TV show. Even social media these days, right? Is just mostly talking is communication, right? And B roll and stuff is there, but it's kind of interspliced some percentage of the time to enhance the story. It's more actually part of the editing part of the product for us. Right. So that's kind of why we decided to focus on Mirage, because we didn't think anybody was really building foundation models for this. And we wanted to make sure that something like this existed, solving the right problem for the user. And so that's sort of the big launch that we did yesterday is bringing this to market finally with real use cases. And we've tested a lot of this stuff in real life scenarios. And not only can we match what people can do with traditional tools, but we can even exceed it in many cases in actual specific use cases too. Which happy to talk about.
Kieran
The one thing I'm curious about, when you think about that storytelling problem you wanted to solve and primarily through the kind of people being able to tell you that story. Did you think about it through me being able to clone myself and tell the story or me being able to tell the story through these kind of AI avatars and maybe you can kind of integrate that into the kind of use cases you're going to show us?
Gaurav Mishra
Yeah, definitely. And I think that's another interesting sort of point that you brought up around. Like there have been two types of video companies too, right? There's the traditional sort of text to video companies that are doing more B roll and then there's the avatar companies. But if you look at the avatar side, you notice that actually a lot of them are just using real people, which they're contracting, right? And then they're sort of dubbing over them. And so it actually ends up having a lot of the same problems that traditional production actually ends up having with like contracts expiring and things like that. I think the future that we see. And by the way, this will get interesting and maybe slightly dystopian. So let's talk about it.
Host 1
We love a good dystopian break here on the show.
Gaurav Mishra
So, I mean, I could see a future where every brand and every small business kind of has a virtual person, right? It's like a likeness that they actually own as ip. I could even see people trademarking these likenesses, right, as like part of their brand. Like this is our representative, right? That person shows up in all of their marketing in different places. It's the trusted face of the company almost, in a way. And this might become very common, right? It's not a thing that happens right now, but something that becomes very easy once you can just create people which Mirage can do. Like Mirage, if you just describe, I want a person of this type wearing these things in this location, it will come up with a person that looks absolutely real, like indistinguishable, basically. And you can kind of have that IP of that likeness in your Mirage studio. You can make as many videos or different types of locations, whatever you want to do with it. Which is kind of a brand new unlock almost, which wasn't possible before, right?
Kieran
Yeah, that is actually a fascinating point. I just want to touch on that for the listeners because we actually have our creator program in HubSpot where we work with a bunch of creators across podcasts and YouTube and all of these different channels. And they are like real people with real audiences. But for the average SMB is never going to have a creator program. It's like too much of a, you know, an investment for them. But what you're saying is like, in the future every SMB could have a creator program that has virtual creators who are fine tuned to those channels to communicate with that audience. And they can become the voice of the brand within those channels. And their proprietary, they're like licensed to that company, so they're not going to appear anywhere else. They're licensed to that company. They can have their own personalities. They can be very, very good at that content, because AI is doing the content, which I think is actually a really good use case. I would love to just hear you've been doing this for some time now. The company's four years old. Do you think humans are going to be fine with the people they follow being AI influencers versus real people? As long as the content is good. Do you ever think through that problem or think through that scenario?
Gaurav Mishra
Yeah, I think, to be honest, that is where the dystopian future begins. If people are okay with that, if people accept that, you can start imagining a world where pretty much a vast majority of social media, like, you imagine your TikTok and Instagram Reels and things like that are practically generated. Like, you watch stuff today, everybody watches Instagram, right? You're scrolling through. You don't know. You don't know any of these people, right? Like, who knows if they exist or not? If I told you, like, yeah, half the people don't exist, like, you would have no way of verifying that either, right?
Kieran
Yeah.
Gaurav Mishra
And so, almost to an extent, you can imagine, like, the best algorithms of the world today are finding the best content and matching people with it, right? They're finding the best content in the world and matching people with, like, you want to watch this, you want to watch that, right? Future best algorithms in the world might be pretty much just fabricating it out of thin air, right? Like, it's an unlimited sea of content that just never ends. Like, whatever you're interested in, there's more of that available, right? So I could see that happening in the next five or 10 years. Something of that nature, right? And a lot of it depends on people getting used to. Are people going to accept virtual influencers and stuff? My theory is yes, because people won't be able to tell the difference. And at the end of the day, if it's entertaining and interesting and you can learn something from it, I think people will watch it. And the use cases are broad, right? Like, not just social media, education, tv, movies. It's just very broad. So, yeah, tons of possibilities.
Host 2
I think what's interesting here is that.
Host 1
If you're watching the show today, you're getting a glimpse into how drastically different the Internet and the social web are going to be. It's just going to be a very different experience to consume content in the future. I guess my question to both of you, Kieran and Gaurav, is, like, if you had to Make a guess, an educated guess. And put like Netflix and like high quality streaming aside, but like social video, when is the majority of social video AI generated versus people generated? Like, how long till you think?
Gaurav Mishra
I think it's already starting to happen. Like, I would imagine that if you are TikTok and you're kind of monitoring synthetic content, or if you're Instagram, you're probably seeing like somewhere in the 1 to 5% range at this point, but I could see that becoming 20 to 30% in a couple of years.
Host 1
So you think five years till it's the majority?
Gaurav Mishra
Five years would be pretty much. You would struggle to compete with what synthetic can do basically as a real recording.
Kieran
Because the other thing that will happen that will accelerate this is like platforms like Meta will offer this as a service and weight up their own avatars and influencers. Like, that will 100% happen. So that will accelerate what you see in your feed.
Host 1
I think we're two years away, by the way. I'm going to go on the record. Two years.
Kieran
I agree.
Gaurav Mishra
I like it. I mean, honestly, I think you might be pretty much spot on.
Host 1
I don't think it's crazy because I will say, because I want you before you have to go to talk about use cases in a second. But as we kind of set up the problem and what Mirage is and what you can really do with this technology now, the reality is not only do you need a video editor, not only are a lot of people not photogenic, it takes a long time, like it takes days or weeks to make a really good video in a manual process right now. And when you can do it, I mean, I was, I made some this morning in like 20 minutes. Right. And when you can take it from 20 days to 20 minutes, that is a drastic acceleration of people coming online and creating that were never able to do so before. And this is also for a price of like $25 a month. Right. You're not talking like thousands and thousands of dollars, you're talking at a fairly low cost.
Gaurav Mishra
Totally. And I think one of the coolest things that we've seen and we want to differentiate Mirage from other models, of course. And like, we want to have our own sort of like specific focus that we really dive deep into. And our focus really is on realism and acting. That's where we kind of come in. Right. So we're not going to help you make animated characters that. Well, you can try it, it probably works, but it's not our focus. Right. We're not going to do 3D models and things like 3D characters and stuff like that, like Disney characters. But when it comes to very real looking people, that's where we want to excel and that's where we've put in a ton of effort. You can bring in an image of kind of like fake looking AI generated image with the glow that AI has. Sometimes you can bring that into Mirage and it will remove it, it will make it more realistic. Right. Which is something we've specifically worked on. And then the second thing is acting because literally, I mean half the battle is like delivery. Right. And I think what people forget is like body language is language. It's part of language, right. It's called body language for a reason and it has to be good. If you've ever seen a bad actor in a movie, literally the definition is when their body language doesn't match what they're saying. They may be saying something, but the body language is off. It just is not portraying that emotion. That's bad acting. You don't want that. We focus specifically on how do we make sure that the body language is exceptional. It should feel like the person went to acting school for two years and came back and then they're delivering the lines there. And so those are the two things we've put in a lot of effort into, I think where we win basically against everybody else in a comparison. So that being said, that's something we're going to continue focusing on and that's where we'll see the most acceleration of our capabilities in the future.
Host 1
Cool.
Kieran
The one point just before we get into use cases, let's say captions AI becomes a default platform for creating video and there's other tools available as well. And then most SMBs are using that tool and using O3 to craft scripts. I always come back to what delineates who gets the likes and engagement. Because it's being created by AI, it's using all the same AI tools. I think that's not unique to video. I think about that everywhere. When there's mass adoption of these kind of tools, what delineates the average from great today I think it's down to human execution. But if you remove human execution and most of the execution is done by the AI and AI is like incredible. The reason I was thinking through this is because I was thinking about legal cases. What happens if one side has access to incredible models and the other side has access to incredible models? How do you win? And so I think it's an interesting point to think about. I still think there's human flavor you can add to things that will still taste and things like that that will delineate what is better than everything else.
Gaurav Mishra
Honestly, I couldn't agree more. I can tell you I'm the number one user of Mara Studio. I think I use it pretty much 24 7. And I make all kinds of stuff. Like I've made music videos, I've made like, you know, marketing content, all kinds of stuff. And at the end of the day, like, it's a skill, right? Like, yeah, yeah, I'll say it honestly, it's not easy, right? Like, making stuff is not easy. Even with the best tools in the world and even with something like Mariah Studio, right. You have to have a concept you have to think about. Like, you know, it's all the same problems that you normally might have solved. Like creative problems, right? Problems of taste, problems of like, what is it that people actually want to hear, what is it that people actually want to watch? If you don't have a good answer for that, these tools may not help you. And honestly, even the GPTs and the O3s and the best models of today, they're going to give you generic answers of generic content that is not going to resonate with anybody. You have to have the angle. How do I tell the unique story? Even if you take some trending topic that's happening today. Oh, like terrorists or something like that, right. If you just go out and say like terrorists are happening, like nobody cares, nobody cares to watch your video. Everybody know, you know there's a hundred different pieces of content. How are you talking about it? That's a unique angle. That's something that gets people interested, is still like. And by the way, like, GPT is not going to come up with it. I can promise you that. You can prompt it 100 ways. Right?
Kieran
Right.
Gaurav Mishra
And so I think what actually is happening at this level right now is jobs are evolving, right? Like the craft of how these videos are created. Not just videos, but content in general is just evolving. It's changing, right. It's becoming a lot easier. But that also means that there's a higher bar for the type of content that deserves to be created. Right. I actually liken it quite a bit to like previous changes in technology that have happened. Like you look at the music industry, this is one of my favorite examples. Right. There was a time where there was no digital music and it was just acoustic. Like you would play the guitar. If you can't play the guitar, you can't be a musician pretty much. Right? And yeah, Digital music came along, a lot of people were like, well, this is just trash music, right? Like, this is not real music and these are not real musicians. A lot of people said that. And maybe that's true. I don't know, maybe it's all trash music. But the reality is there's more music in the world, there's more musicians in the world, that bar is higher, right? And I think the same thing is happening here. I also like the software engineer analogy here, right? Like, software engineering is the only job in engineering where you can imagine anything and make it right. Civil engineering. You can imagine anything, you can't make it right. And I think that same thing is happening to a lot of the content production, where anything you can imagine, you can sit in your basement and make it right. You don't need physical processes. They're not going to get in your way. So in a way, so many more people are going to be enabled to be able to make the type of stuff that could only have imagined before, right? And I think that's the power of all this. Now, on top of that, I will say there's companies in the world working on like general intelligence and things like that. If those things come to fruition, if general intelligence is created, I think the world will change in ways we can't even imagine right now. You know, we're all going to be out of jobs, probably at all levels. So.
Host 1
As somebody who's sitting in his basement making videos this morning, I feel seen and appreciated. So thank you. We'll be right back.
Host 2
But I want to tell you about another podcast I love. The DTC Pod, hosted by Ramon Berrios and Blaine Bolas, is brought to you by HubSpot Media. DTC Pod is a podcast about all things direct to consumer. Ramon and Blaine cover everything from starting, growing and optimizing e commerce stores and direct to consumer brands. They talk with founders, marketers, platforms, creators and marketing and growth agencies to cover topics like brand building, social media, influencer marketing, website conversion, paid media, Facebook ads, consumer trends, email marketing, and much, much more. If you're interested in the stories behind your favorite consumer brands, this podcast is for you. They just did an amazing show called Meta Ad secrets. How top DTC brands spend 300k monthly profitably. You can listen to the DTC Pod wherever you get your podcasts.
Host 1
All right, we have a couple minutes left. We'd love for you to talk through like kind of the core text to video AI video use cases right now.
Gaurav Mishra
Awesome. Cool. I mean, let me show you Mirage 3 really quick. So it's available actually at Mirage App. That's where we've put it actually. It's a different website. We've done that deliberately to kind of get people on a brand new experience with this. This is like my workspace over here. You can see you can start off by basically uploading an audio file or generating one. You can also choose these presets if you just want to test it. And so you bring in an audio file, you generate one if you have a script. So obviously you can write your own script. You can use ChatGPT or something. And we provide like a lot of the stock voices from all the top providers, like 11 labs and so on. Or you can upload an audio. So we actually find uploaded audio works really well because you can really specify how you want the lines to be delivered. A lot of the use cases we've seen around, this is like your typical performance, advertising, social media, obviously like SEO content and stuff like that, which we can talk about. But it becomes really easy. You don't actually have to go through a lot of the process of production to make stuff. So I'll give you an example, like let's open one of these projects that I have. So this is a product we've already done. You can see the audio here. After the audio is uploaded, you can select actors. You can go through this like audition process. So let me show you that really quick actually. So let me select this audio. Press continue here. It'll show you like a little bit of a transcript so you can make sure everything's okay. You can like adjust like the jump cuts and stuff like that where you want them to be in your video. So you can move stuff around and it'll show you basically this actor screen where you can import sort of these existing actors that we've already created for you. You can generate actors from text just with a description. This you can write with ChatGPT or you can like write it on your own. You can also customize the background with just text. So you can describe blurred background, you know, basement vibes, whatever you want. And then you can upload an image too. So here you can literally upload an image of, you know, anybody you want to clone or you know, a fictional character you created on another platform, like midjourney or something like that. And it will just use it. One thing to watch out for here is like you want to make sure that their teeth are visible, like they're smiling or something. So you know, it knows what the person's teeth look like. Otherwise it's going to create them out of nothing.
Host 1
Pro tips.
Kieran
Yeah, exactly.
Gaurav Mishra
I mean, they're going to be great teeth, but they're not going to be your teeth, I can tell you that. So actually, what you end up at the end is something like this where you can cast a bunch of actors and have them play it out. You can actually literally try different people and see what looks best. Like, I'll open up this guy. And by the way, one of the coolest parts, you can retake any line, right? So kind of like a real recording where you're a director sitting in a seat being like, hey, I should redo that part. Can you like, do this one more time? And you can do like 10, 15, 20 retakes as much as you want. And every time the actor will deliver the line slightly differently with, like different movements, hand movements.
Host 1
That's cool.
Gaurav Mishra
And you can pick, like the combination that you like the best. So here you can see, like, I did two takes of everything, right? But let me play this out so you can see what it looks like.
Host 1
There goes Kieran's weekend. Buy an exciting camry with low 3.99 APR financing or get low APR financing on a stylish new Corolla sedan. It's go time. Toyotathon is on Toyota. Let's go places.
Gaurav Mishra
So as you can see, like the hand movement, the gesturing, all that stuff, like it's fully customizable, right? You can actually say, like, where. The part where he's pointing his finger. Let's go places. You can regenerate that and it will do a different hand gesture, right? It will. He'll move his head differently. He'll like express it differently. And, you know, each one will have a little bit different meaning. So you can be the director and pick like, actually that's the right one. And by the way, next step coming on this is you'll be able to literally say in words, right? Like, okay, redo this, but like raise both hands this time, like two finger guns. And so it'll be a full on, like, you are the director, make the video how you want, right? And the person obviously doesn't exist, right? This is not a real person. And you can see like the skin realism and stuff. We focused a lot, so good on making that exceptional. So that's a little quick rundown, but let me actually show you. I can show you at least one video, but maybe two where we've made an ad with this. So it's actually a launch video for Mirage. All the footage in it that's the a roll footage we generated. And you'll see that it's kind of this like a Japanese inspired studio setup with a Japanese streetwear sort of clothing style. You'll see like if you were to go and find someone to record this for you with this setup and this type of clothing and this person, like it would take you a lot of time to find this person, right? And here you're able to customize pretty much exactly what you want. So this is the first frame of the video. But you can see this person doesn't exist. We customize this person. Even the hat they're wearing, the studio behind them, the lighting, the mic, all this stuff is fully customized. And you'll see a second person come in as this video plays and you'll see the same thing where that person doesn't exist either, but looks exceptionally real. All the footage was generated in less than an hour, right? I actually generated a bunch of it. Some people from the team helped generating parts of it. But it's very usable. And if you watch this ad, you'll see this is like an example of a solid performance ad. Amazing hook, like it'll work, right? So just to show you what it looks like overall, need a talking head.
Narrator
Video, but don't want to get on camera. Here's how to generate one in under five minutes without gear, lighting or filming anything you type in the script. Choose an actor and Mirage Studio generates clean, expressive video. This isn't for cinematic work, but for ads, product demos, explainer videos and landing pages. It's fast and it scales. You get 90% of the impact without 90% of the production. For marketers, founders or anyone scaling content without a production team, Mirage Studio built for scale.
Kieran
That is by far the best AI created video I've seen. And just to give you and captions and everyone a reason to go use this. In a recent survey, 67% of B2B buyers said their primary way they want to learn about your product is short from video. There's like a clear use case of why I would use this product straight away.
Gaurav Mishra
No, it's super exciting because that is dope. I mean look at this person that last how she like so good moves her head. Like we did multiple takes to get the perfect movement and stuff right.
Kieran
I would not have known that was AI.
Gaurav Mishra
It's crazy.
Kieran
Biggest compliment I can give you. There's nothing I could have picked up on. Like I feel like I can usually pick up on the video apps because I spent so we speak. We spend so long Looking at tools and in the weeds. But that is just so good.
Host 1
How many hours you think went into that video to make?
Gaurav Mishra
So this was made literally over the weekend, you know, which is not a great thing for our company. We had to work over the weekend to get this launch done. But we generated the footage in probably an hour. About all the footage and then, you know, editing and stuff. Yeah. So over the weekend, maybe two days.
Kieran
So good.
Host 1
It's a pretty sick video for a two day project.
Gaurav Mishra
Yeah. I mean I can promise you, like there's no way we could have done this. I mean we may not even have been actually practically been able to do it. Like how would you find this person with this setup and this environment and this clothing? Like just go to Tokyo.
Kieran
Yeah. All the recording, the edits.
Gaurav Mishra
Right. I mean this is like a Tokyo recording studio, like podcast studio with a person who speaks perfect American English. Like it just all the combination is like almost impossible, you know.
Kieran
Where did the B roll come from? Because I was creating a video earlier on in Mirage and I was like wondering how do you splice in the all the kind of cuts you have?
Gaurav Mishra
So in this video, the B roll is real B roll. It's like real actual footage from a professional videographer, photographer. But I'll show you another video where the B roll is generated. So we don't generate B roll. So you can use a ton of different websites in the Captions mobile app. We also have a lot of companies that generate B roll integrated. Like Luma Labs is one of my favorite ones. They do really good text to video. Pika is another favorite of mine. Yeah, they do a little bit more like fun stuff but good B roll. So that's kind of what we would usually use for like B roll. So let me show you another video. This is a little bit more crazier. It's a music video. It's like 90s inspired rap video.
Kieran
By the way, this is what I'm going to use it for as soon as this is my dream of being able to create a music video. So I will also be trying to do this with Mirage.
Gaurav Mishra
Let me tell you this like it is so addicting.
Kieran
Yeah.
Gaurav Mishra
I've been using Suno, if you've heard of them. They do like. Yeah, yeah.
Host 1
Beats.
Kieran
Yeah.
Gaurav Mishra
So good. Like the music turns out amazing. Like you have to generate a few times to get something you really like. But once you get it and then you're making the music video, it's so much fun. Like I was up till 2am making music videos. It's ridiculous.
Host 1
Literally. This is my entire WhatsApp thread with Kieran is just going to be him sending me random original AI music and videos for like the next like two weeks.
Kieran
You were talking GWARV to someone who used to hire puppets on Fiverr to do battle raps and send them to people in HubSpot. That's the kind will be all kind of addicted to this.
Gaurav Mishra
It's great, to be honest.
Host 1
Literally super addicted.
Gaurav Mishra
Yeah. I mean, there's people in the company, like, also sometimes I'm like, how is this my job? Like, I. I don't understand. It doesn't make any sense. But check this out, right? So this is like a 90s inspired rap video that our video editor put together. I generated a bunch of the footage for this myself. But you'll see what it looks like.
AI Rapper
Palm trees swaying city glowing that dust mirage on the horizon Dreams in the dust AI scheming in the back coast they discuss Silicon Valley Streets where the circuits combust foundation models Script sharp as a knife Acting like a legend Brought the screen to life Binary flows got the data and strike Mirage playing the roles Cutting scenes with the slice.
Kieran
That'S.
Gaurav Mishra
Got a hook and everything Kieran's done.
Host 1
For the rest of the day.
Kieran
Oh, God. I'm logging off.
AI Rapper
Put it into the sheen Deep fakes deep state Scripts written in chrome Mirage hitting harder than a rolling stone AI let the richer in the digital dome Scripts so slick made the world their home Neural nets flexing no strings, no ties Algorithms truth mixed with lies Mirage hitting scenes got that look in its eyes Call it a phantom the start it won't die.
Host 1
Oh my gosh. Kieran's never been so happy.
Kieran
Oh, my God, that is just so good. That's so good. You should release that.
Host 1
Kieran's like, I'm going to stream that on Spotify.
Kieran
I'm just thinking about, like, me and my brothers used to use the puppets in Fiverr to battle rap each other. And so we would write scripts and then send them. Like there was actually like the puppet would wrap it and I would send to my brothers. This is just a futuristic version of what I can actually do now is I can create battle rap disses and videos and avatars. I might actually have to quit my job garb because this might just be all of my time now. I actually might create I a whole bunch of ideas. I need to stop. I'm all, I have too many ideas. That was awesome.
Gaurav Mishra
Awesome. No, Great to hear it. And yeah, we'd love to see what you Create. I would love to see some puppet rap. So.
Host 1
But in all practicality, to kind of close out. Cause I know you got to jump. It's really ready for social Instagram Reels, TikTok YouTube shorts, and it seems like it's really real for any kind of like website or email video. Like, if you're like, hey, I've got a prospect, or I know that I've got a page where a certain profile customer is coming to and they need this very specific message, I can basically deliver a great 30 to 60 second video in places that I wouldn't have otherwise done. And you've also, you told me when we chatted last week that, like, there's a lot of people just using this for SEO.
Gaurav Mishra
Yeah.
Host 1
People are putting these videos on webpages to get extra engagement time.
Kieran
Sure. From product video. That's perfect.
Host 1
Use case, short form product video.
Gaurav Mishra
I mean, it's huge. The SEO thing works quite well because, you know, a lot of the text SEO is getting eaten up by Gemini and stuff these days, as you guys know. And if you convert it to video, Gemini can eat it as easily, which is nice and potentially more engaging, you know, which is. Which is good too.
Host 1
Yeah.
Kieran
Long tail, short form product video strategy. That's a perfect use case for SEO.
Gaurav Mishra
Yeah.
Host 1
All right. I feel like we have like a hundred ideas now. We're going to do a little bit of a close out, Gaurav, but I know you got another call, so we can let you jump.
Gaurav Mishra
Awesome.
Host 1
Thanks for doing this last minute, man. We appreciate you.
Gaurav Mishra
Thank you.
Kieran
Thanks so much, Karl.
Gaurav Mishra
Take care. Bye.
Host 1
Can producer Darren come in?
Producer Darren
Yeah.
Host 1
Can I get the real hot take from producer Darren of what he just saw on Mirage? As somebody who does you play music, you are deep into the audio video world. Like, what's your take, producer Darren?
Producer Darren
Yeah, I mean, it's amazing. It's kind of baffling, isn't it, just to watch that? Because it's. I mean, you can't tell that it's not real, you know, so it's kind of a mixture of scary and exciting, you know?
Host 1
So it was better than you thought it was going to be.
Producer Darren
Yeah. I mean, the thing that's really surprising is because with a lot of AI stuff, when it cuts from scene to scene, you can see there's like differences between the characters. But with that one, it's like the continuity is amazing, isn't it?
Kieran
It's much better than I thought it would be as well. I played around with it this morning, I think to Guara's point. There is still some skill in learning how to use a tool because I didn't actually know you could do the repetition of takes.
Host 1
The repetition of takes feature is bananas, man.
Kieran
Yeah, it's so good. So like I need to do that. The B roll is a big thing. So I'm glad he talked about how you can get B roll and add it into the video because I wasn't sure how you could actually integrate the B roll. But the video he showed, which was the ad for the actual tool, I would not have known that was AI.
Host 1
No, it was bonkers. Good. This is my take, you two. And I'd love to hear if you agree or disagree with this. I think if you are the average small to medium business and you've got a couple of marketers, this product is good enough for you to do your vertical social video and to do some product videos on your website that you would have just never had the time or money to make. For sure I would go and do this now. I think if you are a large to enterprise business, I think I would still use all of the avatars and the scripting and the takes and everything to get all of the A roll, which is the person talking on screen and the audio. I would then probably have a similar flow where I generate B roll in one of the products that Gaurav was talking about, Lumia or what have you. And then I would probably have a very high speed editing process set up, whether I have a dedicated editor or I maybe have a team in a different time zone that can put the A and the B together quickly, overnight and have like fast turnarounds. That's where I think we're at. Like what do you guys agree or disagree with that?
Kieran
I think if you use that tool and you have someone that can use that tool, you have a whole video team. And so the average SMB now can actually have an entire video strategy, like as part of their marketing strategy, which was never really possible to begin with. And I also think his idea of having licensed avatars to your business so cool be the face of your business and train them on those channels is actually really good because people will build a relationship to that person, they'll recognize that person, that person is associated with the brand, put them on your team page. I think that's a great use case as well. That is not being done yet. And so I think there's a ton of things in there. If you just got the tool and started with short form product video, you're in a great Spot.
Host 1
Well, that's what I was saying. If you're watching the show today, you're in the early crowd and there's a arbitrage to take advantage of what you can literally do. And Kieran, I think we might want to do a follow up show on this where we'll make some videos in Mirage where we'll take copy from product pages. Like we'll just take some HubSpot product pages, take that copy, have ChatGPT convert that product page copy into a 60 second script to explain that product and then we'll put that 60 second script in Mirage, pick the actors and everything and generate that. And you should be able to do all of that. Like you should be able to do each product in like 20 to 30 minutes.
Kieran
Right.
Host 1
Start to finish. And like that is a game changer. You would have spent thousands or tens of thousands of dollars to do that before. And this is literally, I paid for it this morning, $25 a month, this product.
Kieran
Right. That's the use case. There's a window where you will be able to take advantage of that for everyone else is actually using these tools.
Host 1
Yeah. And we'll do a show on how to do this. But we'll also put some of these videos on the site and we'll come back and try to report back traffic changes, engagement changes, are we getting more search traffic, everything. But from the people I've heard doing this early on, it's working pretty well.
Kieran
Yeah.
Host 1
It just feels like this is an obvious no brainer.
Kieran
Yeah, it's a no brainer again. It's a arbitrage opportunity. There's a window where you can actually get real benefit from this.
Host 1
I feel like we've done hundreds of shows now. There's probably been like 20 that I've seen Kieran's mind be someplace else because he's thinking of what he's going to do and this is one of those shows.
Kieran
Yeah. I have just so many ideas and videos I can build.
Host 1
There's so much application and it feels like every week we show up and we get like, I feel more powerful.
Kieran
Yeah, yeah.
Host 1
Like, oh, I could make videos now. I could never make videos before. This is amazing.
Kieran
Gaurav's description of this is the right description, which is there's just a bunch more people who can create music now because of these tools. This is what these tools are enabling for now is that if you have good ideas, if you have great taste, then tools and expertise are no longer your downfall. Like you don't have to wait for A videographer, you don't have to wait for someone to do all of the tooling and editing. You can just go create. And I think that's a cool place to be in.
Host 1
Yeah. Like, I want hundreds of people within HubSpot making videos. And right now we have exactly democratized people making videos. Right. Like, that change in magnitude is remarkable. And what's great is those tens of people who are really great specialists are going to kill it. Like, they're going to use this to make even better stuff even faster. And then the rest of us who are not superheroes on video are going to have a really good video. Someplace where a video just would have never existed before.
Kieran
Right. Just democratize it to the best ideas.
Host 1
It's kind of mind blowing. Just for context, for everybody watching, this was a last minute show. I was like texting with Gaurav last night to get this going. But I'm really glad we did it because I think we're going to look back in a year and be like, this launch with Mirage was one of the tent pole moments for AI video. Like, you had Google's VO model, you have a few of them. But like, what Garv and the team at Captions has done with Mirage is they have gotten quality for a very specific use case. Really good.
Kieran
Right.
Host 1
If you are making a vertical video, especially for social media, you can now do that very well with AI. I was talking to him off Mike Kieran before you got on, and he is like, the amount of adoption and GPU usage in the first 24 hours has been off the charts.
Kieran
Yeah, yeah. They've had that viral takeoff moment.
Host 1
They hit it and now it's just like they're hanging on for dear life to, like, keep up. And that's how you know you've hit like the next level of something good with these new technologies. All right, so that was Mirage. It's a new model from Captions AI. It really democratizes short form product videos and social videos in a way that's never happened before. Kieran and I are blown away by it. I highly recommend you go and check it out and you can really use it to participate in video in a way that you've probably never been able to do for your business. I have so many ideas. I'm going to go and play around with those now. Thank you so much for joining us and we'll see you real soon on market against the grain. Look, on this show we have preached the good word about how AI is completely changing the game for content, for creators and for marketers. But we've never given you the exact steps you need on how you can change your content creation workflow. That is until now. We just dropped a step by step guide on which AI tools you can use today to start 10 x ing your content creation workflow. Learning AI for content creation has never been easier. You can steal our entire process right now. Link in the description below.
Episode Title: This AI Does in 20 Minutes What Takes Video Teams 20 Days
Host/Author: HubSpot Media
Release Date: June 5, 2025
The episode kicks off with Hosts Kipp Bodnar and Kieran Flanagan introducing Gaurav Mishra, the co-founder of Captions AI. Gaurav unveils "Mirage," a groundbreaking text-to-video AI model designed to revolutionize video content creation. Kipp shares his personal experience using Mirage, highlighting its efficiency and affordability:
Kipp Bodnar [01:25]: “I could make an awesome product video for HubSpot's customer agent in just 15 minutes with Mirage. And it's only $25 a month. It's literally making my head explode.”
Gaurav delves into the origins and aspirations of Captions AI. Founded four years prior to the podcast, the company foresaw the potential of AI in video creation even before the surge in AI interest. Initially anticipating a decade-long development timeline, the rapid advancements surprised them:
Gaurav Mishra [03:18]: “We were thinking about this back then, but didn't imagine it would be possible so quickly. Seeing it become usable and useful in such a short time has been a crazy journey.”
Mirage is tailored to address the challenges faced by small businesses and professionals worldwide, focusing on simplifying the video recording process and eliminating the technical hurdles of editing.
Gaurav differentiates Mirage from traditional text-to-video platforms that primarily generate B-roll or stock footage. Instead, Mirage emphasizes storytelling through talking heads, aligning more closely with how narratives are conveyed in movies and social media:
Gaurav Mishra [04:08]: “We wanted to focus on the role of storytelling, primarily through talking. Mirage is solving the problem of recording by making it extremely easy.”
A significant portion of the discussion centers around the potential for AI to create virtual influencers—digital avatars representing brands. Gaurav envisions a future where every small business could have its own AI avatar, becoming the face of their brand:
Gaurav Mishra [07:39]: “Imagine every brand having a virtual person they own as IP, their trusted face across all marketing channels. Mirage can create these realistic personas, opening up new possibilities.”
Kieran expands on this, drawing parallels to HubSpot's creator program and the scalability AI avatars could offer to small businesses:
Kieran Flanagan [07:42]: “Every SMB could have a creator program with virtual creators tailored to their audience, becoming the voice of the brand on various channels.”
However, Gaurav also touches on the potential dystopian implications, questioning societal acceptance of AI influencers:
Gaurav Mishra [09:40]: “People might accept virtual influencers because they can't tell the difference, leading to a sea of AI-generated content on platforms like TikTok and Instagram.”
The hosts and Gaurav discuss the rapid adoption rate of AI-generated video content. While currently representing a small percentage, it's projected to dominate within five years:
Gaurav Mishra [11:35]: “It's already happening at 1-5% on platforms like TikTok. I could see it reaching 20-30% in a few years.”
Kieran counters with an optimistic outlook, suggesting mainstream adoption within two years:
Kieran Flanagan [12:17]: “I think we're two years away from the majority of social videos being AI-generated.”
Gaurav concurs, acknowledging the accelerating advancements:
Gaurav Mishra [12:21]: “I like it. I think you might be pretty much spot on.”
Gaurav provides a hands-on demonstration of Mirage, showcasing its user-friendly interface and versatile features. He walks through uploading an audio script, selecting actors, customizing backgrounds, and refining video segments with multiple takes:
Gaurav Mishra [20:23]: “You can upload an audio file, choose your actor, customize the background, and even retake any line to perfect the delivery.”
The hosts are visibly impressed by the quality and realism of the generated videos. Kieran remarks:
Kieran Flanagan [26:03]: “That is by far the best AI-created video I've seen. I would not have known that was AI.”
Gaurav emphasizes Mirage's focus on realism and acting, ensuring that the virtual avatars deliver authentic body language and expressions:
Gaurav Mishra [13:12]: “We focus on making sure the body language is exceptional, like someone who went to acting school.”
Mirage's versatility is highlighted through various use cases:
Kieran underscores the strategic advantage for small to medium businesses:
Kieran Flanagan [32:44]: “If you're using this tool and have someone who can navigate it, you essentially have a whole video team, enabling a robust video strategy that was never feasible before.”
Gaurav adds that beyond simplicity, the real value lies in creative execution and storytelling:
Gaurav Mishra [17:18]: “Even with the best tools, you need good concepts and unique angles to resonate with audiences.”
The conversation shifts to the evolving landscape of content creation. Gaurav draws parallels to the music industry's digital transformation, where technology lowered barriers to entry but raised the standards for quality:
Gaurav Mishra [17:19]: “Digital music made it easier for more people to create, but it also raised the bar for what deserves to be heard.”
He emphasizes that while AI tools like Mirage democratize video production, human creativity and unique storytelling remain paramount:
Gaurav Mishra [19:09]: “Content creation is evolving. Tools make it easier, but you need the right angle and creativity to stand out.”
As the episode wraps up, the hosts express their enthusiasm for Mirage's potential to transform marketing strategies. They encourage listeners to explore Captions AI's Mirage and leverage its capabilities to enhance their video content without the traditional time and financial investments.
Kipp Bodnar [38:55]: “If you are making a vertical video, especially for social media, you can now do that very well with AI. This launch is a tent-pole moment for AI video.”
Kieran hints at future episodes where they plan to demonstrate creating videos using Mirage, promising actionable insights for listeners to boost their content creation workflows.
"This AI Does in 20 Minutes What Takes Video Teams 20 Days" highlights the transformative capabilities of Captions AI's Mirage model. By drastically reducing the time and cost associated with video production, Mirage empowers small businesses and marketers to create high-quality, engaging video content effortlessly. The discussion also sheds light on the broader implications of AI in content creation, emphasizing the balance between technological advancements and human creativity.
For listeners eager to stay ahead in the marketing landscape, embracing tools like Mirage could provide a significant competitive edge, democratizing video production and unlocking new avenues for storytelling and brand representation.