Summary5 min read

Podcast Summary: "Gemini Omni: Clone yourself with AI in under 15 minutes"

Podcast: How I AI
Host: Claire Vo
Date: June 3, 2026
Episode Duration: ~15 minutes (excluding intros/outros/ads)

Episode Overview

In this unique episode of How I AI, host Claire Vo embarks on a hands-on experiment: creating a fully animated AI video avatar of herself using Google Flow's Gemini Omni video model—all in under 15 minutes. The episode is not just a technical walkthrough but also an unfiltered real-time reaction to AI-generated results—including their quirks and limitations. Claire's aim is to demystify the process and showcase how even non-experts can leverage generative AI video tools for creative self-expression and efficient content creation.

Key Discussion Points & Insights

1. Setting the Scene: The Avatar Challenge

Claire describes her goal to make a minute-long, fully AI-generated video starring her digital avatar, constructed in real-time on the podcast.

"I'm going to create a video avatar of myself and in about 15 minutes get to a full minute long video starring none other than your favorite podcast host, Claire Vo." (00:00)

2. Onboarding with Google Flow & Gemini Omni

Claire uses Google Flow’s new Gemini Omni video model, which enables avatar creation from a quick scan via QR code and a short facial photo session.
- Process involves turning her head for different angles and trusting the AI to build a baseline avatar.
"We're gonna see if we can get a full-featured avatar of myself that then we can go and build consistent character videos off of." (03:00)

3. Storyboarding With AI Assistance

Instead of only animating, the tool offers creative ideation features, helping Claire outline a hype video’s concept and flow.
- She requests a storyboard that reflects her authentic workspace: a dark, high-tech, “hacker vibe” home office.
"What I love about flow... is that it's not just a video generation tool, it's actually a whole creative suite." (07:00)
- AI proposes several frames for the video—a keyboard closeup, ergonomic chair reveal, humorous digital overlays, montage, and a call-to-action.

4. From Prompt to Generation: The Results (and Surprises)

Claire notes early hiccups—avatar references sometimes fail; images are generated instead of videos due to mis-clicks.
- Surprised by how well the background and personal details are replicated from a single reference photo.
"Surprisingly... it does have my posters and my books background here. I guess because they're behind me when I took the photo, it's taking advantage of that." (19:10)
Initial videos occasionally have glitches—like blue nail polish or uncanny facial animations:
- "The first video generated, now we have blue nail polish. I still like it." (25:00)

5. Live Reactions: The Uncanny Valley and Bright Spots

Claire candidly reacts to both successes and awkward outcomes:
- "I just got jump scared by the AI version of myself wearing glasses, turning around in a spinning chair." (27:10)
- Posts multiple scenes—preferring some avatar variants to others for realism and appeal.

6. Editing and Compilation: Browser-Based Timeline

Claire uses the web-based editor to assemble the video scenes as suggested by the AI's storyboard.
- The editor allows fast, drag-and-drop arrangement and minimal manual effort.
"Once I click into any one video, I have a video editor timeline here that I can use right in the browser to stitch together all these videos." (31:00)

7. The Finished Product: Hype Video Debut

Claire premieres her one-minute AI-generated hype video.
- Example script, as generated:
  
  "We were told AI would replace us. Oh, my God. I'm Claire and this is How I AI. From automating the mundane to dreaming up the impossible. It's about the tools that change the way we live and work. Join me as we deconstruct the future one prompt at a time. Subscribe to How I AI..." (34:00)
- She marvels at how convincing (and at times bizarre) the avatar is, considering it took under 15 minutes, including the learning curve.

8. Reflections: What Worked, What Didn’t, and the Significance

Claire acknowledges strengths and shortcomings:
- Pros: Speed, ease, surprising realism, near-immediate creative output.
- Cons: Avatar consistency issues (face, hair, background), emotional expressiveness, some odd generative choices.
- Example analysis:
  
  "I would say about 50% of the time it's my face and 50% of the time it's like an uncanny version of my face. Some things I noticed from a character consistency perspective. This gave me beautiful long wavy hair which I have recently cut off... background has, has books and a, an hourglass..." (36:00)
Overall, she’s “kind of obsessed” and encourages listener experimentation.
- "This might be my new favorite hobby project." (40:00)

Notable Quotes & Memorable Moments

On the democratizing potential of AI video:

"What I really appreciate about these new generative AI models, in particular these multimodal ones, image and video, is it unlocks for me an ability to generate, create something that I would have never been able to do before." (08:00)
Facing AI’s limitations:

"I'm not sure it 100% has emotions really well. And some of the timing and hiccups you noticed while you were watching the video, I spoke over myself those sorts of things. But this scene right here is legitimately pretty good." (38:40)
On iteration and accessibility:

"We're talking probably 10 minutes top to bottom... 15 minutes from very beginning, knew nothing about this tool to I have this one minute video now I can share with you all. I'm pretty blown away you guys." (39:30)
Call to action for listeners:

"I want to hear if you all are willing to put your avatar in here, if you can actually get it to generate consistent characters and what your experience is using these kind of incredible new video models." (41:15)

Important Timestamps

00:00–03:00 — Episode introduction, overview of goal and tools
07:00–11:00 — Storyboarding with Google Flow, setting creative direction
19:10–27:10 — Live reactions to AI video generation, first outputs and surprises
31:00–34:00 — Editing and stitching videos, technical walkthrough
34:00–41:15 — Premiere, review, and critical discussion of final video

Takeaways for Non-Listeners

Creating a digital avatar and custom video via Gemini Omni and Google Flow is fast, accessible, and unexpectedly robust—even for absolute beginners.
While the results aren't flawless, for a quick, solo-produced video, the tools deliver a result that's “about 50% to 80% there,” with recognizable quirks (hair, background shifts, uncanny expressions).
Such generative tools can help non-creatives move quickly from idea to shareable content, lowering barriers for engaging marketing and personal projects.
Claire encourages listeners to try it themselves, reinforcing the podcast’s mission to not just explain but also empower real-world AI experimentation.

Loading summary

Transcript1 lines

[00:00]
A
Today I am doing a very strange episode where I'm going to create a video avatar of myself and in about 15 minutes get to a full minute long video starring none other than your favorite podcast host, Claire Ho. Let's get to it. This episode is brought to you by Merge. Building an AI product is one thing. The hard part is everything around it. Connecting to the tools your team and customers rely on, letting agents take action with the right permissions and keeping efficient everything reliable and cost efficient. Once you're in production, most teams end up piecing that together themselves. So instead of building the products you actually care about, you get pulled into integrations, permissions, routing and all the infrastructure underneath. Merge is the infrastructure layer for production AI. It connects to thousands of tools, gives agents secure ways to act inside them, and optimizes model routing and spending without you building or owning any of it. OpenAI, Dropbox and Ramp already. Use Merge to move fast and build AI right? Visit merge.devhowaiai to start building for free. This episode of How I AI is going to be an adventure because I'm going to be Honest, I'm not 100% sure this is going to work. I'm going to return to a product I covered very briefly a couple weeks ago called Google Flow and the new Gemini Omni video generation model. And I'm going to try really hard to create an AI avatar of myself that we can animate or I guess cinematically create using AI. So this is Google Flow and one of the features of Google Flow and the Omni model is you are supposed to be able to create an avatar of yourself. Now we tried this the day it came out, it did not work. But we're going to give it another call it a try and see if we can get a full featured avatar of myself that then we can go and build consistent character videos off of. So I'm going to select up here, I'm going to create an avatar. We're going to click get started and my scan this QR code. I have my phone here. I've done this before so hopefully it'll be fast. Okay, I'm going to put the mic away just for one second. I'm going to allow access to my camera and we're just going to take some photos. Okay, ready, start. 17, 81, 49, 20, 25, 22. Okay, now it's having me turn my head. So I turned my head that way, gave me a check mark. Turn my head the other way, it's giving me a check mark and and it says we're done now. It said we were done last time we tried this. So we're gonna see. It's gonna take a couple minutes and come back and see if I can actually use this avatar of myself. Okay, so look at this beauty. There's this fisheye lens version of me that is now an avatar. So I supposedly can use this and let's use it to create a hype video for the How I AI podcast. So I'm gonna go in here and say, help me create a storyboard for a hype video for the How I AI podcast. I already have a character named Me we can reference. Help me come up with the few scenes that would make this great. This is a podcast by Claire about the best ways to use AI at work and in life. Exclamation mark. Okay, so what I love about flow, or what is pitched to me about flow, is that it's not just a video generation tool, it's actually a whole creative suite. And so ideally it's going to be able to help me not only animate or video generate this avatar of myself, it's also going to help me actually brainstorm what this overall video could be. And I'm, you know, I'm creative, but I'm not video creative. So I'm excited to see what it looks like. So how do you imagine Claire? Is she in a modern studio or perhaps a bright, airy home office? Should it feel high tech and sleek or more grounded and lifestyle focused? And are we going for high energy and fast paced and thoughtful, inspiring? So I'm going to say she is in a dark home office, dark green walls with books about AI and fun posters lighting around. This should be more authentic lifestyle version, but it's high tech and about coding have a hacker vibe to it. Okay, a bunch of typos, but we'll see. We'll see what this does and what I love about these video models in these new tools. Again, usually here on How I AI, we talk about coding, we talk about website generation, we talk about PRDs and work product. But what I really appreciate about these new generative AI models, in particular these multimodal ones, image and video, is it unlocks for me an ability to generate, create something that I would have never been able to do before. So I would have never been able to solo produce a hype video for my podcast. I would have a hard time brainstorming it. I wouldn't know how to frame it, I wouldn't know how to block it. But now I have this AI producer here that can help me with this effort. So let's see what the frames are. It's about seven frames. It's going to be an extreme close up of me typing on a mechanical keyboard, totally on brand. Then there's going to be a wide shot of the office. Then it's going to reveal me in my ergonomic chair. Spoiler alert. I am not actually in an ergonomic chair. I'm going to spin around. That's going to be funny. And it's going to give me a digital heads up display, which is also ridiculous. But let's let it happen. Then it's going to do a very, what I'm presuming to be a very cheesy AI montage, a lifestyle moment, a call to action. Going to hit you with the podcast microphone. And then it's going to say how I AI um, if this looks good, I'm going to say this is great. Generate the storyboard. I already have the character at me. Um, and so I'm gonna send. We're gonna see what it comes up with. I've noticed that it has a hard time referencing the me character in some early tests, so let's see what it comes up with. I'm presuming it's gonna take a couple of minutes. So we will take a mini break and then come back to see what it looks like. Okay. It looks like it's generating a grid for the storyboard. It can't use the avatar, so I think it's going to do it without the character reference. It'll be really interesting to see what it comes up with. But then as soon as it's ready, I'm going to go ahead and generate at least a couple of these storyboard scenes one by one, and we can see how well it does with my avatar. Oh, I mean, this is delightful. Look at this glowy mechanical keyboard. Look at how I am hacking on three keyboards. I'm going to make a little eyes at you with my. My fake glasses, my very trendy glasses. There's going to be me dragging and dropping a file that probably says like AI md. I'm going to smile and I'm going to speak into the podcast. This looks great. So what I think I'm going to do is I'm going to paste in this first frame of the video that the agent came up with. And instead of saying Claire, I'm just going to at mention in this avatar that it gave me so that we can see if it generates this video with me as the character. And so I think I've replaced my name here. I've given details on camera, on lighting, on everything. I press enter. Let's see what it creates with my avatar. I have no idea what we're going to get into, and hopefully it won't be terrifying. Okay. I'm already nervous. What is surprising to me that I didn't actually expect is it does have my posters and my books background here. I guess because they're behind me when I took the photo, it's taking advantage of that. And I'm going to share my audio as well. And we're going to see how this video worked. Okay, I got that wrong. I actually generated images instead of videos. Totally messed up. Did not click the right thing down here in the bottom right. I had image generation instead of video generation. So again, I'm going to paste that walkthrough of the scene here. I'm going to replace my name with the me avatar. It's going to have my fingers flying across that mechanical keyboard. It's going to be so cool. I'm going to go ahead and press send and we're going to see how long it takes to generate a video. Now, something you'll notice about every time you generate videos. It used to work like this in VO2, so I'm not Vayo3 as well. So I'm not surprised. They do this as they're generating two versions of it. It's going to take a couple of minutes. The image took a couple seconds. These are probably going to take a couple minutes. So we'll come back and hopefully we will have our first video with Claire's face in it. And while we're waiting, I'm going to queue up one or two other scenes and see if we can get ones going with my actual face in it. Because some of these had, like, the back of my head as opposed to my face. And I think we want to see what my face avatar looks like. So we'll pick frame three and see if we can get that going as well. Okay. The first video generated, now we have blue nail polish. I still like it. Okay, let's see. We were told AI would replace us. That is quite spooky. Okay. We were told AI is going to replace us. Let's see if. If the video with me actually generates a callback to that. So while that's generating, I'm going to go ahead and make all of these. We're going to stitch them together. It's going to be so awesome. So stick with us. We're going to generate a bunch of videos and we're going to stitch it together into one long hype video. This episode is brought to you by Jira product discovery AI has made individual PMs incredibly productive, but multiplayer mode is where it still breaks down, getting everyone aligned on what should actually get built decisions live in a markdown file from last week. The roadmap's a spreadsheet no one's looking at. JIRA Product Discovery is where teams actually decide what to build, capture ideas, prioritize them as a team, and share a living roadmap everyone works from. It's powered by Atlassian's Teamwork Graph, so it can pull in customer feedback what your team shipped, plus your goals, and suggest what to build next. And when a decision is made, you can hand it off straight to JIRA so a developer or even an agent can pick it up and start building teams. At Canva, Deliveroo and Toast already use Jira product discovery. Join more than 25,000 teams@atlassian.com Howiai start building the right things together. Okay, I have seven scene generating but while we are waiting for those to finish, I just cannot. Oh, sorry, sorry for you all that are listening and not watching. I just got jump scared by the AI version of myself wearing glasses, turning around in a spinning chair. So let's take a look at both of these. This one's pretty good. I'm spinning in a circle. Okay, sorry, back to those. I need to describe this for. So this is using an AI avatar of myself. The prompt was I spin my ergonomic chair around to face the camera. I push my glasses, which I don't have, up to the bridge of my nose and I say, this is Claire. I am Claire and this is how I let's watch V1 of this video, which is actually a scream riot. I'm Claire and this is how I AI okay, it was actually pretty good. What's really funny is I do have the it has the Nvidia way in the background, which I don't have right here, but I do have upstairs. So I do believe the AI overlords are really paying attention. I want to make you laugh and look at the second version where I spin in a circle twice. Pretty good. Foreign. I'm Claire and this is how I AI this one got my not curled hair a lot better, but I prefer the other video. It makes me look a little bit nicer. Okay, I'm gonna take one minute. I'm gonna stitch all these videos together in the form factor that Gemini Told me I should. That Flo told me I should. We're going to bring this hype video together. I'm going to show it to you end to end and then I'm going to conclude today's very strange episode of How I AI where I use my avatar to create an end to end hype video for this podcast. Cool. So it actually seems like I can show you a little bit of how we're going to stitch this video together. So if you see here, once I click into any one video, I have a video editor timeline here that I can use right in the browser to stitch together all these videos. So I'm going to go ahead and add these in the order that the original AI told me my height video should go and then we'll look at it end to end and we'll see if we really like it. Okay. This took me about five minutes, but all I did was stitch together my favorite versions of all these avatar generated AI videos scene by scene, about seven of them together to show one end to end height video. Again. This episode is probably going to be sub 15 minutes. That includes recording my face as an avatar, figuring out what the heck is going on with this tool, building a storyboard, generating all the videos and stitching them together here in this editor. And now, the worldwide debut of the How I AI hype video. I am going to show you who knows what we're about to get, but we're about to get it. Here we go. We were told I would replace us. Oh, my God. I'm Claire and this is How I AI from automating the mundane to dreaming up the impossible. It's about the tools that change the way we live and work. Join me as we deconstruct the future one prompt at a time. Subscribe to How I AI How I AI available now everywhere you available now everywhere you get your podcasts. Okay, I am actually obsessed with this. I. Let's talk about what I love and what I don't. What I love. This took zero time and effort and it is. I wouldn't say it's like 80% there, but is it 50% there? A hundred percent, yes. Am I going to tweet this immediately? Absolutely. Did this take no effort? Basically no effort, no knowledge. Okay, so what did I like about this avatar experience? You know what this is like kind of my face. It's not quite my face. I. I would say about 50% of the time it's my face and 50% of the time it's like a uncanny version of, of my face. Some things I noticed from a character consistency perspective. This gave me beautiful long wavy hair which I have recently cut off because I have a child. So you see there's like location inconsistency. This background has, has books and a, an hourglass. This background is a different color and it has plants. It pulls in some things from my avatar. Like it pulls in this poster that was in the background of when I took my photos. And it changes a little bit over time. And so you can see the books on the shelf change, the lighting changes. As always, these video gen and image models are really early 2000s coded on what they think AI and impressive technology is. So I'm holding like a 24 inch iPad in this video. Looking at a schematic of it looks like a church. It's very confusing. The heads up display that shows up on my face when I'm looking at AI. I'm apparently coding in, in Gemini, a robot of some sort. So it's pretty hilarious. But even looking at this frame, I would say this is the one that felt like it looked most like my face. Like I'll just try to look serious so you all can see. It's pretty good. It's got, it's even got my sun damage here. So good job Gemini. Not smoothing out, smoothing out my face. And so I do think this is 90% there, not 100% there. But it's really interesting even seeing my face turn left and right. How accurate it got on the side profiles of my faces. Now this scene right here where I'm laughing a hundred percent uncanny valley. I look very strange, like I'm on some side of medication perhaps. And so I'm not sure it 100% has emotions really well. And some of the timing and hiccups you noticed while you were watching the video, I spoke over myself those sorts of things. But this scene right here is legitimately pretty good. I bet with some consistent background prompting, with a little bit more effort, with some additional images going into this omni model, I think I can make a hype video that would convince most of you, if not all of you. Now do I think it's great at typography, Do I think it's great at graphics? No, this is kind of lame. This ending part is kind of lame. But again, we're talking probably 10 minutes top to bottom. So we're, we're talking, you know, probably 15 minutes from very beginning knew nothing about this tool to I have this one minute video now I can share with you all I'm pretty blown away you guys, and so I'm gonna go spend a little bit more time with the Google Omni model. I'm spend a little bit more time with Flow. This might be my new favorite hobby project. I'm kind of obsessed with it. I want to hear if you all are willing to put your avatar in here, if you can actually get it to generate consistent characters and what your experience is using these kind of incredible new video models. So I know this is a little bit of a different style of How I AI. We usually do coding, we usually do work stuff. This is a tool I did not know. This is a process I'm very unfamiliar with and I really think I got an outcome that was much better than I expected with very little knowledge of the tools. So if that is not a HOW IAI AI success story, I'm not sure what is. I hope you enjoyed this very strange mini episode of How I AI. I cannot wait to see what you generate and please share your examples in the comments. Thanks for joining. Thanks so much for watching. If you enjoyed this show, please like and subscribe here on YouTube or even better, leave us a comment with your thoughts. You can also find this podcast on Apple Podcasts, Spotify or your favorite podcast app. Please consider leaving us a rating and review which will help others find the show. You can see all our episodes and learn more about the show@howiaipod.com See you next time.