Summary6 min read

Behind the Craft – Full Tutorial: Make Professional Launch Videos for Free with Hyperframes

Guests: Ben Liu (VP of Product Engineering, HeyGen), Jake Moran (PMM, Hyperframes) Host: Peter Yang
Date: June 21, 2026

Episode Overview

This episode takes a deep dive into Hyperframes, a revolutionary free tool from HeyGen that uses HTML, CSS, and JavaScript as the foundation for automated, professional-quality launch videos. Ben Liu and Jake Moran walk through not only how anyone—even non-coders—can leverage Hyperframes with AI coding agents, but also the broader philosophy and future roadmap for video as the new lingua franca of modern product launches. Practical tips, step-by-step guides, and a peek behind the engineering curtain make this an essential listen for product leaders and creators.

Key Discussion Points & Insights

1. The Problem: Traditional Launch Videos are Expensive and Unapproachable

Ben sets the stage by sharing:

"I've spent $30,000 on a launch video and I was told that that was cheap." (00:00)
Core insight: Great launch videos are essential but have traditionally demanded huge investments in money, time, or design expertise.

2. What is Hyperframes and How Does it Work?

Definition: Hyperframes is a free, agent-friendly tool that lets you create complex, animated videos directly from HTML, CSS, and JavaScript—turning code into shareable, professional MP4s.
Agent-driven creation:

"Everyone now has a coding agent. So everyone can literally ask your coding agent to make a Hyperframes video." – Ben (00:28)
Workflow overview:
- Upload media and assets (UI screenshots, images, SVGs)
- Use a “skill” (scripted instructions) that guides the agent step-by-step
- Outputs are full storyboard-driven, scene-based videos
Notable quote:

“This entire video end to end is actually code...our renderer is able to turn such a code base into a rendered MP4 video that you can share anywhere. This is the power of Hyperframes.” – Ben (01:16)

3. Getting Started: How to Set Up and Use Hyperframes

[03:00 – 09:37]

Setup is simple:
- Visit hyperframes.hadron.com Quick Start
- Install from Codex plugin store or connect in Claude/other AI tools
- Find open-source skills on GitHub
Even non-coders can get started easily:

“Just point Codex to the GitHub...I just have to like /website to video and paste my website link. And then that’s it, right?” – Peter (08:31)
Full-site-to-video in one prompt:
Example: Feeding Spotify.com to an agent, it automatically storyboards, collects assets, generates flow, and outputs a full video.
Voice support:
- Local TTS model provided
- Can integrate with HeyGen, ElevenLabs, and more for advanced voice/audio

4. Advanced Tips & Workflow from Practice

[10:49 – 19:55]

Jake's Best Practices:
- Create a project folder with context and assets (UI screenshots, design.md/frame.md for brand/styling)
- Use Frame MD for video-specific design cues
- Ask the agent to generate a “storyboard.md” laying out all scenes and narration
- Reuse components:
  
  “In our last three videos, I have reused the same Claude prompt box, just with different frame MDs.” – Jake (16:37)
- Open-sourced repo has 50+ reusable code components
Rapid iteration:
- Review static frames before full video render to fine-tune aesthetics and copy
Agent efficiency:

“Most of these videos I’m making within a day of the day we’re launching it. So time is so important to me in that case.” – Jake (19:55)

5. The Power of Code for Video Creation

[25:15 – 31:56]

HeyGen’s Approach:
- Focuses on “communication video” (not cinematic/Hollywood), enabling everyone to clarify ideas quickly via video rather than slides or docs
- Agents struggled with visual design using JSON/XML; code (HTML/CSS/JS) is the LLM’s native, visually expressive language
Distinction:
- Landing pages require “spatial aesthetics” (how content is arranged)
- Video needs “temporal aesthetics” (the unfolding of information over time), which LLMs are now improving at
Verification and Quality:
- HeyGen internally builds evals, benchmarks, and self-check loops
- Collaborating with Frontier Labs to further enhance LLMs for video generation

6. Use Cases and Community Impact

[32:03 – 36:00]

Main Use Cases:
- Product launch videos (“the holy grail”)
- Demos, explainers, real estate, education, PR-to-video
- Internal status updates (“Cloud code looks at my commits for the last 7 days and tells my team what I did...it’s a fun Friday afternoon event.” – Ben, 33:22)
Empowerment:
- “A lot of the engineers here actually can make their own launch videos now. They don’t have to learn a tool, they don’t have to learn After Effects or even CapCut.” – Ben (38:04)

7. Exporting & Extending Hyperframes

[23:36 – 25:00]

Supports export as MP4, MOV, WebM (including transparent backgrounds for video pros)
Underlying HTML/CSS/JS codebase enables easy repurposing as websites or interactive videos

8. Future Directions & Open Sourcing

[39:18 – 40:47]

Plans:
- Open-source storyboarding tools
- Launch more pre-built skills (sound effects, motion graphics, music, advanced editing)
- Support for slides and more interactive formats
Community:
- All launch video code is open source
- Encourages forking and remixing via GitHub

Notable Quotes & Memorable Moments

On democratizing video:

“It just feels really empowering because otherwise you have to pay like $30,000 to do it.” – Peter (40:05)
On how far tech has come:

“LLMs can express not only accurately the information, but also visual aesthetics through HTML, CSS, and JavaScript...” – Ben (28:17)
On the agent workflow:

“You don’t actually need to understand code...We build the studio so that humans can modify it using this UI. But the beauty of that is that when you make those changes, they become code.” – Ben (21:30)
On scalability:

“You can literally say, hey, pull that piece from that code base. Because I want that effect. Your agent will do that.” – Ben (17:44)

Timestamps for Important Segments

00:00 – Ben on the high cost of traditional launch videos
01:16 – What Hyperframes does and how it works
03:00–04:29 – How to set up Hyperframes and get started
06:21 – Single-prompt website-to-video walkthrough
10:49 – Jake’s practical creation workflow and asset management
16:37 – Code reuse and design customization
21:05–23:29 – Scene-based editing, human-in-the-loop UI
24:18 – Export options and interactive possibilities
25:15–31:56 – Philosophy: why code, why now?
32:03 – Use case spectrum expands beyond launch videos
38:04 – How empowerment shifts to the broader team
39:18–40:47 – Roadmap and next features

Final Thoughts & Where to Find More

Hyperframes and its skills are open source and free
Find Ben and Jake via the HeyGen (Twitter/X) account
Tag @heygen for shares and retweets
Find more at hyperframes.hadron.com and the Hyperframes GitHub repo

If You Haven’t Listened...

This episode is both a practical guide and an inspiring argument for the future of agentic, code-driven video. If you're a builder, marketer, or founder, the actionable tips (and open code) will help you easily create launch videos—no video production experience, no $30,000 budget required. Hyperframes plus your favorite AI agent are all you need.

"It just feels really empowering." – Peter Yang (40:05)

Loading summary

Transcript124 lines

[00:00]
Ben
I've spent $30,000 on a launch video and I was told that that was cheap. This entire one minute launch video for Spotify is created by asking bob code with Fable 5 and giving it Spotify.com and say, make a launch video for Spotify.com.
[00:15]
Jake
i've learned a bunch of little tips and tricks to maximize my speed to getting a good video. The main thing that I'm adding though is assets. So I'm either adding screenshots of UI or examples from other things I've seen online that I like add.
[00:28]
Ben
I think a lot of the engineers here actually can make their own launch videos now. They don't have to learn a tool, they don't have to learn After Effects or even capcut. Everyone now has a coding agent. So everyone can literally ask your coding agent to make a Hyperframes video.
[00:44]
Peter
All right, hey everyone. My guest today is Ben, VP of Product Engineering at heygen as well as Jake, PMM on the Hyperframes team. Look guys, Hyperframes is a freaking amazing tool and it's completely 100% free for making incredible AI videos just straight from HTML. And Bing and Jake are going to show us exactly what it can do and how it works and how to prompt it. So welcome guys.
[01:06]
Ben
Thank you. Thank you, Peter for having us.
[01:08]
Jake
Yeah, thank you so much.
[01:09]
Peter
All right, guys, well why don't we just get right into it and you can just show us how impressive high power frames is. I can't believe it's free.
[01:16]
Ben
Sounds great, sounds great. We'll start with a show off I guess. So this entire video end to end is actually code. There are definitely some assets, you know, for instance, me talking, which is a piece of media that we add to the construction. But if you actually look into RStudio, you will notice that it's really putting together videos, audio clips, code, animations, motion graphics into an HTML, CSS and JavaScript code base. Hyperframes is designed and constructed so that because of these data attributes that we add to these HTML elements, our renderer is able to turn such a code base into a rendered MP4 video that you can share anywhere. This is the power of Hyper Frames and very excited for people to be able to do that. Everyone now has a coding agent. So everyone can literally ask your coding agent to make a Hyperframes video just. Just like that.
[02:26]
Peter
Yeah, this is pretty incredible. But you know, like I, I don't know how to write any code right. So there's an easy way to do this because you guys actually build a skill to make this work.
[02:33]
Ben
That's right, that's right.
[02:34]
Peter
Yeah.
[02:35]
Ben
I think the, the simplest way for, for people to start actually learning how to to use hyperframes is to take a look at our website to hyperframes skill. Oh, actually do we want to give folks a sense of how to set up hyperframes first?
[02:52]
Peter
This is like super impressive, but why don't we start from step one? Can you show us how do you set up hyperframes in like a codecs or a cloud code?
[03:00]
Ben
Absolutely, absolutely. First and foremost, I definitely encourage people to go to hyperframes.hadron.com Quick start. Here is the most recommended way to teach your coding agent how to use hyperframes. And you just copy this and you go to your. For instance, you just go to your terminal and then you run this command. It'll pull in the hyperframes skill and it'll obviously take the steps to do that. Or you can also find hyperframes in the Codex plugin store. Scroll down. In the creativity section there is Hyperframes by Heygen. You just have to install it and try it in chat. Lastly, if you don't have those tools but you still want to experience the capability of hyperframes, you can go to CLAUDE and in the customize, connect connectors or connect your apps, search for hyperframes. Hyperframe is right here. You can install it and you can literally go to any chat in CLAUDE and say make me a video about something. Right. For instance, here I'm asking it to make me a video about Fable 5. I think it's got it loading. But yeah, that's the easy setup steps.
[04:30]
Peter
All right. And you showed some pretty complicated code to generate that amazing video that you showed for Claude mcp. But I don't know how to write any code. So what's the easy version of trying to make something like that?
[04:42]
Ben
Absolutely. I think this is the easiest way for folks to start getting a sense of how to work with hyperframes. We have a skill which is fully open sourced here in our Hyperframes Open source project. You know, we have been working on many skills to help people, you know, make really good launch videos, motion graphics, et cetera, et cetera. But this skill is a very comprehensive skill that takes your agent step by step to create a full video. I think first we just see maybe an outcome and we're also amazed by the kind of content and video that Fable 5 is able to make using this skill. So this video, I'll first mute it to talk over it this entire 1 minute long launch video for Spotify is created by asking cloud code with Fable 5 and giving it Spotify.com and say make a launch video for Spotify.com and as you can see, Fable followed our skills instructions to take assets from the Spotify.com it wrote a full storyboard on how the flow of this launch video should be and even the audio.
[06:21]
Peter
Yeah, that's nothing saying, but okay, so it just use that skill, the web website to video scale. Right?
[06:27]
Ben
That's right. That's right. So quickly kind of give everyone a high level overview. And the reason why I want to show this is that it's not necessary to say, oh everyone use this one skill. But, but I think just by reading through the skill, it gives you a sense of how to use hyperframes together with your agent. So here the skill essentially teaches your agent. All right, in order to turn a website into a video, Here are the 1, 2, 3, 4, 5, 6, 7 steps that you need to take, right?
[06:55]
Jake
Yeah.
[06:56]
Ben
And it details on every step. So first step, right? Ask it, ask your agent to actually capture the content from your website. So we go to the sub skill here. It actually teaches how, teaches the agent how to pull the screenshots, the assets and putting them into a folder or a set of folders that you know, your agent then can use these images and SVGs and assets and videos for, for its video. And then you can see that step three is storyboarding. You know, our, our skill teaches agent to be more diligent before building anything, build a storyboard so that you know, then it knows to build the code scene by scene. So if we go back here to the Spotify video, you can see that the Spotify video is broken down into scene one, scene two, scene three. And this is how your agent you uses the hyperframe skills, follows the hyperframe standards and writes the entire code base for this end video. So my suggestion is literally, you know, tell your agent, use this, use my website, make me a launch video. That's the simplest way to use it. But if you actually want to get better, which I'm sure J will talk a lot more about, you know, other other tips, tips and tricks that we have. Study the website hyperframes skill I would say is the best way for you to learn.
[08:31]
Peter
I mean I can just straight up copy the skill, right? Like just point codex to the GitHub like hey, just go get the skill for me. And, and then just like, you know, so I just have to like slash website to video and paste my website link. And then that's it, right? Then they'll do it. Yeah. That's incredible. How would you get the voice? Like, is it through like 11 labs or something? Or like some.
[08:53]
Ben
That's a good question. Our goal here is we obviously want to make it so that hyperframes is accessible and almost like completely free for people to get started. Right. So we actually use. I forgot the model name. Apologize for that. We should link that somewhere. But it's a local model that we have a skill for it. And people, if your agent decides that it needs to use audio, then it will download the model and then actually use the local model to do tds. We obviously also have ways for your agent to connect to Heygen, connect to ElevenLabs, all these different providers for your agent to use, text to speech or image generation, et cetera, et cetera.
[09:38]
Peter
Okay, yeah, you know what? I normally don't like to do videos talking about products because it seems like promoting a product. But this stuff is free, number one, and number two is so good. So there's no reason why people should not try this. It's like super approachable.
[09:51]
Linear Sponsor
This episode is brought to you by Linear. When engineers use tools like Cursor, Clock, code and Codecs, a lot of work happens invisibly. Someone can go from a bug report in Slack to a shipped fix without creating any record of what happened outside
[10:05]
Peter
of the code editor.
[10:06]
Linear Sponsor
And that's fine for speed, but it makes coordination harder as you scale. Linear integrates with the very best agent coding tools directly, like, like Cursor and Codex. That way anyone can see what an agent is working on and who assigned them to the task. You get the speed of agents without losing visibility across the team. Product teams at OpenAI, RAMP, and Block are all using Linear to collaborate with AI agents. And I use Linear myself to run my creator business. So check it out at Linear App Agents. That's Linear App Agents.
[10:40]
Peter
Now back to our episode. And Jake, you've been quiet so far, but like, Bing tells me that you're a real expert in actually getting really good videos out of this, so maybe you can share some tips or tricks that you use.
[10:49]
Jake
Yeah, so I have. I mean, I think it's important we lay out. We've really only been a team for two months. You know, we're all getting acquainted with the tool as well and getting better every day. I have had the privilege of creating most of our launch videos so far. Um, so through that I've learned a bunch of little tips and tricks to. To maximize my speed to getting a good video and then also I think the output quality. I think the key differentiator I'm going to introduce here is that the types of announcements we're doing, we don't have like a website or something to start from. So therefore we have to do. Or I have to do a bit more of the groundwork myself for getting projects set up initially. Because what we saw with the website to Hyperframe skill, that first step is it's taking in all this information from the site itself and laying the groundwork. So when you're doing something net new that maybe you only have Figma screenshots for or for us, a lot of our videos are based around a cloud code session or similar. You're going to have to build those yourself for now or if I've already made them, it's in our launch video specific repo which has the source code for all the videos I've created. So I'll talk more about that later too, because that's a very fast path to getting quality visuals that you can then amend for your videos. But essentially I do these setup steps. The first one is I create a new project folder. All I'm going to throw in there is context, some of the things we're launching. We just finished it up two days ago and all we have is a readme document about new feature that's coming into our Hyperframe studio or something like that. I'll pull that file and I'll add it into my project folder then the main thing that I'm adding though is assets. I'm either adding screenshots of UI or examples from other things I've seen online that I like. Just basically laying the groundwork of like I see a couple frames of this video already in my head, here they are. And then the key thing that you also want to add is one aesthetic source. I think most people might have heard of Design MD by now. We just released last week something called frame MD, which you can create on hyperframes.dev design, but you just drop your design.md and then our agent will reformat it to be better for video. But this is a really key thing for matching the aesthetic of your brand or of the brand that you're going to be talking about.
[14:00]
Peter
And Design MD is just like a bunch of fonts and colors and styling, kind of like a brand guideline.
[14:08]
Jake
Yeah, it's basically just brand guideline. And because it's all just codes, like hex codes or whatever the case may be, it's more of a visual direction. But because it's not written into HTML. It allows the agent to take a bit more liberty. And we kind of pushed that a step further with Frame md, giving it the context of like, you know, instead of building a web page, which is what the design MD is made for, you want to maximize the frame and make things larger and use motion. And yeah, then my first step is, or the first prompt that I give is I point my agent towards the new project folder that I've created. I point them towards either the design MD or the Frame MD that I've made. Then the last thing is I ask for it to go through everything and then create a storyboard MD which just comes out as a table of key events. So it just breaks down scene by scene what this video is going to be talking about. Maybe a brief explanation of what's on screen. And normally what I'm refining here is the text copy. Right. I really care about what we're going to say and how it's going to build into this video. So I might take a couple shots back and forth just being like, you know what? Actually I think it should be this line or help me brainstorm here or whatever. But this is more just like the meat of the story as opposed to any kind of visual direction. Then what I mentioned earlier is my next step. Once I'm happy with like this kind of overview of the video is I go. For me, my files are all local. So I point towards things the projects that I've made in the past. But for other people, they can go to our launch video repo and they can ask their agent to pull specific elements from the videos that I've created. So I want to give a quick example that I'm going to share with you. We launched pretty frequently and I try to not make things. Net new if I can. So I want to give an example. In our last three videos, I have reused the same Claude prompt box, just with different frame MDs. This is video number two and then this is video number three.
[16:37]
Linear Sponsor
Right.
[16:37]
Jake
So it looks vastly different. But ultimately I pointed to my agent towards the same first prompt box that I had created and then describe the relevant motion for each video that the
[16:52]
Peter
prompt box is just an image or like it's some code. It's like a code.
[16:56]
Jake
It's code. Yeah.
[16:58]
Peter
Okay.
[16:59]
Jake
That way you can have like different morphing effects and whatever animations you need later, dude.
[17:06]
Peter
So are you going to open source this stuff or is this Jake? It is Jake Scale.
[17:11]
Jake
Yeah, it's all there.
[17:13]
Ben
Actually, Peter, I do want to call out that we are. And we have open sourced quite a lot of these components. We haven't been able to. We haven't. Not that we don't want to, it's just that, you know, I don't think people know or we have not talked about it enough. But in our hyperframes open source repo, there are actually at least a good like 50 components that we use regularly in our, in our launch video.
[17:41]
Peter
Oh, really? So like all the cloud components and everything else is there?
[17:44]
Ben
Yes. And we actually open source every single one of our launch video because it's really just a code base. Right. We open source the entire code base that would render to the video. And so people can, you know, it's really hard for you to read the actual code base. Right. It's an agent written like hundreds, thousands of lines of code. But I think it'll be a great resource for your agent to point to. You can literally say, hey, pull that piece from that code base. Because I want that effect. Your agent will do that.
[18:14]
Peter
That's amazing. I'm not calling it right after this. Yeah, that sounds amazing.
[18:19]
Jake
Yeah. And an example prompt that I would use is like, I really love the text animation from this is the name of the video. Can you grab that one and pull it for my intro of this video. Right. Those types of moves are what your agent is going to excel at. And then it's also going to allow you to more easily. If you're doing things in parts like this, it makes it easier to apply it to an aesthetic. And that's kind of my point here. If you see the prompt box and it's the right interaction, but your colors are different, you want a slightly different pacing, whatever that is. The beauty of it being code is you get to just take the structure that I probably spent 20 minutes refining or whatever trying to get it right. And you already start with this baseline. So when you introduce these design changes or whatever they may be, it's a lot faster and it's more likely to work that first Try the next big unlock that I think we're going to be adding in pretty soon to everyone's videos. But I create a storyboard HTML from that markdown file that I had previously created. What this is is essentially I'm asking the agent to create one frame per scene. It's showing me the most visually dense section of each scene. And then I'm asking it to use the references that I've gotten from the launch videos. And then finally the design system.
[19:52]
Peter
You just want to Review things before it gets too far. Right?
[19:55]
Jake
Yeah, exactly. Yeah. So this one is so much faster. Like it, it inevitably is going to take a little while for your, for your agent to come up with the full composition, especially if you're making a 45 second video or longer. And so by just doing the static frame, you can align on aesthetic that much faster, which I think for a lot of people is a primary concern. So this is, this is definitely a big unlock for me. Most of these videos I'm making within a day of the day we're launching it. So time is so important to me in that case. So you get something back that'll look like this, where it kind of says, this is the hook scene, this is scene one, scene two. And then it'll have one frame. This is obviously a very lightweight version. I'm going to try and quickly find one to show you guys an example from one of my actual videos. But essentially, yeah, what I'll do is I'll go back and forth on this and then once I'm pretty happy with these static frames, I literally just ask, hey, can you turn this into a full hyperframes video and pull it up in the Hyperframe studio or use Hyperframes Preview when done?
[21:10]
Ben
Yeah. So just like what Jake's mentioning, RStudio will break up based on your code composition. Obviously, the studio breaks up your video into multiple scenes. And within the scene, I assume, Jake, you were going to talk about the Inspector.
[21:28]
Jake
Yes.
[21:30]
Ben
So essentially each scene is its own motion. Sometimes the scenes are overlapping with each other. For instance, I was like, I don't know, I don't know about this, you know, text. I don't know about its position either. Maybe I want it somewhere else. I want to change the text. You can all do that in the studio. And the beauty of hyperframes is that as humans who are editing this one, you don't actually need to understand code. You don't need to go to the actual index, you know, the HTML code and find the place and change, you know, these things. You don't do that. We build the studio so that humans can modify it using this ui. But the beauty of that is that when you make those changes, they become code. And so your agent actually knows what you changed. Because the agent can basically do a code diff or something. Right. And what's even cooler is that because of the fact that LLMs are so good at HTML, CSS, JavaScript, it knows exactly visually what this change will entail. And that allows the agent to continuously working with you working with the human on making this video perfect.
[22:51]
Peter
And I can also just tell the agent to change what do you want to play to something else, Right?
[22:56]
Ben
Yeah, absolutely. You can always do that. Sometimes there are just these tiny changes that you don't even know how to describe. And that's where the last mile editing comes in, and that's where the studio really helps. And when you come from where Jake was showing his storyboard, his storyboard essentially turns into each and every one of these scenes. And then he can then go into each scene and talk to his agent on how to make each scene work to the way that he wants them to.
[23:30]
Peter
Got it. Okay. And then after I finished all this, I just hit hit export to export as a video.
[23:36]
Ben
Yeah, that's right. And then it'll turn it into MP4. We also support, you know, you can. There are a couple of different configurations that you can take MP4, MOV and WebM. A lot of people ask if they can export a transparent background layer for a lot of the more. Let's call it more. Professional video makers, they want, you know, Hyper Frames to be making the motion graphics. And then you can download the WebM and then get the WebM to put it into your, I don't know, Premiere cut. But, you know, either way, we support many of these configurations locally.
[24:12]
Peter
Dude, this is incredible. And like, this is just like all HTML and CSS, right?
[24:16]
Ben
Or like all HTML, CSS and JavaScript.
[24:18]
Peter
So theoretically, I can also export it as a website or something, right?
[24:22]
Ben
Absolutely.
[24:22]
Peter
At some point.
[24:25]
Ben
Okay, that's right. Peter, you're touching on some things that we're also excited about because of the fact that it's HTML. We can make interactive videos.
[24:33]
Peter
Yeah, exactly.
[24:34]
Ben
Our player can be interactive. Yeah.
[24:35]
Peter
And I feel like the slide deck that you were showing Jake is also a HTML, right?
[24:39]
Jake
Yeah. Yes. It was also made with hyperframe skills.
[24:42]
Peter
Yeah, dude. You know, like, I got cloud design to make a slide deck, and then I got it to make a video, and the video was like, way more impressive than the slide deck. So I feel like you guys can also just expand to, like, the slides market too. Like just.
[24:56]
Ben
We will try. We will try. We were literally talking about it this morning.
[25:00]
Peter
Yeah, why don't we talk about. Because I have a bunch of PMs and engineers watching this. So why don't we talk a little bit about how this stuff actually works? Maybe you can walk through, you're working on hey Gen. Which is about avatars and stuff, but then all of a sudden you have hyperframe so where did that come from?
[25:16]
Ben
Yeah, great question. This might take a little bit longer, so bear with me. I do think we need to take a quick step back at heygen. One of the things that I think at Heygen we really focus on, we don't compete with cinematic videos, so we don't compete with Sea Dance Vail 3 Hollywood. That's not our focus. Heygen has always been known to have one of the best, if not the best, avatar models. The business problem that we help our users solve is communication. We always believe that video as a format is one of the most effective communication format. Because would you rather read a five page doc or watch a one to two minute video to understand, you know, everything that you need to understand? I have a lot of examples of how our users and internally how we use video as a communication format. So avatar was our first step because many people are, you know, shy in front of camera. They don't feel confident. You need to do retakes, I'm sure Peter, sometimes you do retakes, you know, a lot. And our users even more, they're not even like, you know, professional, you know, communicators or video makers, right? So they leverage our, our avatar model so that they can finally show up in front of the camera and talk to their audience. And because the people to people connection is really important, but just the people is also not enough. Right? You need to have the B rolls, the motion graphics, the explanation of your product, all of these like video editing that then even more people don't know how to do. Like I think majority of us don't know how to do video editing. So ever since about like, I think the beginning of last year, uh, hey Jen. We have been really focusing on, okay, now we've nailed a roll. We've helped people make their avatar videos and so that they can show up. How can we help them make that final video? Because people just take our video and then maybe go to like cap cut, premier cut to finish that video, right? They even hire a video editor to finish the video, right? So how can we take them from end to end? So, and we obviously believe that AI agent is the way. So we've been trying to build a video agent to do that. However, we learned the hard way that agents are, even though very, very capable when they are working with JSons and like, you know, XMLs, which is obviously the backing data model that all of our video editors sit on top of, it has no visual intelligence. Like when an agent writes a JSON blob, it can Be accurate. It can be verifiable and correct str structurally. But agent has no idea whether this JSON is going to be good looking or not.
[28:14]
Peter
Yeah, exactly.
[28:16]
Ben
It doesn't know.
[28:17]
Peter
Yeah.
[28:18]
Ben
And that is actually the biggest problem that we ran into and furthermore is like, you know, human modified that JSON through the UI and then the agent is like, what changed? Like, you know, what happened. Right. And so that's actually when we turned to code because we believe that code, especially HTML, is LLM's native language. LLMs can express not only accurately the information, but also visual aesthetics through HTML, CSS and JavaScript. And it's not necessarily fully true before like, you know, Gemini 2, like you know, GPT 4 time, but it's definitely true after like Gemini 3, you know, GPT 5 and you know, opus models. Like these models are incredibly good at visually expressing something using code.
[29:12]
Peter
Yep, yep.
[29:13]
Ben
And that really unlocked how our users can just talk to our agent and edit a video and the video will come out visually interesting. And that's how we. So we started by like actually just having agent write a very small snippets of code and, and slap that on top of our video editor to all the way be like, why can't it just all be code? And then our footages will sit on top of it, any images, assets, SVGs can sit on top of it and then the code just becomes that foundation layer for agentic video making.
[29:51]
Peter
That's incredible. Yeah. I always believe that code is the foundation of all knowledge work and clearly like a bunch of creative work too.
[29:59]
Ben
Absolutely.
[29:59]
Peter
But how do you like, is there some sort of verification loop that is actually like making beautiful scenes and stuff?
[30:05]
Ben
So there are a lot of things that we've noticed that agents are already very good at. Right. Like agents are already very good at writing a landing page. Right. But you hear a lot of people using hyperframes and be like, oh, I'm making a PPT video. Which is fine. You know, PPT videos are useful in some use cases, internal use cases, it's totally fine. But when it comes to launch video, it's not going to cut it. Right. People are not going to watch your PPT for more than five seconds. So what we also found is that LLMs today, especially with HTML CSS, we call it spatial intelligence or spatial aesthetics. Spatial aesthetics means you look at a landing page and your eyes move from top to bottom, left to right and you scroll down right. All the informations are laid out and spatially it looks great, but videos don't Work that way. Videos, we call it temporal aesthetics. So your eyes are always looking at the camera or you looking at the video and the information is fed to you. Like your eyes don't really move that much.
[31:15]
Peter
Yeah, because of the time evident.
[31:16]
Ben
Yeah, exactly. There's a time element to it. And, and that we found is not, you know, it's actually not something that LLMs are very good at because it's not being trained on top of that. So we internally build evals, benchmarks and also kind of like self check loops so that our own agent gets better and better at that. We open source a lot of that ideas into our skills so that your agent gets better at that as well. But I think that we're actually working with Frontier Labs as well on how they can train LLMs to be better at this.
[31:57]
Peter
Got it. And is the primary use case right now like product launch videos and kind of like CIS reveal like tech product stuff?
[32:03]
Ben
Great question. You'd be surprised. There are so many different use cases. Product launch video for us is the holy grail. Because you know, as a founder myself from, from previous to, hey gen, I've spent $30,000 on a launch video and I was told that that was cheap. Right?
[32:23]
Peter
Yeah.
[32:23]
Ben
And launch video is like so, so, so important. And we see that as like the, the top quality video type that we need to get to. But there are many, many videos that today people use us for like you believe it, like you might not even like, you know, register like real estate videos, you know, like educational videos, obviously internal training videos or just motion graphics that, you know, you make a motion graphic for specifically something. And internally I'll share one last example. PR to videos or commit to videos internally. We literally ask cloud code to look at my commits for the last seven days and tell my team what I did for the last seven days. And it was really fun, it was really useful. It gets everyone, you know, a sense of like what everyone is working on. And it's a fun like Friday afternoon event where we just watch like 10 videos together.
[33:23]
Peter
Yeah, I mean like, because I, I've been a lot of like product reviews and stuff and like, you know, people share like documents and slides. This stuff is just boring to go through and people started doing more prototyping and stuff. But yeah, just having a really nice launch video or like a video that you can share in Slack and people can understand what the hell you're trying to do in like two minutes. It's like super useful.
[33:41]
Ben
Yeah. One of the things that we really want to also Tap into. And we work with, for instance, Hermes agent. We have a deep integration with that team. You know, one thing that we found, and I'm sure, Peter, you run many of your own agents, right? Agents are extremely verbose. They come back with walls and walls and walls of text. And I have gotten to a point where I just don't read them. It's like, all right, sure, yeah, maybe you did what I asked you to do, but we turn that into videos, too. We ask agents to be like, all right, when you're done, make a high performance video. Tell me what you did in 30 seconds.
[34:20]
Peter
Wow. Wait, so, so like. Because I'm actually using harmonization right now, so it's natively integrated into Hermes, or do you have to install anything?
[34:27]
Ben
Yeah, it has a hyperfilm skill. You might need a wrong one command of like adding a hyper, but it's already in there.
[34:34]
Peter
Okay, you know what? You know what I want to do? I want to give it a bunch of pictures of my kids and then make some sort of a reel based on that. You could probably do that, right?
[34:42]
Ben
Absolutely, do that. Yeah.
[34:43]
Peter
And dude, let me try to push a little bit further. If I have a separate video of my kid, like at the playground or something, like I said, MP4, can I play that video within the hyperfix video?
[34:53]
Ben
Yes, you can, right?
[34:55]
Peter
It's just code.
[34:55]
Ben
Yeah, yeah, just code.
[34:57]
Peter
Okay. All right. All right, man. All right. I'm going to be playing with this a lot.
[35:00]
Ben
Yeah. What we found too, is that especially for the frontier models, they are actually also very good at visual understanding. So Fable 5 might be able to clip out that specific timestamps of your kids video to highlight in a hyperframe video. Because, you know, hyperframe can also just make it so that this MP4 is played from like the third second to the.
[35:25]
Peter
Oh, I see. I see.
[35:26]
Ben
Yeah.
[35:27]
Peter
Okay, let's talk about. Okay, so what happens if I can afford Fable 5? Like, what's the next best model for doing it? Doing this stuff.
[35:33]
Linear Sponsor
Great.
[35:33]
Peter
Is it Gemini or.
[35:34]
Ben
Yeah, Gemini.
[35:36]
Peter
Gemini.
[35:36]
Ben
Really? So we have been doing a lot of testing, a lot of evals. You know, if you want to ask for like the top of the line quality. Absolutely. Like GPT 5.5, you know, Fable 5. These are the top tier. But Gemini definitely brings a, you know, quality to cost, like balance. Our internal agent is built all on top of Gemini.
[36:00]
Peter
I agree. Okay. Yeah. Very exciting. Very exciting. Okay, cool. So let's just kind of recap the whole process. Right. So I guess step one is to go to the GitHub for hyperframes and just like comment or like install it and then maybe install the website to video scale. I think can one shot, right? So that's like a one shot scale. Yeah.
[36:19]
Ben
One of the things I wanted to quickly show is that we also have a bunch of templates that you can, you know, go work off of. If there's like one that you feel like is like close enough to you, but you know, you want to change the colors, you know, you can click on Fine Tune, you can change the palette here. It'll like, it'll preview immediately and you can like define your own colors directly if you have them. You can also change the typography if you don't like the original one. You can go for a more funky or a more, I don't know, you can change all kinds of fonts and then you just download the design pack. It'll get you a frame MD that works really well with hyperframes.
[37:04]
Peter
Oh, that's perfect. Yeah. This looks like a slide too. You should definitely support slides.
[37:08]
Ben
Absolutely.
[37:10]
Peter
Yeah.
[37:12]
Ben
Work on it.
[37:13]
Peter
Okay, great. And Jake, like, you know, you were not born to be an amazing HTML video editor, right? So how do you learn this stuff? Like, how did you become good at this stuff over the past two months?
[37:24]
Jake
You know, I started with small projects, right? Like only a 5 second video where I wanted to have some kind of motion and I learned how to describe it. And then from there I just kept refining. And then when I got like text effects and other things that I liked, I would turn those into a skill that I would point my agent to for any of my videos after that. Right. I reuse text animations, I reuse prompts I create, just amending them to the specific video at hand.
[37:56]
Peter
Okay. Well, I guess the good news is that people can just clone the repo and copy what you've done, so.
[38:01]
Jake
Exactly.
[38:02]
Peter
Yeah, yeah, yeah. Cool.
[38:04]
Ben
Yeah. I do think that what this was really freeing for builders like ourselves is that many. Even though Jake did make almost all of our launch videos in the last like two, three months. And you guys wouldn't believe it, but we had like at least 20 to 25 launches in like two months. And each launch comes with, in my opinion, a very good launch video. Maybe not at the top of line, but good enough. But I think a lot of the engineers here actually can make their own launch videos now because they understand the process, they understand the product. They just work with their agent. The agent makes, you know, everything after a lot of iterations. Right. But they don't have to learn a tool. They don't have to learn After Effects or even capcut. Right. Which I think is a, is a huge unlocking for builders and entrepreneurs who are, you know, who are trying to build videos for their businesses but, you know, finding it extremely hard to learn.
[39:03]
Peter
Yeah, learning a new, like, who wants to like go to a web website and learn new tools? It's pain ass. Like I just want to get my agent to do it. Yeah, so, so. Okay, cool. So then what are you guys planning next? Like it's going to be another 25 launches in.
[39:19]
Ben
Yeah, there are quite a lot. You know, Jake showed off the storyboarding. We believe that it's an important step in the video making, so we likely will open source a lot of that. We're also building what we call media use. We're actually going to build a ton of skills inside of the hyperframe so that your hyper frames learn how to use background matting, how to add sound effects, music, all of those things. A lot of that actually at heygen we will offer a lot of that for free, while some of them will likely cost you for more expensive models. So, yeah, we want hyperframes to be able to do honestly, anything and everything that a video editing tool needs to do.
[40:05]
Peter
Okay, I think that's amazing. Okay, look, I think number one, I see a lot of AI builders now posting hyperframe videos to launch their new GitHub repo or something. And that just feels really empowering because otherwise you have to pay like $30,000 to do it. And the other thing is like, you know, I've been on YouTube for a while and I still haven't learned how to edit a video myself. I don't want to learn like Premiere effects or whatever. Like, I don't want to learn it. So yeah, so I've been paying my video editor to do all my stuff and I'll probably still do that. But like people do little things with hyperframes. It just feels like really empowering. Yeah. So I guess I want to thank both of you for putting this stuff out there and making it free for all of us to use. Where can people find you guys online?
[40:47]
Ben
Absolutely. Well, we're on Twitter. Find us through the HeyGen Twitter account or X account. And we post almost every day about hyperframes. So find us at heygen is the handle.
[41:02]
Peter
Got it. And the great thing about hyperframes videos is like, it just helps you go viral on Twitter. So you know, if you want to
[41:06]
Ben
go viral, it's true. And we will always retweet your hyperframes videos if you tag us.
[41:12]
Peter
Okay, cool. I'll definitely tell you guys. All right, guys, well, thanks so much for your time, man.
[41:16]
Ben
Thank you so much, Peter.
[41:17]
Jake
Thank you.
[41:18]
Peter
Cheers.