AI Art Just Leveled Up with OpenAI’s Latest Model - The Jaeden Schafer Podcast

Summary5 min read

Summary of "AI Art Just Leveled Up with OpenAI’s Latest Model"

Podcast: The Joe Rogan Experience of AI
Host: The Joe Rogan Experience of AI
Release Date: April 21, 2025

Introduction

In the episode titled "AI Art Just Leveled Up with OpenAI’s Latest Model," the host delves into OpenAI's groundbreaking advancements in image generation technology. Emulating the conversational and insightful style of Joe Rogan, the podcast provides an in-depth analysis of the new model's capabilities, its implications for the creative industry, and the broader intersection of technology and human experience.

Launch of OpenAI’s 4.0 Image Generation Model

The episode begins with the host announcing the launch of OpenAI's brand-new image generation model, now embedded into ChatGPT. This release marks a significant milestone, offering enhanced features that surpass previous iterations.

A [00:00]: "OpenAI for the first time in years has just launched their brand new image generation model and they have it embedded into ChatGPT today on the podcast, me breaking down demos, how this is working."

Key Features and Capabilities

Enhanced Text Generation Within Images

One of the standout features of the new model is its unprecedented ability to generate clear and accurate text within images— a functionality that struggled in earlier models.

A [00:45]: "The number one feature that I'm excited about is the fact that it can generate text inside of the images."

The host cites a demo where the model accurately generates a boarding pass with readable and precise text, showcasing its improved text rendering capabilities.

Comprehensive Infographic Creation

The model's proficiency in creating detailed infographics with minimal input impresses the host. By simply requesting an infographic on "why Arizona is so hot," the model delivers a cohesive and visually appealing design without the need for selecting specific templates or design elements.

A [04:30]: "It created a very well designed. It's got like this really cool desert yellow feel to it... The text looks perfect."

Consistency in Character Creation

The host highlights the model's ability to maintain consistency when generating multiple iterations of a character across different styles. Through a demo involving a geometric penguin, the model successfully recreates the same character in various artistic renditions, from realistic miniatures to crystal and metal styles.

A [13:15]: "It is literally the exact same penguin. We're just looking at it from a whole bunch of different, different ways."

Complex Prompt Handling

Another remarkable feature is the model's aptitude for handling intricate prompts involving multiple elements. The host shares an example where the model seamlessly integrates fifteen different items into a single graphic, demonstrating its advanced comprehension and execution abilities.

A [19:50]: "It will listen exactly to what you say, right? You're like, I want them to be wearing green shoes and I want there to be seven pairs of green shoes on the windowsill in the background."

Blending Text and Images

The model excels in merging text with images, enabling users to create layered and contextually rich visuals. The host describes a demonstration where an infographic is integrated into a real-world photo, such as placing it on a textbook cover in front of the Arc de Triomphe.

A [25:40]: "It's like, you can generate graphics, and then because you're chatting with the chat interface, you generate a really cool graphic... and it will then generate the next photo."

Image Editing and Customization

The new model offers advanced image editing features, allowing users to specify exact aspect ratios, colors (including hex codes), and backgrounds. This level of customization is particularly beneficial for graphic designers aiming to maintain brand consistency.

A [32:10]: "You can say exact colors. You can use hex codes... It is going to recreate your logo or recreate... the background of whatever your photo is."

Additionally, the ability to create images with transparent backgrounds, such as stickers, enhances the model's versatility.

A [34:55]: "They actually were able to pull it off and literally download that as a transparent PNG background."

Advanced Style Generation

The host discusses the model's capability to generate images in various artistic styles based on user input. By uploading sketches or existing images, users can transform them into fully illustrated comics, lifelike statues, and more, enabling dynamic and creative content generation.

A [38:20]: "It took her sketch it, it illustrated it into be color. Then it was pretty funny... it threw it straight into the comic book."

Real-World Testing and Performance

In testing the model, the host experimented with recreating complex images, including memes and software screenshots. While encountering minor glitches, the model demonstrated exceptional text accuracy and partial image rendering, underscoring its robust performance.

A [42:05]: "It generated about half of the image before it crashed, but in that half of the image, it has like perfectly written out text that looks absolutely amazing."

Potential Impact on the Creative Industry

The host posits that OpenAI's latest model could disrupt existing design tools like Canva by offering more intuitive and powerful image generation capabilities. The seamless integration with ChatGPT and the model's superior performance positions it as a formidable competitor in the creative tech landscape.

A [47:50]: "This becomes an incredibly useful tool to the point where I think it threatens Canva... ChatGPT is just the biggest at this point."

Conclusion and Recommendations

Wrapping up the episode, the host enthusiastically recommends users explore the new image generation features, emphasizing their availability to both free and pro users. He advises ensuring that ChatGPT4O is selected to access the full range of advanced functionalities.

A [54:30]: "Highly recommend checking this out if you're a pro user, if you pay for it, even a free user. This is rolling out to literally everybody."

The host concludes by expressing his amazement at the model's capabilities and encourages listeners to engage with the tool to harness its potential for creative endeavors.

Final Thoughts

Overall, the episode provides a comprehensive overview of OpenAI's latest advancements in image generation, highlighting their practical applications and transformative potential for the creative industry. Through detailed demonstrations and insightful commentary, the host underscores the significance of these developments in shaping the future of AI-driven art and design.

Loading summary

Transcript1 lines

[00:00]
A
OpenAI for the first time in years has just launched their brand new image generation model and they have it embedded into ChatGPT today on the podcast, me breaking down demos, how this is working. I've actually got a chance to play with this and use it and I am absolutely blown away by what this is actually able to do. So today on the podcast we'll be diving into it. Now. The first thing I wanted to mention is the fact that as they've rolled this out, the number one feature that I'm excited about is the fact that it can generate text inside of the images. So this is something that has been notoriously terrible, you could say for these image generation models in the past. They recently came out with a Tweet, they said 4o image generation has arrived. It's beginning to roll out today to ChatGPT and Sora to all Pro plus teams and free users. So literally everybody is getting this. They then had a picture right below it where it's literally someone holding a boarding pass. It says boarding pass introducing 4.0image generation now in chat GP and Sora. 3-25-11am PDT okay, they look as you can tell now, it's very good at text. Look at all this accurate text. All that's written on the piece of paper. And I am blown away by like how clear this is. So you can tell it, generate a boarding pass with all of this information on it. And the text looks perfect. So I decided to actually test this out because I was a little skeptical. Sometimes you can see these like demos and these tweets and it's like, wow, this looks amazing. You're not exactly sure where it, where it sits on this. And so I decided to give it a test myself and I literally decided to. I was trying to just one shot, an infographic. They said it could do infographics. They said make an infographic on why Arizona is so hot. And literally without giving it any more sort of information on what I wanted, it created a very well designed. It's got like this really cool deserty yellow feel to it. It says why are zone is hot, desert climate, low elevation, high pressure. It's got explanations on each of those below them. And the text looks perfect. It's all the same font, it's all super cohesive. I didn't have to choose any design. In my opinion, this slash, what comes after this is going to almost kill companies like, like Canva or at least you're going to need to be able to maybe like generate something like this and open it in Canva. And it's going to be kind of like canvas. Going to have to figure out some AI tools to make it. So you can just like edit this directly. Because I don't really see myself in the future. If I want to create graphics or something, trying to go find a template or a design, I'm just going to one shot it. And like, it's very good at listening to your instructions. So I gave it virtually no instructions. I just said make an infographic, but I could have said, make an infographic. Include cactuses, include the sun. So they actually went through demos of what it's capable of doing. And it's very, very impressive. One of the things that it can actually do is you are like working with it in a chat and it can be super consistent so you can create the same character. They showed a demo of this where essentially they were creating the exact same character. He had it create like this, this like, you know, geometric penguin character, for example. And then he got it to create the exact same geometric penguin. But all of a sudden he made it in, you know, a realistic miniature style, as if a professional made it and painted it. And all of a sudden they create like the same thing, but now it looks like a little miniature sculpture. It's the exact same penguin from the exact same angle holding the exact same keys. And so to me, like, this is very, very impressive. Now the other thing that they were then able to do after they kind of did that was they went through and got it to generate this in a whole, in like a crystal style, as if it was turf, as if it was lava, as if it was a gummy bear, as if it was a metal, like all of these different styles. And what's so impressive to me is that it is literally the exact same. It's the exact same penguin. We're just looking at it from a whole bunch of different, different ways. This is really good for creativity. You can essentially upload an image and get it to recreate it and then change the style. And you can imagine doing this yourself. I saw a demo where someone was essentially able to upload a photo. So this was Allie k. Miller on LinkedIn. She uploaded like a podcast cover that she had done with, you know, her profile picture, whatever, professional studio photo or whatever. And then she said create. And so by the way, this one that she's doing isn't even this same one from ChatGPT. Google has released this. So OpenAI is coming up with sort of this response to this tool from Google and it's Able to do pretty much the same things, but for the Google product. Anyway, she uploaded a podcast cover and said, create an official passport photo for this woman. Be sure to use the exact same woman. It created what it was called like a passport photo, which looks just like a passport photo and it looks exactly like her. Like you could tell it's obviously recreated with AI, but it is her. And so we're getting to this point where these tools are so good at. You upload a character and then it just recreates it in a bunch of different variations. So that was a really cool demo. The next thing that they showed off that this thing is very good at is generating complex prompts. So they essentially created a prompt that, that they used for this, which they had 15 different sort of things. There was like a pair of googly eyes, a thumbs up emoji, a pair of blue scissors, a white giraffe, the word OpenAI. Like they had all of these different things that they wanted it to create. And then it created a graphic with all 15 of the things that described inside of that graphic. So the reason why they showcased that and I'm so blown away and why I think it's important is because now it's to the point where these images, you know, we had image models that were good before. I think midjourney was pretty good. It would look quite realistic. You could generate really realistic photos of people. Now it's useful. Now you can say, I want there to be a, you know, like I want there to be a camera, I want there to be this specific product, I want there to be this specific lighting, the specific angle. I want you to have like 10 of these things in the background and it will listen exactly to what you say, right? You're like, I want them to be wearing green shoes and I want there to be seven pairs of green shoes on the windowsill in the background. I want there to be five jackets hanging up in the closet. This was not something that previous AI models were able to do. And so it's really, really incredible that it has this capability down. So the next thing that it is now able to do is to essentially blend text and images. And I kind of went over that with my example of the infographic that I thought was really impressive. But I saw so many other examples where imagine now you create that infographic, but then you want to merge that with a real world photo. So they did a demo where they created an infographic and then they created, essentially they had somebody holding that infographic on the front cover of A textbook in front of the Arc de Triomphe in the real world. So it looks like a real photo with that infographic being like something on a piece of paper inside of it. That, to me, is, like, really cool. It's like, it's very meta. You can generate graphics, and then because you're chatting with the chat interface, you generate a really cool graphic. It's like, now take that graphic, stick it on the front cover of a textbook, and put a man doing this, and it will then generate the next photo. And then you could say, if you wanted to, you could say, now take that photo and put it on the front cover of a newspaper and have someone reading it. And it's like, now take that picture of a newspaper. Like, you could just go in, like you're creating graphics that go inside of graphics that get so detailed. This is really, really cool. I think, for the first time, these are very useful. Okay. A couple other features that I think are definitely worth mentioning. One of the big ones is how you can actually edit these photos. There's a couple cool things you can do. Obviously, you're sitting there chatting with it, describing how you want to edit the photo. You can say things like specific aspect ratios, which is really cool. You can say exact colors. You can use hex codes. My gosh, this is incredible. For graphic designers that are like, hey, our brand colors are, you know, these five or these three hex codes. You put those hex codes in, it's going to recreate your logo or recreate, you know, stuff behind your. Behind the background of whatever your. Your photo is. Now it's all going to match your brand colors. This is amazing. And of course, you can also do transparent background. So they. They showed a demo where they created a sticker of a dog and they made a transparent background. They actually were able to pull it off and literally download that as a transparent PNG background. They made a bunch of different stickers. I thought that was really cool. The last thing I wanted to show off was they did a demo where they essentially were able to go and create images in a bunch of different styles using GPT4O. So the first thing they did is they made a comic book. She drew out a comic book, took a picture of it, uploaded it. So this is what I then went and actually test it out, and I'll show you what it was able to do. But she just kind of did a sketch of a comic book, and then she said, you know, can you make this into a real comic of a dragon? So then it Went and actually illustrated it. It took her sketch it, it illustrated it into be color. Then it was pretty funny. But then she kind of said like, hey, here's a picture of like a crystal penguin is one of the crystal penguins they had generated earlier in their demo. And she's like, now change out, you know, the dragon for this crystal penguin. And it threw it straight into the comic book. So it's like, I think the ability to upload images and get it to kind of do these in real time. She also then took the crystal penguin and said, generate a lifelike statue of this in my living room. And it then was able to generate it in the living room. So you're uploading images inside of images. This is just incredibly useful. Incredibly useful. So I decided to test like the image, like if it's actually able to regenerate images. I tried with like a bunch of memes where I'd like. I took a screenshot of a meme and I said, remake this photo. At first it kind of glitched out when I said remake this photo. And it just like created the text for the photo. Then I told it to create an image and it, it wasn't very good based off of that. So I was a little discouraged. I. I think this probably has something to do with the way it created the text first. So I tried it one other time and while it actually did crash on the video generation, I took a screenshot of literally Riverside. It's the software I use to like record my podcasts. And I said, recreate this image exactly, even including all the text. And like, we're talking about a screenshot of like tons of ui, tons of text elements all over the screen. It generated about half of the image before it crashed, but in that half of the image, it has like perfectly written out text that looks absolutely amazing. I'm very, very blown away and impressed by this. So overall, it looks like we are seeing some absolutely incredible things from what I've been able to demo and test so far. I mean, we're talking like the text is amazing, like what we're recreating screenshots of whatever's on my screen. We're making one shot graphics, we're making stickers, we're editing things, transparent backgrounds. This is literally the image generator of, I think many people's dreams. I, to be honest, had completely kind of written off image Generation on, on ChatGPT for over a year now. There's just so many better options and this blows everybody, I mean literally everybody out of the water. This becomes an incredibly useful tool to the point where I think it threatens canva. It threatens like so many other players. And so I'm impressed. Google, like I mentioned, has that one other tool that they have rolled out that's able to do some similar things. ChatGPT is just the biggest at this point and so I think they didn't let Google steal their thunder for long. They came out with this and it is incredibly impressive. Highly recommend checking this out if you're a pro user, if you pay for it, even a free user. This is rolling out to literally everybody. You have to go check it out. The one thing you need to make sure to do is you need to make sure that ChatGPT4O is selected. You don't need to go and select a dolly or I don't go select any sort of image thing. Just make sure it's ChatGPT4O. That's where you're getting the best version of this image generation. Thanks so much for tuning in to the podcast. If you enjoyed it, make sure to like and subscribe over on YouTube. Drop us a comment or a review on Apple or Spotify. Thanks so much for tuning in and I hope that you all have an amazing rest of your day.