Summary5 min read

The Last Invention is AI

Episode: OpenAI Gen: Floral Fantasia
Date: December 24, 2025
Host: The Last Invention is AI

Episode Overview

This episode explores OpenAI’s latest release: a new image generation model referred to as “image 1.5.” The host evaluates its technical improvements, user experience, and its place in the rapidly evolving competitive landscape of AI image generation. With direct hands-on testing, the discussion covers both the model’s capabilities and its current limitations, while also considering broader impacts for OpenAI’s market strategy as it faces strong competition from Google’s Gemini/Nano Banana models.

Key Discussion Points and Insights

1. OpenAI’s Competitive Urgency and Context

OpenAI released image 1.5 earlier than planned, reportedly accelerating its timeline in reaction to losing ground to competitors, particularly Google’s Nano Banana image model.
The host remarks:

“I do think this is a really impressive model. … perhaps it is because prior to them releasing this model, their last image model update I was begging them to make for over a year. The old version of DALL-E… was absolute garbage. They’re getting smoked by literally everybody, including Midjourney and everyone.” (00:40)
Code Red: Internal urgency within OpenAI, described as “Code Red,” fueled rapid development and release to avoid further market loss.
Leaderboard pressure:

“The newest version of Google's rival image generator, Nano Banana, topped the LM arena leaderboard across a bunch of different benchmarks. And I do not think OpenAI appreciated that.” (04:36)

2. Technical and Functional Improvements in Image 1.5

Instruction Following and Speed:

“Apparently it’s a lot better at following instructions. I have found that it is more precise at editing and it’s four times faster at generating images, which, let’s be honest, is the biggest thing that would drive me crazy with OpenAI.” (02:13)
Granularity and Iteration:

“They have a really cool feature now where if you click on an AI image, you have this feature called select area and you can select a part of the image and have it regenerate that bit of the image only…” (09:01)
Editing Experience: The model now supports partial regeneration, reducing the frustration of having to re-render entire images for minor tweaks, though the host notes some integration issues with very granular edits (e.g., only updating a head can cause mismatched backgrounds).
Input Flexibility: By uploading reference images (e.g., Sam Altman’s head, OpenAI logo), users can achieve high-fidelity outputs that previous iterations struggled to create.
4K Output: Model is capable of generating 4K images.
Creativity and Interface Upgrades:
- New UI features on ChatGPT’s Images tab simplify the workflow for generating and managing images.
- Trending prompts, preset filters, and ideas for creative templates (holiday cards, album covers, etc.).

3. Hands-On Testing & Use Case: Creating a Complex YouTube Thumbnail

The host outlines a real example:
- Prompt: “Generate a YouTube thumbnail of me looking shocked and staring at a giant cloud with letters in the sky written by an airplane that say ‘new AI image.’ The airplane has an OpenAI logo and is being flown by Sam Altman.” (07:02)
- Findings: The initial image impresses, especially compared to older models, but with minor errors (logo inaccuracy and less-convincing Sam Altman likeness).
Notable workflow:
- Used new “select area” tool to regenerate the airplane pilot’s head but encountered blending issues with the background.
- Solution: Upload explicit reference images (desired logo, actual Sam Altman photo), which yielded much more accurate outputs after re-generation.
- Quote:
  
  “Once I did that, it got the correct OpenAI logo and Sam Altman's head and actually everything looked great. … The image looks a hundred times better than its last model. So I’m really, really impressed.” (11:56)

4. Comparative Market Analysis & Implications

OpenAI is aiming to close the gap—or even overtake—Google’s Nano Banana, especially since the latter leads on major performance benchmarks.
There’s a dynamic, ongoing “arms race” in AI image models, with companies pushing to outdo each other on both speed and quality.
The host speculates that improvements in image models will soon be incorporated into video generators (like Sora), as the two are technologically connected.

5. User Experience Enhancements

ChatGPT’s new “Images” tab streamlines creation and management:
- Saves historical creations
- Allows immediate access to trending prompts and templates
- Intuitive toggling between image and text tools
Quote:

“You can discover like holiday cards or… what would I look like if I was a K pop star? … I think they're trying to like create some trends or something. But I do think it's nice—it saves you a couple seconds...” (15:53)

Notable Quotes & Memorable Moments

On OpenAI’s need to catch up:

“Every week, every month that they're behind in the benchmarks, a bad sign for them, they lose market share, so they're trying to be faster.” (05:09)
On improvements in user control:

“This update that they've added, you can tell it to make small updates like that and it will make the small update across the entire image. So… more like a creative studio.” (13:34)
On partially updating images:

“When it regenerated his head, it put like a better looking head on, but all of the space around his head didn’t match the sky beside it. … It looked like I was in Photoshop and I like cut and pasted a little piece of an image on top, so it kind of looked bad.” (10:17)

Timestamps for Key Segments

00:00–02:00 — Introduction, initial impressions of the new image model
02:01–05:40 — Urgency, market competition, and background on OpenAI’s “code red” state
05:41–07:20 — Technical performance, speed and benchmarks
07:21–11:00 — Hands-on testing: YouTube thumbnail use case, pros and cons in practice
11:01–13:30 — Manual solutions and improvements with reference image uploads
13:31–16:30 — Interface upgrades, iteration improvements, creative potential
16:31–end — Conclusion, summary thoughts, calls to action

Episode Takeaways

OpenAI’s image 1.5 is a significant leap in capability, especially in speed, instruction following, and iteration/editing details.
Despite some minor flaws in very granular edits, the model can achieve highly accurate and creative outputs with well-crafted prompts and reference uploads.
The competitive environment is driving rapid, user-focused innovation, with OpenAI determined not to fall behind industry rivals.
End users benefit from improved UX, faster workflows, and a more robust image generation toolkit built into ChatGPT.

Loading summary

Transcript1 lines

[00:00]
A
OpenAI has just dropped a brand new image model. I've been testing it out and playing with it today. I'm quite impressed with what they've been able to accomplish. TechCrunch said that they are continuing their Code Red war path by putting out this model. I don't know if it's a Code Red warpath, but I do think this is a really impressive model. And I also think, I mean, I think it was just time for them to update it, but perhaps it is because prior to them releasing this model, their last image model update I was begging them to make for over a year. The old version of Dall E. So like two generations ago was absolute garbage. They're getting smoked by literally everybody, including midjourney and everyone. And so when they made their their previous update to the image model, it was a huge, huge upgrade. Playing with this newest model is really cool. There's a bunch of cool features, but there's still some places that it failed when I was testing it. So I'll give you the pros and the cons on this episode and, and break down what I think it is capable, what it isn't capable of doing, the areas I think that there are for improvement and some of the shockingly impressive things I was able to get it to do. So we're going to get into all of that on the podcast today. But if you want to test out all of the models I talk about on the show, go check out my own startup, which is called AI box. AI. You get access to over 40 of the top AI models, a whole bunch of image models that are really cool, a whole bunch of audio models like 11 Labs, OpenAI's audio model for text, you have anthropic Google OpenAI meta, tons of cool open source models all on there for $20 a month so you can save money and have them all consolidated into one place. If you want to go check that out, it's AI Box AI. I'll leave a link in the description. All right, let's get into OpenAI's latest model. So they've just rolled out this new image model. Apparently it's a lot better at following instructions. I've tested it out. I have found that it is more precise at editing and it's four times faster at generating images, which, let's honest, is the biggest thing that would drive me crazy with OpenAI. And the reason why I was using Gemini's Nano Banana, because it was just so much faster at creating images. So I actually think this is a big moment for OpenAI, they obviously didn't want their image model to get lapped. People, everyone was switching to nano Banana for image generation and so I think that they are, they're really trying to push to make sure that they're not falling behind in this. I think this model catches them up and possibly surpasses Nano Banana in some ways. So what's going to be, what's cool about this is they made the announcement, they're calling it image 1.5. It's available on ChatGPT for everybody that has ChatGPT and it's also on the API. So it's an amazing new image model. OpenAI's Sam Altman last month said that they were in code red and a leaked internal document essentially saying that they're, you know, losing market share to Google. They weren't the market leader anymore, they were falling behind, they had room to grow and it seems like this is something that they have been working on. So the newest version of Google's rival image generator, Nano Banana, topped the LM arena leaderboard across a bunch of different benchmarks. And I do not think OpenAI appreciated that. So right now, Google still has its lead over OpenAI in the launch of GPT5. 2. And because of that, basically that means that people are preferring Gemini responses and that is something that OpenAI does not want. I think they, they basically at this point every week, every month that they're behind in the benchmarks, a bad sign for them, they lose market share, so they're trying to be faster. So on that note, apparently OpenAI had been planning on releasing this new image generator in early January next year. But because of the benchmarks, because of the code red, because of everything going on, they decided to just accelerate those plans and push it out as fast as they could. And so they got this model out. The last time they had a model update was in April. This was quite a while ago and I think it was definitely due. So now that they're doing this new 1.5 image model and the image model updates, you have to also imagine that the video generator in Sora is going to get a good upgrade soon because all of the video generators are based off of image generators. So just like nano Banana Pro Chat GPT Image has post production features which give you a lot more granular editing control when you're making some of these images. So there's like facial likeness, there's lighting, there's competition composition, there's color tone across different edits, there's a bunch of cool things that you can do with it. When I was playing with this earlier today, I was making a thumbnail is like the number one way I test image models because I'm like asking it to do text, I'm asking it to do images, I'm asking it to take a picture of me and put it in there and other people and logos of companies and like all this kind of stuff. And I was actually impressed by a couple of things, but I think there's room to grow in a couple other areas. So the first thing that I was impressed by was right off the bat, gave it a picture of myself and I said, generate a YouTube thumbnail of me looking shocked and staring at a giant cloud with letters in the sky written by an airplane that say new AI image. The airplane has an OpenAI logo and is being flown by Sam Altman. Okay, I gave it a lot of things and I also gave it some concepts where the, the reasoning model had to think about what was going on. Like how is it going to display the cloud letters? How is it going to make you be also able to see the airplane and the person driving the airplane. Like there's a. There's a bunch of things that I was curious how was going to do it did this like 100% better than the old model ever could have. It did a really impressive image for me. The one thing that I will say in its first, in its first go is that the OpenAI logo was not the OpenAI logo. I've had it accurately find it on the web before and put it on there. It did put all of the cloud letters really good. It had the airplane at a really great place. That all made sense. The person flying it didn't really look like Sam Altman. Was my biggest, was my, I guess, biggest complaint about this. And they have a really cool feature now where if you click on an AI image, you have this feature called select area and you can select a part of the image and have it regenerate that bit of the image only so you don't have to get the whole image regenerated, just the part that you're talking about. Now one thing I will say that I, I feel like it didn't do a great job of was I selected just the head of the person flying the airplane is this per, like this random person that was apparently Sam Altman but didn't really look like him. And it literally I just like put a circle around his head and had it regenerate. And when it regenerated his head, it put like a Better looking head on, but all of the space around his head didn't match the sky beside it. So like you could tell it looked like I was in Photoshop. And I like cut and pasted a little piece of an image on top, so it kind of looked bad. I'm assuming what I probably should have done was selected the entire like maybe the whole airplane or something and had it regenerate the whole airplane. Maybe really granular, small bits, it's not as good at generating. So in any case, I think it definitely has some room for improvement there. But afterwards I literally, without using that like selection tool, I just uploaded a picture of Sam Altman's head and uploaded a picture of the OpenAI logo. And I was like, update the logo to use this one and the image of Sam Altman to be this one. And once I did that, it got the correct OpenAI logo and Sam Altman's head and actually everything looked great. So if I had done that from the beginning where I provided, you know, the pictures of all the people I wanted to be used and the pictures of the logo that I want to be used, like it could have done it right off the bat. Probably. The image looks a hundred times better than its last model. So I'm really, really impressed. And beyond just making better images, it's also able to make them a lot higher quality. You could do 4K images. I think something that a lot of people have been talking about is just that most of the generative AI image tools are really bad at iteration, like if you're trying to change it. So like this whole process I just walked through where I was like editing the image live. So, you know, in the past if you said like adjust the facial expression or make the lighting colder, it would just reenter. It would like regenerate the entire image and maybe the next one wouldn't look like how you wanted. This update that they've added, you can tell it to make small updates like that and it will make the small update across the entire image. So it's more like they're saying like OpenAI's CEO of applications, he has a whole blog post about it and he said that it's quote, more like a creative studio. I actually think it is. Someone else was saying that, you know, the new image viewing and editing screens make it easier to create images that match your vision or get inspiration from trending prompts and preset filters. That's another thing that I should mention is that on Chat GPT now on the left hand side, you will See that there is an Images tab and inside the Images tab, if you're just trying to create an image, you don't have to in ChatGPT be like create an image of XYZ. You just describe the image you're creating. So that will save you a couple prompts. In addition, you can see all of the images you've ever created. So that's kind of useful to for you to go see and you can download them. You can discover like holiday cards or you know, me is an album cover or what would I look like if I was a K pop star? I don't know. We have like a bunch of like funny ideas that you can go try. I think they're trying to like create some trends or something. But I do think it's, it's nice if it saves you a couple seconds instead of having to go and, you know, add that into your prompt. You just click on the image generation button and it knows that you're doing that. It also has a button for adding images in. It knows that you're going to ask it to manipulate images of yourself or things that you're working on, which I find makes it really, really useful. So overall, I'm really impressed with it. If you learned anything new or appreciated the podcast, I would really appreciate it if you could leave a rating and review. Wherever you get your podcast, they help the show out a ton to get found by more amazing people like yourself. And as always, make sure you go check out AI box AI to get access to 40 of the top AI models for 20 bucks a month. Thanks so much for tuning in. I'll leave a link in the description to AI Box and I hope you have a great rest of your day.