ChatGPT Allows Users to Modify DALL-E Images - The Mark Cuban Podcast

Summary5 min read

The Mark Cuban Podcast: "ChatGPT Allows Users to Modify DALL-E Images"

Release Date: April 9, 2024

Introduction to the New Feature

In this episode of The Mark Cuban Podcast, Mark Cuban delves into the latest update from OpenAI: the ability to edit DALL-E images directly within ChatGPT. Announced on platforms like LinkedIn and X (formerly Twitter), this feature is now accessible across web, iOS, and Android versions of ChatGPT, showcasing OpenAI's commitment to a comprehensive and immediate rollout. Cuban expresses his excitement about the simultaneous availability across all platforms, emphasizing the company's intention to maximize user engagement from the outset.

Mark Cuban [00:01]: "OpenAI just released a new update a couple of hours ago... they want all of our users to use this as soon as possible."

Demonstration and User Reactions

Cuban provides a firsthand account of experimenting with the new image editing capabilities. He describes a demonstration where a generated image of a "cute poodle celebrating a birthday" is edited by adding bows to specific parts of the dog's head. The precision with which DALL-E applies these modifications impressed many viewers, leading to positive feedback on social media. One notable comment highlighted the integrity of OpenAI's demonstration, contrasting it with less transparent practices from competitors.

Mark Cuban [00:03]: "It's able to go and actually generate bows that appear on the dog's head exactly where they highlighted, which is impressive."

User Comment: "I appreciate that OpenAI chose to not speed up the video, demonstrating the generation process in this preview."

Cuban juxtaposes this with Google's Gemini platform, which faced backlash for misleading demonstrations. He underscores the importance of authenticity and transparency in building trust, particularly in the AI domain.

Mark Cuban [00:05]: "When we found out that Google Gemini's demo was essentially faked... it lost a lot of trust."

Comparison with Competitors

While praising OpenAI's advancements, Cuban acknowledges that other platforms like MidJourney still hold significant merit in image generation. He notes that MidJourney currently excels in producing precise and detailed images, suggesting that while OpenAI's new feature is promising, there is still room for growth and improvement.

Mark Cuban [00:10]: "To be honest, I think MidJourney still is the best when it comes to image generation by quite a bit."

Hands-On Testing of the Feature

Cuban shares his personal experience testing the feature by generating an image of a pirate ship battling Blackbeard. Initially, he struggled to locate the editing tool but eventually discovered it by expanding the image and using the "select" tool. This tool allows users to adjust the brush size for more granular edits. For instance, adjusting Blackbeard's expression resulted in less precise changes, indicating areas where the feature could enhance its accuracy.

Mark Cuban [00:15]: "I selected his face and said to give him a grinning scowl... the details are so not precise that you can't really tell if he's scowling or grinning."

However, when he attempted more substantial modifications, such as turning the pirate ship pink and transforming it into a car, the results were more satisfying. The altered image maintained coherence, albeit with creative and unconventional elements.

Mark Cuban [00:20]: "I told it to turn essentially the pirate ship to be pink and make it a car... nothing in the image looks wrong or broken."

Potential Impact on Graphic Design

Cuban envisions significant disruptions in the graphic design industry due to these advancements. He suggests that tools like Canva and Photoshop might face challenges as AI-driven platforms offer more intuitive and flexible image editing capabilities. The ability to make precise adjustments without regenerating entire images streamlines the creative process, potentially saving time and resources.

Mark Cuban [00:30]: "This is going to really take image generation to the next level... it's going to be the way graphic design is going with Canva, Photoshop, and these other tools getting very disrupted."

Future Developments and Multimedia Integration

Looking ahead, Cuban speculates on the next steps for AI in multimedia. He anticipates that video generation will follow a similar trajectory, allowing users to edit specific elements within videos seamlessly. Imagine altering an actor's attire or changing the context of a scene with simple prompts—this, he believes, is where the industry is headed.

Mark Cuban [00:35]: "Next it's gonna be video... you'll be able to select a character and change it with a prompt... it's going to be very fascinating."

Cuban emphasizes the broad scope of potential disruptions across various media formats, including audio and interactive content, positioning OpenAI's current advancements as just the beginning.

Conclusion and Future Outlook

In wrapping up, Cuban underscores his enthusiasm for the rapid advancements in AI-driven image and multimedia editing. He anticipates ongoing innovations that will not only enhance creative workflows but also reshape entire industries. Cuban pledges to keep his audience informed about these developments, highlighting the transformative potential of AI technologies.

Mark Cuban [00:40]: "I'll definitely keep you up to date on everything that is happening in this field... we're going to see a lot of disruption, whether that's video, image, audio, multimedia."

Cuban concludes by encouraging listeners to engage with the podcast through likes, follows, and reviews, fostering a community invested in staying at the forefront of technological innovation.

Key Takeaways

OpenAI's Update: Introduction of image editing capabilities within ChatGPT, available across web, iOS, and Android platforms.
User-Friendly Features: The "select" tool allows for precise edits, enhancing the flexibility of image generation.
Trust and Authenticity: OpenAI's transparent demonstration contrasts favorably against competitors that have faced credibility issues.
Industry Impact: Potential significant disruption in graphic design and multimedia industries due to more intuitive AI-driven tools.
Future Prospects: Anticipation of AI advancements extending to video and other media formats, further revolutionizing content creation.

This episode provides a comprehensive overview of OpenAI's latest advancements in integrating image editing within ChatGPT, highlighting the practical applications, industry implications, and future possibilities of AI-driven creative tools. Mark Cuban's insights emphasize both the current capabilities and the transformative potential of these technologies.

Loading summary

Transcript1 lines

[00:01]
A
OpenAI just released a new update a couple of hours ago. I have not heard anybody talking about this, but I think it's absolutely fascinating is that you can now edit dolly images in ChatGPT in a very interesting new way. I've seen this with some other programs. This is the first time I've seen OpenAI getting into this, and it's really powerful. So I want to tell you a little bit about what they're doing and why I think this is important. So the first thing that I'll say is that they kind of made this announcement on LinkedIn and on X, they said you can now edit Dolly images and ChatGPT across web, iOS and Android. So this is impressive. This is, you know, sometimes people roll out, you know, an update just to the web version and it comes to mobile later. This is already out on web. I've been playing with it, testing it, and apparently it's out on iOS and Android, which I haven't been using, but I highly recommend other people check it out if you have the app. This is amazing. It's going to go into so many more people's hands. When I see them do a big rollout like this to all platforms and make it really says, like, we want all of our users to use this as soon as possible. So essentially what you're going to be able to do here is you're going to be able to, once you generate an actual image, you're going to immediately be able to select parts of that image and edit them. So they give a demonstration where they have a dog that they're generated. They're like, you know, create an image of a cute poodle celebrating a birthday. So it's like a dog with a hat and, you know, celebrating his birthday. They then go to edit that and they highlight two spots on the dog's head and they say, add bows to it. Now, a lot of people have been commenting on the demonstration they've done because they released essentially a clip to social media of this whole, you know, generation happening. And the video is like over or it's like a minute long. And literally most of the video is just you sitting there waiting, watching this generation. But it's able to, you know, go and actually generate bows that appear on the dog's head exactly where they highlighted, which is impressive. So what I do want to say is a lot of people commenting on this video. There's some interesting comments on it. I think, all in all, people are kind of happy that they did this. Someone on the comments said, I appreciate that OpenAI chose to not speed up the video, demonstrating the generation process in this preview. This shows integrity and helps set realistic expectations for the product's capabilities. An authentic preview goes a long way with potential users. Trust and credibility is key in the age of AI. I actually agree with this. We had Google, Gemini come up with a demo of their platform, and they got absolutely roasted because it was, you know, this platform where you could talk to it, it could see what you were seeing, it could create images and video and, like, it would do. It was doing all this crazy stuff. And then we found out that it was essentially faked or staged. They highly edited the video. They asked it questions before they. They essentially gave it, like, way longer prompts than they were telling us they were giving it. So it just looked like they could say, you know, what's this? And then it would say, oh, that's like you playing rock paper, scissors. But in reality, they're like, I'm playing a game with my hands. It's very popular. What is it? And then it would respond, but they would cut out, like, all the context anyways. It was just really sketchy, and it lost a lot of trust, I think, from Google and Gemini. I'm sure they've learned their lesson. They're not gonna do that. But I think OpenAI and other AI companies are also learning their lessons. And. And when they're giving these demos now, I think it's really interesting that it's, you know, they're literally just letting you watch the. They know that people would rather watch a full minute of an image loading than have to, you know, know that it's fake. So we know this is real. So I went and tested this new feature. I think it was really impressive. I just went to ChatGPT. I'm like, oh, my gosh, is this available right now? And at first I thought it wasn't, to be honest. I had to go back and watch the video again to learn how to do it. So I'll let you know in case you want to try this. But I went and said, you know, create a photo of a pirate ship in battle with Blackbeard and his crew. It generated the image for me, and at first I was like, oh, there's no way to edit this. What you actually have to do is click on the image itself, and it will then expand to full view. And in the top right hand corner, there's something called select, which is essentially a tool where you can change the size of the paintbrush so you can make it like a really big selector or you can change the paintbrush to be really small if you want to get some like, smaller details in the image. I did a bunch of different things. One example was at first I selected. So I had to generate Blackbeard on a pirate ship. I selected his face and said to give him a grinning scowl. Now in the second version of the image that was generated, to be a hundred percent honest, I mean, he's got a beard covering his mouth, but like the details are so like not precise that you can't really tell if he's scowling or grinning or whatever. I'm going to be honest, I think Mid Journey still is the best when it comes to image generation by. By quite a bit. But, but this is quite an impressive feature and I'm. It's. There's some things here that I'm not seeing Mid Journey do. So for that reason I do think it's interesting. I wanted to test it with something maybe a little bit more obvious. So I actually just went and selected the entire pirate ship, including the mast, and I just kind of went and selected the whole thing and I told it to generate for me to turn essentially the pirate ship to be pink and make it a car. So it actually was able to do that. And it's, you know, it actually kind of looks like a car is just crashing into the pirate ship, which I guess is fine, whatever, it's its own like rendition. But to be fair, nothing in the image itself that I generated, while it looks like funny that a car's crashing into a pirate ship, nothing in it looks like wrong or like broken, I guess is the best way for me to explain it. Like Blackbeard is still standing on top of the car. There's some weird things coming out of it. So I do think Mid Journey is better for image generation. But I'm very, very impressed with this tool and I think you can, I think you'll be able to do some really impressive things. Now something else that I think is quite interesting is the fact that you can do, you know, ChatGPT is like linking in with dolly. So you actually can do image uploads, right? Meaning you can select an image and upload it to ChatGPT. Now when I originally discovered this, I wanted to see, you know, if it would be able to edit images that you uploaded. I wasn't actually able to see this exact capability. So for some people, like, I think on the, on the LinkedIn post, people were saying, great, I no longer have to spend hours explaining to my sister how to use Photoshop to edit her vacation photos. I thought this was kind of funny, but at the same time, it's not like she could go and just upload her images in there and edit it, right? So it's not like it's completely taken over Photoshop. Although there's the tool. The selector tool reminds me a lot of the selection tool from Photoshop, if you're familiar with that. But unfortunately, when you do something, for example, like when you upload an image, you're not actually able to go and change that image, which is honestly kind of unfortunate because I was looking forward to that particular feature and thinking that it would be pretty interesting to be able to go edit. Otherwise the images are just there. So I'm sure this is a feature they're gonna be adding in the future. There's all sorts of getarounds. There's ways you can do this with Mid Journey specifically, and there's a lot of different tools out there where you can upload photos of yourself and have it edit them. I was unfortunately unable to do this directly within ChatGPT. All in all, an amazing feature I'm really excited about. I think this is going to really take image generation to the next level, because now instead of just generating an image and hoping it gets exactly what you want, you can generate the image. And in the past you would say, okay, do it again, but change this, do it again, but change that. And every time it would regenerate, it wasn't the exact same, and it wouldn't change exactly what you wanted. Now you can literally select the part of the image you want to change and it can change it. I think this is going to be big for graphic design. This might be the way graphic design is going with Canva, Photoshop and these other tools I think are going to get very disrupted. So I think that there are, you know, hundreds of millions of dollars in this area that is going to get disrupted. Whether it's today or tomorrow. I can see a world where OpenAI releases a lot more, many more of these image generation and editing tools, which I think is going to be really powerful. And you also have to start extrapolating where this is going, which, you know, right now it's like, okay, cool image, but next it's gonna be video. So when you're doing Sora and you're doing video generation, I assume they're gonna kind of follow the same precedent. You'll be able to select areas within the video and say, okay, you know, I have the actor and he's like, running and, you know, skydiving off of a building. Now, I want him to be wearing, like, a red shirt. Okay. I want it to be a blue shirt. Okay. I want him to be jumping into a helicopter. Like, it's gonna be very fascinating to. To see how that actual video generation flow works, but I imagine they'll do things like this where you select a character and you change it with a prompt and it's going to change what's happening in the video. So very exciting times. A lot coming down the pipe. I'll definitely keep you up to date on everything that is happening in this field. I think that we're going to see a lot of disruption, whether that's video, image, audio, multimedia, so many areas. Thanks so much for tuning in. If you wouldn't mind, I really, really appreciate it. If you could hit the like button if you're on YouTube. Follow us if you're on Apple Podcasts or Spotify and leave us a review or a comment. I really appreciate every single comment, every single review. Read them all and I try to respond. Hope that you all have an amazing rest of your day.