Transcript
A (0:00)
Foreign.
A (0:04)
Welcome back to How I AI. I'm Claire Vo, product leader and AI Obsessive, here on a mission to help you build better with these new tools. Today I have a really fun mini episode where I'm going to answer the question on everyone's mind which of these new models is actually the best designer? I'm going to take a page on my site that I don't think is particularly well designed and have Gemini 3 opus 4.5 and Codex 5.1 duke it out and see which one can redesign my page better. One shot, let's get to it.
B (0:37)
This episode is brought to you by Lovable. If you've ever had an idea for an app but didn't know where to start, Lovable is for you. Lovable lets you build working apps and websites by simply chatting with AI. Then you can customize it, add automations, and deploy it to a live domain. It's perfect for marketers spinning up tools, product managers prototyping new ideas, or founders launching their next business. Unlike no code tools, Lovable isn't about static pages. It builds full apps with real functionality.
A (1:08)
And it's fast.
B (1:10)
What used to take weeks, months or.
A (1:12)
Even years you can now do over the weekend.
B (1:15)
So if you've been sitting on an idea, now's the time to bring it to life. Get started for free at lovable.dev.
A (1:22)
That'S lovable.dev if you've been paying attention the last couple of weeks, it seems like every single model provider has released a brand new coding model. And what I heard the most from people is sure they're fast and sure they're great and sure they're beating benchmarks, but they are all really good at design. If you've been on X or social media, you've probably seen these beautifully designed landing pages, apps and user experience components generated using Gemini 3 or Opus 4.5 or even Codex 5.1. And I thought let's put these side by side and actually see which one's better at redesigning an existing page. I think it's easy to one shot something and make it look beautiful, especially if you're a great prompter and know exactly what to say as a designer. But if you have an existing site and you want to make it better, who's your trusted design engineer? Which of these models is really going to do the trick? And I'm going to show you what I think today in a couple minutes on which of these models is the better designer or redesigner of a page that I don't think is really great. So this is the chat PRD blog. It is not very good. I don't think this is a very beautiful site. It's not my favorite. I think it could be a lot better and it could be a lot better from a functional perspective, but it can also be a lot better from a design perspective. And you know, if I had a team, which I have a little small one, but if I had a team that was not AI, I might send this to designer and say, hey, we just launched this early on, it's not great. Can you redesign it? And so I wanted to test that flow with some of the new models that have come out that have said that they are better designers than previous versions. And so I fired up Cursor and I did a model by model comparison of redesigns and I used the exact same prompt, exact same input code. And we're just going to see which one we think is the better designer. So I'm going to show you my prompt here in Cursor. It was pretty straightforward. It was this redesign the blog page. So I just showed it the directory of where our blog pages to improve both the visual appeal and user experience. So sort of both like will it look nicer and will it be functionally a little easier to use? And then I added a functional component to it which was add best practices for SEO and navigation. And then I did that for three different models. I did it for Gemini 3 Pro, I did it for Opus4.5, for Manthropic, and I did it from GPT5.1 Codux. These are all recently released models that have been said to be their best in class models from OpenAI, from Anthropic and from Google. And so we're going to see exactly what it did. And I started with Gemini 3 Pro. The reason why I started with Gemini 3 Pro is I've heard over and over and over again what a great designer Gemini 3 Pro is. And I really wanted to see what it did. And so you can see here it thought quite a bit about visual design, user experience, SEO, navigation. It looked at the code and it start. It started executing. So it started writing some code and. And we're going to switch over and see exactly what it generated. So it generated this, this was the before, if you recall, very, very boring, not very good. And in the after it generated a nice hero image of the most recent blog post. So there's now this like highlighted blog post at the top and then these cards at the bottom. And a couple improvements I see here. There's some tagging here, there's some date of releases, there's this nice hover effect that zooms in on our featured images when you zoom in. Haven't done anything regarding pagination, which is a current functionality that doesn't really take into account whether or not we have featured images and making that look good. So there's some things there that could be improved, but I think overall it's pretty good. One thing that I noticed that it did that I did not love is that there's this tag at the very top of the page and it's just a little too tight with the rest of the navigation. So one of my reflections here is, you know, it doesn't have like the full visual context of the page, but it did a pretty nice job and it was very fast. But I have to say, despite Gemini 3's reputation for being the best designer, it was actually not my favorite. So I ran the exact same query in cursor with opus 4. 5. So if you look up here, redesigned the blog to improve both the visual appeal and UX and add best practices for SEO and navigation. Now the difference that I thought was really interesting when using Gemini 3 versus Opus 4.5 is Opus 4.5 actually triggered a to do list inside cursor. So it did a tool call to create a to do list and it gave a step by step flow it was going to follow. So Gemini 3 sort of did that chain of thought reasoning and then just you'll load code. Opus 4.5 created 4 to DOS. So the to DOS were redesign the blog listing page, improve the blog layout, enhance the post display and add comprehensive SEO structured data canonical URLs and meta tags. And so it was very precise step by step on what it was going to do in terms of implementing. And so I think the Planning capabilities of Opus 4. 5 are certainly better. I think Anthropic has really differentiated themselves as experts in coding models. You know, if I wanted to get the best outcome here, I probably should have done this in Claude code because I think there's some optimizations they've done there recently as well. But I thought it was really interesting that the output of a planned implementation was much better than the output of a straight shot, one shot implementation. And so you can see it went step by step and actually checked off those changes and then provided me a summary of changes. And I'm going to switch and show you exactly what that looked like because I was actually impressed by the design. So this is what we got from Opus 4. 5, which I think spoiler alert from all the models, was the most beautifully designed blog page that I got and also honestly the most functional from an SEO perspective. And so what you can see that Opus 4.5 did here is it pulled some images. We have a repository, a beautiful background images and featured images that we use throughout the chat PRD website, it actually pulled and looked for assets that it could bring in that would look nice. These rings are some design elements that we use commonly. And so it pulled in some interesting assets. If you recall, Gemini 3 just had a gradient background. Opus 4.5 actually added some imagery in the background. Very similar concept in terms of the layout. So you see again, a featured article that is the most recent blog post. Again, three column cards with the zoom in trick. So I guess people like it. But if you look at this, a couple nice design tweaks that Opus4.5 added. When you hover, not only does the image zoom in, but it gives you this nice little call to action here, this little arrow. I think it is so cute. Just does that nice little touch hover treatment on the anchor link for the blog post. Again, tags are in. And then it did a little bit more on the SEO side and I will wrap back around to the SEO changes that each of them made. But if you see here, not only do you have the author, which is me, Clairvaux, you have the date, which we also saw in the Gemini 3 option, but it also has an estimated time reading and a link. And so I just think the quality of the design here went probably 20 or 30% further than the Gemini 3 model went. And it's those nice edge touches that I feel like AI can add into any design that just makes it so much nicer to work with. And I was really impressed with Opus 4.5 in terms of the quality of the detail orientation. Now let's go down. You know, one of the things that it did is it handled no images a little smarter than Gemini 3 did. So if you recall, Gemini 3 kind of collapsed these cards here. Did not put placeholder images in here. With Opus 4.5, it saw that we were missing images for some of our blog posts and put a little placeholder with a nice little book icon here, which I think is lovely. It makes these cards just look a lot nicer and is really well designed. So overall, I think that Opus 4. 5 did an excellent job out of the box of redesigning a page. And not only redesigning the page, but really thinking about the functional components of it and I think a lot of that goes to its planning mode and its ability to call tools and then do some of these implementations step by step. Now let's get to the last model that I tested, which was Codex 5.1 Pro. So again, same prompt here. Redesign the blog to improve the visual appeal in UX and add best practices From SEO Edit GPT 5.1 codecs, the leading coding model from OpenAI. Again, codecs like Opus4.5 thought and generated to DOS. The to DOS were a little less granular than the one from Opus. So if you look at Opus, the TO dos were redesign the blog listing page with specifics about how I was going to redesign, improve the blog layout, enhance a specific component and then add SEO. The plans for 5.1 Codex were a little bit more general. They were investigate current layout, redesign, apply SEO. So I think the planning was just not as thoughtful from a design perspective as the Planning was from Opus 4. 5. And then if we actually look at the design, oh OpenAI, you know I love you some of my favorite models, but it did not do well on this redesign. And so you can see a couple things that it didn't do well right out the gate. One, it gave me AI slop purple gradient. Like we do not need any more purple to blue gradients in AI designs. We need to get them out of here. And so just the fact that we got AI purple is an immediate disappointment. The other thing, and this may be a ME problem, but I think we have a white word mark and a better logo to use here. And you can tell here just the image it selected is not nice on top of a colored background. Now I do think that the headline and copy from the from the blog is really nice stories, playbooks and experiments from the team. So it gives a little bit more context. So this was the model that did the best copywriting perhaps, but overall the design was not very good. And then again it did featured post here. This is the image from our most recent blog post. But there's no context, there's no call to action, it doesn't link to anywhere. And so I'm just really unsure what it was expecting users to experience. Now it's repeated here the featured block. So again I think these, I think these models really like, I guess there aren't that many fancy things in blog design and that you all have to have a feature to image and then a three row layout for your blog post. So it did do the featured image here, but the problem is it added a bunch of these links that don't really, I don't understand how they work. They only do the featured image in each of these categories. The jumping's kind of weird. And then if you look at it at browse the library, it doesn't even show the blog posts that exist in our overall library. And so it's both not pretty.
