
Loading summary
A
Today on the AI Daily Brief, Gemini 3 has officially arrived. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. Alright friends, quick announcements before we dive in. First of all, thank you to today's sponsors, Blitzy Rovo assembly and Robots and Pencils. To get an ad free version of the show, go to patreon.com aidaily brief or or you can subscribe on Apple Podcasts to learn about sponsoring the show and pretty much anything else about the show, including jobs, speaking opportunities, et cetera. Visit aidaily Brief AI As I mentioned yesterday, we are in the last week of accepting submissions, so if you want the full readout and report, go to roisurvey AI and contribute your use cases now related to that, between this new benchmarking study and some other exciting things coming up, I have a few exciting projects for researchers slash analysts. If you have immediate availability and a research or data analysis background, shoot us a Note@jobsi dailybrief AI put research in the subject line, share your background and I'll tell you a little bit more about what we're thinking. With that though, let's talk Gemini 3 welcome back to the AI Daily Brief. Happy Gemini 3 Day. To those who celebrate it has been a long time coming. The final anticipated model drop of the year, barring some big surprise from OpenAI, has finally arrived as Google has announced Gemini 3. Now the focus has been on Gemini 3 for, at this point, months, really, ever since GPT5 launched. In fact, the launch of GPT5, which initially had a bit of a bobble around first impressions, ratcheted up the pressure in some ways on Gemini 3 for the entire industry. Over the last few weeks though, the rumor mill has been swirling and hype has been building incredibly for this. An example from just a few days ago, marmaduke091 on Twitter says, okay, I'm saying it now with full confidence. Gemini 3 will make every other LLM irrelevant and Nanobanana 2 will make every other image model irrelevant. It is not close in Google one. This is just my honest feelings after seeing things, nobody is ready. There were effectively infinite posts of that type, including people who said that they had access and were seeing things. So much so in fact, that in the last couple of days you've actually started to see a bit of a shift where some folks have nudged in the other direction. Peter Wilderford shared the meter chart for the length of autonomy that a model can perform at saying Gemini 2.5, released in June, was a great model, but it was a bit disappointing on the meters. Benchmark below trend at 39 minutes. Where will Gemini 3 be? If on trend it should be around 3 hours. My forecast is Gemini 3 remains a bit below trend at 2.7 hours. Synthwave DD went much farther on Monday they wrote. Almost every day I hear and see more things that lower my expectations for Gemini 3. That is once again the case today. And honestly I'm not that excited for this launch anymore. The model regresses with every new checkpoint. DeepMind should be managing expectations. Instead we get more hype posting with little substance or transparency. Hopefully it will be a lot better in the real world, but my expectations there are similarly low now. To be clear, this was the exception going into this release, not the rule. Most people were incredibly hyped and if you needed any more confirmation of that last night the normally Highly reserved Google DeepMind CEO Demis Hassabis tweeted, It's nearly 3 here. My favorite part of the night shift. Locked in. Demis does not hype post. One little coder writes. You guys even made the calmer Demis hype up. But let's talk for a second about the stakes of this launch for Anthropic and OpenAI. Google has been surging throughout the year, picking up market share, picking up mindshare, and picking up brand new relative especially to ChatGPT. For the first time earlier this year, the Gemini app actually jumped up over ChatGPT for a little while in the Apple App Store charts. There is a lurking narrative surrounding OpenAI and anthropic that the 800 pound gorilla I.e. google will eventually just have too many resources and has a certain ultimate inevitability. And so the question coming into this for those companies would be, would Gemini 3 blow them out of the water? Would there still be areas where their models, especially for Anthropic, their coding models, were on par or still better than Gemini 3? And in general, what did this say for the state of the consumer AI app race? This launch also had some implications for Nvidia. While it wouldn't be as direct as the comparison between Gemini and ChatGPT, for example, Google is also building out the full stack around AI, including their own chips called tensor processing units or TPUs. If Google trained this model primarily with their TPUs AI and it was more performant than other models that were trained on Nvidia chips, maybe that would have implications for how the market viewed Nvidia. Speaking of the market, AI bubble talk continues to surge and continues to be the dominant theme in markets right Now, Google CEO Sundar Pichai actually contributed to this, saying in a recent interview with the BBC, there is some irrationality in the current AI boom. The growth of AI investment has been an extraordinary moment. When asked what happens if the AI bubble pops, he said, I think no company is going to be immune, including us. We can look back at the Internet. There was clearly a lot of excess investment, but none of us would question whether the Internet was profound. I expect AI to be the same. So this is a theme we've heard from lots of leaders before, that, yes, there might be irrational exuberance when it comes to valuations and there may even be a correction, but ultimately the technology underneath will be every bit as significant as people say it is now. That, plus broader macro factors, including Federal Reserve expectations, contributed on the day of the Gemini launch to a big tumble in stock prices led by the AI tech sector. And so the reason that Gemini 3 matters from that macro market perspective is that one of the things that would confirm for some investors their concerns about an AI bubble would be if it appeared that we had reached some meaningful plateau. In fact, the conversation around GPT5 being underwhelming when it came to expectations helped ratchet up this latest generation of of AI valuation and AI bubble fears. So you've got implications for their AI model competitors, implications for their infrastructure competitors, implications for the market as a whole. For Google itself, honestly, I kind of think the stakes were the lowest. Google is basically locked in an incredibly good year on their AI products. The number of users has gone up, the amount of tokens they're processing has gone up. And unless Gemini 3 was an absolute total flop, it's hard to see that trend changing. Now, of course, that doesn't mean that they couldn't use this moment to really reinforce a new leapfrog position. But when it comes to downside risk, I was actually least concerned about Google heading into this announcement. Now, pretty much everyone knew it was coming today. And indeed, at 8am Pacific time, 11am Eastern time, Sundar Pichai tweeted, introducing Gemini 3. It's the most powerful model in the world for multimodal understanding and our most powerful agentic and vibe coding model yet. Gemini 3 can bring any idea to life, quickly grasping context and intent so you can get what you need with less prompting. The companion blog post was called A New Era of intelligence. With Gemini 3. In the note, we got a few statistics. The Gemini app is now up to 650 million users per month. That's about 50 million more than last we heard. And Sundar also said that 13 million developers have now built with their models. Now one important thing, it's clear that Google has learned from their own launches in the past that the market does not like when you make announcements and say coming soon. Sundar writes, starting today we're shipping Gemini at the scale of Google. That includes Gemini 3 in AI mode in search with more complex reasoning and new dynamic experiences. This is the first time we are shipping Gemini in search on day one. Gemini 3 is also coming today in the Gemini app to developers in AI Studio and Vertex AI and in our new agentic development platform Google Antigravity, which I will get into in a little bit. In Demis section of the announcement post, he calls this another big step on the path to AGI and gets into the monster set of benchmarks that we'll get into in just a moment. Now, interestingly, the blog post is really focused on what you can do with this model organized around learning anything, building anything and planning anything. In his announcement tweet, Demis said that beyond the benchmarks, it's been by far his favorite model to use for its style and depth. In an example, he writes, I've been doing a bunch of late night vibe coding with Gemini 3 in Google AI Studio and it's so much fun. I recreated a testbed of my game theme park that I programmed in the 90s in a matter of hours down to letting players adjust the amount of salt on the chips. In his announcement post, Google VP Josh Woodward discusses six additional features that come with the launch. The first is Generative Interfaces, which is a new type of experimental interface that's generated on the fly and adapts to the user's needs based on the prompt. Gemini Agent, which he writes, is an experimental tool that orchestrates and completes complex multi step tasks. A new look for the Gemini app, better shopping results, support for 23 more languages, and a new promotion where US college students get a free year of AI Pro. This episode is brought to you by Blitzy, the enterprise autonomous software development platform with infinite code context. Blitzy uses thousands of specialized AI agents that think for hours to understand enterprise scale code bases with millions of lines of code. Enterprise engineering leaders start every development sprint with the Blitzi platform bringing in their development requirements. The Blitzi platform provides a plan, then generates and precompiles code for each task. Blitzi delivers 80% plus of the development work autonomously while providing a guide for the final 20% of human development work required to complete the Sprint. Public Companies are achieving a 5x engineering velocity increase when incorporating Blitzi as their pre IDE development tool, pairing it with their coding pilot of choice. To bring an AI native SDLC into their org, visit blitzi.com and press get a demo to learn how Blitzi transforms your SDLC from AI assisted to AI native. Meet Rovo, your AI powered teammate Rovo unleashes the potential of your team with AI powered search, chat and agents, or build your own agent with Studio. Rovo is powered by your organization's knowledge and lives on Atlassian's trusted and secure platform, so it's always working in the context of your work. Connect Robo to your favorite SaaS app so no knowledge gets left behind. Robo runs on the Teamwork graph, Atlassian's intelligence layer that unifies data across all of your apps and delivers personalized AI insights. From day one, Rovo is already built into Jira Confluence and Jira Service Management Standard, Premium and enterprise subscriptions. Know the feeling when AI turns from tool to teammate? If you Rovo, you know Discover Rovo, your new AI teammate powered by Atlassian. Get started at ROV as in victory o.com if you're building anything with Voice AI, you need to know about Assembly AI. They've built the best speech to text and speech understanding models in the industry, the quiet infrastructure behind products like Granola, Dovetail, Ashby and Clulee. Now, as I've said before, voice is one of the most important modalities of AI. It's the most natural human interface and I think it's a key part of where the next wave of innovation is going to happen. Assembly AI's models lead the field in accuracy and quality, so you can actually trust the data your product is built on. And their speech understanding models help you go beyond transcription, uncovering insights, identifying speakers and surfacing key moments automatically. It's developer first, no contracts, pay only for what you use and scales effortlessly. Go to semblyai.com brief, grab $50 in free credits and start building your voice AI product today. Today's episode is brought to you by Robots and Pencils. When competitive advantage lasts mere moments, speed to value wins the AI race. While big consultancies bury progress under layers of process, robots and pencils builds impact at AI speed. They partner with clients to enhance human potential through AI modernizing apps, strengthening data pipelines and accelerating cloud transformation. With AWS certified teams across us, Canada, Europe and Latin America, clients get local expertise and global scale. And with a laser focus on real outcomes, their solutions help organizers work smarter and serve customers better. They're your nimble, high service alternative to big integrators. Turn your AI vision into value fast. Stay ahead with a partner built for progress. Partner with robots and pencils at robotsandpencils.com aidaily Brief. Now One thing that was interesting to some folks is that Google launched this huge model with just blog posts and videos. No live stream, no presentation. But as it turns out, there was plenty for people to talk about. Even without that. If you are a regular listener, you'll know that I'm very skeptical of the ultimate value in benchmarks when it comes to a new model. I think most of our benchmarks are getting fairly saturated, and they don't necessarily tell you how a model is going to interact with your particular use cases. In other words, while they can be useful guides, they're no substitute for just getting in there and trying the thing out. However, to the extent that benchmarks can make a splash, boy do Gemini 3's benchmarks make a splash. Ras Rex sums up the attitude of basically everyone that I've seen online writing the Gemini 3 Pro benchmark results are genuinely unreal. Google didn't just catch up today, they walked into the arena and rewrote the difficulty settings to take a few examples on the academic reasoning focused humanities. Last examination, GPT5.1 was at 26.5%. Gemini 3 Pro is at 37.5% on MMMLU, one of the more saturated benchmarks for multilingual Q&A, Gemini 3 Pro nudges out over GPT5.1 91.8% to 91%. On GPQA Diamond Scientific Knowledge GPT5.1 is at 88.1% versus 91.9% for Gemini 3 Pro. And basically this story repeats itself kind of across the board. The only benchmarks where it either shared the title or was behind were on the AIME 2025 with code execution, where it tied with Sonnet 4.5 at 100% and Swebench verified where it was just ever so slightly behind. Both Sonnet 4.5 and GPT5.1 at 76.2% compared to 5.1. 76.3% and Sonnet 4.5 77.2%. Now, before you think that means that it's worse at coding, on another coding benchmark, Terminal Bench 2.0, which is agentic terminal coding, Gemini 3 Pro set a new standard at 54.2% compared to Sonnet 42.8% and 5.1's 47.6%. Now, beyond the biggies, some people noticed a few where the jump really stood out, Matt Schumer writes, one of Gemini 3's biggest leaps is its towering score on ScreenSpot Pro. This, by the way, is a measure of a model's ability to understand what's going on on a screen. The previous state of the art was Sonnet 4.5 at 36.2%. Gemini 3 Pro got 72.7%, Schumer says. It just massively accelerated my timeline to full computer using agents. Another one that stood out to many was the performance on ARC AGI 2. ARC AGI is of course, explicitly designed to be difficult for computers, and while GPT5.1 mustered a 17.6%, Gemini 3 Pro got a 31.1%. ArcPrize founder Francois Chollet called it impressive progress on the VPCT spatial reasoning test. Gemini 3 absolutely smashed the test at 91% compared to GPT5, high 66%. And when it comes to user preference overall, LM arena writes, Breaking Google DeepMind's Gemini 3 Pro is now number one across all major arena leaderboards. Number one in text, vision and web dev, number one in coding, math, creative writing. Long queries at nearly all occupational leaderboards. Massive gains over Gemini 2.5. And indeed, Gemini 3 also released a Deep Think mode that had even higher results. For example, the Deep Think mode drove the score on Arc AGI up to 45.1%. Again, Matt Schumer writes, the last time we saw a capability jump of this magnitude was the release of GPT4 in March 2023. We're entering a new era. In addition to LM arena, we also got the independent review from Artificial Analysis. They put it quite clearly, Gemini3Pro is the new leader in AI. In their aggregate score, Gemini3Pro shows up three points ahead of GPT5.1. Going back to that concern from the market bubble people, Simon Smith writes, so I guess we haven't hit a wall. Now at the time of recording, this model has been out to the general public for less than an hour. In fact, outside of Google AI studio, I'm not seeing it yet in any of my literally four different Gemini accounts. So aside from early preview tests, people haven't really had a chance to get their hands on it. There are folks who have had a few reps though, and here's some of the things that they've shared. Dan Shipper in the Every team were testing it earlier this morning and said that so far on coding, it's extremely fast and the quality seems very high. In long context understanding it has a quote massive context window and it can use it. Dan writes, I found it was able to find, synthesize and use pieces of information in a long book draft that other models couldn't. Now. Interestingly, Dan says that when it comes to writing in our early test, it's not as good of a writer or editor as sonnet or haiku. Dan says it appears worse at telling whether writing is interesting. Far Al started to play around with it on AI Studio and said it's so freaking fast and is actually pretty good. Schumer again writes, Gemini 3's biggest differentiator in my opinion is that it's far easier to get it out of the standard slop style for writing, coding, etc. Just ask for what you want and it'll make it happen. Referencing the tendency of AI coding models to design purple shaded interfaces. No more purple gradients. He also gave the example of it redesigning his personal website. Matt actually wrote a blog post about his last three days using the model in preview the TLDR. He writes, Gemini 3 is a fundamental improvement on daily use, not just on benchmarks. Feels more consistent and less spiky than previous models. He says it's fast. Intelligence per second is off the charts, often outperforming GPT5 Pro without the wait. Front end capabilities which we were just discussing, he says are excellent with it nailing design details, micro interactions and responsiveness on the first try. This might not be great for the people who love 4.0, but for the work user, Matt writes, it respects your time and doesn't waste tokens on flowery preambles. Lastly, he says, creative writing is finally good. It doesn't sound like AI slop anymore. The voice is coherent and the pacing is natural. Now of course that sounds a little bit different than what we heard from Dan and also different than what Murdocan Koilan found. He wrote, I'm doing my writing with constraints tests and Gemini 3 Pro is not living up to the hype, at least for this task. He basically talks about a writing test that he's created and goes in depth about how he thinks it fails, coming to the conclusion so early to say, but this model lacks the taste, restraint and structural intelligence necessary for challenging creative work. My big takeaway is that writing is one area where we're going to have to test it a little bit more. Now, a lot of the folks who are sharing their first impressions are focused on improvements in its coding ability and improvements in its design sensibility. Pietro Sharano writes, I asked Gemini 3 Pro to create a 3D Lego editor in one shot. It nailed the UI, complex spatial logic and all the functionality. Reiterating something that we heard from others, Pietro concludes, we're entering a new era. He also said it's also amazing at games. It recreated the old iOS game called Ridiculous Fishing from just a text prompt, including sound effects and music. He continues, it also did something I've seen other LLMs struggle to do before. It built a fully functional Game Boy emulator. And yes, it even drew the Game Boy as an svg. Flavio Adamo also tested it building games, showing off not only the ability to build a game engine, but also its understanding of spatial physics with a game where you rotate a ball running through a tunnel to try to avoid oncoming objects, with Flavio saying, built this fun game in literally five minutes and it's way better at coding than I expected. Now on the coding topic, one of the things that came alongside Gemini 3 was a new Google native IDE called Antigravity. In their announcement post, Google wrote, as model intelligence accelerates with Gemini 3, we have the opportunity to reimagine the entire developer experience. Today we're releasing Google Antigravity, our new agentic development platform that enables developers to operate at a higher task oriented level. Using Gemini 3's advanced reasoning tool use and agent decoding capabilities, Google Anti Gravity transforms AI assistance from a tool in a developer's toolkit into an active partner. While the core of Google Antigravity is a familiar AI IDE experience, its agents have been elevated to a dedicated surface and given direct access to the editor, terminal and browser. Now agents can autonomously plan and execute complex end to end software tasks simultaneously on your behalf, while validating their own code. Finally, they point out that in addition to taking advantage of Gemini 3 Pro, anti gravity also has access to their latest computer use model for browser control as well as nanobanana for image generation. Now, this might have been the part of the announcement that a lot of at least the developers and builders were most excited for. In his post, Google AI Studio lead Logan Kilpatrick wrote, anti Gravity is a faster way to develop. You act as the architect, collaborating with intelligent agents that operate autonomously across the editor, terminal and browser. These agents plan and execute complex software tasks, communicating their work with the user via detailed artifacts. This elevates all aspects of development, from building features, UI iteration and fixing bugs to researching and generating reports. So is this the end for cursor? Hyperbole aside, people do seem excited about this new launch, the AI for Success account wrote, I was one of the early testers and this thing is crazy good. Sterin says an IDE isn't actually the right framework here. He writes, I've been using Google Anti Gravity for a few days and to me it's not an IDE, it's a coding agent UI powered by Gemini 3 Pro. Notably, he writes, the agent can control a Chrome browser, enabling it to validate, open and use the app that it builds. A surprising example of Anti Gravity accessing a browser. Sarin writes, I asked to generate an SVG. It did of course, then asked to transform to PNG antigravity, looked for ImageMagic, couldn't find it, then looked for FFmpeg, couldn't find it. It rendered the SVG in Chrome and saved the pixels. Max Weinbach writes, the Anti Gravity IDE from Google is my favorite one now. Been using it for a few days and it's been outperforming Cursor and Windsurf for me, even most clis. Richard Serata writes, I've been using Google Antigravity for a few weeks and it's wild. It's an early preview, so you'll find quirks, but you'll be blown away by the agent stuff. So where does this all net out? Well, firstly, I certainly think that the AI bubble narrative did not get worse today. Maybe we will see over the next week. People quibble that there aren't some monster crazy highly noticeable advances, but this appears to be at first glance a pretty significant jump in capabilities when it comes to the competitive dynamics we'll have to see. Gemini 3 looks like a great model, but I've been absolutely loving 5.1 as well, and it's going to take a lot to displace it for me. Ultimately, what we have here is what appears to be by all accounts a great new model from Google, one that is shifting behaviors already for some, and even new tools like Anti Gravity around it that could shift how people interact with AI. Next up for me of course is getting in there and actually trying it. That's basically what I plan to spend the whole rest of the day doing and I will be back with what I find in the next couple of days. For now, as I said at the beginning, Happy Gemini 3day to those who celebrate and until next time, peace. Sam.
Podcast: The AI Daily Brief: Artificial Intelligence News and Analysis
Host: Nathaniel Whittemore (NLW)
Date: November 18, 2025
Episode: How Gemini 3 Changes the AI Race
This episode centers on the release of Google’s Gemini 3 large language model (LLM) and its significant implications for the broader AI industry, competition among top tech companies (especially OpenAI and Anthropic), developer tools, and the ongoing "AI bubble" narrative in global markets. Host Nathaniel Whittemore (“NLW”) dives deep into Gemini 3’s technical benchmarks, early user experiences, and the new software development paradigm introduced with the "Antigravity" agentic platform, analyzing what these advances mean for the AI race as 2025 draws to a close.
Announcement Details:
Notable Google Leadership Quotes:
Host’s Skepticism: Benchmarks are saturated; real-world usage more important (00:37).
Breakthrough Results:
Community Reaction to Benchmarks:
| Timestamp | Quote | Speaker / Context | |-----------|----------------------|-------------------| | 00:06 | “You guys even made the calmer Demis hype up.” | “One little coder” on DeepMind CEO Demis Hassabis’s unusual show of excitement | | 00:22 | “Introducing Gemini 3. It’s the most powerful model in the world for multimodal understanding and our most powerful agentic and vibe coding model yet.” | Sundar Pichai on launch day | | 00:15 | “There is some irrationality in the current AI boom... If the AI bubble pops, no company is going to be immune, including us.” | Sundar Pichai, interview with BBC | | 00:40 | “Gemini 3 Pro benchmark results are genuinely unreal. Google didn’t just catch up today, they walked into the arena and rewrote the difficulty settings.” | Ras Rex | | 00:44 | “One of Gemini 3’s biggest leaps is its towering score on ScreenSpot Pro. ... It just massively accelerated my timeline to full computer-using agents.” | Matt Schumer | | 01:02 | “[Gemini 3] was able to find, synthesize and use pieces of information in a long book draft that other models couldn’t.” | Dan Shipper, on context capability | | 01:04 | “Far easier to get it out of the standard slop style ... No more purple gradients.” | Matt Schumer, on UI coding | | 01:12 | “Creative writing is finally good. It doesn’t sound like AI slop anymore. The voice is coherent and the pacing is natural.” | Matt Schumer, on writing capabilities | | 01:17 | “Asked Gemini 3 Pro to create a 3D Lego editor in one shot. ... Nailed the UI, complex spatial logic, and all the functionality.” | Pietro Sharano | | 01:33 | “[Antigravity’s] agent can control a Chrome browser, enabling it to validate, open and use the app that it builds.” | Sterin |
NLW’s signoff:
“Happy Gemini 3day to those who celebrate—and until next time, peace.” (01:45)
This summary delivers the essence and deep technical detail of the episode, mapping the release of Gemini 3 against the shifting battleground of the AI race.