Summary6 min read

The AI Daily Brief: Artificial Intelligence News and Analysis

Episode: Why Google Isn't Chasing Claude Code
Host: Nathaniel Whittemore (NLW)
Date: May 20, 2026

Episode Overview

In this episode, host Nathaniel Whittemore (NLW) delves into Google's latest AI announcements at Google I/O 2026, exploring questions around their increasingly fragmented AI strategy. NLW examines where Google stands in the broader AI race, particularly in relation to OpenAI's and Anthropic's recent strides with coding agents like Codex and Claude Code. He investigates whether Google's sprawling approach truly puts them behind — or if their massive distribution and technical advantages mean it simply doesn’t matter.

Google’s Generative AI: A Turbulent Journey

Pre-ChatGPT Era
- Google originally led the AI race, acquiring DeepMind in 2014 for $500M.
- Early progress was hampered by a lack of clear strategic leadership and siloed efforts.
- [03:27] “Part of the reason that they were caught flat footed when ChatGPT first launched... was that their AI strategy wasn’t consolidated in a single place.”—NLW
Catching Up and Missteps
- Past products like Bard lagged behind Microsoft/OpenAI.
- Gemini’s launch in December 2024 marked an attempt to focus and consolidate leadership (led by Demis Hassabis of DeepMind), but early reception was tepid.
- 2024’s AI Overview features in Search were infamous for bizarre suggestions (“put glue on pizza and eat rocks”).
Turning Momentum
- NotebookLM, especially its audio overview feature (synthetic AI podcasts for custom resources), became a breakout product in late 2024.
- VO3, Google’s first video generation model with native audio, launched in 2025.
- Nanobanana and Nanobanana Pro (later 2025): Set new standards for fine-grained image and infographic editing.
Falling Behind Again
- 2026: The industry’s focus shifted to coding agents — spearheaded by Anthropic’s Claude Code and OpenAI’s Codex.
- [17:31] “When the dominant theme became agents—specifically coding agents... it absolutely left Google in the dust once again, at least from the insider AI conversation.” —NLW

Key Announcements & The Confusion at Google I/O 2026

1. Omni — The “Anything-to-Anything” Multimodal Model

Positioned as more than a video model: Omni is Google’s vision for truly multimodal, flexible AI — any input, any output.
Initial reactions on social media undersold its editing abilities.
[44:12] "Editing real videos with [Omni] is crazy... I can easily ask Omni to change the entire setting of a sequence and turn day into night while preserving the shot structure." —Henry Dobrez
Core differentiator: Depth and steerability of video edits, not just initial quality.
Unclear user audience: Is Omni for consumers, creators, or professionals?

2. Gemini Spark — Google’s Personal/Professional AI Agent

Branded as a 24/7 agent to help “navigate your digital life.”
Sits somewhere between a consumer assistant and a professional tool (examples: managing emails, drafting status updates).
[55:30] “Spark can pull all the facts from your emails, your docs, your sheets and slides and write the draft for you.” —Josh Woodward, Google VP
Confusion over intended users and overlap with other products like Anti Gravity.

3. Anti Gravity 2.0 — Agentic Coding Harness

Google’s latest foray into agentic development environments; desktop app with multi-agent teams, scheduled tasks, native voice, and 1-click integration to Google products.
Demo involved 93 subagents building an operating system in 12 hours.
On launch, criticized for derivative design and confusion with OpenAI Codex.
[1:09:50] “Anti Gravity 2.0 is interesting because it no longer feels like Google made an AI IDE... Now, the Agent layer is [the main story].” —Mark Kretschman
Still seen as trailing behind Codex and Claude Code.

4. Gemini 3.5 Flash — The New “Fast” Model

Benchmarked as Google’s most powerful model yet (76.2% on Terminal Bench 2.0), but still behind GPT-5 and Opus 4.7 in some tasks.
Now focused more on speed than cost efficiency.
Early user feedback: sometimes excessively verbose, high token usage neutralizes speed/cost benefits.
[1:19:16] "It's a pretty weird release. It does way more than what you asked for... It can be decent at games but weaker than 3.1 in controls." —Peter Gostev
Price increases and usage-based limits introduced, mirroring broader industry moves.

Community & Industry Reactions

Product Sprawl & User Confusion

Google’s product ecosystem—Omni, Spark, Anti Gravity, AI Studio, Flow, Pix, and more—is overwhelming and unclear to both developers and consumers.
[1:35:17] “It leads me to a request for OpenAI and Anthropic: Please avoid sprawl. Just give me a single powerful agentic tool like Codex or Cowork... I don't want to have to think about whether to use Spark or Anti Gravity or AI Studio...” —Simon Smith
[1:36:20] Marques Brownlee: “It’s getting genuinely difficult to keep track of all the names of AI products being unveiled.”

Does Product Confusion Matter?

Google’s massive distribution, deep integration, and growing user base (Gemini up to 900M users; 3.2 quadrillion tokens processed/month) may render product sprawl less damaging.
“Farzad writes: I think Google just won the consumer market for AI. Here’s what I think is going to happen. Anthropic, the best models for running businesses. Google, the best for everyday people and creatives... OpenAI probably cooked unless they pull a rabbit out of a hat.”
[1:38:02]

Strategic Tension: Coding Agents vs. World Models/Multimodality

Google’s leadership appears philosophically divided:
- Demis Hassabis is committed to AGI through world models, robotics, and continual learning — not betting on coding agents or self-improving research.
- Sergey Brin allegedly leading a “strike team” to ramp up AI that can improve itself with coding.
[1:47:05] “Two paths are open to Google now: Will Google turn away from the Hassabis path to pursue RSI, or will Google stay on its current path... or finally... pursue both...?” —NLW (paraphrasing Prinz)

Notable Quotes & Timestamps

On Google's Shifting Advantage:
[01:42] "It also just might not matter because of some of the significant advantages that they are bringing to the table." —NLW
On the Agentic Shift:
[17:31] “It absolutely left Google in the dust once again, at least from the insider AI conversation.” —NLW
On Product Sprawl:
[1:35:17] “Just give me a single powerful agentic tool... I’m willing to pay for that simplicity.” —Simon Smith
On Google’s Consumer Lead:
[1:38:02] “Anthropic the best models for running businesses. Google the best models for everyday people and creatives." —Farzad
On the Philosophical Divide:
[1:47:05] “Will Google turn away from the Hassabis path to pursue RSI, or... stay on its current path... or finally... pursue both...?” —NLW

Overall Takeaways

Google's product suite is growing fast — perhaps too fast for its own clarity, with confusing, overlapping tools for both developers and consumers.
The industry’s obsession with coding agents (Claude Code, Codex) has created pressure for Google, but the company’s philosophical and infrastructural bets may be focused elsewhere: truly multimodal models (Omni), hardware (TPUs), and universal distribution.
Despite mixed reviews and a lack of clear flagship success, Google’s user reach and technical footprint may keep it at the forefront of consumer AI—if not for depth, for ubiquity.
Internal and strategic contradictions are surfacing: whether to double down on long-term AGI, world models, and robotics, or chase the now proven productivity gains in agentic coding environments.
NLW’s final note: Progress is visible, but coherence is lacking—leaving listeners with the sense that the next big shift in Google’s AI strategy may not be far off.

For More:
Follow NLW and The AI Daily Brief for continued analysis—especially as Google’s new tools roll out and their strategic direction clarifies.

Loading summary

Transcript1 lines

[00:01]
A
Today on the AI Daily Brief, a look at everything Google announced at I O and why. It seems like their AI strategy is getting messier and messier, but it also just might not matter because of some of the significant advantages that they are bringing to the table. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. Alright friends, quick announcements before we dive in. First of all, thank you to today's sponsors kpmg Scrunch assembly and Section. To get an ad free version of the show, go to patreon.com aidaily brief or you can subscribe on Apple Podcasts. To learn more about sponsoring the show, send us a note at sponsorsdailybrief AI aidaily Brief AI is of course where you can find out about everything else going on in this ecosystem as well. Job opportunities, newsletters, free education programs, paid enterprise education programs. All of that again, is at aidailybrief AI now, one final note. You always had to know that the Google I O recap was going to be a full end to end episode, but that certainly didn't account for former OpenAI co founder Andrej Karpathy announcing that he had joined Anthropic. To many, if not most enfranchised AI watchers, this was a bigger announcement than anything that happened on the stage at IO. And so with that in mind, we will certainly be coming back to it tomorrow. But for now we have a lot of Google to talk about. Today we are talking about Google IO, which in this particular case is more than just a set of announcements, but a chance to see how one of the biggest labs thinks about AI priorities and where they sit in the AI race. In short, the event was a little confused. Google is doing a ton. There's absolutely no doubt what it adds up to is a little less clear. And in fact it seems to me like Google's leadership may have a very different idea than either anthropic or OpenAI's leadership about what winning the AI race will actually look like. But we need a little bit of background and context before we get into this year's event. Google's history with generative AI in general has been interesting. It's in the prehistoric times, I.e. the pre chatgpt times when very few were paying attention to all of this. Google was ahead simply by virtue of paying attention. Back in 2014 they acquired DeepMind for a then massive $500 million. But problematically it was not their only AI effort. In fact, part of the reason that they were caught flat footed when ChatGPT first launched in November of 2022 was that AI strategy wasn't consolidated in a single place. That wouldn't come till later, in fact, and it honestly took a very rough year out of the gate in 2023 for them to figure out that they needed to go through that sort of painful restructuring. 2023's IO event was certainly a catch up event. You might or might not remember that before Gemini we had something called Bard. Bard was Google's first ChatGPT competitor and frankly it wasn't great. Honestly, to many it felt more like the type of product that Microsoft would put out. But Microsoft, shockingly by virtue of their deal with OpenAI, was at the time broadly seen as ahead. Throughout 2024 there was growing pressure on Google from lots of sources, not least of which was the public markets to do something beyond and better than Bard, which eventually we got in December with the announcement of Gemini. Gemini was Google starting to consolidate their strategy around DeepMind CEO Demis Hassabis? But even if it appeared promising, that first announcement was still pretty disappointing. The most powerful version of Gemini wouldn't be available for a number of months and it had the feeling of a rushed move. Early 2024 didn't go much better. At the beginning of the year there was endless digital inks build about their quote unquote woke image generation model which when asked to do things like generating an image of a 1943 German soldier would put Japanese women or African American men in Nazi regalia. By the IO event in 2024 we were starting to get fully capable Gemini models. But even then, Google wasn't out of the woods yet. One of the big things they pushed at that event was AI overviews, which was Google's first attempt to integrate this new set of generative AI Capab, their most important existing business line, which was of course search. The problem was that that first set of AI overviews became known not for how helpful they were, but for suggesting that people put glue on pizza and eat rocks. And yet by the end of 2024, Google had started to get their groove back. And part of the shift in momentum came from an interesting source. Now, internally they had consolidated strategy and put everything under one roof. And so some of it was just the dividends from that. But they also, for the first time in a very long time, had a genuine breakout product hit in the form of NotebookLM, specifically the audio overview feature of NotebookLM, which came in the fall where you could have a synthetic AI podcast discuss whatever set of resources you had dropped into notebook. People saw that this could be a really cool way to learn new material, study for exams, familiarize themselves with the news, and lots and lots of people did it. Google's AI efforts then headed into 2025 with good momentum and IO. 2025 validated that they premiered VO3, which was their first video generation model to have native audio, and throughout 2025 Gemini models were treated as a genuine option and contender and you saw that in the growth of Gemini, which started to get up to ChatGPT style numbers. Google also had a couple of moments where something that they released genuinely expanded the set of things that were possible with generative AI, most notably nanobanana and nanobanana Pro. Nanobanana, which came out at the end of August and which was technically called Gemini 2.5 flash image before they realized that they should just the name that people liked for it wasn't remarkable because the base level quality of the images was so differentiated, but unlocked a ton of value in the new types of fine grained editing controls that it gave you that were simply not possible with other image generation models. That expanded in November with nanobananapro, which added reasoning over the prompt and significant text rendering capabilities that brought things like infographics online in a real way for the first time. All of this led to a feeling that Google was headed into 2026 with the best AI momentum that they had ever had. Quietly, however, something was happening that would push them from a narrative perspective, at least significantly, back behind again. 2026 has been the year of coding agents, agents for knowledge work and harnesses, and the seeds of that were laid at the beginning of 2025. One part of that was the reasoning models which all of the major labs released throughout the course of 2025. But the importance of Harness wasn't exactly clear to everyone when Claude Code launched back in February and Codex would technically come a few months later in May, and for much of the year all of this was quietly brewing. Anthropic's developer devotion clearly started to be noticed by OpenAI, and by August it was clear that they were taking the threat seriously. They tried to position their GPT5 announcement as all about that, putting coding based use cases up top of their announcement for the first time. But ultimately that part of the announcement was significantly sidelined both by the genuinely underwhelming performance of GPT5 as well as by the planned deprecation of GPT4O, which led to effectively a consumer rebell. Still, even as they worked through that by the end of the year, it was very clear that OpenAI was all in on codecs and coding related models. Each incremental new Release. We got GPT 5.1, GPT 5.2, GPT 5.3, had a Codex optimized version along with it. And as we well know, all of these things came together around the Opus 4.5 GPT 5.2 ERA, where all of a sudden, around the beginning of 2026 and especially into January, everyone was realizing that something big had shifted, that the capacity to use code to build things was totally different than it had been before, and that the promise of agents, which had been so exciting for so many years, was no longer just something for the future, but was actually here. When the dominant theme became agents, specifically coding agents, and the reappropriation of coding agents for knowledge work, it absolutely left Google in the dust once again, at least from the insider AI conversation. Now, is there an argument that the insider fixation on coding agents is missing the forest for the trees? Is it actually potentially an indictment of AI progress that we're all obsessed with this one area of immense product market fit? And in fact, over interpolating that product market fit to other areas of knowledge work that are going to be much harder? It is certainly possible that that is the case. Certainly. As part of their big Code red in December, OpenAI made the decision that it was significant enough that they were willing to, for the first time, abandon other areas of their ambition that were competing with Google, most notably around Sora, the video model and app that they basically completely gave up on. And yet, at the same time, it is very clear that it's not just the extremely enfranchised AI users that think that something significant has changed. The enterprise has gone all in on this new approach to AI in a way that opens up questions for the labs that aren't there, most notably Google. Which is not to say that Google didn't have any big advantages. In the first half of 2026, Google's TPU chips have become a bigger deal than ever, as compute constraints have come home to roost. And for the first time, Google is starting to think about that as not just as an internal capacity, but as an external business line. And what's more, OpenAI deciding to focus on the enterprise to the exclusion of consumer, although of course they wouldn't put it quite that dramatically over at OpenAI gave Google what appears to be an increasingly open lane. And so with all of that, the questions coming into this IO one would Google release a competitive state of the Art model two Would they release, update or consolidate around a real agentic coding or knowledge work harness? Are we supposed to use Anti Gravity? Are we supposed to use Google AI Studio? For OpenAI it's Codex, for Anthropic, it's Claude Code or Claude Cowork, both of which live in the same app? For Google it's a Sprawl. And it's never been exactly clear which of these harnesses we should invest in. On top of questions of model and harness, would Google clarify their position on consumer versus the enterprise? Or would they lean into this new AI trade off era, for example by honing in on lower cost or more efficient models as one of their unique opportunities? What we got was honestly a little bit of a confusing mess. The key announcements we're going to talk through are Omni Spark, Anti Gravity 2.0, Gemini 3.5, Flash, which together only represent a part of what was announced and try to understand how they all add up to something bigger. Just by looking at Twitter and social media you would think that Omni was just a video model, but in fact Google is positioning it as a new family of generative AI models that are eventually an anything to anything truly multimodal medium. The idea is that instead of being constrained to a video model using a video input or a text model using a text input, Omni at full strength will be able to take any input or any combination of inputs to produce whatever output you need in whatever format you need. Again, this is a type of thing that's been promised for a long time, but has never fully come to bear. The version that was released was specifically positioned as a video model, and that's how most people initially judged it. Researcher and AI Water Myth dispeller Andy Masley summed up the first impression feelings of many when he tweeted, google's Omni is fun, but definitely not a huge step up for me. Heisenberg on X went farther saying was expecting VO4 got Gemini Omni instead. Seed Dance 2.0 is going to eat this for breakfast. In the comments he even said grok Imagine is better. And yet, as many jumped in to point out, the comparison to Seed Dance itself was actually not particularly apartment as Carlos Santana wrote, this is a model for editing videos like Nana Banana like we've never had before. And indeed, when you actually read the blog post, that's how Google is positioning this. It's less about the initial quality of the video generation, although that's strong, and more about your ability to easily edit it. It wasn't long before some people groked this pun intended with Andy Masley actually coming back later and saying, okay, I take it back on Omni. Editing real videos with it is crazy. Henry Dobrez writes, omni feels much closer to a nano banana moment for video than a traditional cinematic generation model. You're comparing apples and pears. I can easily ask Omni to change the entire setting of a sequence and turn day into night while preserving the shot structure. That's a massive deal. Efficient video to video is probably going to become a huge part of the future of entertainment and hybrid productions. Ethan Malik writes, it's really good at instruction. Following and having a smart model capable of doing video expands what you can do by a considerable margin. Other people shared a bunch of examples of how easy it was to edit video with character consistency, taking an influencer style presentation and making the influencer invisible, changing the background and changing the person's outfit. In another example, there's a video scene of London with a close up of Big Ben where Omni changed the scene from a generic afternoon daylight scene to New Year's Eve with fireworks with the clock updated as well. Now I tend to think that based just on what I've seen, we tend to overestimate the importance of base level model upgrades and underestimate and undervalue, at least initially, changes to the steerability of those models. That's changing a bit post nanobanana and you can see why they use that reference point. But it certainly seems like this layer and depth of editability will unlock new use cases in video generation that weren't possible before. Still, it seems to me like in the wake of OpenAI deciding to close the SORA experiment, there are more people asking the question of who this is actually for than there might have been just a few months ago. Is this something that Google imagines general consumers using, and if so, are they using it for social media or something else? Is this just a prosumer type of feature where a very specific subset of creators are going to be using it? Is it in fact an entirely professional focused tool made for people who actually use video for a living? Or is it just meant to be a preview of a larger paradigm shift in models, with this being the easiest way to demonstrate how those types of models will change things? Google didn't really answer that, which may simply reflect the fact that they just aren't sure now. Speaking of cool things that are a little confused about their audience, next we got Gemini Spark. Google describes it as your 247 personal agent that helps you navigate your digital life, taking action on your behalf and under your direction. This once again was unclear exactly what the reference point is. Some jumped to the idea that it was their version of openclaw. That's in fact what the Verge called it, and certainly some of the ways that they described it point in that direction. In their announcement thread they said it runs on Gemini 3.5 and is built on anti gravity so it can perform long running tasks in the background. They called out the idea that because it runs on virtual machines on Google Cloud, you don't have to keep your laptop open. Which is clearly a reference to the the people who are walking around Claude Code or Codexing with their computers open. And they also even reference integrations with Google Tools and third parties through mcp, suggesting that this might in fact be designed for that more prosumer or professional type of audience that's currently using Claude Code, Codex or even something like Hermes or openclaw. And yet the examples they give seem to point it a little bit more in the direction of a consumer use case. Sundar Pichai said it's your personal AI agent that helps you navigate your digital life, taking actions on your behalf and under your direction. Google VP Josh Woodward said need to send an email to your boss with a status update. Spark can pull all the facts from your emails, your docs, your sheets and slides and write the draft for you. He also said small businesses are using Spark. They can watch over their inbox so they never miss a question from a customer. So on the one hand it's clearly not fully a personal agent for a non professional use, but it's also not exactly positioned in opposition to one of those other tools and maybe to Google. This is just supposed to be intuitive, that it is supposed to fill a gap that they see for users who are not comfortable with something like Claude code or maybe even Claude cowork, but who want this sort of capabilities. But if that is obvious to them, it's not obvious to others. And the way that people didn't know exactly how to frame this is an example of the broader product confusion which I think is everywhere with Google right now, making it harder to get a handle on. The product was announced, but not only is it not actually available, it's not even clear when it will be available, saying it's coming just sometime this summer. One of the most important AI questions right now isn't who's using AI, it's who's using it. Well, KPMG and the University of Texas at Austin just analyzed 1.4 million real workplace AI interactions and found something surprising. The highest impact Users aren't better prompt engineers. They treat AI like a reasoning partner. They frame problems, guide thinking, iterate, and push for better answers. And the good news? These behaviors are teachable at scale. If you're trying to move from AI access to real capability, KPMG's research on sophisticated AI collaboration is worth your time. Learn more at kpmg.com us sophisticated that's kpmg.com us sophisticated Quick question when was the last time you actually visited a website to research something? If you're like me, AI pretty much. Does that work for you now? That, of course raises a new question for brands. If AI is doing the discovering, researching and deciding who or what is your website really for that shift in user behavior the rise of AI bots becoming your most important new visitors is what my sponsor Scrunch, is taking head on. Scrunch is the AI customer experience platform that helps marketing teams understand how AI agents experience their site. Where they show up in AI answers, where they don't, and what's preventing them from being retrieved, trusted or recommended. And it's not just visibility. Scrunch shows you the content gaps, citation gaps, and technical blockers that matter and helps you fix them so your brand is found and chosen in AI answers. Now for our listeners, Scrunch is providing a free website audit that uncovers how AI sees your site, where there's gaps, and how you're showing up in AI versus the competition. Run your site through it@scrunch.com aidaily you know Assembly AI for having the most accurate streaming speech to text out there, but they just went a step further and launched a full voice agent API. The idea is simple. One connection and they handle everything. The listening, the thinking, the speaking. You just stream audio in and get your agent's voice response back. We're talking about things like outbound sales calls that actually qualify leads, customer support that handles complex requests without a script, scheduling, agents that sound like a human assistant, and you can build one in five minutes with one API. And importantly, their streaming model is the best at catching all the stuff that breaks on other voice agents. Things like phone numbers, emails, names and medical terms. And for those of you who are still in experimentation mode, there are no contracts and unlimited concurrency, so you can actually test it out without any friction. Head to AssemblyAI.com brief and try the live Voice Agent demo right there on the site. No signup needed. Here's a harsh truth. Your company is probably spending thousands or millions of dollars on AI tools that are being massively underutilized. Half of companies have AI tools, but only 12% use them for business value. Most employees are still using AI to summarize Meeting notes if you're the one responsible for AI adoption at your company, you need Section Section is a platform that helps you manage AI transformation across your entire organization. It coaches employees on real use cases, tracks who's using AI for business impact, and shows you exactly where AI is and isn't creating value. The result? You go from rolling out tools to driving measurable AI value. Your employees move from meeting summaries to solving actual business problems and you can prove the roi. Stop guessing if your AI investment is working, check out section at section AI.com that's S-E-C-T-I-O-N a I.com but what about the biggies? I said coming into this that maybe the two biggest questions were whether Google was going to release a state of the art model and whether there was going to be any clarity on the harness side of things and a legitimate competitor to Codex and Claude code. To some it was an indictment of Google that this was even a question. Nvidia focused analyst Tae Kim wrote, the media establishment consensus is enamored with Demis Hassabis, but Google is soundly losing in the biggest product market fit agentic AI market. So what did we discover? Well, we did get an update to Google's agentic coding surface Anti Gravity, the team writes, introducing Anti Gravity 2.0, a new standalone desktop application that delivers fully on that original glimpse of a truly agent optimized experience rebuilt from the ground up with multi agent teams, scheduled tasks, native voice and one click integration with other Google products. So this, it appears, is meant to be the agentic coding competitor. Adding some confusion to that though was the fact that they also announced a number of new Vibe code features for Google AI Studio. And I'm again left feeling like it might be clear ish inside Google that Anti Gravity is the Claude code equivalent, while AI Studio is the Claude cowork equivalent. Or maybe the lovable equivalent. Anyway, if that is the conception, it's certainly not clear from their public communications. To demonstrate the new capabilities of Anti Gravity 2.0, they rebuilt the core framework of a working operating system using 93 sub agents and processing billions of tokens, with the entire operation taking about 12 hours. Now there are a couple first reactions when developers saw this. It went down briefly, which wasn't the best experience, but I think everyone can give them a pass for that given that any sort of launch day is always going to involve some throttling the network, but even less positively was chatter around the similarity and derivative feeling of Anti Gravity to other tools like Codex. This was made worse by the fact that in the second minute of the launch video for Anti Gravity, you can see a folder for Codex on the screen being demoed, which made people shake their heads with wonder that that wasn't caught and certainly invited even more comparison between Anti Gravity 2 and Codex. The Codex team themselves weren't shy about sharing their feelings with Tebow from that team saying I wonder if the Anti Gravity team has designers. Couldn't believe my eyes today. Haha. Very flattering to the Codex team, but I would say that others who weren't as concerned with the comparison to codecs did seem to think that, at least on first glance, Anti Gravity had evolved in the way that an agentic harness needed to evolve. Mark Kretschman wrote, Anti Gravity 2.0 is interesting because it no longer feels like Google made an AI IDE. AntiGravity 1.0 was the full IDE editor, Terminal browser, Agent Workspace. Basically, Google's take on agentic coding as a complete environment 2.0 feels more like they pulled the Agent system out of the IDE and made it the product. It feels more like the Codex app, desktop app, CLI SDK, managed agents, scheduled tasks, sub agents, integration with AI Studio, Android and Firebase. The IDE is still there, but it's no longer the main story. The Agent layer is. What I certainly didn't see is anyone arguing that it had surpassed Claude Code or Codex in any way. So if we're keeping track of the score at best, you have to say that we did get a meaningful agentic harness upgrade that at least sort of brings Google into the realm of parity. So what then about the model side the new model premiered was 3.5 flash. There was no Pro version. They say that's coming later, just the Flash version. And as you'll see, Flash doesn't exactly mean what it used to. Or at least to the extent that Flash used to be about both speed and cost, it is definitely more about speed now. Now, just going by the benchmarks, it does seem like this is now Google's most powerful model. It scored 76.2% on Terminal bench 2.0 compared to 70.3% for Gemini 3.1 Pro, but beating Opus 4.7 but falling short of GPT5. On SU bench Pro it scored a 55.1% which was a slight jump over Gemini 3.1 Pro but still significantly behind 5.5 and Opus 4.7 on certain agentic benchmarks. Gemini 3.5 Flash appears by the numbers state of the art for the computer use Benchmark OS World. The model is neck and neck with 5.5 and 4.7 on GDP VAL, which is a measure of economically valuable real world knowledge work tasks. It is not close to the state of the art, but still a significant jump from Gemini 3.1 pro. Generally by the benchmarks it looks like the model is going to be competent but not nestled up against Opus 4.7 and 5.5 as truly cross the board state of the art. Of course, raw intelligence was never the focus of previous Flash models, which were always as I said about speed and cost and in this case they definitely wanted to hammer the speed. 3.5 flash is around three times faster than 3.1 Pro while delivering similar performance. It's also around 60% faster than 3 flash. The problem is that it's not all that inexpensive. The cost is about a 3x increase since the last flash model and about a 20x increase since 2.0 flash, and the cost to run many of the big benchmark tests like those from Artificial Analysis was actually higher than other models like Gemini 3.1 Pro and even GPT5.5 Medium. People at early access described it as a little strange. Peter Gostev wrote, it's a pretty weird release. It does way more than what you asked for. It sometimes generates best in class stuff but sometimes crashes out and does something str. It can be decent at games but weaker than 3.1in controls. While it is good at things like 3d worlds, it is really quite bad at web UI worse than open source models. The pricing of course is the weird bit. It went up to pretty much become a pro model, so losing a lot of what Flash was known and loved for. Simon Smith writes Initial experience with Gemini 3.5 flash smart, crazy fast but quite verbose example I highlighted some text I wrote and asked does this make sense? The reply was 568 words and included a legal analysis. And this doesn't seem to be an isolated experience when you go look at the output tokens that were used to run the Artificial Analysis Intelligence Index tests, which is basically a proxy for token efficiency. While 3.5 flash is nowhere near the top of the most token hungry reasoning models, it is less Token efficient than 3.1 Pro and in fact less token efficient than any GPT5.5 model except extra high where it's in a very similar range. It used about three and a half times more tokens for those tests than GPT5.5 medium, which as many are starting to point out, is kind of an indictment of the value proposition of speed and cost. If you're really fast, but it takes you three and a half times as many tokens to get to the same answer, that cuts into a lot of the speed gains and it also eats alive any cost gains. Theo from T3 wrote, the result of this is it costs two times more to run than Gemini 3.1 Pro on similar tasks. It's more expensive than GPT5.5 Medium and in point of fact, the more that Theo used it, the less he liked the model. He posted a self admitted crashout video about just how badly it performed on actual agentic tasks that he tested it on, which is certainly the most problematic. On top of all these other issues, Tenebrous writes so far, pretty negative impression of 3.5 flash. It is very fast in terms of token output, but this basically doesn't matter because it explodes in a huge avalanche of unnecessary tool calls on basically every task. When it gets stuck on something, it seems to pretty much never pause or ask for help. It just kinda keeps steamrolling ahead and flailing. Frequently hallucinated fake acronym expansions writing quality as mid to bad, tons of emoji slop. Actual code quality is sonnet tier. This is a very early vibe check. I could be missing things, but even the initial use case of super quick codebase exploration subagent is pretty quickly dissolving for me because it's not actually smart enough to be quick about it. All in all, definitely not what Google needed to drop. And interestingly, zooming out a bit, the decision to focus so much of the messaging around speed seems kind of out of sync with the reality that speed is not the big thing that's an issue for developers right now. That issue is of course cost. One of the questions that I had coming into this was whether Google would lean all the way into the potential to compete not on the full state of the art, but on a much more token efficient, lower priced model. And although that's kind of what you would assume a 3.5 quote unquote flash model would bring, that's not exactly how it appears in the real world. Now there was some quiet acknowledgment of the same sort of forces that are changing business models everywhere else. Google made hey about the fact that they had dropped the price of Ultra from $250 to $200 a month, as well as offering a new $100 plan. But in an email describing the changes which I got as an Ultra user, they also wrote that the Ultra plan would now include quote Compute based usage limits that factor in the complexity of your prompt, the features you use in the length of your chat. In addition, Agentic tools including the Flow Design Platform and Anti Gravity would now switch to a usage limit model. Basically, while they're bringing down the base price you are now in certain instances going to have to pay on a usage basis when you didn't before. Now this I don't think is an indictment of Google or anything. In fact it's an acknowledgment of where everything is going. But it does make the decision to focus so much on speed as the big Competitive differentiator of 3.5 all the more questionable. I don't think it was any accident that on the same day OpenAI introduced their new Guaranteed Capacity program, which is basically a way for customers to guarantee long term access to compute, even with even worse compute shortages on the horizon. And I will say that even though they didn't seem to nail it at I O, I do think that this is still an opportunity for Google Box's Aaron Levy writes token costs will become a dominant topic in enterprises going forward with AI just got out of a dinner with many Fortune 500 enterprise CIOs and this was the most heated topic. A mix of strategies are being employed, but basically no one feels like they have the right solution. It's going to be a mix of figuring out how to prioritize workloads to different models, giving out access to better or worse agents by user type, setting different spend caps by team, having teams justify AI by their use case, and some just having unfettered access. Everyone is trying to figure out a semi or predictable model right now in a world where the underlying tech and cost models are constantly evolving. Overall, holding aside first reactions to any one product or the model that was announced, there were two big strands in the sentiment that I saw. The first was OMG Product Sprawl. Simon Smith again writes, the barrage of Google announcements today are exciting but also confusing. So now there's Google Images, Google Photos and Google Pics. And there's Anti Gravity but also Spark. In another post he followed up My head is spinning from Google I O. It leads me to a request for OpenAI in anthropic please avoid sprawl. Just give me a single powerful agentic tool like Codex or Cowork through which I can do Everything. I don't want to have to think about whether to use Spark or Anti Gravity or AI Studio or Flow or Pumeli or Pix. I accept that these may meet the needs of different users and Google knows how to run a killer business. But personally, I just want one interface to rule them all, and I'm willing to pay for that simplicity. Moving outside of just the pure enfranchised AI Circle Marques Brownlee, it's getting genuinely difficult to keep track of all the names of AI products being unveiled in the last hour. Google unveiled Google Pics, which is not Google Photos, and updates to Google Flow, nanobanana, Veo, which are all Media Generation Google Antigravity Gemini Spark, Gemini Omni, Gemini 3.5 flash Nathan Clark summed up with his tongue firmly buried in his cheek. It's in Gemini, just created in AI Studio. Oh, that's for your personal Google One account. For Workspace, you need Gemini Business. No, not Gemini advance. That's AI Pro. Now, unless you need AI Ultra. Oh, agents, you do that in Spark. Actually, for coding, use Joules. Unless you mean the agentic ide. That's Antigravity. No, that's the old Anti gravity. Download the new one actually, Gemini CLI is being deprecated. Use Anti Gravity Cli. No, the Flash model is smarter than the Pro model. Unless you need Pro. If it's video, use Flow. No, Flow uses Veo. Actually, that's in Gemini. Now, unless you're in search, then it's AI mode. No, research is NotebookLM. Anyway, it's all very simple. So that is one take that even if these products are good, everyone is left confused. The other take though, is that it might not matter. The sheer surface area that Google has with its users, the total amount of digital interactions where we already interface with Google may just mean that product sprawl that puts the right version of the thing in the right place. Where someone's already interacting is, for the average consumer, all that's going to matter. And on that front, it's hard to argue with some of the numbers we got yesterday. The Gemini app has jumped from 400 million monthly active users back in May last year to 900 million users last month in April. In that same period of time, the number of monthly tokens processed across all of their surfaces has jumped from 480 trillion to to 3.2 quadrillion per month. And there's also that point about the open lane. Peter Yang wrote, I feel like Google is going to win consumer AI. It's the only US Lab that's building video models and consumers love video. Eg. TikTok and YouTube is far more popular than text based platforms. The only real competition is Seed, Dance and other video models that don't care about copyright. Farzad writes, I think Google just won the consumer market for AI. Here's what I think is going to happen. Anthropic the best models for running businesses. Google the best models for everyday people and creatives. SpaceX the biggest hyperscaler by far. OpenAI probably cooked unless they pull a rabbit out of a hat. Now a couple people asked whether he actually used these products or whether this was just an engagement tweet. But honestly, the fact that an engagement tweet is placing Google as the default winner of the consumer market says something in and of itself, Hayter writes. Google has secured its ecosystem and is playing the long game. Gemini already powers search, workspace and more, so Google no longer needs to subsidize models to attract users. Gemini Omni may not be the best coding model, but it will become the base for future universal multimodal models. And actually, as we start to round the corner, I want to bring it all the way back to Omni. Prakash Adapai the impression I got was that Demis Hassabis thinks AGI will require world models. He's thinking of literally any input to any output models. Omniflash is a toy video model version of this. Demis intention seems to be that Omni will eventually generate anything from construction blueprints to gene sequences. And with the rest of Prakash's tweet being about how they're now in the let a hundred flowers bloom phase and back to being a little disorganized, there's an interesting implication that the reason for that is that their AI leader, Demis just doesn't care about the product fight. So fascinatingly, we're in a situation where Google may be the default consumer leader by virtue of their massive distribution and existing relationships with consumers, as well as the fact that OpenAI has clearly decided to go compete on enterprise while also having their main leader not actually care about consumer products all that much. For some, the gap between the AGI rhetoric at Demis closing keynote where he said we are standing in the foothills of the singularity, the gap between that and the demos that we got was fairly jarring even before the event. Prin summed up those who have been paying attention know that Demis Hassabis has been generally skeptical of the research direction being pursued by OpenAI and Anthropic, I.e. coding agents leading to acceleration and eventually full automation of AI research. Instead, Google has been pursuing its own separate 5 to 10 year track to AGI to be achieved through continual learning, world models and a link to the physical world that is robotics. Just four months ago in Davos, Hassabis spoke about the limits on how fast self improving systems, that is Those being pursued by Anthropic and OpenAI can work. But now, Prinz writes, the pace of releases by Anthropic and OpenAI has become relentless. It is clear that Codex and Claude Code in particular is significantly accelerating the pace of AI research at these two labs. We've recently heard rumors that an important faction at Google, led by none other than Sergey Brin, is not happy about these developments. Brin has allegedly formed a strike team at Google tasked with achieving AI takeoff of AI that can improve itself through improvement of Google's AI coding abilities. For those paying attention, this is the exact path to fully automated AI research in RSI that is currently being pursued by OpenAI and Anthropic. And here lies the tension. Two paths are open to Google now. Will Google turn away from the Hasabis path to pursue rsi, or will Google stay on its current path, knowing full well that if OpenAI and Anthropic are wrong and the approach of fully automating AI research does not turn out as fruitful as they hoped, then Google's lead in areas like world models and robotics may prove to be decisive. Or finally, is there room, talent, resources and compute to pursue both of these approaches simultaneously shaping Google's AI strategy? The answer seems to be, for the moment at least, to go both. And whether that remains the strategy as the world gets even more compute and resource constrained will be what to watch for next. Summing up all this, I want to be clear that I'm not now somehow massively bearish on Google or anything like that. I think Anti Gravity made progress, although I think on first glance it still appears behind. I think when it comes to 3.5 flash, it's surprising that with the stakes as high as they are, they couldn't get it together to get a Pro model out. But I'm not particularly interested in judging how it compares to 5.5 and 4.7 until we actually have 3.5 Pro. When it comes to Spark, I think we are in such a new space in terms of what the right type of agent interaction is for most people that epistemic humility demands that we be open to different possibilities. And frankly, I'm interested to see Google explore a different one. So none of this is about counting Google out. What it is ultimately is a feeling that whereas they had started to feel over the course of 2025, especially locked in, focused and rowing in the same direction in this product sprawl, we may be seeing splinters of both different strategy and different priorities showing up once again. I will certainly be trying out things like anti gravity 2.0 and spark when it becomes available and I will report back what I find. For now though, that's going to do it for today's AI Daily Brief. Appreciate you listening or watching as always and until next time. Peace. Sam.