Transcript
A (0:00)
Today on the AI Daily Brief, the winners and losers following the launch of Gemini 3. Before that in the headlines, Bezos is back and he's doing AI. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. Alright friends, quick announcements before we dive in. First of all, thank you to today's sponsors, KPMG Blitzy, Robots and Pencils and rovo. To get an ad free version of the show go to patreon.com aidaily Brief or you can sign up on Apple Podcasts. And again to note something I haven't mentioned for a while, it is just $2.99 a month for ad free. I really want a very low cost version of that for you. So if you are a person who doesn't want to have to press that skip button again, you can sign up on Patreon or on Apple Podcast subscriptions. If you're interested in sponsoring the show and we are starting to fill up for Q1, send a note at sponsorsidailybrief AI and lastly, thanks again to everyone who has already contributed to the AI ROI benchmarking study. We are in the last couple of days, so if you want that full readout go share your top use cases. It is at roisurvey AI. Welcome back to the AI Daily Brief Headlines edition. All the daily AI news you need in around five minutes. So today in the main episode we're looking at the winners and losers following Gemini 3. And actually for the headlines we're talking about two companies slash projects that I don't get into in depth in that main episode. But really this kind of forms a complete where do things stand now kind of combination. The big one that we're going to talk about is Jeff Bezos coming out of retirement in the CEO slot in order to build a new AI company. But I did want to also give a mention to xai releasing Grok 4.1, which both the company and Elon say brings significant improvement to real world usefulness. To train the model, XAI developed new reinforcement learning processes that allowed them to create an autonomous training environment using agents. The model update is focused on improvements to writing, quality, personality and instruction. Following ab, testing has been conducted over the past few weeks on x and on grok.com, with Xai finding that users prefer responses from the new model almost 65% of the time. Similar results were found on the LM arena boards where Grok 4.1 and 4.1 thinking leapfrogged the other Frontier models like Gemini 2.5 Pro, Cloudsonnet 4.5 and GPT5. Grok 4 had been ranked below those models prior to the upgrade. Unfortunately, GPT 5.1 has not been included in LM arena, so we don't know how Xai's new model stacks up against the latest from OpenAI. Grok 4.1 also tops the leaderboard on the EQ bench measurement of emotional intelligence. On the creative writing v3 benchmark, Grok 4.1 is beaten out slightly by GPT 5.1, but outranks all other models. Now it goes without saying that this was announced just before Gemini 3, so Gemini 3 is not included on these yet. In addition to usability improvements, Grok 4.1 also has a dramatic reduction in hallucinations compared to Grok 4 fast now overall, it is interesting to see XAI follow OpenAI and push an update focused on EQ and writing quality. Like the release of GPT 5.1, this update didn't include any benchmarking of coding ability or the other typical objective benchmarks. Professor Ethan Malik posted Interesting changes in Grok 4:1 decreases in harmful responses but also increases in sycophancy and deception. And this of course is one of the great challenges when it comes to model personality is are there ways for people to like their interactions without the model just being endlessly coddling and sycophantic? That is something that I'm sure we will continue to discuss. But now we have to get to what was the big news before Gemini 3? That Jeff Bezos is funding a new AI startup and much bigger that he will be personally taking the lead as co CEO. The new startup is called Project Prometheus and has apparently been operating in stealth for some time. Ahead of this announcement, sources said that Prometheus Already has nearly 100 employees, including researchers poached from other labs including OpenAI. DeepMinded Meta Bezos co founder and co CEO of the venture is Vic Bajaj, a physicist and chemist who previously worked at Google Special Projects Division Google X. Now Google X is known for moonshot projects which included the self driving car prototype that became Waymo and a drone delivery service that turned into Wing. Vic most recently co founded an AI and data science company called Foresight Labs around three years ago, with sources saying he left that job recently to focus on Project Prometheus. So what is this company? Is it another model company coming to sneak in and buy Nvidia GPUs and try to compete? The short answer is no, absolutely not. Instead, Project Prometheus appears to be focused on applying AI to physical tasks. The New York Times in their reporting described the focus on AI for engineering and manufacturing of computers, automobiles and spacecraft. Sources said the startup will be working in a similar direction to Periodic Labs who are aiming to automate experiments in material science. It doesn't seem like the company has fully picked a direction at this stage, however. A lot of the speculation is that the work would likely intersect with Bezos interest in space exploration through his company Blue Origin. Now, aside from Bezos returning to the CEO role, one element of the story that's grabbing a lot of attention is the 6.2 billion that's billion with a B in seed funding. That immediately makes Project Prometheus one of the most well resourced early stage startups in AI. For comparison, Mira Muradi's Thinking Machines Lab raised 2 billion in seed funding in July, while Ilya Sutskever's Safe Superintelligence raised 3 billion across two rounds late last year and earlier this year. Now, as you might imagine, much of the funding is said to be coming from Bezos himself. Still, the startup will have whatever resources it needs to hire an extremely elite team of researchers and do pretty much whatever they want. Given that Bezos hasn't run a small company in a very long time, a lot of the reporting is wondering what it's going to be like for him to be the CEO of a hundred person startup. Since he retired as Amazon CEO in 2021, Bezos has mostly made headlines for his mega yacht and his extravagant wedding. But throughout the 2010s, Bezos was the darling of MBA programs around the world. He was never known as a technologist like Elon or a marketer like Steve Jobs. Instead, he was seen as an elite manager able to harness a huge workforce to drive massive growth and domination in multiple sectors. The Bezos philosophies that drove Amazon's success were largely built around the idea of scaling without losing agility. But those lessons might be outdated at this point. AI native organizations have been obsessed with the idea of staying as small as possible. Given how much leverage a small team of people empowered with AI can really have now, how able to adapt he is to the new world, we'll have to wait and see. Elon jokingly welcomed Bezos back with a tweet that said, haha, no way copycat. While others pointed out the significance of AI being enough to lure him back, Mary G writes, bezos couldn't even make it three years without being CEO again. Man saw everyone doing AI startups and said, hold my 6 billion. Now, while some incorrectly assumed that any AI company was just going to be another model company, others were quick to point out that there is something very different going on here. Rohit Mital writes Jeff Bezos becoming a CEO of a new company is one of the most bullish signs for the AI times, and he's choosing to work on AI for manufacturing, the most bullish sign for American manufacturing in a long time. AI Tools Hub 2.0, writes Bezos, isn't chasing another shiny chatbot. He's quietly aiming at the boring trillion dollar layer AI that moves atoms, factories, supply chains, engineering. First wave was models, next wave is whoever wires them into the real economy. I think that's a great take and that is exactly why I'm excited to see what happens with this. For now, however, that is going to do it for the headlines. Next up, the main episode. What if AI Wasn't just a buzzword but a business imperative on you can with AI, we take you inside the boardrooms and strategy sessions of the world's most forward thinking enterprises. Hosted by me, Nathaniel Whittemore and powered by kpmg, this seven part series delivers real world insights from leaders who are scaling AI with purpose, from aligning culture and leadership to building trust, data readiness and deploying AI agents. Whether you're a C suite executive strategist or innovator, this podcast is your front row seat to the Future of Enterprise AI. So go check it out at www.kpmg.us aipodcasts or search you can with AI on Spotify, Apple Podcast or wherever you get your podcasts. This episode is brought to you by Blitzy, the Enterprise autonomous software development platform with infinite code context. Blitzi uses thousands of specialized AI agents that think for hours to understand enterprise scale code bases with millions of lines of code. Enterprise engineering leaders start every development sprint with the Blitzi platform, bringing in their development requirements. The Blitzi platform provides a plan, then generates and precompiles code for each task. Blitzi delivers 80% plus of the development work autonomously while providing a guide for the final 20% of human development work required to complete the sprint. Public Companies are achieving a 5x engineering velocity increase when incorporating Blitzi as their pre IDE development tool, pairing it with their coding pilot of choice. To bring an AI native SDLC into their org, visit blitzi.com and press get a demo to learn how Blitzy transforms your SDLC from AI assisted to AI native. Today's episode is brought to you by Robots and Pencils. When competitive advantage lasts mere moments. Speed to value wins the AI race while big consultancies bury progress under layers of process. Robots and pencils builds impact at AI speed. They partner with clients to enhance human potential through AI modernizing apps, strengthening data pipelines and accelerating cloud transformation. With AWS certified teams across us, Canada, Europe and Latin America, clients get local expertise and global scale. And with a laser focus on real outcomes, their solutions help organizers work smarter and serve customers better. They're your nimble, high service alternative to big integrators. Turn your AI vision into value fast. Stay ahead with a partner built for progress. Partner with robots and pencils at robotsandpencils.com aidaily Brief Meet Rovo, your AI powered teammate Rovo unleashes the potential of your team with AI powered search, chat and agents or build your own agent with Studio. Rovo is powered by your organization's knowledge and lives on Atlassian's trusted and secure platform, so it's always working in the context of your work. Connect Rovo to your favorite SaaS app so no knowledge gets left behind. Rovo runs on the Teamwork graph, Atlassian's intelligence layer that unifies data across all of your apps and delivers personalized AI insights. From day one, Rovo is already built into Jira Confluence and Jira Service Management Standard, Premium and enterprise subscriptions. Know the feeling when AI turns from tool to teammate. If you Rovo, you know. Discover Rovo, your new AI teammate powered by Atlassian get started at ROV as in victory o.com. Welcome back to the AI Daily Brief. As is to be expected following the launch of Google's Gemini 3, the discussion surrounding the entire AI space is all about one how the model is performing in practice, not just on the benchmarks, and 2 how the release of Gemini 3 changes the overall AI landscape. Now, when it comes to that first question, how the model is performing, I have had a chance to start to put some reps in. I've had initially really positive experiences with some data analysis and visualization that I was doing on the AI ROI benchmarking study, but I'm not yet in a position to give a full review and to talk about the use cases that I think Gemini 3 is most valuable for. Look for that sometime later in the week as both my experiments and other people's experiments have a little bit more time to mature. The other part of the conversation, however, around what the release of Gemini 3 does for the industry is something we can discuss right now. I went through and I gave a bunch of different groups red light for having a bad day, yellow light for having a mixed day, and green light for having a good day. And we're going to use that as a framework to also look at a bunch of recent news. Now where we're going to start is with a big announcement from Microsoft, Nvidia and Anthropic. This dropped about an hour before the launch of Gemini 3, and I can't really tell exactly if it was timed to try to sneak in under the wire or if this was just planned as a big announcement as part of Microsoft Ignite and it happened to coincide with Gemini 3. In any case, what was announced was a big deal between Nvidia, Microsoft and Anthropic, a massive new multidimensional strategic partnership. As part of the deal, Anthropic commits to buy $30 billion of Azure compute capacity. Nvidia is investing 10 billion in anthropic. Microsoft will invest 5 billion in anthropic. Nvidia and Anthropic are going to collaborate around design and engineering, as well as establishing what they call a deep technology partnership. As Microsoft CEO Satya Nadella pointed out, Microsoft Foundry customers will now also be able to access Anthropic's Frontier cloud models, although it should be noted that although they will be available through Azure, Amazon will remain Anthropic's primary cloud and training partner for the time being. When it comes to the Nvidia part of the relationship, Anthropic is committing up to 1 gigawatt of compute capacity using Nvidia, Blackwell and Vera Rubin systems. And a lot of the other points of collaboration are at this stage a little bit hand wavy, although I'm sure they get more real over time as the companies dig in. Anthropic's Chief Product Officer Mike Krieger points out that Anthropic is now the only frontier AI lab that is partnered with all three major clouds Google, Amazon and Microsoft. And I think this is pretty reflective of the place that we find ourselves at this point in the AI competition. As much as there is a vicious competition and battle between these providers, and ultimately there will be winners and losers at this stage. Everyone needs everyone. It's a lot more frenemies than Kumbaya. But no one has the ability to go it alone or even stick closely with their solo strategic partnerships. The speed of things is moving too fast. The constraints to development are too great for any one company to support on its own. And as much as the markets are squawking about the circularity of deals, the reality is just basically that the 10 to 20 biggest companies in AI are all going to work with each other on basically every aspect they can for the foreseeable future, based on the presumed ubiquity and market penetration that this industry ultimately will have now. One note about the investment the 15 billion being invested into Anthropic from Microsoft and Nvidia pushes the company's valuation up to the 350 billion range, a massive number that puts them a lot closer to OpenAI's half trillion. Still, it wasn't the big fundraising that led me to give Anthropic that mixed rating on the winners and losers charts from yesterday's Gemini 3 announcement. On the one hand, there is some inherent challenge for any other frontier model provider, given Google Gemini's size, growth rate and the incredible apparent capabilities that this model has. In other words, it's harder to compete with Google when they've released Gemini 3 as compared to when they had Gemini 2.5. At the same time, when you look at the benchmarks, the two that Gemini 3 didn't win outright were both behind Claude Sonnet 4.5. They tied at 100% for Aime 2025 with code execution. But the big one given Anthropic's dominance as a coding model is that on sue bench Verified Claude Sonnet 4.5 still outperforms Gemini 3 Pro. And by the way, GPT 5.1. I saw a number of different independent testers that found something similar. Binduretti wrote. Gemini 3 barely inches out GPT5 but is behind Sonnet 4.5 on coding and agenda capabilities. Sonnet 4.5 continues to rule in the combined agentic encoding arena. So like I said in the note, I think Anthropic had a surprisingly mixed day, especially when you consider that we're talking about Sonnet 4.5, not Opus 4.5. Next up, let's talk about OpenAI in the same way that I think Anthropic has harder competition now than they did before the launch of Gemini 3. I think the same applies for OpenAI and ChatGPT, and indeed there was no shortage of it's so over for OpenAI posts. OpenAI in particular had an even more skeptical eye given their recent spate of dealmaking. Zhen Xu writes, so if Google has a better flagship model, Quen Kimmy Deepseek have better open source models with wider adoption, free and cheap API, Anthropic is winning enterprise and XAI is better on long context reasoning with real time access to X. How will OpenAI get 100 billion in revenue by 2027. There were other folks who were less snarky but just pointed out the resource constraint challenge. Elmer de Bravin writes why ChatGPT is going to fall behind Gemini and Grok OpenAI can't scale up compute fast enough. Sure they are trying but they're too slow in comparison to Elon and Google. The difference in intelligence will eventually next year become clear. At the same time, I think this is much more mixed than people are giving it credit for. Yes, the benchmarks show a meaningful improvement between Gemini 3 and GPT 5.1. 5.1 is a great model and is basically having exactly the opposite response from consumers as GPT5 did. The folks who want more personality are liking it better and the people who want better strategic collaborative thinking are liking it better. Plus there are already specific examples of use cases where people are finding 5. One still beating out Gemini 3. Swix runs an AI curated AI newsletter and did a comparison and came to the conclusion GPT5.1 is better than Gemini3 is better than all the others, and it's not particularly close. He also gave about eight reasons why he thinks five. One wins. Alex Finn, who it seems has a very similar set of use cases as mine writes, I've been testing Gemini 3 for over a week and it's incredible. Extremely smart, the best straight up problem solver and getting answers AI ever. If you need information, there's no tool better. It's not quite there yet though. When it comes to vibes, I use AI 80% of the time for business planning and creative writing. I use it to be my project manager, come up with new novel ideas for products and features to build, and as a business consultant to bounce ideas off of. It doesn't quite have that human feel GPT5.1 thinking has. So on his use case list, creative writing and business planning still go to 5.1 thinking. Like I said, it's too early right now for me to make a strong statement about anything, but my initial instincts are that I'm going to find something similar. 5.1 is my favorite model for creative and business strategic collaboration since 03, and I've been finding myself enjoying it enough that it's actually significantly increasing the amount of time I'm spending interacting with AI on those types of use cases. The point of all this is that ultimately I think that Gemini 3's launch, even its improvement relative to 5.1 on some of the benchmarks, feels within the band of expectations and not some mortal blow. And despite Google being the bigger company overall Chatgpt does have the unassailable brand association with AI chatbots. For many people out there it is simply what AI is now. One company that I think we have to discuss that's not primarily a model or chatbot company is but that does have implications from yesterday I believe is Nvidia. And for them I'm suggesting that they had a red not so good day. The simple reason for that can be read on page two of Gemini 3 Pro's model card where they write Gemini 3 Pro was trained using Google's tensor processing units. TPUs are specifically designed to handle the massive computations involved in training LLMs and can speed up training considerably compared to CPUs. Now, it's not surprising obviously we know that Google has been building these TPUs but but the fact that TPUs and not Nvidia GPUs were used to train what is now the most state of the art model, at least according to the benchmarks, does I think have implications for the unassailability of Nvidia's position. John Grievous writes, people are sleeping on how impressive it is that Gemini 3 is fully trained on TPUs. Kakashi writes, TPUs are Jensen's biggest nightmare. That's one of the main reasons he's pushing Nvidia GPUs onto anthropic with the investment incentives and urging OpenAI to keep using cloud providers that rely on Nvidia rather than Google. Entrepreneur Siki chen writes regarding Gemini 3 For the past four years I've had the plurality of our liquid net worth in Nvidia. About a month ago I sold it all and rotated into Google. Take from that what you will. Now, I don't want to overstate things. Nvidia is still in an incredibly advantaged position and at this stage TPUs are still a Google internal advantage rather than something that they're selling to the market. But this certainly opens up more opportunities for them as their own business line in ways that could impact Nvidia in the longer term. Now it was interesting as I was preparing this note, how much Meta didn't come into the conversation. When it comes to meta and AI. This year the story has really been two parts. The positive side is that at this stage they have the only AI related wearable that people actually like, which is of course the meta ray bans. And that's I think a much bigger advantage than maybe people are appreciating. Still, mostly this year has been all about restructuring and reorganizing of their internal processes. It's been about Zuckerberg going and poaching and building the superintelligence team, about the bringing in of Alexander Wang from scale to lead the new efforts. And really we're waiting to see what comes out of that. Ultimately, the next big test will be whatever model they choose to put out next, but the short of it is they really need it to be a banger. One optimistic thing is that a couple of years ago when Google was struggling, it was because they had a lot of divided and distributed efforts around AI in a similar way to how Meta has up until the last couple of months, where they've been trying to sort of ruthlessly align things in a new way. It took multiple layers of Google reorganization and ultimately bringing everything together under DeepMind and naming it in one direction for their efforts to really start to come to the fore. But from here now we have to get into who were winners from yesterday's announcement. And the first category absolutely is the AI market bulls. There is incredible fear in the market right now. In fact, in the Fear and Greed Index, Generally we're at a 13 with extreme fear driving the US market. A big part of that is concerns around overspending on the AI buildout and the potential of an AI bubble popping. So much of the economy is tied up in AI expectations that people are of course, getting more and more nervous. Now, we've been following this quite closely, and so I don't need to belabor the point, but one of the things that is important to note is that among the signals that people are looking for when it comes to whether they think we're in boom or bubble territory is whether it seems like we're hitting performance plateaus. A big part of the latest leg of this bubble talk was, of course, the feeling of plateau that happened around GPT5, even though it wasn't exactly true. And we've consistently had a correlation between this sense that AI is hitting a wall or hitting scaling limits and the market's general sense of the AI bubble. We talked yesterday about how these benchmarks at least represent a major jump and really throw some cold water on the idea of a scaling wall being hit. In fact, Adam GPT, who does go to market at OpenAI, shared a meme of a man wagging his finger saying no wall for you, capturing the sense among many that Gemini 3 really shows that there is more to get when it comes to scaling these LLMs. Google's Oriole Vignal actually talked a little bit about how they got the performance that they did. He tweeted the secret behind Gemini 3 simple improving pre training and post training. On pre training, contrary to the popular belief that scaling is over, the team delivered a drastic jump. The delta between 2.5 and 3.0 is as big as we've ever seen. No walls in sight on post training, still a total green field. There's lots of room for algorithmic progress and improvement, and 3.0 hasn't been an exception thanks to our stellar team. So like I said, AI market bulls big winners from yesterday's announcement. The second big winner category is the Vibe coders and specifically I'm talking about here the non technical Vibe. In other words, the section of the Vibe coders like myself who are not ashamed of the Vibe coding title and who are not wrestling with the autonomy spectrum and how much we want AI to respond to us versus be independent agents that go off and code on their own. No, I am talking about the mass democratization of people who can create with code now thanks to these Vibe coding tools and for all of us, and I think there are a lot of us, 3o totally kicks butt 3.0 appears to be a big jump up. Now we did talk yesterday about Anti Gravity, the new IDE from Google, and so you might be wondering, should I have a rating for the AI coding companies like Windsurf and Cursor, et cetera, that now have a new competitor? I guess if I did I would also have it in the yellow column in the sense that new competition from Google is meaningful and they have to take it into consideration, but they also all have experiences where they get to take advantage of the latest models as well. And you're already seeing that with the way, for example, that Replit has integrated Gemini 3 into their new design experience. Now this is one that I've had a chance to play around with a little bit and good lord, is this so much better than the off the shelf design that you were getting from some other Vibe coding tools just a few minutes ago. I haven't had that much time to play around with it yet, but it appears to me that Gemini 3 at least integrated into this overall experience that Replit is offering represents a major advance in the quality of the design that comes with Vibe coding, and that is something that I will absolutely be taking big advantage of. There are also so many people who have shared the games that they vibe coded. Overall, I think that one of the biggest green categories of winners from the Gemini 3 announcement is the Vibe coders. And of course the other big winner on the day is Google themselves This really represents the capstone where Google went from how the hell are they behind to underwhelming Bard to the first versions of Gemini suggesting rocks and glue on pizza to their image models creating black Nazis to By the end of last year, hey, Notebook LM is a pretty cool product. Maybe they're getting their groove back to a year this year where the number of users has rocketed to 650 million monthly active users, where the amount of tokens processing has jumped dramatically in the last six months, and where Gemini 3 is now, at least by the benchmarks, the best model in the world and certainly in rarefied air even among consumer preferences for that very top slot. On top of that, as we've discussed in this show, and as Ben Dixon points out, Google is the only company that has control over the full stack, applications, foundation models, cloud inference and acceleration hardware. Mandla Venture's Dee Dee Das writes, we're in the what if Google does that part of the AI cycle? They can make cheaper models, better models, distribute products at no cost to billions of users, get good unit economics because they own TPUs and use it to retain premium talent cheaper. He points out of the big tech giants, Amazon and Microsoft chose to be infrastructure partners, Apple chose not to play meta shat the bed. Google is coming out on top. Interestingly, this is sort of consensus enough that others are simply asking how it happened. Er, Katakana writes, what I want to know is how did Google go from way behind to easily number one in all domains of modern AI in like a year? The answer, by the way, to many was what Peter Levels pointed out the return and reinvolvement of Google co founder Sergey Brin. Whatever truth there is in that, ultimately Google is heading into 2026 in an incredibly strong position and at least from a consumer and an enterprise perspective, whatever else happens next, that is nothing but gravy and upside for all of us users. So that is my sense of the lay of the land, the winners and losers after Gemini 3day how long this remains the state of things remains to be seen. People are still just starting to experiment with Grok 4.1 and Elon seems to think it's a bigger deal than people are giving it credit for. And when it comes to anthropic and OpenAI, we could still get GPT 5.1 Pro and Opus 4.5 and things could feel quite different again. Like I said, ultimately the biggest winner is all of us users who seemingly every couple of weeks have new capabilities and new use cases that get unlocked and I certainly plan on taking the time to go take advantage of them for now. That's going to do it for today's AI Daily brief. Appreciate you listening or watching as always. And until next time, peace.
