Transcript
A (0:00)
Today on the AI Daily Brief, the month AI Woke Up. Before that in the headlines, the latest on Anthropic versus the US Government. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. Alright friends, quick announcements before we dive in. First of all, thank you to today's sponsors kpmg, aiuc, Blitzy and Scrunch. To get an ad free version of the show go to patreon.com aidaily brief or you can subscribe on Apple Podcasts. To learn about sponsoring the show or really anything else about the show. Check out aidaily Brief AI One of the things of course we've been talking about a lot is our Claw Camp and our Enterprise Claw programs. Claw Camp is an always free self directed program. Enterprise Claw is an upcoming paid training program led by New Far Gaspar. Registration is open for that right now and we'll close at the end of the week. You can find out more at enterpriseclaw AI or again just from aidaily Brief AI. Now with that out of the way, let's catch up with Anthropic. Welcome back to the AI Daily Brief Headlines edition. All the daily AI news you need in around five minutes. I said over the weekend as we were covering the Anthropic Pentagon story that we were probably going to have quite a few updates on this one in the weeks to come, and indeed that is certainly the case. The conflict between Anthropic and the Pentagon, White House, Trump, Hegseth took on a new light over the weekend as the US and Israel launched preemptive strikes on Iran. Now, while some thought that maybe this made that 5:01pm deadline on Friday not arbitrary and instead driven by the Pentagon's need for an approved and operational AI system in place ahead of the Saturday operation. But as per Wall Street Journal reports, Anthropic's technology ended up being used in the strikes. Despite being declared a supply chain risk hours earlier. Sources said that CLAUDE was used to analyze intelligence, help select targets, and carry out battlefield simulations. To be clear, there are no suggestions that CLAUDE piloted fully autonomous weapons, but the Pentagon has confirmed that this was the first time that autonomous Lucas Kamikaze drones were deployed in an active mission. Their use highlights that autonomous weaponry is part of modern warfare already and doesn't require the use of Frontier LLMs. Additionally, despite OpenAI signing a new deal on Friday, that company's models were not used in the attack. Katrina Mulligan, OpenAI's head of national Security Partnerships, said that that wouldn't have been possible as the models haven't yet been approved for use in classified settings. TLDR in spite of some of the chatter, it doesn't actually appear that the Pentagon hot swapped AI models on the Friday night before an operation. As the president said on Friday, there's a six month phase out period where Anthropic's tech will remain in military use. Still, for some, all of this makes the way that it played out even more confusing and contradictory, Democrat Congressman Seth Moulton wrote Friday. The Pentagon claims Anthropic is a national security risk and should be blacklisted. Saturday the Pentagon still uses Anthropic's claw during its strikes on Iran. Either they used tech that is a NATSAC risk during military action, or they lied in the first place. So that might be what's going on in the actual deployed world of military operations, but what's happening in the world of consumer sentiment is very different. Anthropic saw their downloads spike over the weekend, driving Claude to number one on the app charts, overtaking ChatGPT for the first time. Claude was outside the top 100 free apps at the end of January and spent most of last month outside of the top 20. Taking advantage of their surge in popularity, Anthropic promoted the ability to easily migrate memory from ChatGPT for those making the switch. Now as a side story, this is something that people are paying a lot of attention to. The signal account on Twitter writes this is incredibly fascinating because we initially thought that memory is a moat, but if it is just a file you can take with you, I suspect people aren't going to do this at scale, but very interesting to see this play out and stress tested. Now to be clear, this isn't some super sophisticated feature. It's it's basically a big old prompt that Claude gives you that you paste into whatever LLM you're using and take the results and paste it back to Claude. Point being that it is not going to be perfect. It is not going to have all the context even if it gets you started. In any case, on Saturday, Sam Altman hosted an AMA on X to answer public questions about their new contract with the Department of War. One of Altman's big points was that the threat of labeling Anthropica supply chain risk is bad for the entire industry, he wrote. We said to the Dow before and after we said that part of the reason we were willing to do this quickly was in the hopes of de escalation. I feel competitive with Anthropic for sure, but successfully building safe superintelligence and widely sharing the benefits is way more important than any company competition. I believe they would do something to try to help us in the face of great injustice if they could. We should all care very much about the precedent. OpenAI later published a blog post containing details of their contract with the Department, including full text of the sections dealing with AI redlines regarding autonomous weapons. The contract states the AI systems will not be used to independently direct autonomous weapons in any case where law, regulation or Department policy requires human control. Regarding domestic surveillance, the contract laid out a series of applicable laws and directives adding, the AI system shall not be used for unconstrained monitoring of US Persons private information as consistent with these authorities. Now many pointed out that this language does not prevent OpenAI's technology to be used for autonomous weapons or domestic surveillance as long as the Pentagon deems that use to be lawful. Self professed AI security hawk Peter Wildeford posted, OpenAI is trying to claim simultaneously that a their contract with the Pentagon allows for all lawful purposes and b also that their red lines are fully protected. The way OpenAI bridges this is by saying the protections live in this deployment architecture and safety stack rather than in the contract language. But if this contract says all lawful purposes and your safety stack prevents a lawful purpose, you're in breach of contract. The Pentagon can just say, we both know your model can do this. You should remove that safeguard and then OpenAI would have to comply or be sued. Both Sam Altman and NATSAC lead Katrina Mulligan responded to this particular point. Mulligan said a lot of the concerns about the government's all lawful use language seem to stem from mistrust that the government will follow the laws. At the same time, people believe that Anthropic took an important stand by insisting on contract language around their red lines. We cannot have it both ways. We cannot say that the government cannot be trusted to interpret laws and contracts the right way, but also agree that Anthropic's policy red lines in a contract would have been effective. Setting out OpenAI's approach, she continued, let the democratic process decide on the legality and proper use question. Now somewhat overshadowed by the conflict with the Pentagon, OpenAI finalized the largest startup fundraising round in history on Friday morning. The round ultimately totaled $110 billion, valuing OpenAI at an $840 billion post money valuation. The valuation positions OpenAI as the most valuable startup ever and the 15th most valuable company in the world. They are now worth slightly more than JPMorgan Chase Notably, the round remains open and OpenAI expects another 10 billion from financial entities, including UAE investment fund MGX by the end of March. The 110 billion is entirely from three corporate strategic partners. Nvidia and SoftBank invested 30 billion each. Details were a little scant on this front, but OpenAI mentioned the Nvidia strategic partnership includes additional chip supplies, but the largest investor was Amazon who put 50 billion into the round. This investment is split between 15 billion due at the end of March and a further 35 billion contingent on OpenAI going public or hitting unspecified milestones. Previous reporting rumored that these milestones included achieving AGI. I don't know why these companies keep putting a term as nebulous as AGI as a condition on their contracts. It's just going to make lawyers rich later now. Overall, the Amazon strategic partnership is wide ranging. OpenAI will expand their server rental deal with AWS from the previously announced 38 billion over seven years to 138 billion over eight years. As part of the agreement, OpenAI has also committed to use Amazon's Trainium 3 and forthcoming Trainium 4 chips. OpenAI and Amazon will also jointly develop AI models to power Amazon's consumer apps. The Amazon deal also has some interesting implications for Microsoft, who notably did not make a further investment as part of this round. Microsoft continues to hold the exclusive right to serve so called stateless versions of OpenAI's models and the revenue sharing agreement also remains in place, so Microsoft will take a cut of revenue generated through AWS. Amazon will be the exclusive provider of OpenAI's Frontier AI agent management tool, aside from the first party deployment. However, the OpenAI branded version of the tool will be hosted on Azure alongside the fundraising numbers. We also now learned that ChatGPT has 900 million weekly active users. The last reported figure was 800 million in October, and reports suggested that stagnating user growth had been part of the trigger for Sam Altman's Code Red in December. The announcement underscored that subscriber growth is also strong, now reaching 50 million, writes OpenAI. Subscriber momentum accelerated meaningfully to start the year with January and February on track to be the largest month of new subscribers in our history. People use ChatGPT to learn, write, plan and build. As usage scales, the product improves in ways people feel immediately faster responses, higher reliability, stronger safety and more consistent performance. OpenAI also shared that they now have more than 9 million paying business users across startups, enterprises and governments. And in addition, weekly Codex users have tripled since the beginning of the year to reach 1.6 million. Now that might be the perfect segue to talk about the big changes that ended up characterizing February. So with that, we will close the headlines and move on to the main episode. Foreign. Is powering a $3 trillion productivity revolution and leaders are hitting a real decision point. Do you build your own AI agents? Buy off the shelf or borrow? By partnering to scale faster KPMG's latest thought leadership paper Agentic AI Navigating the Build, Buy or Borrow decision does a great job cutting through the noise with a practical framework to help you choose based on value, risk and readiness and how to scale agents with the right Trust, Governance and Orchestration Foundation. Don't lock in the wrong model. You can download the paper right now at www.kpmg.usnavigate. again, that's www.kpmg.usNavigate. there's a new standard that I think is going to matter a lot for the enterprise AI agent space. It's called AIUC1 and it builds itself as the world's first AI agent standard. It's designed to cover all the core enterprise risks, things like data and privacy, security, safety, reliability, accountability and societal impact, all verified by a trusted third party. One of the reasons it's on my radar is that 11 labs, who you've heard me talk about before and is just an absolute juggernaut right now, just became the first voice agent to be certified against AIUC1 and is launching a first of its kind insurable AI agent. What that means in practice is real time guardrails that block unsafe responses and protect against manipulation. Plus plus a full safety stack. This is the kind of thing that unlocks enterprise adoption. When a company building on 11 labs can point to a third party certification and say our agents are secure, safe and verified, that changes the conversation. Go to AIUC.com to learn about the world's first standard for AI agents. That's AIUC.com with the emergence of AI code generation in 2022, Nvidia master inventor and Harvard engineer Sid Pareshi took a contrarian stance. Inference, time, compute and agent orchestration, not pre training, would be the key to unlocking high quality AI driven software development in enterprise. He believed the real breakthrough wasn't in how fast AI could generate code, but in how deeply it could reason to build enterprise grade applications. While the rest of the world focused on co pilots, he architected something fundamentally different. Blitzi, the first autonomous software development platform leveraging thousands of agents that is purpose built for enterprise scale code bases. Fortune 500 leaders are unlocking 5x engineering velocity and delivering months of engineering work in a matter of days with Blitzi Transform the way you develop software. Discover how@blitzi.com, that's blitzy.com quick question when was the last time you actually visited a website to research something? If you're like me, AI pretty much. Does that work for you? Now that of course raises a new question for brands. If AI is doing the discovering, researching and deciding who or what is your website really for that shift in user behavior, the rise of AI bots becoming your most important new visitors is what my sponsor Scrunch is taking head on. Scrunch is the AI customer experience platform that helps marketing teams understand how AI agents experience their site, where they show up in AI answers, where they don't, and what's preventing them from being retrieved, trusted or recommended. And it's not just visibility. Scrunch shows you the content gaps, citation gaps and technical blockers that matter and helps you fix them so your brand is found and chosen in AI Answers. Now for our listeners, Scrunch is providing a free website audit that uncovers how AI sees your site, where there's gaps, and how you're showing up in AI versus the competition. Run your site through it at scrunch.com aidaily Foreign. Welcome back to the AI Daily Brief. As part of my collaboration with kpmg, each month at the beginning of the month we do a little bit of a recap of the previous month that preceded it. For the listeners who are somewhat less regular and don't have time to catch every show, it's meant to serve as a quick, simple recap of the key themes. Meanwhile, for those of you who are here every day, it's to put a fine point on any changes that the previous month represented. Despite the incredible amount of attention around it, not every month in AI is huge. However, February of 2026 was this was the month that crystallized for a number of different groups that, to quote one of the viral pieces from the month, something big is happening. And in fact, one of the things that made the month so interesting was the extent to which the broad recognition that something had changed and that something big was indeed happening was the way that that realization cascaded across all sorts of different groups. Let's talk about the AI insiders first. This is basically the people like you guys, the enfranchised, highly engaged, probably using vibe coding tools, type of AI users who actually pay attention to when new models launch and what new capabilities they have. This is the group for whom basically the period from the holiday break at the end of last year up until now has been a steady realization and embracing of the idea that the generation of models that came around last November represented something meaningfully different than those that came before. The core and first manifestation of this was of course around software engineering and one of the people who's been in the eye of the storm and communicating what so many others have felt is former OpenAI founder Andrej Karpathy. About a week ago he tweeted, it's hard to communicate how much programming has changed due to AI in the last two months. Not gradually and over time in the progress as usual way, but specifically this last December. He then goes on to explain exactly what happened. Effectively, he says, coding agents basically didn't work before December and they basically do now. As he puts it, the models have significantly higher quality, long term coherence and tenacity, and they can power through large and long tasks well past enough that it is extremely disruptive to the default programming workflow. Programming, he writes, is becoming unrecognizable. The era where you type code into an editor is done, he says, and instead we are now in the era of spinning up AI agents, telling them what to do in natural language and then managing their work. The biggest prize, he says, is about orchestration. How many of these agents can you have going at once in a way that actually adds up to something real? He concludes, this is nowhere near business as usual time in software, and I think this does a pretty good job of summarizing what has shifted. In short, agents that could actually do work whom you give not a plan, but just a goal and let them come up with a plan are now for many of these most enfranchised users, the primary way that they get value out of AI. And what's more, in February this was given a name and a face and an icon in what was first named claudebot for a very short time named Multbot and ultimately finalized as OpenClaw, OpenClaw has been so far the biggest, clearest manifestation of the change in autonomy ambition. Openclaw created a process by which users could give those powerful new generation of models access to their systems and let them actually do meaningful work on their behalf. It started with simple personal assistant type Things. Indeed, the OpenClaw homepage still says cleans your inbox, sends emails, manages your calendar, checks you in for your flights. But that is 100% not where it stayed. Almost immediately people were using Openclaw for much more extensive and much More ambitious autonomous or semi autonomous work. I did a show around mid month about the 10 agent team that I had built, which included one developer agent, two researchers, five project managers, one chief of staff and a partridge in a pear tree. Because of openclaw, Mac minis and for some even Mac studios became the hot new visualization of the new era of AI. And again, what's super important to point out is that despite openclaw being very meaningfully not for beginners, something that indeed requires a ton of technical work and frankly beating your head against the wall as you sort through just legions of different problems, despite all of that, it was not just developers who were excited about it, it was all sorts of different types of people. I have no better evidence for this than the response to clawcamp, which is the self directed program I put together that basically took the process that I had gone through to figure out how to build both my first agent and then the agent team and turned it into a sequence that other people could follow. It is not an easy sequence. It takes a lot of time and a lot of hard work and yet nearly 5,500 people are doing it right now. By the end of the month, we were starting to see the manifestation of ideas that have long lurked around the edges of AI as some exciting future potential, but which were now coming to the fore. Solopreneur Ben Serra by himself built a company called pulsea, which is an AI for running autonomous AI companies. Basically you sign up for pulsea, give it an idea or just ask it to surprise you, where it'll go, do some research and come up with a relevant idea that seems related to you, and then it will build a company around it. Pulsea gives it access to everything from GitHub to meta ads, basically everything that you could need to run an online business. The company is up to an annual run rate of over $1.25 million in just a couple of weeks. So if this whole increase in autonomy ambition was the key theme represented by openclaw, wouldn't you think that all the big labs would be racing to catch up with that? Indeed you would. And indeed that's what happened. At the beginning of the month, OpenAI released the Codex app, the latest in their push to catch up with and then eventually try to exceed CLAUDE code. And by the middle of the month, they had made another huge move by hiring the creator of openclaw to build these types of systems inside the context of OpenAI. Many thought that the whole situation surrounding Openclaw was a bit of a bauble for Anthropic, given that it had originally been named claudebot after Anthropic's Claude, which instead of embracing Anthropic, asked for a name change, which is what led to Multbot and eventually openclaw. Still, pretty much everyone assumed that the type of features that OpenClaw made available were likely to come to Claude code pretty soon. And sure enough, over the last week and a half or so we saw first Anthropic release remote Control, basically a way where you can move from a Claude code session on your computer to managing it from your phone while you're on the go. This being of course one of the main appeals of openclaw, the fact that you get to interact with it Through Telegram or WhatsApp or another app on your phone. And we also saw Anthropic release scheduled tasks inside of not Claude Code, but Cowork. Also, over the last week, Perplexity announced Perplexity Computer. Now they've been working on this for the last couple of months, but it's very much playing in the same space as openclaw in that you give it a wildly ambitious task to be built and it can just go figure out how to do it. Microsoft announced Copilot Tasks, and we've also heard reports that Microsoft CEO Satya Nadella is actually using OpenClaw and encouraging his team to check it out and so on and so forth. We also got notion custom agents, and basically I think you can assume that this clawfication of AI, in other words, the move to actual agentic AI, is going to continue to proliferate across the industry. Now, as we transition to the next group that woke up, it's important to note again that the people who were Claude coding and open clawing weren't just the devs. In preparation for a segment about these new tools, CNBC's Deirdre Bosa went to try to build her own version of Monday.com with Cloud Cowork, just to understand and share with her audience what's actually possible. She figured it won't work, but it would be a good way to show people the current state of the technology. An hour later, she writes, I literally have my own Monday.com that's plugged into my calendar in Gmail and surfaced a kid's B day that was not anywhere on my radar and I need to get a gift for finance and markets. Content creator Joe Weisenthal was actually a couple weeks ahead of everyone else starting to play around with Claude code in a big way. Back in January, and in many ways preceding the realization that the rest of Wall street would have coming into February. Because if one group that woke up in February was the AI insiders, the other group was Wall Street. February was the month of the SaaS apocalypse. And it actually started off at the end of January when after Google shared the demo version of Genie 3, where you could create 60 second immersive worlds, a bunch of gaming industry publisher stocks fell. But that would be just the very beginning. But the big actual story of the SaaS apocalypse would end up being that basically every time Anthropic announced some new plugin for Claude Code or Cowork, a set of stocks that were somewhere between directly and nominally related to that plugin's focus would just absolutely crater. On February 10, Bloomberg wrote that Wall Street's new hot trade was dumping stocks that were in AI's crosshairs. And it was not just one category. We saw this in games, we saw this in productivity software, we saw it in finance, we saw it in legal. On February 10, the Wall Street Journal went so far as to call Wall Street's hot new trade dumping stocks that in their words, were in AI's crosshairs. And remember, this is not just one category. This is games, legal software, general productivity software. IBM seeing their worst single day drop in 25 years because anthropic wrote a blog about its COBOL tool, which had been announced months earlier. All of this was the perfect caustic environment for Citrini Research to drop their highly viral piece called the 2028 global intelligence crisis, which basically articulated a theoretical doom loop scenario that led to utter economic catastrophe. Despite that report producing a lot of good counter conversation as well, when in the middle of last week, Block announced that it was cutting 4,000 employees, about 40% of its overall staff, many pointed to it as evidence of the exact sort of white collar carnage that the Citrini report was discussing. Now, there has of course been a lot of debate about the extent to which it might be the biggest case of AI washing we've seen so far. But this is where the environment is heading out of February and into March. Wall street is extremely jumpy when it comes to AI. And this time it's not because of the size or circularity of infrastructure deals, but because AI might be too good. Then of course, there's Washington. Not only was February the month Washington woke up to AI, but that the rest of us woke up to the complication of the relationship between Washington and Silicon Valley when it comes to AI. I just did an extended episode about this and we talked about the latest in the headlines. But of course the TLDR is that through a series of steps seemingly going back to the Nicolas Maduro Venezuela raid, there was a negotiation where Anthropic wanted specific red line carve outs around AI being used for autonomous weapons and for domestic mass surveillance, with the White House instead wanting the standard to be any lawful uses. Now, of course, this disagreement wasn't just about these specific uses. It was much more about who gets to determine for what and how AI is used. It was the first manifestation of what was always an inevitable power struggle, even if it happened in a very ugly way. Indeed, before the fight took its most dramatic turn. Already members of Congress like Thom Tillis were pretty disgusted around how the whole thing was happening. Tillis said, why in the hell are we having this discussion in public? Why isn't this occurring in a boardroom or in the secretary's office? I mean, this is sophomoric. It of course only got more so as President Trump and Defense Secretary Pete Hegseth announced that not only would the US government not be working with Anthropic, but that they were going to be designating them a supply chain risk, arguing that that meant that other contractors of the US Government would also have to drop their relationships with Anthropic, which if it came to pass, would have some pretty serious and dramatic implications. Now, this particular manifestation of the battle itself isn't even done yet. And again, it is just the first in what will be a much bigger power struggle in the years to come. So those were the big things. AI insiders woke up and changed their level of autonomy, ambition, broader. White collar workers got dragged along and started to use things like Claude Code, Claude cowork, and even openclaw. This spilled over into a recognition on Wall street that, as Matt Schumer's post put it, something big was happening, which led to lots of chaos in the markets. And to cap it all off, we have of course, the absolute BUN fight that is Anthropic versus the Pentagon. In terms of other key details, a couple of things that are worth noting, especially as we keep an eye on what we expect in March. First of all, while we didn't get the much anticipated deep seq4, we did get seed dance 2.0 from ByteDance, which is a video generation model that had many people in the AI industry asking whether it was the first example of of China open weight models not only catching up with the US but actually exceeding it. The big one to watch for coming up in March is of course that deep SEQ event which people have been predicting as coming next week for basically every week for the last four weeks. Anthropic added Sonnet 4.6 to Opus 4.6, making a complete 4.6 suite, and Google dropped Gemini 3.1 Pro. One interesting note about Google's releases is that they're very clearly flexing the opportunities around multimodal, although exactly how that's going to hit and what it's going to matter for, especially as everyone is just talking about co generation, remains to be seen. Capping the month in models off was also another Google model, nanobanana2, which was actually more of a functional upgrade than anything else, making Nanobanana much faster and cheaper, although also improving things like text handling and text reasoning. One final story from the month that I think does a good job of capturing where we are When Meter finally shared where Codex 5.3 and Opus 4.6 were on their Long Horizon task study, both were high, but Opus 4.6 especially was basically off the charts at this point. In other words, we are in uncharted territory where even the metric that became one of the most, if not the most important metric in some ways of showing AI progress last year just can't keep up any longer. And with March now here, we could be heading for something else huge. Myx Twitter today is filled with rumors of GPT not 5.3 but 5.4, and a lot of breathless discussion about how much better it is. We'll see if that's actually true, but no matter what, 2026 is off to a rollicking start, and that is going to do it for today's AI Daily Brief. Appreciate you listening or watching. As always, thanks to KPMG for sponsoring the monthly recap. And until next time, PE.
