Transcript
A (0:00)
Today on the AI Daily Brief the Power we have to Shape AI the AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. Alright friends, quick announcements before we dive in. First of all, thank you to today's sponsors, KPMG, Robots and Pencils, Blitzy and aiuch. To get an ad free version of the show go to patreon.com aidaily brief or you can subscribe at Apple Podcasts. If you are interested in sponsoring the show or really learning anything about the show, you can find it all at aidlybrief AI While you are there I suggest you check out the newsletter which is back. It's pretty simple, we talk about a lot of stories and share a lot of links and as great as a podcast is for in depth exploration, it is not so good for sharing the actual links themselves. And that is what the newsletter is for. Again, you can find that at aidelibrief AI now we are back with a weekend episode. And as you guys know, weekend episodes are long reads slash big think episodes. They're a chance for us to zoom out a little bit, get away from the grind of daily news announcements and try to think about things a bit holistically. Recently there have been more big think than long reads episodes, but today we get to do both because Professor Ethan Malik has dropped his latest essay, the Shape of the Thing. So what we're going to do is read some big chunks of that and actually an earlier essay of his as well, and then talk about what I think the big thesis is, which is the power to shape AI. Ethan writes, In October of 2023, I wrote about the shape of the Shadow of the Thing, speculating on the thing that AI might turn into in the coming years. I think we can see the Thing much more clearly now and some of the consequences that come with it. Now Record Scratch Editor's Note for the sake of posterity, let's actually go back to that writing from October 2023, the shape of the Shadow of the Thing. It's a pretty interesting time capsule and provides a nice pairing with this newer essay. So zooming back to October 2023, Ethan writes, a lot has happened in the past week or so, so I wanted to write a post taking stock of where we are. In many ways, I see us reaching the culmination of the first phase of the AI era that started only 10 months ago with the launch of ChatGPT. It ends with the upcoming launch of Google's Gemini, the first LLM model likely to beat OpenAI's GPT4. Now enough pieces of the jigsaw puzzle are in place that we can start to see what AI can actually do, at least in the short term. Even more importantly, the actual implications of what this phase of AI will mean for work and education is currently unknowable. It is unknowable to all of us who don't have insight into what the AI labs have planned, but it is also actually unknowable to them. I guarantee that the people at Google or OpenAI or Microsoft do not know the implications of AI for your job, or your company or your education, or even all the ways in which the systems they are building will ultimately be used for good or bad. So we can't see the thing that is being built, or even the shadow it is going to cast over work and education, but we can get a sense of its general shape from there. Ethan makes a few observations and a few predictions. He predicts that Gemini would outperform GPT4, which didn't exactly end up happening. But what he did get right is that we would be in a period for a while where all the models were floating around the same level of GPT4 style intelligence. That was of course, a lot of the story of 2024. He talked about the increasing capabilities of AI image creation as well as AI voice and concludes we have these pieces which let us guess at the shape of the AI in front of us. It isn't science fiction to assume that AIs will soon talk to you, see you, know about you, do research for you, create images for you, because all of that is already built and working. I can already pull all of these elements together myself with just a little effort. That means AI can quite easily serve as personal assistant, intern and companion, answering emails, giving advice, paying attention to the world around you in a way that makes the series and Alexas of the world look prehistoric in many ways. What happens next? The actual thing that all of this becomes in the near term depends on our agency and decisions. It is not going to be imposed on us by machines. With these new capabilities, AI can either serve to empower and simplify or to remove power. Some of these consequences are knowable and need regulation or responsible action by individuals, and some is going to fall unevenly across industries and societies. It is up to us to figure out how to use this new technology to empower and uplift rather than harm. So that's where Ethan ended that post back in October 2023. But now we return to today. As I have been discussing in recent posts, Ethan writes, we have entered a new phase of AI. After ChatGPT was introduced, human AI work took the form of what I call cointelligence, where humans would prompt AI back and forth to get help on tasks. Starting in late 2025, we entered a new era thanks to AI agents like Claude Code, OpenAI's Codex and OpenClaw. These are AI systems that you can just give work to, sometimes hours of human work, and get back reasonable and useful results in minutes. This is an era of managing AIs rather than working with them. This new approach to AI is the outcome of the rapid exponential improvement in AI abilities. That means you can't understand where we are and where we might be going without understanding the increasing capability of AI. Now, in the next section, called Riding up the Exponential, Ethan tries to visualize what's changed with the evolution of an image generation test he's been running for years now. Otter on a plane using WI fi As you can see, he writes, the progress from 2022 to 2025 was rapid and remarkable. So what has happened in the time since April 2025? He asks. With nearly perfect images, video has become the new frontier and has also seen exponential gains. He then shares a video From Seed Dance 2.0 that was created with the prompt a documentary about how otters view Ethan Malick's Otters Test, which judges AIs by their ability to create images of otters sitting in planes. In a world of ones and zeros, there exists a final furry alters of truth. The verdict is clear. Back to the drawing board. Humans, Ethan writes. Aside from a single pronunciation mistake, this is pretty perfect. There's down to the fact that the otters are animated to have human like expressions. Of course video models are cool, but they are not necessarily indicative of what useful agentic AI can do. So what if we look at the benchmarks of AI ability? Do we see the same exponential curve? We certainly do. In the most famous evaluation of AI today, the Meter Long Tasks graph, it tries to measure AI progress by seeing how much human work an AI can complete autonomously with some measure of reliability. It has attracted its share of critics, and even Meter has pointed out potential issues. But if you don't like the meter graph, you will find most graphs of AI ability have that same curve. Ethan then picks a set of four different benchmarks and shows how all of them have much the same exponential growth curve. And yet, he writes, despite these amazing capabilities and tests, companies are still very early in adopting AI, meaning that as of yet Remarkably little has changed in most organizations, but most organizations doesn't mean every organization. We are already starting to see the first appearances of new approaches to organizing that take advantage of the new abilities of AI agents. Radical Changes to Work A few weeks ago, a three person team at StrongDM, a security software company focusing on access control, announced they had built a software factory, a way of working with agents that relied entirely on the AI to test, write and ship production software without human involvement. The process included two quite radical code must not be written by humans and code must not be reviewed by humans to power the factory. Each human engineer is expected to spend amounts equivalent to their salary on AI tokens, at least $1,000 a day. The basic idea of the factory is that it takes future product roadmaps written by humans and turns those into products. Coding agents use those roadmaps to build software, while testing agents try out the software in a simulated customer environment. With the testing agents built as needed. The sets of agents provide feedback to each other, looping back and forth until the results satisfy the AI. Then humans review the finished product and the results are shipped to customers without anyone ever touching or even seeing the underlying code. Ultimately, Ethan writes, the particular details of the software factory matter less than the fact that such radical experimentation into how we work is now not only possible, but likely necessary. AI is good enough to change how organizations operate and and the experimentation is just getting started. Even as models continue to improve, rolling disruption, practical agents, jagged exponential improvement, and the ability to radically experiment with the nature of work combine to form a sort of rolling and unpredictable environment for AI advances. As AI capability crosses thresholds, it unlocks radical new use cases that change people's views, sometimes overnight, about what AI can do. At the same time, organizations experimenting with AI will figure out how to make it work for them to leading to sudden announcements about new strategies or large scale shifts in which kinds of employees companies value most. Now Ethan points out that this is no longer speculation and points to the last week in February as an example of the sort of disruption to come. That was of course the week that we got the Citrini Research substack post on how AI being too good would cause a huge financial crisis, destroying a bunch of different businesses by 2028. Then that same week we got Block announcing 40% of its company were being laid off, very heavily implying it was due to AI. And then of course, to end the week, we got the very public and very aggressive spat between Pentagon and Anthropic over who gets to control AI and specifically how Claude could be used by the government in a lot of ways, Ethan writes. Each of those cases were not what they first appeared to be. The Citrini report was a fictional scenario, the block layoffs were not about AI, and the conflict over AI at war revolved around a number of complicated issues that are still not completely clear. But I think that single week is a good illustration of what the near future will feel like. Sudden revelations about AI capability leading to rapid market reactions, increasingly real impacts of AI on jobs, even if there is a lot of debate over whether those impacts will be good or bad in the short term, and increasing entanglement between AI companies and policymaking around the world. As the stakes go up, it is likely things will feel even more unstable. It is possible, of course, that things settle down. Maybe AI improvement hits a wall. Organizations absorb the changes gradually and the rolling disruptions become more manageable as people learn what AI can and can't do. History is full of technologies that were supposed to change everything overnight, but instead took decades to fully reshape the economy. But I wouldn't bet on it. One reason is that AI companies are telling us fairly explicitly what comes next. Recursive self improvement, or rsi. This is the idea that AI systems are increasingly being used to build better AI systems, creating a feedback loop that could accelerate the very curves I showed you above at Davos in January, Anthropic Dario Amade explained that if you make models that are good at coding and good at AI research, you can use them to build the next generation of models, speeding up the loop. He noted that engineers within Anthropic barely write code themselves anymore. When OpenAI released its latest Codex model in February, the company stated it was quote, our first model that was instrumental in creating itself. And Google DeepMind's Demis Hassabis acknowledged at the same Davos panel that closing the self improvement loop is something that all the major labs are actively working on. Even even as he warned, there are still missing capabilities and real risks. We don't know how far this goes. RSI has been a theoretical concept for decades and the labs may hit bottlenecks whether in compute, in data or in the sheer difficulty of AI research. We also don't know whether LLM based AIs will eventually hit a ceiling where they cannot get any better or where the jagged frontier never smooths out. I don't think we know anything for certain, but I also think we are past the point where recursive self improvement is science fiction. Instead, it is an explicit item on the roadmap of every major AI company. If the loop does close, the exponential curves we've been watching would get steeper with an uncertain endpoint. So here is where we are today. The instability of that single week in February was a preview of what it feels like when the increasing ability of AI starts to interact with markets, jobs, and governments all at once. That feeling of uncertainty will likely only spread further. But uncertainty is not the same as helplessness. When a technology is this powerful and this unsettled, the choices that individuals and organizations make right now matter more. We can see the shape of the thing now, but we can still influence the thing itself and what it means for all of us. We clearly don't have rules or role models for how AI gets used at work, in schools, or in government. That's a problem. But it also means that every organization figuring out a good way to use AI right now is setting a precedent for everyone else. The window to shape the thing may not last long, but but it is here now. Agentic AI is powering a $3 trillion productivity revolution and leaders are hitting a real decision point. Do you build your own AI agents? Buy, off the shelf or Borrow? By partnering to scale faster KPMG's latest thought leadership paper Agentic AI Navigating the Build, Buy or Borrow decision does a great job cutting through the noise with a practical framework to help you choose based on value, risk and readiness and how to scale agents with the right Trust, Governance and Orchestration Foundation. Don't lock in the wrong model. You can download the paper right now at www.kpmg.us navigate. Again, that's www.kpmg.us navig. Today's episode is brought to you by Robots and Pencils, a company that is growing fast. Their work as a high growth AWS and Databricks partner means that they're looking for elite talent ready to create real impact at Velocity. Their teams are made up of AI native engineers, strategists and designers who love solving hard problems and pushing how AI shows up in real products. They move quickly using roboworks, their agentic acceleration platform so teams can deliver meaningful outcomes in weeks, not months. They don't build big teams, they build high impact, nimble ones. The people there are wicked smart with patents, published research and work that's helped shaped entire categories. They work in Velocity pods and studios that stay focused and move with intent. If you're ready for career defining work with peers who challenge you and have your back, Robots and Pencils is the place. Explore open roles@rootsandpencils.com careers that's robotsandpencils.com careers weekends are for Vibe Coding it has never been easier to bring a passion project to life, so go ahead and fire up your favorite vibe coding tool. But Monday is coming and before you know it you'll be staring down a maze of microservices, a legacy COBOL System from the 1970s, and an engineering roadmap that will exist well past your retirement party. That's why you need Blitzi, the first autonomous software development platform designed for enterprise scale code bases. Deploy the beginning of every sprint and tackle your roadmap 500% faster. Blitzy's agents ingest your entire code base, plan the work and deliver over 80% autonomously validated, end to end tested, premium quality code at the speed of compute months of engineering compressed into days. Vibe code your passion projects on the weekend. Bring Blitzi to work on Monday. See why Fortune 500s trust Blitzi for the code that matters@blizzi.com, that's blitzy.com There's a new standard that I think is going to matter a lot for the enterprise AI agent space. It's called AIUC1 and it builds itself as the world's first AI agent standard. It's designed to cover all the core enterprise risks, things like data and privacy, security, safety, reliability, accountability and societal impact, all verified by a trusted third party. One of the reasons it's on my radar is that 11 labs, who you've heard me talk about before and is just an absolute juggernaut right now, just became the first voice agent to be certified against AI AI UC1 and is launching a first of its kind insurable AI agent. What that means in practice is real time guardrails that block unsafe responses and protect against manipulation, plus a full safety stack. This is the kind of thing that unlocks enterprise adoption. When a company building on 11 labs can point to a third party certification and say our agents are secure, safe and verified, that changes the conversation. Go to AIUC.com to learn about the world's first standard for AI agents. That's AIUC.com. So that's the end of Ethan's essay. Another great one. Thank you Ethan for that. And here's where I wanted to pick up the thread. It is very clear at this point, and everyone agrees, that we have just lived through, or are living through a major transition in the AI capability set. In fact, another even more crystallized distillation of that also from Ethan, was a tweet from the beginning of March, where he wrote from an AI user perspective, the 4 big leaps so far in ability 1 GPT 3.5 aka chatgpt GPT in November of 2022 2. GPT 4 in spring of 20233 reasoners starting with 01 preview but the real deal was 03 spring of 20254 workable agentix systems harness plus good reasoner models December 2025 I think that that's right, but I think you could simplify it even farther. I think we are in the second great transitional period. The first was the ChatGPT moment, which I would argue really came to its full expression in spring when GPT4 hit. And the second is now. These workable agentic systems with the reasoners, although they were tremendously different, being just the prelude to what we have now. So again, there is, as we've talked about on the show extensively, widespread agreement of the significance of this moment. And with it, as Ethan has pointed out, has come a feeling of destabilization. Certainly Wall street is feeling it. We're of course living through the SaaS apocalypse, which has been this cascading wave of disproportionate market impacts. Every time Anthropic announces some new features, we're also feeling it in politics. It's not just the fight between Anthropic and the Pentagon. AI as an issue is forcing itself into consciousness everywhere right now. Just this week, Bernie Sanders dropped a nine minute video about his plans for legislation to declare a moratorium on AI data centers. You see it in polling of Americans, where members of both major parties have effectively no faith in either party to handle artificial intelligence. And you even see it in and around the people who are closest to this technology. A few days ago, Semianalysis Dylan Patel wrote, being in SF is like being in Wuhan. Right before the pandemic. Something is happening. It's going to hit everywhere, but so few people know it. And all year there's been something bothering me about this discourse, and recently it's crystallized. There is sort of a feigned helplessness in all of these discourses, a denial, maybe implicit instead of explicit, but there nonetheless, of human agency to shape what this all is going to mean. It's as though, because these forces are so large that we're shrinking rather than rising to meet them, we forget what Archimedes said, Give me a lever long enough and a fulcrum on which to place it, and I will move the world. Unfortunately, it feels like the imagined helplessness is getting worse, not better. A group called the alliance for Secure AI, which I know nothing about, announced a new website this week called Jobloss AI. It's a real time tracker of AI driven layoffs across the U.S. they write these jobs are disappearing, the numbers are growing and we're counting every single one. Now we're going to hold aside the entire phenomenon of AI washing and acknowledge that even if lots of the layoffs that are being blamed on AI are not exactly about AI, that directionally this is still something that's worth engaging with. But let's listen to the ad that they actually released.
