Transcript
A (0:00)
Today on the AI Daily Brief, we're talking about the seven different kinds of AI agents before that in the headlines. Meta really rubs it in and poaches Apple's head of AI models. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. Hello friends, quick announcements today. First of all, thank you to today's sponsors, KPMG Blitzy and agency.org and of course to get an ad free version of the show go to patreon.com aidaily Briefly. Now a quick request today. Superintelligent, as you well know, is in the business of agent planning and helping companies figure out what agent use cases are going to be most useful for them, as well as solving organizational gaps, getting data readiness, setting up infrastructure like monitoring observability evaluations, all that sort of stuff. Basically we want to help organizations accelerate identification, but as you might imagine, what happens next is that they need companies to actually build the agents. We already have some great partners in this area, but I am personally looking for more. If you run a dev shop that is focused on agent engineering, agent buildouts, agent application design, basically people who are actually pushing the code that makes agents work for companies, I want to hear from you. Shoot me a note nlwsuper AI with Agent Builder in the subject line and I am excited to talk to you and hear more about what you have going on. But with that, let's dive into today's show. We kick off today with the latest in Meta's talent war and this one is just the rubbins man. Bloomberg's Apple whisperer Mark Gurman reports that Rua Ming Pang, the manager in charge of Apple's Foundation Model divisions, is parting way to join Meta superintelligence team. Now. Apple had themselves poached Pang away from Google in 2021 and sources said Meta offered a compensation package in the tens of millions of dollars to land this well regarded AI engineer. Although frankly, many people thought that even outside of the millions of dollars in compensation, the chance to move on from Apple's dubious AI project might have been another motivation to leave. In addition, sources say that Meta has hired another pair of researchers away from OpenAI and Anthropic to fill out their team, still bringing it back to Apple. Gurman's contacts said that this departure could be the first of many from the troubled Foundation Models team with quote, several engineers telling colleagues they're planning to leave in the near future to Meta or elsewhere. Rather than a single replacement at the head of a hundred person team, Apple is reportedly looking to restructure and add a layer of middle management. Always the successful way to approach an innovation challenge for Meta outside of just scooping generically talented people. TechCrunch suggested that the pang hire could bring a specific expertise around designing small on device AI models, which is of course a big part of what Apple had been thinking. The community reacted to this news much as you might expect. First of all, we see the continued trend of people following these career moves like they are the new sports trading news. But a lot of folks brought it back to not just Apple, but Zuck's playbook for poaching talent across the entire industry. OpenAI is obviously not acquirable, at least not in any real world scenario. So what's the next best move? You don't buy the company, you bleed it out. You go straight for the people who are the company. OpenAI is quite literally nothing without its people, remember, especially in AI. So Zuck starts cutting checks big enough to short circuit anyone's loyalty circuits $100,300,000,000 plus packages. It's just a different kind of acquisition, just one quiet defection at a time until the center can't hold. Everyone's stuck in traditional acquisition era thinking while Zuck's out here doing black ops recruitment like it's cold for Berlin. Peter Levels also agrees that this is all about OpenAI. He writes, Everything Zuck does is to extinguish OpenAI. I think first it was open sourcing llama this way. He thought if their model was just as good but free to use, people wouldn't have to pay OpenAI because it was closed. But it didn't work then. It was essentially buying scale AI for 14 billion. Scale does a lot of the data labeling and creates a lot of the LLM data sets that OpenAI trains on. Getting control of that and removing OpenAI as a partner makes it harder for them to train models. Next is poaching of the OpenAI staff to hollowed out from the inside. Zuck sees OpenAI as its biggest threat I think now. Adding insult to injury after their controversial announcement of Liquid Glass as their big non AI announcement for Worldwide Developer Conference, Apple is now entirely backing off of that, showing just how wide the innovation problem is at that company at the moment. Now speaking of OpenAI and the talent war, as we've been following OpenAI, stock compensation has necessarily surged in this battle for top talent. According to the information, OpenAI told investors that stock based compensation has jumped more than 5x over the past year to a total of 4.4 billion. That means stock comp was around 120% of revenues for 2024. The ratio had been projected to fall to 45% this year as revenues took off. But Zuck's talent rate has led executives to rethink the strategy. Chief Research Officer Mark Chen indicated the company will look to get more aggressive with its stock offers to ensure they're competitive. OpenAI is still forecasting the ratio will fall to 10% of revenue by 2030, but 2030 is still a very long way away. The reporting also highlights how little equity there will be to go around as the company looks to convert to a public benefit company. OpenAI employees currently receive profit units rather than traditional equity, but they'll convert along with the venture investors if the conversion goes through. The company's leadership have reportedly discussed scenarios where employees will own roughly a third of the company, Microsoft will own another third, so investors together with the nonprofit will divide up the remaining third. By way of comparison, Google stock compensation ratio was around 16% of revenue when they went public in 2003, and Facebook's was roughly 6% in 2011. Snowflake was running at around 30% of revenue in 2020. The reality, though, is that as much as investors might get queasy with this, it is very clearly necessary in this particular competitive landscape. Next up, a follow up to an earlier story. Cursor has apologized for unclear pricing changes that left many users with bill shock over the past month. CEO Michael Trull wrote, we recognize that we didn't handle this pricing rollout well and we're sorry our communication was not clear enough and came as a surprise to many of you. Now. The change was implemented in the middle of June instead of receiving 500 fast responses on Frontier models before dropping to unlimited slow responses. Pro users are now capped at $20 of usage. Excess usage is charged at the API price. Users can still access unlimited usage through the auto model selector, but that doesn't allow routing to Frontier models. That led to a lot of people who weren't aware of the changes to rack up excess usage charges at a dramatic pace. And indeed, over the past week, many users took to social media in outrage after being caught up by the changes. One Reddit user reported that they ran into the rate limit for Sonnet 4 after just three prompts until a document update late last week, Cursor was advertising unlimited access to Sonnet 4 for Pro subscribers. The user remarked, it feels like a scam at this point. Oscar Lee wrote, We paid 7k yesterday for a yearly subscription and then you immediately pull the rug on us. One of our devs just used all 500 requests in a single day. Is that even legal? Apps it took me three days to blow past the included limits on the $200 Cursor Ultra plan. This can't be serious. Others noted that Cursor had also removed the ability to bring your own API keys, so there's no way to get around those billing practices now. To their credit, Cursor reached out to each of those users offering full refunds. They've also extended the offer to anyone with unexpected usage since the change, so if you were affected, get in touch with that team. It's clear to me that this was not a gotcha. This was just a company growing pains kind of moment. Still, while Cursor is trying to make it right, the issue still highlights that usage costs have changed in a big way. They explained new models can spend more tokens per request on longer horizon tasks. Though most users costs have stayed fairly constant. The hardest requests cost an order of magnitude more than simple ones. API based pricing is the best way to reflect that. And by the way, this is not Cursor alone. Replit are also feeling the pinch, introducing effort based pricing for longer tasks last month. TechCrunch, meanwhile, suggests this is all due to a price change from the model. Companies writing OpenAI and Anthropic have also started charging enterprise customers for priority access to AI models, an additional premium on top of what AI models already cost that guarantees reliable high speed performance. These expenses may be filtering their way down to AI coding tools, which seem to be getting more expensive across the industry. Given that cost is an increasingly important part of the consideration for enterprises adopting these tools. It will be interesting to see if and how these changes impact how companies think about adopting these new tools. Lastly, today, a big acquisition Fresh off their successful ipo, data center company Core Weave has agreed to acquire Core Scientific for $9 billion. Core Scientific is something of a sister company to Core Weave, renting them data center space and long term energy contracts. They started life as a bitcoin miner, but made the pivot to AI in 2024 after going through a bankruptcy. As a result of the deal, Core Weave says they will gain direct access to more than a gigawatt of data center capacity. Still, most of the deal just highlights how intense the demand for AI capacity has become. Core We've already made one attempt to acquire Core Scientific last year, but only wanted to pay a billion dollars. Although this is an all stock deal, it's still a huge premium on last year's pricing. Core Weave is even paying 60% more than market value for the company, underscoring just how desperate they are for additional capacity. In addition to increased capacity, the deal would also wipe out 10 billion in future lease obligations for Core Weave by acquiring their landlord. CEO Mike and Trader said, when you look at the hyperscalers, they have some infrastructure that they build and they have some infrastructure that they use third parties to deliver. And there's a reason they do that. And those reasons kind of are applicable to us too. So that's what you're seeing. And so taking a step back as we roll out, if you look across all of these stories, Cursor having to raise its prices, Core Weave paying a premium to get access to more data center space, and Meta and OpenAI being forced to pay boatloads of money to keep the top talent. It shows how integral everyone is treating the battle for AI supremacy right now. The short of it is that we are in a very small window that will shape a huge amount of the decade to come, and companies are just throwing all the rules out the window order to be able to compete. For now though, that's going to do it for today's AI Daily Brief Headlines Edition. Next up, the main episode. Today's episode is brought to you by KPMG. In today's fiercely competitive market, unlocking AI's potential could help give you a competitive edge, foster growth, and drive new value. But here's the key. You don't need an AI strategy. You need to embed AI into your overall business strategy to truly power it up. KPMG can show you how to integrate AI and AI agents into your business strategy in a way that truly works and is built on trusted AI principles and platforms. Check out real stories from KPMG to hear how AI is driving success with its clients@www.kpmg.us AI. Again, that's www.kpmg.us AI. This episode is brought to you by Blitzy. Now I talk to a lot of technical and business leaders who are eager to implement cutting edge AI. But instead of building competitive moats, their best engineers are stuck modernizing ancient code bases or updating frameworks just to keep the lights on. These projects, like migrating Java 17 to Java 21, often mean staffing a team for a year or more. And sure, copilots help, but we all know they hit context limits fast, especially on large legacy systems. Blitzy flips the script. Instead of engineers doing 80% of the work, Blitzi's autonomous platform handles the heavy lifting, processing millions of lines of code and making 80% of the required changes automatically. One major financial firm used Blitzi to modernize a 20 million line Java codebase in just three and a half months, cutting 30,000 engineering hours and accelerating their entire roadmap. Email jacklitsi.com with Modernize in the subject line for prioritized onboarding. Visit blitzi.com today before your competitors do Today's episode is brought to you by Agency, an open source collective for interagent collaboration. Agents are of course, the most important theme of the moment right now, not only on this show, but I think for businesses everywhere. And part of that is the expanded scope of what agents are starting to be able to do. While single agents can handle specific tasks, the real power comes when specialized agents collaborate to solve complex problems. However, right now there is no standardized infrastructure for these agents to discover, communicate with, and work alongside one another. That's where Agency spelled A G N T C Y comes in. Agency is an open source collective building the Internet of Agents, a global collaboration layer where AI agents can work together. It will connect systems across vendors and frameworks, solving the biggest problems of discovery, interoperability and scalability for enterprises. With contributors like Cisco, Crewai, LangChain and mongodb, Agency is breaking down silos and building the Future of interoperable AI shape the future of enterprise innovation. Visit agency.org to explore use cases now that's agntcy.org welcome back to the AI Daily Brief. Today we are talking about one of the big themes, maybe the big theme in AI right now, which is of course agent transformation, agent adoption, the agentification of everything. Now, since almost the moment that ChatGPT came out, people have been excited about the possibility of this agentic era. Agents represent the idea of an AI that has greater capacity and can do more than just make you more productive, but in fact fundamentally change the equation on the amount and type of work you can do or that your organization can accomplish as a whole. Still, what exactly we mean when we say agents can still be a little abstract and difficult. And so for this piece, we're building off of the inspiration of a recent piece that appeared in the Information called the seven Kinds of AI Agents. We're going to look at both that as well as a couple of other frameworks to help you wrap your head around the different ways that you might think about categories of agents as a way to inform your personal or organizational agent strategy. Now, one quick note before we dive into these definitions. Agents are officially and very much in the realm of the now. I shared these statistics before, but in the most recent KPMG Pulse survey, they found a massive increase in full enterprise agent deployments. The percentage of organizations that had at least some agents fully deployed, that is out of and past the pilot stage, tripled from 11% to 33% between Q1 and Q2. That followed a jump in pilots from 37 to 65% between Q4 and Q1. In net, 90% of the organizations that KPMG surveyed said they were past AI agent experimentation and actively into either pilots or deployments. Meaning that this is here and happening right now. So before we get into a few different ways to break down subcategories of agents, let's talk about the broadest possible definition. If you've listened to me frequently, you'll know that I actually don't care all that much about hyper specific definitions of agents. I think a lot of the discourse around whether something is an agent or an agentic workflow or an automated workflow kind of doesn't matter. And the reason that it doesn't matter is that I think in this circumstance, the common knowledge and intuition about the difference between an agent and other types of AI is actually more accurate or at least more functionally useful than those highly specific and technical definitions. Basically, if you talk to your average business person or person who works for a big company or corporation what an assistant or a co pilot style AI is versus what an agent is, they'd probably give you some definition that roughly said assistants are AI that I use to do things, whereas agents are AIs that do things for me. And I think broadly and directionally that is the right dividing line. So what then is the use of this sort of deeper dive that we're about to go through? Well, even if that is the broad category dividing line between assistants and agents, there still are many different types of agents. In understanding different ways to think about them can be really useful as you're figuring out your agent strategy. Now, broadly speaking, there are a couple of different ways that people will try to organize agents. The first is based on their functionality, the second is based on their focus. So in terms of the definitions that are based on functionality or how they operate, you'll often find this list of six or seven different agents that include simple reflex agents, model based reflex agents, goal based agents, utility based agents, learning agents, hierarchical agents, and sometimes the last that they'll add is multi agent systems. Here's how AWS defines these things. A simple reflex agent is basically exactly what it sounds like it operates based on predefined rules and the immediate data it has access to. As Amazon puts it, it will not respond to situations beyond a given event condition action rule. These agents then are well suited for very simple tasks that don't require a ton of training. The example that AWS gives is an agent that resets passwords by detecting specific keywords in a user's conversation. Some other use cases for this type of agent from DigitalOcean include automated sprinkler systems that activate based on smoke detection, email autoresponders that send predefined messages based on specific keywords or sender addresses, and things like that. Next up we have model based reflex agents. These are basically more complex reflex agents with a more advanced decision making mechanism that can evaluate probable outcomes and consequences before deciding. DigitalOcean writes, the model tracks how the environment evolves, allowing the agent to infer unobserved aspects of the current state. While these agents don't actually remember past states in the way more advanced agents do, they use their world model to make better decisions about the current state. A use case they give is network monitoring where a model would rely on metrics, logs, events and network metadata to understand overall network conditions and then from there detect anomalies, route alerts and help with root cause analysis. Next up we have goal based agents or rule based agents which as opposed to reflex agents that act based on rules or world models, goal based agents can plan sequences of actions to achieve their desired outcomes. The key components of goal based agents includes a goal state, a planning mechanism, a state evaluation, an action selection, and a world model. An example might be an inventory management system that can plan, reorder schedules and maintain target stock levels. Learning agents, as you might imagine, are agentic systems that are capable of improving behavior over time by learning from previous experiences. The key difference from simpler agents is that rather than relying purely on knowledge that is pre programmed, they can figure out how to achieve goals through experience. An example might be an advanced customer service chatbot which doesn't just rely on a set of pre programmed knowledge, but can interact with and improve its experience over time based on the conversations it's having with customers. Next up we have utility based agents which differ from goal based agents in that rather than shooting for some specific state, utility based agents can explore and handle trade offs between competing goals. An example that AWS gives are agents that search for flight tickets that can balance different types of benefits like minimum travel time on the one hand versus price on the other without having to know in advance which of Those priorities the end user is going to prioritize most highly. With hierarchical agents, we start to get into agentic systems. AWS writes. The higher level agents deconstruct complex tasks into smaller ones and assigns them to lower level agents. Each agent runs independently and submits a progress report to its supervising agent. The higher level agent collects the results and coordinates subordinate agents to ensure they collectively achieve their goals. The idea here is that by having different specialized agents that work together in a larger system, you can go out and assign agents to achieve more complex tasks because they can break things down into more manageable subtasks. And then multi agent systems might refer more broadly to combinations of these various agents which can achieve more complex goals. So this is a way of breaking down agents based on how they operate in the world. But there is another way to break agents down, which is based on focus in this new information article. They're honing in, it feels to me, on how agents are actually being deployed right now and organizing the categories based on the output that the business who's using them is trying to achieve. The seven categories that the information lists are business task agents, conversational agents, research agents, analytics agents, developer agents, and domain specific agents. Business task agents are what some people might huffily say are actually automated workflows. These are useful for fairly simple but repetitive and common use cases like data entry, document classification, invoice processing. A lot of the business process automation layer that's happening right now fits into this category. The next up is conversational agents, which is inclusive of both external facing customer service as well as internal facing support around IT or HR questions. Research agents are of course agents that can go do research. Research agents I think are particularly important for the average employee because they're one of the first agentic experiences that even non technical folks are deploying to great effect right now. The next category that the information includes are analytics agents, which can analyze structured data to produce graphics, charts or reports. And then the last two are some of the most discussed categories, developer agents, which of course have been the major, major theme for most of this year, and as we'll see in just a minute are perhaps the single most significant breakout agents so far. And then lastly is domain specific agents, which have in other places been referred to as vertical agents that are specialized agents that have very specific domain knowledge and an area like legal, healthcare or finance. So whereas the functional frameworks for understanding different types of agents are useful in understanding what's going on under the hood and how agents are actually interacting with data and the world These archetypes are a little bit more focused on the types of outcomes that you can enjoy if you successfully deploy these agents, which makes them useful in a different way for a different part of the planning process. Now, interestingly, we recently got a study from Iconic that looked at how the agent builders themselves are using AI in agents internally. So these are all from firms that are producing AI or agent software in some way, shape or form. And when it came to the way that they are using AI and agents internally, there are some very clear trends. Notably, coding Assistant is by far the most common use case at 77% of organizations. But really, as you can see, there is significant usage of AI in agents across a huge array of business categories, from coding assistance to content generation to knowledge retrieval to product design to business intelligence and beyond. Now, of course, every different agency is going to have some different version of their framework for breaking down different types of agents. Kpmg, for example, has tried to simplify that functional breakdown by organizing things into more like four categories as opposed to seven. Their breakdown is the Taco framework that organizes agents into taskers, automators, collaborators and orchestrators that are basically divided by the complexity of the tasks that they take on, the amount of human in the loot they need, and the breadth of the systems that they can interact with. What's useful about the Taco framework is that I think that these terms are more intuitive for a lay audience or a non technical audience than perhaps the breakdown that includes words like reflexivity. Taskers, they write, execute well defined individual tasks and require a human in the loop. Automators manage more complex tasks that span multisystem workflows, collaborators are adaptive AI teammates that manage multi dimensional goals, and orchestrators are transformative agentic systems that coordinate multiple agents and tools to manage interdependent workflows. Now one interesting note and a conversation I'm having a lot at the moment is around this idea of orchestration. It is quite clear if you're spending any time with enterprises or private equity firms that there is a huge amount of discussion of orchestrators and multi agent systems. And hold aside some of the technical conversations around what type of orchestration is needed for multi agent systems to work. There is very clearly an emphasis on not just individual spot agents but agent pick systems. This to me was one of the most notable things from Microsoft's Build conference back in May is that rather than talking about their cool premier agent in each of these different focus categories like conversational agents or research agents, instead Microsoft really put the emphasis on Software and agent infrastructure. One of their big announcements was Multi Agent Orchestration and Copilot Studio that was designed to allow people to deploy more comprehensive and complex agentic systems where the agents could actually interact with one another. We're of course also seeing frameworks to support this sort of agent to agent interaction, including a to a agent to agent, which is a communications protocol. And recently someone asked me if I thought that some of these businesses were getting ahead of themselves talking about multi agent systems when they had barely wrapped their head around or deployed a single agent yet. My short answer was that I thought that no, that was actually a good thing. It's not about skipping steps and not doing the work to actually pilot and deploy and test and learn how to interact with single spot agents. But I think in general, organizations that think in these systems terms are more directionally aligned with where the world is heading. To the extent that agents really are taking on big chunks of labor and functions that were either A previously done by humans or B weren't possible to be done by humans because of some complexity or cost equation, they are going to have to work together in comprehensive systems to get the full value of agents. It will not just be a single spot agent deployed in a clever way. It will be big, complex digital worker organizations. And even if we're not all the way there yet, I think anchoring our thinking and our systems design to that agent system's future is going to be more productive than than getting lost in the sauce of some specific exciting spot agent. Now, if that resonates, one recommendation that I would have is to start thinking about the infrastructure and tech stack that's going to need to be put around agents for you to get the most out of them. At the end of that same iconic report, there's a 12 or so page breakdown of different platforms across all of the different agent tool areas. And there are a lot of different agent tool areas. Model training and fine tuning, LLM and AI application development, monitoring and observability, inference optimization, model hosting, model evaluation, data processing and feature engineering, vector databases, synthetic data and data augmentation, coding assistance, DevOps and MLOps product and design. Now, not every enterprise is going to have to deal with all of these, but some of them are going to be common across basically everyone who's interacting with agents. Inference optimization is going to be something that every organization does. It is nearly 100% the case that you will have some sort of monitoring and observability suite. Same with evaluations. And the point is that as you think about agent readiness and exploring how to deploy agents. In addition to just thinking about use cases, also think about all of this infrastructure that needs to be built as well. And of course I would be remiss at this point not to point out that if you are focused on this, if you are on this agentic journey, there are a lot worse places to start than the Superintelligent Agent Readiness Audit. It's a voice agent that we deploy to interview your leadership and teams about how they work now in order to create a roadmap and a blueprint for both the specific agentic use cases that are likely to be most valuable for you, as well as the additional change management or organizational gaps you need to fill to be able to take advantage of those use cases. Of course, if you are interested, we would love to help you with these problems, but whether you work with superintelligent or not, the reality is that agents are here and they are distinctly not monolithic. They represent a broad set of different type of capabilities and operational models and understanding. And starting to figure out which of those operational models and which of those focus areas are going to be most useful is going to be a key part of your work in the years to come. Hopefully this gives you some additional tools to think about your options and for now that's going to do it for this AI Daily brief. Appreciate you listening or watching as always and until next time, Peace SA.
