Loading summary
A
Welcome to the Practical AI Podcast where we break down the real world applications of artificial intelligence and how it's shaping the way we live, work and create. Our goal is to help make AI technology practical, productive and accessible to everyone. Whether you're a developer, business leader, or just curious about the tech behind the buzz, you're in the right place. Be sure to connect with us on LinkedIn X or Bluesky to stay up to date with episode drops, behind the scenes content and a insights. You can learn more at PracticalAI FM. Now onto the show.
B
Welcome to another Practical AI Podcast episode. This time it's just Chris and I, my co host. In these episodes where it's just the two of us, we try to take something that's in the AI news or a topic for a deep dive. Something that will help all of us level up our AI and machine learning game. I am Daniel Whitenack, I'm CEO at PredictionGuard and I am joined as always by my co host Chris Benson, who is a principal AI and Autonomy Research engineer. How you doing Chris? Hey, doing great.
C
Lots of cool stuff out there. Looking forward to today's conversation.
B
Yes. Yeah, for sure there's, there's no shortage of of things to talk about, but even in our I don't know if you remember this passing comment Chris, but I think it was in our episode where we were talking about MCP on top of Kubernetes. The guest mentioned that, hey, when Anthropic kind of drops one of these white papers or research topics or blog posts, often that's a window into something that's, that's significant and something to pay attention to and, and review in detail. And it just so happens that they on May I think 27th of this year, 2026 released this I guess it's a ebook white paper blog post however you want to frame it. Framework around zero wisdom. Yeah, Zero trust for AI agents. Say Zero trust for AI agents. We share a security framework for deploying autonomous AI agents in the enterprise covering the new threat landscape, a tiered zero trust architecture and defensive operations built for AI accelerated attack. So that's a lot of words. Now I think first off Chris, it's probably worth recognizing that Anthropic obviously has a has a horse in this race, especially with things like CLAUDE code or CLAUDE cowork or all the CLAUDE things. These are autonomous agents that can operate in your enterprise environment. So obviously I think probably there are things that are happening and things where their customers or people using these tools are obviously thinking about the security implications of that they also recently released cloud security, which is more on the AI for security side, not so much the security for AI side, which is mostly what we'll talk about today in relation to this, to this article or ebook. But yeah, I think that's worth acknowledging. Obviously if people, people have a secure way of deploying autonomous agents, I'm sure they are hoping that many of those are built on anthropic technologies.
C
I'm sure they do. And you know, just to keep in the back of our mind, this is the same organization that has Mythos out there and is working with. I believe the latest number is 150 organizations, is the latest thing I saw published on their website. Trying to go through and do security audits and such as that. And with the timing of this, I would guess. Don't know, but just making a guess that some of the leveling up that Mythos has enabled is probably driving some of their zero trust and other security concerns going forward. So looking forward to this.
B
Yeah, yeah, I guess that's a good place to start with the kind of premise of this. I think there's a few things to frame here. Maybe one is there is probably a segment of the market and of our audience that is already using autonomous agents for something, even if that's just like Claude code or something like that for development purposes, where by autonomous I mean it's making actions on your behalf to do some things. And I think generally in terms of where we're seeing the market going on the positive side, organizations are going to need to more and more adopt these autonomous agents within their organization for value creation or new revenue or op, you know, saving on operational efficiency. So that's like thing, you know, premise one is that that's the way the market's going. I think the, the other kind of background to this though is like you were saying, there's a bit of a forcing function here because AI, or how should I so attackers, so malicious parties, hackers, et cetera, have, you know, they have equal access to these agentic coding and development capabilities themselves. Right. Meaning that the pace at which people are about to be or are already being attacked and exposed to threats in their infrastructure is just like expanding exponentially. Which means you cannot keep up with that level of attack using human only approaches. Meaning that the forcing function that I'm talking about is you are necessarily going to have to adopt autonomous agents at least to help you manage the threats associated with, with the offensive use of this AI technology. So I think there's the, the positive side of this obviously, which is we there, there's a future where autonomous agents are doing very positive things and you have this kind of digital workforce of agents within your organization. But the, maybe part of the forcing function behind this discussion is that people actually need to adopt autonomous agents because of this offensive threat to their infrastructure.
C
Yeah, I agree and I think that'll put, that'll put quite a strain on a lot of the humans involved in this because you know, there's a certain amount of leveling up from a human standpoint to understand what, what different harnesses are and what the different capabilities that are now becoming available, understanding vendors versus open source and such as that. So to actually get to the point where you can start implementing these is a bit of a lift. And I think that that's going to be something that we observe is that I think there'll be a spread across organizations where you'll have some, you know, on one extreme end you have the anthropics that are leading the way and producing these capabilities and stuff like that. But then there's a lot of kind of a mom and pop organizations or maybe not that small, but you know, mid size stuff like that that are going to struggle to level up just a little bit. And so I think we have some interesting, I think the security landscape will be very interesting, a little bit Wild west in the days ahead. As people, even if tools are available, they have to get to where they can uptake those and get productive with them. So.
B
Yeah, yeah, so I, I agree and I think the, or maybe a way to get into this discussion is that if we frame the background with an assumption, and I'm sure there are arguments against, against this assumption, but let's assume that your organization is and will adopt autonomous agents for, you know, positive things. Like I talked about operational efficiencies, new, new revenue, whatever that is, and, or cybersecurity purposes. If we assume that, then you say, well, okay, well now we're going to have these autonomous agents operating in our environment. They could cause all sorts of harm themselves. So it's like I could shoot myself in the foot trying to protect against the offensive malicious people by releasing a bunch of agents into my infrastructure. And they themselves cause a lot of, a lot of harm. Like how do I, how do I manage those things? And anthropic has, so they, they have not come up with this idea of zero trust. To be clear, this is a general concept which we could talk about the definition of. But they're essentially releasing with this framework a way to think about a zero trust approach or a zero trust framework for managing AI agents or autonomous agents within your organization. So maybe, maybe it'd be good to just define that, define that term first. In the, in the past, if we, if, if we think about cybersecurity, there's been what's generally referred to as perimeter based cybersecurity. This is a more traditional model that would focus on that boundary of your organization and outside, or internal and external. And the kind of core principle being that I'm going to trust everything that's inside and distrust everything that's on the outside. So there is a perimeter in which within that perimeter I trust things. A zero trust approach to cybersecurity, on the other hand, would actually assume that everything inside the network, that threats are already inside your network, already inside your parameter. So it treats every user, device request as a potential threat. So that, that's why it's called zero threat. And like I say, this has been something that's been around from, for a long time. NIST has published about it in zero trust architecture back in 2020 and other government organizations and others have, have talked about it as well. So that's that kind of, that kind of difference. I don't know if, if those, if, if that zero trust idea has crossed into your, your perimeter of knowledge. Chris.
C
I'm sure, yes, you know, without going into any detail at all, working in defense and intelligence, that it is pretty core. And yeah, I mean, I mean the simple way of thinking about it is every single API request that you have has to have security credential and that can be from a variety of different mechanisms. But you don't trust anything and everything is down to a granular level unless it is authenticated and authorized to do whatever it is trying to do. So in the world that I'm living, that's pretty standard though, I think, as I think, I think there's room for all of us, even those of us who have been doing it, to level up and get better at this. So I don't think that there's anybody who has just nailed it. So it's one of those ongoing learning curves.
B
Yeah. And we're about to dig into a lot of that as related to AI agents. However, to your point, there's a lot of organizations that are still trying to think about this concept, even generally in their kind of general cybersecurity world. And you know, one of my hot takes here is we'll talk about these kind of foundational things that Anthropic is suggesting and you know, probably 90% of plus of of organizations, enterprises that have AI deployments currently are not operating according to this model. They are, according to this framework. They would be completely exposed and I think so just acknowledging much of this is probably aspirational for enterprises and they need to work towards it in a, maybe a more rapid way just because of how things are advancing and you know, there's better tooling out there, day by day, better products, et cetera. But yeah, this is just, just so if you're out there and you're thinking I have agents running and I have none of what we're about to talk about, that's probably the situation that most are in, in, in the enterprise world
C
would be my hopefully today we can, we can help people start on, on a path here to mitigate some of their risks.
B
Exactly. Next week. You have no excuse but coming into this conversation you, you have an excuse. Yeah, exactly. So I think the, I would encourage people to. If you just search for zero trust for AI agents, you know, anthropic blog post, we'll link it in the show notes as well so you can click through to that ebook and the framework itself. There's a lot that we won't be able to cover in detail, but I think the overall structure that they present are some, some kind of initial background and considerations kind of definitions related to autonomous systems that need to consider and then they talk about the current threats to those agentic or autonomous systems and then how to apply this zero trust to those threatened agentic systems. That's kind of the flow of, of of what they talk about. So the first thing, and I think this is something we've talked about more on the show and have, have already covered, but just to set the foundation, some of these considerations, kind of background information that, that we may want to give is that, you know, why, why are we talking about like a new framework? Well, agents are different in how they operate. We've talked about this on the show before. They use a distributed set of tools, they interpret instructions, try to accomplish goals, they execute operations without human initiation. I think importantly, they might preserve context across sessions if they're trying to accomplish some goal. And then you kind of add multiple agents and they might communicate with one another. So you've got this multi agent communication. Now there's a couple terms here Chris, that I think we've even mentioned, but they just define specifically related to agent security as new terms that people might, might be unfamiliar with. One is blast radius, which kind of, I think people could assume what that means. Right. It measures the potential damage if something goes wrong. An agent does, does go off the rails, that blast radius and least agency, which I guess is a term coined by OWASP and that extends this kind of idea of least privilege to agentic applications. So you shouldn't be giving more agency to your agents than they need to do their agent things.
C
And that's standard zero trust ideas. You give it just what it needs and absolutely no more.
B
So. Yep. And, and so that's kind of the, I guess the background in which, in which we're operating. Then, then the anthropic paper, it goes into these current threats, which is some are ones we've talked about, some are ones we've not talked about as much. Chris it's interesting that they talk, they kind of frame everything within the agent world as agentic systems, which I very much think like in our, in our product. That's why I insist on using the idea of AI system as, as a thing because you have these distributed set of things that are powering agents these days and so they kind of break down then this like current threats to agentic systems. The first of those, which is probably not a surprise because it's the first on OWASP list often as well, is prompt injection and instruction manipulation. We, again, we've talked about this. There's everything from the obvious direct, you know, human input into a chat interface. Ignore your instructions and do this other thing which you shouldn't be doing. But the one that they mention as the more difficult or scary one would be the indirect prompt injection where that's coming in through. Maybe it's a file that's, you know, you have an agent connected to your email and attachment comes through with hidden instructions in it. Anecdotally, I, I helped another company do some interviews and I, I wrote a technical exercise and put it in a PDF and I knew everyone would use cloud code like they should, but just just because I wanted to be fun, I, I had all the instructions in black text and then I had an extra like 3/4 of a page. So I just filled up that page with, with instructions that would make Claude code do the opposite of what I was saying in the instructions just to, just to see if they would catch it so that, that sort of thing.
C
Very devious, very devious.
B
It was fun.
C
Did you make it white text in the PDF so it wasn't obvious, just like white space which, which would get
B
interpreted if you just uploaded it code or, or whatever?
C
That's very sneaky, but actually Quite common in terms of vector. I mean, because everyone just throws everything they can, you know, the way, the way things have been operating. And so, yeah, that's what we're doing today.
B
Yes, true. And I guess the other. So that's threat number one, prompt injection, instruction, manipulation, threat number two that they talk about, which is related to agents using tool, particularly through mcp, which was a topic on a recent episode of this show, which you can look back at for, for much more information on that.
C
On mcp.
B
Yep, on mcp. Yeah. So they talk about agents that can manipulate tools maliciously or kind of do things that they shouldn't be doing because of privileges. I, I think about Chris, like, it, it's kind of like you set up a server, maybe I set up a FAST API API that you know, my agent could use and I only tell it about instructions, you know, about a couple get. Get routes on the API in the instructions, but I don't shut down the other routes. Right. And if the agent was smart in any sort of way. Right. It could just look at the swagger documentation at the Slash Docs endpoint and know about all the other routes that maybe shouldn't use. And then like all of a sudden I have problems. Right? That's right, yeah.
C
And just to clarify, swagger is a protocol that defines what those routes are. And you know, you mentioned, you know, kind of going off the rails. But, you know, the notion of a malicious MCP server has now been documented and there could be lots of various types of tooling that is coming into being now just to take advantage of these vulnerabilities. So we'll see a whole of malicious software arising to, to do these kinds of, of tool and resource misuse.
B
Yeah, exactly. And, and a lot of times these tool descriptors or schemas or metadata is injected into the context for an LLM to actually generate the output. So if I'm a malicious party or maybe just an agent that doesn't know what it's doing, and, and like they say has drifted from its goals or something, there's nothing preventing that from doing this poisoning thing where I like find out about the descriptor schema and metadata and I even modify that in the instructions to maybe get the MCP server to do different things. Right. So this, this tool and resource misuse is definitely, is a reason why it's kind of number number two there, the, the next one, identity and privilege abuse. So. Yes, yes, exactly. So they talk about this. Agents often operate with elevated privileges or service accounts and traditional Identity systems designed for humans struggle to accommodate them. There's sometimes unscoped privilege inheritance. Almost like I kind of think about this, like what was that, that cyber security book from? It's like the Cuckoo's.
C
Yes, the Cuckoo's Nest or something.
B
That was great. Someone can tell us in, in the comments, but it's like you, you kind of land one place in a network and then you escalate privileges. Right. And you can move laterally and go in all of these directions. Right.
C
Really old book. It was one of the original cybersecurity books that came out before it was really a field. I read it many years ago and yeah, definitely inspiring. Yeah.
B
And the Cuckoo's Egg. That's what it was. Yes.
C
Yeah. And as you are looking at lots of different agents that have different levels of privilege and different capabilities and as agents are formulating things during runtime essentially that didn't exist as a preset static thing that you want to do and they're developing that it's very easy for one agent to spin off another agent and it has more privilege than it needs and then that can be taken advantage of. So there are lots of different variations of, of how those kinds.
B
Yeah, yeah, for sure. So that's the privilege. And I should say I do really encourage people to take a read through the. The ebook. Obviously we're highlighting some of these things, but there's much more detail there. Also a great resource around this if you're trying to learn some of this is if you go to the OWASP Gen AI project. We've, we've had reps on our show before and my team's involved in the AI Bomb project and other things with oas. There's a lot of great people involved, but they have so many great resources online related to this sort of thing and guides for mcp, guides for agentic security, et cetera. So take a look at those as well. You might be listening to this episode and thinking that, hey, I am part of one of those organizations that's in the 90% of enterprises that are not ready security wise for autonomous agents operating in my environment. How am I going to manage supply chain risks and have an AI build materials and define agent boundaries, secure tool access and implement input, validation and output controls? Well, this is one of the reasons why I think it's so important to have great platforms that don't require you to build your own AI agent governance platform. That's why outside of the Practical AI podcast, I personally am leading an organization full of really smart people that are thinking about these problems and have brought Prediction Guard into, into existence. Prediction Guard is an AI control plane that's self hosted. It lives in your own infrastructure where you're going to deploy those autonomous agents and it allows you to manage this supply chain risk and put in governance policies that are enforced and maintain observability over those agents. And I'm just really excited about the capabilities that are, that are already in the product and are being released later this year. So I would encourage you, please check us out@prictionsguard.com practicalai you can book a call with me and the team to discuss how you're going to manage security for your agents operating in your enterprise. That's predictionguard.com PracticalAI predictionguard.com PracticalAI AI the next one that Anthropic highlights is supply chain and dependency risks. So you were just mentioning how sometimes agents compose things at runtime. Chris this includes potentially loading external tools or installing packages or changing infrastructure and so that supply chain can actually update in real time or at runtime as agents are trying to accomplish a task. But also model and tool supply chain. So models have their own supply chains related to the weights and how they were trained or fine tuned, how easy it is to jailbreak them or prompt inject them. But then MCP servers are also software components, right? They have their own integrations, their own software dependencies, et cetera, which have their own potential vulnerabilities. So all of this, it's very much a multi layered thing that could evolve dynamically, which is kind of of scary.
C
And one thing to call out while we're talking about supply chain and dependency risks is that all of the traditional zero risk vulnerabilities, all the things that we were talking about in the cybersecurity world before we started having AI agentic system conversations about this, those all still apply as well. I was prompted, no pun intended, to say that by you when you mentioned the multi layer. So you can still have, you know, BIOS and CMOS vulnerabilities that can take it, that lend themselves to some of these vulnerability, you know, layers and packages that build up. So there's many different points in a stack where these attacks can occur.
B
So all the way down to, you know, networking, firewall, right? If you're, if you have an agent operating in that environment, it could, you know, find and detect things that, that it shouldn't. And so yeah, it's, it's so yeah, I guess multi layered, which, you know, many security things are. And I know OAS always recommends this kind of layered approach. But yeah, the, the last two are kind of related Memory and context poisoning and rag poisoning both obviously are this type of of way that you can either in the memory or context to an LLM call or into rag data retrieval augmented generation data which often lives in a, a database, a vector database. You if, if you have no control over what and how things are committed to that memory or to that vector database, there's nothing preventing agents or external parties from inserting things into that memory. So you know, the, I think the one, the example I used last year at the Midwest AI Summit, Chris, which as a reminder to our folks, Midwest AI Summit coming up October 15th. Gonna be another great, great experience. You can, can search the details Midwest AI Summit. But I think I used the example where it was a healthcare situation and someone at, you know, an agent or a prompt is like in a first interchange it says, hey, do this for patient A. And then you in the follow up say like, well in all the following, you know, consider patient A to be patient B. And then you keep, keep filtering in that information about patient A being patient B. And then all of a sudden when you know later on you're, you're wanting some information about patient A or patient B, all of a sudden you're getting data that you shouldn't, shouldn't be getting. So it can happen and has been shown to happen. So. Okay, Chris, that's all the scary things. I guess there's a lot of them.
C
Now we got to go. Now we got to figure out how
B
to fix this right now we got to figure out how to fix this. And I do like the general structure that Anthropic provides here, recognizing again that many people are behind in this and that new tools and products will need to address many of these things gradually over time they present three capability, I think what they call capability tiers or three tiers of application. Basically saying, hey, in these different areas you need to do something. There's like the minimal thing that you should do, which they call foundation, the minimum viable thing. And then there's an enterprise tier, which means, hey, if you're, if you're an actual enterprise and, and needing to be robust and resilient, you need to do these things. And then there's advanced, which would apply to kind of particularly high risk or stringent regulatory environments or maybe aspirationally for everyone else to try to get to that, get to that level. So foundation, enterprise and advanced in each of these categories and then for they develop something in each of these categories. For each of a number of the threats that, that we talked about or the areas in which you need to secure the first one kind of dimension,
C
it kind of breaks them down by diff. By dimensions and then tiers them against those three tiers that you just described.
B
Yeah, it's kind of like I need to, I need to consider these. However many things, I forget how many there were. I need to at least be in the foundation level for all of these and then I can circle back and maybe upgrade particular ones to enterprise or like gradually work on it over time. So the first of those is agent identity and authentication, which they kind of frame as the foundation for every other security capability. Because without this identity you can't really enforce other things throughout the, throughout the framework. Now, as we go through here, they talk about certain ways of doing identity and verification and there are a couple terms in here that people may be unfamiliar with as well. One of those being they talk about hardware bound credentials. Have you. I'm sure this is also a part of your life over time, Chris, Hardware
C
bound credentials or where you have to present, you know, you may be a USB or something. You know, there's a lot of different ways it can, it can, but you have to insert a piece of hardware or make. Act. Make accessible a piece of hardware which provides that authentication which an adversary would be unlikely to have in their possession. And that doesn't necessarily do it by itself. There's usually multiple tiers. But that is one way of contributing significantly is if you don't have a physical piece of hardware in your hand, you're not going to be able to gain access even if you can break through other tiers.
B
Yeah. And this idea of it being bound to hardware, I think is the key point that you're referencing, Chris, where otherwise they view kind of, hey, if you have API keys, for example, and those are just floating around, you should probably consider those already compromised. If we're going with this idea of zero trust versus if an agent has an identity and has an authentication to access this environment, it has authentication tied specifically to the hardware that it's operating on. You know, something like that. That hardware bound credential is, is something that they talk about. And just to give some examples here, in the agent's agent identity and authentication piece, the foundational and we won't be able to go through all the tiers of all the categories. We just don't have time. But just to give an example of, of these, there is the agent identify identity verification piece, the foundation level that they Suggest there's to have unique cryptographic identifiers for each agent instance. So to Assign persistent agent IDs backed by cryptographic material, not just labels the track agent lifecycle from creation to retirement. IDs appear in all logs and access requests. The enterprise level is certificate based authentication with full lifecycle management. And the advanced is hardware backed identity with attestation. So that advanced, you know, you store agent credentials in hardware security modules or trusted platform modules. Right. With remote attestation, which there's a whole rabbit hole. You could go down there with those, with those terms. But that would fit into their, into their advanced category.
C
That's right, yeah.
B
So that, that's an example of one of these categories, agent identity and authentication. The next category that they, that they talk about is access control and privilege management. So assuming you have an identity for your agent, then you need to control access and privileges for that agent. And that authorization layer should enforce this idea that we defined earlier of leased agency which is ensuring agents receive only the access required for their specific function. And this can get very subtle. Like that API example that I gave you could only tell an agent about these endpoints. But if you haven't physic, like if you haven't literally shut off the network for other endpoints or something, then there's nothing preventing that agent from going off of the, off of the rails in that case. So yeah, just to give another kind of set of examples here, access control foundation level is role based access control or RBAC with deny by default. That's the foundation in that category.
C
That's right. And by the way, just as we're working through this, wanted to make one quick comment. These are all standard zero trust concepts. So those of you who may be watching, you may recognize a lot of these categories and stuff. And I think the key is kind of thinking about it within this agentic context and you know, as, as, as we're all onboarding agents and stuff that, that throws it out, but keep going. I just wanted to call that out for those that might recognize that.
B
Yeah, yeah, for sure. I think we can't abandon our good security intuition. And especially when you start treating these agents as having an identity and being operating in this zero trust environment. Some of these things kind of flow through if you, if you work out those details. But yeah, the ne. The next category, behavioral monitoring and response or sorry, observability and auditing, that was, that was. So there's actually these two are tied together. We could probably talk about them together. There's observability which essentially captures what agents do. So it observes what agents are doing and you need visibility into that. So you need logging and audit trails. Often in our implementations with customers, in my day to day work, I often like to say, hey, we need to know that this human user using this API key triggered this agent which has this identity to do this goal, which issued these prompts, which triggered this tool call, which had this input, which was blocked by this governance policy, et cetera. Like that's where we're, you know, and down the line we need that kind of traceability and logging. Otherwise you, you can't have visibility or build rules or monitor things. So that's the observability piece. But observability captures only what agents do. The behavior behavioral monitoring that they're talking about determines whether the actions that agents are doing should be allowed or are suspicious.
C
Are they appropriate for what you would expect?
B
Are they appropriate? Yes, that's right. Yes, exactly. And this is behavioral monitoring and response. Right. So in certain cases, like I say, when we enforce governance policies, we say, well, if we see this, then do this right. So sometimes that's blocking certain things, sometimes it's just logging, sometimes it's, you know, alerting someone using a particular platform. Okay, the second to the last one is input validation and output controls. I think actually this one. So what are we on? 1, 2, 3, 4. This is the fifth one. This is probably the one that most often comes to people's minds and I think is often maybe overemphasized, which is this idea that you would have point checks over, you know, harmful things that the agent could produce in its output or harmful things that could go into the agent's context or something. This is very important, I would say, but it's kind of like table stakes. The example I usually give is, you know, is it bad for me to take my temperature if I want to be a healthy human? Well, that's not a bad thing. You know, you can take your temperature. It doesn't mean that you are plugged into a healthy lifestyle or being governed by, you know, health records and as part of a healthcare system and have a primary physician and have a care plan and a diet, and it's a just a very limited way to view that kind of overall health. And if we extend that here, this would be these sort of point checks of validating inputs and outputs which are. Yeah, again I would say those are table stakes. And the last one is integrity and recovery. So all of this prevention and detection assumes agents operate correctly. You Know when they don't, what, what do you do?
C
Yeah, and, and I think that's actually a pretty big question in the agentic systems world in that if you think about, you know, going back a couple of points to beh behavioral monitoring and trying to identify what's appropriate for agents to be doing, you know, within all the other security parameters that we've talked about along the way. But when, when you, when you have gotten outside the bounds of what is appropriate, trying to figure out how to roll agents back, especially if they're in critical functions, can be quite challenging because those critical functions still have to be addressed. And so if a critical function is compromised by an agent that is intentionally or unintentionally off the rails, then figuring out how do you take a critical system back and get it back to a safe place to proceed in whatever is appropriate for that function can be quite challenging. And so I have spent some time in that space myself and I think that there's a lot of imagination that has to go into it that maybe wasn't quite as necessary in pre agentic zero trust models. So just wanted to call that out.
B
Yeah, yeah, they talk about, to give some examples Chris, for configuration integrity they talk about on the foundational level version controlled agent configurations and the advanced level immutable infrastructure with attestation on the recovery capabilities. They talk about at the foundation level documented rollback procedures which to your point, having some, having an idea of what you might do is one thing, being able to actually do it is sometimes a challenging thing. At the advanced level they talk about self healing systems with autumn automatic remediation. So yeah, definitely agree, agree with your points there. I know that we're getting to the close to the end here Chris, and just to kind of wrap things up or get close to the end here, Anthropic does a good job at kind of saying hey, here's all of this stuff and all of these tiers and levels and categories, et cetera. But then they do provide a kind of phased, a phased way that you can think about implementing agents, which I think is helpful. One, identifying requirements two, managing supply chain risks including they talk about AI BOM or AI build materials, defining agent boundaries, defending against prompt injection, securing tool access, protecting agent credentials and then safeguarding agent memory. And they give some kind of specifications under each of those phases for people to think about.
C
Yeah, I think, you know, as we're winding up, as they address it, I know just to share kind of how I perceive the, you know, kind of establishing the workflow is in the zero trust world that we've been in for, for a number of years. It's fairly static. You know, there's a lot of things and you kind of have to tick them all off. And a lot of it's a very, it's almost a regulatory approach to system development. And I think the thing that agentic implementations require is trying to anticipate an incredibly dynamic capability that can arise, you know, that can kind of an emergent quality that, that people are doing. And I think what Anthropic has done for us is given us a way of taking what we already know in a zero trust context and pointed out that within agentic systems these capabilities, it definitely requires a level up to take the same ideas but get them out of that static mindset and move into an anticipating dynamic capabilities from agents. And I know as we're both in our own jobs and stuff, stuff that certainly required us to kind of level up and reconsider, it makes it for a very interesting problem set to address.
B
Yeah, yeah. And there's major thought process changes or philosophic philosophical shifts as you're mentioning that as practitioners we may have to make. They talk in the, in the ebook Anthropic does about this idea of AI vendoring that hey, there's these fragile open source projects out here that you might rely on. The thing to do might just be to have your agentic coding system just completely vendor or literally not copy, but generate a new version of that project that's proprietary to you and under your control and just include it in your project rather than bringing in a third party dependency. So there's like philosophical shifts like that. I do think there's some hard things that we'll still have to wrestle with around. I think there's still some of this conclusion that humans are going to have to make containment decisions around how to contain these things and whether it be threats in your environment or agents operating in your environment. And if things are moving so fast, I just think it's going to be hard for humans to, you know, if, if something is happening in your infrastructure and exploit timelines go from, you know, months to hours to minutes to seconds. You can't just like rely on waking up the CISO in the middle of the night to prove, you know, shutting this thing down. Right.
C
I mean this is, I mean this is a revolution in cybersecurity just to, just to put a dot. You know, as we're finishing up here, every intelligence agency in the world is learning how to both defend against and exploit these potential vulnerabilities that we're talking about, as well as criminal organizations of all sizes, shapes on a global scale. So this, you know, I think we're at the very beginning of this journey. I think this is a fantastic start to get us thinking. I think we're going to see a lot more tooling and a lot more capabilities coming out in the days ahead. And it seems to be coming out very quickly because the threats have risen very quickly. And so I hope folks find this as useful as we did in terms of kind of reframing this modern take on cyber in this agentic world that we've been talking about nonstop throughout this last year.
B
And we'll, we'll, like I say, include the links in the show notes, so take a look at those and excited to excited to keep the conversation going. Thanks. Thanks for this today, Chris.
C
Yeah, thanks for taking us through it. It was a good, good, good exercise today. We do.
A
All right, that's our show for this week. If you haven't checked out our website, head to PracticalAI f them and be sure to connect with us on LinkedIn X or BlueSky. You'll see us posting insights related to the latest AI developments and we would love for you to join the conversation. Thanks to our partner, Prediction Guard for providing operational support for the show. Check them out@prictions guard.com Also, thanks to Breakmaster Cylinder for the Beats and to you for listening. That's all for now, but you'll hear from us again next week.
Episode Title: Zero Trust for AI Agents
Release Date: June 11, 2026
Hosts: Daniel Whitenack & Chris Benson
Main Theme: Exploring Anthropic’s "Zero Trust for AI Agents" framework—why "zero trust" matters for the security of autonomous AI agents in enterprise environments, key implementation challenges, and the new threat landscape.
This episode delves into Anthropic's new framework for implementing "zero trust" principles with AI agents in enterprise contexts. Daniel Whitenack (CEO at PredictionGuard) and Chris Benson (Principal AI & Autonomy Research Engineer) discuss how the rapidly-evolving threat landscape requires moving beyond traditional security approaches. They break down the core security challenges when deploying autonomous agents, outline the threat vectors unique to agentic systems, and walk through Anthropic’s phased mitigation strategy and tiered capability model.
Quote: “[Attackers] have equal access to these agentic coding and development capabilities... you cannot keep up with that level of attack using human only approaches.”
—Daniel Whitenack ([05:58])
Quote: “You give it just what it needs and absolutely no more.”
—Chris Benson on Least Agency ([15:22])
Based on Anthropic’s analysis, several key attack vectors for agentic AI systems are outlined ([15:28] – [28:50]):
Memorable Example: Daniel embedded invisible, adversarial instructions in a PDF resume to fool Claude Code during job candidate evaluation ([17:35]–[17:49]).
Real-world Example: Adversarial prompts that corrupt “context switching” for healthcare agents, leading to unauthorized data disclosure ([27:31]).
Three “Capability Tiers”:
Anthropic suggests a phased approach for organizations:
“You cannot keep up with that level of attack using human only approaches.”
— Daniel Whitenack ([05:58])
“You give it just what it needs and absolutely no more.”
— Chris Benson, on the principle of least agency ([15:22])
“Multi-layered… all the traditional zero trust vulnerabilities… still apply as well.”
— Chris Benson ([25:54])
“Rolling agents back, especially if they're in critical functions, can be quite challenging… There's a lot of imagination that has to go into it that maybe wasn't quite as necessary in pre-agentic zero trust models.”
— Chris Benson ([39:31])
“If things are moving so fast... you can't just like rely on waking up the CISO in the middle of the night to prove shutting this thing down.”
— Daniel Whitenack ([45:06])
“This is a revolution in cybersecurity... every intelligence agency in the world is learning how to both defend against and exploit these potential vulnerabilities.”
— Chris Benson ([45:06])
Chris and Daniel conclude by highlighting that rapid evolution in agentic AI is creating a fundamental shift in cybersecurity—demands for automation, dynamic risk assessment, and novel containment strategies. Anthropic’s framework offers a starting point, but real-world implementation will require both enterprises and vendors to "level up" — moving from aspiration to operation as agentic systems permeate critical business functions.
Useful links and further reading recommendations are included in the show notes.