Loading summary
A
You're listening to the Cyberwire Network powered by N2K.
B
Welcome to ThreatVector, the Palo Alto Networks podcast, where we discuss pressing cybersecurity threats and resilience and uncover insights into the latest industry trends. I'm your host, David Bolton, Senior Director of Thought leadership for unit 42.
A
The future is now and our expectations are wrong and we need to constantly question how to be better about AI security. It's extremely general, but it's going to be true for a long time.
B
Today I'm sharing a conversation I had with Nicole Nichols, Distinguished Engineer for Machine Learning Security Security at Palo Alto Networks. Nicole holds a PhD in electrical engineering and brings over a decade of experience in adversarial machine learning, cyber defense and applied AI research. She's worked at Apple, Microsoft and pnnl, led AI security research for national security programs, and contributed to pioneering efforts like the Cyber Battle sim. Today we're going to talk about achieving a secure AI agent ecosystem. Her new paper on the subject and why the timeline for AI deployment in cybersecurity might be accelera faster than anyone anticipated. Nicole Nichols, second try on ThreatVector. Thank you so much for making the time and coming to talk to me today.
A
Absolutely glad to be here.
B
I was hoping that we could have this conversation in person, but we ran into some technical glitches and some urgency to, to use a room too quick and today we don't have that. We don't have that hanging over our head. So I want to talk to you about the new paper that you're writing, the Achieving a Secure AI Agent Ecosystem. A lot of vowels in that one, which outlines both the urgency and complexity of securing these next generation systems. I've got to ask, what motivated you to run this cross institutional collaboration that you've done with RAND and Schmidt Sciences and did that workshop that you ran with those organizations shape your perspective on what we need to do first in this AI agent ecosystem?
A
Definitely. So I've been thinking about this problem for a couple of years, since before GPT popularized agents by circumstances, fate, coincidence, whatever you want to call it. I started working at Microsoft in 2020 on autonomous cyber agents and trying to, at the time it was considered a 40 year out there goal to potentially have anything like an autonomous agent working in cybersecurity. And now fast forward five years and people are building autonomous agents and applying them in cybersecurity and we can argue about, you know, is that something that's, you know, end to end or just partial? But either way, we're so far Ahead of where we thought we were going to be.
B
So for the listeners who are maybe Hearing you say 44, zero. I heard that. Right. And then five years later.
A
Yes, and five years later.
B
Yeah, like we're ahead of schedule, if you will.
A
Yeah, exactly. So in some sense there have been some really key blockers that were unlocked with generative AI that enabled these massive shifts forward in what we could potentially do in cybersecurity. From a decentive perspective, I think that it's not a stretch to say that the timeline has been dramatically compressed to what we expected. And the other thing that I've seen in this path is that there is a disconnect between the people developing the AI technology who have been pushing the envelope of what you can do with generative AI and, and cybersecurity and how to develop best practices in cybersecurity and where those intersections are. Because there's unique threats in the AI itself that are being pulled from the AI research community. And there's a lot of gotchas in both how the AI is being designed and how it applies in cybersecurity that have made for a unique environment. And that's, you know, to your original question that I've been rambling on about for a while now is it's that disconnect that I've been trying to bridge. And this workshop is where I really tried to do that by pulling together experts from both AI and cybersecurity to put together a brain trust and figure out what are the things that we weren't realizing from our individual blind spots. Because nobody can contain all that knowledge in their individual brain to figure out what can we anticipate in this new ecosystem and how can we best prepare ourselves to secure it.
B
So as you were talking, I'm reminded of William Gibson's quote, the future is here, it's just unevenly distributed. I think I'm getting that close.
A
Extremely true.
B
Yeah. You know, when somebody says it's here now or maybe it's 10 years away, I think that's a matter of your specific perspective. So let's start with a paper. In it, you outline a couple, what, three foundational pillars of AI agent security?
A
Yep.
B
Protecting agents from third party compromise, protecting user delegated agents from the agents themselves, and protecting systems from malicious agents. Why was it important to organize the security landscape in this way?
A
Two reasons. One, we wanted something that functionally organized the types of defenses and gaps that needed to be addressed in the pillars. And so they kind of go from having the most certainty of potential Solutions and mostly engineering based solutions with some evaluations of what may need to be added to. There's a lot of unknown AI research that needs to be done to kind of far future thinking of worst case scenario planning. And so the three buckets kind of span in that direction in terms of understandability, but also functionally in terms of where they're integrating into the adoption stack. Things that are being developed now with low code, no code solutions tend to fall in that I'm mostly going to need to secure against third party attacks. And as the sophistication improves over time, we're going to be looking at how do we ensure the assets and goals are aligned. It's a concern now, but those tend not to be the apps that are actually being deployed in scale right now. That's something that's more of a in development type. And then the third of, you know, looking at malicious actors is even further out.
B
Okay, so it's kind of a like right now, tomorrow and later. Yeah, like, like a Time Horizons piece.
A
One, two and more.
B
Yeah, yeah, there you go. You know, I want to get to my next question, but as you were talking through this, how much did AI help you write a paper on AI? To be very meta about this.
A
I will say it was a non zero contribution and some of that was trying to do simple things like rephrasing things so it was less wordy. I actually explicitly tried to avoid having it write any sections wholesale because I find that it tends to write stuff that sounds good but means nothing. And as I wanted to write this paper, my whole goal is I didn't want to be another piece of noise in the environment of crying wolf, or not crying wolf, but just kind of raising alarm without producing a solution. And so as I wrote it, I really only would give it a couple of sentences or two and being like, give me some ideas about how to merge these ideas. I really wanted to focus on the contributions.
B
One of the standout ideas in your report is the need for a agent bill of materials, kind of like a software bill of materials. How should organizations think about the provenance and component tracking when deploying agentic systems?
A
A lot of agent security is going to be defense in depth and the bill of materials piece is really only providing one component of the ecosystem defenses. And it definitely hits directly at the provenance piece, ensuring that you're deploying the model you intended, using the training data you intended and you know, reaching the audience that you want. I think some stretch of that. It's really focused on that provenance Step. And so if we think about the landscape of, you know, say there's 50 threats, that's a solid defense for three of them. And then we piece together defenses for the other piece. And so in some sense, it's kind of like the lowest hanging fruit as we think about building AI ecosystems, is ensuring we have provenance, in part because of the problem of hallucinating libraries and by extension, an agent's hallucinating tools, where you can have the agent think that it's calling Palo Alto Network's auto defender tool, and instead it's, you know, reaching APT's nefarious defender tool and taking the wrong actions. And so you want to prevent hallucinations like that by ensuring that you've got the supply chain provenance in there.
B
Nicole, what lessons can we borrow from traditional software boms? And where do we need new standards?
A
Yeah, so for agents, a lot of that comes down to the threat intel sharing. There's a lot of ongoing work trying to understand what is the metadata that needs to accompany an AI vulnerability. What are the tools we use to even identify an AI vulnerability? Those are not standard yet. And how do we not only share that information, but share it with the right people? Because right now the CVE system is really set up to go from MITRE to the PCERT teams at whatever tech company you're working at. And when we add in the layer of AI agents, the AI researchers, engineers, people in MLOps that are deploying these tools, they are only loosely connected to the PCERT teams. And so if those vulnerabilities impact those teams, we need to make sure that that information is being delivered to the right people. And yes, we're all at the same company, but a lot of that depends on knowing the right people and ensuring that people recognize where that information needs to be triaged to and kind of greasing the wheels on those communication paths so that we can respond effectively.
B
Agent containment is a recurring theme in your report. What would a robust containment and recovery strategy look like for a compromised AI agent?
A
Honestly, that's somewhat speculative at the moment because we don't have a good common framework for agents. Every agent is a prototype, which is radically different on the next prototype. I think that where we're seeing some kind of foundational elements towards that agent containment is in part ensuring alignment. I mean, that's before you even get to containing. We're just making sure that the agent is doing what you intend to. I think we still need some tools on that side which are going to be dependent on having Better interpretability and introspection tools, which is an open and hard problem. But when we think to the containment piece, some of it would potentially be either as we build up the new protocols for agent connectivity to the sub elements within that agent, basically kind of layering in that wiring authentication protocols that are potentially novel. And this is one of the things that was kind of unexpected and challenging. Not unexpected, but it was interesting how much consensus there was that the current communications protocols do not have enough security built into them to contain an agent. And so it's kind of a green space right now. And there are a lot of agent protocols being produced through some of the commercial developers. A2A and MCP are the two most popular ones.
B
Nicole, can you tell me what those, those acronyms mean? A2A? MCP.
A
Yeah. So A2A is agent to agent and MCP is model context protocol. And there's a very cute article that said that the S stands for security in mcp. And so right now it took me.
B
A second longer than it should have. There is no S in mcp.
A
I won't lay judgment on how much security is or isn't in there. But I think that in general, as we design these protocols, we need to reflect on our time in the 90s and building web protocols and put security first. And right now those protocols are being built, so let's put security first. And my other personal opinion that we kind of talked about in the report as well is that commercial forces will be what they are and the frontier labs are going to produce their own protocols that advance their strategic needs. And they may or may not include security. But even if they include fantastic security, it's always about the weakest link. And nobody is going to completely dominate that space. And so we need to ensure there's intercompatibility and we need to ensure that open source tools also have an ability to be secured. And so I think it's actually going to take some very intentional effort by building together a community to define a protocol that can actually be universally adopted that has a security first mission in terms of that connectivity so that we can have the ability to contain an agent that is gone rogue.
B
Yeah, speaking about that, can you talk about this idea of disposable or clonable agents as part of a containment architecture?
A
Yeah. So there's some consideration that at the moment it's very expensive to train the foundation models. And if that is the brain that is powering the agent, the tools are really what's connected to enable that agent to take actions and it might be possible that there is some form of cloning of the agent. Once you ground truth it and say, this is the performance, we want to do this particular type of task, you kind of spawn an agent to do a task once, and then when the task is complete, it's done. And so you don't have to worry about reusing that agent if through its interactions, its data has been poisoned, its memory was corrupted, maybe it was not perfectly aligned. You kind of have a clean start at the next go that you need to perform a task. And so in some sense, it's a digital hygiene approach to deploying agents.
B
So for the simpleton in the conversation, I would describe that as the Kleenex model. Once it's been used, we want to throw it out, start over with a fresh Kleenex. No.
A
Yes.
B
No. Second.
A
Because it's digital, it's not like it's going in a landfill.
B
Right, sure.
A
It's not going to take more, you know, the model, like, in terms of, like, energy consumption. That model's already trained. We're just, you know, cloning out a new copy of it.
B
So one of the other things I wanted to ask you about is this idea of goal integrity. Right. This is a real unique challenge with autonomous agents. How can we ensure that the agents faithfully pursue the user's intent and not some misaligned or really corrupt objective?
A
My hot take on that is that I don't know that we can yet. I think that we're so early in this prototype stage that I don't think the tools fully exist yet. And when I was thinking about this question, I kind of thought about it in three levels. One is, in some sense, we can just have completely manual oversight. We don't want the agent to do something wrong. We just check every move. But we lose all of the value of the agent in doing that, because the value of the agent is to work at speed and scale that we humans can't. And so the kind of intermediate thing is a deterministic check where maybe we sample every 10th value or action and have some sort of deterministic process that says we're going to check these places, we suspect there'll be a problem, and it will help provide some of the utility. But it's definitely still a compromise because at the end of the day, generative I is not deterministic. There are unique ways in which it will potentially misbehave, and that is guaranteed to miss that. And this is where we get to that piece of. We need to understand how LLMs work at a much more fundamental level in order to be able to get a fully kind of from first principles reliability in terms of that alignment of the nlp, sorry, natural language processing or semantic instruction set to intent. Because this is the other challenge is a lot of people really want to lean on formally verified code. So if you think about formally verified code, it's a great practice, but it doesn't apply to semantics. You and I can have a conversation and intonation and context can change and you can't really formally verify that. And so when we have that natural language interface to describe the goal, you can't formally verify the goal to a provable standard using formal verification.
B
Right.
A
So we need to instead think about those unknown unknowns that right now exist in. When these neurons are connected in these ways, is it able to pick up nuance and context of sarcasm and jokes and context so that it's aligned correctly? And so that's a really hard goal. And I think that there's some intermediate things that we can do that are really those compromises of, you know, having deterministic checks. There's other sorts of things that are starting to get there, those to work out of.
B
Are there frameworks that you think that are emerging that are going to help?
A
They're starting, I think that they're all very nascent. And what frameworks exist are really in kind of exploratory evaluation and not at scale. The one that I'm thinking of in particular is people are starting to look at separation of data from instructions. And in some sense that's partly a sanity check of saying, well, we believe this part of the prompt is data, this part of the prompt is instruction, and the instruction is providing intent. And so it kind of narrows the scope of ensuring that you're aligning to the intent. And it's kind of that halfway in between solution. We don't know exactly if that parsing between data and instruction is going to be perfect, but it's a good first step forward towards being able to have a method of validating that goal and intent. That work is from MSR Cambridge, I believe. Sahar Abdelnabi, if I'm pronouncing your name right, is someone who's published on that. There may be some others as well, but that's the one that I'm familiar with. And I think that there's kind of expanding interest in that because of the potential in that framework.
B
No, I think if we can, we will put a link in the show notes to that research.
A
Yep.
B
In the paper you talk about this idea of pre deployment evaluation environments as a major gap. What would it take to build a reliable and scalable TestBed to assess AI agents before they're out in the wild, before they're in production?
A
I think that's something we're going to have to start scaling up. I think that it will take a lot of goodwill resources to pull together the expertise and scale of providing something that is truly an open facility for people to evaluate. So this has kind of been a challenge in the AI agent evaluation framework as benchmarks. There's a lot of uncertainty of the quality of a benchmark, for example, evaluating an agent's ability to answer or perform cybersecurity tasks, or answer cybersecurity questions. There's a couple of benchmarks that are starting to look at that, but cybersecurity experts generally don't consider them to be comprehensive enough to fully represent the knowledge domain of cybersecurity yet. And so as we think about these pre deployment facilities, if you don't have a common benchmark, it's hard to get a common standard to say this is the type of agent that works best. And so we may need to start with a priority domain and a priority scale. So something like an average enterprise customer, that's 5,000 employees, and see what is the priority of evaluating agents operating in that type of environment and pick a couple of high value needs. And this is where I think the challenge is. They may not necessarily align with commercial interests. And so it's unclear as soon as it drops out of commercial best interest who takes ownership of that. And there's a couple organizations such as the Coalition for Secure AI or some of the AI safety institutes, they could potentially take up that mantle. But it's hard to get buy in from multiple groups to define specific roadmaps for that. And they take time. And in the workshop when we were discussing it, there was a lot of consensus that we really need that much sooner than it could be built in order to get that standard evaluation of how does it perform in different environments with different types of endpoints and different architectures, and particularly around critical infrastructure, because some of the security practices around critical infrastructure, most people don't have high fidelity models to evaluate how an agent would behave in that environment. And so for security practitioners to ensure that their model would be effective in defending those sorts of things, they need better models of those systems to be able to validate that.
B
So, Nicole, let's shift gears a little bit and talk about malicious agents. In the paper, you've talked about the growing risk. I think a lot of people are thinking about what does the world look like when you can scale attacker activity, attacker behavior, and what happens when one of those or many of those malicious agents becomes maybe embedded in critical infrastructure. How do you go about building detection mechanisms or hardening your systems and then how does a defender know what to prioritize in a world where there's this ambiguity right now?
A
Can you be more specific on which.
B
Ambiguity do you look at critical infrastructure first? Are you looking at something that has this infinite patience that we talked about last week? Are you looking at something that is a data stealer instant cache? Are you looking for something that is. It's ambiguous of what it's going to attack or how it's going to attack? Is it trying to poison and bias or is it just trying to outright steal? Maybe it's a combination of those things, but I'm just trying to figure out what kinds of detection mechanisms or system hardening should defenders prioritize?
A
In some sense, the most sophisticated actors are always going to be the ones that adopt a most advanced technology first, because they have the resources and motivation and goals to do it. So in that case, it's unlikely the goals or objectives are really going to change. I think that the objectives of a nation state actor will remain the same. How they're being achieved is what's changing. And so when we think about anticipating where in our global cyber systems will they be discovered first, it will probably be in those environments and attempting to achieve those same goals in terms of identifying them. I think that that is something that will eventually be universal, whether or not the target is a government or whether it's a large multinational company. I think that the defense tools that we need to build in order to detect if an autonomous cyber agent is operating in your network and how to remove it will be universal, which is in some sense why I feel like it may be further down the line for those multinational companies. But we don't want to be caught off guard with five years becoming one year in terms of the time we have to prepare for it. And so I think that if we take the mentality of just in time planning, our just in time planning may need to be shrunk as we think about when to be ready for those threats.
B
Nicole, you proposed a roadmap toward a secure agent ecosystem with an actionable starting point. If you had to pick one or two areas where immediate investment would have the biggest security return, what would that be?
A
I'm going to take the Engineering's answer and say it depends on who you are. Because I think that if you are a government, if you are a multinational corporation, or if you are a mom and pop business that wants to make sure that your customer data isn't stolen, what you're going to do to prepare is going to be really different. So I'll start with that little caveat. But I think that, yeah, a little hedgehog. But I think going forward and preparing for this, I think it comes back to those three pillars that we put into the report, which is why we structured it that way. If most of your work is going to be not in the early adopter category, where you started to use technology that's been tried in a couple of places, most of what you'd be doing is ensuring that you're using the best practice tools. And I think that your security teams will need to be doing some evaluations of which tool is doing the job best. From a technical perspective. I think that if you are pushing the edge of at least today and now, anything that has higher degrees of autonomy, higher degrees of tool connectivity, you're operating in a higher risk or higher unknown AI security environment. And it's worth investing or at least participating in the community dialogues where we're trying to better understand the interpretability and inner workings of AI models and agents. Because understanding that is likely to be what will drive better evaluation of alignment and security of those agents. If you are kind of in the realm of you have long term planning capacity and research budget. I think it's really important to start investing now in terms of understanding how the AI security landscape is going to change with agents to try and figure out if the kind of detection signals and features that we're using in our AI use of defensive tools will continue to be robust when agents are using unexpected attack paths or using kind of an infinite patience model to use different techniques than we've seen before. And I think that we need to share more information and be a little more open on this to ensure that from a defensive perspective, we're ready for when the future arrives a little bit sooner than we expect.
B
So if I were to go back and try to summarize that one of the things that we should be doing immediately, no matter whether you're a mom and pop, all the way up to a massive government organization, is this information sharing, like, is that maybe the very first thing? And I think that leads me to sort of leading the witness a little bit. But if you're listening now, how can our audience get involved whether they're building the agents, they're trying to secure the infrastructure, maybe they're securing policy. What would you recommend for jumping in and being a part of defining the future that we're all going to arrive at?
A
I mean, I think there's two pieces, you know, one is simple awareness. You may not have the research capacity of staff to be building that next generation tool, but you can still be connected to the technology discussions around where those are proceeding and figuring out which ones are going to be best aligned. And so there's a variety of sources for more information. The AI Safety Institute from the US and the uk. The MLSECOPS podcast from Protect AI is an open community resource that's kind of all access levels of conversations with executives and technicals about where AI security is headed. There's a variety of technical formats in terms of research exchange around AI security that are associated with acm. And so if you want to even just go and listen and become familiar, you can attend these conferences remote for maybe 100 bucks and just kind of listen in. And I think that small side story, at one point I was studying abroad and I took classes that were completely outside of my major because I was interested in it and the grades weren't going to transfer. So if I failed, I didn't lose anything. And I just had everything to gain from learning. And it was kind of like in some sense I would encourage people to be willing to fail or accept that they don't know all of the technical terminology in some of the technical conferences, but just ask questions and find people who are supportive of you learning about this. Because there's still this fundamental gap of people who know cybersecurity and are deeply technical in that people are deeply technical in AI. And the more you can do to help someone else learn something about what you're expert at is going to help all of us become better at securing AI.
B
Nicole Nichols Second try on threatvector we finally got it this time. It was a delight to come here and learn from you as we talked about your new paper and agentic AI and how fast the future has decided to arrive for us.
A
My pleasure. I love talking about this stuff and speculating and trying to stitch together odd bits of information to help the world be a better place and more secure place.
B
That's it for today. If you like what you heard, please subscribe wherever you listen and leave us a review on Apple Podcast or Spotify. Your reviews and feedback really do help us understand what you want to hear about. If you want to reach out to me about the show. Email me at threatfactor palo alto networks.com I want to thank our executive producer, Michael Heller, our content and production teams, which include Kenny Miller, Joe BT and Virginia Tran. Mix and original music by Elliot Peltzman. We'll be back next week. Until then, stay secure, stay vigilant. Goodbye for now.
A
It sa.
Date: September 4, 2025
Host: David Bolton (Palo Alto Networks Unit 42)
Guest: Nicole Nichols (Distinguished Engineer for Machine Learning Security, Palo Alto Networks)
In this episode, host David Bolton sits down with Nicole Nichols to discuss the rapidly evolving landscape of AI agents in cybersecurity. They dive into Nicole’s recent cross-institutional paper, “Achieving a Secure AI Agent Ecosystem,” the shifting timelines of AI deployment, foundational security pillars, and the urgent need for coordinated standards, evaluation environments, and information sharing. The conversation offers both practical guidance and forward-looking speculation for security professionals navigating the intersection of generative AI and cyber defense.
Nicole recounts how, five years ago, autonomous cyber agents were a “40-year out there goal,” but generative AI breakthroughs have radically shortened the timeline.
Quote:
"It's not a stretch to say that the timeline has been dramatically compressed to what we expected." (Nicole Nichols, 03:47)
There is a disconnect between AI developers (pushing the boundaries of generative AI) and cybersecurity practitioners (focused on practical best practices).
Nicole’s motivation for convening experts from RAND and Schmidt Sciences was to bridge these worlds, exposing "individual blind spots" and better preparing for new threats (04:40).
Pillars defined in Nicole’s paper:
The pillars provide a staged, “time horizon” approach: immediate (“now”), intermediate (“tomorrow”), and future/advanced (“later”) challenges.
Quote:
“The three buckets kind of span in that direction in terms of understandability, but also functionally in terms of where they're integrating into the adoption stack.” (Nicole Nichols, 06:08)
Nicole advocates for an agent bill of materials (ABOM)—paralleling software BOMs—to track components, supply chain, and provenance.
This helps prevent attacks like “hallucinated tools,” where an agent could unwittingly interact with a malicious clone of an intended tool.
Quote:
“The bill of materials piece is really only providing one component of the ecosystem defenses... It’s focused on that provenance step.” (Nicole Nichols, 08:49)
Threat intelligence sharing for AI differs from legacy software.
Challenges: determining AI vulnerability metadata, sharing with the right stakeholders (AI researchers & engineers not tightly connected to traditional PCERT teams).
Quote:
“...We need to make sure that that information is being delivered to the right people... greasing the wheels on those communication paths so that we can respond effectively.” (Nicole Nichols, 10:12)
Current lack of standards—a “green space”—for agent containment.
Key issue: existing agent communication protocols (A2A—Agent-to-Agent, MCP—Model Context Protocol) are not security-first.
Memorable moment:
"There’s a very cute article that said that the S stands for security in MCP. And so right now it took me... There is no S in MCP.” (Nicole Nichols & David Bolton, 13:12–13:35)
Security must be intentionally designed into protocols, with community-driven standards to ensure compatibility and open-source inclusiveness.
The disposable/clonable agent (the “Kleenex model”): after performing a task, the agent is discarded, preventing lingering compromise—“digital hygiene.” (15:02–16:17)
The most actionable universal step: Active information sharing and community education, regardless of expertise level or organizational size.
Engage with resources like the AI Safety Institute, MLSECOPS Podcast, ACM conferences, and research exchanges.
Nicole encourages security and AI practitioners to “be willing to fail” and cross the knowledge divide between fields, fostering a supportive learning environment.
Quote:
“The more you can do to help someone else learn something about what you're expert at is going to help all of us become better at securing AI.” (Nicole Nichols, 32:15)
Nicole Nichols and David Bolton deliver a clear-eyed view of the urgent, multi-faceted challenge in securing AI agents. Nicole’s guidance is both pragmatic (adopt a bill of materials, focus on information sharing) and visionary (engineer for alignment, invest in interpretability, prepare testbeds and standards). Above all, the message is a call for collaboration—between domains, sectors, and individuals—because, as Nicole puts it, “the future has decided to arrive for us” sooner than expected.