Loading summary
A
Foreign. And welcome to this soapbox edition of the Risky Business podcast. My name's Patrick Gray. These soapbox editions of the show are wholly sponsored, and that means everyone you hear in one of them paid to be here. But that's okay because we have excellent taste in sponsors and they generally join us in these soapbox conversations and say very interesting things. So joining us now is Russell Van Tiele, who is the VP of Services from Spectrops. Hello, Russell.
B
Hey, Pat.
A
And James Wilson. Our very own James Wilson is joining us for this one because it's very much in his wheelhouse. James, how's it going?
C
Hey, Pat. Hey, Russell. Great to meet you.
A
So what we're going to talk about today, we're going to be talking about red teaming AI systems, and we're going to be talking about what AI systems are doing to the typical enterprise in terms of, like, driving risk, both in the sense that there are all sorts of risky internal systems springing up that are sprouting new identities and new attack paths and all of that stuff. Because Spectrops, of course, makes Bloodhound, which does the most sophisticated attack path measurement and enumeration. Like, if you want to. If that's something you want to do, you want Spectre. Spectre Ops, Bloodhound. Excuse me. And we'll also be talking about, you know, how everything is moving at machine speed and machine scale these days. And, you know, things are getting a bit crazy. But I thought we'd start off, Russell, by just trying to define what it is that Spectre Ops means when you say you do sort of AI red teaming engagement. I mean, is this. Is this really looking at weaknesses in the models that are being used? Is it looking at. Or is it looking more at sort of the way AI systems that people are using, you know, altogether like looking at that as. As a system.
B
Yeah. If the fact that the term red team already had a checkered history to begin with of, like, what people agree on, what it means, AI red teaming is even worse. You know, when AI first started becoming a thing, everyone say that they're doing AI red teaming. And at the time when that first started, a lot of what the times of them meant is they were like, testing a model for, like, safety, alignment bias, all that kind of stuff to
A
go trying to trick a model into saying something racist basically was very early ideas with AI red teaming.
B
Yeah. And that's definitely needed for it. But then the offensive security space started connecting up with testing AI systems, and I think that definition kind of changed. You mentioned two things. I Definitely see it as both. Even OWASP has their OWASP for machine learning and then their OWASP for LLM applications. I believe that most of the organizations that are going to come to us to do some type of test have like a whole system with AI in it and they want us to test that whole system. You know, web apps, databases, skills, all that kind of stuff. Most companies are not creating models themselves, they are just calling OpenAI or calling Anthropic or calling one of those model providers and there's probably some in between that are actually like creating their own models or maybe doing fine tuning on it. So for me I like to focus on actually testing like the, the system of systems that have a piece of AI in it at some point to kind of separate out those two on those. I looked at some of the adversarial machine learning courses and that's a whole different skill set to try to get into that kind of stuff. A lot of math, a lot of understanding stuff. And look, I still think it's important depending on how deep you want to get through tradecraft, because there are some tax you can go with but yeah, mostly the AI systems around them sticking closer to like the owasp lymph for top 10 for LLM applications.
A
Yeah, now you're talking about AI systems, right. And we're talking about the sort of companies that would go and then get SpectreOps to, to have a look at these sort of systems. So I mean what I imagine it's a pretty broad spectrum of, of AI systems that you're looking at. But I mean, can you give us an idea of what sort of stuff we're talking about here?
B
Yep. Yeah, I think the, probably the most common thing that I expect people will have implemented or use some version of is a chatbot. Like that's really the one of the more primary interfaces people are getting, which is just a web application that takes your input and sends it off to inference to some other provider. Sometimes it has a rag database connected to it, sometimes it's connected to other internal systems, all that kind of stuff. So that's probably the most common thing that we see where AI is, is like inside that's, that's like.
A
I gotta be really honest that I'm somewhat disappointed that it's a chatbot. Right. Like, because I'm thinking you look at the spectacular blow ups we've seen out of AWS from you know, messing up orchestration and stuff and I'm expecting it to be really cool and it's not, it's chatbots. James, you wanted to jump in with a question here.
C
Well, it's all chatbots at the end of the day, but that's not where the question was that I had. I was thinking about, you know, the. The engagements that you've done in the past around red teams. Before, it was AI red teaming. Are you. Are you now. Are you meeting with different teams? Like, is this is the responsibility for AI being built into these systems? Do enterprises even know where to manage that yet? Who's owning this problem?
B
Yeah, I've seen, like a mix of kind of both things, and I have a whole rant, I like to go on about it, but I've definitely seen the bigger organizations create a whole new AI red team because they're trying to, like, have a specific belly button they can poke to, like, do their AI security testing and whatnot. So we've seen organizations stand up. They have their regular red team, and they stand up their own AI red team on top of it. And a lot of times what we're seeing is everyone's still trying to figure it out. You know, the AI system is moving so fast and, like, basically inside their org, you know, they're trying to use AI as fast as they can. And this new AI red team is trying to get connected over to the people that are using it, if they could figure out who's using it, and, like, trying to get all their governance and policies in place too. So I'm still seeing it's developing a lot in organizations right now.
A
Now. I mean, obviously this is a new type of service, right? You've got a long history in pen testing and red teaming. Right. Spectrops is known as one of the best in the business for that sort of work. Right. So when you are now all of a sudden, like, you know, hanging out the shingle saying, we do AI pen tests, like, what is involved in sort of upskilling there, I guess, among all of your testers to sort of. To sort of figure this out. Like, is this a completely different type of engagement or is it just pen testing with a slightly new, you know, type of technology? Like, how. How new is this?
B
I would say a little of both. And that's kind of one of the rants I usually like to get on. I just consider everything an offensive security assessment. And then the client's going to give me some technology stack that I need to go look at. And in my opinion, offensive security practitioners have always done the same. You hop into some customer environment, they got these weird technologies, you've never seen before. You try to figure them out as fast as you can. You try to figure out how to compromise them or steal identity and move and do that kind of stuff. And when we talk about AI system, the AI piece is new. But again all the other systems around it, the identities they're using, the web servers are running on, the databases they connect to, none of those aren't new. Those are the same old things that have been there for forever. In fact, a lot of the attack paths that you do see on AI just compromise identities or use other things that are not AI system for it. But you know, a lot of it's unique. I would say the prompt injection is probably like the, the biggest unique thing to understand. And then also understanding, you know, probabilistic versus deterministic nature of just AI in like you kind of got to wrap your head around that there's more nuance to it than that. But that's probably like the biggest starting point for getting people to understand it. And then I would say generally speaking, most people that do like offensive security tests want to know everything at its deepest level possible, or at least we do. And so it brings like this level of uncertainty when anyone's asked to look at it like, well, I'm not sure I know this enough kind of stuff. We've definitely undertaken like an internal upskilling program to like get people to like get the knowledge they need to kind of understand it.
C
As you start to get into these AI pen tests more and more, I just wonder how often are you going to see truly novel, incredible things that an LLM is doing that's unpredictable versus just almost like a forward slapping discovery of like you've just undone the last five to 10 years of security controls we put in place because you bolted this damn model in right next to where all the credentials are. In the space of one week we had this story of like an incredible nation state grade exploit kit that came out that blew my mind. And then by the end of the week it's like, and Google's just jammed an agent into your browser and it can access all your extensions. It's like what are you seeing? Is it novel? Is it exciting? Or is it just like, oh, I can't believe you've actually done it this way.
B
Yeah, yeah, I think what kind of the point you're getting to is like even like with openclaw and its thing, everyone's so excited about AI, they're moving so fast to do everything and like to get the most value out of AI, they just connect it to literally everything. And then, you know, you just, like you said, you undo all these security principles that we spent years learning and it's like no one cares about them anymore. It's more like I got to keep up with AI, so I'm just going to put all these things together. I'm not going to care about principle least privilege, I'm not going to care about separation, I'm not going to care about testing nothing. Just move as fast as you can. To say that you're gaining some efficiencies out of AI again, a lot of the public reports I see is they're all, most of them are traditional, like web app vulnerabilities. It's some type of idor, some type of injection kind of thing. Like you said, most of the attack primitives are not new on it. The only thing I would argue is new is prompt engineering. And while it is new to me, it's just like social engineering. A human, which is also part of red teaming models, different than a human in some ways. But a lot of ways the attacks are just like, how can I get this model to do what I want that it wasn't really planning on doing? Which is the same as calling someone the phone and trying to get them to give you a password. Like, what things do I need to say to get you to do what I want?
A
Well, it's funny too, because they are non deterministic. You can ask them the same thing twice and get a different result. Right. And I've experienced this trying to get Gemini to do stuff that's kind of outside the boundaries of the model safety or whatever, generating embarrassing pictures of politicians or whatever it is. And sometimes you've got ways of tricking it into doing what you want and it works, and other days it just doesn't feel like it. Right. And that's the. But I mean, people are like that too, right?
B
Mm, yeah, sure, yeah. You can make 10 phone calls and you know, nine of them don't work. The people are just like, nope. They just hang up on you straight away. And you know, maybe that tenth time, finally you convince someone, or maybe you called that person earlier and you get someone else from your team to call them another day or so. It's the way that it works out. One point you mentioned there about the non deterministic part, that's another interesting part when it comes to doing the testing of any of these types of systems is because they're non deterministic, you really have to log your inputs and Outputs and you need to try every prompt injection type attack multiple times. A lot of times, like when a report, when you give a client, you give them the steps to reproduce the vulnerability or reproduce the thing. Like when it comes to prompt injection, you can't just like this is the prompt I send it. You'll also get the same response because you won't.
A
Yeah, yeah, no, that's funny. Now look, staying kind of, I mean this kind of connects to the open claw thing that you just sort of mentioned as well. And it connects pretty nicely to spectrops as well because you know, you make bloodhounded, you're the, you're the attack path experts. You know, it seems like the identities that are being used by agents, there's a lot of identities being used by agents. Every time someone puts in an agent, there's one and sometimes multiple identities used by that agent to perform various tasks. It's like, what if, you know, instead of trying to minimize the number of service accounts in our organization, we just tried to maximize it instead. I mean it feels a little bit like that. You know, the AI age feels like, my God, machine to machine accounts are just exploding. Is that just, you know, is that an accurate perception? Because it seems like that that's the logical place this is going to go, which is just, yeah, like a gajillion machine to machine identities in most organizations eventually.
B
Yeah, definitely seen a lot of explosion. Like you mentioned on that. I think some of the public reports report anywhere from 82 to 96 non human identities, two human identities in an org. I think AI is definitely exacerbating that. But I also think, you know, when SAS started becoming super popular years ago, I think a lot of those non human identities come from just SaaS applications to begin with. But you know, what's an AI chatbot? If it's not connected to anything and you can't, you can't do anything with it at a base level, you're at least going to have the token to talk to the model. So you could probably steal that. But again, if it's not connected to other things, that's. How useful is it going to be to you?
A
Well, this is, this is why some of the, some of the open claw security advice that you see around on social media is so funny is because everybody's like, yeah, I put it in a vm, so it's totally fine. But then they give it its credit card, the credit card number and like all of the cookies out of their web browser and you just think, yeah, that's not, that's not how this works, guys.
B
Yep. Yeah, I've been, I've been ticking around with OpenClaw my myself and it isn't a VM, but I do not give it ability to get to anything right now. It just has my inference key to do that kind of stuff. I'm really scared about that. Like you said, you start giving it your identities and they can start maybe exchanging them for other things or accessing things you didn't know about. Plus, you know, you got the prompt injection part where you know, you let it read your email because you want to just help you read your email and then like someone just sends an email to you with a prompt injection and then off you go. So definitely a problem with any type of AI agent system. You know, consumers want it to be useful for them, so that's why they're connecting everything to it. I scares me. It's not the exact same parallel, but in one way. Way back when you used to be able to compromise like an RDP server or any Windows server and all the credentials that were ever there you could kind of pull out of memory. And this kind of reminds me of that. Like if you compromise like openclaw system, you can get a whole truckload of credentials you could do all kinds of stuff with.
C
And yeah, look, this is one thing that's been on my mind is that I think so much of what we're seeing in AI at the moment, especially that case you just mentioned where it's like the open CL is not useful if you deploy it in the way that everyone's telling you to do it. Right. If you locked it away in a VM with no credentials, that gets real boring real fast. But the moment you give it any credential, I think you've got to make an assumption that that credential will A be leaked and B be used to move laterally in directions you hadn't imagined. Which is where I think things that are already out there in this cybersecurity space become just far more important than they already are. And Pat mentioned there bloodhound and sort of that attack path enumeration or that lateral move enumer. I'm just curious as to how you guys are thinking about this in terms of is it the same product doing the same thing, but it's now just even more important or are there ways in which you need to start thinking about extending that product set to be. I hate to say it, but AI native and almost follow this LLM relentless desire to just get something done regardless of the friction.
B
One of the things that's still the same in offensive security to match your question, what you're asking me is as an offensive security practitioner, you compromise an identity, you see what access it has and you keep doing that credential shuffle over and over again until you get the access that you're trying to go to. Obviously a lot of the products that we have and a lot of pen testers are used to as active director and Entra. Those are probably the primary two that everyone's super familiar with, but definitely attck pass on assessments we've been doing have been crossing multiple technology stacks, just like you're seeing in the public reporting. You go, you get from GitHub to AWS to something else, to something else, to something else. And we've definitely been doing that not necessarily to shield a product, but like the open graph extension that we have allows you to map an identity across any technology stack and you can kind of define it. So I think we're kind of positioned well to kind of handle that as is. So it's still thinking about the same. It's looking at the identity attack pass through all your technology stacks instead of just active director or entrance to find where people are going to move. That just matches the tradecraft we're actually doing on assessments as well.
A
Well, it's funny too what you said there, James, because, you know, AI making all of the existing controls kind of more important. It's true, but it's twofold, right? Because it's not just enterprise use of AI internally that make all of the fundamentals more important, like watching your identities, monitoring identities, looking at attack paths, things like that. But it's also adversaries now using AI to scale out, which just means like, you know, a conversation, I've mentioned it a couple of times, a conversation with Tony De la Fuente from Prowler, which is the, you know, the open source cloud security scanner. Like Claude uses Prowler to scan things for you if you want them to. Right. So now that, that's so easy to do, you need to find that stuff before anybody else does. Like, you just have to be better now. Both because, both on the internal, because of internal reasons and internal use of AI and external attackers, being able to be like, well, just go out there and continuously scan the entire Internet with or this whole chunk of the Internet with Prowler until something pops and then, you know, and then off we go. So, you know, I know a lot of vendors who are having that conversation like you know, some of the vendors like Airlock Digital, who do allow listing of executable code and host hardening and stuff, and knock, knock, who do allow listing of network connections. And the pitch really has become, well, you kind of need to go much more into this default deny stuff because there's so much more happening inside and outside that you just really got to get on top of this. Does that vibe with your understanding, Russell?
B
Yeah, it definitely does. The deny by default kind of policy, it's like you're describing, everything moves so fast, and unless you can keep up with that how fast that is, you're safer by, you know, secure by default mindset instead of permissive by default mindset. There's a big buzzword going around the industry that I try not to use, but it seems pretty fitting. It talks about, you know, things move at machine speed, which is what you see with, like, the adversaries are using, you know, they can just go through a whole exploit chain, you know, an hour or two kind of thing. And obviously that leaves defenders wondering the same thing, like, how can I find all these and block them all, you know, in the same two hours? But it's hard to keep up with because the deployment of people using this stuff and putting it out there is happening so fast again. Again, I'd argue that, like, most organizations at least understand the idea that you should have things go through security review before you deploy it. But I wonder if that's actually happening because of how fast everyone is trying to move in this space.
A
Well, and that's that. That's another, you know, notch in the, in the internal drive to make everything, you know, faster. It's not just machine speed, it's machine scale as well. Right?
B
Yeah.
A
Like, that's the thing that, that boggles my mind about this is it's not just that everything's going faster, it's that everything's going faster and there's heaps more of it. Right? Like, holy moly. Like, you know, we thought we were dealing with a fire hose before, and now it's just, you know, wow.
B
Yeah, I mean, it's exhausting to just keep up with the news as it is on, like, what's you're telling me, pal? I bet it's so hard to keep up with. I think that's where a lot of executives are finding themselves too. They're like, how do I even keep up with this AI stuff going on in my environment as well now, just
A
to bring a few threads together here, right? So one of the things that you Noted that you wanted to talk about were a few of these publicly discussed AI adjacent breaches and how they played out and how they interacted with attack paths and things like that. So one of them was the Salesloft Drift breach.
B
Yep.
A
I'd love to get your view on that because like, at the time it wasn't like super clear exactly how that happened and exactly what everything meant. You know, I think it, you know, it. I think by the time it was clear, it had sort of dropped off the news agenda a little bit. But walk us through your view of that whole thing. Start to start to end.
B
Yeah. So kind of one of the main reasons I brought up that point is like most of the attack path was like, again, using tried and true tactics that we're going through. Like, it didn't actually start with like a AI system, if I remember correctly. I believe it started with someone compromising GitHub and adding a user account to a thing. And then somehow they got some credentials out from aws and then once they got in their AWS environment, they got the OAuth tokens to. To talk to everybody's Salesforce instance and then they just started pulling out more
A
and more sales loss. Salesloft Drift made like an AI chatbot for Salesforce. Right. So you could do like certain customer service tasks. You could just get the AI chatbot to do that. But a great example of where. Well, this chatbot, it obviously needed access and it did it through OAuth. And if those tokens went walkies, well, you know, it's a bad time.
B
Yep, yep. That was the thing they stole. Sorry, I forgot to mention that, like when they got into the AWS environment, they stole the OAuth token for the chatbot that people used to talk, talk to Salesforce. And then from there they just pulled out exponentially more sensitive data out of people's environments. Again, one of the reasons I brought it up is like it's all traditional tradecraft on stealing identity. What does it have access to pivot to the next thing until you get to your objective, whatever it is you're to accomplish on that one. Another interesting attack is the. The client injection intact. They also kind of. That one actually did start from an indirect prompt injection. It came from like a GitHub issue and they had an anthropic worker, I believe it was in their GitHub account that just read the issue title, I think it was. And they use that to inject some actions into post install that the attack got a little bit more technical hang
A
on, hang on, I'm going to ask you to just roll back there a second because I don't know what client actually is. Right. What is client?
B
Sorry, Klein is one of those IDEs that people use to do like AI coding, type stuff and they push out a package. It kind of looks a little bit like VS code and so a lot of people are using it as what I would get out for that. I tried it out myself for a little bit as well. Nifty and nice to use one of the cool features that it has like a distinct plan mode button and action mode button. So you can kind of separate it out. But the point for the clientjection is software that a lot of developers are using and someone compromised their repository and was able to push out malicious versions of that client to people.
A
And what did those versions do?
B
I think at the beginning they had just published some like post install scripts for it and I know later on in Attack path they published it so that way it just installed openclaw on the end instance. And I think the public reporting of like why somebody would do that wasn't really clear. I think the theory was openclaw is known to have a lot of vulnerabilities in it. So that would just create an attacker ability to just have C2 through Openclaw basically.
C
I love that as like, it's almost like, you know that meme of like step one, build an exploit, step two, question mark, step three, profit. It's like they just skipped straight to what are we going to do with this? I don't know. Should we just dump openclaw in there? Yes, do that. Then we'll work it all out later. It's just wild that there's like an agent that can be the end of the chain of this when you're not quite sure what you're going to but you'd love something that's going to be like super obedient sitting there on the end user's machine that you can absolutely own at any point in time. Crazy.
B
Yeah, I know. It was definitely an interesting one. I was curious, it's like asking myself was like, was this like maybe a white hat person that just saw an opportunity to like test out their skills or was an actual like real adversary trying to accomplish something and that's just where they went? Maybe they had more goals or something after? I'm not sure. It was interesting though.
A
Yeah. Now one other thing that we want to talk about today is AI in the browser, which I'm going to be honest, I don't use an AI browser. And I think using an AI browser seems like a pretty bad idea.
B
Yes.
A
And I know that this is going to age me out right. Like real quick because this is what everybody is doing. But when you've got non deterministic models and you know, you've even got OpenAI saying prompt injection is never really going to be a solved problem. And you know, these things inherently mix code and data. Like it is never going to be a solved problem. It just, you know, it just doesn't feel, it does not spark joy the idea of using an AI browser. But what are the enterprise security implications and what are the implications for you as someone who runs runs, you know, a red teaming practice.
B
Yeah.
A
People using AI browsers.
B
Well, even if it's not AI browser, the browser is one of our favorite things to attack to begin with because what does the browser have in it? All your MFA authentication credentials. So very common technique that we'll use is either dump your cookies from your browser or stand up maybe Chrome on like a dev port and pull your cookies over to our session so we can start using them. So the AI part definitely adds something unique to it. But browsers are already a gold mine to begin with. Which is why I also don't use use an AI browser for that same reason. All the identities, all, everything past MFA is in there for you to just grab, scoop up and use. If we were starting to test actual people that were using AI browsers, we also have now a natural language way to try to accomplish some of our tasks of compromise or if we want to get the user to do other other kind of stuff. So that makes it real interesting and the target. I think AI browsers are popular because what do you most people know how to use very easily? It's the browser. It's like if I was selling the stuff, it feels like that's a good way to get it in front of consumers. You know, everything is you can do everything through the browser. So maybe that's the appeal of it, but the security risk to me is huge.
A
Yeah. Yeah, me too. So. And I'm glad. Thank you for validating my thinking there and for being another rapidly aging person.
B
That's me who's.
A
Yeah, basically. Grandpa, why do you still use. Yeah, why do you still use a browser that you have to type in? I think is going to be the, the situation sooner than we would like to. Sooner than we'd like to think. So look, we've talked about how we've got an explosion identities. We've Got agents crawling around often without appropriate controls on them. We've talked about how things are now moving at machine speed and machine scale. You know, we touched briefly on how like least privileged access is going to be important. And I do think it's really funny that like the oldest school control is like allow listing of some kind. It is the most low tech thing. And it seems like people are taking a real fresh look at that approach because of AI, which is crazy. But what are some of the other ideas that you would like to put out there in front of the typical enterprise in terms of how we deal with both the internal, the internal risk from people attacking the AI systems that we use internally and also just the general elevation of risk from attackers using AI systems to orchestrate attacks. What's the advice from Spectrops on what companies should be doing to sort of deal with 2026 with all this?
B
Yeah, there's a lot there to get through. I mean fundamentally still the same like identity attack path management is, is still the thing I'm going to stand on for no matter what system or technology you're using. But again, obviously AI kind of explodes that.
A
Well, I'm guessing too that that makes for some interesting interpretations of like Bloodhound scans. Right? Like Bloodhound activity when you're like, well this chatbot, you can actually create an attack path from its credential to full domain Compromise or to GitHub or whatever. Like that's going to be the sort of new contemporary like alarming finding, right?
B
Yeah, I was trying to see if I wanted to challenge like alarming or not. We use those attack paths pretty frequently on like regular like red team assessments. I would say it's a newer capability in Bloodhound to be able to visualize those cross technology stacks or hybrid. You know, we first had Active Directory, then we added Entra and then we called the ability to pivot between the two hybrid attack paths. But we've already been executing those kind of attack paths across technology stacks from you know, AD to Entra to AWS to GitHub, like across them all kind of thing.
A
I think that hdmore has done some really cool work in that space as well. Like just taking Open Graph and just running with it. Right. It is extremely cool. But I'm guessing, I mean, I guess my question was really like, you know, now you've got Open Graph and you're doing red teams involving AI systems. I'm guessing there has been a few moments where you're like that credential that, that AI chatbot is using should not be able to have this path, right?
B
Yeah, definitely. A more interesting thing about trying to do that is you see these technology stacks. You might not know all your edges because your traditional active directory interest stuff is more like, I know how this all works. And so a lot of you get this weird technology stack you've not seen before and you have to kind of enumerate like, well, what can I do with this kind of thing? And that's when you ultimately end up running across like, oh, I didn't know it could do that. But that's really cool. It can because it's very useful to me. So I'm going to start going down that path.
A
So I mean, I, I sort of derailed you there with just drilling down into attack path management. But like, what's some other sort of more high level generic advice that you would give the average CISO on, like what they really need to be doing to deal with again, all this?
B
Yeah, it's. I mean it's really just understand the identities and what they have access to, which I know is not like super high tech or anything. And that's a lot to ask for to like really just like figure that out because again, everything's moving really fast. But a lot of the principles, again, are still the same. You know, the principle least privilege, all that kind of stuff. Even the client attack, like one of the things that they did was they gave the AI the ability to execute arbitrary code in the environment. Like, again, that's probably not something that you'd want to be doing in your systems as you implement them and roll them out.
A
Yeah, I think my favorite AI security incident so far was the mid level guy at the Ministry of State Security in China who was using ChatGPT to summarize, like classified reports, which was. Yeah, that was, you know, 10 out of 10. Awesome. It's a funny old world. Russell Fantaile, thank you so much for joining us to have this conversation. All about AI and red teaming and what's changing and I guess really what's not changing, you know, it's all the same stuff but more and faster. So, yeah, cheers. Great to chat to you.
B
Same. Thanks to both.
C
Yeah, thanks Pat. Thanks Russell. That was a great chat.
Episode Date: March 27, 2026
Guests: Patrick Gray (Host), Russell Van Tiele (VP Services, SpecterOps), James Wilson (Risky Business Media)
Main Theme: Exploring the realities of red teaming modern AI systems, the evolving enterprise risk landscape, and the challenges/opportunities AI brings to attack path management.
This sponsored “Soap Box” edition delves into how SpecterOps approaches red teaming AI-powered environments. The group discusses the difference between traditional and AI red teaming, the explosion of machine identities, and how rapid adoption of AI in enterprises is reshaping security challenges. Through practical anecdotes and a healthy skepticism of hype, the conversation covers real-world breaches, changing attacker tactics, and high-level strategic advice for organizations in 2026.
“For me I like to focus on actually testing like the system of systems that have a piece of AI in it at some point… testing the, the system as a whole.”
“I gotta be really honest that I'm somewhat disappointed that it's a chatbot… I’m expecting it to be really cool and it’s not, it’s chatbots.”
"...Everyone's still trying to figure it out. The AI system is moving so fast... this new AI red team is trying to get connected over to the people that are using it, if they could figure out who's using it..."
"A lot of the attack paths… just compromise identities or use other things that are not AI system for it. But… prompt injection is probably the biggest unique thing to understand."
"You undo all these security principles that we spent years learning and it's like no one cares about them anymore... The only thing I would argue is new is prompt engineering."
"...when it comes to prompt injection, you can't just like this is the prompt I send it. You'll also get the same response because you won't."
"I think some of the public reports report anywhere from 82 to 96 non human identities, two human identities in an org. I think AI is definitely exacerbating that."
"Some of the open claw security advice… is so funny… they give it its credit card, the credit card number and like all of the cookies..."
"...you compromise an identity, you see what access it has and you keep doing that credential shuffle over and over again until you get the access that you're trying to go to."
"The deny by default kind of policy… everything moves so fast, and unless you can keep up with that... you're safer by, you know, secure by default mindset instead of permissive by default mindset."
"…most of the attack path was… tried and true tactics… It didn’t actually start with an AI system… someone compromising GitHub and adding a user account…"
"It's almost like... step one, build an exploit, step two, question mark, step three, profit. It's like they just skipped straight to what are we going to do with this? I don't know. Should we just dump openclaw in there?"
"Browsers are already a gold mine to begin with... All the identities, all, everything past MFA is in there for you to just grab, scoop up and use..."
"...understand the identities and what they have access to, which... I know is not like super high tech or anything. And that's a lot to ask... A lot of the principles, again, are still the same. You know, the principle least privilege..."
"Really what's not changing, you know, it's all the same stuff but more and faster."
On prompt injection vs. social engineering:
Russell (08:23):
"...it's just like social engineering a human... The attacks are just like, how can I get this model to do what I want that it wasn't really planning on doing?"
On the scale of the challenge:
Patrick (18:25):
"We thought we were dealing with a fire hose before, and now it's just, you know, wow."
On organizational response:
Russell (04:57):
"...still seeing it’s developing a lot in organizations right now."
On tools keeping up:
Russell (14:40):
"...open graph extension... allows you to map an identity across any technology stack... positioned well to handle that as is."
The episode underscores that the AI era doesn’t bring a revolution in attacker tactics—it massively expands the scale and speed at which classic mistakes become dangerous. Enterprises and security teams need to double down on foundational hygiene, adapt tools like Bloodhound for hybrid identity mapping, and be wary of AI hype driving insecure integration.
Final quote (Patrick, 29:25):
"...it's all the same stuff but more and faster."