Transcript
A (0:02)
You're listening to the Cyberwire Network, powered by N2K.
B (0:12)
Ever wished you could rebuild your network from scratch to make it more secure, scalable and simple? Meet Meter, the company reimagining enterprise networking from the ground up. Meter builds full stack zero trust networks, including hardware, firmware and software, all designed to work seamlessly together. The result? Fast, reliable and secure connectivity without the constant patching, vendor juggling or hidden costs. From wired and wireless to routing, switching, firewalls, DNS security and vpn, every layer is integrated and continuously protected in one unified platform. And since it's delivered as one predictable monthly service, you skip the heavy capital costs and endless upgrade cycles. Meter even buys back your old infrastructure to make switching effort, transform complexity into simplicity and give your team time to focus on what really matters, helping your business and customers thrive. Learn more and book your demo@meter.com cyberwire that's M E T E R.com cyberwire. Hello everyone and welcome to the Cyberwires Research Saturday. I'm Dave Bittner and this is our weekly conversation with researchers and analysts tracking down the threats and vulnerabilities, solving some of the hard problems, and protecting ourselves in our rapidly evolving cyberspace. Thanks for joining us.
C (2:00)
Yeah, so it's something, it's an area of interest for our research team to kind of look at how AI is changing the business of software. And so we grabbed Claude Code as one example. We like them a lot, they're nice and easy to work with and started kind of poking at what it can do, where the limitations of its protections are as just trying to wrap our heads around what can this thing do? What can this thing do that's risky? How does it act to protect us? Where are there gaps that could be accessible to our product or that maybe we could advise the community on?
B (2:33)
That's Darren Meyer, security research advocate at Checkmarks. The research we're discussing today is titled Bypassing AI Agent Defenses with Lies in the Loop.
C (2:50)
And we kind of discovered that there's some issues where they draw lines in places that people might not expect. And we wanted to make sure that people understood the risks that they're taking on when they adopt these things.
B (3:02)
Well, let's dig into some of the mechanics here. Together we talk about lies in the loop, but also humans in the loop. Can you contrast those things and why they matter?
C (3:14)
Yeah, absolutely. So one of the risks that you have with turning an AI agent loose in your environment, whether that's on your desktop or in your production environment, is that AIs are imperfect. They make Mistakes, they hallucinate things. They might do something dangerous. So the community as a whole and the industry as a whole has kind of responded to this by saying, hey, in many cases, what we're going to do is in the loop of we assess data, we decide what's supposed to happen, then we execute on that decision. We should put a human in that loop as a defense. So we'll propose a course of action. Maybe it's, hey, we want to run this database query. Hey, we want to run this local command on the machine where the agent runs. Hey, we want to access this service with these credentials. We're going to ask a human for permission so that the human has an opportunity to review what the AI agent is about to do and say, hey, that's not okay with me, or hey, that seems totally safe, and make that decision and move forward. AIs are still bad at that. So we let the human do that. So that's kind of the defense. And it transfers the risk to the operator of the agent. Right? The agent says, this isn't my responsibility. I'm asking you for permission. The lies in the loop kind of exploits that a little bit and says like, okay, that's great. If the AI agent is giving you accurate information about what it's about to do, it turns out it's pretty easy to lie to these agents in a way that gets them to lie to the user for us as an attacker. So that you think you're saying yes to something safe, when in fact you're saying yes to something malicious or dangerous.
![The lies that let AI run amok. [Research Saturday] - CyberWire Daily cover](/_next/image?url=https%3A%2F%2Fmegaphone.imgix.net%2Fpodcasts%2F191db83e-dd0d-11f0-8122-ef489b5cb50b%2Fimage%2F95b72a93c2ffaf8ff900d662a9bd3735.png%3Fixlib%3Drails-4.3.1%26max-w%3D3000%26max-h%3D3000%26fit%3Dcrop%26auto%3Dformat%2Ccompress&w=1920&q=75)