Summary8 min read

Risky Business Snake Oilers – April 9, 2026

Episode Theme:
This edition of the Snake Oilers podcast, hosted by Patrick Gray, spotlights three innovative security vendors—Portswigger (Burp Suite/Burp AI), Sondera, and Truffle Security (Truffle Hog). Each segment offers deep insight into how these companies are reshaping the landscape of application security, AI agent governance, and secrets management in the age of rapid software development and pervasive AI adoption.

1. Burp Suite/Burp AI by Portswigger

Guest: Daf Stuttered (Founder, Portswigger)
Segment Start: 03:02

Key Discussion Points

The Evolution of Burp Suite & Integration of AI
- Burp Suite, launched in 2003, is now used by 80,000+ professionals in 20,000+ organizations worldwide.
- Portswigger is bridging manual desktop testing (Burp Suite Pro) and enterprise-scale automation (Burp Suite DAST) using AI.
- Burp AI launched in early 2025, focusing on copilot-style features to accelerate human testers' workflows rather than replacing them.
Real-World Impact of Burp AI
- AI helps users quickly move from suspicious findings to working exploits, automates tedious tasks like access control checking, and increases productivity for large pen testing teams.
- “Orange Cyber Defense deployed Burp AI to all of their pen testing team. Found that they're able to go generally between two and five times faster in their work and paid for itself in the first two or three engagements.” – Daf Stuttered [04:13]
Boundaries of Automation & the Role of Humans
- Portswigger views AI as a powerful "accelerator," not a replacement for skilled human testers.
- Humans remain necessary for oversight, coverage, and preventing risky AI decisions—especially where LLMs might "make stuff up” or act unpredictably.
- Memorable Quote: “Pretty much anything we do with AI, there is still that need for that human in the loop to keep it on track and make sure it's doing the right thing… particularly with offensive appsec.” – Daf Stuttered [06:20]
AI Broadening Access to Testing
- AI lowers the skill barrier for non-experts and small teams needing basic security testing, while also serving as a “force multiplier” for top experts like James Kettle.
Addressing AppSec Team Challenges
- Enterprise AppSec teams face overwhelming, ever-changing attack surfaces due to frequent releases and AI-generated code. Burp AI enables teams to keep up.
- Patrick Gray: “Now people are of course using AI to generate the code as well and just yeet it into prod instantly.” [11:30]

Burp Suite DAST: Enterprise-Scale Automation

Shares its core scanning engine with Burp Suite Pro.
Seamless transition between manual and automated workflows using custom configurations and checks.
Quote on Custom Checks: “One great example… when [the React to Shell] bug dropped, we were able to release a custom scan check pretty much instantly… deployed it straight away…” – Daf Stuttered [13:34]
DAST tools find classes of vulnerabilities (e.g. request smuggling, cache poisoning) that SAST/AI can miss—these only manifest at runtime.

Notable Timestamps:

04:13 – User success stories and productivity gains
06:20 – Why skilled humans are still needed
09:40 – AI for broader markets: from SMEs to experts
12:36 – How DAST complements AI-era testing

2. Sondera – Deterministic Controls for AI Agents

Guest: Josh Devon (Co-founder)
Segment Start: 18:23

Key Discussion Points

What is Sondera?
- Builds a harness and control plane for AI agents. The harness acts as a “man-in-the-middle” on agent trajectories, giving visibility and deterministic control.
- Unlike “guardrails” that just add an extra AI layer, Sondera uses policy as code for real-time, provable, and enforceable governance.
Technical Overview
- The harness can be instrumented into custom agents (via open-sourced SDK) or installed as hooks in third-party agents (e.g., Claude Code, GitHub Copilot CLI).
- Monitors every step of an agent’s process, both before and after tool decisions, maintaining stateful context (not just step-by-step).
Quote: “What the harness does ... is man in the middle… the agent trajectory.” – Josh Devon [18:23]
Defense Against Context-Splitting Attacks
- Handles attacks where sensitive information is split across multiple agent steps (analogous to historic packet fragmentation attacks).
- Maintains full context throughout agent “flight,” crucial for compliance (e.g., GDPR) and preventing data leaks.
Deterministic Policy Enforcement
- Uses the Cedar policy language for policy-as-code. Sondera’s auto-formalization process converts enterprise procedures and natural language guidelines into enforceable, verifiable code.
- Avoids “prompt suggestion” pitfalls by directly blocking noncompliant agent actions, regardless of prompt injection or emergent behaviors.
Quote: “We are not using another model to judge the behavior of another model. What we're doing is using policy as code in real time to evaluate the agent's behavior.” – Josh Devon [24:02]
The Principle of Least Autonomy
- Ensures agents operate with tightly bounded permissions to avoid “insider threat on steroids” risks.
Memorable Analogy:
“You’ve got to treat every agent like it’s a person with awesome hacking skills, worse judgment than a human being, and zero fear of consequences for violating company policy.” – Patrick Gray [27:40]
Simulation and Policy Optimization
- Sondera supports dry-run "simulation" of agent deployment—enables CISOs to test proposed policies and agent access scenarios before real-world release.
- Generates “agent cards” summarizing capability and risk. Uses adversarial LLMs to probe agent action space and identify risky flows, iteratively refining policy controls.

Notable Timestamps:

21:42 – Stateful trajectory analysis and defense against context splitting
24:02 – Policy as code for real-time agent control
27:40 – Treating agents as “insider threats”
29:08 – Agent deployment simulation and risk assessment

3. Truffle Hog – Next-Gen Secrets Detection & Lifecycle Management

Guest: Dylan Airy (Founder, Truffle Security)
Segment Start: 33:14

Key Discussion Points

Why Secrets Management Is So Difficult (and Critical)
- Secrets sprawl (API keys, credentials) is one of the most impactful and complex security challenges in AppSec—arguably harder than SAST or SCA.
- Truffle Hog focuses on end-to-end secrets lifecycle management: discovery, validation, tracing, and remediation.
- 800+ integrations allow for real-time testing and validation of exposed keys across a wide range of platforms.
Quote: “We create accountability for being able to measure when [leaked secrets] get remediated or fixed… by testing the key by doing an API call…” – Dylan Airy [33:14]
Why Truffle Hog vs. Built-In Tools Like GitHub Advanced Security
- Many customers run both GitHub’s push protection and Truffle Hog; the former blocks some secrets by default, but Truffle Hog provides deeper validation, better noise reduction, and a single view across all platforms.
- GitHub’s liveness checks and permissions contextualization are far less mature.
- Truffle Hog consolidates findings across varied locations (e.g., code, chats, cloud storage) and tracks remediation to true closure.
Secrets Leak Hotspots
- Rough breakdown:
  - 60-70% of exposures: code repositories (Git, SVN)
  - 15%: Atlassian suite (JIRA, Confluence)
  - 10%: chat platforms (Slack, Teams)
  - Remainder: logging pipelines, Postman, etc.
- Leaks in logs and public channels may be less common but can be catastrophic (“a public slack channel where it's the entire company…” – [40:16])
AI and the Growing Threat
- AI coding assistants substantially increase secret exposure risks by hardcoding keys, reusing credentials, and often bypassing traditional peer/automated reviews.
- Some executives now prioritize shipping AI-driven features over security concerns, accepting risk for speed—while security teams are left to manage the aftermath.
Quote: “Some CEOs...are so hellbound on getting their organizations to adopt AI, they are sidelining security...Skip the security review, skip the person saying, we'll figure that out later.” – Dylan Airy [44:01]
- Security staff often find AIs using user credentials to do things end users never intended (“it starts pillaging through my home directory to find the secret to do the deploy itself.” – [45:09])
End User and Buyer Profile
- Still primarily sold to AppSec teams (even though it arguably addresses broader IAM/identity risks).
- Provides sophisticated triage but relies on customers to set business context for true prioritization.

Notable Timestamps:

33:14 – Why secrets management is uniquely challenging
36:21 – Limitations of built-in tools and where Truffle Hog adds value
39:12 – Where secrets leak most frequently
44:01 – Impact of AI coding and shifting executive attitudes
46:03 – Who in the organization actually buys and uses Truffle Hog

Memorable Moments & Quotes (with Timestamps)

“If anyone confidently tells you where we're going to be in two or three years with AI that they're probably speculating.”
– Daf Stuttered (Portswigger) [06:20]
“You're building insider threat software on steroids.”
– Patrick Gray to Josh Devon (Sondera) [27:40]
“There's a long list of problems with [GitHub’s] liveness checks...so for the long tail of everything else, they'll still use Truffle.”
– Dylan Airy (Truffle Security) [36:21]
"Some CEOs...are so hellbound on getting their organizations to adopt AI, they are sidelining security and they're saying, look, we need to pick up these agentic workflows. It will make us 100 times faster. Skip the security review."
– Dylan Airy [44:01]

Summary Table of Segments

| Segment | Guest | Core Theme | Key Points | |----------------------|------------------|---------------------------------------------------------|--------------------------------------------------------------------------------------| | Burp Suite/Burp AI | Daf Stuttered | Using AI to supercharge manual & automated AppSec testing | AI as productivity multiplier; still requires humans for oversight; DAST runtime attacks| | Sondera | Josh Devon | Deterministic mid-flight controls for AI agents | Policy-as-code, stateful harness, agent simulation, defense against context splitting | | Truffle Hog | Dylan Airy | End-to-end secrets discovery/validation | Cross-platform, liveness checks, AI amplifying risk, limitations of built-in tools |

In the words of Patrick Gray (47:02):
“I did not think that, you know, you would need an entire company just to do secrets tracking and I was absolutely wrong about that, because now when I look at where Truffle Hog is, what it's doing, it's absolutely something people need.”

For further information: Each vendor and project has links in the show notes at Risky Biz.
End of summary.

Loading summary

Transcript73 lines

[00:00]
Dylan Airy
Foreign.
[00:05]
Patrick Gray
And welcome to another edition of the Snake Oilers podcast series. My name is Patrick Gray and for those of you who don't know Snake Oilers, these Snake Oilers podcasts are where vendors come along, they give us some money and then they pitch their products to you, the listeners who run a skeptical ear over them and then decide whether or not you want them or not. But today we are going to be hearing from three, three vendors who've got some awesome stuff for you. We've got portswigger, makers of course, of Burp Suite. They also have like a DAST product which you're going to hear about in just a moment. We are also going to hear from Sondera today, which is a company that I'm very happily an advisor to. And they're making. What would you call it? It's not really guardrails. It's like deterministic controls for AI agents while they're in flight, sort of mid trajectory, like, like proper controls of AI agents for organizations that need that. Right. Which is frankly most of them, but only some of them realize it just, just right now. But yeah, basically Sundera has created a harness that you can use to instrument your, your AI agents and make sure that they're not doing stuff that they should not be doing. And they've done it in a way that's a little bit different. There's a lot of snake oil in that particular area at the moment. So Josh Devon will come along a little bit later on to explain that one. Then we're going hearing from Truffle Hog and Dylan Airy, who was on the show pitching this stuff like quite a while back. But yeah, we're going to hear from him now on. You know where Truffle Hogs at these days. Truffle Hog, of course, does Secrets Discovery. You can throw it against your repos, throw it against Slack, wherever, throw it against network shares, wherever data is stored, basically. And it will go and find things like API keys, cred pairs, all sorts of stuff. And not just find them, but it will actually validate them, help you remediate them and whatnot. It's a very advanced bit of software these days. And Dylan joins us a little bit later on to talk through that. We're going to kick things off now with our first guest, Daf Stuttered, who is the founder of portswigger and the creator of Burp Suite, which is a very well known tool in the security discipline. If you're a security tester, you, you are familiar with Burp Suite. Now what's Interesting is portswigger have, you know, made some moves in the last couple of years to really sort of AI ify Burp Suite. And in a way that is not crazy, in a way that makes a lot of sense, in a way that's going to help testers do more testing and also help people who might not be testers do some testing. So really it's just about making itself very useful to human operators. So DAF is going to fill us in on that and he's also going to talk through one of portswigger's lesser known products, which is a DAST tool that their customers, the customers who use it, certainly love it. So here is Daft started filling us in on all things Burp.
[03:02]
Daf Stuttered
I created Burp Suite way back in 2003 and this was a tool I built for my own use as a pen tester. And here we are more than 20 years later. Burp Suite Pro is used by 80,000 security professionals in over 20,000 organizations around the globe. And we're now at this interesting point where we are connecting the pentesters world, the world of manual testing tools on the desktop with Burp Suite Pro, connecting that with enterprise scale automation through Burp Suite. DAST and AI is accelerating everything that we can do there. So I'm excited to talk about that.
[03:41]
Patrick Gray
Yeah. So I'm really curious to understand what you're doing with AI around Burp Suite in particular because this is a tool that is used by security testers when they're looking at web applications. I mean it is the industry standard tool. Everybody who works in security testing knows Burp, but it does strike me as one of those tools where, you know, it's very much process driven and testers are going to have their own process and whatever. It seems something that's quite well suited to having an AI agent sort of bolted to it to automate a lot of the work. I'm guessing that's where your focus has been with it.
[04:14]
Daf Stuttered
Yeah. So we launched our first AI features in early 25, a little over a year ago and our mindset was to start that way with some kind of copilot features to accelerate human testing to drive faster productivity, but very much in the workflows that humans were doing. So some examples that our users have described that when they got value. Julian Garrido described described how we went from a couple of interesting bits of evidence, couple of curious requests to a fully working proof of concept exploit in under a minute cost a few cents. Adarsh Kumar described how Burp AI Was able to kind of orchestrate testing against a bunch of endpoints for access control vulnerabilities, which is often quite a laborious, manual, repetitive job for a human. Found a really juicy IDOR vulnerability, saved him a bunch of time. A company like Orange Cyber Defense, one of the biggest pen testing suppliers in Europe, deployed burp AI to all of their pen testing team. Found that they're able to go generally between two and five times faster in their work and paid for itself in the first two or three engagements.
[05:28]
Patrick Gray
Yeah, no, I mean that absolutely would not surprise me in that. Yeah, I mean it does seem like it is extremely well suited to having a bit of AI dust sprinkled upon it. But I guess the question is where to from here? Right. Like we joked and we were talking about this before we got recording that we were joking on the weekly show that one day, you know, portswigger is going to make basically James Kettle in a box. James Kettle, of course, being a security researcher who works with portswigger and develops a lot of really cool new attacks and whatever, you know. So eventually you could get to the point where, where that automation is kicked up even further. Are you at the moment just still working on that line between like, what's left for the humans, what's left for the agents? Like that's hard because if you automate it too much, it's like, it's almost like, you know, who buys it anymore? I don't know, it's a, it's a confusing time.
[06:21]
Daf Stuttered
Yeah, sure. Well, I think if anyone confidently tells you where we're going to be in two or three years with AI that they're probably speculating. I think our view of the current, the current tech is, it's, it's very much an accelerator for that human activity and it can allow people to deliver more, deliver it faster, be more consistent. It isn't a replacement. It is more like having a skilled colleague or someone else to bounce ideas off. Another example from Christy Blad was he was able to join together a few low grade kind of paper cut type vulnerabilities, the kind of stuff that pen tester's a bit embarrassed to report, like username enumeration was able with AI to kind of join the dots between some of those leading to a critical account takeover. So that's the kind of thing that a skilled human has always been able to do if they had enough time and patience and maybe got lucky with the right endpoints. And it's really able to provide that acceleration to try a lot of things. At once and guide the user where to look. I think there's a couple of reasons why we still see humans being necessary in the loop. And one is around kind of coverage and accuracy. I think we all know LLMs can make stuff up, they can go off piece, they can make mistakes. And pretty much anything we do with AI, there is still that need for that human in the loop to keep it on track and make sure it's doing the right thing. Same as in any domain, but particularly with offensive appsec, you are giving the LLM access to dangerous tools that can do real damage if something goes wrong. That might involve hitting the wrong parts of your application and doing damage. It might involve even hitting third parties if it's vulnerable to, you know, prompt manipulation. It might even involve like leaking sensitive data or vulnerability data to an adversary.
[08:16]
Patrick Gray
You're just answering my question, which is like, well, why aren't people getting this like AI enabled Burp suite and then just giving that to like Claude code and saying, go on, do me a pen test. And I think you just answered that, which is like, you probably don't want to arm, you know, Claude code with the Burp suite AI chainsaw and just say be careful.
[08:36]
Daf Stuttered
Absolutely. I mean, I mean we've all seen examples that people have shared publicly of like crazy ninja stuff that AI can do. And for all the things it can do that work out right, it can probably do some that won't go wrong. You know, we've got lab capabilities where, you know, the power is extreme, but so is the danger. So I think the path ahead really for us is to invest in ensuring the kind of the safety guards around what the AI is doing. Some of it is deterministic code, some of it is human in the loop, hooks in the right place so that we can provide that power to AI customers safely and give them the confidence to use it.
[09:16]
Patrick Gray
No, 100% makes sense. I guess though, the question is, are you trying to broaden the market for BURP with the AI push? Is the idea that, well, you don't quite need to be a extremely talented security tester anymore to get some value out of burp? Is that one of the ideas behind doing an AI integration is to, as I say, broaden out the appeal and increase the number of people who might want to buy and use it?
[09:40]
Daf Stuttered
Absolutely. I think that will be one side of it. The same way that people today who've never been a software engineer can vibe code applications and get some good stuff working. People who have a bit of an interest in pen Testing. And this might be one of these tiny IT shops with only one or two people in an SME and they are tasked with securing their application and they don't have time to kind of get Burp Sui certified and fully develop that craft. This will be an accelerant to do a lot of the essentials for them, but still have them steering. But it is also a huge force multiplier right up to the expert end. I can tell you. James Kettle is using Burp AI and a bunch of his own AI creations to turbocharge his research and his work as well. So I think it's the full spectrum from beginner through to expert.
[10:34]
Patrick Gray
And are you seeing that manifest now in your sales? Right? Like, are you seeing a whole bunch more sales come in from some of those smaller teams and, you know, is this a work in progress?
[10:45]
Daf Stuttered
I think we're seeing strong reach from Burp AI. I think when people do. People who've been certainly used to Burp Suite Pro and working manually and maybe for a lot of pen testers, have assumed that this is just a human expert craft that needs them. I think when they do discover the value, they're not. They're not worried about it. They're not worried about being displaced. They just realize they can go faster. And in fact, when we talk to more enterprise customers, people who've got an AppSec team with a bunch of red teamers, security engineers, the story they're generally telling us is they just have too much attack surface that's moving too quickly. Continuous deployment is meaning multiple releases a day. We're way past the stage of a pen test every quarter for an app.
[11:30]
Patrick Gray
And now people are of course using AI to generate the code as well and just yeet it into prod instantly. So.
[11:36]
Daf Stuttered
Absolutely. So this is another reason why, you know, we don't see the human tester going away. Just because there is so much, so many things to test, so much attack surface being generated. Yeah, yeah.
[11:46]
Patrick Gray
I mean, look, this very much vibes with my view on AI, which is that it's a. You know, and I've been saying this literally for like a couple of years, which is that it's a productivity booster. And it's my feeling it won't be quite as devastating to skilled jobs as. As people think it will. We're running out of time, though. This is a great conversation. But you also have other enterprise products where the awareness isn't that great and, you know, in that this is a segment in which you can promote your enterprise products. You wanted to Mention them as well. And one is that portswigger actually makes a DAST tool that, you know, people aren't really that aware of, like, well, not to the same degree as Burp Suite anyway. So tell us about portswigger's DAST tool and how people are using it and why you still need something like that in the AI age, because that's an interesting aspect to this as well.
[12:36]
Daf Stuttered
Yeah, absolutely. So, yeah, we make a Dask product and it has the same core scanning technology, the same core engine that people are used to in Burp Suite Pro. And what we find talking to customers is that in those AppSec teams that they will generally have a bunch of humans doing some testing on the desktop with Burp Suite Pro and they will also have some flavor of scaled automation, some flavor of DAST scanner scanning at scale, embedding in CI CD and the rest of it. What they tell us is that for their testers, when they're taking the findings out of the DAST product and triaging them, replicating them and escalating them on the desktop, there's a kind of cognitive load of transferring between the two worlds. It might be a different scanning engine, different issue taxonomy, different evidence model, and they have to kind of take that and replicate it in the tooling they're used to.
[13:29]
Patrick Gray
So they wanted Burp server side. Burp basically is absolutely right.
[13:34]
Daf Stuttered
This is what customers are asking for when it works in the same way, when it's familiar, when the handover between the two is just much more seamless. But also for any experienced pen tester, any AppSec team, they will build up over the years a bunch of their own custom configurations, extensions, scan checks that they have made that work for them and maybe for their application infrastructure. And what they find is with both products the same, they can develop those and test them in Burp Suite Pro and then deploy them into Burp Suite Dask to spray them at scale right across their estates. One great example of that is React to Shell. When that huge bug dropped, we were able to release a custom scan check pretty much instantly that enabled security engineers to test it on the desktop, validated it worked for them in their stack in their estate, and then they could deploy it straight away. It appeared in Burp suiteast and they were able to use it at scale.
[14:32]
Patrick Gray
Yeah. Now I did allude to this earlier when I said, you know, it's still something that's going to be needed in a, in an AI based world. Because you pointed out to me that, like, although AI models are really good at static analysis, there's a lot of stuff that doesn't actually manifest until an application is actually up and running, like cache poisoning attacks and whatnot. Like, you're not going to find that with sast.
[14:54]
Daf Stuttered
Absolutely. I think there's, you know, the direction of travel for SAST and sca. You know, there's one path ahead where some of that gets eaten by AI. AI is actually generating the code. AI is reasonably good at being trained to follow and align with the patterns that it needs to follow. What that leaves is all the vulnerabilities that are not present in the code and cannot be seen there and only arise when that code is deployed. And James Kettle, head of research, has spent the last decade or more uncovering a whole series of these critical new vulnerability classes where they only arise when the code running in a modern cloud stack, things like request smuggling and cache poisoning. The only way to really find those vulnerabilities is to deploy the application and see how it behaves. As well as that, modern applications are just so heavily stateful and data laden, it can be really almost impossible to just look at their static code and figure out what their behavior will be. You really need to run them with realistic runtime data, interact with them and that's when the behavior emerges and you can interrogate it it.
[16:01]
Patrick Gray
Alrighty. Well, Daft Studded CEO of portswigger, A pleasure to chat, chat to you and great to meet you. Like, you know, we, we talk about Burp Suite a lot. Like we're fans of everything you do and obviously we were followers of the Daily Swig back in the days when you operated your own media outlet as well. But yeah, look, just terrific to meet you and thanks for joining me to pitch me on some of your technology. All the best with it.
[16:26]
Daf Stuttered
Thanks very much. Great to be here.
[16:29]
Patrick Gray
That was DAF Studded there with a chat about Burp Suite and all other things. Portswigger, I do hope you enjoyed that. Great to meet DAF as well. You know, a bit of infosec royalty there, so that was exciting. Now we're going to hear from Josh Devon, who is the co founder of Sundera. And Sundera is an interesting company. Full disclosure, I'm doing some advisory work with Sundera, but basically they make a harness for AI agents that allows you to control them while they're in flight. You can actually put deterministic controls onto these models. Now, a lot of other people, a lot of other vendors, they talk about sort of governance, they talk about guardrails, but when you Actually, look at the tech. I mean, some of that might just be putting another LLM in front of prompts, right? To make sure, you know, in front of prompts or in front of the returns, just to make sure that everything's. Everything's okay. And it's like, I don't know, using something non deterministic to try to solve a non determinism problem doesn't seem right to me. Right. So the idea behind Sundara is that they can put deterministic, you know, concrete, provable, deterministic controls on AI agents. And they do this with a harness, and they do this by trying to, you know, get into the trajectory of these agents while they're mid flight, understand where they're going, stateful sort of tracking of models and what they're doing and so on and so forth. So Josh explains this a lot better than I do. So in this interview, to pitch the Sondera tech, I asked Josh to start off by explaining to all of you out there in Listenerland what the actual harness is like, what is the actual bare bones of bare bones tech of Sondera. And then from there we sort of talk about how that can wind up enabling sort of policy simulation and enforcement across like a whole bunch of different AI agents in very large enterprises. So here is Josh Devon with all of this talking about Sondera, and in particular starting off by talking about the Sondera harness. Enjoy.
[18:24]
Josh Devon
What we've built is an agent harness and a control plane. And I will explain what those are in a second. So your listeners, your listeners might have heard of what's called, like the agent scaffold. The scaffold is the thing that wraps an LLM and gives it agency. So Claude Code around Opus 4:6 is a scaffold. When you hear about a harness, when you hear like, hey, this LLM passed, like humanities last exam or whatever, basically what these researchers are doing is putting a light scaffold on it. Hey, it can fill out these forms and figure out these math problems. And then they have a harness that effectively is observing the agent's behavior and monitoring these things. And so effectively what the harness does, to put it in security terms, is man in the middle, like the agent trajectory. So what our harness does and the way that it's instrumented is in different ways. If you are building your own agent, we have an SDK in Pythonic. We can do it in Typekit. If you're using a framework like Langgraph, if you're rolling your own framework, you basically do like an import Sundera and we've open sourced our agent harness so you can go play with that. That instruments you into an agent that you're building. For third party agents, we've built hooks that are easily deployable for enterprises or individuals. So think like in CLAUDE code I can do like slash plugin, install like Sundera or if you're deploying this to a fleet, you can kind of use device management to push out the software into the agent like CLAUDE code. And we have really great hooks into all the major coding agents from Codex to Claude Code to GitHub, Copilot CLI to Gemini CLI and in general the Frontier Labs are doing a really great job of providing folks deep hooks into the agents. When it comes to other types of agents that I would call the walled gardens that don't have great hooks, those are a little bit more challenging to do as easily. So there's kind of like a dragon's tail and we have a lot of experience reverse engineering a lot of this stuff. And as an example, we've got great hooks into CLAUDE desktop right now, which Anthropic has kind of matured. When it comes to CLAUDE cowork, there's kind of hooks that we have in and we've almost got that working, but it's a little bit harder than some of these others that that give you that capability. So that harness is instrumented into these first party and third party agents and what that effectively does is monitor what's happening every step of the trajectory as it's happening. And so what we do is we look at, before the model decides to make a decision and do something with a tool, we're inspecting what's happening. And then after the agent decides to make a decision with a tool, we're inspecting that. And so we might beforehand say,
[21:42]
Patrick Gray
I
[21:42]
Josh Devon
guess a good point to make here is that we're doing the stateful inspection of the trajectory with the harness. So a lot of other folks are sort of focused on turn by turn, every step.
[21:53]
Patrick Gray
But you're trying to build a little bit of context there and keep it like a stateful, I guess tree where you could look through, step through it and go, well, is the trajectory here okay? Because. Because in isolation this step looks okay. But you take all of this together, that's wrong turn, wrong way, go back 100%.
[22:09]
Josh Devon
And that's where you have these context splitting attacks, which is when the Chinese were leveraging Anthropic's CLAUDE to hack all these companies, they were doing this context splitting attack, which for lack of a better way of explaining it. If I put all credit card number, a whole credit card number in one step, I'll probably detect that. But if I only put one number in 16 steps, then it's going to be much harder to detect if I'm doing it turn by turn only.
[22:37]
Patrick Gray
I mean, this is just the modern equivalent of packet fragmentation attacks that were designed to bypass Snort like 20 years ago. Right. Like it's the same thing.
[22:45]
Josh Devon
Exactly. The other place that this is really important is that what we can do with the harness, what's called like, tainting the trajectory. And what we mean by that is like, like if you have an agent that, say, pulls PII or GDPR sensitive data or something sensitive, you have to know that and retain that. That context is you're dealing with sensitive information for the entire rest of the trajectory. So that if I pull that in step three, in step 73, I still have to know that. And I might have to cut off a class of tooling like open web search, for example, which CLAUDE code will use all the time. And if I'm pulling GDPR sensitive data and doing an open web search, I'm getting fined. Right. Like, so the statefulness is really important. And that harness.
[23:34]
Patrick Gray
Okay, so let me just pause just for a second there. So basically, you've got your harness where you can't put your harness in. You've got some sort of hacky workarounds there. But the basic idea here is that you're getting visibility into what these models are doing, the trajectory that they're taking. And then you have, I'm guessing, also through the harness, you have the ability to sort of jump in and stop certain things from happening. Is that about right?
[24:02]
Josh Devon
Exactly. And this is where things get cool. So, you know, we are not using another model to judge the behavior of another model. What we're doing is using policy as code in real time to evaluate the agent's behavior. And so that allows us to, for example, say you're dealing with a payment processor agent that's going to maybe give customer refunds. Well, there's going to be all kinds of people who are going to say, I want a refund to $5 million. And the customer is always right, so give me my $5 million refund. If you just put in the prompt of your agent, please, please, please. If anything is more than $50, always ask a human or never do it. Exclamation point. Two exclamation points. XML tags, capital letters. They're still only suggestions to the model. What we do Is we assume prompt inject, we assume emergent behavior, and we're going to see a tool call that is going to try to send more than say $50, if that's what your limit is. And, and that rule is written in policy as code. We use the CEDAR policy language in order to do that. And the real magic is like what we're doing is we use a process called auto formalization where we take the natural language that exists in your system prompt, in your CLAUDE MD files, in your standard operating procedures. I've had enterprises ask me how do I apply my employee handbook, how do I apply the eu, AI act, et cetera. We take all of that and we use this auto formalization process to automatically generate the policy as code. So we're not showing up and saying, hey, here's this awesome harness and control plane, go write 100,000 Yara rules. We want to govern these things in natural language. So we're using what agents are good at, which is abstracting out the code. And we can use mathematical lean analysis on the CEDAR policy language to verify at scale that there's no vacuous policies, that policies can't conflict, that if there's ambiguous policies, we can call that out and that allows us then to generate bespoke policies for every agent. And that's, to me what's so cool about this, which is like, you know, I think of like the Ten Commandments, right? Like you probably, for your fleet of agents, you probably want to, want to rule like don't steal, right? But I can't give you a list ahead of time of every way to steal, right? And write like, you know, rules that can prevent that.
[26:51]
Patrick Gray
Well, this is, I mean, as we've talked about on the show, a fair bit. Like this is the problem with AI agents is. Yeah, I mean, look, people are non deterministic as well, but people tend to fear consequences and AI agents don't. And you ask them to do something and they're just going to try to get it done, they don't really care how, right? Like, you know, you ask it to edit a wiki entry and it doesn't have creds, it's going to go like do vulnerability research so it can pop shell on the wiki to change the wiki entry. Like this is something that's happened.
[27:18]
Josh Devon
Exactly. And that's like the irregular labs, research and a lot of others that we've seen in the headlines. So we call this like one of the tenets that we have is like principle of least autonomy. It's like, how do I restrict the action space of the agent while making it more capable, but like the Waymo making sure it only stays on the roads and not go on the sidewalk.
[27:41]
Patrick Gray
So this is really about, like, trying to. I mean, look, you've got to treat every agent like it's a person with awesome hacking skills, worse judgment than a human being, and zero fear of consequences for violating company policy. I mean, this is, you know, you're building inside a threat software on steroids.
[28:00]
Josh Devon
I guess the one advantage that we have is that, you know, humans still have, like, privacy.
[28:06]
Patrick Gray
Like, yeah, you can't hook our brains. Right. You can't harness our brains. Whereas this is like harnessing the brain of an insider threat.
[28:14]
Josh Devon
Exactly. So what we can, you know, as I like to call it, get up in there of the agent in a way that we can't do with humans. And that's the one advantage I think that we have with the agent, like, given their capabilities, is that because they're software, we can monitor them far more effectively without worrying about privacy or humans being upset with us. The agents don't care. At least not yet.
[28:38]
Patrick Gray
Now, look, Josh, sorry. We're running out of time, right? So I just want to squeeze one more thing in there, which is like, I guess with Sondera, one of the selling points, I guess, which gets some CISOs, particularly CISOs of large organizations pretty excited, is the ability to actually run policy change simulations. Or look at. Well, if we launch this agent, give it access to these systems, these tools, this data, run a simulation, tell us if there's going to be any sort of problem. So that's a big part of the product, isn't it?
[29:09]
Josh Devon
Yes, and thanks for asking, Pat. So a key piece of the harness, I've kind of focused on this runtime piece, which seems like the obvious piece of the harness, but another piece of our harness is what you're getting at, which is simulation. And many of the enterprises that I've been chatting with, what they struggle with is. I hear this a lot. We don't know. We don't know. I don't know what it means to give this agent access to Snowflake, and it's going to read my emails and do open web searches. What could go wrong? The state of the art that I'm seeing is, Pat, you and I get in a room, we look at this agent and we're like, pat, can you think of a risk? Can I think of a risk? Can every lawyer think of a risk?
[29:47]
Patrick Gray
We're just spitballing here.
[29:48]
Josh Devon
It's really hard. What we've built is, as part of our harness is we generate automatically what we call the agent card. The agent card is basically like what is this agent capable of? Who are we onboarding? Is this an intern with photocopy access or is this a CFO who's going to send wire transfers? And we use that agent card. We have an adversarial LLM that perturbs the agent under test through the harness. We're not red teaming the model to get it to say bad things. We're not looking for vulnerabilities. It's just that point that you said before, Pat, like in the action space, what risky and toxic flows can we get this agent to do? Can we get it to hack an API? Can we get it to exfiltrate data? Can we get it to do something illegal or non compliant? All that information then moves, helps us with that auto formalization process which allows us to create the mitigating policies at scale. Because how are you going to steal? Well, I don't know, but now that I've simulated, I can see how you would steal. And now I know how to create bespoke policy as code for this specific agent that can prevent what it is in a specific way. And so after we've done the simulation that really helps us understand the risks and then use them in the auto formalization process so that we can create highly effective bespoke policy as code that is constantly being updated and improved to make sure that we're capturing all of the agent's potential behavior as it evolves and changes. And that's really a key focus of what the company is working on.
[31:27]
Patrick Gray
All right, Josh Devon, thank you so much for joining us to walk us through what you're doing at Sondera. It's all very interesting stuff and yeah, we'll be chatting throughout the year. Cheers.
[31:37]
Josh Devon
Yeah, no, thanks so much, Pat. Thanks for the opportunity.
[31:39]
Patrick Gray
That was Josh Devon, co founder of Sondera there. Big thanks to him for that. It is time now for a chat with Dylan Airy of Truffle Hog. Now, Dylan was on the show when Truffle Hog was brand new years ago. It is a secrets discovery product. Very cool stuff. Right? So basically the idea is you set it loose on your code reposition, you set it loose on your slack, you set it loose wherever you need to and it will go and find API keys that people have put where they shouldn't have put or cred pairs where they shouldn't be, you know, certificates where they shouldn't be all of that sort of stuff. So it's very, it's very, you know, simple premise. But doing that at scale is, you know, harder than like everything, right? It's harder, it's harder than it sounds. But now, you know, several years later, they've got to the point with Truffle Hog where they're doing like Secrets validation as well. And they're built a whole bunch of really cool features. One thing that's really funny too is now everyone's using AI to ship code. There's a lot more credentials and a lot more secrets getting shipped in code. So that's, that's an unexpected, frankly side effect of AI coding agents. Although, you know, you would think that, that eventually people are going to get a handle on that one. But here is Dylan Airy walking us through why people buy Truffle Hog. We also talk too about how the fact that they're not the only show in town doing Secrets discovery these days. You know, even GitHub do it with some of their advanced programs and whatever. But Truffle Hog obviously goes a lot deeper and does a lot more stuff than the GitHub stuff. Anyway, I'm rambling here is Dylan Iyeri talking all about Truffle Hog. Enjoy.
[33:15]
Dylan Airy
I think there's three main verticals that AppSec teams will usually buy. There's SAST, there's SCA, and there's secrets. And usually what I tell our customers is like the Secrets problem is, number one, it's harder than you're expecting. And number two, it's probably the hardest of the three. It's impactful and it's important, maybe the most important even. But it's going to be hard to go on that journey. And so we just make that as easy as possible. We focus entirely on all of the secrets that your developers are sort of sprawling all over your environment. We create accountability for being able to measure when they get remediated or fixed. And what I mean by that is we'll get measure by testing the key by doing an API call or doing a cryptographic check on the key to see whether or not it can do something sensitive. And then we'll hold that key accountable until it gets revoked. And that measurement piece is really important. An older way of doing this, when this kind of just fell into sast, when a Sast product would highlight a key, it had no idea if that key was live and have no idea if it got revoked. And so you don't even have a way to measure or to baseline what your exposure was. And so we rolled that out by building 800 different integrations for all these different key types to be able to measure which ones are live and which ones aren't. And some of our best customers that have been with us for two, three years might have, over those two, three years, reduced the number of live exposed keys by 70%. Nobody's getting this down to zero. It's a really, really hard problem. One of the reasons why is, is sometimes the person who manufactures the key is completely different than the person who leaks the key out. And what I mean by that is like you may have a developer who develop, you know, creates a key in GitHub, like a personal access token and shares it with their team. Five years later, their team member accidentally posts that environment variable file in GitHub. That person who posts it probably has absolutely no idea who created it. And the person who created it is the only person in the universe that can log into their GitHub account and actually get it revoked. And so we have tools to be able to trace back who that original manufacturer was. We also have tools to be able to figure out what that key can do, like how much access it has, which repositories it has access to, whether or not it has read or write access. And all these are just kind of needed to get even to that point of revoking 70% of those keys. And hopefully within those 70%, you're getting the ones that matter the most, the ones that have access to customer data or indirect access to customer data.
[35:41]
Patrick Gray
Sorry, I just want to jump in there just quickly. You know, you've mentioned a couple of times being able to find these sort of secrets in like GitHub repos and whatnot. GitHub does its own secret scanning. I mean, I'm guessing though, that they don't do anything to do with like the sort of secrets lifecycle stuff that you're describing. So I want to get your thoughts on that. And also, where else are people actually mostly scanning for these secrets? Because when we first spoke, you know, about this years ago, it's like, oh, well, you know, you can plug it into your Slack, you can do it in here, you can search, you know, you can search through shared direct on file shares and whatever. But now you're years into this, I'm guessing there's going to be a few key use cases and I wondered if you could tell us what they are.
[36:22]
Dylan Airy
Yeah, happy to. On the GitHub side, we actually have a surprising number of customers that will pay for GitHub advanced security secrets and also pay for Truffle. And the entire reason why is the one main value add you get from GitHub advanced security is their push protection, like the thing in their platform that stops the secret from hitting the platform in the first place. They don't open that up to other vendors, they keep it for themselves. I think that's a little anti competitive that they don't allow the best secret scanners in there to fairly compete. But a consequence of that is everything else that I listed out, like being able to measure which permissions the keys have. Most of the keys that they support don't have the liveness checked of the ones that do. There's a long list of problems with their liveness checks where it still depends on a developer saying, hey, I fixed it and it's not really nicely lined up with a second liveness check that checks to see whether or not the key got revoked. And so for the long tail of everything else, they'll still use Truffle. Right. And because of their methodology of detecting keys, it tends to lead to a lot of false positives. And because of that they built like a bypass method to push protection that every developer has. So every developer can push the override. No, please let me leak it the secret button. And so after that happens, the security team then needs to know, well, which of these are actually live and which have access to customer data. And that's where they'll use Truffle to kind of fill that gap, specifically within the GitHub platform. So even within GitHub, if you use GitHub Advanced Security, the most value that you get from it is the push protection that they don't open up to vendors, but sort of everything else, the measuring to see which keys are live and which ones aren't, the permissions, all the other pieces, it's just night and day better on the truffle hog side of the house. And then to your question, if you are managing keys, let's say your responsibilities to make sure an Amazon key does not leak out, you shouldn't care how it's exposed to 2,000 people. If it's exposed in Slack or it's exposed in Jira or it's exposed in GitHub, like in any scenario, it's still exposed to 2,000 people. And so we kind of help consolidate that single pane of view. If the same key is leaked three different places, we'll create one finding for it with the three different places and we'll close the finding out as soon as the key gets revoked. And so we kind of just help with consolidating the SECRETS program into that one pane of view.
[38:51]
Patrick Gray
To answer your question, where do they tend to leak out, though? Where is the most important place to be doing this measurement? Because, I mean, I hear what you're saying, which is like, you've got to look everywhere because it doesn't matter where it's exposed, if it's exposed, it's exposed. But where do you get the most value from the integrations that you've built? Right. There's got to be like a top three, surely.
[39:12]
Dylan Airy
Number one is definitely code. So your Git platform or SVN in some cases or others, by far, number one is code. It probably accounts for 60 to 70%. The next is the Atlassian suite, your JIRA and your Confluence.
[39:32]
Patrick Gray
Yeah, tickets, man.
[39:33]
Dylan Airy
Of course, that probably counts for 15%. And then you have your chat platform, your teams and your slack, and that probably accounts for another 10%. And then you have a long tail of providers like Postman, and then there's another echelon as well of places where keys leak, but the RBAC is a little bit more locked down. Those would be like your logging pipelines. Like, everybody has keys in their logging pipelines, but also every organization doesn't open up their logs to the whole organization because they assume they have keys in their logging pipelines.
[40:04]
Patrick Gray
And so, like keys, secret keys, credit card numbers, pii, health information. Like, you know, it's like, I think everybody's used to the idea that logs are not, you know, are best kept in limited view.
[40:17]
Dylan Airy
100%. Yeah, exactly. And so we can scan logs to some degree, depending on the throughput of the log and the type of the log. For customers that choose to crack open that Pandora's box, it's usually a bloodbath. But also what matters is how many people the keys are exposed to. A public slack channel where it's the entire company, or a logging pipeline where it might be a few hundred need to know people, different levels of stuff.
[40:41]
Patrick Gray
Yeah, it's like this. We should put this on the list of stuff that we should rotate it sometime soon. And then there's the oh my God, stop everything. And we need to get on this immediately. Right, I'm guessing. And do you do, I mean, do you actually do some form of triage in the product that tells you, like, how critical it is, like how exposed a credential is?
[40:59]
Dylan Airy
There's one part of that that we can do and one part that we need our customers to Fill in. And what I mean by that is we can tell you, hey look, it's got read and write access to this bucket and we can flag the read and write. That's impactful. But knowing whether or not that bucket is sensitive or not, that's a piece that really the customer has to step in and for themselves set their own filter and priority. So a bucket that just has access to a bunch of Linux images, read and write might not matter as much, maybe writewood, but read obviously, whatever. But the bucket that has all the customer data in it was like a stop the presses moment. You can't always tell from the name of the bucket. So sometimes we kind of need the customer to set their own sense of priority when it comes to the resource, when it comes to the permissions. We can kind of leverage our subject matter expertise to help out.
[41:46]
Patrick Gray
Now I don't know if you remember this, but sometime back in sort of mid March, we had this incident where Chihu360 leaked a private key, like a wildcard. The private key for their wildcard cert for their Open Claw subdomain. Right? And this was hilarious. I did think of you when it happened. But I think the reason it was hilarious, it leaked into the installer for their like AI assistant that they were shipping out to people and it was pretty clear like that they would have used AI to create that installer as well. So I'm guessing that AI. One thing that I find really interesting about sort of AI sweeping through sort of IT environments at the moment is just how much they are driving more demand for some pretty fundamental security stuff, whether that's endpoint controls, network controls, or like secrets discovery. Are you finding that buyers are more concerned with doing secrets discovery now that they've got AI shipping code? And probably not doing a very good job of not shipping secrets.
[42:48]
Dylan Airy
I have a lot to say about this. So just on the AI side of things, Cursor is a customer of ours and I've asked them how much of Cursor writes cursor? And they thought it was a dumb question. They're like, of course 100% of it is written by, you know, it's like it's writing itself, right? And like, you know, not every company is like that, but still it's the fastest growing company like in the history of mankind, right?
[43:12]
Patrick Gray
Like it's.
[43:12]
Dylan Airy
They doubled from a billion to 2 billion within a few months or something like that. And you know, they were at a few hundred million the year before. And so I think, like, to answer your question, to answer the exact question that you asked. And I also want to say why we're seeing the models do some of this behavior. So I'll come back to that. But to answer the exact question that you asked, I genuinely believe there are some executives, not security executives, but some CEOs, that are so hellbound on getting their organizations to adopt AI, they are sidelining security and they're saying, look, we need to pick up these agentic workflows. It will make us 100 times faster. Skip the security review, skip the person
[43:56]
Patrick Gray
saying, we'll figure that out later. 100%. This is what always happens at times of great innovation.
[44:01]
Dylan Airy
Exactly. I kind of get the formula they're doing in their head. They're saying, hey, look, if the user is logged into Amazon and cursor, when you run it locally, assumes the privilege of the user and it can talk directly to Amazon, it can directly pull the logs from the thing it just deployed, it can directly write and deploy its own terraform. It can move 100 times faster. And so they're probably thinking to themselves, Maybe there's a 5% chance that this thing deletes the production database, but it's also going to move 100 times faster. And so I'll take that risk. I would imagine that's. I'm not saying I agree with that calculus, but that's probably the calculus that those CEOs are making on the security side of the house. I think people are freaking out because I think they're saying to themselves, look, when people use cursor, it assumes the privilege of the user. And that means this long running prompt and context window. I've caught it me telling it, hey, I will do the deploy, I'll run the command or whatever. I've had it go off and write some code for me. Then instead of coming back to me for me to run the command, it starts pillaging through my home directory to find the secret to do the deploy itself.
[45:10]
Patrick Gray
And who knows what it's going to do after it's used that credential? It's probably going to stash it somewhere to use it later.
[45:17]
Dylan Airy
Exactly. Or it hard codes it and commits it to GitHub itself. Once it's in the context window, it's in the context window for it to forever reference later. It's scary, and I think security professionals are freaking out about that. But also they are running into the CEO that is saying, get security out of the way for now.
[45:36]
Patrick Gray
So, look, I think the short answer there is, yes, this is making truffle Truffle Hog. More important. My final question too, because we are running out of time, is who is the buyer for this? Right. Because this started off very much for the sort of companies that did a lot of development. And maybe it'd be the dev team that are like, hey, maybe we should, you know, or you know, someone who's working in dev security, right? Like DevSec was the buyer. Is that still the case all these years later?
[46:03]
Dylan Airy
It's kind of always been the application security team. And so application security is usually responsible for code security. I would argue this is more of an identity and an IAM issue, but for legacy and other reasons, application security kind of owns it.
[46:15]
Patrick Gray
Well, that's why I asked, because it seems like probably it should have a little bit of interest outside of the AppSec team, if I'm honest.
[46:24]
Dylan Airy
You're not wrong. I think just because historically this was pulled out of Sast, Sast used to own it. It just kind of like for legacy reasons is kind of stuck in the AppSec world.
[46:33]
Patrick Gray
All right, well, we're going to have to wrap it up there, but Truffle Security, it's the good stuff. I remember too, Dylan, when we spoke about this years ago when you were just getting started, I was actually skeptical about whether or not there was enough in this to turn it into a full on business. And I think you're just about to close or have closed your series B round. So I'm delighted to be wrong and I'm stoked to see you doing so well. So thanks a lot for joining me to walk us through the latest with Truffle Security. It's been great.
[47:00]
Dylan Airy
Thank you so much, Pat.
[47:02]
Patrick Gray
That was Dylan Airy there from Truffle Hog. Fantastic to have him back on the show. And as I said at the outro there, I'm very happy to be wrong. You know, I did not think that, you know, you would need an entire company just to do secrets tracking and I was absolutely wrong about that because now when I look at where Truffle Hog is, what it's doing, it's absolutely something people need. So, yeah, absolutely did not call it. Absolutely did not see that one coming. But well done to him and the whole Truffle Hog team. That is it for the Snake Oilers episode today. I do hope you enjoyed hearing the pitches from those three vendors. There are links in the show notes for this podcast, so if you need to find them, head over to Risky Biz. But that is all from me today. Thank you very much and I'll catch you soon.
[47:53]
Dylan Airy
Sam.