Risky Business #837 -- GitHub Actions footgun claims TanStack - Risky Business

Summary8 min read

Risky Business #837 Summary — "GitHub Actions Footgun Claims TanStack"
May 13, 2026 | Host: Patrick Gray | Co-hosts: Adam Boileau, James Wilson

Overview

This episode dives into a tumultuous week in infosec, kicking off with a major supply chain compromise impacting TanStack via a misconfigured GitHub Actions workflow. The panel also explores a spate of AI-discovered vulnerabilities, the persistence of old-school flaws (like memory corruption in security appliances), emerging trends in adversarial AI use, news on ransomware (including links to state-backed groups), and the geopolitics of satellite internet. The show closes with an insightful sponsor interview on the realities of AI buying and adoption in cyber products.

Main Theme

The Risk and Reality of Modern Software Supply Chains
The primary focus is the TanStack compromise—a vivid illustration of how nuanced misconfigurations in widely used developer infrastructure (GitHub Actions) can cascade into far-reaching attacks. The episode unpacks not just mechanics, but the broader implications for trust, software build pipelines, and how developer convenience sometimes undermines core security. Interwoven are reflections on the rise of AI both as a tool for attackers and as a fuel for vulnerability discovery.

Key Discussion Points & Insights

1. TanStack Compromise via GitHub Actions Misconfiguration

Context (01:59–05:17):
- TanStack is a cornerstone library ecosystem for React apps.
- Attackers exploited a subtly misconfigured GitHub Action triggered by pull requests. This allowed attacker code to run and poison a shared build cache, which in turn tainted NPM releases.
- No leaked or phished credentials—entirely via pipeline abuse.
Security Lesson: Even "slight" misconfigurations in CI/CD can have systemic impacts.
Quote (03:48, James):
"There's no prize for getting just about everything right... If you're using this pull request trigger action, be very, very careful about bringing in the untrusted part of the code repo, because this is what can happen."

2. Broader Risks in GitHub-Centric Development

Discussion (05:17–08:06):
- GitHub is now infrastructure for much of the software world, and its features (like Actions) introduce broad attack surfaces.
- The desire for faster builds encourages risky use of build caches, which in turn can make propagation of malware or worms easier if pipelines are compromised.
- The worm used post-TanStack compromise has particularly nasty features: if creds are rotated, it wipes the infected developer’s environment.
Quote (08:06, Patrick):
"If you try to rotate them [creds], it detects that they've been invalidated and rm -rfs your whole drive, which is just like the nastiest thing."

3. Supply Chain Infection: Modern Threats and Attack Vectors

Usage of Coding Agents (09:16–10:33):
- With the proliferation of coding agents (like Codex, Claude), packages may be installed outside of a user’s direct knowledge, broadening exposure.
- Realization: Developers may be impacted even if they didn’t explicitly run npm install—their tools or bots might have.
Quote (10:28, James):
"You might not have done an NPM install during this window, but which one of your agents was working on something that went and did it?"

4. Notable Incidents and Vulnerabilities Covered

Instructure/Canvas Ransomware Breach (11:45–13:10):
- Large-scale compromise, with payment allegedly made to Shiny Hunters for data deletion.
- Hosts reflect on the complexity and trade-offs in ransom payments.
DNSSEC Outage in Germany (13:12–17:05):
- A key error disabled DNSSEC for .de domains; Cloudflare chose to disable DNSSEC validation rather than disrupt users, sparking debate about DNSSEC’s real-world utility.
- Quote (15:13, Patrick):
  "The fact that an entire TLD...just turned off DNSSEC validation and everything was like no one noticed, kind of tells you what you need to know about whether or not the DNSSEC juice is worth the squeeze in my view."
Google TAG AI Threat Report (17:05–20:14):
- AI is now used at ‘industrial scale’ by adversaries.
- Highlights include AI-discovered vulnerabilities, defense-evasion, and attacks via obfuscated LLM access.
- AI as a vulnerability-finding engine—echoed by Nils Provos’ research and ongoing features on the show.

5. AI-Discovered Vulnerabilities and Old School Failures

'CopyFail' and 'Dirty Frag' Linux Bugs (20:14–23:33):
- New wave of privilege escalation vulnerabilities, enabled by AI-driven code analysis.
- These bugs highlight how even mature software still has deep, overlooked flaws—especially in cache handling.
- Quote (24:10, Adam):
  "It feels a little unsporting, honestly, letting an AI look at 26-year-old...FreeBSD code."
Cache Poisoning as a Recurring Theme (across several bugs):
- Reinforces the classic joke in computer science:
  "There are two hard problems in computer science: naming things, and cache invalidation." (25:28)

6. Browser and Extension Security

Claude Chrome Extension Flaw (26:39–28:28):
- Vulnerability allowed other browser extensions to inject DOM prompts, hijacking Claude’s functionality.
- Broader point: browser-based LLM agents read the DOM—creating dangerous prompt-injection risks.
- Quote (28:03, James):
  "The only safe way to really use a model these days is for you to be the sole source of input into the initial prompt."
Google Chrome Bundling Gemini (AI) Model (28:28–30:53):
- Local AI model included in Chrome draws criticism, but hosts agree: AI inclusion is rapidly becoming baseline.

7. Enduring Problems in Security Appliances

Palo Alto and Ivanti Vulnerabilities (30:53–34:35):
- Even flagship products continue to see basic, severe bugs (e.g., memory corruption in content headers).
- Trend: "Infinity script kiddies" (AI-driven exploit chains) are about to make such legacy exposures even more dangerous.
- Quote (32:12, Patrick quoting Gruk):
  "Infinity minus 271 is still infinity." (referring to AI finding hundreds of new bugs)

8. Geopolitics: Satellite Internet Race

Russia’s Starlink Rival (35:39–37:21):
- Russia, EU, and China pushing for their own secure, resilient satellite networks—underscoring how connectivity is a strategic asset and attack surface in modern conflicts.
- Technical discussion on orbits, satellite counts, and implications for kinetic warfare.

9. Ransomware, State Links, and Deception

Latvian Hacker Indictment: Reveals Russian government links to ransomware gangs (Karakurt).
MuddyWater (Iranian APT) Using Ransomware as Cover (39:00–40:21):
- Attempts to slow investigation via destructive operations—effective mainly against organizations below the "Mandiant line".

10. Critical Infrastructure Resilience: Going Offline

CISA’s CI Fortify Program (41:00–45:05):
- Drives operators to test and build plans for running "offline"—sans cloud or external data links.
- While resilience is necessary, James cautions that new redundancy can create fresh attack surfaces.

11. Deepfake Services as a Crimeware Commodity

404Media’s Story on Chinese Deepfake Kits (45:37–47:43):
- For $500, adversaries can buy custom, high-fidelity real-time deepfake tooling, capable of evading the best detection models.
- Raises serious questions for remote identity and KYC verifications.
- Quote (47:30, James):
  "Exception...which is a deep fake detection model, struggled—almost 100% of the samples...was mislabeled as authentic."

12. Policy & Regulatory News

FCC Delays Ban on Router/Drones Updates (48:55–49:09):
- Patch ban for foreign devices pushed out to 2029; drone patch ban reversed. Panel relieved but critical of U.S. regulatory process.

Memorable Quotes

GitHub Actions "Footgun":
"GitHub has given everyone a foot gun and said, 'Don't shoot yourself in the foot with this foot gun.'" — Patrick (05:17)
On the effect of AI in vulnerability discovery:
"It feels a little unsporting, honestly, letting an AI look at 26-year-old...FreeBSD code." — Adam (24:10)
On the persistence of old bugs:
"Infinity minus 271 is still infinity." — Patrick quoting Gruk (32:12)
On Deepfake Detection Failure:
"Exception...struggled—almost 100% of the samples...was mislabeled as authentic." — James (47:30)
On the evolution of software pipelines:
"The desire for everything to be snappy and responsive was the second kind of part of this...the side effects...in this case poisoning a cache that's then used later on is the second part of the puzzle." — Adam (06:07)

Sponsor Interview: Bobby Filar, Head of AI, Sublime Security

(50:36–64:38)

Main Topic:

The reality of evaluating and deploying agentic (AI-driven) security technologies, and how this cycle mirrors past waves of security “AI/ML” hype.

Key Takeaways

Customers’ Top Questions:
- How is the AI/agent trained?
- What offline/online evaluations are used?
- Has the model and its autonomy been thoroughly “red-teamed”?
- What problem is the AI actually solving (vs. hype)?
- Data flow and privacy—will our data be used for training?
- Contractual/data protection concerns.
AI Fatigue:
There's buyer skepticism due to widespread AI marketing, but also executive pressure:
"If you get funding for this project, we need the latest and greatest—latest and greatest is AI." — Bobby (54:38)
Similarity to 2010s ML Hype:
The current AI trend echoes machine learning’s integration into security products a decade ago:
"It does feel like a repeat of that whole thing." — Patrick (54:04)
Trust & Familiarity Cuts Both Ways:
While buyers are now more comfortable with AI (via chatbots, etc.), there's still an insistence on explainability, transparency, and evidence—especially regarding autonomy in critical workflows.
Expectations Gap:
There's a cognitive dissonance where traditional ML errors are tolerated, but AI/LLM failures are seen as damning:
"No one expects machine-learning...to be perfect and never make mistakes. But...when it comes to a lot of these contemporary AI solutions, that expectation is very different." — Patrick (61:00)
Transparency & Operations:
Vendors can build trust through explanations, transparent rationales, and clear operational boundaries for AI.

Timestamps for Key Segments

TanStack/GitHub Actions breach: 01:59–10:33
Instructure/Canvas ransomware: 11:45–13:10
DNSSEC outage (.de domains): 13:12–17:05
Google TAG AI/adversary report: 17:05–20:14
AI-driven Linux/FreeBSD bugs: 20:14–25:28
Chrome, Claude extension AI issues: 26:39–30:53
Palo Alto/Ivanti bug parade: 30:53–34:35
Russian Starlink competitor: 35:39–37:21
Karakurt & MuddyWater ransomware: 38:39–40:21
Critical infra "offline mode"/CI Fortify: 41:00–45:05
Deepfake as a service: 45:37–47:43
FCC patch/ban reversal: 48:55–49:09
Sponsor Interview – AI in security buying: 50:36–64:38

Tone & Style

Frank, wry, occasionally irreverent analysis by three seasoned security pros.
Frequent references to classic computer science axioms and infosec culture.
High level of technical detail, especially for supply chain and AI-driven vulnerabilities.

Useful for Listeners Who...

Want to understand how “subtle” DevOps missteps can have massive downstream effects;
Need an overview of how AI is upending both attack and defense in cybersecurity;
Are evaluating (or being pitched) AI-based cyber products and want to know what matters in this hype-heavy segment;
Appreciate war-stories, informed skepticism, and clarity on why some old solutions (e.g., DNSSEC) just aren’t working out.

End of Summary

Loading summary

Transcript123 lines

[00:00]
A
Foreign.
[00:06]
B
And welcome to Risky Business. My name is Patrick Gray. Adam Boileau is back on deck and he'll be joining James Wilson and I in just a moment to talk through the week's security news. And there's lots of like awful and funny things happening, so that's going to be fun. And then after that, in this week's sponsor interview, we're going to be hearing from Bobby Filler, who heads up artificial intelligence over at Sublime Security. Sublime Security makes the most modern sort of contemporary iteration of an email security platform. So if you are, you know, looking to get the best in class email security platform, you want to hit up Sublime. And we're talking to Bobby about, I guess, how customers these days are evaluating AI features in products. It's an interesting conversation. They are very heavy. Sublime. And yeah, he's going to walk us through the conversations they're having with customers and the questions they're asking, which seem to be the right questions, if I'm honest. And then we can sort of. We also have a bit of a chat about how, you know, LLM AI selling that compares to selling machine learning based AI, if you want to call it that, from, you know, a decade ago. So it's all in all, it's a very interesting conversation and it is coming up after this week's news, which starts now. And look, we got so much wonderful, delicious chaos to talk about, but we're going to start off by having a chat about this mini shy Ludworm. We've seen this worm originally pop up last year sometime. We talked about it at the time. You know, it's a self propagating NPM worm. In this case though, the initial access is a really interesting thing. The thing that started this all off and it wound up infecting Tanstack, which is an extremely wide used thing in the dev ecosystem. I mean, James, you're the engineer among us. We're going to start off with you on this one. What was the interesting vector here? And can you give us a bit of background background on Tanstack?
[02:00]
C
Yeah, let's start with Tanstack because this is complicated machinery and a complicated landscape. So if you're building a React app, React is like one of your two fundamental decisions. React is the framework that you're working with and you're probably writing the code in Typescript, but that's kind of akin to saying, well, I've bought my block of land and I've got my plans for my house, but there's a heck of a lot of other decisions you need to make about how you're going to assemble that app. And it's things like what's going to handle the API routes, what's going to handle the state management. All these things are an entire separate ecosystem of components that have built up. And Tan Stack is a set of those components. They've become wildly popular and they've sort of forged their own paradigm within the React community. So that's what it does. It is a very integral part of building a React app. Now, the thing that is super interesting about the way that this initial attack vector happened here is there was no leaked credential, there was no phished credential. There was none of the traditional sort of weighs in that you would expect for an initial compromise. Instead, it relied upon a malicious pull request making its way through a GitHub action. And that GitHub action was, to the admission of the Tanstack folks, improperly configured. But it's just delicious how they did this.
[03:23]
B
We talked a little bit about this. We ran through the run sheet before we got recording. We do this every week, right. And the misconfiguration here was quite subtle. Like this was not like Tanstack did something completely suicidal and dumb here. They did slip up a little. But why don't you walk us through the mechanics of how this malicious GitHub action would wind up giving these attackers access to Tanstacks repos.
[03:48]
C
Yeah, it is funny, right? Because you pulled me up on the fact that I was approaching this from my software engineering perspective, which is like they tried so hard and they did almost everything right. And you said, buddy, you're in security now and there's no prize for getting just about everything right. And sure enough, there is a particular GitHub action that happens on a pull request trigger, which is it's kind of the most. One of the most dangerous areas where a GitHub action can operate, because it's essentially the moment when someone says, I've got a pull request, a set of changes that I would like you to bring into your repo. And this action fires within the context of the repo that the change might potentially be merged into, but can, if so configured, also pull in those untrusted changes that have been proposed by the external third party in this pull request. The advisory from GitHub that's actually been out for a couple of years now, to be fair, does say, this is a very dangerous foot gun. If you're using this pull request trigger action, be very, very careful about bringing in the untrusted part of the code repo, because this is what can happen. And you know, they, they didn't follow that advice. And that's exactly what this GitHub action did. And that's how the attacker got their foothold. They got code run, they poisoned a cache that was then used during the deployment. And it was actually the legitimate deployment step that then pulled that cache out and resulted in the bad binaries being uploaded to npm.
[05:17]
B
I mean, it does feel a little bit though that GitHub has given everyone a foot gun and said, don't shoot yourself in the foot with this foot gun. You know, Adam, let's bring you in on this one. I mean, this GitHub action stuff, it just seems like perhaps GitHub could be doing more, but I don't know because I'm not an expert in this field.
[05:35]
A
Yeah, they absolutely could have avoided shipping people a foot gun. I think kind of here what we are seeing is that I don't know that anyone really expected the whole industry to coalesce around GitHub as the way of building everything. And one of the things that I thought was interesting about the story is
[05:55]
B
that, well, remember, sorry to cut you off there and ruin your flow, but you remember when Microsoft bought GitHub and everyone's like, it's over for GitHub. That's it, no one's going to use GitHub anymore. GitHub's dead, right?
[06:08]
A
Yeah. It's weird how much GitHub has become a critical part of everyone's of infrastructure and flow. But yeah, so in this particular case, I feel like they did so many things right, as James was saying, but the GitHub action kind of set up like this. I don't know that anyone really expected this to become so important and this particular footgun to be such an important thing. I mean, the idea of running a little bit of automation when you push stuff in and out of a git repo or make pull requests or whatever else seems like a good idea. And all these little things, they all seem like good ideas until you start really crossing important trust boundaries and then building infrastructure that in very kind of near ish real time pulls in dependencies from other people and so on and so forth. And we end up with the impact of these kind of little choices turning into quite widespread compromise of code. And I thought the trick of using the kind of shared cache between GitHub actions can cache artifacts that they've built or whatever else they want to speed up later steps in the process, which when you are redownloading heaps of infrastructure or heaps of dependencies and stuff like that. That can make a pretty material difference to how responsive your build pipeline feels. And developers love having their build process be really snappy because then they can turn things around really fast and it feels like you're doing stuff. That kind of desire for everything to be snappy and responsive was the second kind of part of this. The first thing is GitHub Actions executing kind of macros, I guess, or triggers automation events that potentially have untrusted inputs. One problem, the side effects of those untrusted inputs being processed, in this case poisoning a cache that's then used later on is the second part of the puzzle. And then the third part of course is once it was executing in the release context, then it drops this worm that will propagate into other people's repo steal creds.
[08:06]
B
Well, let's go there because we've covered the tanstack part of this. But then from there, as you say, it dropped a worm which went off and self propagated. And it's like it, you know, you just get on social media and it looks like people are having a bit of a hard time containing this thing. Like it is running real quick, you know, and we went through this last year and everyone's like, oh, we've all taken steps to slow it down. And GitHub's like, oh, you know, we're gonna, this ain't gonna be a problem in the future. And like here we all are. And this time the people behind this worm have added some real nasty features. And one of them is if you try to rotate cred so it will, it will, you know, hand off all of its API tokens and stuff on your machine to the attacker. If you try to rotate them, it detects that they've been invalidated and rmrfs your whole drive, which is just like the nastiest thing to do. I don't even know why you would do that unless you're just like, I don't know, it's like a real antisocial personality disorder kind of thing to do with your, with your NPM worm. But like this thing's out there now and causing all sorts of drama. But what, to what end is this? Is this, you know, sitting above this whole thing? Is this going to be something really dumb like someone trying to steal cryptocurrency or something? Because that's the vibe I get here.
[09:16]
C
It's very hard to tell because again, this all amounts to stockpiling Creds that we then go like, for what and for where? And so similar to pcp, it might be a little while until we see either the follow up actions that they decide to do or they just kind of farm the creds out and let other actors have a go with them. But it like, it is, it is odd that we don't know where this is going, but it'll go somewhere. The other thing I wanted to add in here is I had a real cold sweat moment of panic this morning when I read this because I thought to myself, it's okay, I haven't done like an NPM install or a BUN install for a while now. So I know I didn't pull in packages during this, you know, six minutes, six whole minutes where this was ablaze. But then it dawned on me that the way that coding agents work now, both Codex and also Claude, they work in git work trees. And so when you farm your agent off to go and do a task, it's working in a new directory that doesn't have all these modules installed. And so time after time after time throughout my day as I'm interacting with agents, they are pulling these packages over and over again. So yeah, folks need to think you might not have done an NPM install during this window, but which one of your agents was working on something that went and did it.
[10:29]
B
You might not know that you did that.
[10:30]
C
You might not know that scared the hell out of me this morning.
[10:33]
A
Yeah.
[10:34]
B
All right, we'll talk about our incident response later. But yeah, probably time to move on to the next story now. And funnily enough, the next story we're going to talk about is this breach of Instructure and the learning platform Canvas, which is used by K12 schools and colleges worldwide. I mean it's reported as being an America thing, but I can tell you that, you know, universities here in, in Australia and across in your country, New Zealand, Adam, have been like delaying exams and things and dealing with this. Funnily enough though, it being the big story, I don't know that we've got much to add here. I mean it was a small breach that initially instructor was like, ah yes, you know, attackers tried to compromise us, but we have contained it. And yeah, it turned out not so much. A lot of their data got walked, including according to the attackers, billions of messages between students and their educators. And then of course there was a, you know, ransom note kind of dropped on a Shiny hunters ransom note dropped on the login page for this system. Reporting suggests that they have actually paid now to get the data deleted. And you know, I mean, that's pretty much the end of it. It looks like Shiny Hunters managed to rack up a win here. Adam, any thoughts?
[11:45]
A
Yeah, I mean, it's just, you know, paying them out feels bad, but on the other hand, you see the amount of pain it was causing, you can understand why that was a decision that they were going to consider. And especially in cases where you've got a supplier and their customers and the customers are all applying pressure to the supplier to do something, anything, and the only fast option that a supplier has is to go pay. And that feels bad, but you understand why. And you know, I don't know, hopefully very little of it actually ends up in the pockets of a Shiny Hunter. But you know, well, you hope they
[12:17]
B
slip up doing their, you know, their money laundering or something. Right? Like that's, that's the vibe I get here, is that that could be, that could well be how this ends because Shiny Hunters definitely has like UK teens vibes, right? Yeah.
[12:30]
A
And I can't imagine their money laundering slash, you know, kind of money handling slash not going out and immediately spending it on, you know, on, on trash that's immediately kind of draws attention to them. Doesn't feel super likely. Like they may be good at the tech stuff, but that doesn't necessarily translate to good at, you know. Yeah, long term crime.
[12:48]
B
I mean, everyone always criticizes people who pay. But I've said it on the show a million times, right? Like sometimes it's existential and they gotta pay, like. And it's really hard to say that it's uniformly wrong in every circumstance. Which is why I was against legislative proposals that would have outlawed paying ransoms.
[13:04]
A
Yeah, I guess that's why I cast it as feels bad as opposed to is the wrong thing to do, you know.
[13:09]
B
Yeah, exactly.
[13:10]
A
It just feels gross. But what are you gonna do?
[13:13]
B
Well, we're gonna stay with you on this next story, Adam, because this one has you written all over it. I thought of you as soon as I saw this. Where to begin? So someone screwed up in Germany rotating a key signing key for DNSSEC which meant that the entire detld was returning serve fail on DNS queries where they were signed zone files. Right. So they broke the chain of trust for all of.de. and Cloudflare basically its decision. And I agree with this decision. I think it was the correct decision. And apparently there's an RFC backing this decision as well. But they just let the whole thing file open. Right. So Basically they just switched off for the 1.1.1, you know, cloudflare resolver. They just switched off DNSSEC validation for all of Germany because of this screw up and I think look absolutely the right decision. But it kind of goes to show you that like this isn't really news, you know what I mean? Like I'm reading this straight off the Cloudflare blog. The fact that an entire TLD and a big one at that, just turned off DNSSEC validation and everything was like no one noticed, kind of tells you what you need to know about whether or not the DNSSEC juice is worth the squeeze in my view. But I really keen for your opinion on this one.
[14:34]
A
Yeah, it's a pretty interesting tale this one and I think you kind of summarized the guts of the technical aspects. The interesting bit with Cloudflare is So they their 1.1.1 resolver does enforce DNSSEC. It validates the domains that it's answering questions about. And the correct behaviour in this situation was to return an error, right? Return servfail and not answer the query. And Cloudflare very rapidly realized that that was worse than just answering the queries and marking them as not secure, saying like here's the answer, but it's not DNSSEC validated, which is what they ended up deciding to do.
[15:14]
B
Well, and thankfully, Adam, thankfully all of the software that we use and rely upon out there is set up to really take note of that note in the return zone file that says this isn't secure, you know. And you know, this really changes a lot of things.
[15:28]
A
I know we played with a bunch of DNS stuff over the years and you know, making the like it's one thing to sign your zone files and publish it for other people to validate, it's a whole other thing to say we are also going to make all of our queries fail if DNSSEC isn't available or you know, should be available and isn't because it's just gonna break stuff. And the amount of breakage that DNSSEC causes versus the amount of, you know, kind of impersonational, you know, cache poisoning or whatever other things it's trying to prevent. Really the, you know, the impact of DNSSEC is mostly about bad availability and not about integrity. And yeah, does the juicy worth the squeeze for DNSSEC really not? And especially now that we have so much other crypto layered over the top
[16:15]
B
with tl, let's encrypt One, let's encrypt one. And the thing is, like, I guess one of the reasons that I like to beat up on dnssec is the proponents of dnssec, like the real rabid ones, are among the most annoying people you'll ever meet in your life. Right.
[16:29]
A
I mean, DNSSEC clearly grew out of the kind of cypherpunk way of thinking, Right, where we should make it perfect without really accepting the reality of the world that we have to live in. And yeah, it's. I mean, DNS itself is just old tech. And then bolting crypto into old tech, you know, we just, it just, you know, brittle is the end result. Right.
[16:51]
B
We have let's encrypt, we have modern browsers, let's move on.
[16:54]
A
Yeah, and it's not perfect, right? I mean, the. Let's encrypt world and the browser, like, letting. Delegating this all out to TLS isn't the best solution, but it's the solution we've got, and it's the only one that's, you know, kind of viable in the real world.
[17:06]
B
I'd argue, Adam, that actually, in that everybody uses it, it is actually the best solution, unlike dnssec, which, which people don't actually use. But anyway, anyway, we can argue this one. We could argue this one more at great length over a beer one day. Moving on. And Google's Threat Intelligence Group has released a report all about, you know, what's happening out there with adversaries and whatever. And not surprisingly, AI features very heavily. I guess the item that they spoke about here that's been talked about most is they discovered some threat actors had used AI to uncover an O day in some sort of web IT administration tool. I don't know. That sounds like cPanel to me. I don't know. But it's like an MFA, something like that, I guess. But it's an MFA bypass bug that they found and they were able to, I guess, disrupt the actor from being able to do widespread exploitation against that. So that's great, a wonderful win for Google. But the thing that's remarkable to me is looking through the executive summary of this report, and it's all stuff we've been talking about, like, at length on the show for months and months and months. James, you've been through this one. That was your take as well.
[18:20]
C
Yeah, exactly. You know, they open with that, that exact statement that this is really just a trajectory from nascent AI usage by attackers, which is where we were from the Last report to now, this is, you know, to use their term, industrial scale application of generative models within adversarial frameworks. But the nice thing about it is you sort of break it down into six headings and each of them is a very sort of targeted look at where AI is being used for vulnerability discovery, AI augmented development of defense evasion, autonomous malware operating end to end with AI. There's a good section in here as well about the obfuscated LLM access and I think that's something that needs to get a whole lot more attention throughout industry is just like how do we prevent the large scale use of LLMs in an unauthorized sense from these bad actors through things like chat interfaces or other half baked LLMs being shoehorned into this product and it's accidentally a really nice distillation vector for an attacker. But overall it's not a super thrilling report. But it's just really great to see this all condensed down in one place that says, yes, this is happening and the trajectory is we're now at industrial scale and let's see where this goes from here.
[19:29]
B
Yeah, I mean Adam, you would, as someone who has spent your entire career basically working in ofsec, I'm guessing you would have found this one pretty interesting.
[19:37]
A
Yeah, yeah, it's a good summary of kind of where things are at and all the various places you can use the tooling. And you know, I think much like James, the idea that we can control access to models as a viable kind of strategy for mitigating the various ways that it's being used doesn't seem like a really robust path forward. But I thought there was just, you know, this is a great roundup of the state of the world and obviously they have insight by virtue of being both incident response but also operating one of the big models and they can look at the apps being used and so on. So yeah, it's always interesting reading their work because they have that kind of both ends perspective on it.
[20:14]
B
Yeah. And if you want to know about the state of the art in Terms of using LLMs to do vulnerability discovery, last week we spoke about some work from Nils Provost, who's an old school security head who did some work in instrumenting and orchestrating LLMs to do Volumdev in a way that was like as effective as Mythos, even using local models and older models. That was some very interesting work. We did talk about it last week, but since then James did a 90 minute interview and discussion with Niels all about that work, which is available in the Risky Business Features feed. So again, I know I've been banging on about it every week, but if you are not subscribed to that feed, you are missing the all some really good stuff. So head over to either Risky Biz to get the links, the subscribe links, or you can just fire up your podcatcher and search for Risky Business Features. But that is a fascinating discussion and I've also linked through to it in this week's show Notes. Now, a big thing that happened, Adam, while you were away for a few weeks is every week. It was like the agenda seemed tailor made to the special guests that we had that week, right? Just incredibly well tailored. And then you come back and it's the same, right, because we've got a whole bunch of really interesting bugs to talk about in stuff that you know very well. So first of all, there's the Dirty Frag bug we spoke about Copy Fail last week. And James, I believe you're going to correct something that you said last week that was incorrect about that. But there's this new one called Dirty Frag. We've also got bugs popping up in like FreeBSD and whatever. A lot of this feels very AI driven and I just wanted to get your thoughts on these bugs.
[21:49]
A
Actually, I really enjoyed Copy Fail. It's a beautiful bug and I went through and read some of the coverage of that when I got back from my holiday and wanted to refresh my memory about what the cybers was all about. And it just, it felt so familiar in so many great ways. Dirty Frag is essentially just another variant of the same bug. The guts of Copy Fail were that you could write corrupt the disk cache in the kernel, you could overwrite data stored in the kernel's idea of cached files off disk. And that was done through in that particular case, something in the encryption plumbing somewhere. This is another couple of vectors in the kernel that you can use to write to the page cache in ways that are surprising and use that for Local Privesque. And they're beautiful Local Privesque bugs. You don't have race conditions, you don't have memory corruption. It's targeted, repeatable. It's exactly what you want in a kernel, Local Privesque. Because the last thing you want is bugs that are going to cause instability. You want things that are super reliable. And so we love a Linux Privesque. We've had to think about Linux being essentially single user from a security point of view probably for the last 20 years. It's never really been safe to have multi user Linux boxes and it's nice to see that sort of reinforced for everybody. But yeah, I enjoyed both of these bugs actually. Dirty frag technically it's kind of two different instances of a repeat of the bug. That one that works well on Ubuntu, one that works well on everything else. But yeah, they're well worth reading and understanding. And I felt, yeah, it felt nice seeing something so near and dear to my heart, you know, on the run set this week.
[23:34]
B
Well, and then there's the FreeBSD one too, which is also pretty hilarious. And I mean these are all AI discovered bugs as far as I know. I think this FreeBSD one was a mythos discovery. And I think one thing, when I was doing a little bit of research on it, I plugged it into Google or whatever and I think Forbes was running a story saying Mythos has found a bug in one of the world's most secure operating systems. And I'm like, man, it's 2026. Like, come on, who did that headline? But yeah, walk us through this one as well.
[24:02]
A
Yeah, so the FreeBSD bug absolutely feels AI discovered. And it feels a little unsporting, honestly, letting an AI look at 26 year old or whoever old this, you know,
[24:10]
B
but it's the most secure operating system in the world, according to Forbes.
[24:15]
A
They may have confused their BSD variants there perhaps. Anyway, the particular bug in question is that a malicious DHCP server can set a value that gets written. They set a value that's given to the DHCP client that when the FreeBSD DHCP client writes it into a cache file on disk for later use, you can kind of incorrectly escape metacaracters. You can inject more directives to, to the DHTTP client and then next time it runs and repasses that file, it interprets those directives and you get codexec. And that feels like an AI discovered bug because sort of chaining that logic together of how you would use it and what it's good for makes a lot of sense. The thing that I really liked though is this bug. So writing into the lease cache file on disk, dirty frag, writing to the cache files, writing to the cache in memory, and that bug way up front with Shai Hulud, all three of those are cache poisoning. And I'm reminded of that, like classic amorphism about there being two hard problems in computer science. One is naming things, the other is cache invalidation. And that definitely felt like, yeah, that, that is ringing true this week for sure.
[25:29]
B
Yeah, yeah. Meanwhile, James, you wanted to correct something when you were talking about copy fail last week, you said you got one of the technologies, technical details role.
[25:37]
C
Yeah, so when I read the publication from the theory folks, there was mention of using the, was it the IPSEC encrypted sequence numbers? There was a bug in there that basically resulted in a predictable 4 byte write outside of boundaries. And I assumed that that meant that that was just a really simple like buffer overflow in that. And on one hand you think, okay, neat that they found that. But you also think there's so much tooling and stuff that should have caught things like that. And so I think we'll link to it in the show notes. But there's a great write up from Retro zip Retro with a zero where they actually sort of did a bit of a record scratch of like it's not your average 4 byte write out of bounds and they go deep into this and it is crazy the level of sort of hoops that were jumped through to just get this 4 byte right into the page cache. So superb work and glad they took the time to really explain it.
[26:32]
A
Yeah, that write up is absolutely worth a read if you want to understand the specific details because like it explains it so well. Yeah, I definitely recommend that one.
[26:40]
B
Yeah. And we've linked through to that write up in this week's show notes. Now look, we're talking about AI discovering flaws in other things. There's a fix just gone out for the CLAUDE Chrome extension which would have enabled other plugins, I presume they mean extensions by that to hijack the CLAUDE extension there. Is that about right, James?
[27:02]
C
Yeah, you look at this one and you just go, claude does not belong in a Chrome extension right now because it's just so simple how this was done, right? So if you've got the CLAUDE extension running, someone else, or, sorry, someone could load another extension into the browser that had no permissions, no elevated permissions whatsoever, but just, just had some nasty code in there that was interacting with the page. And then if the browser goes to Claude AI, for example, the malicious extension, all it has to do is just inject something into the dom, which is the bread and butter thing that all of these extensions do. And then CLAUDE sees that thing that's been injected into the DOM and reads it as just a prompt. It's like if it's this easy to trick CLAUDE into reading a prompt out of the DOM in a browser, get that thing the hell out of an extension. The only safe way to really use a model these days is for you to be the sole source of input into the initial prompt.
[28:03]
A
Right.
[28:04]
C
That is like the cleanest guardrail and surface we have at the moment. Is because that's when you, as the human, express your intent, your instructions. Yes, we pull in a bunch of skills and other things along the way, but you've bootstrapped that, you've set the task and that's generally what the model will follow. But when you take Claude and put it in an extension that is reading the DOM and that can become the prompt, nothing good is going to come off this, my friend.
[28:29]
B
That's going to be a bad time. And meanwhile, Google copying a bit of flack this week for shipping everybody a four gig, like, local version of Gemini with Chrome, where really people just woke up and their computer had seriously just downloaded this 4 gig update. Funnily enough, I mean, look, this is to be expected, right? This is the way the world is going. And I've included a story in this week's show, notes from Lily Hay Newman over at Wired. The headline is, you can disable Gemini and Chrome if it's freaking you out. And the reason I really wanted to include this is this is going to be a future historical artifact, this article, where people will be like, wow, people thought they could avoid AI. It's like saying, you know, here's how you can have a clean install of Windows and disable all web browsers. Right? Like, you don't want to use those
[29:12]
C
web browsers in a browser. You used to be able to turn off JavaScript and that was a legitimate decision that people would make. It's the same sort of lineage.
[29:20]
B
Exactly, exactly. But I mean, what do we think about Chrome doing this? Adam, I'd love your thoughts here.
[29:26]
A
I mean, I think the argument for having that tiny model in there was actually pretty reasonable, like being able to do certain things that you don't necessarily want to shove off to the cloud. Obviously Google, by making you run the model, even if it's a really little one, saves him quite a lot of compute, I would imagine, rather than having to like, make a. Have safe browsing, make a call off to a Google AI service every time someone visits a place.
[29:47]
B
No, no, no. So your argument is reasonable, but the counterpoint is, which it's a witch.
[29:55]
A
Yeah, I mean, the amount of things that are going to have like AI models stuck in them, like, it's, as you say, it's just like, you know, we may as well say, let's not put DLL files and things anymore. You know, let's you know, that's the kind of the level that we're at these days. So like, I understand that some people have of, you know, concerns about AI just generally as a concept like that the training of those models in the first place was unethical or whatever. And that's kind of like, I can respect that kind of point of view of it, but if you're a technical user and you want to turn off this piece of functionality in your browser, you're kind of on a losing wicket, I think.
[30:25]
B
Yeah, I mean I think there were more solid, there were more solid arguments. You remember when Sony started putting like basically malware on its music CDs back in the day where if you played one of their CDs it would like Trojan your box to put all of this crazy like you know, kernel level like anti copy stuff in your, on your computer without asking you like, you know that one. Okay, I think that we could say that one's over the line but like a browser shipping a model, get used to it.
[30:52]
A
Yeah, pretty much.
[30:54]
B
Yeah. Now look, we've just got to like, I gotta rub our temples now. We've talked about like AI bug discovery. That's a very big deal. And we've also talked earlier in the year about like how AI is being used to orchestrate attacks and scale them up and whatever. And you know, while we're on that topic, bugs in Palo Alto and Ivanti still, I mean this is the sort of stuff that's going to get absolutely auto owned by orchestrated AI agents, which I am now referring to as Infinity E script kiddies, because that's basically what these agents are. But my God, man, like at this point, you know, running stuff like Ivanti running stuff like a lot of the Palo Alto gear, it's just, you just. It was risky enough before and now it's just suicidal.
[31:42]
A
Yeah, I mean there being yet more bugs in a Vante endpoint manager. Like how is that even possible? How is there any code left that is like, surely they must have got rid of all of it by now. There can't be anything left. But yeah, I don't know what this.
[31:54]
B
Well, as Gruk said and you weren't here and I doubt you listened to it because you were on holiday. But as Grux said here a few weeks ago when talking about Mozilla patching 271 bugs that were found with Mythos in Firefox, he said infinity -271 is still infinity. And I think that just applies here.
[32:12]
A
I think so. Yeah, Gruk with the wisdom as usual. Yeah. I mean, ultimately the thing that stood out to me about the Avanti story was the CISO of Avanti came out and said, look, we just want people to understand that we are trying to do the right thing. It's like nobody. The time to do the right thing was 30 years ago when you stopped investing in the security of this product or the people who bought it. Who bought it? Who bought it. Like many corporate acquisitions ago, that's the problem is that we're running 30 year old trash code and expecting it to be robust against modern Internet, modern AI, much like that FreeBSD DHCP client being unsporting as a target at this point. Avanti, it feels like you're kicking a puppy at this point.
[32:51]
B
Yeah, well, and this Palo bug too, Adam, is quite awful. Right.
[32:55]
A
So I hadn't seen. I was looking around for a pog, apparently is a poc. So this was a Palo Alto remote code exec. It appears to be memory corruption in a content length header in the year 2026. And a thing that parses web content for a living.
[33:11]
B
I mean, that's Baby's. That's Baby's first exploit, really.
[33:14]
A
It is. So there's a slight kind of mitigating factor in that. This is in their like captive portal bit, which there's probably no reason to have Internet facing, but of course people will because why wouldn't you? But yeah, come on, like memory corruption in a security appliance in the year 2026. And there should be defense in depth and exploit mitigation and all these things, but security appliance vendors haven't had to update their products for 20 years. And so why RELRO and PI and do all of the other exploit mitigation stuff in here that would let you get away with having MEM corruption in your content header. But no, that's just parallel to life. So. So everyone's at least used to patching this stuff. So that's good, right?
[33:55]
B
Yeah. James, you looked like you had some
[33:56]
C
feelings there to that point. Everyone's used to patching it. But the thing that I got a good giggle out of was when I read the Ivanti advice. It sort of is. They bifurcate what they tell you to do based on how well you responded to the last time you got owned on this box. So it's like if you responded correctly to our advisory in January and rotated your credentials, you need to do these steps. If you didn't, you are in this bad state. And if your Box wasn't compromised in January. You need to actually do this instead. So I just love that it's now like you've got to look back over your history of, you know, not just what do you do with this box now? But.
[34:30]
B
So they released a. They released a choose your own adventure to accompany their advisory, basically, decision tree of fail.
[34:36]
A
That's great.
[34:38]
B
Moving on. And Russia is launching kind of its own version of Starlink. It looks like there's a great write up of this in Wired and we've linked through to it in this week's show notes. But it looks like it's going to take them a while to get this thing up to being quite reliable. But even an intermittent satellite connectivity over a battlefield is actually going to be quite useful. But what's amazing is, like, how quickly this sort of capability has been understood to be very important by major companies. I mean, we've got the Europeans essentially launching their own thing and doubling down on their own thing because they're a little bit worried about continued access to Starlink because of Elon Musk and the United States and its attitudes towards various things. And no doubt the Chinese will be working on their own version of this, but it just seems like, you know, this is absolutely a very important capability and they're going to do it. I mean, there's some delicious details though, in this write up about how they're. The hardware for this Russian version of Starlink is like multiple times bigger than anything that comes out of SpaceX.
[35:40]
D
Right.
[35:40]
B
Which shouldn't be all that surprising, but, you know, they're giving it a crack. James, I know you've been through this one, and I guess one of the things that you and I both zeroed in on is the orbit path for these satellites is quite different to those from the other companies, which I don't know, there's some that could be interesting in terms of, well, could you shoot these things down and have it not actually damage your own satellites, for example?
[36:06]
C
Exactly, yeah. There's two very interesting differences. One is just the count of the satellites, you know, whereas Starlink has thousands and thousands of these things get full global coverage, especially around densely populated areas. The plan here is to only launch, well, only 300 satellites by 2030. That alone will require them to churn out a satellite or two a week, which seems a little bit ambitious, but so it's a much smaller number of satellites. But the orbit path, because they want to essentially cover Russia and its various territories around there, they can get away with an almost polar orbit on A certain incline. That means that. But they're going to be out of the way of the other satellites. And they're also operating at a much higher altitude, about 800 km versus 500 kilometres. So it sort of settles an argument you and I were having because I said, you know, there's no way that, you know, an adversary would shoot one of these down because you'd take out all the satellites and then, you know, we'll never go through space travel again. But, you know, if it's only 300, that doesn't sound like you actually need to disable too many of them. And even if you had to do something kinetic it, it's probably in a flight path where that debris won't be too dangerous. So, yeah, gosh, when everyone's got these, it's going to be interesting to see what happens if there's a kinetic conflict around them.
[37:22]
B
Well, the next world war is going to happen in space, I guess, as well as everywhere else, I guess, is where we landed on this. But I think also, you know, there's probably ways to disable these types of satellites without creating a debris field. Whether that's just putting a hole in them with a laser or burning out some of their sensors or antennas, you know, there's probably ways to do it. And no doubt the very smart people over at US Space Force working on exactly this problem right now. But yeah, just an interesting sign of the times. Moving on, we've got a report here from TechCrunch from Zach Whitaker, which is looking at a Latvian hacker named Dennis Zolotarjovs. This guy was doing ransomware stuff, working for a Russian ransomware gang called Karakurt. The interesting thing is here, though, that the indictment sort of spells out links between this group and the Russian government, basically. So it's really good to have some of those links spelled out in a more explicit way than we've had previously. And that's just why I've flagged it and put it in this week's show notes. We've also got a report from John Grieg over at the Record talking about how the Muddy Water crew, which is an Iranian apt crew, essentially, they've been dropping chaos ransomware to kind of COVID their tracks, which, I don't know, that's not so surprising. I mean, it's right there in the name, you know, muddy water trying to muddy the waters on attribution. Although, James, you pointed out to me that this does not look like a particularly, you know, effective way of obfuscating attribution given that they're signing utilities with certificates that people know they use, for example.
[39:00]
C
Yeah, it did seem like well intended, but perhaps poorly executed. But I think, you know, you made a good point to me, which is that even if you get to the point of realizing, huh, this ransomware was actually signed by muddy water, you know, you. You've already had to deal with that ransomware and all the impact and. And that's what they're aiming for is just to slow you down, to distract you so they can get on with doing what they. They want to do otherwise without being noticed. So, yeah, yeah, well, I feel.
[39:27]
B
I feel like this will actually work as an obfuscation against people who don't call Mandiant, which is most people, but if you do call Mandiant, they're going to pick it apart and figure it out. I mean, is that your vibe too, Adam?
[39:37]
A
Yeah, yeah, pretty much. But I think, like, the. The specific value of this in terms of it being believable or whatever, it's kind of less important than the goal is to get people onto the ransomware off the playbook. They might have a playbook for how to respond to ransomware. If you can shunt them over onto that, you know what they're going to be doing for the next three days. And it buys you time. And anything where you can manipulate how your adversary. Well, how your victim, I guess, still thinking my offsec life. Anything where you can manipulate how they respond gives you predictability, and that's just really important. It doesn't have to hold long. It might just be long enough for you to action on objectives or whatever else. So, yeah, I think it's, you know, it's still a worthwhile trick.
[40:21]
B
Yeah. And speaking of ransomware, Foxconn has confirmed there's been some sort of attack against its factories in North America. Looks like limited impact, though. I mean, from what we can tell so far, it looks like they had a bit of drama with, you know, the WI fi didn't work and they couldn't use their computers and whatever. And I don't know if that's ransomware or if it's people like, you know, if they're pulling stuff down and responding or whatever. But it looks like they're all back up now. Many such cases these days, right, where you actually see, you know, there's an initial foothold, there's a bit of drama with the ransomware attack, and then things are quickly brought back to normal. So it looks like perhaps that's what's happened here. But one story I wanted to talk about in a bit more depth. John Greig has this write up in the record. There is a CISA initiative where it's called CI Fortify. And the idea here is that they're going to help critical infrastructure operators to figure out how to operate offline. Now this could be because there are DDoS attacks happening targeting critical infrastructure. It could be because a campaign like Vault Typhoon is starting to do unthinkable things to us. Critical infrastructure could be for any reason like that. But I think this is a really good idea. I think in terms of like resilience, you know, going to a utility, going to a bit of critical infrastructure and saying, look, how dependent are you on US East 1, right? And that link being active enable in order for you to actually be able to operate your service and like can we build some contingencies here? Funnily enough, James, at one point in your career you kind of went through this with a critical infrastructure provider and where it's landed is okay, so for the, for the really insecure utilities and whatever, going through this process is going to be a really good thing to do. But you made an interesting point to me which is for the people who really know what they're doing for the highly sort of secure environments, having to go through an exercise like this actually winds up introducing a fairly large amount of new attack surface because there's all of a sudden a lot more new equipment and a lot of redundancy. So. So yeah, there is a downside I guess for some operators was the point you were trying to make to me.
[42:28]
C
Yeah, I was looking at it from the perspective of if an organization has gone through their cloud transformation and moved a lot of on prem workloads up into the cloud and they're heavily dependent on telco infrastructure that is managed for them, that's the kind of stuff that is straight away is going to be in the crosshairs if they're challenged to say, well how do you operate if all of that stuff is gone away inevitably and this is the situation I went through is we had to start bringing some versions of software, really key critical things back on prem in a hybrid cloud and on prem model. Now that introduces complexity, right? Not just it's one thing to go from cloud to on prem, but if you're having to manage both active standby or active active on prem and cloud at the same time, you've got complexity, you've got additional equipment, you've got additional configuration and all of those things when you're not in the offline mode become really excellent places for an attacker to get access to and to dwell.
[43:27]
B
My point is, I think this is still. I mean, you have to go through this, right, Would be my point, which is we live in a world where these links, these data centers, I mean, we saw in Iran, right, Iran actually attacking Amazon data centers, for example, right? When there's going to be a conflict, these. This infrastructure is going to be targeted and we need to make sure that our critical infrastructure is resilient. Yes, it does cut both ways, but I don't really see that we have another option but to do this. Adam, what's your take on this?
[43:53]
A
I think this kind of exercise is really useful and important because we build this stuff so quickly, all this infrastructure, everything is all high tech very quickly and we haven't really thought about the failure modes. Now, what you do with your. If you go ahead and do this kind of exercise and you understand now what the potential failure modes look like, what you do with that is another thing. Building redundant infrastructure or offline, bringing stuff back up of the cloud, like, those are things, paths that you can go down, but at least understanding what the potential impact could be so that you're not exploring that the first time when it actually happens. And I think like in New Zealand, for example, we have one, essentially one or two kind of big fiber links in and out of the country. If those go away because someone does a little submarine snippy, snippy, which, you know, great powers have been known to do in times of conflict, little snippy snippy and like, we don't understand and like, sure, there's a bunch of things we could do. It's probably prohibitively expensive. We're just going to accept the risk, but at least understanding that all our national payment systems are going to stop working without that piece of fibre, that's good knowledge. And it's worth doing these kinds of exercise to know, are we going to have power, are we going to have water? You know, because you don't want to be doing this for the first time for real.
[45:05]
B
And of course this will happen because the first objective of a superpower during a great power conflict will be to take New Zealand out of the war.
[45:14]
A
Strategic dagger pointed at the heart of Antarctica.
[45:16]
C
Yes.
[45:17]
A
Yeah, no, we're just going to stay here with our sheep and because we're not exporting all of our food anymore, we're going to have plenty. It'll be fine, I hope. Just no diesel.
[45:26]
B
Yeah, yeah. What is it? The New Zealand defence strategy of well, they have to go through Australia first.
[45:35]
A
Your taxpayer dollars buying F18s. Thank you very much.
[45:38]
B
F35s. Thank you very much. We get the good stuff we got. I wanted to link through to a report which unfortunately, unless you're a 404Media subscriber, you won't be able to read it on web. If you are an email recipient, you would have received this one. So you can dig it out of your, out of your inbox. It did go out for free as an email, but Jo Cox managed to get his hands on the software. A bit of Chinese software that's used by people doing scams to do deep fakes. Right. So you can grab pictures of people, send them to these guys for 500 bucks. They will create like a model of this person you're trying to impersonate and it will do it in real time on Zoom, on WhatsApp, on whatever you want. A really fun write up of like Joe's adventure in like how going and procuring the software and they even like remotely set it up for him. They set up a partition on his, on his computer and in they came remote support. Very slick operation, James. I mean, you know, I know you enjoyed this one as well.
[46:37]
C
I did. A couple of things really jumped out at me first is, yes, just it was such a white glove. High touch service. So beautiful to see. They've considered customer service but didn't take a lot in terms of hardware specs. They demanded a i7 processor, 16 gig of RAM and an Nvidia 4080. That's not the kind of spec that I thought would be needed to pull this off. So that was quite surprising. But the thing that is like just this moment when you go, oh my goodness, we're in trouble is when you read the part that says exception spelled with an X, which is a deep fake detection model. It struggled, they say in inverted commas, but struggled was actually almost 100% of the samples from this software they acquired was mistakenly labeled as authentic. Despite this research being the state of the art of deep fake detection.
[47:30]
B
Yeah.
[47:31]
C
And the videos look good. There's still a little bit of uncanny valley. But you know the fact that it works when there's something in front of the face, it tracks lighting differences. It's gosh, it's getting good.
[47:43]
B
Yeah. And I wonder, I wonder. I mean we've had Persona on a couple of times, right. And they talk about how they do real time video and stuff and that's a big part of how they actually do like kyc Style, you know, verification of identities and it's being used in the enterprise and whatever and you, you worry how enduring that sort of approach is going to be. Now I'm sure they've got labs and they're cooking up all sorts of ideas and detections and whatever, but this, you know, remotely verifying someone's identity is correct. Is a wicked problem. It has always been a wicked problem and it's going to remain a wicked problem. And I feel like we've had an easy run of it with, with video recently and now that's kind of gone. And our final piece this week, which is, you know, I guess we'd call it our skateboarding dog. We spoke about how the FCC in the US is going to ban foreign made routers and it looks like they're pushing through with this, but it looked like they were also going to ban foreign routers from patching and issuing patches from like March next year. They've now realized this is not a great idea and they have pushed that patch ban out to 2029. Still a bad idea, but it's further away. They've also reversed the ban on patches for drones. So well done, fcc.
[48:55]
A
So many bad ideas. But yeah, bad idea. Two years down the track, we've got a chance with. It's the bad ideas this week that these days we are reduced to struggling with. So I guess good news and we'll check in on a couple of years and see, you know, what they do then.
[49:10]
B
Yeah. Right, guys, well, that is it for this week's news. Great to have you back, Adam. James, great to chat to you as well. And yeah, we'll do it all again next week.
[49:19]
A
Yeah, we certainly will, Pat. I'll see you then.
[49:21]
C
Yeah, thanks, Pat. See you in a week.
[49:27]
B
That was Adam Boylo and James Wilson with a check of the week's security news. Big thanks to them for that. It is time for this week's sponsor interview now with Bobby Filler, who heads up AI over at Sublime Security. If you are not familiar with Sublime Security, it is the modern whiz bang AI enabled secure email platform or email security platform. So you know, if you need to filter out bec, if you need to filter malware, phishing links, things like that, you know, it is that sort of platform. It is the most modern iteration of one of them. It's also highly inspectable. You can write custom rules for it or you can just get their AI agents to do that for you. That stuff actually works really well, crazily well in fact. But Bobby joined Me to have a bit of a broader discussion about AI in the cybersecurity marketplace. You know, when people are evaluating cybersecurity solutions that use agentic AI, what sort of questions are they asking? What are the things they want to know? And let's just start it there. But we go on to talk about a few other things, like how this is a bit similar to the machine learning craze of like 10 years ago. It's a fun interview. So here's Bobby Filler talking about how customers go about evaluating agentic cybersecurity platforms. Enjoy.
[50:37]
D
Yeah, I think they go about it a few different ways and honestly the easiest one is just asking questions, right? How has this agent been trained? What is its background of knowledge? Do you use evaluations, offline, online evaluations to monitor performance? When we're talking about things like agentic use and autonomy, has it been red teamed? Right, like that's, that's a real situation that folks need to consider at this point is these agents can do a variety of things. They have different skills, different tools they can reach into. And if that hasn't been thoroughly tested internally and externally, customers are understandably wary of that. And then as you move down the line, I think it turns into what's your methodology? What is the reason for building this agent in the first place? Like, what problem did you identify where you felt like I needed this? And those types of questions, I think really suss out whether or not a vendor is bolding on something that is just kind of an afterthought, checking the box or whether or not they're building it with, with good intentions. This idea of up, leveling the customer, giving them an opportunity to grow with the product until the point where they feel comfortable releasing some of their day to day responsibilities to it.
[52:02]
B
Now do you feel like there is, and I definitely feel like this is something that's happening out there. Do you feel like there's some AI fatigue among buyers at the moment? Because I feel like everybody has like bolted some sort of AI function onto their thing. They're like, we're an AI platform because like that's what you have to do at the moment. Right? It's just what you have to do. So I feel like if I'm a buyer at this point, I'm like, oh, you want to pitch me your AI solution, do you? Oh, great. Like I haven't had like 10 of them this week.
[52:31]
D
Yeah, it's, it's interesting. So I grew. I, I kind of got my background in the, in the, heyday, of early machine learning being introduced into security products. I'm from Endgame. A lot of my, A lot of my security AI friends were in like Crowdstrike and Silence and things like that. It was always really funny because you would go to rsa, you'd go to Black Hat and be like, I'm the person who, you know, uses math to catch malware. And they're like, no, just like. And you probably remember snake oil booths and things like that popping up and it was, it was kind of a joke for a while and it was a, it was a tough one.
[53:08]
B
I mean it always, it always worked though. And I think, I think AI is kind of the same. It's interesting that you mentioned this, right, because like Ryan Permit, I know still I'm in touch with Ryan who was a co founder of Silence and, and I remember running into him at Black Hat once like just before they launched that product and I'm like, hey man, you know how you beat. He's like, I've been working on something, it's really cool. And silence. You know, look, in the end they had an exit to, you know, wherever they wound up. And it wasn't really that spectacular, but the product was interesting. I think where they messed up is they missed the EDR train, right? But as an anti malware engine doing machine learning classification, man, it worked well. Like it worked really well. The problem with all of that machine learning stuff was always going to be the edge cases. It was like how do you handle like anti cheat on that ships with a game or how do you handle enterprise products that look like Trojans? But, but that was. The thing is fundamentally this technology was incredible. But then all of a sudden everybody's like, it's got machine learning, right? It sort of does feel like a repeat of that whole.
[54:05]
D
It does. I feel like it kind of like ebbs and flows and it's fascinating now when I'm, when I'm in a lot of these, these customer meetings and talking to folks, I don't get, I don't feel the same pushback that I used to in like 2016. What I get instead is there's usually some pressure from a higher up being like, look, if, if you get funding for this project, we need, we need the latest and greatest, latest and greatest is AI. And it's like, okay, so there's some self fulfilling prophecy there that kind of takes place.
[54:39]
B
So people are slapping AI on stuff so that like people can get authority to buy it because they've been, there's a mandate from heaven that says that they need to find efficiencies. Yeah, ye, yeah.
[54:48]
D
And then I think on the flip side, like in 2016, people weren't using, even accidentally using machine learning in their day to day, whereas now it is so, it is just so pervasive in everything that you do, where I just kind of wonder, there's just like a general malaise or a general comfort around, like, okay, I'm already familiar with what a lot of this is, so what does that mean for me and the product that I'm trying to buy? Like, maybe I'm not pushing back, back as much as I should be.
[55:21]
B
That's it. That's a really interesting point, which is that if, if people, if the people making the purchasing decisions or running these programs already familiar with chatbots, they've got a general familiarity with what they can and can't do, right? They've got a feeling for it. Right?
[55:34]
D
Yeah. And I find that, I find that to be the most interesting because it's like, ah, yeah. But meanwhile, like US cyber security experts are often on the sidelines watching people use these tools externally to cybersecurity being like, wait, wait, wait, wait, you know, don't put your medical records on here. Be careful what you, you hook your machine up to and allow it to do. And, and then on the same note, we're building tooling that's being like, yeah, you could probably take your hands off the wheel. It's, it's fine, we'll remediate things, we'll catch things. And it's, it just seems to be, the message just, just doesn't seem to be. I don't know if it doesn't resonate or, or if we're just not thinking through the potential, potential impacts. But I feel like there's an opportunity for the entire industry to kind of take a step back and be like, what are we actually trying to sell here? And what does that look like? And that's kind of what I've been trying to communicate internally with this, this idea of like SAE levels for autonomy, for lack of a better way of putting it.
[56:33]
B
But I mean, just going back to what we were talking about earlier, which is like, how do people gain trust in these systems? I mean, it sounds like what you're saying is the starting point, the starting level of trust is already kind of high because people are familiar with this, you know, with, with, you know, basic, you know, chatbot technology. So there's already that starting level of familiarity and then they're working through this stuff. Mostly with questions. What are the questions that people seem most concerned with? I mean, you mentioned red teaming as being a big, big concern. Like what are some of the other ones where people are really like, you know, this is a, this is a deal breaker question for us when we're looking at evaluating an AI enabled security technology.
[57:12]
D
I think one of the bigger ones I hear is just about data flow kind of through these agents. Right. There's a lot of, I think misunderstanding about what tends to happen. And I feel like the main FUD around AI use in general is like, oh, these frontier providers are going to take all your data and train on it.
[57:34]
B
It's funny how that became just an accepted truth when it is just not the case at all. That's not how this works.
[57:41]
D
Right, right. And that's, you know, I do elements of that happen probably on some level, but cybersecurity industry in general has so many policies and guidelines that we need to adhere to with regards to data. It's like, we're not just vacuuming all this stuff up and then shipping it off to, to a frontier provider and being like, give me a response back, charge me money and, and keep the data, it's yours. That, that isn't really how it works. So, you know, part of it is, is education. Right. With these customers and being like, look, this is, this is the way this flows. This is what these models actually do. When we say we're learning from your feedback or, or from mistakes like this is what we mean. It's, we're not going back.
[58:30]
B
What's interesting here, what's interesting here is that, you know, you're talking about where there's pushback. It seems to be, well, is it, is it safe? They're like, is it red teamed? Is our data contractually protected against being included in a training set? So it seems like people are not so much pushing back on. Can this thing do what it says it does? They're more pushing back on. Is it safe to use it? Do you think that's the dynamic here?
[58:53]
D
I think that's the start. Right. So that's, I can pitch this more as those are the questions you get pre POC or pov. And then once it's in their environment, that's when you get more of the operational questions. Hey, what was this trained on? Do these things look like my environment? If they don't, is it an approximation? If it's not an approximation, how do you learn? At what point should I feel comfortable hitting the toggle button saying like, I need to Be in an active kind of feedback loop as opposed to a passive feedback loop. And that is a really interesting thing to navigate because it really can be a choose your own adventure, what level are you comfortable with? And you can chip away at that as a vendor by giving them explainable kind of transparent reasoning along with, with any decision that it makes. Or you could just say like, look, you could treat this as any other feature, it's just slightly, any other machine learning feature, but it's just slightly more intelligent. And we've found personally that it's a, it's kind of a back and forth, a give and take where you're showing them evidence, trust is built up. You make a mistake, trust can degrade. But then how quickly do you turn that around? Or is the explanation around why that mistake stake occurred strong enough where that sort of trust did not evaporate or go down?
[60:24]
B
You just said something fascinating there, which made me, yeah, change my thinking, I guess about how all of this works. Right. Which is you said, oh, it's just like a machine learning thing, but it's smarter. And in many ways that's true. Right, because LLMs are just machine learning but like at ridiculous kind of scale that was thought to be like impossible previously.
[60:42]
D
Right, right.
[60:43]
B
You know, you could do that but you'd need, need so much compute. It's ridiculous. It's like, yeah, here we are like hundreds of billions of compute later and that's what we got. No one expects machine learning solutions which are just bought and sold like without any question. No one expects them to be perfect and never to make mistakes. But it seems like when it comes to a lot of these like contemporary AI solutions, that expectation is very different. I'd never thought of that before but like if your machine learning based like IDS or mail filtering things makes a mistake, like no one's even complaining at that point. Like they might grumble about it a bit if the mistake's really bad. But like why is it that there's such a higher expectation that these AI solutions are going to have to be, you know, perfect? Like people will look to like point to them making a mistake and say see, this technology is rubbish. They don't do that without the tech. Like why?
[61:33]
D
Yeah, I, I chalk it up to, you know, the, the hype, like the, the marketing hype around AI in general is, is it like a, it's such a level that I feel like it's, it's very hard to walk back and you know, I, I recall the, the days where it was like Our machine learning catches 99.999. It's like, that's probably not true, but I think now there's just this expectation that even when you make a mistake, these things are, are so smart that it's just going to pick it up the next time around. It's like when you're, you're talking to a frontier model via chat and you're like, no, no, no, that's, that's a mistake. And it's like it takes on the Persona of a human being and you're like, oh, yeah, that's actually, that's a sharp question. Or that's a good point. And I think human beings take that feedback and they're like, oh, it's learning. So now, now I shouldn't see that mistake ever again. And I think where you run into problems, particularly with, with being a security vendor is you're pulling in these frontier models. You're not actively adapting them, right?
[62:39]
B
Like, no, I mean, literally, I was
[62:41]
D
going to say $30 million, right?
[62:42]
B
Yeah, yeah, yeah. Like you're saying, oh, people, you know, are they learning? But they don't, you know, and even if you put, you know, like, even if you prime them with the right instructions and prompts and whatever, they still ignore you every now and then. We saw this Twitter, Twitter thread recently where someone lost their entire production environment because they thought their text based instructions to a model of never ever do this were guardrails. And they're not.
[63:04]
D
Yeah. And I love that consumers are getting a little bit more savvy and they're learning more of the nomenclature and kind of what to ask. So it is cool to get things like what guardrails do you have around that? And it's like, well, here's kind of what we're doing and this is what we give it as access to. And it, sometimes it satisfies things, but other times they pull the thread a little more and they're like, all right, well talk to me about tool use. Like what tools do they have? And they're, they're coming at it and it's getting, I want to say, maybe more precise the way they're, they're thinking about it and they're, they're starting to pull the right threads.
[63:42]
B
So as you go through prop poc, they start asking you, well, how, you know, why do I need the AI? What does the AI actually do?
[63:49]
D
Yeah, yeah, exactly. And it's just like, and then they get that taste, right? And they're like, oh, wow, this, this like takes care of this problem that I have or I'm throwing too many people at. You're like, great. And then it's usually at that point they're like, could we, could we put it over here? And I'm like, it took us so long to get to this point. Like, let's take a breath, let's learn and then we can start to move it over. And it's. Yeah, the parallels I feel like with self driving cars and, and kind of what we went through in the late 2000 and tens is like certainly not lost on me where it's just like, yeah, kind of funny. Kind of funny.
[64:27]
B
Bobby, we're gonna have to wrap it up there. We are out of time. Great to chat to you about all of this. And yeah, for those interested, they can check out Sublime Security, a great email security product. Thanks again.
[64:38]
D
No, thank you. Take care.
[64:41]
B
That was Sublime Securities. Bobby Filar there. Big thanks to him for that and big thanks to Sublime for being a risky business sponsor. And that is it for this week's show. I do hope you enjoyed it. I'll be back soon with more security news and analysis, but until then I've been Patrick Gray. Thanks for listening.