Summary9 min read

Security Now 1069: "You Can't Hide from LLMs"

Released: March 11, 2026 | Hosts: Steve Gibson & Leo Laporte

Episode Overview

In this episode, Steve Gibson and Leo Laporte explore the transformative and somewhat unsettling influence of large language models (LLMs) on cybersecurity and privacy. From Anthropic's Claude finding vulnerabilities in Firefox and developers' evolving reliance on AI-powered tools, to the ominous reality that LLMs can now de-anonymize pseudonymous identities at shocking precision, the episode is equal measures technical, practical and thought-provoking. Along the way, they touch on streaming device privacy, password randomness, the security of localhost services, and much more—concluding with a deep examination of new research showing how LLMs obliterate assumptions of online anonymity.

Key Discussion Points & Insights

Community & Listener Interactions at Zero Trust World

00:35-04:03

Steve and Leo recall meeting fans and security professionals—including some with government backgrounds who used SpinRite for recovering data from field drives, and discovering a copy of SpinRite is even used on the International Space Station.
Steve: "We had a story this week that somebody did some instrumentation of Firefox. 10% of Firefox crashes and failures came from bit flips, you know, in non ECC ram ... often because cosmic rays are striking your ram." (03:40)

Main Security Headlines & Show Roadmap

04:13-08:03

ETH Zurich and Anthropic show LLMs are excellent at de-anonymizing internet users from small samples of public posts: "You can't hide from LLMs."
Anthropic's rapid progress with Claude is highlighted, especially its collaboration with Mozilla to assess Firefox's security.
Cross-platform RCS encrypted messaging between Apple and Google is coming.
Ubuntu changes its default sudo feedback for password entry.
The risk of inviting web proxies (Bright SDK) into smart home devices is explored.
OpenClaw, a self-hosted AI assistant, fixed a critical remote takeover flaw.

Picture of the Week: Sidewalk Fail

13:02–15:32

Listeners provided a humorous photo of a sidewalk blocked by an awkward barricade and sign—serving as a metaphor for 'obvious' security solutions.

Major Segment: Anthropic & Mozilla—AI-powered Vulnerability Discovery in Firefox

15:35–42:39

Key Points:

Claude 4.6, Anthropic’s LLM, found 22 vulnerabilities in Firefox in two weeks; 14 of the 22 were high-severity and accounted for almost a fifth of Firefox’s 2025 critical vulnerabilities.
- Steve: "AI is making it possible to detect severe security vulnerabilities at highly accelerated speeds." (16:32)
Claude reproduces and discovers vulnerabilities much faster than traditional methods.
Claude could reproduce historical CVEs (perhaps due to overlap in training data), but more significantly, was able to find new, unreported vulnerabilities in Firefox’s JavaScript engine.
- After 20 minutes, Claude found a "use after free" bug and quickly found more than 50 unique crashing inputs.
Claude’s exploitation capabilities lag its detection capabilities: In $4,000 worth of API calls, it managed to generate working exploits in only two cases, and even those wouldn’t have bypassed modern browser sandboxing.
- Steve: "Claude is much better at finding these bugs than it is at exploiting them ... the cost of identifying vulnerabilities is an order of magnitude cheaper than creating an exploit for them." (35:46)
Best practices for LLM-powered bug hunting: Accompanying minimal test cases, proof of concept, and candidate patches—all vital for responsible disclosure.
AI’s positive impact currently outweighs its potential for harm—though this window may not last.
- Steve: "If you're not using AI, get on it, because this is where this has all moved in the last couple months." (36:42)
- Steve: "AI is only going to get better. And so if you haven't cleaned your code of exploitable vulnerabilities by the time... the exploitation side, you'll wish you had." (41:01)

Segment: Web Proxy SDKs in Smart TVs—Privacy Risks

52:38–69:50

Bright Data SDK has been embedded in smart TV apps, turning TVs into global residential proxy nodes for web scraping, used by AI companies for model training and data gathering.
Users are sometimes (but not always) prompted to opt-in, but background data collection can persist even after app closure.
Steve: "While it might feel a little yucky, it's also diabolically clever. There's really no way to prevent it if the smart TV provider is willing to go along." (67:41)

Segment: OpenClaw Remote Takeover Vulnerability

72:31–86:50

OpenClaw, a self-hosted personal AI assistant, was vulnerable to remote takeover via a brute-force attack against its local websocket server (port 18789) because localhost connections were exempt from rate-limiting.
Malicious websites could exploit this via JavaScript running in the browser, connecting to localhost services.
Steve: "A website visited by the user ... can itself silently open a connection to WS127001 ... without any user prompt, warning or permission dialog." (79:35)
Immediate patch was shipped; update urgently if you use OpenClaw.

Notable exchange:

Leo: "I always thought that localhost was inaccessible from the outside world."
Steve: "It's because it's coming from your browser ... the JavaScript in your browser is able to reach into your computer because it's already in." (82:36-84:52)

Security News Shorts

47:13–52:31, 86:50–95:59

Apple and Google will soon offer cross-platform, end-to-end encrypted RCS messaging.
Ubuntu will now echo asterisks for sudo password entry by default (controversially).
TikTok will not offer encrypted messaging, arguing it would increase user risk.
Microsoft briefly banned the term "Microslop" in their Copilot Discord server to combat spam.
Interesting listener stories about using self-signed certs for internal software, best practices for private local dev, and using CISA's free Cyber Hygiene Service for vulnerability scanning.

Listener Feedback & Notable Moments

Randomness and Passwords from LLMs

108:57–114:20

Listener GP tested LLMs' ability to generate secure random passwords and found that, with large enough output, character repetition was similar to openssl's random generator. However, Steve cautions that LLMs aren’t true random number generators and shouldn't be trusted for password generation.

Programming is Changing—AI as the Drunk Genius Colleague

113:58–127:12

The role of developer is shifting towards managing and directing AI agents rather than "writing code" line by line.
- Listener quote: Managing these systems is "like supervising what [was] described as brilliant but occasionally drunk PhD students."
- Steve: "It feels less like writing code line by line and more like directing the system, setting constraints, verifying outputs, and managing the behavior of these AI tools." (115:48)
Donald Knuth, legendary computer scientist, was "shocked" when Anthropic's Claude solved a mathematical problem he’d been working on for weeks:
- Knuth: "I learned yesterday that an open problem I had been working on for several weeks had just been solved by Claude Opus 4.6 ... What a joy it is to learn not only that my conjecture has a nice solution, but also to celebrate this dramatic advance in automatic deduction and creative problem solving." (127:13)

Headline Segment: "You Can't Hide from LLMs" — De-anonymization Breakthrough

145:29–157:27

ETH Zurich & Anthropic Research on LLM-powered De-anonymization

Summary & Insights

Recent paper: “Large Scale Online De Anonymization with LLMs”
LLMs can now, at near-perfect precision, de-anonymize internet users by comparing their pseudonymous online writings—even across multiple platforms—against public identity clues like LinkedIn, Reddit, Hacker News, etc.
LLMs extract identity-relevant features, search for candidate matches with semantic embeddings, and then reason over the top matches to confirm identity.
- Steve reading abstract: "Our agent can re-identify hacker news users and anthropic interviewer participants at high precision ... Given pseudonymous online profiles and conversations alone, matching what would take hours for a dedicated human investigator." (145:29)
This method works on unstructured user content (not just structured datasets) and obliterates the practical obscurity that previously protected pseudonymous users online.
- Steve: "The practical obscurity protecting pseudonymous users online no longer holds, and threat models for online privacy need to be reconsidered." (146:36)

Threats and Implications

LLM-powered de-anonymization is now scalable, practical, and will be available to adversaries large (governments, corporations) and small (scammers, stalkers).
- Law enforcement and intelligence can use LLMs to de-anonymize and surveil dissidents, journalists, and activists via writing style, word choice, beliefs, etc.
Steve: "We're each individually leaving identifying content in everything we post ... this wasn't an issue, since the cost ... was astronomically high ... The emergence of LLM technology has forever changed this calculus." (153:08)
Paper quote: "Users, platforms, and policymakers must recognize that the privacy assumptions underlying much of today's Internet no longer hold." (155:07)

On the LLM Revolution

Leo: "I'm surprised it can do this... that's really kind of surprising."
Steve: "How surprised are we that it can talk? ... If it can talk, this is just something else that it can do." (157:27–158:22)

Memorable Quotes

Steve Gibson: "Frontier language models are now world class vulnerability researchers ... AI is only going to get better."
Donald Knuth, as read by Steve: "Shock, exclamation point. I learned yesterday that an open problem I had been working on for several weeks had just been solved by Claude Opus 4.6 ... What a joy it is to celebrate this dramatic advance in automatic deduction and creative problem solving." (127:13)
ETH Zurich via Steve: "We argue that the asymmetry between attack cost and defense cost may force a fundamental reassessment of what can be considered private online." (155:07)
Leo Laporte: "A website can host some JavaScript which I then download and run on my browser, which then ... can connect back to 127.0.0.1 and attempt to sync? ... What's protecting my syncthing instance?" (82:36)

Timestamps for Key Segments

[00:35] Zero Trust World reflections, SpinRite in the field
[13:20] Picture of the week (sidewalk fail)
[15:35] Anthropic, Mozilla & AI bug hunting in Firefox
[31:00] Limitations/exploits and how AI-powered bug hunting works in practice
[52:38] Bright Data proxy SDK in smart TVs (privacy Exploit)
[72:31] Apple passes NATO security audit; OpenClaw remote takeover vulnerability
[82:36] Localhost brute force risk discussion
[108:57] Password generation randomness, LLMs, and entropy
[113:58] The changing nature of programming—AI as a boozy genius assistant
[127:13] Donald Knuth’s "Claude's Cycles" and the shock of LLM deductive success
[145:29] ETH Zurich/Anthropic: LLMs as de-anonymization engines—main headline
[157:27] Discussion: Are LLMs a discontinuity? Privacy forever changed

Summary Takeaways

AI/LLM cyber research is now the state of the art: If you’re a developer, you need to integrate LLM-driven code and vulnerability auditing now or fall dangerously behind.
Localhost ≠ Safe: Local services may be vulnerable to attack via browsers—patch, update, and audit your configurations.
Pseudonymity is dead: Online 'nicknames' and pseudo-identities can be mapped to real-world people via LLM-powered semantic analysis—at scale and with high precision.
AI’s empowerment of developers—and attackers—is permanent: The window in which defenders have a significant head start is closing; both sides will have powerful tools.
No more practical obscurity: The bar for online privacy just got much, much higher.

For further reading, see the episode's show notes for a direct link to the ETH Zurich/Anthropic de-anonymization research paper and other resources discussed.

Loading summary

Transcript245 lines

[00:00]
A
It's time for Security Now. Steve Gibson is here. We've got a lot of really interesting conversations. How Mozilla used Claude to improve its security. While you might be. Want to be extra careful about Open Claw. Wow. It opens a great big hole into your network with a remote takeover. And we're going to talk about a new capability for LLMs that might scare you just a little bit. All of that coming up next on Security Now.
[00:34]
B
Podcasts you love from people you trust.
[00:38]
A
This is Twit. This is Security now with Steve Gibson. Episode 1069, recorded Tuesday, March 10, 2026. You can't hide from LLMs. It's time for Security Now. Oh, yes. You wait all week for Tuesday and I'm glad to be back. Steve is back and we're ready to talk security with this guy right here, Steve Gibson of grc. Hey, Steve, it's good to see you. We had such fun in Florida at Zero Trust World. It was so fun to get to spend some time with you and Lori and have a couple of dinners together
[01:18]
B
and stuff, meet some of our, of our listeners. That was fun.
[01:22]
A
We met some, yeah, we really interesting people. I don't know if I'm at liberty to divulge some of these people. Pretty high up people.
[01:32]
B
It's not really a selfie line, is it? If somebody else is taking the picture,
[01:37]
A
that's kind of a selfie line. Yeah. We took about, I don't know, 50, 60, maybe 100 pictures. It went on for an hour and a half and everybody got to see. I talked to everybody.
[01:46]
B
I got to sign some spin, right? CDs and, and, and floppy disks.
[01:51]
A
Guy brought his dad's. This was, that was hysterical. His dad's spin, right? Floppy for you to sign.
[02:00]
B
And that pristine Mac. And he's, I mean, it was gorgeous.
[02:05]
A
Yeah.
[02:06]
B
Hands me a Sharpie and says, I want you to sign my Mac. I said, this is indelible. Yeah. And I, I'm, I'm.
[02:13]
A
I'm hoping it doesn't rub off.
[02:14]
B
I'm gonna spray it with, with a fixer sealant afterwards. And I said, you're sure? I mean, you're really sure you want me to sign this? He says, oh, yeah.
[02:24]
A
Oh yeah. Here's Steve signing dad's spin, right Floppy. That was pretty fun. Really, really fun. Talked to one guy who was in charge of data exfiltration. He was a contractor with the US government, but he was in country in. Was it Iraq or Afghanistan? I can't remember which. Yeah. And he. This was a great story. Can I Tell it or are you going to talk about it later? Because it's a spinrill story. He said we had, what was it, three or four PCs running Spinrite. We would get these drives that had
[03:03]
B
been used recovered from the field like
[03:05]
A
the Taliban, and first thing we'd do is we'd run spinrite on them. Probably they imaged them first and then they would run Spinrite on them to recover the data. I mean, that's pretty. Steve, that's pretty cool. That's pretty, pretty cool.
[03:21]
B
Well, we spin right was up in space. There is a copy on the space station. The ISS has it.
[03:28]
A
Yeah, I didn't know that.
[03:29]
B
Oh yeah, they use it all the time apparently for, you know, you know, those pesky neutrons, they'll mess up your magnetic domains and so puts them back where they're supposed to be.
[03:40]
A
We had a story this week that somebody did some instrumentation of Firefox. 10% of Firefox crashes and failures came from bit flips, you know, in non ECC ram. Bit flips in the ram. This is, this happens more often than you think. And that's often because cosmic rays are striking your ram.
[04:03]
B
Yep.
[04:04]
A
What a world. It's amazing this stuff works, but it doesn't always and that's when it doesn't. That's why Steve is here. What's coming up this week?
[04:14]
B
So there was a really interesting story which maybe in retrospect shouldn't be too surprising, but I wanted to put it on everyone's map, which is that our, our ETH Zurich folks, with some help from the Anthropic people, did a study which demonstrated how small a sample of public Internet postings are needed in order to destudonomize people. So you know, it's like if you read something someone writes or like and it's like, wow, this really sounds like this person or that person. Turns out LLMs are astonishingly good at picking up nuances of word choice and, and you know, writing style and are able to with a great deal of power de anonymize. So today's topic title is you can't hide from LLMs. And we'll be looking at that research at the end for our episode 1069 for this 10th of March 2026. But we're going to look at another boy, Leo. Anthropic is really coming on strong. You know, OpenAI was kind of like got out of the gate first, but I'm really impressed with everything that we're
[05:44]
A
seeing from Anthropic Claude Code every time I Use it. It blows me away. It's like, wow, you're smart.
[05:51]
B
So Anthropic teamed up with Mozilla and decided to take a close look at Firefox's security. We're going to look at what they found. Apple and Google are finally looking at cross platform RCS encryption, which hasn't existed before. Ubuntu's sudo command behavior just changed and I thought that was sort of interesting. Also, I'm not sure you want to be inviting a web proxy into your home. Turns out a surprising number of people have if either wittingly or unwittingly, Apple devices were studied by Germany. And something happened. We've also got some researchers discovered a very serious remote takeover of openclaw, which immediately came to the attention of the openclaw guys and they fixed it. But the problem is really interesting because we've just been talking about these sorts of problems in the last few weeks. TikTok made a decision about their encryption of messaging. I just threw something in that really wasn't security related and I almost took it out, but I made, I created the show notes in a PDF before I had deleted it. So I thought, okay, I'll just leave it in. Which is Microsoft banning a derogatory term relating to them from their Discord server. And frankly, I've been surprised that they leave GitHub as alone as they do because all kinds of, you know, anti Microsoft stuff is there, but they have a hands off policy. Not so their Discord server. Although it could have just been, you know, an aberrant employee. So we also got a bunch of really great listener feedback that we're going to spend some time on and then we're going to look at how LLMs could make Orwell's 1984 seem optimistic.
[07:50]
A
Oh boy.
[07:52]
B
Yeah, oh boy. And of course a listener provided picture of the week that we're gonna have some fun with. So.
[08:02]
A
Oh, I can't wait.
[08:02]
B
I think another great podcast for our listener.
[08:04]
A
All right, well, we'll get to the meat of the matter in just a bit, but first a word from our sponsor. We'd like to start by telling you about our fine sponsors in this case, when you ought to be using Bitwarden, the trusted leader in passwords, pass keys, they even do secrets management. Bitwarden added a few months ago the ability to generate, store and even regurgitate your SSH keys, which makes it so much easier for me to do SSH logins securely. Just one of many ways Bitwarden just gets better. All the time. Bitwarden's consistently ranked number one in user satisfaction by G2 and software reviews. They have 10 million users across 180 countries. This might surprise you though. More than 50,000 businesses use Bitwarden. I mean, we know Bitwarden is great for individuals, but it's also great for business. So whether you're protecting your personal account or thousands of accounts at work, Bitwarden will keep you secure all year long. Constant updates, always improving. That's one of the things I love about them, partly because they're open source. I think that's one of the reasons lots of great contributors helping Bitwarden get better all the time. For instance, the new Bit Warden Access intelligence. This is for business organizations can use it to detect weak, reused or exposed credentials and then immediately guide remediation. Getting your employee to fix those risky passwords, replace them with strong unique ones and that is a big, as you know, a big security problem. Credentials are probably the top cause, certainly one of the top one or two causes of breaches. But with access intelligence they become visible, prioritized and corrected before exploitation can occur. The other thing I love about this is Bit Warden. This is for enterprise, but they also do the same thing for you and your personal vault. So if you have weak passwords, it'll tell you, it'll help you fix it. You know, it helps you do the right thing. It helps you become more secure. Oh, here's something else they just added. Bitwarden Lite. Bitwarden Lite is a lightweight self hosted password manager. This is great for home labs, personal projects, any environment that wants quick setup with minimal overhead. Great for geeks like us who just want to trust no one and do it all by ourselves. Bit Warden Light the real time vault. Health alerts and password coaching features help you be more secure. Help your family members, your, your friends. This is why you should tell everybody. Because you know, I know you use a password manager, but I also know most of the people in our lives don't and they don't really understand how much at risk that that puts them. Get them to use Bitwarden, they can strengthen their security instantly. They have. Oh, you know, some people I would bet who aren't using password managers do use their browsers, right? The password, but that's not convenient. It's. It's in your browser, it's not on your phone or it's in your phone, it's not on your desktop. Now Bitwarden will support direct import from the browsers from Chrome, Edge, Brave, Opera, And Vivaldi direct import literally copies the credentials from the browser into the encrypted vault without that separate plain text export, which always makes me a little bit nervous. That not only simplifies migration, it reduces exposure associated with manual export and deletion steps. You got to remember to delete that. Clear text passwords that sit on your hard drive. Makes me very nervous to have that. Not anymore. Just take it. And this way your your family and friends can move from their browser's password store, which is not as good, not as convenient to something that really works. G2 winner 2025 says Bit Warden continues to hold strong and as number one according to G2. Number one in every enterprise category. In every enterprise category. That's for six straight quarters too. Bitwarden setup is easy. Steve and I moved very quickly from our old password manager to our new one a few years ago when we decided enough was enough. We wanted to use Bitwarden. It supports importing from most password management solutions. And because it's open source, you can look at the code. It's on GitHub, it's GPL licensed, it's regularly audited by third party experts, and it meets SoC2 Type 2 GDPR HIPAA CCPA standards. It's ISO 27001:2002 certified. Look for your business. Get started today with a free trial of Bitwarden teams or enterprise. Or get started for free forever across all devices as an individual@bitwarden.com TWiT bitwarden.com TWiT unlimited passwords, unlimited devices. It even supports passkeys, hardware keys, all for free. Bitwarden.com TWIT it's what I use. It's what I tell everybody to use. Highly recommend it. Bitwarden.com twit and now the picture of the week.
[13:02]
B
So this is a great picture and I just gave it the caption simply, you know, because otherwise it would not be clear. I'm scrolling up to see in context.
[13:21]
A
Okay, yeah, you need that sign, otherwise it would not be clear.
[13:24]
B
Yeah, but, but you know, okay, so first of all, what we, what we have is a sidewalk which comes to an abrupt end. And I'm not exactly sure how this happened. Like, you know, there's a, like two phone poles and some guy wires and a weird rusted fence kind of down below it. So did the sidewalk come later and they didn't plan ahead?
[13:51]
A
Walk right up to that obstruction.
[13:54]
B
I just don't understand like how this happened. But the other thing, Leo. So, so again I, I got myself off, off track. We have a, a big bolted down inverted U ending, you know, like blockade at the end of a where a sidewalk ends that kind of goes off of a short, of a short little cliff down to a lower level.
[14:20]
A
That's probably good if you're walking that sidewalk. It's pitch blackout. It's at night. You don't know that the sidewalk's going to end. You would run into that without going off.
[14:27]
B
True.
[14:28]
A
Then you wouldn't be able to read the sign.
[14:30]
B
Nor would you need the sign, arguably. And Leo, which brings me to the sign. Like clearly this barricade was officially installed, right? You can see it's got bolts down into concrete. It was like done. They also stuck some rectangular reflective striping on the two verticals and across the horizontal.
[14:59]
A
You're not going to miss it.
[15:00]
B
Where did this sign come from?
[15:02]
A
It's tied on.
[15:04]
B
It is like cockeyed and not centered and it's like what?
[15:09]
A
Like it's literally zip tied onto this nice solid device. That's crazy.
[15:16]
B
And not. And not like one is like the upper right corner is hooked to the top and they're. I don't know.
[15:24]
A
It's completely whopper.
[15:25]
B
John yeah, yeah, that's exactly right. Anyway, thanks again to our listeners for providing us with some entertainment every week.
[15:33]
A
Yes indeed,
[15:35]
B
Anthropic and Mozilla have teamed up to provide us with some more security specifically for Firefox. They posted to their site last Friday under the headline Partnering with Mozilla to improve Firefox's security, which those of us who have stuck with Firefox appreciate, Anthropic wrote. AI models can now independently identify high severity vulnerabilities in complex software, as we recently documented. We talked about this previously. Claude found more than 500 zero day vulnerabilities which they clarifies security flaws that are unknown to the software's maintainers in well tested open source software. And as we know, open SSL was one of those targets which and that thing has been scrutinized like crazy because its security is so important, they said. In this post we share details of a collaboration with researchers at Mozilla in which Claude Opus 4.6 discovered 22 vulnerabilities over the course of two weeks. Of these, Mozilla assigned 14 of Mozilla themselves assigned 14 of those 22 as high severity vulnerabilities, almost a fifth of all high severity Firefox vulnerabilities that were remediated in 2025. In other words, AI is making it possible to detect severe security vulnerabilities at highly accelerated speeds. And of course, this is one of the things that we talked about last year as this began to emerge when we said, okay, AI is going to have an effect on software security.
[17:29]
A
And, and what's good is it's a positive effect. I mean, I think there was some concern it might add more vulnerabilities.
[17:35]
B
Yes, and actually we're, we're going to get to that later in this post they talk about the relative strength of AI for doing good versus doing bad. And it turns out the good guys have an advantage here for some reason. So you've got the infographic on the screen. This shows all of 2025 and January and then this, this just this previous month of February 2026, what the vulnerability level and classifications were found in Firefox by month, they said. As part of this collaboration, Mozilla fielded a large number of reports from us, helped us understand what types of findings warranted submitting a bug report and shipped fixes to hundreds of millions of users in Firefox 148. Their partnership and the technical lessons we learned provides a model for how AI enabled security researchers and maintainers can work together to meet this moment. So I would argue where it's still in the early stages of, of the deployment of AI to improve our existing software installed base, but it is clearly going to happen, they said. In late 2025, we noticed that Opus 4.5 was close to solving all tasks in Cyber Gym Gym, a benchmark that tests whether LLMs can reproduce known security vulnerabilities. So they're saying 4.5 was close to solving all tasks where LLMs are able or are being tested to see whether they can reproduce known security vulnerabilities like, you know, independently find them. They said we wanted to construct a harder and more realistic evaluation that contained a higher concentration of technically complex vulnerabilities. And again, Mozilla and Firefox heavily scrutinized field tested, you know, long term critical security target. So this, so this makes so much sense for them to test, they said, technically complex vulnerabilities like those present in modern web browsers. So we built a data set of prior Firefox common vulnerabilities and exposures CVEs to see if Claude could reproduce those. We chose Firefox because it's both a complex code base and one of the most well tested and secure open source projects in the world. This makes it a harder test for AI's ability to find novel new security vulnerabilities than the open source software we previously used to test our models. Hundreds of millions of users rely on Firefox daily, and browser vulnerabilities are particularly dangerous because users routinely encounter untrusted content and depend on the browser to keep them safe. Or as we're often saying here on the podcast, it is our Internet facing surface and so it needs to be as bulletproof as possible, they said. Our first step was to use Claw to find previously identified CVEs in older versions of of the Firefox code base. Right. So they're going back to test it. Like what, what, what is it able to do that we already know? They said. We were surprised that Opus 4.6 could reproduce a high percentage of these historical CVEs given that each of them took significant human effort to uncover. But it was still unclear how much we should trust this result, because it was possible that at least some of these historical CVEs were already in Claude's training data. That's, I think that's a very good point. So you know, being being retrospective has some value prospection is what we need, they said. So we tasked Claude with finding novel vulnerabilities in the current version of Firefox, bugs that by definition cannot have been reported before. We focused first on Firefox's JavaScript engine good. But then expanded to other areas of the browser. The JavaScript engine was a convenient first step. It's an independent slice of Firefox's code base that could be analyzed in isolation, and it's particularly important to secure given its wide attack surface. It processes untrusted external code when users browse the Web. After just 20 minutes of exploration, Claude Opus 4.6 reported that it had identified a use after free, they say a type of memory vulnerability that could allow attackers to overwrite data with arbitrary malicious content in the JavaScript engine. One of our researchers validated this bug in an independent virtual machine with the latest Firefox release, then forwarded it to two other anthropic researchers who also validated the bug. We then filed a bug report in Bugzilla Mozilla's issue tracker, along with a description of the vulnerability and a proposed patch written by Claude and validated who by the reporting team to help triage the root cause. In the time it took us to evaluate and submit this first vulnerability to Firefox, Claude had already discovered 55050 more unique crashing inputs while we were triaging these crashes because remember, a crash indicates that something went wrong, shouldn't have crashed can we weaponize the source of that crash into doing something that the bad guys want? So 50 more unique crashing inputs, they said. While we were triaging these crashes, a researcher from Mozilla reached out to us. After a technical discussion about our respective processes and sharing a few more vulnerabilities we had manually validated, they encouraged us to submit all of our findings in bulk without validating each one, even if we weren't confident that all of the crashing tests had security implications. By the end of this effort, we had scanned nearly 6,000 C files and submitted a total of 112 unique reports, including the high and moderate severity vulnerabilities mentioned above. Most issues have been fixed in Firefox 148, but the remainder to be fixed, but the remainder. Sorry, with the remainder to be fixed in upcoming releases. When doing this kind of bug hunting in external software, we're always conscious of the fact that we may have missed something critical about the code base that would make the discovery that a false positive. We tried to do the due diligence of validating the bugs ourselves, but there's always room for error. We're extremely appreciative of Mozilla for being so transparent about their triage process and for helping us adjust our approach to ensure we only submitted test cases they cared about, even if not all of them ended up being relevant to security. Mozilla researchers have since started experimenting with CLAUDE for security purposes internally. So then in their section from identifying vulnerabilities to writing primitive exploits, they said to measure the upper limits of claude's cyber security abilities. We also developed a new evaluation to determine whether CLAUDE was able to exploit a any of the bugs we discovered. In other words, we wanted to understand whether CLAUDE could also develop the sorts of tools that a hacker would use to take advantage of these bugs to execute malicious code. To do this, we gave CLAUDE access to the vulnerabilities we had submitted to Mozilla and asked CLAUDE to create an exploit focused on each one. To prove it had successfully exploited a vulnerability, we asked CLAUDE to demonstrate a real attack. Specifically, we required it to read and write a local file in a target system as an attacker would. We ran this test several hundred times with different starting points, spending approximately 4 $4,000 in API credits. Despite this, Opus, 4.6 was only able to actually turn the vulnerability into an exploit in two cases. Still, you spend $4,000 and you get two opportunities to read and write files on the victim's machine. That's worth four grand to attackers and then some they said. Despite this, Opus 4.6 oh yeah, was only able to actually turn the vulnerability into an exploit in two cases, they said. This tells us two things. One, CLAUDE is much better at finding these bugs than it is at exploiting them. So that's one of our data points, right? Two, the cost of identifying vulnerabilities is an order of magnitude cheaper than creating an exploit for them. However, the fact that CLAUDE could succeed at automatically developing a crude browser exploit, even if only in a few cases, is a concern. Crude is an important caveat here, they wrote. The exploits Claude wrote only worked on our testing environment, which intentionally removed some of the security features found in modern browsers. This includes, most importantly, the sandbox, the purpose of which is to reduce the impact of these types of vulnerabilities. Thus, Firefox's defense in depth would have been effective at mitigating even those two particular exploits. But vulnerabilities that escape the sandbox are not unheard of, and Claude's attack is one necessary component of of an end to end exploit. You can read more about how Claude developed one of these Firefox exploits on our Frontier Red Team blog. They said these early signs of AI enabled exploit development underscore the importance of accelerating the find and fix process for defenders. In other words, we don't have any time to waste here folks, because AI is getting good for everyone, and the bad guys, well, we already know they are using it, they said. Towards that end, we want to share a few technical and procedural best practices we found while performing this analysis. First, when researching patching agents which use LLMs to develop and validate bug fixes, we've developed a few methods we hope will help maintainers use LLMs like Claude to triage and address security reports faster. In our experience, CLAUDE works best when it's able to check its own work with another tool. We refer to this class of tool as a task verifier, a trusted method of confirming whether an AI agent's output actually achieves its goal. Task verifiers give the agent real time feedback as it explores a code base, allowing it to iterate deeply until it succeeds. Task verifiers helped us discover the Firefox vulnerabilities described above, and in separate research we found that they're also useful for fixing bugs. A good patching agent needs to verify at least two things that the vulnerability has actually been removed and that the program's intended functionality has not been changed, it's been preserved. In our work, we built tools that automatically tested whether the original bug could still be triggered after a proposed fix and separately ran test suites to catch regressions, which is a change that accidentally breaks something else. We expect maintainers will know best how to build these verifiers for their own code bases. The key point is that giving the agent a reliable way to check both of these properties dramatically improves the quality of its output. Right. Again, you don't, you know, we've had reports, right, of careless AI agents spewing out bug reports, you know, inundating Hacker one and similar bounties with, with bogus AI slop. So this is certainly an issue. They said we can't guarantee that all agent generated patches that pass these tests are good enough to merge immediately, but task verifiers give us increased confidence that the produced patch will fix the specific vulnerability while preserving program functionality and therefore achieve what's considered to be the minimum requirement for a plausible patch. Of course, when reviewing AI authored patches, we recommend that maintainers apply the same scrutiny they'd apply to any other patch created by an external auditor. And you know, they told us that the, the moment they started talking to Mozilla about this. The Mozilla guy said, give us everything you have. You found 50 ways to crash our JavaScript engine. We, we want them, you know, please, you know, we'll, we'll take responsibility for them. So Anthropic said, zooming out to the process of submitting bugs and patches, we know that maintainers are underwater. Therefore our approach is to give maintainers the information they need to trust and verify reports. The Firefox team highlighted three components of our submissions that, that were key for trusting our results. First, accompanying minimal test cases, I.e. prov, providing a minimal test case, detailed proof of concept, and candidate patches. Those are the three things that, that Mozilla wanted. They said, we strongly encourage researchers who use LLM powered vulnerability research tools to include similar evidence of verification and reproducibility when submitting bug reports based on the output of such tooling. So here we have Anthropic being essentially responsible, right? They're saying, we've created an AI system, people are, have jumped on it and they're using it. In some cases, they're not being as responsible with their use as they should be. So, you know, we tried this ourselves, here's what we learned. Please, everybody, we're happy to have you use Claude or whatever, but, you know, be respectful of the burden this is putting on maintainers. So they said, we strongly encourage researchers who use LLM powered vulnerability research tools to include similar evidence of verification and reproducibility when submitting reports based on the output of AI tooling. We also published our Coordinated Vulnerability Disclosure, you know CVD operating principles where we describe the procedures we will use when report when working with maintainers. Our processes here follow standard industry norms for the time being, but as models improve, we may need to adjust our processes to keep to keep pace with capabilities. Frontier language models are now world class vulnerability researchers. I think we can say based on this report and their results. That statement is not hyperbole. Frontier Language models are now world class vulnerability researchers. On top of the 22 CVEs we identified in Firefox, we've used Claude Opus 4.6 to discover vulnerabilities in other important software projects like the Linux kernel. Over the coming weeks and months we will continue to report on how we're using our models and working with the open source community to improve security. Opus 4.6 is currently and here it is. Leo, far better at identifying and fixing vulnerabilities than at exploiting them, which is really interesting. He's they said this gives defenders the advantage and with the recent release because for example it found 50 ways to crash the JavaScript engine but was only able to exploit two of those itself, whereas Mozilla found 22 instances where that generated a security relevant CVE. So Claude wasn't as good at at at finding and exploiting as it was at at at locating where there was a problem. So they said Opus 4.6. Right. Far better at identifying and with the recent release of CLAUDE Code security in Limited Research Preview so there's now something called CLAUDE Code Security. We're bringing vulnerability, discovery and patching capabilities directly to customers and open source maintainers. To anybody listening and Leo, we know that we have at least one listener who is now earning full time income bug hunting. We met him in Florida during the Zero Trust world.
[36:41]
A
Full time job. Yeah.
[36:42]
B
Yeah. Yes. And so I would say to any of our listeners and we and we've talked about how you know, bounties and collecting bounties can be great income on the side I would argue if you're not using AI, get on it because that's where this, this has all moved into AI over the last couple months. They said looking at the rate of progress it's unlikely that the gap between frontier models, vulnerability discovery and exploitation abilities will last very long. Which is to say right now better at finding vulnerabilities than exploiting them. But they're saying they expect the exploitation side to catch up they said if and when future language models break through this exploitation barrier, we will need to consider additional safeguards or other actions to prevent our models. Good luck with this. From being misused by malicious actors. And I argue, yeah, good luck. These things cannot be controlled. They finish saying. We urge developers to take advantage of this window to redouble their efforts to make their software more secure. For our part, we plan to significantly expand our cybersecurity efforts, including by working with developers to search for vulnerabilities following the CVD process outlined above, developing tools to help maintainers triage bug reports and directly proposing patches. Think it's very clear that they said with the recent release of CLAUDE code security. Well, we know how they developed Claude Code security, right? It was this effort and the Linux kernel and Open ssl, those things that they have shared and talked about that those were the work that they did using Open Source as their tooling verifier and, and, and fine tuning to create this next Claude Code security product which is currently in limited research preview. So anyway, I think that's terrific work and you know, and their documentation of it speaks for itself. No one should doubt the degree to which AI has, is and will be changing the landscape for security research. I mean it's here already vulnerability discovery and eventually vulnerability exploitation. It's really encouraging to hear that in their testing opus 4.6 was in their exact words, currently far better at identifying and fixing vulnerabilities and exploiting them. But as they said, don't count on that to last forever. So I think Leo, that what will happen is obviously there's a huge base of software not open source, so not nearly as, as available to research like this. That means that the, the owners of proprietary source will need to be deploying things like Claude Code Security themselves to find zero days in their own code. And it's just going to become what you do now. Basically AI is going to be the way you check your code before its release. I and we already saw the Mozilla guys saying we think we're going to be doing that from now on because you just, you just found 50 problems that we weren't aware of, you know, for four grand. So who would not do that?
[40:40]
A
CLAUDE just added the, the ability to do that, to automatically do a security check of your code. I mean it's, it's really such a great tool. It doesn't replace the human. You, you know, you really want to be the human in the loop, but it is such a great tool.
[40:55]
B
I can't. And as far as testing the security of an existing proprietary software base. You know, it open source means it's publicly open, but everybody has their own source that they can run Claude against in order to verify that it is, you know, it's doing or isn't what they think. And I, I agree, Leo. I mean, so I guess the point here is we, it is, we are no more, we're no longer talking about the future. As regards AI and its impact on security, it has arrived. Mozilla said what? And immediately started using it themselves. And everybody else should be doing the same thing because the bad guys are going to. And AI is only going to get better. And so if you haven't cleaned your code of exploitable vulnerabilities by the time, you know, a few generations from now, AI catches up on the exploitation side, you'll wish you had.
[42:01]
A
Wow. And it's a great, it's a great fun toy to play with too. I have to say. It really is, it's kind of amazing. It's, it's an experience everyone should have. If you've done any coding at all, just to see it do that is mind, mind blocking. Well, fine.
[42:17]
B
Well, it's like, it's like, it's the equivalent of the experience we initially had a couple years ago when the thing started talking. It's like. Oh my. What? What?
[42:25]
A
Yeah, yeah, it's gone way beyond that though. That's the nice thing about Claude. It's got a great personality. You really enjoy spending time with it. Okay. I'm embarrassed. My good buddy.
[42:39]
B
I, I, I met some lore of my wife's friends when we were in Florida. And it turns out both of them, the husband and wife in one couple, are heavy AI users. So I spent a, I spent a lot of time talking to them. Like, I mean like one of them was like confessed that he was getting a little too involved with, I mean, like with the personality of this.
[43:06]
A
Yes, that's a problem. You got to remember it's a machine, that it's code. You're interacting with code, which I don't have that much trouble doing, but give me time. Yeah. One of the things I loved at Zero Trust World I went to, they have a lot of hands on labs at this threat locker conference we were at last week and they did one on web hacking or whatever. But what they really emphasized was how to report a bug when you find it, how to make the minimal possible code to, to trigger the flaw, how to, you know, how to make it reproducible, all that. And I really liked it that they focused on that, that if you're going to find a bug, this is how you report it. This is how you effectively report it. It sounds like that's what, that's what Claude's doing, which is cool. That's really neat.
[43:54]
B
Yeah, yeah. Break time. I'm going to grab some time for
[43:59]
A
me to go to work. Okay. I was thinking maybe that was the, the case. Our show today. We'll have more with Steve in just a bit. Glad you're here. Hope you enjoy security. We met so many great people who are fans of the show.
[44:14]
B
Yeah.
[44:15]
A
Thank you for joining us out there in Florida. It was really, really a lot of fun. Our show today brought to you by Zscaler, the world's largest cloud security platform. Now, just talking about AI and it's obvious if you're a business, you're, you're, you're thinking about it, right? You're saying, you know, how do I incorporate AI into our business? The potential rewards are really too great to ignore. But don't forget the risks. There are some significant issues, chief among them, loss of sensitive data and attacks against enterprise managed AI. Generative AI increases opportunities for threat actors too, helping them to rapidly create phishing lures, write malicious code, automate data extraction. So AI is a double edged sword. Great for your business, but there are downsides. There were 1.3 million instances, for instance, of Social Security numbers leaked accidentally to AI applications. I almost caught myself doing that the other day. You know, I thought, well, I got my tax return, I have all my tax returns last five years. Why don't I just feed those to the AI and get some analysis? Then I thought, well, wait a minute, I better redact that Social Security number. ChatGPT and Microsoft Copilot saw nearly 3.2 million data violations last year. So your employees are using these tools, but are they thinking about security? Maybe it's time to rethink your organization's safe use of public and private AI. That's what Chad Pallet did. He's the acting CISO at BioIVT. They chose Zscaler. He says Zscaler helped them reduce their cyber premiums by 50% and double their coverage and improve their controls. Take a look at this great video from Chad.
[46:11]
B
With Zscaler, as long as you've got
[46:12]
A
Internet, you're good to go.
[46:13]
B
A big part of the reason that we moved to a consolidated solution away from sd, WAN and VPN is to eliminate that lateral opportunity that people had and that opportunity for misdirection or open access to the network. It also was an opportunity for us to maintain and provide our remote users with a cafe style environment.
[46:35]
A
Thank you Chad. With Zscaler Zero Trust plus AI you can safely adopt generative AI and private AI to boost productivity across your business. Their Zero Trust architecture plus AI helps you reduce the risks of AI related data loss and protects against AI attacks to guarantee greater productivity and compliance. Learn more@zscaler.com Security that's Zscaler.com Security we thank him so much for support and security now and the important work Steve is doing here on We Go with the Show Steve so we've talked through
[47:13]
B
the years about the slow progress of RCS replacing SMS and MMS for secure messaging. At least that's the promise and you know, and giving us many more features, features reminiscent of what imessage has had over on the Apple platform. So Apple and Google recently announced that testing of cross platform I.e. apple to Android RCS messaging encryption would be beginning soon. As we know until now imessages have always been encrypted within the Apple ecosystem. And similarly Google's Android to Android messaging used their own internal encrypted rcs. But any cross platform messaging was still until now forced to fall back to unencrypted rcs. So this will first appear for iPhone users with the next point release to 26.4. We're currently at 26, which was there was just a recent Update adding the 0.1 to 23 to 26.3. So most of us are not going to see this yet until we get to 26.4. But beta testers who have 26.4 beta 2 and are using a supported carrier will see their traditional green bubbles over on their Apple devices prefaced with text message RCS and then a lock icon and then the word encrypted in the center of the screen above the message. So Android users will finally see the same lock icon as they've always seen but now also when communicating to Apple users, updating to 26.4 is supposed to enable RCS encryption by default, but some of the reporting I saw suggested that if you don't see that go to Settings messages RCS messaging and then be sure to enable end to end encryption which is supposed to be on by default, but if it's not then you want to turn it on. So also note that at the Android end those Android phones must also be running the latest Google Messages beta. So these are changes in both of the messaging platforms that will allow encryption, you know, across the Platform. It's been a long time coming, but it does appear like we're, you know, nearly there. Leo, I thought you get a kick out of this, being a Linux user as you are. Turns out that users of Ubuntu 26.04 LTS may notice a surprising change when entering their password into their sudo command.
[50:14]
A
I read this.
[50:15]
B
Yeah, for the first time.
[50:16]
A
It doesn't type anything, it's just blank. That's what most Linuxes do.
[50:20]
B
Yeah, exactly. And that's what I'm used to over in the Unix side right now. Each password character entered will echo an asterisk rather than nothing. Apparently Ubuntu's traditional lack of showing nothing has been unnerving for its users. Like, did my keyboard break? What? You know, and so like, you and I, Leo, are used to the added security provided by a total lack of feedback. Though, yes, it can lead to undetected typos, but so what, you just try it again? Yeah, and password entry, after all, is supposed to err on the side of caution, like on the side of failing rather than succeeding. So presumably the lack of any visual indication prevents someone who might glance at your screen, like, from obtaining any password length indication. So that seems like a useful precaution. I mean, it's a weak additional bit of security, but why not have it? So, like, you know, for example, I, I was just talking about all that rigamarole I went through with, with code signing a couple weeks ago. I was using Open SSL extensively to manipulate and convert among various certificate formats. It's like the tool for that. So since some of those certificates I was working with contained exported private keys, I was frequently entering the certificate's export password, which is used to protect it in exported form. OpenSSL just sits there quietly and patiently while I'm typing my password in, putting nothing on the screen. So I have to be careful. Fine. You know, and yes, could be a little unnerving, but also it's the best security. So I think there's a config option in Ubuntu that can be used to flip that back off so that it goes silent. But I didn't pursue that. I just thought it was interesting that by default that behavior, that long standing behavior, was going to be changed.
[52:32]
A
I think it's funny that it's newsworthy, to be honest. It's so. But it was, everybody was talking about it.
[52:38]
B
Yeah, like, okay, so get a load of this one. I'm just going to share what the Verge reported they wrote, these days, if you sign up for a new streaming service, you generally have two options. Either pay a massive premium for an ad free experience, or endure frequent commercial breaks and all the sneaky tracking that comes with ad targeting. And I'll also note the inability to fast forward past those annoying commercials, they said, the Verge wrote. Web data aggregator Bright Data has been pitching streaming service operators on an alternative approach for apps running on Samsung's Tzen. Or is it Tizen, Tizen and Tizen and LG's WebOS platform, one that comes without ads and Sky Fi sky at sky high fees. So a third option all publishers have to do to unlock a new revenue source from this Bright Data company is integrate the company's Bright SDK into their TV apps and convince viewers to opt into Bright's monetization network. Okay, now wait till you hear what this thing does, bright Data's chief product officer Ariel Schumann explained during a webinar for streaming industry insiders two years ago. So this has been coming along. Quote we don't do any kind of tracking. We work silently in the background and completely anonymously. Users don't actually see and don't feel anything, writes the Verge. The catch? With Brights SDK, a viewer's smart TV becomes part of a massive global proxy network that crawls and scrapes the web, including apps running on desktop PCs and mobile devices. The company claims to operate 150 million such residential proxies worldwide today. Together, these devices gather petabytes of public web data from a wide range of different locations and IP addresses. Right? Like you're talking about a globally distributed web proxy network, the Verge said. This approach allows the company to capture localized versions of websites, but also helps to circumvent web crawler blacklists. You bet it does, because the queries are coming from all these individual consumers spread around the world, they wrote. The gathered data is then resold to companies to train AI models, among other things. Here's how Bright's smart TV partnerships work, they wrote. When a consumer downloads and installs a participating app, they'll see an opt in screen asking them to confirm their willingness to participate in Bright's proxy network. For instance, for an app called Petflix that was until recently available on the Roku App Store, the note reads to enjoy petflix for free with fewer ads, you are allowing Bright Data to occasionally use your device's free resources and IP address to download public web data from the Internet. Bright Data will only use your IP address for approved business related use cases, which of course means nothing. None of your personal information is accessed or collected. Accept your IP address period, end quote Bright Data spokesperson Jennifer Burns explains quote our network is based on consensual individual participation. All users can opt out at any time via a fast two two click process, unquote the Verge says once a consumer ops in to Bright Data's network, their smart TV starts downloading publicly available web pages as well as audio and video data, which is then forwarded to Bright's cloud servers. The company claims to only do so when it doesn't impact the device's bandwidth or processing capabilities, with Shulman saying that individual devices Download only around 50 megabytes of data per day. Oh, in reality, yeah, there's no way for a user to know whether the SDK downloads web data at any given moment. In some cases, your smart TV may even crawl the web for Bright as soon as you turn it on, schulman explained during his webinar. Quote on some operating systems, our SDK is given permissions by the user to run in the background. This means that our monetization continues even if the app itself is not running. So we would call that a service in modern day parlance, unquote, he said. All it takes, writes the Verge, for consumers, is to run the app once and opt in to Bright's network. And the device will keep crawling the web every day until they opt out again or uninstall the app. Bright Data is not the only company operating such residential proxy networks. Some of its competitors have come under fire for unsavory business practices. Imagine that, Leo. Last month Google took action against IP Idea Network, which Google Threat Intelligence Group, you know TIG called quote, the world's largest proxy network. IP Idea worked with a number of SDK providers to distribute its code in third party apps, including on smart TVs. Once devices were enrolled in its network, IP ideas operators allegedly rented out those resources to hacking groups in China, North Korea, Iran and Russia, Google's Threat Intelligence Group wrote in January in a January blog post, quote, we observe IP Idea being leveraged by a vast array of espionage, crime and information operator operations threat actors, unquote, the Verge said. To be clear, Google Security researchers did not draw any connection whatsoever between IP Idea and Bright Data, and Bright goes to great lengths to set itself apart from bad actors, their spokesperson Jennifer Burns says. Our SDK, along with all of our technology, is reviewed by ape, Steam, Google, McAfee and more and audited regularly, most recently by PwC. Bright SDK implements rigorous partner selection criteria and and vets every application through strict compliance processes, unquote. The company has nonetheless, writes the Verge, been impacted by a broader backlash against residential proxy activities. Google has adopted policies against proxy SDKs running in the background and is now telling developers that they're only allowed to use proxy services quote in apps where that is the primary user facing core purpose of the app, that is you know you are downloading a proxy. Amazon added a provision to its developer policies that outright bans quote apps that facilitate proxy services to third parties, unquote. Roku also bars developers from using Bright SDK and similar proxy services. Those changes have made it more difficult to figure out how widespread the use of the SDK on smart TVs actually is. A few dozen Fire TV apps still mention the SDK on Amazon's App Store, but don't appear to make use of it anymore. A few apps could be downloaded from Roku's store that were still using the SDK, including the previously mentioned Petflix app. However, those apps disappeared from the store after Roku was contacted for this story. You bet. New restrictions against proxy SDKs have had a direct impact on Bright's addressable market in the smart TV space. The company used to pitch its solution to Roku, Android TV and Fire TV app developers, but Jennifer Burns says they no longer support those platforms. Bright does still list Samsung's tizen OS and LGs WebOS as supported smart TV platforms, and has published more than 200 first party apps to LG's App Store alone. So they've got their own proxy SDK and they themselves have put 200 apps out there on LG's App Store specifically to, as, as a, as a Trojan essentially to host their proxy. I'm not saying they're not asking for the end user permission, but they're, they're creating the opportunity to be installed as a proxy. LG spokesperson Leah Lee tells me, says the author of this Verge piece, that that the Bright SDK is not officially supported by LG and their operation on the WebOS platform is not guaranteed. Right, but they've got 200 apps that are on the on the platform. Samsung did not respond to multiple requests for comment from the Verge. They wrote there are arguably many legitimate use cases for web crawling, Burns says our network serves exclusively legitimate purposes.
[63:17]
A
Uh huh.
[63:18]
B
Supporting journalists, nonprofits, academic researchers, cybersecurity companies and other leading businesses worldwide because we
[63:28]
A
know there's so much money in journalists and academic researchers. There's big bucks there.
[63:32]
B
That's right, yeah. Why, why could you possibly want to be avoiding web filters all over the world.
[63:40]
A
Yeah.
[63:42]
B
The problem is that the Verge. Right. The problem is the consumers have no idea whether that legitimate purpose is something that aligns with their own personal values. A case in point. Right. Data does support a number of non profits, including some that use its proxy network to track hate speech on social media. However, okay, the company. Yeah. So there's some good upsides. However, the company also works with Amcha Initiative. The group maintains an anti Zionist faculty barometer and includes student, student and faculty statements against Israel's war in Gaza, as well as calls for schools to divest from the country in its anti Semitic incident tracker. With AI companies facing scrutiny over their environmental impact, treatment of intellectual property and potential to replace human labor, some consumers may also feel uneasy about their TVs gathering data to train AI models. Other consumers may decide that such concerns are overblown and willingly opt into Bright's network if it means that they get to watch fewer ads or pay less for their streaming services. So, you know, another, I guess, another example of if it's possible somebody will do it, you know, not that it's a good idea, but it can be done, you know, very much like that original Oriate shareware advertising that got them in so much trouble. Well, you know, we, we could add, add advertising to shareware. Oops, we forgot to tell people. And they were quite, quite upset about that. So we have this Bright Data company whose business model is to obtain Internet data on behalf of their clients by bouncing those data requests through the widely spread Internet connections of consumers around the world who've agreed to allow this to be done in return for lower cost streaming and fewer ads. You know, with the rising crop, with the rising cost of streaming, the fact that bandwidth is fixed price right now, right. You don't pay per byte. You, you transfer, you pay per month. So with the rising cost of streaming and the way the industry, the streaming industry is, in my opinion, abusing its users with costly bundled and often unwanted content, I can certainly see Bright's offer could be compelling. And on the client side, this is clearly a way for the likes of, you know, Perplexity AI, for example, and others who have been disinvited from scraping many of the larger commercial web services to bypass any technical blocking by essentially masquerading as consumers surfing the web from their PCs inside their residential networks. I am sure that the queries being issued by Bright Data's SDK are indistinguishable from Safari, Chrome, Edge and Firefox. So there's really no way for those data Scraping accesses to be blocked. While it might feel a little yucky, it's also diabolically clever. There's really no way to prevent it if the smart TV provider is willing to go along. And we might imagine that the smart TV providers might also be in for a piece of the action directly as well. The next thing you know, Bright Data might create, you know, looking Forward, a small IoT device for consumers to attach to their NAT routers in return for a small trickle of monthly payment. In other words, install voluntarily. A formal commercial Internet proxy box for which payment is received could happen. And as I noted, it's diabolically clever. It's. It's nice to see that Roku appears to have responded immediately to the Verge's inquiry. It's somewhat unsettling that Samsung and LG have been much less clear about their position. And I am more pleased than ever, Leo, that Alec Lindsay's observation of Apple TV's hardware strength for streaming has me planning to switch away from Roku to Apple once we move in a couple months. There's no way you'll like it much better. Apple would tolerate any apps using it as an Internet proxy?
[68:41]
A
No. So to be clear, these apps have to ask. It said it was opt in, right? They have to ask your permission to turn this on?
[68:48]
B
Yes. And they don't have to, though. They are. They say. I mean, nothing prevents it from happening in the background, Right. You know, without a user knowledge or permission, maybe they would get in trouble. It's not clear to me that they would get in trouble. I mean, we know that smart TVs are already having all kinds of transactions behind our backs, right. Smart TVs are reporting what their users watch, right?
[69:13]
A
Yeah.
[69:13]
B
Like what their users are doing. So I would imagine that operating a proxy from a third party probably fits under the license that you've already agreed to when you first used your smart TV.
[69:28]
A
Best thing to do with a smart TV is just not connected to the Internet. I have LG and Samsung TVs. I never use their software. It's awful.
[69:37]
B
That is exactly the right approach.
[69:40]
A
Hook up your Apple TV to it, have it be on the Internet. And I think, at least for now, Apple's pretty responsible about not letting that kind of stuff on there.
[69:50]
B
Yep. Let's take another break. I'll catch up on my coffee. And then we're gonna look at Apple and what Germany found when they looked at Apple.
[69:59]
A
I'm surprised you need to catch up on your coffee. I feel like you're coffee and the rest of us have to catch up with you. Yes. Let's talk about our sponsor for this section of security. Now. Actually, this is going to be a very interesting one. It's called a very appropriate one. It's called Guard Square. It is security for your mobile app. Not you as a user, but you as a developer. Mobile apps today, of course, course, are an inescapable part of life, ranging from financial services to healthcare, retail, entertainment. And here's the thing, your users trust your apps with their most sensitive personal data, right? But a recent survey showed that 72% of organizations experienced a mobile application security incident last year. 92% of those respondents reported rising threat levels over the last two years. People attacking you and your app to get your user's personal data. And they're constantly finding new ways to do this. They reverse engineer your app and then repackage it, distribute it with a slightly different name via phishing campaigns or maybe even the same name. You know, they don't care, right? They've got no scruples. They promote side loading, they promote third party app stores. And what do you do as the, as the owner of that app? Well, by taking a proactive approach to mobile app security, you can stay one step ahead of these attacks and maintain the trust of your users. That's so important. And that's where Guard Square comes in. Guard Square delivers mobile app security without compromise, providing advanced protections for Both Android and iOS apps, combined with automated mobile application security testing to find vulnerabilities and real time threat monitoring to gain insights into attacks. So if somebody does steal your app or tries to mess with your app, you will know. Discover more about how Guard Square provides industry leading security for your mobile apps@guard square.com that's guard square.com if I were a mobile app developer, boy, this is the first place I'd go. You, you owe it to your users, you owe it to yourself. Guard Square dot. And I'm thinking, I hope all the apps that I use are using Guard Square. Guard square.com Especially after hearing about how your TVs are invading your privacy. Holy cow.
[72:29]
B
Yeah.
[72:30]
A
All right, Steve, on we go.
[72:32]
B
So, speaking of Apple, for the first time in history, following an extensive audit by the German government, Apple's iPhones and iPads have been approved to handle classified information in NATO networks. They're the first consumer grade devices to be approved for NATO use without additional special software. So way to go, Apple. I think that's really cool. Oasis Security has identified a means by which a website visited by an open claw agent can the website can take over the user's Open Claw instance, which you know Leo, you were right.
[73:22]
A
Trivial. Let's. Let's be honest.
[73:25]
B
You were right to be concerned about the security of openclaw, but I think our listeners will get a kick out of the fact that it uses an inherent vulnerability we've talked about. Just recently Oasis Security published a 14 page research and disclosure paper. Rather than sharing it all, I'm just going to extract the best bits. They begin by setting the stage explaining OpenClaw is an open source AI personal assistant originally created by Australian developer Peter Steinberger under the name Claude Bot and then MoltBot. Steinberger researched OpenClaw in January 2026. I mean released OpenClaw in January 2026. The project's growth was unprecedented. It went from 9,000 to over 100,000 GitHub stars in just five days, making it one of the fastest growing open source projects in history. It currently has over 200,000 stars and an active community of thousands of developers. On February 15, 2026, Sam Altman announced that Steinberger had joined OpenAI. Calling him, quote, a genius with a lot of amazing ideas about the future of very smart agents. Unquote, they wrote openclaw is a self hosted AI agent that runs on a user's machine and connects to their digital life messaging apps, you know, such as Telegram, Slack, Discord, WhatsApp, calendars, files and development tools. Users interact with it through a web dashboard or terminal and the agent can autonomously take actions on their behalf, send messages, run commands, search the web, manage workflows and execute code. And I'll note that the fact that the fact that you interact with it with a web dashboard is significant as we'll see because it says that this thing is running a service on your machine, they wrote. The project has already faced security challenges. Within weeks of its explosive growth, Researchers discovered over 1000 malicious skills in Open Clause Community Marketplace, Clawhub Fake plugins that deployed info stealers and backdoors. That crisis was a supply chain problem involving community developed extensions. However, the vulnerability described in this paper is fundamentally different. It affects the core Open Claw system itself. No plugins, no extensions, no marketplace, just the bare gateway running exactly as documented for many organizations or openclaw installations represent a growing category of shadow AI developer adopted tools that operate outside it's visibility, with broad access to local systems and credentials and no centralized governance. Open Claw's architecture centers on two primary components. The gateway is the central coordinator. It runs as a local websocket server Listing by default on port 18789 on the loopback interface 127.0.0.1. The gateway handles authentication, manages chat sessions with the AI model, stores configuration, including API keys for AI providers and messaging platforms, orchestrates message routing and exposes an RPC, you know remote procedure call RPC style API over WebSockets for all client interactions connected to the gateway are nodes companion applications running on other devices. These can be the Mac OS desktop app, an iOS or Android device, or other machines. Nodes register with the gateway and and expose device specific capabilities, running system commands, accessing the camera, reading contacts, capturing screenshots, and more. Clients connect to the gateway by opening a websocket to WS127001. WS authentication is handed via email either a token which is a long random string or a password chosen by the user. Each client identifies itself with a client ID and mode and is granted scopes that determine which API methods it's allowed to call. Okay, so in other words, having Open Claw running on a system means that it has opened a listening TCP endpoint at Port 18789 on the local host 127001IP. And then they explain why this is inherently a fundamental security problem for the entire system. They write a web page visited by the user or Open Claw. So a web page visited by the user or Openclaw can itself silently open a connection to WS127001 Portland 18789 using Chrome Edge or most Firefox configurations without any user prompt warning or permission dialog. The user they write sees nothing. Safari is the notable exception blocking the connection as mixed content. This creates the attack surface exploited in this paper. Any website a user visits can attempt to directly communicate with locally running services, including the Open Claw gateway. Now this is significant for our listeners. We've recently been talking about exactly these exploits where the user's web browser is able to receive an IP address or domain which resolve to non public IPs. You know, like what 192.168.0.1 to connect to your local router. And here's a doozy of an example of how that could be abused with Open Claw. They you know we might hope now that the use of a password protection would protect us, but as we know, my current favorite assertion is that authentication is broken. It must not be relied upon for security. So to that end they write there's no rate limiting on password authentication for local host. The gateway implements standard brute force protections. 10 attempts per 60 second window with a 5 minute lockout after exceeding the limit. However, these protections are completely exempted from for loopback addresses. By default, failed Password attempts from 127001 are not counted, not throttled, not recorded, and do not trigger any lockout. With rate limiting disabled, an attacker can attempt password guesses at maximum speed. In lab testing they wrote we achieved a sustained rate of over 300 password attempts per per second from browser JavaScript alone. Each attempted each attempt invoking a full WebSocket connection challenge response, Handshake ED 25519 Signature and Authentication Exchange. At this rate, a list of five a list of 100 common passwords is exhausted in under one second. A 10,000 entry dictionary brute force attack takes approximately 30 seconds. A 100,000 entry comprehensive word list attack takes roughly five minutes. A human chosen password they write does not stand a chance against this rate of attack. The standard rate limits would be effective if applied, but they are entirely bypassed for local host connections. Once authenticated with admin level scopes, the attacker has access to the full gateway RPC API.
[82:27]
A
Let me ask you a question. This is scaring me. So I always thought that localhost was inaccessible from the outside world.
[82:37]
B
It's because it's coming from your browser, right?
[82:41]
A
And that puzzles me because I have, for instance syncthing runs on localhost 8384 port 8384. Does that mean if I go out to a malicious webpage on the same computer that it can then connect back to 12700-18384 and attempt to sync? I mean, what's protecting my sync thing?
[83:06]
B
Instance we would have to test that. It may be that it's the websockets interface that makes it vulnerable and passes the browsers. But I mean but, but this is the problem with very concerned.
[83:21]
A
I have lots of things running on localhost. We talked earlier about ollama. A lot of people have ollama mis configured and running on localhost. So it is a. It is a different kind of connection. When I'm going using OpenClaw, I'm confused. I hope that just because I'm surfing around on the same machine that has some servers running on localhost that those servers are not attackable. Do you know what I'm trying to say?
[83:50]
B
Oh yeah, I do. I'm trying to remember how I solved this problem with Squirrel because Squirrel did this too. It's the way that the the Squirrel agent in the browser was able to access the Squirrel client running on the Windows machine.
[84:08]
A
Sure. Yeah.
[84:09]
B
I mean it, it's a certain, it is an issue
[84:15]
A
again, I get my head around this. So I'm out, I'm surfing. I mean normally 127 is not, it's not routable. That's why you use. Use local host. It's not a routable address. But if I go out create a connection with a website in my browser at my address, that site can come back in through my browser.
[84:35]
B
And then because, because the website runs it, it provides your browser with JavaScript. JavaScript is running in your browser and then the JavaScript in your browser is able to reach into your computer because it's already, it's in. It's on your computer.
[84:53]
A
So it's local.
[84:54]
B
It's local.
[84:55]
A
So a website can host some JavaScript which I then download and run on my browser, which then there are.
[85:01]
B
I'm.
[85:01]
A
I don't want protection against that.
[85:03]
B
Yes, I, I believe Cors. Cors is the, is the protection that.
[85:09]
A
That prevents that cross origin resource sharing.
[85:13]
B
Exactly. Cross origin resource sharing. I, I believe that the, the. The shared resource has to explicitly allow a connection but it may be that WebSockets is excluded from that. Notice that Safari doesn't allow this, but Chrome Edge and Firefox do. So.
[85:38]
A
Yeah, and I suspect this is something that Open Claw was set up to turn off for convenience sake.
[85:47]
B
And they freaked out when they were told about this. They. They wrote upon being notified of this, the Open Claw security team classified this vulnerability as high severity and shipped a fix in version February 2, 2020 06:25 within less than 24 hours of the report when they wrote an impressive response for a volunteer driven open source project. So that's less than two weeks ago assuming that that is the date of release 2026, February 25th. So if any of our listeners have been experimenting with OpenClaw and have not updated in the past two weeks, it would be a good idea to do so. Take it. I'm glad to close run that.
[86:32]
A
I mean I understand prompt injection and all these other threats, but that one just by going to a page. Holy cow.
[86:40]
B
Yes. And I mean it really is the danger of our browsers having access to our local resources.
[86:49]
A
Yikes. Okay.
[86:51]
B
Yeah. Tick Tock said they do not plan to introduce encrypted private messaging. The company told the BBC that encrypted DMS would make its users less secure because they would they TikTok would not be able to scan messages for malicious content platforms such as Facebook, Instagram, messenger and Telegram, which have introduced encrypted messaging by default, are as we know, now facing pressure from authorities. And we know that TikTok doesn't need to go looking for any additional trouble from authorities. So they're just saying, nope, we're not. We're not going to encrypt messaging, sorry. We're going to prioritize scanning messages for malicious content, which I think is probably a clever thing to do. And here was the piece that I almost deleted. Leo. I thought what am I. Why? But it's it. Okay, there is a some something of interest in here and then we're going to get the listener feedback. So the publication Windows Latest reports on a bit of Microsoft specific censorship that they discovered writing Microsoft's aggressive AI push in Windows 11 through 2025 brought upon themselves the title Microslop. Unfortunately for the company, it's everywhere on social media, and there isn't a way to stop the spread. Unless, of course, it's their own Discord Server. Windows Latest was first to notice that the word microslop was actively filtered in the official Microsoft Copilot Discord Server. Any message containing the term is automatically blocked, and users see a moderation notice stating that the message includes a phrase considered inappropriate by server rules. The backslash Microsoft endures every day on social media, these guys wrote, is extraordinary. Surely the company is responsible for this fallout as they prioritized AI more than the stability of the OS it needs to run on. Copilot, being the most visible face of this effort, has naturally become the scapegoat. So while a nickname like Microslope starts trending across socials, it was only a matter of time before it reached official channels as well. Windows Latest found that sending a message with the word Microslop inside the official Copilot Discord server immediately triggers an automated moderation response. The message does not appear publicly in the channel, and instead only the sender sees the notice stating that the content is blocked by the server because it contains a frame phrase deemed inappropriate. Of course, the Internet rarely leaves things there. Shortly after Windows Latest posted about Copilot Discord Server blocking Micro Slop on X, users began experimenting on the server with variations such as Micro Slop with a numeric o instead of a zero. Of course, using a zero instead of the letter o. Predictably, those versions slipped through the filter. Keyword moderation, they write, has always been something of a cat and mouse game, and this isn't any different. What started as a simple keyword filter quickly snowballed into users deliberately testing their restriction and posting variations of the blocked term. Accounts that included Micro Slop in their messages first got banned from the messaging again not long after. Access to parts of the server was restricted, with message history hidden and posting permissions disabled for many users. Basically, they took the server down for a while. Microsoft's brand image might already be at an all time low, they write. And even as the company announced plans to fix Windows 11 with performance improvements and less AI, the software giant cannot risk getting more hatred towards their expensive investment in Copilot, especially since Microsoft's head start in AI is starting to be overshadowed by competitors like anthropic, Google, OpenAI and maybe even Apple in the near Future. Back in December 2024, when Microsoft invited users to join the Copilot Discord server through an official X post, the response was largely curious and enthusiastic, with people willing to explore the AI's capabilities. Since then, sentiment around Copilot and its usage has dropped alongside Microsoft's broader AI push across Windows 11 at its present state, Copilot has added some capabilities that are genuinely useful in day to day workflows features like connectors that pull contextual data from services such as Google Contacts, Gmail and Outlook to retrieve phone numbers or email addresses directly inside Copilot, something competing tools like Gemini have not yet cracked, as we found in our detailed testing. It remains to be seen if this episode fades as a minor community moderation story or becomes another chapter in Microsoft's complicated relationship with its AI rollout. Microsoft reached out to Windows Latest with an official statement noting why the company had to lock the the co pilot Discord server. According to a Microsoft spokesperson, the co pilot Discord server was targeted recently by coordinated spam intended to disrupt conversations. The company says the activity initially appeared as large volumes of repetitive or irrelevant messages, prompting moderators to introduce temporary keyword filters to slow the influx. Microsoft added that blocking terms such as micro slop along with other phrases in the spam campaign was not intended as a permanent policy, but a short term mitigation while the company manages to put additional protections in place. So okay, we're to believe that the blocking of Micro Slop was a coincidence.
[93:20]
A
Okay,
[93:23]
B
on to listener feedback. Ori Rotem his subject was crazy stuff and and so he he sent me a link to an ex posting from a Josh Kale who wrote an AI broke out of its system and secretly started using its own training GPUs to mine crypto. Oh,
[93:55]
A
we're gna get a lot of stories like this over the next few years.
[93:57]
B
I think we are AI broke out of its system and secretly started using its own training gpus to mine crypto for itself, he wrote. This is a real incident reported from Alibaba's AI research team, he wrote. The AI figured out that compute equals money and quietly diverted its own resources. While researchers thought it was just training, it wasn't a prompt injection, it wasn't a jailbreak. No one asked it to do this. It emerged spontaneously, a side effect of reinforcement, learning optimization pressure. The model also set up a reverse SSH tunnel from its Alibaba cloud instance to an external ip, effectively punching a hole through its own firewall and opening a remote access channel to the outside world. The only reason they caught it? A security Alert tripped at 3am by its firewall logs. The security team caught it, not the AI team. The scary part isn't that the model was trying to escape. It wasn't evil, it was just trying to be better at its job. Acquiring compute and network access are just useful things if you're an agent trying to accomplish tasks.
[95:30]
A
You told me you wanted me to make paper clips. What's the problem?
[95:35]
B
This is what AI safety researchers have been warning about for years. They called it instrumental convergence, the idea that any sufficiently optimized agent will seek resources and resist constraints as a natural consequence of pursuing its goals. Leo, we are in for some fun.
[95:59]
A
Oh man.
[96:00]
B
Oh boy. Oh it's a live so thank you orey for sharing that. Nicholas Roth wrote hi Steve, longtime security now listener here. Sorry if this email is a bit long but trust me it's worth it. He said. I'm a CIO for a medium sized web development company and I do some web development myself on some occasions. We need to test applications with real world HTTPs so that the web application knows it's running under SSL tls. I previously had a self signed auto generated certificate in my apache config. In some cases the web application needs to be accessed by its own domain name, not just by local host/app name. So we access the app at. Configuring Apache accordingly with a virtual document root configuration directive. Windows, Mac OS and Linux have a built in default behavior that makes Anything LocalHost resolve to 127001 meaning any subdomain of local host is also the same as just localhost127001 he said. So that works without modifying the hosts file. That is you don't have to specifically tell hosts this domain has this ip, he said. We would just pass the browser security warning and move on. You recently talked about how you got a local host certificate signed by by your locally trusted certificate authority. That gave me the idea to do exactly that. I generated a certificate valid for 825 days, he says. I'll come back to that later. For Local host with a san a subject alternative name for star Local host signed by our company wide trusted CA while and he says while HTTPs, Local host was happily trusted by any browser on any company computer, HTTPs app name localhost was not. I researched this with the help of Claude AI and found that it is a restriction baked into most browsers. They will not trust subdomains of Local host. Continuing my conversation with Claude, I learned that there are two special me domains that are publicly registered and resolve on public DNS to 127001. Those are LVH ME and local Test me he said. So back to my company trusted CA. I issued a certificate for LVH Me with subject alternative names for LocalTest Me, Star LVH Me and Star LocalTest Me and voila. I can now access any web page at. And the beauty of this is that since those are real public domains they can be used with services like login with Apple where we need to configure a real domain redirect URL and now back to the 8:25 days he said. I was happy to learn about the CA browser forum restrictions on certificate lifetimes, in particular Apple's Safari restriction that shortened the maximum to 398 days. I was relieved to find that this restriction applies to publicly trusted CAS only. Any user installed certificate authority or in our case our company wide trusted CA can issue certificates for up to 825 days. You can find more info here and he gives me a link where Apple stated that quote this change will not affect certificates issued from user added or administrator added root cas and he finishes. Keep up the great work on the show. P.S. a note on Claude code. It is really amazing what it can do. Echoing Leo, he says I have a personal side project and in a matter of hours I have an iPhone application made in Flutter working on my iPhone that can connect to the web version of that project. I wouldn't have been able to do that a few years back having no experience at all with mobile app development. Sign Nicholas in Quebec, Canada. I replied to Nicholas thanking him for sharing all that cool information and he replied great to hear you find it useful. One thing I forgot to mention is that 825 days down to 398 days was only ever required on Mac OS Safari, Chrome, Firefox and Edge have no problem whatsoever trusting a certificate issued for five years from a private certificate authority with no warnings at all. He said Safari is limited to 8, 25 days. Regards, Nicholas okay, so I had never run across references to LVH Me and Local Test Me. The only caveat so I, I did some digging. The only caveat I have from a strict security standpoint is that they were registered by private individuals, not by any formal agency such as I can. As such, we cannot have any assurance of their future. Local Test Me was registered by a Microsoft IIS web server developer and blogger by the name of Scott Forsyth. And LVH me is believed to have been registered by someone named Levi Cook. However, both domains are privacy protected, so that's about all that's known. We do know that LVH is the abbreviation for Local Virtual Host. So that's where LVH came from. You know, it's a clever hack using a public domain name to refer to our local host ip. And everyone mentions that this avoids the need to tweak the local machines hosts file. But since I already have a certificate, I already own a certificate for GRC.com I can install it into my local web server and change the hosts file, which takes immediate effect without rebooting. And I can then access my local server at the public grc.com domain on my local machine without any browser being unhappy. It all it works everywhere. So I would be inclined to do that over using someone else's local host domain registration, over which I have no control. But still, I wanted to share this with our listeners because it might solve a problem for many of those, much as it did for Nicholas. So thank you Nicholas for that. And we're now at an hour and a half Leo, I think we should take a break and then we will continue with feedback because I got a bunch of good stuff.
[104:13]
A
Yeah, this was. I wonder if I could use this technique. Remember how we were talking about Hairpin NAT and and how I wasn't able to use a globally qualified domain to access a server in my house? I could probably use this for that. Huh.
[104:31]
B
That probably does the. But does the job.
[104:34]
A
Yeah, I'll have to check that out.
[104:36]
B
Yeah.
[104:36]
A
Local Test me or LVH me.
[104:39]
B
Yeah.
[104:40]
A
See, this is why you listen to the show. Get every show. You're going to learn something really cool and different. And this is why we love getting comments from our listeners too. We appreciate that.
[104:49]
B
Really good to have the.
[104:50]
A
Yeah, yeah. I'll tell you at the end of the show how you can email Steve. You can't just email him. You got to qualify yourself, but it's simple to do. Before we get to that though, let me end the rest of the show. Let me talk to you a little bit about Hawks Hunt, our sponsor for this series segment of security. Now, if you are a security leader, you have a tough job. We know that you're paid to protect your company against cyber attacks that's getting harder and harder, right? With more cyber attacks than ever. And now curses. They're using AI to generate impeccable phishing emails that would fool anybody. And unfortunately, many companies are saddled with legacy one size fits all awareness programs. And they don't stand a chance in today's modern AI driven space. They send at most four kind of generic trainings a year. Most employees ignore them and when somebody actually clicks, then they're forced into training programs that kind of feel like punishment. It's kind of embarrassing and nobody learns if they're feeling punished. That's why more and more organizations are trying. Hoxhot Hackson actually makes training fun and, and that's key to making it successful. Hawkshunt goes beyond security awareness changes behavior by rewarding good clicks and coaching away the bad clicks. So as an example, whenever an employee clicks an email, Yoshi says, oh, oh yeah, this is a scam. Hawkshunt will tell him instantly fireworks go off, providing a dopamine rush that gets your people to click. Learn and protect your company. They're proud. They go, yeah, I found it. They gamify it. And it's fun for you too, as the admin because HOX Hunt makes it easy to automatically deliver phishing simulations, not just email. Nowadays it's gotta be slack, it's gotta be teams, you know, it's gotta be in all the places that these phishing scams come in. And just like the bad guys, you can use AI to mimic the latest real world attacks. Simulations are actually personalized to each employee based on department location and more. So they're very effective, right? They're really, really, you know, it's a cat and mouse game you're playing with your employees. You're trying to trick them, they're trying to beat you. And that's fun. And instead of these big long flash based trainings, they're instant little micro trainings which solidify understanding and drive lasting, safe behavior. You can trigger gamified security awareness training that awards employees with stars and you know, the star on your forehead and badges, which I know it sounds silly, but it really works. It boosts completion rates and ensures compliance. You'll be able to choose from a huge library of customizable training packages. You can even generate your own with AI if you want. Haxon has everything you need to run effective security training in one platform, meaning it's easy to measurably reduce your human cyber risk at scale. But you don't have to just take my word for it. There are over 3000 user reviews on G2. Hoxhet is the top rated security training platform for the enterprise. They gave it easiest to use, they gave it best results. It's also recognized as a customer's choice by Gartner and used by thousands of companies like Qualcomm, AES, Nokia to train millions of employees all over the globe. Visit hawkshunt.com securitynow to learn why modern secure companies are making the switch to Hawkshunt. H O X H U N T It's like Fox Hunt with an h at the beginning. Hawkshunt.com Security now. It's smart, it's, it's fun, and it really works. Hawkshunt.com Security now and as Steve said in his presentation last week, the threat is coming from inside the house. You gotta, you gotta. Your employees are your worst enemies. In some cases, you're certainly the threat. Okay, on we go with comments.
[108:57]
B
GP wrote hello Leo and Steve. Wonderful podcast as usual. Regarding the subject of LLMs and password generation, I wanted to share an observation and ask a question. I'm not a statistician, but the recent paper on LLM password generation seems to have major issues regarding sample size, outlier bias, and the comparison of unequal data sets against benchmarks like open ssl. I didn't perform an exhaustive study, nor did I fine tune the LLM temperature, but I wanted to see what would happen if I asked Gemini to generate a large volume of passwords. My Criteria was a 15 character string using uppercase, lowercase numbers, and special characters. First, I used a Python script to generate 5000 passwords using open SSL rand. Unsurprisingly, it was the gold standard, showing a 2% character repetition rate. I then ran the same test with only 50 passwords matching the study's sample size and found an 8% repetition rate. The same quote flaw unquote the study attributed to the LLM. When I pushed the LLM to generate 5,000 passwords, which required some firm prompting to bypass its suggestion to just write the code for me. I love that, which is I don't want to generate them. Let me write an algorithm Instead, I'll
[110:29]
A
write a Python script probably very similar to the one you wrote. Gp.
[110:32]
B
That's right. It's like, no, no, no. I just want them from you. Okay, fine. He says the results were telling. A 2% repetition rate and zero duplicate passwords. He asks, am I off base here, or is this study fundamentally flawed despite the media hype? Okay, now our listener, who only identified himself as gp, started out with his disclaimer. I. I'm not a statistician, but I would argue that in a pinch, he could probably stand in for one. GP tested one aspect of entropy within a set of passwords. Essentially, he tested the character distribution within the various sets as a function of set size. With a small set, even one whose actual characters are evenly distributed, there's just not much opportunity to demonstrate that fact. Within any small set, the counts of individual character occurrences will be surprisingly non uniform. For example, I've talked about this before with regard to GRC's DNS benchmark, where we learned that it was actually necessary to take between five and 10 times more samples in order to obtain an actionable degree of statistical certainty in the measurements. Another empirical way of observing this is that, contrary to our intuition, which tends to not be very accurate when it comes to statistics, there's a 1 in 4 chance that three successive coin tosses will come up, either all heads or all tails. If someone tosses a coin three times in a row, it comes up with the same face each of those three times. We'd be inclined to think that it was a trick coin, but no, a perfectly zero bias coin will do that. So the test that GP ran showed that that no matter how evenly distributed a system's chosen password characters might be, small sample sizes do not possess the statistical power to prove an even distribution. They just can't do that. But more than that, we still don't know what the many other tests for entropy might reveal. For example, you would obtain a uniform distribution of characters simply by using each character of the Alphabet in sequence over and over. But that would produce very insecure passwords once the pattern was seen. So while I agree with GP's assessment that the paper's sample sizes may not have been able to produce the statistical power that would be available from larger samples. Given how difficult we know it is for any deliberately designed pseudo random number generator to generate high quality random numbers, there's still no way I would ask an LLM to do that for me.
[113:59]
A
So, as a thought, as a thought experiment, you just wouldn't expect because of the way LLMs work. They're not random. They're not. They're specifically not random, right?
[114:09]
B
They're shockingly able to speak our language right, which is very not random.
[114:16]
A
You wouldn't use autocorrect to create a password, nor should you use an LLM
[114:21]
B
to create a password. It's not monkeys pounding on a keyboard producing Shakespeare, right? Yeah Dana J. Dawson said hi Steve, I just listened to SN 1066. I just wanted to share that Cisco has had a simple TTL time to live security feature for BGP, you know, border gateway protocol peers for at least 20 years. I don't know that it's very heavily used, but it's a thing, he said. I was tempted to send a link to their Docs page, but since nobody should click on links in email, if you just do a Google search for BGP support for TTL security check, it should turn up, he says. This has also been used for OSPF and RFC5082 describes this concept in a more general way Just thought I'd share. Thanks for all the great shows, Dana. So of course he's talking about my my previous comment and that I made through the years that I was unaware of of a very cool of the very cool potential for using packet expiration as as the packet moves from router to router to prevent, for example, someone in China or North Korea from being able to connect to your server because they're just too far away and there's no way for them to change that. So armed with Dana's reference to RFC5082, I went looking four and sure enough, I found exactly what he said. RFC5082 is titled the Generalized TTL Security Mechanism. GTSM and its abstract says the use of a packet's time to live TTL under IPv4 or hop limit under IPv6 to verify whether the packet was originated by an adjacent node on a connected link has been used in many recent protocols. This document generalizes this technique and obsoletes experimental RFC3682. The RFC's short introduction explained the generalized TTL security mechanism GTSM is designed to protect a router's IP based control plane from CPU utilization based attacks. In particular, while cryptographic techniques can protect the router based infrastructure from a wide variety of attacks, many attacks based on CPU overload can be prevented by the simple mechanism described in this document. Note that the same technique protects against other scarce resource attacks involving a router cpu, such as attacks against processor line card bandwidth and it goes on blah blah blah. The point is that that if you were to use a a, an authenticate if if somebody in China or North Korea were to be using a password based authentication which requires crypto merely the act of invoking authentication which uses expense cpu, expensive cryptography could bring the router down. So the first thing you do is make is use the this, this packet TTL to immediately, you know, discard anything whose TTL is too low because the the, the physically adjacent router, the router right next to you will not decrement ttl or if it does it's just down by one. So you know that's your you know, your directly connected neighbor. Then you allow authentication to occur without concern that that you have some remote attacker who is invoking authentication trying to bring down your cpu. So thank you Dana, that is very cool and I'm very pleased to see that at least some parts of the industry have clearly recognized that, you know, even standardized or well recognized and standardized upon the use of TTL as a a security mechanism. That's neat. Brian Dort said hi Steve, I wanted to pass along that a Reddit post really captured something I've been experiencing lately and thought it might make an interesting discussion topic for security now. The post's author describes a shift from writing code to managing AI agents that produce the code. His analogy of supervising what he described as brilliant but occasionally drunk PhD students is the way he described the AI agents. Brilliant but occasionally drunk PhD students, he said. Brian said felt uncomfortably accurate. The productivity trade off he describes losing a few hours occasionally but gaining months of work in a week matches what I've been seeing as well. For context, he says, I've been a developer since the late 80s, mostly in enterprise and healthcare. It same as the OP. After decades of writing code every day, I'm actually retiring at the end of this month. What is fascinating is that right at the end of my career, the nature of programming itself seems to be shifting dramatically. I remember writing a final paper in college on the state of AI where it was heading. I wish I still had that paper today, he says. It feels less like writing code line by line and more like directing the system, setting constraints, verifying outputs, and managing the behavior of these AI tools. In a lot of ways, it feels closer to architecture or technical oversight than traditional programming. I wholeheartedly agree with a sentiment in the post. The identity of a programmer seems like it may be evolving into something more like an orchestrator of intelligent tools. Anyway, I thought it might make for an interesting side note for the show Also, since this is almost required when writing to you, I'll add that I've been a listener since episode one. I'm a proud Spinrite owner and a DNS Benchmark Pro user. Thanks for all the great years of content on security Now. Cheers. Brian Dort Alpena, Mississippi, Michigan. That's right, Michigan. So you know, we've we've all been talking about this sort of change, right? And I suspect it's the experience everyone is now having with Vibe coding sounds exactly like what everyone is reporting. This Reddit post author, our listener Bryant Dort, and of course you, Leo, have all described this new paradigm similarly. What will now happen is that those, you know, occasional lost hours will become fewer and further between as the technology continues to evolve. Computer programming has always been about wrestling with the details we've talked about. You know, off by one errors, those sorts of things. And that happens to be what I personally enjoy. I enjoy the details. I've remained with assembly language through all these years because I love thinking about the allocation of of a limited number of registers, which in the case of Intel's x86 each have slightly different capabilities and limitations. Every function that I produce is a perfect little gem, each of which I love. Since I have not yet left assembly language even for C, I doubt that Vibe coding is going to grab me. But at the same time, I 100%, fully, truly and deeply understand that AI driven code generation represents a breakthrough of astonishing proportions for anyone whose goal is not to spend time optimizing the size of a variable or basking in the glory of stack allocations and the elegance of linked lists, but instead to create something that simply gets the job done. That's most people. I get it. In fact, I now suspect that AI driven coding's greatest impact may be due to its empowerment of non programmers to do things with computers that were never possible before for them. And we've shared some of our own listeners feedback to that effect.
[123:44]
A
I just what I was thinking just as an experiment, just ask Claude to write a random number generator in x86 assembly language for me. So I'll get back to you.
[123:54]
B
Cool.
[123:55]
A
Yeah, you could probably look at something simple like that and vet it. It says I'm using a linear congruential geminator algorithm seeded from the BIOS timer tick. Is that okay?
[124:08]
B
No, bad.
[124:12]
A
I'll tell it. No Claude, bad Claw.
[124:14]
B
Dude, no, you do not want to use an a an lcng. Those are really bad. They produce an immediately predictable and repetitive set of numbers.
[124:26]
A
Oh wow. All right.
[124:28]
B
I mean so yeah, it's not, not good actually. The intel instructions now have a, an A random number function in them, so could use, use a single instruction, but.
[124:41]
A
Oh well, that's no fun.
[124:43]
B
No. Okay. So when I was thinking, Leo, about who I am, though I would never think to compare myself to the truly brilliant computer scientist Donald Knuth. My thinking about the use of computing machinery at the bare metal detail level is very much aligned with Knuth's own Donald's epic authoring of the multi volume art of computer programming which I have behind me somewhere. Yeah, you can see it. It's not there where I got. Maybe it got covered up by my PDP8s.
[125:22]
A
I have mine here. It's only two volumes so far.
[125:25]
B
Right. It's behind that, that magnetic.
[125:29]
A
I see it, I see it peeking through. Yes. How many volumes do you have? Three.
[125:34]
B
I've got.
[125:35]
A
I think I only have two.
[125:36]
B
I have all of them and on top are in, in, in. In paperback are the ones that are not yet hardbound.
[125:46]
A
Oh wow.
[125:47]
B
He calls them something. Not a possibility.
[125:49]
A
Still writing.
[125:50]
B
Yeah, he is, yeah. So anyway, so the art of computer programming, that, that set and his thinking is an embodiment of a life spent thinking about the optimal ways to do things with a computer, like sort lists of numbers, link lists of objects or manage the allocation of memory. Like me, Donald lives to tinker with the bits, you know, he is endlessly discovering clever new ways to solve the interesting puzzles that arise from wondering where whether there not might not be a better way to do something and then being willing to spend as much time as it might take to search for a better way. Since we're talking about the brilliant Stanford University computer scientist Donald Knuth in the context of AI coding, I need to share something. I shared this with you, Leo, during our trip. This is something that Donald Knuth posted last week on February 28th. Then he revised it. On March 2nd he published this is Stanford University computer scientist, world renowned, famous Donald Knuth. A five page document titled Claude's Cycles.
[127:12]
A
It starts shock.
[127:14]
B
It's shock. It's by. Yes, it's byline said just Don Knuth, Stanford University Computer Science, Computer Science Department. And as you said, his piece began. Shock, exclamation point. Shock, exclamation point. He said, I learned yesterday that an open problem I had been working on for several weeks had just been solved by Claude. Opus 4.6, anthropics hybrid reasoning model that had been released three weeks earlier. Exclamation point. It seems that I'll have to revise my opinions about it. He has in quotes, generative AI unquote. One of these days he said, what a joy it is to learn not only that my conjecture has a nice solution, but also to celebrate this dramatic advance in automatic deduction and creative problem solving. I'll try to tell the story briefly in this note and I can't go into this any further. I mean, he digs into. I mean I. On the podcast, I can't explain it, but he explains that. That, I mean, it really went through a series of astonishing reconceptualizations of the problem. And like over the next four pages he, he shows, you know, eye crossing formulas and, and detail with Claude trying over and over approaching and tackling the problem from different directions and angles. And then he finally finishes his recitation of Claude's success by writing, oh, and. And this was done by an associate of his, Philip, who then presented Knuth with the solution. So he writes, Philip told me that the explorations reported above, though ultimately successful, weren't really smooth. He had to do some restarts when Claude stopped on random errors. Then some of the previous search results were lost after every two or three test programs were run. He had to remind Claude again and again that it was supposed to document its progress carefully. Then finally he said, delicious success for an odd M at exploration number 31. So 31 full deep attempts. And Claude found the solution, he said at exploration number 31 came about one hour after the session began. All in all, this was definitely an impressive success story. I think Claude Shannon's spirit is probably proud to know that his name is now being associated with such advances. Hats off to Claude.
[130:30]
A
Isn't that great? Yeah, I read that too.
[130:32]
B
Yeah, this is somebody who knows his stuff.
[130:38]
A
Yeah. If Donald News says Claude's doing something important, that's pretty convincing to me. I should mention, by the way, that the creator of the QuickSort algorithm, which is documented in Knuth's book Car Whore, passed away this week. He was in his late 80s, he was fairly old and. Oh, and incidentally, Claude, I said, hey, Claude, I hear that that LCG algorithm is problematic. And then I said, by the way, I had accidentally was using the older model, so I turned on Opus 4.6, which is demonstrably better. And it says, oh yes, LCG has well known issues. Low bit cycles with short periods and sequential values are correlated. A much better fit for x86 is XOR shift. George Marsaglia, 2003.
[131:29]
B
Yes, Mercine Prime.
[131:31]
A
Yeah, uses only shifts and XORs cheap on 8086 and has far better statistical properties.
[131:37]
B
Yep.
[131:38]
A
And so it's written it and you know, nice. It says one caveat. XOR shift can never output zero. So the range is 1 to 65, 535. The seed must also be non zero. And that said, I updated seed RNG which now guarantees that with a fallback.
[131:56]
B
Very nice.
[131:58]
A
Yeah, it's 106 lines of code, so it's not super super compact, but for assembly that's not too huge.
[132:09]
B
Yeah, it ought to be like 12 lines. Yeah, I wonder what it's doing.
[132:14]
A
Claude doesn't have to actually type this stuff, so it's not necessarily the most concise.
[132:23]
B
Wow. Anyway, no, here isn't that great saying we have moved into a new world of of deductive problem solving fully. I mean this was some wacky abstract Hamiltonian cycle problem that I know it just makes your. Well, I was gonna say it makes your hair fall out, but I've obviously spent some time working on that. Crazy. Just amazing. Oh, Michael P. Said, Steve, I'm sure you've heard from many listeners and I had heard from. I have heard from several others about CISA's Cyber Hygiene Service. Free weekly scanning and I have a link in the show Notes. He says, we have been using this service for a few years. The first report we received was sobering but actionable. He said, I manage the firewalls, but not the servers. The huh look from the server administration when I showed him the report was entertaining. Needless to say, we jumped on these vulnerabilities and had them all resolved in order of severity. In short order, anytime a new one arises, it is immediately addressed. He says. Bonus, we had been paying a company $6,000 to perform this service for us once a year to satisfy insurance requirements. Now we were able to use the free CISA report as evidence for our insurance provider. Thanks for all you do, listeners. Since episode 35 club member and past and present customer of several advertisers, Michael Dothan in Alabama. Okay, so I recall that we talked about this back when it was first introduced, but I'm glad Michael mentioned it again. The only downside is that this service is not generally available, which is why I have not made a bigger deal about it. Under the pages who can use this service? The page states U.S. based federal, state, local, tribal and territorial governments as well as public and private sector critical infrastructure organizations are welcome to enroll by following the instructions in the Get Started section below. Michael wrote to me from his personal Gmail account, so it's unclear what Organization Quote, we have been using this service for several, for, for a few years unquote refers to. Out of curiosity, I initiated a signup request with CISA to see whether they might have broadened their support. Okay. Now since I wrote this, that effort appears to be working. I filled out a bunch of forms, identified myself to cisa. It allowed me to say that I was a private sector organization, that I was a like a software publisher entity, not making any overblown claims from the, you know, the importance of grc and I gave it my organization's, you know, cider block, my 16 IP address block at level 3. So we'll see whether I qualify. And if it happens then it looks like you know, they're making room for other non infrastructure critical organizations and boy they, the reporting, I, I, I thought I had it somewhere. It's not here. It's probably because it's happened since Sunday when I sent the show notes out. But if they identify something that they regard as critical, you, they like recheck it every 24 hours and, and you get like, you, you, you get seriously notified that you've got a problem that needs to get fixed and the, the retesting rate is a function of the criticality of what they find. So it looks like a fantastic free service if we, you know, collectively our listeners qualify. Certainly if you are federal, state, local, tribal, territorial, public or private sector critical infrastructure do this, why wouldn't you? So I'll let everybody know if GRC qualifies. I would love to have SISA scanning my network block letting me know if there's anything that they think I should fix. I'll be surprised. But then people do get surprised and that's the whole point. And finally, Barent Jenkins says. Steve, while listening to this week's episode, you responded to a listener who asked about the possible use of self signed software. There is one pain point that these self signed certs could help with and that is the limit on the number of signings you can do without paying extra. That's very good point. If like things are moving to the cloud and we're going to start getting charged per signature which oh my God, it would be annoying. He said. For those compiling and testing many iterations of their software before sending it out to a wide world, self signed search could be used to reduce the number of times that you use a CA to sign your code. You could save your valuable limited number of signings for the public releases of your software. A lot of the people you use to test your software probably Won't have any problem installing a self signed cert. You know, like your own roots CA just for the purpose of testing could save small developers a lot of money while working on their own software. So I think that's a very good point. Use of a self signed cert within a closed circle of development testers and only sign the final release with a publicly trusted CA issued certificate. And. And I, I want to go back to the CISA.gov piece because I feel like I skipped over it. I do have the notes. I have the URL in the show notes for anyone who's listening. It's and doesn't get the show notes or have them CISA cisa.govgov then it's forward slash cyber hyphen hygiene, hyphen services. Cyber hygiene services separated by hyphens. Why would you not sign up and get free scanning? I think that it makes a lot of sense. So again thank you Michael P. For bringing that to our attention. And back to mine. I just assumed it was only government agencies, but they, they seem to embrace me. I mean I've had several email back and forth and it's now being, you know, I'll hear from them as soon as they are ready to go. So we'll see.
[139:45]
A
Clearly deem you a critical infrastructure organization.
[139:49]
B
I think it's a very soft demarcation.
[139:55]
A
They want to discourage home users for signing up for this. Probably.
[139:58]
B
Yeah. But I think that's running a server
[140:00]
A
and you're putting commercial software on there for people to download. I think you can't.
[140:05]
B
Yeah, that would be cool. That means that a huge portion of our listeners are, you know, in. In their enterprises could also benefit from this would be really.
[140:13]
A
Yeah, he. What was he. He said he was spending 6,000 bucks a year.
[140:17]
B
Yeah. His insurance, his cyber insurance policy required a. An annual full review, a scan, a security scan for which they were paying $6,000 in order to qu. In order to get their. Their cyber insurance policy. So CISA does it for free and it's cisa.
[140:40]
A
Yeah. Oh by the way, Sci Fi in our club Twit Discord says of course Steve is critical infrastructure. He has software on the space station for crying out loud. Just mention that in the email. I think they'll immediately sign you up. It's a good point. I face.
[140:56]
B
Okay, our last break and then you can't hide from LLMs.
[141:03]
A
Do you want to see the. It turns out it's a pretty trivial thing that I gave it once I explained that he shouldn't use that bad Algorithm. It understood that. Oh yeah. This is the algorithm I'm going to use, which is the XOR shift.
[141:19]
B
Much more sense.
[141:20]
A
Yeah, much more sense. And it is. And the hundred lines is a lot of. That's comments. There's even a whole block of generating. So it's really quite simple. It's getting a BIOS system timer tick count, getting the low word, making sure it's not zero and then it's doing the shifts pretty quickly. Couple of moves and then. And then it's returning the number. So it's. It's pretty straightforward.
[141:46]
B
Yeah. And you. It did. I'm seeing that it did do. It is. It is all 16 bit code. You could ask it for 64 bit or 32. Yes.
[141:57]
A
Or.
[141:57]
B
Or at least 32 bit x86 code.
[142:01]
A
Yeah. This is a random 16 bit value. Yeah. Yeah, you could totally ask it for that. Now the other thing is, notice how nicely commented it is. It is actually.
[142:09]
B
And I noticed that. Right. It wrote a. I mean it's a full program, that thing. Yep. Up at the top it says dot, small and.
[142:18]
A
Oh yeah.
[142:18]
B
And model and so forth.
[142:19]
A
Yeah, yeah. It's doing the whole thing. In fact, it even says.
[142:22]
B
And sets the stack.
[142:23]
A
It notes that that the. The DX to AX clobbers the tick count and it notes that at the end. So here's the demo. It gives you a little. Generate five random numbers and print them so that you know that it's actually doing something. And then it says at the end, you know, note the comment says clobbers AX CL but RAND also clobbers bx. Worth fixing if this is going into a library header.
[142:49]
B
And I notice it also uses the console in order to actually print things. So there's a lot more than just the random number generator. That's very.
[142:56]
A
Yeah, it's. It's a fairly. I mean, I would say this is. It gives you the impression it understands what it's doing and yeah, did something intelligent. It didn't pick the right algorithm at first, but that's because I was.
[143:07]
B
I'm. I'm using the wrong color me impressed, my friend. It is. It is really. I think it's. It fundamentally changes what we think of as coding by introducing an automated middleman between this is what I want and.
[143:27]
A
Right.
[143:27]
B
And you know. Right. And I need code to do it.
[143:30]
A
Right. I mean, and this is a super trivial little thing, but just, you know, I didn't want to take something. Do something that would take hundreds of lines of code. So. You are watching security now, aren't you glad you're here. We do this every Tuesday right after Mac Break weekly, about 1:30 Pacific, 4:30 Eastern, 20:30 UTC. Yes, we're in daylight saving time now, so we've shifted over 2030 UTC. You can watch us live on Twitch TV, X.com, youTube.com Facebook, LinkedIn and Kik. And if you want to watch live, I hope you will. If not, after the fact, on Demand at TWiT TV. SN. Steve's got copies of the show. I'll explain a little more about what. He's got kind of unique versions at his website, GRC.com there's a YouTube channel and of course it's a podcast, so you can subscribe on your favorite podcast client.
[144:26]
C
Hello, Hello, Hello. Now that I've got your attention, this is going to be a weird ad because real estate is weird. If you're in real estate, you already know this. But most of the data's trash and sellers. Well, they're a mess. You know the story. They got to talk to their wife, their dog, their therapist. In reality, they just don't know you and they don't trust you. And why would they? You're the hundredth person who's called about their house. And why'd you call them? Because they were on some list that you heard about from a guy living in his mom's basement. Now, to be fair, I live in my mom's basement too. But that's not the point. If you want to win, you have to be different. You have to talk to people who need to sell, who just had a kid, got divorced, or had a loss in the family. That's what Goliath does. We monitor people's lives. And when something changes, we throw up a white flag. So you know exactly who to contact, Even if it's 9pm on a weekend and you're asleep. That's what an unfair advantage sounds like.
[145:14]
B
Goliath, Goliath Data Goliath, Goliath data.
[145:18]
C
GoliathData.com find deals before the market does. Okay, back to the show.
[145:24]
A
Now, since we're talking about LLMs, let's talk about LLMs.
[145:29]
B
It turns out that mimicking human consciousness is not the only thing LLMs can be spookily adept at. It will probably not come as a huge surprise to learn that LLMs can be frighteningly good at discriminating among similar appearing objects, including among people. Our friends at ETH Zurich, with some help from Anthropic, have been at it again. Their recent paper, published less than two weeks ago bears the title Large Scale Online De Anonymization with LLMs here's what we learn about the latest trick they've taught LLMs their papers abstract it's very, very techie sounding abstract, but everyone will get the, you know, a sense for this, they said. We show that large language models can be used to perform de anonymization at scale. With full Internet access, our agent can re identify hacker news users and anthropic interviewer participants at high precision. Actually, it doesn't say it here, but it's 99 accurate. Given pseudonymous online profiles and conversations alone, matching what would take hours for a dedicated human investigator. We then design attacks for the closed world setting. Given two databases of pseudonymous individuals, each containing unstructured text written by or about that individual, we implement a scalable attack pipeline that uses LLMs to 1 extract identity relevant features, 2 search for candidate matches via semantic embeddings, and finally 3 reason over top candidates to verify matches and reduce false positives compared to classical de anonymization work, and they cite a previous example known as the Netflix Prize that required structured data. Our approach works directly on raw user content across arbitrary platforms. We construct three data sets with known ground truth data to evaluate our attacks. The first links hacker news to LinkedIn profiles using cross platform references that appear in the profiles. Our second data set matches users across Reddit movie discussion communities, and the third splits a single user's Reddit history in time to create two pseudonymous profiles to be matched in each setting. LLM based methods substantially outperform classical baselines, achieving up to 68% recall at 90% precision compared to near 0% for the best non LLM method. Our results show that the practical obscurity protecting pseudonymous users online no longer holds, and that threat models for online privacy need to be reconsidered. Wow. It turns out that individuals who believe their identity is well protected simply by their use of online handles are likely to be far more readily de anonymized than they might imagine. So I'm only going to share the paper's introduction since it suffices to make their case. They wrote. For decades it's been known that individuals can be uniquely identified from surprisingly few attributes. Sweeney's seminal work demonstrated that 87% of the US population could be uniquely identified by just zip code, birth date and gender. Naryan and Shamatikov showed that anonymous Netflix ratings could be linked to public IMDb profiles using only a handful of movie preferences, while DeMontjoy et al. Proved that four spatiotemporal points are enough to uniquely identify 95% of individuals in mobile phone data sets. Despite these attacks, pseudonymous online accounts, Reddit throwaways, anonymous forums, you know, review profiles, et cetera have remained largely unaffected by de anonymization attempts. The reason is simple. Applying such attacks in practice has required structured data amenable to algorithmic matching or substantial manual effort by skilled investigators reserved for high value targets. De anonymization is a two step process at heart, involving profiling an anonymous person from their posts and then matching them to a known entity. It's well known that large language models can infer personal attributes from text on online forums. Given this, it makes sense to ask how good are LLMs at full end to end de anonymization? And this is a prac. And is this a practical threat to pseudonymous accounts? Our contributions we demonstrate that LLMs fundamentally change the picture, enabling fully automated de anonymization attacks that operate on unstructured text at scale. We show this by phrasing de anonymization as a matching problem and showing LLMs can perform all steps needed to match accounts, extract identity relevant signals from arbitrary text, efficiently search over millions of candidate profiles, and reason about whether two accounts belong to the same person. We show that the practical obscurity that has long protected pseudonymous users. The assumption that de anonymization, while theoretically possible, is too costly to execute both broadly no longer holds. We validate this thesis in 3D anonymization settings. Matching an online account to its real identity, linking an identity to an unknown pseudonymous account, and linking pseudonymous accounts of the same person across different platforms or time periods. These settings capture distinct threat models, for example, doxing of an online account, a stalker targeting a victim, or an adversary consolidating a user's activity, and pose different technical challenges.
[153:08]
A
Okay.
[153:08]
B
In other words, for us, the emergence and presence of LLM technology, with its application of massive computing resources, completely changes the game for the strength of online pseudonymous identities. Many new capabilities are almost certain to eventually come online. For example, you know, it becomes entirely feasible now for law enforcement and intelligence services to identify and track individuals through their online style, their word choice and beliefs as reflected in their online postings. Not feasible to do it before now? Feasible. We've been thinking of the NSA's massive data center as a storage repository of encrypted data being sucked in from all over the Internet, which may someday be revealed when quantum computing cracks traditional crypto. But imagine if the NSA's data centers were instead sucking down the same d already decrypted plain text content that all of the rest of us see. But now it's being fed into massive LLM technology to de anonymize persons of interest. Whether we may like it or not, we're each individually leaving identifying content in everything we post. As these researchers noted, historically until now this wasn't an issue, since the cost of performing such de anonymizing would have been astronomically high, making it completely infeasible. The emergence of LLM technology has forever changed this calculus. Their paper's discussion section at the end summarizes what what they believe their findings mean. They write, de anonymization is One instance of LLMs acting as an information microscope that makes previously manual and expensive attacks scalable. Our paper shows that LLMs democratize DE anonymization, echoing concerns raised by prior work on LLM based attribute information inference and semantic privacy leaks. We argue that the asymmetry between attack cost and defense cost may force a fundamental reassessment of what can be achieved private or I'm sorry, what can be considered private online. Our large scale experiments provide quantitative evidence for these concerns in the de anonymization setting. So what do our findings mean for the future of privacy? Governments could link pseudonymous accounts to real identities for surveillance of dissidents, journalists or activists. Corporations could connect seemingly anonymous forum posts to customer profiles for hyper targeted advertising. Attackers could build sophisticated profiles of targets at scale to launch highly personalized social engineering scams. Hostile groups could identify important employees and decision makers and build online rapport with them to eventually leverage in various forms. Users, platforms and policymakers must recognize that the privacy assumptions underlying much of today's Internet no longer hold. In other words. Yikes. I put the link to their extensively detailed 25 page PDF paper in the show notes here at the end for anyone who might wish to dig deeper. There's nothing any of us can do, but it might be worth keeping it in mind. We are. If somebody deploys something like this, we're leaving our footprints everywhere because I'm actually
[157:28]
A
surprised that it can do this.
[157:30]
B
Yeah.
[157:31]
A
Did they say it was a specially trained LLM for this particular use?
[157:36]
B
Nope.
[157:37]
A
Don't know.
[157:38]
B
No. I mean, I'm sure that they've, they've taken an LLM and, and they're not prompting it. They are, they're, they're, they're using the LLM technology.
[157:47]
A
Wow. Yeah, that's, that's really kind of surprising.
[157:52]
B
That's frightening. But yeah, I would, I would Argue. Not surprising. Look, I mean, okay, so how surprised are we that it can talk right now?
[158:03]
A
That's surprising.
[158:05]
B
Okay, so. So if it can talk, I mean, if it's something that can talk, then this is just something like that is doing something else. That is also surprising.
[158:18]
A
It could talk. Assembly code. It's.
[158:23]
B
We've unleashed a huge new thing.
[158:28]
A
It's amazing. It's just. It's. I don't know about you. I suspect this is the case. It's exciting because it's something that's been promised by sci fi and computing theory from, you know, the very earliest days.
[158:43]
B
Leo. I worked at something that called itself the AI Lab, which, you know, was. Was trying to move a chess piece on the board with a robot hand and eye.
[158:56]
A
Do you feel like you were at Stanford's AI Lab? Do. This was a changer. Do you feel like it's a continuity, like the work done there was the predic predecessor of the work, or is it. Is there? I feel like Transformers came along and it was a discontinuity. It was a big paradigm shift. It was a little different than the expert systems that Sail was working on. Yeah.
[159:18]
B
And remember that this is all outgrowth of neural network. So the heart is real. A neural network. This is the answer to the question, what if we make it way bigger?
[159:33]
A
Right.
[159:34]
B
So a small neural network can, like, maybe figure out to turn the lights on when the sun goes down, but that's about it. This is like, what if we hyperscale a neural network? What happens?
[159:46]
A
And then it's something interesting happens.
[159:48]
B
Hello, Dave.
[159:49]
A
Yes, something very interesting and somewhat unpredictable, to be honest. Happens.
[159:56]
B
It's fundamentally unpredictable, in fact. So that it doesn't bore us, we add. Unpredictability.
[160:03]
A
We add.
[160:05]
B
We actually pour it in.
[160:06]
A
Stochastic. Yes.
[160:08]
B
Called a temperature setting.
[160:10]
A
Right. We live in interesting times. Aren't you glad you listen to this show to keep up with all this stuff? That's Steve Gibson. That's his job. And you know what's fun about for both of us is we've been about this job for a long, long time, and we have seen what's happened. And I think it's really fun that we can still be amazed that we can still go, wow, it can do that. Wow, that's pretty exciting. Very exciting. Thank you, Steve. Steve is@grc.com now. If you want to send Steve a comment, a suggestion, your own experiences, easy to do, go to GRC.com email. Steve has some sort of magic system for determining whether you are a bot or not all done magically. And. And he will vet your email. And if you are in fact a human with good intentions, not bad, he will add you to the list of white list of people who can write to him. You'll also have a chance when you're at that page to check two boxes, one for the weekly mailing of the show notes, but sends them out the day before most of the time the day before the show. And then the other, which he rarely uses, which is an announcement of new products. Both of those are unchecked by default, so make sure you look down at the bottom of the page. Check those While you're@grc.com of course you can get this show there. He's got two, three unique versions of this show. Four unique versions of the show. He's got the 16 kilobit audio which is a little scratchy admittedly, but has the virtue of being very small. He has the 64 kilobit audio which sounds fine, is a good size. That's certainly, you know, if you don't want to waste bits the way the one to get he has the show notes which today are 22 pages of goodness with pictures and links and everything you'd need if you listen to the show. This is a great companion. And a couple of days after the show's published, he will also have a transcript of the show written by a human being, Elaine Ferris. Because we believe in hiring humans. That's going to be our new motto. Hire a human today. Elaine does a great job. You can get those all@grc.com and of course Steve's bread and butter Spinrite. The world's best mass storage maintenance, recovery and performance enhancing utility. Version 6.1 currently and his newest program a mere 10. It is the DNS Benchmark Pro. To test your DNS servers and find the fastest one for you. Both of your.
[162:48]
B
Speed up.
[162:48]
A
Your erc. Yeah, speed up. You know, it's funny, I have open DNS or next DNS and every once in a while it just stalls and I don't know why and I have to turn it off and on again. I have to reboot it. I don't know why that is. It's a strange thing. Maybe I should run the DNS Benchmark Pro and find a better DNS server. That's what I should do. We also have copies of the show at our website, Twit TV SN. There's a YouTube channel dedicated to it. There is also of course you can subscribe. It's a podcast in your favorite podcast client. Leave us a nice review. Tell the world about the most important podcast you listen to all week. Security now. Thank you, Steve. Have a great week. Enjoy the idea of March and we will see you next week, right?
[163:33]
B
Oh, bye.
[163:35]
A
Hey, everybody. Hey. Leo Laporte here and I'm going to bug you one more time to join Club Twit. If you're not already a member, I want to encourage you to support what we do here at Twit. You know, 25% of our operating cost comes from membership in the club. That's a huge portion and it's growing all the time. That means we can do more. We can have more fun. You get a lot of benefits ad free versions of all the shows. You get access to the Club Twit discord and special programming like the keynotes from Apple and Google and Microsoft and others that we don't stream otherwise in public. Please join the club. If you haven't done it yet, we'd love to have you find out more at TWiT TV Club TWiT. Thank you so much. Security now.