Risky Bulletin — Between Two Nerds: Why AI in malware is lame
Podcast: Risky Business
Date: November 10, 2025
Participants: Tom Uran (A), Gruk (B)
Episode Overview
In this episode, Tom Uran and Gruk examine Google Threat Intelligence Group's latest paper "AI Threat Tracker: Advances in Threat Actor Usage of AI Tools". They break down how AI, particularly large language models (LLMs), are beginning to show up in malware and cybercrime tooling, and analyze whether these innovations represent meaningful technical advances or just technological theater. With a skeptical and sometimes irreverent tone, Tom and Gruk assess the actual capabilities of observed AI-enabled malware, discuss the evolving cybercrime market for AI tools, and contrast offensive and defensive applications of AI in cybersecurity.
Key Discussion Points & Insights
1. Google’s Paper: A New Angle on AI in Crime
- Paper Context: Google’s Threat Intelligence Group (formerly Mandiant) released an updated report within the same year as its previous findings, reflecting fast-paced changes in AI-related threats. Unlike previous reports focusing on model queries (e.g., OpenAI, Anthropic), Google’s work leverages threat intelligence and code analysis to see how AI is actually used within malware families. ([02:08])
"This one is based on threat intel. So it's got a very different flavor in that. It's here's malware that we see and here is how the malware is using LLMs or whatever, AI technology." — Tom Uran [02:08]
2. “Just-in-Time” AI in Malware: Prompt Steal & Prompt Flux
-
First Observed Cases: Google identified malware families ‘Prompt Flux’ and ‘Prompt Steal’ that use LLMs during execution to generate malicious scripts or obfuscate themselves on the fly. ([03:15])
-
How It Works (and Why It’s Lame):
- Instead of hardcoded scripts, these malwares ship with prompts to LLMs (e.g., “write a script to exfiltrate X”), dynamically generating action code.
- Results are unimpressive; such malware is easier, not harder, to reverse engineer.
- Gruk: "If this is the future of malware, we have nothing to worry about." [06:35]
-
Reverse Engineering Simplicity:
- Prompts are often plain English, making the intent of the malware blatantly clear to analysts, as opposed to obfuscated scripts.
"It seems to me that if I am a reverse engineer, I would much rather have an LLM prompt to look at than a bunch of obfuscated code." — Gruk [04:10]
- Redundant Complexity:
- Replacing a simple, tested script with an LLM prompt adds unnecessary steps, making execution more error-prone and slower.
- The speakers question the rationale behind this approach.
"You've added a layer of complexity and uncertainty to a simple fixed problem." — Gruk [07:23]
3. Case Study: Lame AI-Driven Malware in the Wild
- APT28 (“Fancy Bear”) Example:
- Notably, “Prompt Steal” was deployed by Russia’s APT28 against Ukraine. Locally, Ukrainians dubbed it “Lame Hug”, highlighting its unimpressive technical merit. ([10:13])
- Tom and Gruk speculate that this reflects rushed, low-skill development from expanding teams during the war.
"I thought that Fancy Bear was relatively competent." — Tom Uran [10:37]
"There's probably quite a lot of, I would say, entry level developers based on this." — Gruk [11:15]
- Lowering the Barrier, but Not the IQ:
- AI enables people with minimal skills to create runnable malware, but not effective or sophisticated malware.
- Gruk uses a Vietnam War analogy to illustrate: lowering competence thresholds increases numbers, but does not yield capable operators, and can be counterproductive. ([12:40])
"It feels a bit like that, where it's like, if you can drop the requirement for competence low enough, you can get anyone doing this, but then you'll have anyone doing this, and that's a problem." — Gruk [14:00]
4. Experimental AI-Driven Malware Techniques
- Prompt Flux:
- Malware uses Google Gemini API to rewrite/obfuscate its own source code for persistence and spread.
- Gruk and Tom are skeptical; repeated LLM-based rewriting will inevitably lead to functional bugs (DNA “Chinese Whispers” analogy).
"You're now relying on Gemini...to read an obfuscated script, rewrite it as a different obfuscated script and preserve functionality 100%...that doesn't seem like it will work." — Gruk [17:20]
- Other LLM Use Cases:
- Credential searchers (like “Quiet Vault”) using AI to look for secrets in typical locations.
- Tom and Gruk see no technical reason to use AI for such fixed, well-defined tasks. ([18:34])
5. AI Tool Marketplaces for Criminals
- Rise of Underground AI Tools:
- Google’s report notes an emerging market for AI-assisted crimeware (e.g., “CrimeGPT”), lowering the barrier on phishing, malware dev, and vuln research.
- “Script” AI suitability model: Good fit for repetitive, short, and context-light tasks—but most crimeware problems are already solved and automated.
"If you just look at the, like the romance scams, they have manuals which are basically just template emails for every stage..." — Gruk [22:40]
- Why AI Is Overkill Now:
- Most current cybercrime tasks are better served by “battle tested” scripts/templates than AI, unless there’s a need to adapt to new contexts (like deepfakes or novel lures).
- AI could help tailor existing scams to news events or specific victims, but not fundamentally change the nature of these attacks.
6. Offense vs. Defense: Where AI Actually Adds Value
- Attackers Can Afford Mistakes:
- Offense benefits less from AI, as errors are inconsequential (just move to next target).
"As an attacker, if you don't gain access, like, it sucks, but you'll just try someone else, whereas a defender, you can't say, well, you know, better luck next ransomware." — Gruk [25:07]
- Defense Gets the Edge:
- AI excels in security defense, where it can coordinate and automate a multitude of small, information-gathering and analysis tasks that would otherwise bury analysts.
7. Why the Lame Examples? The Invisible Good Stuff
- Detection Bias:
- The “lame” LLM uses cited in the report are easy to spot and document; more advanced uses (like human-in-the-loop decision-making via AI, deepfake social engineering, etc.) lack detectable on-host traces and are thus not represented in the data.
"If you're using AI to actually write the script, right. Where's the evidence that it was AI?" — Tom Uran [27:59]
- The True Innovation Might Be Invisible:
- AI might be helping attackers in ways that are unobservable in malware samples (e.g., operational decisions, manual crafting of payloads, deepfakes for social proof tricks).
8. Unexpected Side-Effect: AI Experience as Career Investment
- New, inexperienced developers in APTs might be acquiring valuable AI expertise at government expense before moving to lucrative private-sector jobs—a perverse but plausible upside to inefficient AI-driven malware.
"It's cybercrime funding someone's career in AI. Wonderful." — Gruk [29:35]
Notable Quotes & Memorable Moments
-
On the redundancy of AI in basic malware:
"You're taking a defined problem where you can write a script and replacing it with a defined problem where you write a prompt." — Tom Uran [07:11]
-
On incompetent adversaries:
"You could tell which boxes have been hacked because they basically start rebooting when they run out of ram and their GPUs melt." — Tom Uran [09:18]
-
On the current state of underground AI tools:
"Gartner has issued the four quadrants and we see that the crime GPT is sort of up and to the right..." — Gruk [19:59] (sarcastic)
-
On attacker vs. defender asymmetry:
"The cost of failure is quite high." — Gruk [25:07]
Timestamps of Important Segments
- [02:08] — How this paper’s approach differs: on malware code, not just LLM queries
- [03:15] – [07:53] — How “Prompt Steal” and “Prompt Flux” work, and why they’re underwhelming
- [10:13] – [14:19] — APT28 using "Lame Hug" malware; lowering the competence bar
- [15:35] – [19:19] — Self-rewriting worms, their flaws, and the futility of LLMs for standard tasks
- [19:59] – [25:07] — AI tools in cybercrime markets, script tasks, and why defenders benefit more
- [27:04] – [29:35] — The “invisible” good uses, and how AI practice via crime might boost careers
Overall Tone
Wry, skeptical, and occasionally caustic, Tom and Gruk dismiss much of the current AI-in-crime narrative as inflated or premature. They are not techno-optimists for criminal innovation, but also warn not to mistake observable failures for the absence of sophisticated uses—they just may not be visible to analysts yet.
Conclusion
- Bottom Line: Most observed “AI-powered” malware is technically unimpressive, often less effective than traditional forms, and primarily enables low-skill attackers to deploy simplistic attacks. The real areas where AI will (or does) empower cybercrime may be less detectable and not yet showing up in threat intelligence reporting. For now, AI offers more practical gains to defenders than attackers—at least in operationalized malware code.
(End of summary)
