The AI Policy Podcast (CSIS):
"AI, Cybersecurity, and Securing Model Weights"
Host: Gregory C. Allen (CSIS)
Guests: Miles Brundage (AI Policy Researcher; formerly OpenAI), Chris Rohlf (Security Engineer, Meta; CSET Fellow)
Date: June 27, 2025
Episode Overview
This episode offers an in-depth look at the intersection of artificial intelligence (AI) and cybersecurity—with a special focus on the emerging debate over securing "model weights" (the valuable files that encode a trained AI’s knowledge). Host Gregory Allen is joined by cybersecurity veteran Chris Rohlf and AI policy researcher Miles Brundage to explore how AI has changed the defender-attacker balance in cybersecurity, why protecting AI model weights matters, and what policy and practice need to catch up with rapid industry changes.
Both guests share their career backgrounds, the evolution of AI in cyber, discuss how defenders and attackers leverage AI, practical security recommendations for organizations and individuals, and debate whether securing leading AI models is realistic—or futile—in an increasingly open, connected world.
Key Discussion Points and Insights
1. Cybersecurity Evolves: From Manual Analysis to AI-Assisted Defense
(Chris Rohlf: 02:20-05:20)
- Chris describes how cybersecurity shifted from painstaking, manual code analysis to automated tools, and finally to integrating classical machine learning and now generative AI.
- "You sat there, you looked through code line by line, tried to determine what the program was doing, where the developer got it wrong, and then slowly and painstakingly writing an exploit by hand." (02:23, Rohlf)
- The advent of machine learning brought classifiers for malware, but the “paradigm shift” came with large language models (LLMs) automating code writing and vulnerability discovery, dramatically scaling security workflows.
2. Old & New Threat Vectors: “Malicious Use of AI” Taxonomy
(Miles Brundage: 05:51-08:56)
- Miles recounts early AI policy work on the risks of malicious AI use, spanning political (disinformation, authoritarianism), digital/cyber (vulnerability discovery, automated spear phishing), and physical (drones, targeted attacks).
- "We broke it up into three different domains: political security... digital security... and physical security." (06:38, Brundage)
- Previously theoretical concerns—like LLM-aided phishing—are now "real phenomena that companies are dealing with every day."
- "It started to be clear that cyber was not just an afterthought, but perhaps one of the most important security domains of AI." (08:00, Brundage)
3. The Offense-Defense Balance: Attackers Still Have the Edge
(Chris Rohlf: 09:50-17:26)
- Chris explains most attackers benefit from scale and automation. Millions of accounts are assaulted via credential stuffing—taking advantage of weak/reused passwords and lack of two-factor authentication.
- "There are areas... where the cost of attack and the barrier to entry is very, very low for attackers." (11:22, Rohlf)
- High-end, nation-state attacks (e.g., zero-days) are more targeted, expensive, and rare—still relatively cheap compared to other state-level tools.
- Defenders must “be right everywhere,” while attackers need only one slip-up.
- "The Defender has to be everywhere and do a good job in all of those places, whereas the attacker... has to be right once or twice." (13:36, Rohlf)
- Yet, investing in security meaningfully raises the bar; even if impenetrable defense isn’t possible, making attacks expensive and noisy deters opportunistic hacks and slows targeted incursions.
4. The Impact of AI on Cybersecurity: Leveling the Playing Field?
(Chris Rohlf: 21:17-26:29)
- AI’s main benefit is lifting the bottleneck for defenders—especially in “boring but important” work like vulnerability scanning.
- "AI is going to tip the scale, eventually back in favor or balance it out toward defenders." (22:08, Rohlf)
- AI helps defenders automate vulnerability detection across huge codebases far more efficiently than attackers benefit in finding novel vulnerabilities.
- "AI is exceedingly good at that." (24:13, Rohlf)
- Practical constraints: Not all companies are adopting AI-powered tools, especially those below the “security poverty line”—e.g., hospitals and SMBs with low resources and expertise.
5. AI as a Target: Data Poisoning, Adversarial Attacks, and Model Security
(Miles Brundage & Chris Rohlf: 28:30-34:46)
- Newer attacks target AI itself, like poisoning training data, inserting "sleeper agents" (backdoors) via fine-tuning, or exploiting LLMs’ tendency to leak or exfiltrate data.
- "If you look at a lot of these complex agentic frameworks... the models will generate the code you ask them to." (34:13, Rohlf)
- These risks are real, but the attacks aren’t always trivial—still, vigilance is needed, especially as LLMs are integrated into critical/high-trust settings.
6. Securing Model Weights: Why It’s Hard, Why It Matters
(Miles Brundage & Chris Rohlf: 35:38-59:31)
- Model weights—the files encoding a trained AI—are now multi-billion-dollar assets, but can be (in principle) copied "for damn near free."
- "We're creating these extremely valuable files that essentially represent billions of dollars in spending... but it's not obvious the security is increasing at the same rate." (36:06, Brundage)
- What does a thief get? Sometimes, transferring weights for practical use may be difficult due to hardware/software dependencies; but in other scenarios, exfiltrating weights could jumpstart a nation-state's or competitor’s AI efforts.
- Security “tiers” exist: basic defense against random hackers up to full hardening (air gaps, armed guards, limited bandwidth, etc.) for state-level threats. It’s challenging to balance high-level security with the desire to serve millions of users and innovate quickly.
- "Once you ratchet [security] up too high, it becomes nearly impossible to do research or to scale out inference infrastructure." (54:06, Rohlf)
- Even with open-source models, organizations have incentives to secure their own fine-tuned variants/proprietary data.
7. Security “Doomerism” vs. Pragmatic Optimism
(Miles Brundage: 63:08-68:44)
- Miles defines “security doomerism” (aka "security nihilism") as the belief that securing model weights is futile—attackers will inevitably win.
- "There is a kernel of truth here... security is really hard... in many cases it is actually appropriate to be... a doomer." (66:06, Brundage)
- For some contexts (open labs, non-strategic models), extreme measures aren’t warranted. But for strategic assets, it's both feasible and valuable to invest in high-grade protection—with the right trade-offs for usability and cost.
8. Policy Recommendations: What Should Governments and Companies Do?
(Chris Rohlf: 69:16-71:13, Miles Brundage: 72:28-75:20)
- Promote adoption and access: Policies should focus on broadening access—especially for resource-strapped sectors—so that AI's benefits for defense are realized.
- "We need policies that allow these companies to flourish and allow them to innovate with as few barriers as possible." (70:14, Rohlf)
- Public investment & shared best practices: Pilot projects, public R&D, and "science in the open" can push the state of the art in secure model hosting and share failures for better community learning.
- Procurement and standards: Use government buying power to set higher bars for AI model security (e.g., “FedRamp for model weights”).
- "It's totally reasonable to use the government purchasing power that exists to... start to formalize some of these tiers of security standards." (74:02, Brundage)
Notable Quotes & Memorable Moments
"You can never make something secure enough. You can never keep up. And that trend of automation obviously at some point collided with AI."
— Chris Rohlf (03:14)
"The Defender has to be everywhere... The attacker only needs to be right once or twice."
— Chris Rohlf (13:36)
"AI is going to tip the scale, eventually back in favor or balance it out toward defenders."
— Chris Rohlf (22:08)
"AI could reduce the cost of attacks, enable larger scales of attacks and, you know, faster paces of certain kinds of attacks. But it could also help with defenders."
— Miles Brundage (08:32)
"It seems like there's maybe a mismatch in kind of the seriousness of protection."
— Miles Brundage (38:28)
"Replication is not innovation. And so we need policies that allow these companies to flourish and allow them to innovate with as few barriers as possible."
— Chris Rohlf (70:14)
"Security doomerism is this idea that... as you said, that it's kind of futile, that it's not necessarily worth the investment of at least getting to the point of robust model weight protection. I think there is a kernel of truth there, which is that security is really hard."
— Miles Brundage (65:41)
Timestamps for Key Segments
| Segment | Time | |------------------------------------------------------------|--------------| | Introduction and guest backgrounds | 00:00–05:20 | | The evolution of cyber and AI intersections | 05:21–08:56 | | Offense-defense balance in cyber | 09:50–17:26 | | AI's impact on cybersecurity (defense/offense parity) | 21:17–26:29 | | AI as a target: data poisoning, adversarial attacks | 28:30–34:46 | | Security of model weights; economic rationale | 35:38–47:42 | | Open/closed models & consequences for security | 57:05–59:31 | | Security doomerism vs. trade-offs and best practices | 63:08–68:44 | | Policy recommendations (adoption, procurement, R&D) | 69:16–75:20 | | Closing reflections | 75:20–76:49 |
Conclusion
This episode blends practical technical insight and high-level policy analysis on the evolving cyber-AI landscape. Rohlf and Brundage bring clarity to urgent questions: Where and how does AI change the security game? What is realistic to protect? How does success depend on adoption, incentives, and community learning? And as AI model weights become ever more valuable national assets, how do we secure them without hindering progress? The discussion is candid: there are no silver bullets—but a future where defenders can really leverage AI and sound policies can help tip the scale, if we invest and coordinate now.
Essential listening for anyone navigating the new frontiers of AI security and policy.
