
Loading summary
Podcast Host (Ad Segment)
Nearly every News alert in 2025 has raised questions, some old, some new, about the law and national security. And now you get the chance to ask lawfare directly. It's time for our annual Ask Us Anything Mailbag podcast, an opportunity for you to ask Lawfare this year's most burning questions. You can submit your question by leaving a voicemail at 202-643-846, or by sending a recording of yourself asking your question to ask us anything lawfairmail.com by December 16th ThirdLove makes better bras, period. ThirdLove was founded by women who were tired of settling for bras that were just good enough. Each piece is made with the highest quality materials to solve for the fit issues so many of us face, get extra lift, smooth out back spillage, and so much more, all in over 60 sizes from double A to H. They they even have exclusive half cup sizes, which means if you're in between sizes, you can get the perfect fit every time. Stop settling for average bras and get solutions made for your body. Get $15 off your purchase at thirdlove.com with code podcast15thirdlove. Your best fit awaits this holiday season. Connection with the kids we love is the best gift of all. Right now, kids on average are spending between five to nine hours a day on screens, and studies link heavy use to rising anxiety and depression with media being at the center of it all. That's why Gab makes kids safe. Phones and watches. No Internet, no social media, just the right features for their age. With gab's tech and Steps approach, kids get the right tech at the right time. So if a phone is on your child's wish list, make it a gab. The Gift of Safe Connection for an exclusive holiday offer, visit gab.com getgab and use code getgab. That's G-A B B.com getgab Gab Tech insteps independence for them, peace of mind for you.
Alan Rosenstein
It's the lawfare Podcast. I'm Alan Rosenstein, associate professor of law at the University of Minnesota and a senior editor and research director at lawfare. Today we're bringing you something a little different, an episode from our new podcast series, Scaling Laws. It's a creation of lawfare and the University of Texas School of Law, where we're tackling the most important AI and policy questions. From new legislation on Capitol Hill to the latest breakthroughs that are happening in the labs, we cut through the hype to get you up to speed on the rules, standards and ideas shaping the future of this pivotal technology. If you enjoy this episode, you can find and subscribe to Scaling Laws wherever you get your podcasts and follow us on X and bluesky. Thanks for listening.
When the AI overlords take over, what are you most excited about?
Kevin Frazier
It's not crazy, it's just smart.
Alan Rosenstein
And just this year, in the first six months, there have been something like a thousand laws.
Kevin Frazier
Who's actually building the scaffolding around how it's going to work, how everyday folks are going to use it?
Alan Rosenstein
AI only works if society lets it work.
Kevin Frazier
There are so many questions have to.
Alan Rosenstein
Be figured out, and nobody came to my bonus class.
Kevin Frazier
Let's enforce the rules of the road.
Welcome back to Scaling Laws, the podcast brought to you by Lawfare and the University of Texas School of Law that explores the intersection of AI policy and, of course, the law. I'm Kevin Frazier, the AI Innovation and Law Fellow at Texas Law and a Senior editor at lawfare. Today we're joined by Caleb Withers. Caleb is a research associate at the center for a New American Security, where he focuses on frontier AI and national security. More specifically, Caleb studies the impact of emerging AI capabilities in the biological and cyber domains. Today's conversation examines how frontier models could disrupt the balance of power in cyberspace, potentially giving malicious actors a decisive edge. We'll look at the trends fueling this shift, explore how policymakers and labs can counter this threat, and finally, consider the next era of cybersecurity. To get in touch with us, email scalinglawsawfairmedia.org or follow us on Xor Bluesky. And with that, we hope you enjoy the show.
Caleb, welcome to Scaling Laws.
Caleb Withers
Thanks. It's great to be here.
Kevin Frazier
So you penned a report called Tipping the Emerging AI Capabilities and the Cyber Offense Defense Balance. And I can't say thank you enough because for one thing, you provided me with a great reading to assign to my students on our unit on AI and cyber. But perhaps of more importance than just my class, you added some really great insights into a space that's as gray as a fall day in the Pacific Northwest. Depending on who you ask, cybercrime costs between 1.5 trillion and $10 trillion in 2025 alone. Now, that's a crazy gulf, and we won't get into how we can have estimates that are 10 times different. But even on the small end of that spectrum, even if it is just $1.5 trillion in costs as a result of cybercrime, that's a huge issue, and that's a huge public policy issue. And obviously it's going to become even bigger with AI. And that's really where I want to center our analysis. But let's start with some basics. Pre gen AI, pre this wave of AI, what was the relationship between some of the AI tools that existed at that point and cybersecurity, both offense and defense?
Caleb Withers
Yeah, sure thing. So I think machine learning, automation software, and as you said, both cyber offense and defense is nothing new. If I was to think back to some particular examples, you know, take spam initially that was somewhat of a manual process or there wasn't a need for automation and sort of spam defence in the sense that no one was really doing it yet in the very early days of communication. But as soon as we saw sort of the influx and people realized, hey, there's an opportunity here to send messages that the person receiving them might actually not be that excited to receive or not to their benefit to receive, then all of a sudden there's this question of, well, how well exactly can we do at having machine learning and software and algorithms to filter out.
The good from the bad there? And so whether it's spam or recognizing malware or just some of the productivity enhancing things, looking at what a cyber defender might be doing and saying, hey, well this is a process that we can follow and maybe we can speed that up a little bit by.
Hard coding how to do that. Yeah. So machine learning has played a long standing role and cybersecurity for sure.
Podcast Host (Ad Segment)
Yeah.
Kevin Frazier
And I just think it's important to situate that this quote unquote AI moment is new in many ways. But also we've seen AI be a core part of cybersecurity for a long time. But your report notes that while AI has traditionally helped defenders, for example, companies looking to detect anonymous behavior are helping you and me detect spam. Now, with some of these frontier AI systems, the balance may be changing. So let's get to that by first talking about what do we mean by frontier AI in a cyber context and how is that kind of changing the calculus on offense versus defense?
Caleb Withers
Yeah, so I think the term can vary depending on who you're talking to. But at least the way I'm using it and see most people using it is effectively to refer to large foundation models, that is models that are trained by ingesting large amounts of data, in particular from the Internet and text and whatnot. The sort of models that are powering ChatGPT, pretty powerful, pretty general purpose, moving pretty fast. Large language models would be the term that people would often Refer to them as, although I sort of put a little asterisk on referring to them as large language models here, because given some of the training and the capabilities and the things that we can see them do, if these models can effectively use computers or they're multimodal and they can ingest images and whatnot, that also has relevance in the cyber domain. And so I think that's worth noting that when people think about large language models in cyber, sometimes they say, okay, well, it's going to be helpful for English text and whatnot. But, you know, really these models are also increasingly using computers, making decisions, doing all sorts of things.
Kevin Frazier
Okay, so we've seen some new capabilities come about. And I wonder how, from the vantage point of thinking about the net balance of is AI helping defenders or helping offensive cybersecurity efforts, what is the key aspects of this frontier AI that may lend itself more actually to the offensive folks, the bad actors trying to infiltrate systems?
Caleb Withers
Yeah, so I'll say first and foremost that I think in general, the case that AI helps defenders on net, the arguments for that, I think by and large a lot of them still do apply and will apply to these frontier models effectively. People talk about the defenders definitely dilemma, which is that if I'm a cyber attacker and I'm trying to get into your system, you know, if I only succeed once, you know, in some sense, then I'm in your system and you sort of have to be successful all the time and for everything as a defender. And, you know, defenders especially these days will often have massive sprawling networks. And so there's just a big scale challenge there. And so if you have software tools or AI or machine learning that help you deal with that scale or deal with that volume of attacks, then sure, attackers are going to be able to use them. But things that can scale up as software is good at doing is great for defenders. And so.
The approach my report takes is to sort of say, is there any reason to expect that this might not hold or that this time might be different for some of the Korean Jugen capabilities we see? And there's a few things that come to mind for me. One is that even as the cost of any given capability in terms of running these models is rapidly going down, the benefit to spending more on running these models longer, running more of these models, running bigger models, we're still seeing returns to that. And so that breaks assumption that was previously pretty safe, that, you know, running most machine learning models or cybersecurity software is going to have, you know, pretty low marginal Cost. So we're not there yet in terms of this being a big deal. But, you know, if we look into the future, I could see that the cost that defenders are spending on running AI models, you know, starting to actually be a material consideration of can we actually afford to do XYZ thing.
Kevin Frazier
Right. So is it right to really put an emphasis on the fact that these frontier AI systems essentially enable both a increase in quantity and quality of attack? So any vector vulnerability that you had previously, now, if a bad actor is willing to spend those inference costs of finding more and more ways of running a model to attack whatever defenses you have, well, as you're pointing out on the defensive side, you. You're also having to spend more and more and more, and you just may not be willing to put up as many of those costs as maybe a bad actor would be. And so how are we going to see this sort of balance play out given that it's just becoming more. More and more expensive to defend against more and more attacks?
Caleb Withers
Yeah, and so I think both factors are things to keep in mind here. I would say that maybe the quantity or scale side of things is maybe more important at the moment, just given that there's sort of not much truly new under the sun when it comes to cybersecurity. A lot of the attacks or exploits you might see are sort of echoes of things or similar to things that have already been done. And, you know, there was a study that Mitre Corporation did, you know, they're active in sort of the cybersecurity sort of research space, and they're looking at these things they called stubborn weaknesses, software vulnerabilities that have consistently been the most common and most severe. And if they looked back in 2007 versus more recently, they're still just making up a whole bunch of the vulnerabilities that we're seeing in software. So.
Kevin Frazier
So just to pause there, because I have a lot of my own stubborn weaknesses, so I'm curious to learn about these ones. What are some of these weaknesses that really stand out both in 2007 and 2025, which is really unfortunate that we haven't made some improvements.
Caleb Withers
So, yeah, one that is ranked pretty highly, for example, is they call it not neutralizing special elements used in SQL command. So that's effectively.
SQL being language that underpins some of the databases that are used on the Internet and otherwise. And you want people to be able to put in their usernames and text and whatnot, but you don't want them to be able to actually Put in commands that make your database do things and whatnot. And so this is sort of a classic thing of, well, we should make sure that the text that people give you when you're using SQL systems and code and whatnot is just that, not actually commands that are going to run on your system. And we still see today, after decades, that that's a mistake developers sometimes make. And they call them stubborn in the sense that sometimes there'll be an attack and you're like, oh, you really got unlucky there. That was hard for you to know that that was going to happen or anyone could make that mistake. But a lot of the mistakes we see in the coding and software side of things that sort of like, you probably shouldn't have done that in this day and age.
Kevin Frazier
Yeah. And it's really fascinating to get a sense of how the promise of AI, in many ways of us all being able to code, for example, everyone can create their own website, you can create your own app, everyone's vibing, having a good time. But as you're pointing out, these stubborn weaknesses still exist and might introduce folks who are testing their new code, for example, against some of these attacks to really impressive and really perhaps pervasive and troubling vulnerabilities. And they're just not even paying attention to this. Right. When you're vibe coding, it doesn't come with a warning necessarily of, hey, we're watch out, a bad actor might exploit this.
Caleb Withers
Yeah. And I mean, the flip side of this is some of these things are things that AI could be quite useful for, these sort of known, relatively straightforward categories of security vulnerabilities. This is the sort of thing that an AI model actually probably could be decent, at least augmenting or helping coders pick up some of these. And I think one of the themes of my report is you can see it going both ways. You can see if vibe coding models and apps aren't worrying about this too much or empowering people who don't even know what to be thinking about when it comes to making secure code that could be introducing all sorts of vulnerabilities at the same time, these tools could be really helpful in avoiding them. And so I think a point that's particularly salient for these and in general, just as we think about AI and cybersecurity is, you know, there's choices we, we face as, you know, be that, be it industry or policymakers or procurers, in terms of, you know, how much do we actually prioritize this sort of stuff if we're looking at the benchmarks for models and apps. You know, are we just looking at code performance per se, or are we actually, you know, baking in security as one of the more important things we want to be comparing these on?
Podcast Host (Ad Segment)
Yeah.
Kevin Frazier
So before we get to some of the policy interventions and some of your own recommendations, just to bring out some of these more stubborn weaknesses and more generally referring to just the sorts of cyber vulnerabilities people should be most attentive to. I know I share this with a lot of folks. If you write me an email that inflates my ego and says, oh, Kevin, you are the best podcast host. Alan is, you know, pretty good, but you're definitely the better one of the co hosts and why don't you just send me some money to help out with my new Kickstarter campaign? I may be pretty tempted. Right. Oh, clearly this is someone who listens to the pod, they care about me. These sorts of hyper personalized, just great spear phishing or phishing attacks, right? Of being able to write a personalized email that lands in someone's inbox at the right moment and makes a really tangible request. I've heard this is one of those up and coming issues that, like you said, has echoes of the attacks of the past. But to what extent do you really see AI as amplifying some of these traditional attacks, whether it's phishing scams or things like ddos or some of these other well known cyber vulnerabilities?
Caleb Withers
Yeah, I think phishing is a great example because I think in terms of the sort of have we seen true transformation of cyber attack yet? My answer is not yet, except for this asterisk of, as you said, phishing for exactly the dynamics you pointed at. And I think another particularly salient example here is phishing in the non English language. It just seems totally correct to say that this is probably increased by at least an order of magnitude in terms of if I am a Japanese business and how many, as you say, like relatively high quality phishing emails I'm receiving, that's just gone through the roof. And this is a good example too of I think AI on the defense will be an important aspect of counteracting AI on the offense. But it also doesn't always. It's not going to be one for one. Right. Or it's not always going to necessarily be an AI solution that's most promising for an AI threat. And to make this concrete.
In the early days of the Internet, as I guessed it, if you received an email, it was probably Legitimate. There were just more people initially in the early days sending legitimate messages than they were sending illegitimate messages. And now that's just no longer the case. Gmail and these service providers are sort of any email that comes in, they're sort of working from an assumption of this is actually probably not a genuine email or a good faith email.
But at the same time, part of the solution here, yes, will be more sophisticated algorithms to identify these. But at a certain point, as you say, if I'm sending you this flattering email in your inbox.
There'S not going to be much that a sufficiently sophisticated.
AI phishing campaign that has looked up what I say, has looked up what you say, et cetera, is probably going to be not that differential between a legitimate and illegitimate email. The actual thing that's going to be differentiable is did I, Caleb, send this? And that's where things like two factor or authentication, or is this email actually coming from cnas.org or not? And is, you know, what is the CNAS website, that sort of thing. So, you know, some of these more mundane solutions, some of these more behavioral solutions in the sense of just because there's an email in my inbox that says it is from so and so person and says the things I would expect it to say from so and so person, you know, you could probably actually get away with assuming that most of the time, apart from in the most critical cases.
In years prior, but increasingly that's not the case. And so standards and systems around authentication and understanding. When an email says it comes from somewhere, does it actually come from there? You know, that's going to play as much of a role, if not more of a role than AI defense for this sort of stuff.
Kevin Frazier
Yeah, it's fascinating the fact that we still can't get beyond the we're very human and you're going to have to rely on the end user like you and me to have some degree of cyber hygiene in place to know those signs of when should I read that email again? When should I say, you know, send a text message to Caleb to say, hey, are you really starting this new Kickstarter campaign for everyone to have a mustache that looks as beautiful as your mustache? Caleb? That'd be an interesting Kickstarter. I would donate to that. But this requires a lot of attention and a lot of resources for the actual end user, which I think does put you at a disadvantage of just the everyday person having to be even more hyper vigilant about who you're interacting with, what information you're Receiving. So that does seem to put us, us being the defenders, at a bit of a disadvantage. Are there any other things, new threats you see on the horizon that AI is going to introduce from a cyber offensive perspective that we should be aware of? Or do you think that a lot of the, for lack of a better phrase, fear mongering of cyber is broken, the Internet is over, bad actors are just going to use AI to get into every system. Is that hyperbole and perhaps way too exaggerated. Exaggerated based off of where the tech stands today?
Caleb Withers
Yeah. I think one additional trend that comes to mind here is, and again, there is some continuity here, the time to exploit, that is, you know, once an exploit is discovered, how quickly is it actually weaponized and practice against targets.
Kevin Frazier
Right. And some exploit here is just referring to some gap in the code, some ability to probe into a system that you otherwise shouldn't have.
Caleb Withers
Yeah, yeah. And that time on average has been trending down over the years, you know, going from months to weeks and then sometimes days now, depending on sort of what sort of exploit is or some other characteristics of it.
And I think that's notable because on the one hand it's going down. This just highlights that the challenge of defences getting harder and harder at the same time, because it is usually measured in days or weeks at the moment, that actually gives you a little bit of grace as a defender. If you're updating your systems to the latest version, when your software provider pushes out those updates, you'll probably be good most of the time. If you're a particularly attractive target to a state act or you're unlucky, sure, maybe that's not fast enough, but most of the time it is.
I think a trend that we're in the early stages of seeing or maybe on horizon, is just that AI really does have the promise for attackers to compress this somewhat. I'm borrowing this from Timothy Chauvin, a researcher. He points to the scenario of you can imagine something where you have.
An open source piece of software, so, you know, software that people use in the sources up there online, then some update gets made to it to patch some security vulnerability in the code. And you could imagine a large language model that just sort of monitors everything that goes up on GitHub. This, you know, one of the places that open source code goes, and every time there's an update, sort of the prompt is, it'd be more sophisticated. But in effect, is this trying to fix any security problem? If so, what is it? If so, how can we exploit this? If so, who is running the software and then let's have a go at doing that exploit. And so.
The models aren't there yet in terms of the sophistication, I think to do this for all but the most rudimentary or transparent things. But you know, these, these models are getting quicker, faster.
And so, you know, I, I, if I was to think about the coming years, a trend I anticipate seeing is for, you know, starting not very sophisticated sort of exploits or vulnerabilities, but you know, getting more and more sophisticated over time. This thing of just as, as soon as something is out there in the water, as soon as something is there to be discovered in terms of what can be exploited. Just this happening at mass and pretty fast that starts to put pressure on defenders in terms of if you can usually get away with days to update your systems, then okay, maybe you can do it overnight when not as many people are using it. Or maybe you can at least run some checks and not roll it out to everyone at the same time just to make sure, you know, we're not ending up with a situation like the crowdstrike issues that, you know, the, the airlines and others had back in 2024 where if you, you know, roll out some update and mess up the update, then you roll it out to everyone all at once and that can cause a lot of problems. So yeah, I can see this. You know, AI I think will exacerbate this trend of the time to exploit things going down and the pressure on defenders to be quick going up.
Podcast Host (Ad Segment)
I'm a last minute holiday shopper. I often don't do it at all until it's too late. You know the feeling everything's gone already. You don't have ideas. But here's an idea for you. If you're like me and that's your situation, aura frames is the solution with a gift that feels personal. I love my aura frame. But more important than that, I love my aura frame is that the people that I buy aura frames love them. I got two aura frames for the lawfare office. We share them. They're hanging around the office. One of them is just pictures of lawfare people. One of them is the lawfare dependents, pets, kids, all the people that we care about and we upload them to the aura frame and we all share them. We all get a kick out of it. And it's super moving to see both of these frames develop over time. You upload unlimited photos and videos. You individually or a group of people like say the lawfare community. You Just download the Aura app and connect it to wifi. You preload photos before it even ships, and you can just keep adding them from anywhere, anytime you can personalize the gift. Add a message before the frame arrives. You share photos and videos effortlessly, straight from your phone all year long. The gift box is included. Every frame comes packaged in a premium gift box with no price tag. You can't wrap this kind of togetherness, but you can frame it. So for a limited time, save on the perfect gift by visiting auraframes.com to get $35 off Aura's bestselling carver, Matte FR, named number one by Wirecutter, by using the promo code Lawfare at checkout. That's a U R A frames.com promo code lawfare. This deal is exclusive to listeners, and frames sell out fast, so order yours now to get it in time for the holidays. Support the show by mentioning us at checkout. Terms and conditions apply.
Deleteme makes it quick, easy, and safe to remove your personal data online at a time when surveillance and data breaches are common enough to make everyone vulnerable. The New York Times Wirecutter has named Deleteme their top pick for data removal services, and I'll tell you why. Because data brokers make a profit off of your data, which is a commodity. Anyone on the web can buy your private details, and this can lead to identity theft, phishing attempts, and harassment. But Delete Me lets you protect your privacy. I do it with Delete Me and I think you should, too. I have an active online presence. I do wacky stuff. I dressed up as a inflatable frog the other day and, and, you know, I put myself out there. I threw dead sunflowers in front of the Russian embassy. But my privacy is, at the end of the day, still really important to me. I want a separation between my public activity and my private life. I've been a victim of harassment, identity theft, and that sort of thing. It's not pleasant. And if you haven't, you probably know someone who has to. Delete me can help. So take control of your data and keep your private life private by signing up for Delete Me now at a special discount for Lawfare listeners. Get 20% off your Delete Me plan when you go to JoinDeleteMe.comLawfare20 and use the promo code Lawfare20 at checkout. The only way to get 20% off is to go to JoinDeleteMe.com Lawfare20 and enter the promo code code Lawfare20 at checkout. That's JoinDeleteMe.comLawfair20 code Lawfare20.
Darina (Quo Co-founder)
Hi, I'm Darina, co founder of Quo. You might know us as OpenPhone. My dad is a business owner and growing up, he always kept his ringtone super loud so he'd never miss a customer call. That stuck with me. When we started Quo. Our mission was to help businesses not just stay in touch, but make every customer feel valued, no matter when they might call. Quo gives your team business phone numbers to call and text on your phone or computer. Your calls, messages and contacts live in one workspace so your team can stay fully aligned and reply faster. And with our AI agent answering 24. Seven, you'll really never miss a customer. Over 90,000 businesses use Quo. Get 20% off@quo.com business. That's Q-U-O.com business and we can port your existing numbers over for free. Quo no missed calls, no missed customers, 1.3%.
Caleb Withers
It's a small number, but in the right context, it's a powerful one. Stripe processed just over $1.4 trillion last year. That figure works out to about 1.3% of global GDP. Empowering that figure are millions of businesses finding new ways to grow on stripe, like Salesforce, OpenAI, and Pepsi. Learn how to build the next era of your growth@swepe.com Enterprise Group health plans.
Kevin Frazier
Are limited to a single carrier and.
Caleb Withers
A few plan options, but that doesn't fit everyone's needs. Now a new form of employer coverage.
Kevin Frazier
Called an ichra allows employees to choose any plan from any carrier.
Caleb Withers
Learn more at the and betterhealth.com ICRA.
Kevin Frazier
And so, looking also a little bit ahead if we think about us all becoming more reliant on agentic AI. Right now we're living purportedly in the year of the AI agent. I think all of us are are waiting for an AI agent to book our travel and to plan my syllabus and so on and so forth. But soon, having autonomous AI systems that can complete tasks on your behalf will become ubiquitous. And all of us, you and me, may have dozens, if not hundreds of AI agents acting on our behalf. How is cyber shaping that space? Because as you noted, the possibility of having some of these bad actor agents that when they engage with your agent, may say, hey, what are you actually trying to do? Let me take that in a really nefarious direction. Is this a emerging field of cybersecurity that you think policymakers should be more attentive to and in particular, AI developers should be more attentive to?
Caleb Withers
Yeah, I think so. I think, I mean, the. And this comes back to the email and phishing discussion we were having. You know, we were mentioning it might be hard enough with all the phishing emails that people are getting. Of course, people are actually quite excited to offload some of the effort of managing their inbox to these AI systems. And, you know, I think it's not controversial for me to say that these agentic systems are showing some, you know, impressive but nascent capabilities.
But these capabilities are probably outpacing their ability to be relied on with any degree of confidence that if these systems are interacting with adversaries, that they're going to be reliable. And so, you know, as is often the case with the early days of various technological paradigms, yeah, there's a lot of opportunities and risks around thinking how can we actually.
Have some degree of security given this. And so to offer some concrete examples here, at the moment, I have my laptop. There's a bunch of stuff that some categories of emails that I'm actually not that worried if an adversarial person was trying to deal with. If.
Adversary wants to modify my grocery shopping list on Instacart, you know, that'd be annoying, but actually probably just a dozen avocados.
Kevin Frazier
Things.
Caleb Withers
Yeah, but there's probably a tolerable risk. On the other hand, you know, I also have confidential information on my laptop that would be a big deal. And so, you know, it's already the case that if I'm, you know, a software developer, I might have sort of virtualized systems for, you know, running less trusted things or mitigating the, the blowback if there was to be some security threat. But in terms of just like everything on my, my laptop in general, you know, consumers generally don't do that. And so, you know, thinking is, is it going to be the case that in the future we'll have two somewhat segmented inboxes, the one that deals with stuff that actually isn't that big a deal versus the one that, no, we shouldn't let the AI touch this. Another example that comes to mind here is that a lot of the.
Agentic computer using AI systems running this sort of model of.
Get a screenshot of what's on the screen and then be like, click this coordinate and do this thing.
And that's a great way to have flexibility because you don't have to say, okay, well here's how you use this piece of software and that piece of software, and here's the commands you can and can't run. You can try and prompt that they can only do certain things. But ultimately if you have an agent that can use your computer and type anywhere and click anything, it can do anything. And it might be a little bit bit more annoying or harder if you are, you know, building in these guardrails and what affordances you do and don't have around these computer use systems. But that's an example of how you can, you know, make trade offs between doing the the easy way and investing in the slightly more secure way. That might take some more effort and you know, whether it's users or developers or you know, government deciding what should we procure and what might there be a public good case for investing in R and D in this direction? I think security and reliability stands out to me more so than just capabilities in general because people will push the capabilities forward. We don't have to worry about that.
Kevin Frazier
As much bad actor is going to be bad. We know that we can make that bumper sticker. But I love your point too of security always involves trade offs. If you want to have a more secure email, well then you may not want to make it as available to agentic systems for example that can send those sorts of high end phishing scams. So you may subject whoever sends an email to that inbox to a higher degree of security of you want to know certain information, you want more metadata, so on and so forth. In the same way you may have that kind of dumb low barrier to entry in box for all the sort of oh, new deal at this thing or managing your Amazon purchases. And it's really interesting to think through how we may have more opportunities to segment where we're okay with some degree of exposure and where we're not. But I wonder how you're seeing some of the labs start to respond to those sorts of trade offs. Because what we're describing right now is kind of a market opportunity. You can be the provider that tries to offer the most secure, most reliable agentic email checking system and so on and so forth. Is there a market that we're already seeing on the private side of folks trying to be the most cyber forward AI companies?
Caleb Withers
Yeah, I think so. Some things that come to mind here is, I mean I don't want to undersell that the sort of developers are making some effort here in terms of some of these problems I pointed out.
For these computer using agents I'm referencing or just agents in general. You'll see when OpenAI for example puts out the model card, they'll say and we're running classifiers over every interaction to Try and identify, have sort of multiple layers of redundancy and having not just the model itself identifying as something going on here that shouldn't, but then another model sort of looking at the model saying there's something going on here that shouldn't. And to your point about trade offs, an interesting one that I saw here was there's this debate about.
Website owners, do they want AI models to be able to consume that information? And one trade off I've seen here is that you can say, if you're a website owner, you can say, I don't want an AI model to read the text.
Now a trade off here is that.
If the AI model is, one of the things it's doing is sort of saying, is this a malicious or dangerous bit of the website that the users asked for? If it respects that, and arguably it should, I've seen this thing where you'll get a warning, you'll sort of say, give me the text from this website. And then the model will say, okay, well here's the text from the website. But by the way, consume at your own risk, because we respected not running the model on that website. So trade offs, I think abound. Yeah. And you know, and I'm, I'm mentioning the sort of the agents and email side of things here, but you know, we've also seen, I think most of the frontier developers and also the general key players and cybersecurity software, you know, they're offering all sorts of interesting things in terms of models that are, you know, fine tuned to be particularly good for cyber security, defense use cases and that sort of thing. So, you know, people are doing stuff here, right?
Kevin Frazier
Because clearly if you're going to make an AI tool available, let's say, to any provider of mental health services or physical health services or lawyers or other key professions, they're going to be asking these questions of how resilient is this tool to cyber attacks. And you're going to need to provide some assurances there if you want to get traction in that market. So beyond the labs themselves having this clear incentive, are we seen from a policy standpoint, and let's start domestically here in the US Are there states, are there bills pending before Congress that are trying to address this issue? And then looking more broadly, do we see countries around the world taking action in this regard, or is this one of the back burner AI issues for right now?
Caleb Withers
I mean, in the scheme of things, it feels relatively up there in that. I've talked to some policymakers in the US in the current administration and there's this whole debate about how much people too worried about the risks and downsides of AI versus we just need to let people push innovation forward. And obviously that's not a firstly, of course, that's not just a binary trade off. But with that being said, something I'll often hear is people say, oh, I'm not worried about most of the things that people are hyping up, but there's probably a there there in cyber. Just recognizing that, as you said right at the start of our conversation, the stakes of cybersecurity and cyber attacks are really high. We're seeing some interesting things that the models can do here. So I think there's a degree of attention here in terms of the what to do. You know, I think there are some things that make sense to do, but also probably the thing I'm most excited about is just improving evaluation of what these models can do. And so I'm a big fan of KC here for example, and I'm glad that they exist and are looking at stuff like this and working with our ACs around the world around what can cyber models do? Because I think a year or two ago you'd see some sort of research paper and it would say we asked the model some sort of like multi choice cyber skill questions. That's interesting work worth doing that doesn't necessarily really inform, you know, provide information around how helpful is this for offenders versus defenders? How much does this compare to what a human can do versus can't, how expensive is it versus you know, having cyber attackers and defenders. So building up that responsive evaluations and monitoring ecosystem I think is pretty important here and I think promising in the sense that.
It can be overstated. I would not want to gaze in a crystal ball and take bets on what exactly the cyber domain is going to look like five years from now given AI. At the same time there's this thing we see of if a model can do something like some of the time or expensively or.
The closed models can do it, but not the open ones. The pretty reliable bet is okay at some point. Open, cheap, foreign, not that hard to use relative to ones that might require a bit of.
Knowledge about how to best look. With AI models you do eventually see that diffuse over the coming year or two. So you can't see all the way into the future. But I think you can actually see a little bit of the way into the future if you look at what models can do now or how people are trying to use them. And then that probably.
Informs or at least gives you a little bit of warning of, oh, hey, there's something that's a little bit different here that might require a little bit of a different policy response than the usual.
Kevin Frazier
Yeah, that's really interesting to put a emphasis on those early capabilities that we're seeing from labs that may not get the sort of headline news of, oh, well, this model can do this thing 20% of the time. If you're thinking, huh, if it can only do it 20% of the time, who really cares? Let's just address other issues. But to your point, as we see these developments and as we see more and more bad actors get AI savvy, then whatever is possible 20 of 20% of the time today may drastically improve by just the next model release. And so having that sort of aggressive, proactive cybersecurity process posture seems really important here from a. Are we taking this seriously enough perspective? Would you like to see more of an emphasis on cybersecurity from a policy landscape, or do you think it's the right balance so far?
Caleb Withers
I think there's a challenge here in that the demands on the cyber security community, be that sort of policymakers or practitioners is already pretty high. People are already pretty stretched thin. And so saying you have to divert a bunch of resources to focus on AI things that might be coming down the pipe, there's trade offs there. I think the things that I would be most excited about is, as I mentioned, is making sure that there really is that strong evaluative function going on both within industry and within government. And also a sort of, and maybe this is a bit of a fuzzy thing to point at a willingness to move fast and turn on a dime and sort of think about what would be worth doing if we saw certain things. Because, I mean, at the moment people talk about, should there be some sort of federal regulation of AI systems with regards to these risks. And I think people have maybe pretty reasonably advanced arguments that it might be a bit premature, you might not end up with the exact shape of things that you want if you sort of move too quickly in this regard. At the same time, if the, the thing you're hoping happens is, okay, if and when we see something that does flip the game board of what cyber security looks like, given where AI systems have gone, then we'll take, you know, five years to legislate a response, you know, that's, that's not gonna, gonna work. So, you know, I think it's, it's a matter of just paying attention and thinking about what might be justified you know, people talk about and policy making, the sort of, you know, the piece of paper in the drawer ready for if and when things are being great. So I think some of that sort of thinking is important.
Kevin Frazier
I love this answer for so many reasons, not only because it aligns a lot with what I heard from the three economists we had on the podcast a few episodes ago. Basically saying, look, we just need scenario planning. There's not enough information out there right now to know what is the definitive hard law statutory approach that we want to enshrine for the next decade to mitigate against some of these AI cybersecurity concerns. But why not start to iterate on, okay, if we see these capabilities at this point, here's how we would like to respond. And that just sort of playbook of possible scenarios isn't our usual approach to policymaking. It requires a degree of creativity and flexibility. Like you said, that isn't the sort of stuff you commonly associate with D.C. or state capital. So if you were to wake up tomorrow and you had a magical wand of either a policy solution or you could just host a convening or host a roundtable, who would you want in the room? What would you want the agenda to be? What would that sort of immediate next step be, based off of all of your impressive research on this topic?
Caleb Withers
I think the thing that comes to mind here is, and it's a bit of a cliched answer, but just getting people from the different communities, be it the AI community more so, or the cyber security community more so, and then also the.
Government sort of side of things together to, as you sort of say, think about, what are the sort of scenarios that are worth paying attention to? And I think there's two important questions when it comes to looking at evaluations of AI models and whatnot. One is.
What should we be evaluating for? But then also, is anyone going to actually listen or find those evaluations compelling? And so if I sat here and said, well, I have some things I point to in the report where I'm like, I think if AI models could do this, it would be a really big deal. But then if policymakers and the cybersecurity community and other experts are sort of like, well, I don't actually think it would be a big deal if AI could do this, well, then before we even set up the evaluations ecosystem to find these things, we want to make sure that these things are actually decision relevant and compelling and sort of tease out some of the assumptions that people might have of, oh, actually, there should be Some federal regulation of AI models that can do this thing or not. So, yeah, a bit of a cliched answer, but convening those sorts of folks seems good.
Kevin Frazier
That's. That's fascinating to think about. The perhaps over proliferation of evals in response to this, of, oh, we need more information. So then everybody says, oh, I've got a test for this cyber vulnerability. Oh, I've got a different test for this other cyber vulnerability. If you start to get that mentality of if an eval lands in a forest, does anyone hear it right? Does anyone even care what that information is? But how do you get consensus around those core factors of when we have crossed a dangerous point is obviously something that I think everyone would want to agree if we want to avoid a world in which cybercrime costs $10 trillion per year.
Podcast Host (Ad Segment)
Yeah.
Caleb Withers
Especially because the pool of people who are really good at doing evals and thinking through this thing, it's actually a pretty lucrative skill set because if you can set up an evaluation for is an AI model good at this thing that actually matters, that would actually have some economic or strategic value if it could do this? That rhymes a lot with what would it look like to set up a good training pipeline for models to do that? And if anyone's been paying attention to sort of the salaries and trends about how much value is placed on people that can nicely operationalize. What would it mean for a model to be good at this thing that matters. There's a lot of lucrative opportunities there. So, yeah, without going all the way to that, oh, we need to sort of have some like, central planning of what all the evals community is doing.
It's a scarce skill set and a bunch of people. And so thinking thoughtfully about what are the things that are worth taking some big bets on and spending on is worthwhile.
Kevin Frazier
I think, given that you could go start a consultancy and charge a much higher hourly rate than folks probably want to imagine. I know your time's valuable, so I don't want to steal too much more of it. But.
Doing this research.
An exhaustive report, and I really encourage listeners to give it a read. What was the biggest misconception you flagged? If there was any vibe you want to call out or any just general sort of misinformation. Right. Not necessarily intended, but just something that you hear repeated nowadays that as a result of your research kind of makes you cringe, like, oh, no, that's just so inaccurate. I wish people would stop saying that. Did anything like that emerge from your research?
Caleb Withers
Yeah, I would say the one thing that stood out is, you know, there's this big ongoing debate and people like to point, you know, is AI hitting a wall? Was GPT5 a flop or whatnot? And, you know, there's, obviously there's something to these discussions and there's always the question of even if we've seen strong AI progress to date, you know, to what extent can this be expected to continue into the future?
With this being said for the cyber benchmarks I looked at that seem most compelling, they have been reliably going up, you know, over, over recent years. If I was to think of, you know, ways that I could continue to see them going up and in months and years to come, I think there's a lot of opportunity still there for them to keep doing so. So, yeah, I mean, I'm not sure whether this is a contrarian or uncontrarian take, depending on who you're talking to. But yeah, I think that AI models are getting better at cyber stuff pretty rapidly and I think this will continue for at least a few months to years.
Kevin Frazier
Yeah, I think the general sense of don't bet against AI right now is a pretty good gamble on most fronts in terms of capabilities. I have a lot of tired lines, but one that I definitely repeat frequently is always remembering that today's AI is the worst AI you'll ever use. And I think also acknowledging the fact that we're seeing the development and fine tuning of models democratize across the globe. So there are certain models, certain capabilities we may never learn about through these more centralized evals, for example. And so that, to me is something else that would keep me up at night as a cybersecurity scholar. But I want to give you the final word here. Any other key takeaways that you want listeners to know from your report?
Caleb Withers
No, I would just double down on that. Paying attention to what AI models can do in the cyber domain and doing so in a thoughtful way, I think is pretty important.
Kevin Frazier
All right, well, Caleb, we'll have to leave it there. Thanks so much for coming on.
Caleb Withers
Thank you.
Kevin Frazier
Scaling Laws is a joint production of lawfare and the University of Texas School of Law. You can get an ad free version of this and other Lawfare podcasts by becoming a material subscriber at our website, lawfairmedia.org support. You'll also get access to special events and other content available only to our supporters. Please rate and review us wherever you get your podcasts. Check out our written work@lawfaremedia.org you can also follow us on X and Bluesky. This podcast was edited by Noam Osband of Goat Rodeo. Our music is from Alibi. As always, thanks for listening.
Podcast Host (Ad Segment)
As a small business owner, you don't really get to clock out early. Your business is on your mind 24 7. So when it's time to hire, you need a partner that works just as hard as you do. That partner is LinkedIn jobs when you clock out, LinkedIn clocks in. It's super easy to post your job for free, share it with your network, and manage qualified candidates all in one place. LinkedIn can even help you write your job description and get it in front of the right people. And promoted jobs get three times more qualified applicants. 72% of small businesses say LinkedIn helps them find higher quality candidates, and you can double your reach just by adding the hashtag hiringframe to your profile photo. Find out why more than 2.5 million small businesses use LinkedIn for hiring. Post your job for free at LinkedIn.com jobsearch that's LinkedIn.com J O B S E-A-R C H Terms and conditions apply.
Date: December 5, 2025
Host: Kevin Frazier (AI Innovation and Law Fellow, Texas Law; Senior Editor, Lawfare)
Guest: Caleb Withers (Research Associate, Center for a New American Security)
Main Theme: The evolving relationship between advanced “frontier AI” systems and cybersecurity—the shifting offense-defense balance, new attack vectors, response strategies, and policy implications.
This episode explores how cutting-edge “frontier” AI models are disrupting the established dynamics in cybersecurity. Host Kevin Frazier and guest Caleb Withers dive into Withers’ research and recent report analyzing the impact of new AI capabilities in the cyber domain. They unpack what’s genuinely novel with generative models, how old vulnerabilities persist, escalating pressures on defenders, and what policymakers and industry should do to adapt for the coming wave of AI-driven cyber threats.
Definition:
How the Offense-Defense Equation Shifts:
Historically, AI favored defenders by scaling protection and automating routine tasks.
But, new models offer attackers “returns to scale”—they can launch more sophisticated, higher-volume attacks cheaply and quickly.
“The approach my report takes is to sort of say, is there any reason to expect that this might not hold or that this time might be different for some of the current gen AI capabilities we see?” (Withers, 10:30)
Rising costs for defenders: Running models at defense scale may become prohibitively expensive—as attackers can iterate and experiment at marginal cost.
Phishing, especially hyper-personalized and in non-English languages, is being supercharged.
Human vigilance is still critical—but the cognitive/”cyber hygiene” burden on users increases, putting defenders at a disadvantage.
U.S. and International Policymaking:
Scenario planning, not static laws:
On the offense-defense “frontier AI” shift:
On persistent vulnerabilities:
On phishing and human-oriented attacks:
“Have we seen true transformation of cyber attack yet? My answer is not yet, except for this asterisk of, as you said, phishing for exactly the dynamics you pointed at.”
— Caleb Withers (18:00)
“At a certain point... there’s not going to be much that a sufficiently sophisticated AI phishing campaign... can’t do.”
— Caleb Withers (19:56)
On agentic AI risks:
“Capabilities are probably outpacing their ability to be relied on with any degree of confidence... as is often the case in the early days.”
— Caleb Withers (34:32)
“It might be a little bit more annoying... but that’s an example of how you can make tradeoffs between the easy way and investing in the slightly more secure way.”
— Caleb Withers (36:42)
On evaluations and policymaking:
On future AI progress in cyber:
On practical recommendations:
| Timestamp | Segment / Key Topic | |:----------:|:----------------------------------------------------------------------| | 04:24 | Introduction of Caleb Withers and framing of the AI-cyber report | | 05:44 | Pre-generative-AI: how ML and automation have supported cybersecurity | | 07:48 | What is “frontier AI” and why it changes the landscape | | 09:28 | How and why frontier AI changes the offense-defense calculus | | 13:15 | Persistent “stubborn” vulnerabilities in code—e.g., SQL injection | | 15:27 | The double-edged role of AI in exposing/fixing old vulnerabilities | | 18:00 | GenAI’s impact on phishing—hyper-personalization and language | | 23:00 | Time-to-exploit trends, the shrinking window for defenders | | 24:27 | Hypothetical: LLM watches GitHub to instantly exploit new patches | | 33:05 | Emerging risks with agentic AI systems (autonomous agents) | | 41:39 | How AI labs are responding with layered security, market incentives | | 42:29 | State of AI cybersecurity policy in US and internationally | | 44:41 | The importance of rigorous model evals, scenario planning | | 48:51 | Policy advice: playbooks, flexibility, scenario-based approaches | | 50:08 | The need for convenings & consensus on what to evaluate and why | | 54:12 | Misconceptions: AI’s “progress slump” versus ongoing cyber gains | | 56:16 | Closing takeaways—remain vigilant and proactive |
This episode is a must-listen for policymakers, technologists, and anyone following the intersection of AI and cybersecurity. Caleb Withers asserts that “frontier” AI models catapult both old threats and new capabilities forward—making the offense-defense dynamic less stable and more resource-intensive than ever. The big takeaways: scenario planning, flexible policy frameworks, collaborative evaluations, and relentless vigilance are the best ways to prepare for the looming AI-driven cyber frontier.
Recommended companion reading:
Find more at: lawfaremedia.org
Contact: scalinglaws@lawfaremedia.org
(This summary omits ad breaks and non-content segments for clarity and depth.)