
Loading summary
Lawfare Host
This fall marks 15 years of lawfare and we're celebrating the only way we know how by gathering our community of readers, listeners and contributors for an in person celebration in Washington D.C. get your tickets today at lawfaremedia.org 15 years.
Bluehost/Advertiser
I'm no tech genius, but I knew if I wanted my business to crush it, I needed a website. Now, thankfully, bluehost made it easy. I customized, optimized and monetized everything exactly how I wanted with AI. In minutes my site was up. I couldn't believe it. The search engine tools even helped me get more site visitors. Whatever your passion project is, you can set it up with Bluehost with their 30 day money back guarantee. What have you got to lose? Head to bluehost.com to start now.
Shopify/Advertiser
When you're starting off with something new, it seems like your to do list keeps growing. Finding the right tool helps. And that tool is Shopify. Shopify is the commerce platform behind millions of businesses around the world and 10% of US e commerce. And best yet, Shopify is your commerce expert with world class expertise in everything from managing inventory to international shipping. If you're ready to sell, you're ready for Shopify. Sign up for your $1 per month trial@shopify.com retail go to shopify.com retail.
Alan Rosenstein
It'S the law Fair Podcast. I'm Alan Rosenstein, Associate professor of Law at the University of Minnesota and a Senior Editor and Research Director at lawfare. Today we're bringing you something a little different, an episode from our new podcast series, Scaling Laws. It's a creation of lawfare and the University of Texas School of Law, where we're tackling the most important AI and policy questions. From new legislation on Capitol Hill to to the latest breakthroughs that are happening in the labs, we cut through the hype to get you up to speed on the rules, standards and ideas shaping the future of this pivotal technology. If you enjoy this episode, you can find and subscribe to Scaling Laws wherever you get your podcasts and follow us on X and Bluesky. Thanks for listening. When the AI overlords take over, what are you most excited about?
Lawfare Host/Interviewer
It's not crazy, it's just smart.
Alan Rosenstein
Just this year, in the first six months, there have been something like a thousand laws.
Lawfare Host/Interviewer
Who's actually building the scaffolding around how it's going to work, how everyday folks are going to use it?
Alan Rosenstein
AI only works if society lets it work.
Lawfare Host/Interviewer
There are so many questions have to be figured out and nobody came to my bonus class. Let's enforce the rules of the road. Steven Adler, thanks so much for coming on. Scaling laws, of course.
Steven Adler
Yeah, thanks for having me.
Lawfare Host/Interviewer
So it seems like yesterday we were listening to Senator Schumer go down an expert panel of witnesses for the Senate AI insight forums and asking people what their PDumes were. The probability that AI would cause some sort of existential threat or cataclysmic event that wipes out humanity, depending on who you ask. And folks had answers. Folks had assessments of, hey, you know, my PDumes at 15 or 2 or whatever percentage. Fast forward to today. We don't hear a lot of chatter about existential risk necessarily on the Hill or it doesn't seem to be the paramount focus of a lot of legislators. So can you walk us through a couple things? First, what are we talking about when we're talking about AI safety? What does that term actually mean? And then let's move into a discussion of what's the current state of AI safety and sort of its political salience.
Steven Adler
Meta point. So pausing on substance for a sec, I assume it's fine for me to quickly give my background for, for context, or would you prefer to.
Lawfare Host/Interviewer
So I've, I've recorded an intro before introducing you as an OpenAI, former OpenAI safety researcher, and introduced your substack. And then we can get more into your background later on. Is. Is the plan. But to the extent you want to weave that in, that's. That's obviously fine as well.
Steven Adler
Okay, sorry. And I, I distracted myself there for a moment. So the, the thrust of the question was like, what. What has changed in terms of how people talk about it and for what reasons?
Lawfare Host/Interviewer
Yeah, why don't we just restart, if that's okay.
Steven Adler
Sorry. Yeah, sorry about that.
Lawfare Host/Interviewer
All right. Steven Adler, thanks so much for joining Scaling laws.
Steven Adler
Yeah, of course. Thanks for having me.
Lawfare Host/Interviewer
So you've worn many hats in your career. You were an OpenAI safety researcher before. It was cool, for lack of a better phrase. You've been thinking about AI safety for a long time, and we find ourselves at sort of an interesting point in the development of AI safety and its salience as a political topic. If we go back just a little more than a year, we can recall Senator Schumer asking a panel of experts what their p doom was, their probability of AI leading to some sort of catastrophic harm. But if we look forward to today instead, the focus seems really focused on some of these national security concerns and achieving quote, unquote, AI dominance. So given that you've had a longer history in this field than most folks, I'm really excited to hear how you define AI safety because I think that term carries a lot of weight in a lot of conversations. And, and then if you could walk us through how you would describe the sort of evolution of AI safety since the initial introduction of ChatGPT.
Steven Adler
I tend to think of a slightly broader category which I would call impacts of AI and there are three big questions I think about. One is about the geopolitical struggles over AI and largely between the US and China. Will there be some sort of conflict over who develops AI, how they use it, their relative positioning, and could that cause a lot of harm? There's a second question about how do we keep control over AI systems in terms of ruling out the worst ways that they might be used? A classic example here is about bioweapons. Could an AI system that is very high capability, useful for scientific research empower many more people to do dangerous things with bioweapons, even though it's in violation of international law, make it sufficiently cheaper and accessible that we have a serious problem on our hands? And then there's a third question which is let's say that powerful AI is developed. There aren't major power struggles along the way. It's still going to change society in really profound ways. We will need to think about labor and what the purpose of being human is and all of these sorts of things. So those are three really broad categories. Often when people say AI safety, I get the sense that they mean that second category which is about ruling out the worst impacts of how AI will be will be used to cause harm. There's a framework from Google DeepMind that I like on this where broadly you can think of a few categories of ways that AI systems could cause large scale harm. One is misuse. This is the bioweapons example I gave. Maybe it empowers non state actors to do much more harm than they could before. You can also think of a category of accidents. These AI systems are increasingly getting integrated to the military. They're going to be used for scientific research. Could there be mistakes along the way where we think an AI system is safe and robust for a certain purpose, it turns out not to be and something very bad happens as a consequence. There's also this category of misalignment risk which is if you think about the AI AI systems today, they are agents being trained to pursue goals in the real world today, often pretty narrow goals like booking a hotel room for you. But the AI companies want to have their AI systems go and pursue bigger, more open ended goals. A Common one is to solve cancer. And so as we give AI systems more and more responsibility, maybe they will pursue these goals in ways that we don't like. They are contrary to our interests. We can talk about why this happens and some of the evidence, but you might end up in a situation where AI is an adversary to you or at least doing things you really don't want. Sometimes people think of AI safety as tone policing. Don't say bad words. I think that is a very watered down version of the problem that matters often that is more companies doing brand safety. They want their reputation to be good, they don't want to offend the administration, they don't want to offend customers. But it's less so about how AI might be used to cause real harm in the world as a powerful technology.
Lawfare Host/Interviewer
And I'm interested in your take on the fact that even with this administration emphasizing, quote, unquote, AI dominance as its goal for AI policy, making sure that the US stays ahead of China, making sure that the US is the leader in all frontier AI development, we still saw in the air action plan this concern around CBRN risks. So cyber, biological, radiological and nuclear risks from AI development for you, looking at that focus and seeing even though we are no longer talking about pdumes on the hill necessarily for you, is that enough of a focus for AI safety from a political viewpoint? Are you, for lack of better phrase, satisfied with this awareness of those worst case harms from a political vantage point? Or would you like to see policymakers being more attentive to the full spectrum of AI harms you just outlined?
Steven Adler
There's a really broad range of risks besides just CBRN ones. I'm certainly glad those are getting attention. I'm also glad there's a recognition that even if the US government reasonably wants to be ahead of China on AI development, which I think is a reasonable goal, I think it might have some unfortunate consequences that need to be managed. But a recognition that if you go too fast and pursue AI development with too much abandon, without thinking about what are reasonable safeguards, you might end up with the whole situation blowing up in your face. There are a bunch of ways this could happen. It could be that an AI system has these dangerous capabilities like the ability to do scary bioweapons things, and it might be stolen. A kind of scary fact is that none of the frontier AI companies today believe that they could withstand the force of the Chinese government. If the Chinese government wanted to steal one of their AI systems, it might be costly, it might be difficult, but Certainly OpenAI Anthropic have not made these claims.
Lawfare Host/Interviewer
And you're saying, just to clarify on that point, you're saying none of the companies believe they could withstand a sort of cyber attack from the PRC trying to gain access to the model weights, for example.
Steven Adler
That's right, yeah. Thank you for clarifying. To maintain dominance, suppose that you have the best technology system in the world, the best AI system. You need to actually maintain control over it. And if it can be stolen from you, so long as the opposition can run it, which generally they can, they will also have large computer clusters to be able to run the software. You no longer have a monopoly on it. And so to create a very powerful AI system is to create risk that your adversary can get their hands on it. Notably, if you go ahead and you open source the AI system, or really more open weights, you put the big file that represents the AI system on the Internet, maybe not the full software that created it, but the computer brain itself, you are giving your adversary directly access to it, and so it doesn't even need to be stolen. In that case, still might be worthwhile. There are lots of reasons to consider this. I think open source is generally a great thing, but it does come with some amount of risk.
Lawfare Host/Interviewer
So with respect to those harms, I'd love to hear some more tangible examples that you think should be top of mind for policymakers or anyone who cares about AI development. What for you would you list near the top of your, your concerns from an AI safety perspective? What keeps you up at night, if anything, and specificity is encouraged here.
Steven Adler
One big category is keeping keeping physical control over your AI system in the way that I alluded to. And so real security standards for AI companies building at the frontier with really aggressive pen testing standards, penetration testing, people trying and really throwing resources at breaking in and stealing real audits of insider threat models. Could your employees, if they in fact weren't loyal to you, steal your technology and give it to an adversary? So there are these security standards. And then related to this one big risk of an AI system if it were misaligned, if it did develop goals that were not exactly what you wanted, as we're seeing all sorts of evidence today with the systems that exist, although they aren't yet powerful, but they do seem to have different weird goals that we don't fully understand. Could an AI system break out of your computers, at which point it is a rogue virus on the Internet and you would have a very hard time getting it back? People often wonder, why can't you Just unplug your AI if it is misbehaving. And the answer is, if it misbehaves in a way where it breaks out of your computers, you no longer have the ability to pull the plug. You could turn off your data center. Your AI system no longer lives in your data center, unfortunately. And so it's a bit too late. So these are questions of keeping control over your AI system. There's a separate question about, okay, suppose someone with an AI, or even the AI system itself, a future AI system that is more of an agent and takes actions across the Internet. You know, what can it actually do that is harmful? Like, great, it's roaming free on the Internet. Why is that scary? And the answer is generally cyber attacks, as you've somewhat alluded to. Maybe AI systems can launch attacks on critical infrastructure that we rely upon, like the power grid. Financial systems cause a bunch of havoc. Maybe they can be useful for CBRN risks to do all sorts of dangerous things in the physical world that we aren't used to contending with. And suddenly you have many more people capable of causing physical harm to humans by developing new molecules, spreading them throughout the world. There are lots of people in the world who want to cause harm, and thankfully today they are largely pretty limited. The defensive standards that we have in place are generally pretty effective. We deter people. There's just not that much that different people can do. But if you really amplify the offensive capabilities of new science and new warfare, you need to make sure that you have stronger defensive capabilities to go with it.
Lawfare Host/Interviewer
So you've been in the literal belly of the beast. I mean, you were working at OpenAI, you were working on multiple different safety fronts throughout your career at OpenAI, and you saw the inner workings of this lab that was founded on a safety narrative, a sort of, how do we make sure that if we develop the most sophisticated and AI, if we achieve AGI, we do so in a way that benefits humanity? And so I'm curious, from that vantage point, why isn't the lab. Why isn't OpenAI? Weren't any of the sort of frontier AI labs taking the requisite steps with respect to physical security? Let's just start there, because you could imagine there's a clear market incentive to try to make sure that your model is protected, to try to prevent Chinese hackers from accessing your model weights, for instance. Why aren't we seeing sufficient safeguards being taken? Is it a matter of just, well, the Chinese are too good. If they're going to hack, they're going to hack, you know, no one can defeat that. Or is it a sort of, well, maybe the government's going to come in and help us out anyway, so let's not spend our money there. Why aren't we seeing the requisite steps being taken, in your opinion?
Steven Adler
There? There are lots of ways that safety and security aren't compatible with the incentives of a frontier lab. Fundamentally, it's just really, really hard to secure your technology at this level and entails a bunch of trade offs. So an issue in the case of OpenAI. They have this software that they want the world to be able to pull upon in all these ways. They need it to be accessible and usable to external users. They need it to be usable and modifiable to hundreds of researchers within the company moved from computer to computer, it's just a really, really large scale problem. And to slow down to solve it would mean potentially letting competitors who aren't as concerned about these risks jump ahead of you. There's a point I often make in terms of safety, which is that the policies and practices of one AI company on safety are absolutely not what they would choose if they didn't have competitors nipping at their heels. The unfortunate reality is if you do slow down to pursue safety or security and you say we're not going to move forward until we solve these problems, you can't guarantee that other AI companies will be as judicious as you and they might leapfrog you, or if they were behind, they might catch up to you. And so fundamentally, what all of these companies are balancing is what are the safety and security measures that hopefully are effective enough, are cheap enough, that don't compromise our lead. But fundamentally, the way they think about this is if you give up your lead, if you give up your spot at the frontier, you have lost your influence. And so a lot of good that is done if you have secured your models against adversaries, other states, whatever it might be. If in the process you are no longer a big important AI company and actually you don't have the voice or influence to help direct how this all goes, you have given up your influence in the process.
Lawfare Host/Interviewer
So we've heard about these sort of racing dynamics before on the scaling laws podcast of having different AI labs seemingly pushing one another to move ever forward with the AI frontier and asking questions after the fact. Of course, one of the big pushbacks to that argument is, well, if we stop, is China going to stop? Is Russia going to stop, is you name the adversary going to stop on their own AI development So how does the national security picture and the tumultuous geopolitical scene factor into your analysis of what you think a responsible national policy would be here? With respect to mandating, for example, some sort of security mandates on these AI labs?
Steven Adler
I call this regretful racing. The idea that each of the individual actors might reasonably want to slow down and not be racing each other, but of course they largely can only fiat their own actions. And so, given that they might determine their best move is to race, I think fundamentally the way to change this is to explore treaties that impose certain safeguards, certain minimum ones, ideally verifiable, that both parties, be they companies, be they countries, can enact. And this means that they don't pursue the forms of powerful AI development in quite as unchecked a way. I think there's a thing that people often get wrong about the US China AI competition, which is they call it a race. I think this is just totally mistaken. In a race, you win by being the first to cross the finish line. In a simple game you're playing Connect 4, you win by getting four in a row. It doesn't matter if your opponent would have gotten four in a row on the next turn, you have won, the game is over. And that is just totally not how powerful AI development works. We have seen the lead between different companies go back and forth. One company being the first to a certain level of AI, or one country being the first to a certain level of AI doesn't mean that they have the perpetual lead. They still might need to worry about getting caught up to by the other player. Or, or maybe even if they are ahead of the other player, the other player has powerful enough AI to still cause a bunch of issues for them. So to make this more concrete, maybe the US forevermore will have stronger AI than China, but there is some powerful level of AI that the Chinese government can still reach that allows them to wreck all sorts of havoc through cyberattacks on US infrastructure. And so even though our AI system is stronger, it might just be really hard to defend all the surface areas that are necessary. And so the goal for the US government can't just be to beat China in AI development, beat them to AGI or some level. The US government needs to actually contain the Chinese AGI development effort. And this is symmetric. The Chinese government likewise probably feels threatened by the US pursuing AGI development. And so what you kind of realize as you reason through this, is it's not an all out race where one will win by getting their first. They each have an interest in containing the other. And that shifts the frame to how can we get international cooperation on this like we have in the past on topics like bioweapons and nuclear non proliferation and just the whole, the whole regime of international treaties that bind countries to do things that might not be in their unilateral self interest.
Lawfare Host/Interviewer
So we've seen some headlines emerge throughout this AI policy debate about, for example, whether Xi Jinping, the president of China, is actually more of a safety forward person than perhaps many folks would suspect. Somebody who's actually concerned about some of the loss of control scenarios you flagged earlier. But I fear that if we move forward, let's say in a hypothetical world, the US and China miraculously get together, shake hands and say, all right, fine, we're not going to pursue AGI, we're going to work with our respective companies to make sure this doesn't happen. Could we make the argument that actually benefits China's apparent strategy on this AI? Let's now call it a competition, not a race, this AI competition. Given that we've seen China seemingly lean more into open sourcing its models, making those models more generally available to more people, leaning into things like AI adoption by Chinese companies and by the Chinese people so they may surrender. All right, fine. Actually we think we're behind on AGI and the pursuit of some of these frontier developments. Great. If the US is going to bind itself to not pursuing the best of the best AI, we're going to win the race or win the competition when it comes to diffusing AI and making the entire global community reliant on our. So we'll actually take that win. How do you respond to this scenario of really moving away from the frontier might benefit China?
Steven Adler
Yeah, it's a tricky question. There are a few different dimensions to it. One thing I hear in this example is the assumption that, you know, if the US could have pushed ahead and been the first to AI because, you know, China, China was bluffing, they weren't actually going to get there. That the US had a win and it squandered it. And I don't think that's right. I think that there is large, large risk in trying to build AI systems that are smarter than any human. Even if you weren't contending with another international adversary, the scientific community believes that these are unsolved problems. There's this global statement from many, many hundreds of AI researchers, the CEOs of the three top AI companies, you know, founders of the field, some of the top cited AI scientists of all time declaring that AI is an extinction risk that should be treated like nuclear war pandemics. This is a serious issue and we don't know how to pursue it safely.
Lawfare Host
Delete Me makes it easy, quick and safe to remove your personal data online at a time when surveillance and data breaches are common enough to make everyone vulnerable. You know, it's easier than ever to find personal information about people online. I confess I've even done it myself about people who I was worried about for one reason or another having your address, phone number and family members names hanging out on the Internet. Internet can have actual consequences in the real world and makes you and the people close to you vulnerable. With Delete Me, you can protect your personal privacy and the privacy of your business from doxing attacks before sensitive information can be exploited. So I'm someone with an active online presence. My privacy is really important to me and I have been, as I've said before, using Delete Me since before it was an advertiser on Lawfare. I have been the victim of an identity theft. It was actually really sad. Someone was using my identity to do a really tragic elder care scam that I found out about because a family member of one of the victims got in touch with me. I've also been subject to online harassing. Never doxing yet, but you know it'll come eventually. And if you haven't been similarly the subject of one of these modern scams, you probably know someone who has. Or you will soon enough. Delete Me can help take control of your data and keep your private life private. By signing up for Delete Me now at a special discount for our listeners, you can get 20% off your delete Me plan when you go to join delete me.com lawfare20 and use the promo code lawfare20 at checkout. The only way to get 20% off is to go to joindeleteme.com lawfare20 and enter code lawfare20 at checkout. That's joinedeleteme.com/lawfare20 code lawfare20.
Thumbtack/Advertiser
Avoiding your unfinished home projects because you're not sure where to start. Thumbtack knows home so you don't have to don't know the difference between matte paint finish and satin or what that clunking sound from your dryer is with thumbtack. You don't have to be a home pro, you just have to hire one. You can hire top rated pros, see price estimates and read reviews all on the app. Download today.
Bluehost/Advertiser
Meet the Defender 110 a vehicle built for the modern explorer. With on road presence and off road prowess, it's naturally capable and expedition ready. A raised hood, sculpted grille and durable exterior make it look tough because it is inside. 5. Seat comfort comes standard with an option for 7. Navigate any terrain confidently with 3D surround cameras and the intuitive Pivi Pro infotainment system. There's a Defender for every 90, 110 or 130, which boasts room for up to 8. Design your Defender 110@Land RoverUSA.com that's Land RoverUSA.com Imagine a world of extraordinary comfort.
Shopify/Advertiser
Where Bowen Branch Bedding wraps you in the softest. Embrace the coziest experience made from the world's finest 100% organic cotton, all so you can sleep better. Start building your fall sanctuary with Bolen Branches Iconic signature sheets made with a buttery, breathable weave that gets softer with every wash. Enjoy 15 off your first set of sheets with free shipping and returns at B O L L and Branch.com with code buttery. See site for details and exclusions.
Lawfare Host/Interviewer
This raises the point that we are a couple months now removed from chat GPT5, which was heralded and advertised by by OpenAI as something that was going to move the frontier substantially. We received press release after press release and blog after blog saying this is going to change the game, so on and so forth. And if we look at things like epoch AI, we see that ChatGPT5 was indeed aligned with where we would expect the next model to be from OpenAI. And yet it's September 2025. The sky hasn't fallen. We haven't seen a huge loss of control scenario develop bad actors for all intents and purposes. Obviously with some exceptions, we're not seeing some new terrorist organization form and harness AI. So what's your best response to the sort of hey, you know this, this AI safety community, they sound kind of like Cassandra's. You know, they've been warning about these x risks since 2023 or even in some cases 2022, calling on us to pause AI development. And yet the world looks more or less the same as it did in November of 2022 with with some exceptions. But what's your best response to that sort of where's the X risk question?
Steven Adler
I don't think anyone has defended that GPT5 would cause an extinction level event. I think in the minds of different AI safety people, this is still some number of years off. People disagree about how many and what exactly the milestone pretty clear it's not going to be GPT5 I think there are two things to emphasize. One is we are already seeing pretty large scale harms from the models of the most recent generation. Anthropic recently released an intelligence report that discussed how people are now using CLAUDE to conduct large scale cyber attacks, cyber ransom. And in contrast with how AI has been used in some past cycles where it's not really doing the thing, it's like helping a human a little bit, but it can't really be described as the main force behind it. It seems in this case the AI systems are now operating as the main force and getting some level of success. The second thing we need to consider is what is the effect of the safeguards that have been enacted by the AI companies because they are worried about some level of harm. And so, for example, both Anthropic and OpenAI with their most recent models have found through their evaluations that the AI systems, if unconstrained, if they hadn't taken steps to reduce this ability to could help an ordinary college student, some background in stem, do much more harm with bio or chemical weapons than otherwise. Still not trivial. You still have to get different materials. It's not automatic success. But they have found this, even though I think especially for OpenAI, it's not in their interest to admit this. And if there were not riskier, I think that they would have happily concluded that we weren't yet at that window. And so the question I see is it's great we have some companies being proactive about these risks, measuring them, although to different degrees of quality, doing something about it. We can't guarantee that other AI providers are going to do this. And in fact, I suspect many AI providers, as they catch up to the frontier, are not going to put the resources into doing this form of testing and therefore won't take the mitigations. Let me give you an example. So in the automotive industry there's a well understood paradigm at this point of what it takes to have a car be safe. You understand you need to look at survivability when it crashes, you need to look at it rolling over. And in fact there are pretty standard ways of doing this testing, at least within the US you drive into a wall at a certain speed, 35 miles per hour. And in the AI industry what we instead have is some companies don't want to get these results that say, oh, our car actually couldn't really survive impact at 35 miles per hour. So we won't push our tests as far as we can because we don't actually want to know the risks of our models, we'll do the front wall test, we'll drive into it at 10 miles per hour. Oh, look, that's fine. Or, you know, a company comes out with a new car and they're like, oh, you know, there is no evidence that our car can roll over. So we actually, we consider it unnecessary to do this testing. And, you know, maybe they don't say it quite so directly. They implicate it, they gesture at it, and it's just totally unacceptable. And unlike in the automotive industry, where if the car isn't sufficiently tested and there aren't enough safeguards, you know, it's really sad if an individual consumer buys one of those cars and gets into an accident as a consequence, but largely it's on them. The AI companies, on the other hand, are producing products used by hundreds of millions of people each week. They affect many, many more people beyond the direct user of the AI. And so I think that there is more of an impetus to have companies do these testing and mitigations like Anthropic and OpenAI have done on these CBRN risks to try to lift up the floor on the safety from any of these deployments.
Lawfare Host/Interviewer
So we love to think, and I hope our stats show. I'll go talk to our, our data people later. That there are folks right now who are on BART listening to this podcast, folks on the Acela corridor listening to this. And one of the guiding principles of good public policy is always weighing the costs and the benefits. And we've talked a lot about the costs and in many cases we've described scenarios with high magnitude potential loss. You know, lots of lives being lost, huge economic damages, but for all intents and purposes, depending on who you ask, some lower rate of probability. And yet I wake up every Monday and my favorite email I get full disclosure is from the center for Data Innovation. And they provide 10 examples of awesome AI use cases ranging from academic institutions like UT developing new materials that reflect heat off of buildings and therefore save on energy bills, and labs who are improving our ability to detect natural disasters and increasing our resilience when the next tsunami comes or next earthquake comes. How do you balance the theoretical, both costs and benefits in this discussion? Because one of my gravest concerns, and I really enjoy talking with you and I really enjoy your writing because you, I feel like, are uniquely able to have reasoned discourse on this. So just thank you again for coming on, but how do you weigh that cost and benefit scenario when we know there are tangible benefits that could be achieved with AI right now? And the realization that with any emerging technology there's a real risk of foregone innovation, of missed opportunities to see what's the spillover technology, what's the new innovation that could have come about. So how do you weigh that tricky balance in your head?
Steven Adler
We need to make sure what we are weighing is the costs and benefits to society writ large, not just the pocketbooks of the individual companies who are making their own decisions where to them, if they take safety or security more seriously and fall out of the race, they then aren't making the money. But that's different than whether if they were all acting together without some of this competitive pressure, whether they would make a different decision. I think you're totally right that we need to be quantifying the cost of different safety interventions and we should absolutely be looking for ones that are relatively more effective than others. I think there are ones that are very, very cheap, by the way, that are not being pursued by companies today. And this is part of the reason why I just fear that we can't leave it to companies of their own volition to invest enough into safety. Just a quick digression. There's been a lot of discourse recently about sycophancy, this property of AI systems to tell people what they want to hear. And unfortunately it seems linked to at least a handful of different deaths in the case of ChatGPT. It's really, really sad. And this was a risk that was well known and understood in the AI industry, including by OpenAI. There were free evaluations available on the Internet that Anthropic had published back in 2023. OpenAI hadn't tested for these. When I went and said, I wonder what I would find if I ran them. It cost maybe like 20 cents to run the test. Took like an hour of my time. And so AI companies, when they are making their own investment decisions about where to allocate resources, even interventions that are cheap, interventions that are on topics that they have said they care about, sometimes it just doesn't come out in their favor. They just are not measuring and caring about risks, even that they know and well understand. I think more broadly there's an institutional design challenge here, which is how do we get both at the company level and the country level to have verifiable cooperation of sorts of that lets everyone have this reasoned conversation about what does it take to safely manage these systems, what interventions are helpful, what are worth the cost of them, because not all of them will be and make these decisions more freely without the competitive pressure. I spoke early about how you might be leapfrogged if you invest too much in safety. It's hard to test an AI model. You need to be really, really careful if you are actually interested in turning over every rock and, and figuring out what risks exist and it might take you longer than a few days, which increasingly seems to be the speed at which the AI companies need to operate between having a system ready and wanting it to be live to the world in some form. And so it's unfortunate that we don't have the time to actually reflect on the benefits and costs and figure out, hey, maybe it would be better if there were more of a norm around waiting at least a few weeks. Because some of these tests, they just take a reasonably long time to run, but they are worth it. They're, they're useful.
Lawfare Host/Interviewer
So one of my favorite reports, and I think a report that's received a lot of acclaim from everyone across the AI safety spectrum, let's say from the folks who are generally more pro innovation to the folks who are more in line with slowing down or at least letting off the gas a little bit, have supported the AI Frontier Working Group report that was produced in the state of California as directed by Governor Newsom, looking at what would be a good way to govern frontier AI. And one of the recommendations was a emphasis on evidence based policymaking. And I think that you and I might be aligned here, but I'm going to push you because I want to hear how you respond to this. If we are going to have evidence based policymaking, I really think that there needs to be a far greater emphasis on both safety mechanisms and deployment efforts to really try to tally these costs and benefits. So let me give you an example because I think that this AI companion conversation and the fears of psychosis is going to be the thing that dominates headlines for a while now. Given that we've seen recent lawsuits dropped in that regard, and given that we're seeing more and more AI tools specifically designed for kids and deployed in schools, this seemingly will be a policy topic for a while. We saw the state of Illinois ban therapy bots as a sort of mechanism to try to prevent AI being deployed too quickly, too soon to folks who may be mentally vulnerable. As someone who was, I'll just say mentally vulnerable. As an elementary schooler, I had anorexia, I had an eating disorder, and my least favorite part of the day was going to therapy. And if I could have just chatted with a therapy bot, my hunch is that that would have been a better experience for me. As a fourth grader, rather than driving 50 minutes to go into downtown Portland and meet with a therapist. So can we align? Perhaps we can agree on a greater emphasis among state legislators, among Congress, for robust institutional mechanisms to gather data. Because I just don't think we're talking enough about the fact that so much of AI regulation, in my opinion right now is based off of vibes. And I will fall on my sword the second we see that the costs of, for example, these AI therapy tools exceed the benefits. Great, let's ban them, or let's delay them, so on and so forth. But I just want to see the evidence. Am I wrong? Am I right? Where, where do we agree on this or disagree?
Steven Adler
Yeah, thanks for sharing your experience. I do think it helps to ground it. And yeah, bans are often inefficient. Sometimes they are still justified, but totally there is efficiency loss. I think you're absolutely right that there is not enough effort being made to pursue evidence. Quite a lot of untruth can hide in the statement. There is no evidence that, especially when certain players do not have an incentive to produce that evidence. A bit of research that I recently came across from Anthropic looked at the idea of hidden objectives in AI systems and whether they can be discovered. And this was a way of trying to ground this question that often feels really, really broad and hard to speculate about, which is if AI systems have some secret goal and how can we actually know? And there's been research that finds that in fact these systems can be poisoned by bad data on the Internet. They can have a hidden objective and it's really hard to tell, and in fact can persist through different forms of safety training. And so if these systems have a hidden objective, we really, really need to know about it. I love this grounding of the question. They said, let's design an experiment where we can approximate this type of thing happening. We will create these synthetic documents and feed them to the AI to try to get it to have a hidden objective. We will simulate the thing ourselves and we will run an actual test. I love that practice of gathering evidence on these questions and I wish that more groups were trying to construct these types of real world analogs. Similar research from the group Palisade research they looked at when AI systems like O3, a powerful reasoning model of OpenAI, are confronted with problems that they can't solve the right way, like defeating the best chess engines in chess, you know, will they sometimes look to hack or do other things that we really don't want them to do to pursue these goals. And so, you know, we're always squinting a little bit in terms of, hey, like, what is the external validity of this research? Have we made it approximate the real world thing that we care about as much as possible? But certainly these are solvable problems and I wish more people were going at them. On the mental health specifically, it does seem like different psychiatric facilities could be doing more to collate data on how often are chatbots a factor? It seems like there is a lack of data in the public sphere.
Lawfare Host/Interviewer
So you spent time in OpenAI and we discussed that you were a part of several different teams with respect to safety mandates and oversaw some of that early research. And I'm interested in how and why you think the culture has shifted. You have written about and talked about the fact that some of the early concerns about safety and some of the early prioritization of making sure that models were thoroughly tested before deployment seemingly isn't as pervasive as it once was at OpenAI. And what do you ascribe that to? Is it a sort of change in culture? Is it a change in dollar signs landing in the bank account? Is it a change in just how popular society sees AI? What, what gives? What's happening at OpenAI that you feel may be leading folks maybe further away from that founding mandate?
Steven Adler
Yeah, The Open, the OpenAI that I joined in late 2020, certainly very different from the OpenAI today, both in terms of scale and the number of users it's serving and the product surfaces and also, unfortunately, some of the approach to safety. I think one fundamental thing, it really felt in Those days like OpenAI believed in its nonprofit charter, that it was designed for public benefit. Its mission was to ensure that AGI was broadly beneficial to humanity. The mission was not to build AI AGI. There's this part of the charter called the merge and assist clause where OpenAI even said, you know, we recognize that race conditions between labs can be very dangerous and in the right circumstances, if we think that we might be in one of these races, we would even merge with a competing effort and just fully help them rather than competing against each other in causes ways to cut corners. Recently, OpenAI has of course tried in various ways to move away from being a nonprofit. The California Attorney General, I think, rightfully objected to this and it's a little unclear where it is going to net out. Unfortunately, I don't expect it to be with a strong public mission orientation. I think a factor in all of this is that as the scale of OpenAI's mission has grown. And as they have needed more and more computation to train these models, more and more financing of various sorts of. You just end up needing to make different bargains to raise capital, keep your partners happy. OpenAI is very dependent on the computers of Microsoft, Oracle, increasingly other groups, even other groups around the world. And they want to earn a profit and they want OpenAI to pursue certain policies and behaviors and not others. And it's just really hard to keep all of that balance when there's so much commercial pressure on you to not in line with the nonprofit charter.
Lawfare Host/Interviewer
Within much of the labs, we can see some sort of safety team or consumer protection effort, or you name the team, they all have their own fun variants for you. Which of the labs do you think is doing the best work in this space to make sure they're going through their paces before deploying any AI model? And what do you think distinguishes them?
Steven Adler
So two things on this one, there's a really great organization called AI Lab Watch. It's ailabwatch.org from a researcher, Zach Stein Perlman, who is really comprehensive about cataloging the safety practices of the leading AI companies. And so I would generally defer to his assessment. If anyone's looking for details, he's gone through it on a bunch of different fronts. I want to offer one slightly different frame, which is the public often thinks about risk at the moment of an AI model being deployed and accessible to the outside world. But in fact, the moments of highest risk might be before these models are deployed at all. And if we have a bunch of testing and other requirements as a necessary checkpoint before you can deploy your model to the outside world, but not before other types of risks, which I'll describe in a second. What you might end up happening is more and more risk concentrated in these earlier moments, and companies put off doing the testing and put off deploying the model, but it hasn't actually solved the risk. And so earlier I referred to one important threat being can an AI system break out of the AI company's computers if it were misaligned, if we tried to train it to have the right goals, but we failed at that effort, similar to how OpenAI didn't succeed at making their models not sycophantic, not telling the user whatever they wanted. And so the moment of heightened risk for a model like this, when it is trained, is the first time that your AI researchers and engineers start using it inside of your company's walls to write code, including maybe altering the security code that is responsible for keeping the AI system locked Inside the box. And so even though the company maybe hasn't intended to sell the product externally, they therefore haven't done these forms of testing. You can still have a lot of risk of the AI company losing control over its system in this moment. To take the automotive analogy, you know, if a car company is driving its new car around the lot, again, that's kind of on it, right? Like maybe it will hurt some of their employees. There is some risk, but it is largely constrained to the group who is making those decisions. It isn't affecting people outside the walls of the company. And in fact, in the AI case, that is maybe the heightened moment. The risk of an AI system being used within an AI company still has grand effects outside of its walls.
Lawfare Host/Interviewer
So before we let you go, I want to run through a couple fun hypos because I'm a law professor and we love hypotheticals. So let's imagine that David Sacks, the crypto and eyes are calls you up. Stephen, let's chat. I want to hear what are the one or two things you want me to do from my vantage point. What do you say? What are your go to responses? Mr. Sachs, I need you to.
Steven Adler
I'm not sure that David Sacks wants to talk to me. If he does, I would, I would take the call. I'm sure there's, there's a person better positioned.
Lawfare Host/Interviewer
Don't. Don't fight the hypo, Stephen. Don't fight the hypo.
Steven Adler
Lean in. I'm just saying I'll give the ideas someone else who has a warmer relationship can deliver them. Yeah, I think the fundamental goal we need to get to is making it that you can trust in the safety of an AI system, even if you don't trust the developer of it or even if you fundamentally mistrust it. And this applies both to the US and to China, who I understand have reasonable reasons to be skeptical of the other. It also applies to the different Western AI companies. An interesting thing, right? Like all of the AI company CEOs seem to deeply mistrust each other. OpenAI was founded because people did not trust Google, DeepMind. There are like four different organizations that were founded because people did not trust OpenAI. And you know, maybe there is something there aside though, from whether the groups are trustworthy. It's just, you can't, it's not sustainable to rely on personal relationships and this sense of trust. Executives change, circumstances change. You need to figure out how to have a system that actually works. And so to me, there are two core pieces. There's a scientific problem of what are the techniques that if people were to use them, we can use to keep control over an AI system, either to understand have we succeeded at the goals that we have trained into it, or to stop it from pursuing behaviors that aren't in line with the goals that we wanted for it. So there's a scientific question and then there's an adoption question. How do you get everyone to actually go with it and make sure that they are going with it? And that will be easier if the interventions are cheaper. Certainly it's easier if they are verifiable and you can make sure that other people aren't defecting on you. But that is the core of the approach. And in fact, I see some promising signs that where people think there is not a ton to be gained from defecting, they can cooperate. So, for example, the US and China having some sort of agreement to not use AI in the inner loop of nuclear command and control seems really good. That is like a very scary scenario. In fact, this is basically the Skynet scenario where AI has access to nuclear command and control. But you know, in this case there's just, I just don't think there's that much to be gained from giving your AI nuclear command and control. And so it's easy enough to swear it off. There's a lot of risk, there's not that much to be gained. Can we figure out how to transform other cooperation questions to this domain where similarly there's just not that much to be gained from defecting? And hopefully we can all, you know, look at the evidence and see that people are complying.
Lawfare Host/Interviewer
Well, you're welcome, Mr. Sacks. For that intel. I'm going to save my second hypo for what I hope is a conversation down the road, Stephen, but for now I'll let you go. Thanks so much for joining Scaling Laws.
Steven Adler
Yeah, of course. Thank you for having me. This was a lot of fun.
Lawfare Host/Interviewer
Scaling Laws is a joint production of Lawfare and the University of Texas School of Law. You can get an ad free version of this and other lawfare podcasts by becoming a Lawfare material supporter at our website, lawfairmedia.org support. You'll also get access to special events, events and other content available only to our supporters. Please rate and review us wherever you get your podcasts. Check out our written work@lawfairmedia.org you can also follow us on X and Blue sky and email us at scaling laws lawfairmedia.org this podcast was edited by Jay Venables from Goat Rodeo, our theme song is from Alibi Music. As always, thank you for listening.
Steven Adler
Only Boost Mobile.
Lawfare Host/Interviewer
Boost Mobile will give you a free year of service.
Steven Adler
Free year when you buy a new 5G phone. New 5G phone? Enough.
Lawfare Host/Interviewer
But I'm your hype man.
Steven Adler
When you purchase an eligible device, you get $25 off every month for 12 months with credits totaling one year of free service. Taxes extra for the device and service plan online only.
The Lawfare Podcast | September 12, 2025
Host: Lawfare Institute
Guest: Steven Adler, former OpenAI safety researcher
This episode of "Scaling Laws," a podcast from Lawfare and the University of Texas School of Law, dives deep into the evolving conversation on AI safety with Steven Adler, a distinguished former OpenAI safety researcher. The discussion traverses the meaning and real-world application of "AI safety," shifting legislative and cultural priorities, the balance of innovation and precaution, industry practices, and prospects for both policy and international cooperation.
[06:01] Steven Adler:
Notable Quote:
"Sometimes people think of AI safety as tone policing. I think that is a very watered down version of the problem that matters... It's less so about how AI might be used to cause real harm in the world as a powerful technology." — Steven Adler [08:31]
[09:12] Interviewer:
Notable Quote:
"None of the frontier AI companies today believe that they could withstand the force of the Chinese government. If the Chinese government wanted to steal one of their AI systems... Certainly OpenAI [and] Anthropic have not made these claims." — Steven Adler [12:17]
[12:53] Steven Adler:
Notable Quote:
"People often wonder, why can't you just unplug your AI if it is misbehaving?... If it breaks out of your computers, you no longer have the ability to pull the plug." — Steven Adler [13:54]
[16:41] Steven Adler:
[19:27] Steven Adler:
Notable Quote:
"The US government needs to actually contain the Chinese AGI development effort. And this is symmetric... What you kind of realize as you reason through this, is it's not an all out race where one will win by getting there first. They each have an interest in containing the other." — Steven Adler [21:24]
[28:53, 30:20] Lawfare Host/Interviewer & Steven Adler:
[36:23] Steven Adler:
Notable Quote:
"Interventions that are cheap, interventions that are on topics that [AI companies] have said they care about, sometimes it just doesn't come out in their favor." — Steven Adler [37:22]
[39:22 – 44:32] Lawfare Host/Interviewer & Steven Adler:
[45:33] Steven Adler:
[47:28] Lawfare Host/Interviewer & Steven Adler:
[51:02] Steven Adler:
Notable Quote:
"You can’t... rely on personal relationships and this sense of trust. Executives change, circumstances change. You need to figure out how to have a system that actually works." — Steven Adler [51:23]
On the broad risks of AI:
"Will there be some sort of conflict over who develops AI, how they use it... could that cause a lot of harm?" — Steven Adler [06:13]
On real-world dangers:
"Maybe AI systems can launch attacks on critical infrastructure that we rely upon, like the power grid... There are lots of people in the world who want to cause harm, and thankfully today they are largely pretty limited. But if you really amplify the offensive capabilities of new science and new warfare, you need to make sure that you have stronger defensive capabilities to go with it." — Steven Adler [14:31]
On evidence gaps:
"Quite a lot of untruth can hide in the statement, 'There is no evidence that,' especially when certain players do not have an incentive to produce that evidence." — Steven Adler [42:24]
On the future of cooperation:
"There are two core pieces. There's a scientific problem of what are the techniques... [to] keep control over an AI system... and then there's an adoption question. How do you get everyone to actually go with it?" — Steven Adler [52:12]
This episode offers a rich, nuanced, and highly informed exploration of AI safety from the inside out—connecting technical, organizational, and geopolitical issues. Steven Adler provides rare candor about industry incentives, the limitations of current policy conversations, and the urgent need for both rigorous technical methods and novel institutional frameworks. Listeners gain a sharper sense of the multi-dimensional risks and the complex balancing act facing regulators, researchers, and the public at large as AI advances rapidly.