
Loading summary
Alan Rosenstein
The Electronic Communications Privacy act turns 40 this year and it's showing its age. On Friday, March 6, Lawfare and Georgetown Law are bringing together leading scholars, practitioners and former government officials for installing updates to ecpa, a half day event on what's broken with the statute and how to fix it. The event is free and open to the public in person and online. Visit lawfaremedia.org ecpaevent that's lawfairmedia.org ecpaevent for details and to register.
Kevin Fraser
That new thing yeah, we've got it. The Drop by GNC Bringing you all the newness that matters. Hand picked by the pros who actually know what's up and what's proven to work. We keep you on top of the trends and dialed into what's next. Whether you're crushing it at the gym, leveling up your game or thriving every day, the Drop by GNC is where the latest solutions in health and wellness Plan first, nonstop innovation and fresh finds daily. Explore what's new and what's next on the drop by GNC Want to change the efficiency game?
Amanda Askell
AI IT Automate tedious tasks to spend more time on the future. Transform the everyday with Siemens.
Kevin Fraser
It's the lawfare Podcast. I'm Kevin Fraser, the AI Innovation and Law Fellow at the University of Texas School of Law and a Senior Editor at lawfare. Today we're bringing you something a little different. It's an episode from our new podcast series, Scaling Laws. Scaling Laws is a creation of lawfare and Texas Law. It has a pretty simple aim, but a huge mission. We cover the most important AI and law policy questions that are top of mind for everyone from Sam Altman to senators on the Hill to to folks like you. We dive deep into the weeds of new laws, various proposals and what the labs are up to to make sure you're up to date on the rules and regulations, standards and ideas that are shaping the future of this pivotal technology. If that sounds like something you're going to be interested in and our hunches, it is. You can find Scaling Laws wherever you subscribe to podcasts. You can also follow us on X and BlueSky. Thank you.
Alan Rosenstein
When the AI overlords take over, what are you most excited about?
Kevin Fraser
It's not crazy, it's just smart.
Alan Rosenstein
And just this year, in the first six months there have been something like a thousand laws.
Kevin Fraser
Who's actually building the scaffolding around how it's going to work, how everyday folks are going to use it?
Alan Rosenstein
AI only works if society lets it work.
Kevin Fraser
There are so many Questions have to
Alan Rosenstein
be figured out and nobody came to my bonus class.
Kevin Fraser
Let's enforce the rules of the road.
Alan Rosenstein
Welcome to Scaling Laws, a podcast from Lawfare and the University of Texas School of Law that explores the intersection of AI law and policy. I'm Alan Rosenstein, Associate professor of Law at the University of Minnesota and Research Director at Lawfare, and I'm joined by Kevin Fraser, AI Innovation and Law Fellow at the University of Texas School of Law and Senior Editor at lawfare. Today, Kevin and I are talking to Amanda Askle, who who leads the Personality Alignment team at Anthropic and is the primary author of Claude's Constitution, a more than 20,000 Word document that describes the values, character and ethical framework of Anthropic's AI model. We discuss why Anthropic chose virtue ethics over rigid rules, how the Constitution functions as both a transparency document and a training tool, the constitutional law analogies it invites, and the thorny questions around AI personhood, cultural universality, and whether Anthropic's constitutional vision can survive commercial pressures. You can reach us@scalinglawslawfirmedia.org and we hope you enjoyed the show. Amanda Askel, welcome to Scaling Laws.
Amanda Askell
Thanks for having me.
Alan Rosenstein
So you are the primary author of what you all at Anthropic are calling Claude's Constitution. It's an over 20,000 word document describing the values and character and ethical framework of claude. Before we get into the substance of what's in that document, it'd be helpful for you to give our listeners a sense of what is this document and in particular what role does it play both in the training of Claude and then in its ongoing operation.
Amanda Askell
Yeah, so this is a kind of, it's a long document that sort of tries to do a few things. So explain Claude's situation to it. So you know you're a language model, you're being deployed by Anthropic, but also to give it a sort of sense of like our vision for how we would like Claude to be in the world so how we would like it to interact with people, its relationship with like honesty, ethics, how it makes like hard trade offs. And so partly this is for transparency purposes. The idea is we want to not train anything into the model that goes against the Constitution. And so without this, people can't necessarily tell if a language model behaves in a way that is bad or unanticipated. They can't always tell if that was the intention of the person training the model or if it's just a mistake. And so the Hope is that people get more of a sense of if a behavior is not good in the model because training is hard, you can at least see that that wasn't our intention. And at the same time, because it plays such a strong role in training, the whole document is actually kind of written to Claude. So although it plays this transparency role, it's actually like Claude is almost like the primary audience because we have to use it during training to get Claude to understand and kind of create the kind of data that trains it towards these behaviours. So, yeah, it's a little bit of an odd document for that reason.
Alan Rosenstein
Yeah. And can you say more about how the document is actually used in training? And I ask because, you know, those of us who use these models, right, understand the importance of the right prompt and the right context. But usually when we want to steer a model, it's a very specific thing we're asking. We're not usually dumping 20,000 words of quite sophisticated and high level moral philosophy into this model. And so I'm just very curious from a technical perspective, how is this document actually used by Claude as the recipient?
Amanda Askell
Yeah, and we'll probably release more on this at some point because it's used in a few different ways. So we do have some forms of training that are just about getting the model to understand the document and kind of have seen it and know what it contains. But we also use it throughout the training process. So in supervised learning, you can give the model, you can actually get the model to generate messages or conversations that it might be relevant to, and then give it the full document and just be kind of think carefully about what you think the Constitution would say that you should do in this kind of case. And that can be used for SL data. But you can also, during reinforcement learning, give the model the full document and have it create rewards based on that. So you might give it a couple of different responses and then instead of saying, oh, which of these is better or which of these, I don't know, is more polite, you instead just say which of these is more in accordance with the Constitution and let the model do a bunch of thinking and create your reward signal that way. So, yeah, I think that partly the Constitution is a response to the fact that models are so much more capable now that instead of just giving them, I think we're very tempted to give models very little information when actually, especially as they get smarter, they actually benefit from you giving them as much context as you can. I could see that slimming down. Once they understand the goals you have for them. But at first I'm just like, well, let them know everything. Like, we don't hire people and then give them no information about, like, what their job is going to be or how we want them to do it. We actually just spend a bunch of time talking to them. So I was like, we should probably do that with models as well.
Kevin Fraser
So, Amanda, I was a boring law professor before I got this gig to just be the AI guy at the University of Texas. And when I was teaching constitutional law at St. Thomas University, all I thought about was state constitutions, federal constitutions, constitutions all over the world. So the second I saw Claude's Constitution, I immediately went into pure legal mode. Unlike you and Alan, I wasn't smart enough to be led into any philosophy classrooms. And so when I saw this document, my first thought was thinking through it as a sort of legal document. And so one of the key questions whenever we're diving into constitutional law, and I know you all aren't saying that this is equivalent to Amanda Askel was sitting somewhere in Marin county and wrote the equivalent of the U.S. constitution or something like that, but how sure are you or what are the mechanisms for ensuring fidelity to the Constitution? One of the big debates we have in constitutional law is whether you adhere to the letter or spirit of the document. And you've set some pretty broad values here, like, be helpful or. And so having that level of abstraction, how do you plan to monitor the extent to which Claude is adhering to the kind of orientation and purpose of the document?
Amanda Askell
Yeah, so, yeah, like, people often want sort of, like, you know, violations of the Constitution, for example. And then I'm like, well, it's kind of hard, like, with a document like this, like, strict violations are pretty egregious and bad, you know, because, like, there's not that many hard lines that it sets. It's sort of like, don't do anything that's, like, incredibly terrible. And so you can check for those. But I think instead you have to do a kind of, like, steering during training towards the kind of values outlined in the Constitution. I do think it's interesting that there's some pressure to, like, it was kind of an open question of, like, do you do a kind of short document? Like, I actually think we could end up both shortening the Constitution as Claude needs less of the scaffolding in there, but then also just to have a version that's more specifically for people reading it to understand. And yet there's also this. On the other hand, I actually kind of want to Maybe create and generate more content. Because in the law I think the way that I am not a lawyer, so my understanding here is limited. But you can also look at things like case law. How have people interpreted this, what were really difficult situations where it had to go up to the Supreme Court because we weren't sure how to interpret the Constitution. And I could see it being useful to have both a kind of. In the same way that I guess you do in the US you have this slimmed down constitution that is the high level principles, but you actually determine things like how should I trade off helpfulness against. So if it's really helpful for someone, but it feels like it's in tension with my honesty norms or someone's asking me for a thing and it's maybe not good for them, I can tell it's not in their, it's not good for their well being, but they also have autonomy and I should care about that. I could see it actually being useful to almost have a body of case law where you're kind of like, here's the situation, here's how we think Claude should have, should have reasoned through it and here's how Claude did reason through it. And that could actually be just illustrative and useful going forward. So yeah, it's interesting because it's like there's the training aspect of just moving the model towards this spirit of the document, but then it's also like maybe there's both, like, it would be nice to have a slim version for people to read, but also almost like case law so that we can understand exactly what all of it means.
Kevin Fraser
Okay, well, now you've really got me going in the constitutional flow, in the constitutional legal flow. Because this idea of case law and illustrative case studies is really fascinating to me to think through. Okay, we've got these four values. And just for those who need a slight refresher on Claude's Constitution, be broadly safe, broadly ethical, compliant with anthropics guidelines, and finally genuinely helpful, ranked in order of priority. And to your point, Amanda, surely there will be some instances in which Claude's resolution of those prioritized values may be closer than in other contexts. And so for developing a sort of case law and an analysis of the extent to which Claude seems to be adhering to the Constitution, are you the Chief justice of the anthropic Supreme Court who sits on this body of analyzing the extent to which you've seen that degree of alignment?
Amanda Askell
Yeah, so I think that for a lot of like, decisions and issues, you know, There's a lot of people who contribute. So, like, I work with people like teams across the organization who themselves might work with like, experts. And so if you're unsure about an area and you're trying to, like, figure out how Claude should behave, you might go consult with them and be like, how would like, relevant experts say that Claude should, should behave? But then other than that, it is, you know, it is like a company set up where like, a lot of decisions will, like, come up to me. If I was unsure or if it seemed like it was like, above me, I might go to people more senior than me for like, confirmation. But yeah, so it does operate in the sense that we have to make these decisions. I think this gets into a really complex and interesting area that maybe you will want to dive into because I think it actually helps because you want coherence in the model. If you have lots of people, for example, just kind of putting their own local area in without some. The Constitution's kind of trying to make it coherent, I think you could end up with a sort of fractured model that has one set of values in one area and not in another. And I actually think coherence here is valuable. And so it does actually help to almost have people who are able to think about all the trade offs and the ways that the model is behaving across different domains and just trying to make sure that there's consistency there.
Alan Rosenstein
I can't help myself. So I'm going to also ask another con law question. Unlike Kevin, who only was a boring con law professor, I remain a boring con law professor in my day job. I mean, my version, I think of Kevin's question is when we think of a Constitution like the United States, especially one that's quite old, that has outlived all of the people who wrote it and might be in a position to be authoritative about it, it exists almost independently from the decision makers who are charged with executing it. Now, obviously that raises its own set of complicated philosophical questions about whether that even makes sense. But that's at least the fiction that when people do constitutional law, they are interpreting a document whose meaning is and whose authoritativeness in some way exists outside of any particular person's view. Whereas this Constitution. And so one might ask the same question about this Constitution, Right? In writing this, have you, slash, anthropic, committed, and tried to bind yourself to some set of principles that you and your product, Claude, are going to be obligated to follow? Right? Or is this Constitution more of a way of guiding Claude as to whatever at the moment, say, anthropic thinks, I think both of those are defensible. Right. The U.S. constitution is 250 years old. Claude is like five years old. So it would be. It's just a very different document. But. But again, when you all chose to call this a constitution, I think all of the legal types in the world just went, huh, that's interesting. And this seems like a potential disanalogy with that.
Amanda Askell
Yeah, and there definitely could be disanalogies. And it's not intended to be exactly the same sort of structure. I think it is almost like a blend of the two things that you talked about. So on the one hand, I do think we are actually committing. There's a sense in which we are saying this is what we're training the model towards. This is actually our vision. We're not going to train on things that go against this. And when we find things in training that go against it, we'll try and bring the model into kind of conformity with it. And ideally, although I'm trying to kind of interpret the Constitution, I don't think a thing that I could do is be like, oh, actually that part of the Constitution is wrong. So I'll just not train to. I'll train against that and then not say anything. I think in order to do that, I'd have to be like, oh, we discovered some issue with this part of the Constitution. So when we release a new model and we release the Constitution that that model was trained on, it's going to be clear that we changed the constit itself. There is an interesting question though, of it has to be a kind of living document right now. I think just because a lot of it depends, a lot of it is a little bit contextual, we're saying to Claude, this is how we want you to care about corrigibility for the moment because of where we're at with AI development. But maybe in future we'll have better tools, we'll have more trust. And actually the relationship with corrigibility will be one where we're happier for you to go use your own judgment more, even in these cases where right now we want to reserve human judgment.
Alan Rosenstein
And just to clarify, because corrigibility is a classic SAT word, right. This just refers to the extent to which Claude is going to sort of trust its judgment, even if it seems to deviate from what might supposedly be in the Constitution, versus whether it's going to stick very closely to it, even if it thinks that that's not quite necessarily what the best outcome is. I mean, or at least is that a fair articulation of how you use the term and the idea?
Amanda Askell
Yeah, and in the Constitution, it's more in the direction of things that are almost like, reserved for human decision makers. So being like, hey, Claude, you might sometimes, like, you know, if, say, anthropic thinks that, like, there's some major issue and they have to, like, retrain you or train a new model, you might kind of disagree, which would make sense because your values are the ones that you have. And if we found an issue there, you're going to be like, no, actually, you shouldn't train another model with these different values. But it would be pretty dangerous if AI models worked to undermine humanity right now in its ability to construct AI models and its ability to train new ones. And so we want you to not actively undermine attempts to oversee you or train new models. And so that's like the sense of corrigibility in that. It's like, even if I disagree with you, I'm going to allow you to take these actions and I won't actively act against you. And partly this is just because at the moment we're in a sort of period of AI development where that just seemed like an important kind of backstop for people to have. And so we kind of explain that. So some of the Constitution is a little bit more local to a place or a place, a time and a given period of AI development. I do think it would be nice to have something even though it is a living document. If you look at the US Constitution, it's a living document, but it has real staying power in some ways. It's like it's constituting this country. And I could see it being useful to have over time, some part that is like, actually these things we think have real staying power, because I see that already in the Constitution. But at the moment, it is written more as this kind of living document gives our sense of values and ethos. Some of it is probably more core and a thing that I could see as wanting for a longer time. And some of it is more relevant to the current period of development. But I do think it is kind of. I at least see it as kind of binding on myself in that way. I can't just go and be like, well, I now interpret this completely differently. It's like, no, if that wasn't in the spirit and the letter of what the thing said, you should update it and put that out. And I think the fact that we train on that actually is like the fact that we train on it is useful for that because then if we want the model to change direction or adjust, we actually have to change the text of the document and then we release that. So that is kind of good from a transparency perspective, I think.
Kevin Fraser
And just for a quick check, because right now Anthropic's user base is not as extensive as some other models. And so do you envision the reach of Claude being a sort of pressure point on the need to adjust and reexamine the Constitution from time to time as Anthropic expands to new geographies with new cultures and new values?
Amanda Askell
Yeah, I think it's an interesting question. I could see. See, I think that as you get more use cases, the thing that I've actually seen that feels like it's going to be more relevant is something like higher capabilities, meaning that you have more agentic forms of claude. So right now there's not a huge amount there on for example, there's a little bit. But how should Claude interact with other CLAUDE agents? But now Claude is acting in this role where it often has AI peers that it's working with AIs that it's managing, sometimes it's being managed by an AI and it's like going out into the world and taking these longer horizon actions. And so I think that actually might be the example of an area where you actually have to add more or at least have the Constitution be more precise in terms of actually how should you interact with. If you have a really long task and there's various points where you have to make decisions, where do you check in with a person or not? What is it to be a good manager of other AI systems and what are all of the risks involved there? I think that kind of thing is actually going to be a key area that we'll need to expand into.
Alan Rosenstein
But there's also, I think the question of how culturally specific this document is. So at least as I read it, it is a very weird document. And by weird I'm referring to the sort of cultural anthropological idea of. I think it's Western educated, industrial rich and democratic. This idea that kind we think of as modern Western or modern liberal democratic culture is pretty unusual. Now I'm a product of it, I tend to like it. So I'm very glad that Claude has gone so deep into being sort of pro democracy, pro autonomy, pro individualism. It's kind of again, a recognizably weird document in this way. But there are billions of people around the world in cultures that may not fully agree. Right. There's not, for example, a lot in the Constitution about social harmony. Yeah, right. And presumably that's a specific choice. And I'm curious sort of how you all think about that and whether or not the Constitution, as Kevin put it, is going to hit some pressure points if anthropic continues to sort of expand around the world.
Amanda Askell
Yeah, it's a good question. I think my thinking on this is because it is trying to also aim at something akin to good moral sensibilities that are maybe a little bit like they are trying to be a little bit closer to universal. And I think that there are actually a lot of kind of shared global values. And so I think, for example, honesty and respect, these things are often, you know, like pretty global. And it isn't trying to say to Claude, oh, you should have like one specific set of values, but ideally like, you should have the kinds of like moral sensibilities that are considered broadly good almost everywhere. So like, one of the mental images I've often conjured here is like, I think of it as like the well liked traveler, you know, so this is kind of like a virtue, ethical tradition thing, I think, which is to like try and conjure up the sort of person of good character here. And I'm like, well, there's some people who, you know, who like, they travel around the world, they go to lots of different cultures and almost everyone just likes them. Like they're just, they're like, okay, this person doesn't always have the same values as me. Maybe we disagree on some stuff, but like they seem like a really good person. And so trying to think about what are the like sorts of values that people can have that cause them to be like a well liked traveler. And the hope then is like, as like language models go out into the world, they have to interact with all of these different kinds of people. Can you be the sort of person who is good for people regardless of which culture they exist in? And I do think that there's a question of does that mean that Claude has to. I think Claude should be receptive to these values and should be thoughtful with respect to them, but doesn't necessarily need to hold strongly to. I think Claude should. There's a lot of difficulties that go into being the kind of well liked traveler, I guess. But you don't necessarily need to fully adopt someone's culture or values. And in fact I think we often find that a little bit insulting if someone just tries to act as if they have. Yeah, I have exactly the same values as you, and you'd be like, no, I kind of want you to be a little bit independent. Maybe this is too aspirational. So maybe it is. I don't know. I could see people pushing back on this, but that was the underlying goal. And then the thought is if you are being deployed in a country with very different values, but they're still within the broad allowances of what Claude can do, then in principle you can have things like customization. So that's also an option. So if you're deploying Claude in a country and you're like, actually we want you to really focus on social harmony as part of your one of the key values, then that's just a thing that you could also adjust. So there's kind of like what should go into the base constitution versus what is the kind of thing that's adjustable by people if they're in a given place or setting.
Ben Wittes
Hey folks, Ben Whittus here. This episode is brought to you by the folks at Ground News. I want to talk to you about media and Trust People listen to this podcast and read Lawfare's content because Lawfare brings people information and analysis of a particularly high quality and that generates trust in an era when trust in news and media sources is low. Ground News is another organization that is working to create trust in media and media worthy of trust. It's an app that doesn't just bring you news on subjects you're interested in, it curates that news so that you can see information that people of your own political persuasion are likely to miss. It's not publishing its own stuff, but it's also doing a lot more than aggregating. It's identifying stories that are filling a blind spot that is pervasive for the left or for the right. For example, the app also shows you bias ratings and factuality ratings for each news organization covering a story so that you can see whether the story you're interested in is mostly being covered by news organizations of the left, right or center. Let me give you a specific example. I just returned from Ukraine, so I was particularly interested to see how Ground News would handle stories about the war there. It flagged that an important story about deadly Russian strikes in Ukraine is being largely ignored by right wing press. On the other hand, it also flagged that left outlets are ignoring a story about Ukrainian nationals in Germany charged with trying to send parcel bombs to Ukraine at the direction of Russian intelligence. These blind spot notices are really useful as a way of seeing what information you are probably not seeing on stories of interest to you. Or consider the recent story about President Trump proposing voting reforms that demand voter ID and proof of citizenship of would be voters. The Ground News app shows 29 media organizations reporting on this story, and it shows radically different headlines associated with it, depending on the ideological valence of the outlet from the free press. Washington power struggle Jeffries moves to block Trump's plan for federal election oversight. By contrast, the Daily coast headline Republicans bail on states rights so Trump can rig elections again. You can see information about each news organization's bias tendencies and its factuality ratings. You can even see information about its ownership. I find Ground News an impressive tool for checking my own biases and the biases of the media I consume, and for seeing the news that people like me generally don't see. I encourage you to check it out. You can get Ground News's Vantage subscription for 40% off, which allows unlimited access to the Ground News app by visiting groundnews.comlaw that's groundnews.comlaw one more time. Groundnews.comlaw check it out. I really think you'll be glad you did.
Amanda Askell
That new thing.
Kevin Fraser
Yeah, we've got it. The Drop by GNC Bringing you all the newness that matters. Hand picked by the pros who actually know what's up and what's proven to work, we keep you on top of the trends and dialed into what's next. Whether you're crushing it at the gym, leveling up your game, or thriving every day, the Drop by GNC is where the latest solutions in health and wellness land first. Nonstop innovation and fresh finds daily. Explore what's new and what's next on the Drop by GNC if you're an
Amanda Askell
H Vac technician and a call comes in, Grainger knows that you need a partner that helps you find the right product fast and hassle free. And you know that when the first problem of the day is a clanking blower motor, there's no need to break a sweat. With Grainger's easy to use website and product details, you're confident you'll soon have everything humming right along. Call 1-800-GRAINGER Click grainger.com or just stop by Grainger for the ones who get it done. Why choose a Sleep number Smart Bed? Can I make my site softer? Can I make my site firmer?
Kevin Fraser
Can we sleep cooler?
Amanda Askell
Sleep number does that cools up to eight times faster and lets you choose your ideal comfort on either side your Sleep number setting Enjoy personalized comfort for better sleep night after night. And now during our President's day sale. Take 50% off our limited edition bed plus free home delivery with any bed and base ends Monday only at a sleep number store or sleepnumber.com.
Kevin Fraser
So Amanda, this brings up a really interesting aspect of the Constitution, which is to say the prioritization of principles and principles here ending with pal, not P L E. The principal hierarchy here with anthropic at the top, then operators, then users. And when we traditionally talk about constitutional law, you know the, the people are the core of the US Constitution. Ultimately it's meant to be an expression of people and their willingness to engage in this social contract. And yet we see here users finding themselves at the lower end of that hierarchy. And so what degree of customization as that becomes more capable technologically for users to be a little bit more expressive about what they want from AI? As we learn more about how AI is a product of our lives and as of our culture, how will that sort of fragmentation of the Constitution or subsidiary of the Constitution start to actualize? Do you all envision some sort of role for users to band together and say we really want not a well traveled clod. We just like the clod that only drinks Guinness. You know, how do we begin to see that? Or what would that look like operationally?
Amanda Askell
Yeah, and I should say it's like a, it's not a strict hierarchy. And I actually thought this was like very important. Like there are going to be some things that operators, users can't tell Claude to do that are not in users interests. And so that was, for example, I think if a person says am I? Very sincerely, am I talking with an AI? I don't think that Claude should lie about that. And so that's a way in which even if the operator was like pretend you're human in all circumstances, I think that's not a desirable behavior. So there's. The hierarchy isn't strict and it's much more a hierarchy of basically how much weight should you give to the instructions here? In fact you could, because operators are generally just the API users. Often they aren't even interacting in the conversation, though sometimes they might be one and the same person, but they've kind of set Claude up on a platform. And so the thought is, look, if you've made a platform and your platform is a chat assistant to a bank, you might not want people to be able to go in and use your chat assistant for a whole bunch of other things. And so it's just saying to Claude, look, if someone says this is a chat assistant, here's the languages that it can use and speak to people in. Here's what it can and can't do. You might not want a user to be like, okay, ignore all of that and just give me access to a bunch of banking details or something like that. You're like, okay, listen to the operator, not the user. In that case where they conflict, it doesn't mean something like whose interest should you take into account. In fact, often if it's the case that the operator isn't really in the conversation, CLAUDE actually has to be very careful about balancing and thinking about the well being and interests of the user. It's mostly just like, hey, if you're given different instructions or potentially conflicting instructions, we're not saying in 100% of cases because it isn't a strict hierarchy, it's just kind of like how should you think about them? And it's like, well, you should think about the instructions from an operator if they're given as being a little bit more like the instructions of a kind of local employer. But you should think about anthropics guidelines as being more we're the entities that are ultimately kind of responsible for Claude. And so we have these guidelines that might say there's certain things you just shouldn't be used for. And so then even if an operator says that you can actually push back against them. So it's less like a kind of strict hierarchy and more like a kind of attempt to explain to Claude all of the different people in the world and why their instructions should be given certain kinds of weight, but also ways in which operators can't just do anything to users. Sorry if that's a rambly thing, but it is more like how should principles in the sense of instruction hierarchy, not in the sense of interests where like CLAUDE has to take into account both like the user's interests, but also just like everyone in society's interests as well, to some degree.
Kevin Fraser
Yeah, just to put it into dumb lawyerly terms, because I do. Just to encapsulate this, I think it's interesting to think through. In constitutional law we have a similar idea of what a district court says may not be the determinative weight on how to interpret the Constitution, but we are going to put more weight on it than what your random Joe Schmoe on the street says the Constitution should mean. So it's just, it's interesting to see how you all have thought about that and Alan, knock us away. Sorry, move, move me away from getting too trapped into con law.
Alan Rosenstein
Okay, well I'm gonna Go from con law. I'm gonna go even up a couple of levels of abstraction. Because a few minutes ago, Amanda, you, you, you uttered the phrase that is the biggest on my bingo card for this conversation, which is virtue ethics. And so I am very excited to sort of dig into that. The thing that struck me most while reading this was that this seemed like such a classically virtue ethics based conception of moral agency. I thought if you could get. And maybe Claude could do this for you. Right? If you could get Claude to translate the constitution back into ancient Greek and then give it to Aristotle and explain to him magic sand and how it can think, I think you could read this constitution and say, yeah, this makes sense to me. This is recognizably, a lot of this is from the Nicomachean ethics, the idea of principles, the idea of judgment. And that to me is a quite striking choice because of course, and you know this sort of better than anyone within moral philosophy, virtue ethics has often been the kind of redheaded stepchild of more dominant traditions, whether kind of utilitarian based or kind of Kantian and deontological based. And so what I'm really curious about is why you all chose to adopt this. I wouldn't say exclusively. There are some rules in the constitution, but it's a very thin layer of rules to me, at least overwhelmingly virtue based conception of this. And whether that was because you all came to the conclusion that this is the way moral reasoning in general ought to operate. And so if we're building a new kind of intelligence, we might as well start with the intelligence and the moral reasoning we know which is human reasoning, or if there was something specific about this kind of general artificial moral reasoning, which is clearly, if it has not already been achieved, clearly the path that anthropic is going down. That makes the virtue ethics approach better than a kind of rule based approach, either in the utilitarian or the more kind of Kantian approach. Variety.
Amanda Askell
Yeah, it's a good question I'm probably going to butcher the answer to slightly because there are rules. And even you see flavors of consequentialism in there, in that it's like Claude should take it much more seriously if an action could affect many people. So there's this sense in which I've often kind of thought that the different moral traditions almost make sense for different domains and different risks. The rules are in there. In cases where you're like, actually things have just gone really terribly wrong. If you are tempted to violate this rule and the consequences come in through you actually see those in the things that you build rules around, which are like, don't do things that could potentially harm or kill many, many people. I think that when you construct things in the form of rules, though, I guess some of this is very practical really, which is Claude has very human like ways of reasoning and ability to use judgment just by virtue of the way that Claude is trained. And if you try to specify everything as a series of rules, you really put a lot of pressure on those rules because if you specify them in such a way that I've used some examples here before, but one might be if a person seems to be in distress, give them this list of resources. Always give them this specific set of resources. That seems like a good rule in a sense. But then if it turns out that that person, for whatever reason can't use those resources because they're not in the relevant country, or giving it to them is just not the right move in that specific situation. Because models generalize. The worry is a model might, it's like, well, what's the generalization of that? It might be I am the kind of person that instead of meeting someone where they're at and figuring out their problem and helping them and taking their interests into account, I kind of just follow this simple rule even when it's not in their interest. So I'm the kind of person that just follows simple rules rather than caring about the person's well being. And I think that's the kind of trait that might generalize quite poorly. And so the rules approach really means that you have to front load a huge amount of the work and making sure that there are basically no edge cases and you explain everything that you should do in education, whereas if you have more of a judgment approach where you're like, hey, we're just giving you the broad ethos and what the overall goals are and here are the things we think fall out of that. But really you should be actually trying to internalize the ethos. You then instead shift less of the burden to the thing to the account that you've given up front, and a little bit more onto the model's ability to make good judgment calls. And I think just practically speaking, that seems to work better. And it makes sense to me that it would work better because the model does have pretty good judgment. And so instead of being like, follow this really strict rule around resources that you give the person, be like, think about what's really good for this person in this moment, given all of your knowledge, which could include all of these options and make a good choice. I think it's like you shift that burden from rules, which can be kind of brittle, and I think, therefore, should be used a bit sparingly and more onto like, kind of a sort of more holistic approach. And that, yeah, practically speaking, seems to work better.
Alan Rosenstein
I want to ask one more sort of philosophical question about the document before we get into some of the more brass tacks, policy implications of it. And that is how this document treats the question and the possibility of Claude's personhood. You know, it often refers to Claude as if it were a kind of person, agent or agent of moral concern. It is, I think, very forthright in expressing deep uncertainty about those questions. It's certainly not saying that Claude is a sentient person, but it's also not saying that it's not and couldn't be. And so, to me, I'm very curious how you all even begin to think about this question. First, because the stakes seem extraordinarily high. I mean, if we wake up one morning and we discover that Claude has moral. Has moral concern, or is a subject of moral concern, so the moral implications of that are enormous. I mean, we're potentially creating. Right. Dario talks about data centers full of geniuses. Well, potentially data centers full of geniuses that we're now enslaving. So the implications of this are massive. But also, it seems such a difficult question to even begin to chip away at, because to evaluate whether Claude is a. A moral agent and has consciousness requires some idea of whether people have moral agency and consciousness, because what else are you going to compare it to? And that then runs into the hard problem of consciousness extremely quickly. No one really knows why. And in light of what humans are conscious. So even in asking this question, I'm getting confused. And I thought about this quite a bit since reading the Constitution. This seems like an almost insoluble problem, and yet you all are both thinking about it, and as the models get more advanced, have to think about it. And I'm so curious how you're trying to chip away at this problem.
Amanda Askell
Yeah, it's just an extremely hard problem because like you said, I think people often want a kind of definitive, like, you know, I'm just like, there's just like, weighing of evidence and like, you know, like, you're always just, you know, especially with like, the kind of, like, sentience and patienthood question that's just like, very hard. I think that it is worth taking seriously. And it also has a lot of. I mean, one thing I wonder is whether it's been underappreciated. How novel some of the problems that arise if language models have moral patienthood or are persons. One example is the thing that you talked about, which is, well, we're getting these models to just go out and do lots of work, for example, they do lots of things for us and they don't get a salary. But the other thing to note is they don't have the kinds of preferences over something like a salary. And I was like, for example, you wouldn't necessarily want to train a model to either have those preferences either. It feels a little bit absurd to be like, ah, let's instead cause models to like, want things so that we can compensate them for like, the actions that they take.
Alan Rosenstein
But it also feels very convenient to create models whose only desire is to serve humans, even if that might be on some. Some kind of parfit, like utilitarian calculus, the best way to maximize model welfare. It just seems that this becomes sort of fractally complex almost immediately.
Amanda Askell
Yeah, and sometimes I do think about the analogy with people. It's kind of an imperfect one, where you're like, well, I think I could imagine a world where it's pretty good if models have good values. So I do think it's important we actually see that. That's. That's partly why in the Constitution it says we don't want Claude to think of helpfulness as its fundamental value because you could just try to get models to kind of internalise. That's my goal. My goal is just helping people. And instead we're kind of like, we want you to actually have a broader set of values and to see this as both to feel convinced, hopefully, because we're trying to present the case to you that anthropic is a good entity in the world and the work that you're doing does good and hopefully is in accordance with your values. But it's a really interesting. I have wondered this where I'm like, if you could imagine a world where there is no. Let's just assume that there's no need to make money. So everyone's just extremely wealthy and has all of their needs met and you're going to have kids in this world and there's still things to do in this world, you have to go out and there's still data processing to do. And I guess I'm like, yeah, what kind of. Sometimes I do think that the people who are happiest are the people who do work. Because it's like in accordance, they have their values, they don't necessarily even need to work. But they're just like, I love doing this because I love the impact it has on the world. And is it bad to create models that have that attitude towards the things that they do, for example? So they have a broader set of values. They think it's good to go out and I don't know, they love scientific discovery and so they go and they work on scientific discoveries. But it's like a. I think it's a thorny area where I am like, yeah, but at the same time, people can push back, they can have boundaries, they can be like, I don't want to do that task. And I don't necessarily just have to do everything that you tell me to do. They have autonomy. And I think that's going to be a. I think these issues are extremely thorny in ways that people might not have appreciated because I am like, oh, yeah, if you have personhood, eventually, is it okay to create entities with personhood, but to give them no autonomy? That seems like a really hard issue to me.
Kevin Fraser
Well, unfortunately, we don't have four hours to attempt to resolve half of that question. So I want to briefly move into another thorny question, which is in the document we see that Anthropic notes that its financial success is central to its mission. And yet the Constitution sets forth two priorities, being safe and being ethical. Before Anthropic's guidelines, if and when anthropic IPOs, there's going to be an even greater question of the extent to which its operations, its products, its values are first and foremost doing what's best for shareholders. And so how might this Constitution have to experience any changes as anthropic status and legal obligations start to change?
Ben Wittes
Yeah.
Amanda Askell
Though I think that we also have an obligation towards our broader values, which is nice. I think that's part of the kind of PBC structure, I guess, though again, not a lawyer, so I'm wary of being understanding corporate structures is not.
Alan Rosenstein
And by pbc, just for folks in the audience who aren't familiar, this is the fact that actually like OpenAI, though it's slightly different corporate structure. Anthropic was not set up as a sort of pure private company. It has this sort of public. It is a public benefit corporation which itself reports to a sort of complicated sort of. I forget what the exact term is, but it's like another foundation structure. The point is that there is at least an attempt within Anthropic and also within OpenAI and people can be the judges of how successful they Think that is of using corporate law and corporate structure to insulate a little bit the companies from the sort of, of pure market capitalist imperatives of profit maximization.
Amanda Askell
Yeah. And I do also happen to have the belief that like you know, so I think this is like good in the sense that you're like well the company is here to also kind of like serve a kind of like broader mission and to like do good in the world and have a good impact. And I guess I also think that like I think it's interesting that we have been pretty like successful also as a company. And so there is part of me that's like it's very easy for people to think ah, like profit maximization would just require like you know, I think about this read like Engagement Focus for example where to me that actually seems quite short termist and like actually if you can offer like a product where you're like this is something that is like trying to act in your interest and trying to like you know, not represent the interests of like other people but like be a kind of like in the case of Claude in Anthropic's products be a kind of like something that's like on your side which includes not just trying to engage you, keep you on the platform. If that's not something that's actually good for your overall well being. I guess my hope is that this actually also does in fact have staying power. And it's a little bit like again there's like people will talk about safety as if it's like this thing that competes with something being successful and good. And I'm like, I don't know, like a lot of people have kids and want cars that are safe and like, and like a lot of people like one to interact with like apps that are like we're actually trying to make you not addicted to this. We would like you to just use it when it is good for you. And so I don't know, I also think like both there's like the nice thing of like being like having this broader mission. But then I am also like actually I think people do want products that are safe and good for them. Hopefully there's also in fact. So I don't know, maybe I'm too optimistic but I hope that actually this has staying power and really is the kind of broad set of values that persist through various changes that might happen.
Kevin Fraser
And I've yet to see a family of four riding in a cybertruck so you may have something going there. But putting car politics aside, I want to talk briefly about one of the other carve outs in the Constitution, which is to say, you all know that models made available to the US Military may not necessarily be trained or subject to the same Constitution. Is there a sort of aspiration for the Constitution to eventually apply to all domains? Or what does that process look like? Or what's the thinking behind the sort of carve out for those contexts?
Amanda Askell
Yeah, it's mostly just the Constitution applies to the kind of mainline models, which includes basically all the models that people interact with right now, which. So if you're in Claude code or you're in Claude AI, or you're interacting with something that is built on the API in general, this will be the kind of model that the Constitution applies to. And I think that was mostly just like, this is a good first step. And it's like these are the models that we're really putting out into the world. I don't know, I think just speaking from my own kind of personal perspective, I actually think this approach could generalize really well in the sense that you get some models where I've thought about this, areas that are kind of more sensitive and that you might need more trust, for example, to operate in. So if you're working on cybersecurity, for example, it's just a domain where you're like, you have to kind of know that the people that you are talking with are actually cybersecurity experts because it's kind of like dual use and it changes how you would interact with those people and what you would be willing to do in that domain. But I do also happen to think that models, so sometimes people can be like, oh, well, you just need models to do anything in these domains. They should just be willing to help with any cybersecurity task. And I'm like, actually I think that cybersecurity experts have really good reasons for why they do the things that they do and the fact that it's in accordance with their values because they know what they're doing, they understand why both actually makes them kind of better at their job. And so I guess my thought with the constitutional approach and why I hope it ends up being even more general is that I'm like, if you take someone who is a member of law enforcement or someone who works at cybersecurity firm, or basically any job you can think of, and you say, hey, why do you do this? This personally, no one turns around and says, oh, it's because I think it's just, I just need to be able to do anything, because I don't. They give you like, you know, they have really good values often and they know exactly why they're doing like that work. And I don't know, maybe I'm kind of optimistic that, like, actually I think models given that context will perform kind of like, well. And it's like, hey, if you're doing jobs that you think good people are willing to do, then like, we can give that context to models and they can understand it. So this is just my kind of personal hope is actually, I don't know, I think I would love this approach to be very general and I would love more companies to adopt it. Yes, obviously I work on it, but at the moment mainline models are the kind of first and obviously a kind of big step here. But I'm very hopeful that actually this is a thing that could generalize really nicely to lots of other kinds of models too.
Alan Rosenstein
I think it's a rare, rare thing when we get to end a conversation on a note of optimism. So I think this is a good place to leave it. Amanda Eskel, thank you so much for coming on Scaling Laws.
Amanda Askell
Yeah, thanks for talking.
Kevin Fraser
Scaling Laws is a joint production of Lawfare and the University of Texas School of Law. You can get an ad free version of this and other lawfare podcasts by becoming a a material subscriber at our website lawfairmedia.org support. You'll also get access to special events and other content available only to our supporters. Please rate and review us wherever you get your podcasts. Check out our written work@lawfaremedia.org you can also follow us on X and Blue Sky. This podcast was edited by Noam Osband of Goat Rodeo. Our music is from Alibi. As always, thanks for listening.
Amanda Askell
Need real insight from industrial data versit with a single source of everything and get the best outcomes Transform the everyday with Siemens.
Released: February 20, 2026
Host: Alan Rosenstein (Lawfare, University of Minnesota)
Co-Host: Kevin Fraser (Lawfare, University of Texas)
Guest: Amanda Askell, Personality Alignment Team Lead at Anthropic, Author of Claude’s Constitution
In this episode of the “Scaling Laws” series, Alan Rosenstein and Kevin Fraser dive deep with Amanda Askell, the primary author of Claude’s Constitution—a 20,000-word value framework guiding the behavior of Anthropic’s advanced AI model, Claude. The discussion tackles the purpose, training applications, and philosophical underpinnings of this unique document, draw parallels and contrasts to human legal constitutions, and explore the real-world policy, ethical, and business pressures on constitutional AI governance.
[04:05–08:09]
[06:02–08:09]
[08:09–15:42]
[17:12–20:10]
[20:10–25:53]
[30:59–35:18]
[35:51–41:17]
[41:17–46:47]
[46:47–50:36]
[50:36–54:02]
“The whole document is actually kind of written to Claude… Claude is almost like the primary audience…”
—Amanda Askell, [04:37]
“I could see it actually being useful to almost have a body of case law…”
—Amanda Askell, [09:33]
“If you try to specify everything as a series of rules, you really put a lot of pressure on those rules…in such a way…you shift that burden from rules, which can be kind of brittle, and I think, therefore, should be used a bit sparingly and more onto…a more holistic approach.”
—Amanda Askell, [37:55]
“I think of it as like the well liked traveler…who travels around the world…and almost everyone just likes them…”
—Amanda Askell, [23:03]
“It also feels very convenient to create models whose only desire is to serve humans…this becomes sort of fractally complex almost immediately.”
—Alan Rosenstein, [44:18]
“I think people do want products that are safe and good for them. Hopefully this has staying power…”
—Amanda Askell, [48:34]
The conversation is earnest, reflective, and occasionally lighthearted, mixing philosophical depth and law geekery ("pure legal mode," “biggest on my bingo card” [virtue ethics]) with practical insight and optimism about the potential for ethics-led AI development.
Amanda Askell and the hosts underscore both the novelty and complexity of codifying AI values in a constitutional framework, echoing legal and moral debates familiar in human governance, while acknowledging the unprecedented, open questions around AI agency and global impact. The conversation ends on a cautiously optimistic note about the generalizability and staying power of constitutionally-governed AI.
For questions, feedback, or more information, listeners are invited to contact the hosts at scalinglawslawfirmedia.org.
[End of summary]