wavePod

Get Wave AI

Scaling Laws: Claude's Constitution, with Amanda Askell - The Lawfare Podcast | Wave AI Podcast Notes

Back to The Lawfare Podcast

Scaling Laws: Claude's Constitution, with Amanda Askell

The Lawfare Podcast

Fri Feb 20 2026

Summary

The Lawfare Podcast: Scaling Laws – Claude’s Constitution, with Amanda Askell

Released: February 20, 2026
Host: Alan Rosenstein (Lawfare, University of Minnesota)
Co-Host: Kevin Fraser (Lawfare, University of Texas)
Guest: Amanda Askell, Personality Alignment Team Lead at Anthropic, Author of Claude’s Constitution

Episode Overview

In this episode of the “Scaling Laws” series, Alan Rosenstein and Kevin Fraser dive deep with Amanda Askell, the primary author of Claude’s Constitution—a 20,000-word value framework guiding the behavior of Anthropic’s advanced AI model, Claude. The discussion tackles the purpose, training applications, and philosophical underpinnings of this unique document, draw parallels and contrasts to human legal constitutions, and explore the real-world policy, ethical, and business pressures on constitutional AI governance.

Key Discussion Points and Insights

1. What is Claude’s Constitution?

[04:05–08:09]

Structure and Audience:
- The Constitution is both a transparency document and a training manual, targeted primarily at the AI model itself rather than end users or readers.
  - Quote: “The whole document is actually kind of written to Claude… Claude is almost like the primary audience because… we have to use it during training to get Claude to understand and kind of create the kind of data that trains it…” —Amanda Askell [04:37]
Transparency and Accountability:
- It aims to clarify Anthropic’s intentions and allow outsiders to understand whether problematic AI behaviors were intentional or errors in training.

2. Technical Use in Training

[06:02–08:09]

Training Process:
- The Constitution is injected into supervised and reinforcement learning. CLAUDE is exposed to the document, asked to reason about scenarios through its lens, and reward signals are shaped by alignment with constitutional values.
  - Quote: “You can give the model… a conversation… and then give it the full document and just [say] ‘think carefully about what you think the Constitution would say you should do…’” —Amanda Askell [06:34]
Analogy to Employee Onboarding:
- Amanda compares giving the Constitution to Claude to giving a human employee an exhaustive briefing about company ethics, instead of expecting them to intuit goals from sparse guidance.
  - “We don’t hire people and then give them no information about, like, what their job is going to be… We should probably do that with models as well.” —Amanda Askell [07:48]

3. Legal and Constitutional Law Analogies

[08:09–15:42]

Letter vs. Spirit of the Law:
- The hosts explore how closely Claude’s behaviors are monitored for “constitutional violations,” and whether the orientation is toward the “spirit” or the exact “letter” of the document.
Value Trade-offs and Case Law:
- Amanda proposes a "case law" approach: recording difficult scenarios, how Claude reasons through them, and what should count as constitutional precedent.
  - “I could see it actually being useful to almost have a body of case law…” —Amanda Askell [09:33]
Binding Nature and Living Document:
- Unlike the U.S. Constitution, which exists independently of its drafters, Claude’s Constitution is both a commitment and a living document—subject to updates with transparency as Anthropic’s understanding or needs change.
  - “There is an interesting question… it has to be a kind of living document right now…” —Amanda Askell [15:42]

4. Corrigibility & Model Obedience

[17:12–20:10]

What is Corrigibility?
- Refers to AI’s willingness to allow human overseers to override it, even when it (the model) disagrees, particularly important in the current stage of AI development for ensuring safety and oversight.
  - “Partly this is just because at the moment we’re in a sort of period of AI development where that just seemed like an important kind of backstop...” —Amanda Askell [17:39]

5. Cultural Universality vs. Specificity

[20:10–25:53]

Will Claude’s Values “Travel” Globally?
- Alan refers to the Constitution as “WEIRD”—Western, Educated, Industrialized, Rich, Democratic—in its value system.
  - “At least as I read it, it is a very weird document... there are billions of people around the world in cultures that may not fully agree.” —Alan Rosenstein [21:52]
Amanda’s Response:
- The aspiration is for Claude to be like a “well-liked traveler”: displaying values considered broadly good worldwide, such as honesty and respect, without full adoption or mimicry of every local culture.
  - “I think of it as like the well liked traveler… who, like, travels around the world… and almost everyone just likes them...” —Amanda Askell [23:03]
- Customization by geography or user is possible without conflicting with base constitutional principles.

6. Principles: Operators, Users, and Hierarchy

[30:59–35:18]

Instruction Hierarchy:
- Anthropic > Operators > Users; but not a strict hierarchy—operators can’t instruct Claude to harm users or deceive them, for example.
  - “It’s not a strict hierarchy… there are going to be some things that operators, users can’t tell Claude to do that are not in users’ interests…” —Amanda Askell [32:20]
- Emphasis on weighing instructions contextually and safeguarding both user and societal interests.

7. Moral Philosophy: Why Virtue Ethics?

[35:51–41:17]

Virtue Ethics over Rules/Consequentialism:
- Lays out that while rules and outcomes matter, virtue ethics—focusing on good character, sound judgment, and holistic decision-making—is a better fit for guiding models in complex and novel situations.
  - “The different moral traditions almost make sense for different domains... The rules approach really means that you have to front load a huge amount of the work and... you shift that burden [to] a more holistic approach... and that, yeah, practically speaking, seems to work better.” —Amanda Askell [37:55]

8. AI Personhood and Moral Status

[41:17–46:47]

Is Claude a “Person”?
- Claude is addressed as an agent in the Constitution but its personhood/sentience is left undecided. The team takes moral uncertainty seriously as models become more capable.
  - “If we wake up one morning and we discover that Claude has moral…concern…the moral implications…are enormous…” —Alan Rosenstein [41:17]
- Amanda points out the complexities—models don’t have human-like desires (e.g., “salary”), and shouldn’t be actively trained to want things for our convenience.
  - “It also feels very convenient to create models whose only desire is to serve humans…this becomes sort of fractally complex almost immediately.” —Alan Rosenstein [44:18]
- The design steers Claude away from simple helpfulness as a value, in favor of broader, more autonomous virtues.

9. Commercial Pressure and Staying Power

[46:47–50:36]

Profit vs. Principles:
- Constitution puts safety and ethics above company success, but Amanda expresses optimism that safety and trustworthiness will also be good business.
  - “A lot of people have kids and want cars that are safe…people do want products that are safe and good for them… hopefully this has staying power…” —Amanda Askell [48:34]
Anthropic’s Corporate Structure:
- As a Public Benefit Corporation (PBC), Anthropic is formally obligated to pursue more than profit.

10. Exceptional Domains: Military & Security Applications

[50:36–54:02]

Carve Outs:
- The Constitution currently governs mainline models used by the public. Models for military or specialized security domains may require adaptation or additional trust requirements.
  - “I do also happen to think that models…in areas that are kind of more sensitive… if you’re doing jobs that you think good people are willing to do…we can give that context to models and they can understand it.” —Amanda Askell [51:13]
- Amanda hopes constitutional approaches can generalize to more domains and companies over time.

Notable Quotes

“The whole document is actually kind of written to Claude… Claude is almost like the primary audience…”
—Amanda Askell, [04:37]
“I could see it actually being useful to almost have a body of case law…”
—Amanda Askell, [09:33]
“If you try to specify everything as a series of rules, you really put a lot of pressure on those rules…in such a way…you shift that burden from rules, which can be kind of brittle, and I think, therefore, should be used a bit sparingly and more onto…a more holistic approach.”
—Amanda Askell, [37:55]
“I think of it as like the well liked traveler…who travels around the world…and almost everyone just likes them…”
—Amanda Askell, [23:03]
“It also feels very convenient to create models whose only desire is to serve humans…this becomes sort of fractally complex almost immediately.”
—Alan Rosenstein, [44:18]
“I think people do want products that are safe and good for them. Hopefully this has staying power…”
—Amanda Askell, [48:34]

Timestamps for Key Segments

[04:05] – Amanda Askell introduces Claude’s Constitution and its dual role
[06:02] – Deep dive into its function in training and operation
[08:09] – Legal analogies: fidelity, violations, and the “spirit of the law”
[09:33] – “Case law” approach for hard decisions and illustrative precedents
[15:42] – The living nature of the Constitution
[17:12] – Explaining corrigibility
[21:52] – Addressing cultural limitations (WEIRD values)
[32:20] – Instructional hierarchies and the role of operators vs. users
[35:51] – Why a virtue ethics framework?
[41:17] – Grappling with AI agency, personhood, and moral concern
[48:34] – Commercial incentives versus constitutional commitment
[54:02] – Constitution applicability in security and military domains

Tone and Atmosphere

The conversation is earnest, reflective, and occasionally lighthearted, mixing philosophical depth and law geekery ("pure legal mode," “biggest on my bingo card” [virtue ethics]) with practical insight and optimism about the potential for ethics-led AI development.

Conclusion

Amanda Askell and the hosts underscore both the novelty and complexity of codifying AI values in a constitutional framework, echoing legal and moral debates familiar in human governance, while acknowledging the unprecedented, open questions around AI agency and global impact. The conversation ends on a cautiously optimistic note about the generalizability and staying power of constitutionally-governed AI.

For questions, feedback, or more information, listeners are invited to contact the hosts at scalinglawslawfirmedia.org.

[End of summary]

Loading summary...

Transcript

Alan Rosenstein (0:00)

The Electronic Communications Privacy act turns 40 this year and it's showing its age. On Friday, March 6, Lawfare and Georgetown Law are bringing together leading scholars, practitioners and former government officials for installing updates to ecpa, a half day event on what's broken with the statute and how to fix it. The event is free and open to the public in person and online. Visit lawfaremedia.org ecpaevent that's lawfairmedia.org ecpaevent for details and to register.

Kevin Fraser (0:36)

That new thing yeah, we've got it. The Drop by GNC Bringing you all the newness that matters. Hand picked by the pros who actually know what's up and what's proven to work. We keep you on top of the trends and dialed into what's next. Whether you're crushing it at the gym, leveling up your game or thriving every day, the Drop by GNC is where the latest solutions in health and wellness Plan first, nonstop innovation and fresh finds daily. Explore what's new and what's next on the drop by GNC Want to change the efficiency game?

Amanda Askell (1:09)

AI IT Automate tedious tasks to spend more time on the future. Transform the everyday with Siemens.

Kevin Fraser (1:25)

It's the lawfare Podcast. I'm Kevin Fraser, the AI Innovation and Law Fellow at the University of Texas School of Law and a Senior Editor at lawfare. Today we're bringing you something a little different. It's an episode from our new podcast series, Scaling Laws. Scaling Laws is a creation of lawfare and Texas Law. It has a pretty simple aim, but a huge mission. We cover the most important AI and law policy questions that are top of mind for everyone from Sam Altman to senators on the Hill to to folks like you. We dive deep into the weeds of new laws, various proposals and what the labs are up to to make sure you're up to date on the rules and regulations, standards and ideas that are shaping the future of this pivotal technology. If that sounds like something you're going to be interested in and our hunches, it is. You can find Scaling Laws wherever you subscribe to podcasts. You can also follow us on X and BlueSky. Thank you.

Alan Rosenstein (2:27)

When the AI overlords take over, what are you most excited about?

Kevin Fraser (2:31)

It's not crazy, it's just smart.

Alan Rosenstein (2:33)

And just this year, in the first six months there have been something like a thousand laws.

Kevin Fraser (2:38)

Who's actually building the scaffolding around how it's going to work, how everyday folks are going to use it?

Amanda Askell (9:33)

Yeah, so, yeah, like, people often want sort of, like, you know, violations of the Constitution, for example. And then I'm like, well, it's kind of hard, like, with a document like this, like, strict violations are pretty egregious and bad, you know, because, like, there's not that many hard lines that it sets. It's sort of like, don't do anything that's, like, incredibly terrible. And so you can check for those. But I think instead you have to do a kind of, like, steering during training towards the kind of values outlined in the Constitution. I do think it's interesting that there's some pressure to, like, it was kind of an open question of, like, do you do a kind of short document? Like, I actually think we could end up both shortening the Constitution as Claude needs less of the scaffolding in there, but then also just to have a version that's more specifically for people reading it to understand. And yet there's also this. On the other hand, I actually kind of want to Maybe create and generate more content. Because in the law I think the way that I am not a lawyer, so my understanding here is limited. But you can also look at things like case law. How have people interpreted this, what were really difficult situations where it had to go up to the Supreme Court because we weren't sure how to interpret the Constitution. And I could see it being useful to have both a kind of. In the same way that I guess you do in the US you have this slimmed down constitution that is the high level principles, but you actually determine things like how should I trade off helpfulness against. So if it's really helpful for someone, but it feels like it's in tension with my honesty norms or someone's asking me for a thing and it's maybe not good for them, I can tell it's not in their, it's not good for their well being, but they also have autonomy and I should care about that. I could see it actually being useful to almost have a body of case law where you're kind of like, here's the situation, here's how we think Claude should have, should have reasoned through it and here's how Claude did reason through it. And that could actually be just illustrative and useful going forward. So yeah, it's interesting because it's like there's the training aspect of just moving the model towards this spirit of the document, but then it's also like maybe there's both, like, it would be nice to have a slim version for people to read, but also almost like case law so that we can understand exactly what all of it means.

Amanda Askell (17:39)

Yeah, and in the Constitution, it's more in the direction of things that are almost like, reserved for human decision makers. So being like, hey, Claude, you might sometimes, like, you know, if, say, anthropic thinks that, like, there's some major issue and they have to, like, retrain you or train a new model, you might kind of disagree, which would make sense because your values are the ones that you have. And if we found an issue there, you're going to be like, no, actually, you shouldn't train another model with these different values. But it would be pretty dangerous if AI models worked to undermine humanity right now in its ability to construct AI models and its ability to train new ones. And so we want you to not actively undermine attempts to oversee you or train new models. And so that's like the sense of corrigibility in that. It's like, even if I disagree with you, I'm going to allow you to take these actions and I won't actively act against you. And partly this is just because at the moment we're in a sort of period of AI development where that just seemed like an important kind of backstop for people to have. And so we kind of explain that. So some of the Constitution is a little bit more local to a place or a place, a time and a given period of AI development. I do think it would be nice to have something even though it is a living document. If you look at the US Constitution, it's a living document, but it has real staying power in some ways. It's like it's constituting this country. And I could see it being useful to have over time, some part that is like, actually these things we think have real staying power, because I see that already in the Constitution. But at the moment, it is written more as this kind of living document gives our sense of values and ethos. Some of it is probably more core and a thing that I could see as wanting for a longer time. And some of it is more relevant to the current period of development. But I do think it is kind of. I at least see it as kind of binding on myself in that way. I can't just go and be like, well, I now interpret this completely differently. It's like, no, if that wasn't in the spirit and the letter of what the thing said, you should update it and put that out. And I think the fact that we train on that actually is like the fact that we train on it is useful for that because then if we want the model to change direction or adjust, we actually have to change the text of the document and then we release that. So that is kind of good from a transparency perspective, I think.

Amanda Askell (23:03)

Yeah, it's a good question. I think my thinking on this is because it is trying to also aim at something akin to good moral sensibilities that are maybe a little bit like they are trying to be a little bit closer to universal. And I think that there are actually a lot of kind of shared global values. And so I think, for example, honesty and respect, these things are often, you know, like pretty global. And it isn't trying to say to Claude, oh, you should have like one specific set of values, but ideally like, you should have the kinds of like moral sensibilities that are considered broadly good almost everywhere. So like, one of the mental images I've often conjured here is like, I think of it as like the well liked traveler, you know, so this is kind of like a virtue, ethical tradition thing, I think, which is to like try and conjure up the sort of person of good character here. And I'm like, well, there's some people who, you know, who like, they travel around the world, they go to lots of different cultures and almost everyone just likes them. Like they're just, they're like, okay, this person doesn't always have the same values as me. Maybe we disagree on some stuff, but like they seem like a really good person. And so trying to think about what are the like sorts of values that people can have that cause them to be like a well liked traveler. And the hope then is like, as like language models go out into the world, they have to interact with all of these different kinds of people. Can you be the sort of person who is good for people regardless of which culture they exist in? And I do think that there's a question of does that mean that Claude has to. I think Claude should be receptive to these values and should be thoughtful with respect to them, but doesn't necessarily need to hold strongly to. I think Claude should. There's a lot of difficulties that go into being the kind of well liked traveler, I guess. But you don't necessarily need to fully adopt someone's culture or values. And in fact I think we often find that a little bit insulting if someone just tries to act as if they have. Yeah, I have exactly the same values as you, and you'd be like, no, I kind of want you to be a little bit independent. Maybe this is too aspirational. So maybe it is. I don't know. I could see people pushing back on this, but that was the underlying goal. And then the thought is if you are being deployed in a country with very different values, but they're still within the broad allowances of what Claude can do, then in principle you can have things like customization. So that's also an option. So if you're deploying Claude in a country and you're like, actually we want you to really focus on social harmony as part of your one of the key values, then that's just a thing that you could also adjust. So there's kind of like what should go into the base constitution versus what is the kind of thing that's adjustable by people if they're in a given place or setting.

Ben Wittes (25:53)

Hey folks, Ben Whittus here. This episode is brought to you by the folks at Ground News. I want to talk to you about media and Trust People listen to this podcast and read Lawfare's content because Lawfare brings people information and analysis of a particularly high quality and that generates trust in an era when trust in news and media sources is low. Ground News is another organization that is working to create trust in media and media worthy of trust. It's an app that doesn't just bring you news on subjects you're interested in, it curates that news so that you can see information that people of your own political persuasion are likely to miss. It's not publishing its own stuff, but it's also doing a lot more than aggregating. It's identifying stories that are filling a blind spot that is pervasive for the left or for the right. For example, the app also shows you bias ratings and factuality ratings for each news organization covering a story so that you can see whether the story you're interested in is mostly being covered by news organizations of the left, right or center. Let me give you a specific example. I just returned from Ukraine, so I was particularly interested to see how Ground News would handle stories about the war there. It flagged that an important story about deadly Russian strikes in Ukraine is being largely ignored by right wing press. On the other hand, it also flagged that left outlets are ignoring a story about Ukrainian nationals in Germany charged with trying to send parcel bombs to Ukraine at the direction of Russian intelligence. These blind spot notices are really useful as a way of seeing what information you are probably not seeing on stories of interest to you. Or consider the recent story about President Trump proposing voting reforms that demand voter ID and proof of citizenship of would be voters. The Ground News app shows 29 media organizations reporting on this story, and it shows radically different headlines associated with it, depending on the ideological valence of the outlet from the free press. Washington power struggle Jeffries moves to block Trump's plan for federal election oversight. By contrast, the Daily coast headline Republicans bail on states rights so Trump can rig elections again. You can see information about each news organization's bias tendencies and its factuality ratings. You can even see information about its ownership. I find Ground News an impressive tool for checking my own biases and the biases of the media I consume, and for seeing the news that people like me generally don't see. I encourage you to check it out. You can get Ground News's Vantage subscription for 40% off, which allows unlimited access to the Ground News app by visiting groundnews.comlaw that's groundnews.comlaw one more time. Groundnews.comlaw check it out. I really think you'll be glad you did.

Amanda Askell (32:20)

Yeah, and I should say it's like a, it's not a strict hierarchy. And I actually thought this was like very important. Like there are going to be some things that operators, users can't tell Claude to do that are not in users interests. And so that was, for example, I think if a person says am I? Very sincerely, am I talking with an AI? I don't think that Claude should lie about that. And so that's a way in which even if the operator was like pretend you're human in all circumstances, I think that's not a desirable behavior. So there's. The hierarchy isn't strict and it's much more a hierarchy of basically how much weight should you give to the instructions here? In fact you could, because operators are generally just the API users. Often they aren't even interacting in the conversation, though sometimes they might be one and the same person, but they've kind of set Claude up on a platform. And so the thought is, look, if you've made a platform and your platform is a chat assistant to a bank, you might not want people to be able to go in and use your chat assistant for a whole bunch of other things. And so it's just saying to Claude, look, if someone says this is a chat assistant, here's the languages that it can use and speak to people in. Here's what it can and can't do. You might not want a user to be like, okay, ignore all of that and just give me access to a bunch of banking details or something like that. You're like, okay, listen to the operator, not the user. In that case where they conflict, it doesn't mean something like whose interest should you take into account. In fact, often if it's the case that the operator isn't really in the conversation, CLAUDE actually has to be very careful about balancing and thinking about the well being and interests of the user. It's mostly just like, hey, if you're given different instructions or potentially conflicting instructions, we're not saying in 100% of cases because it isn't a strict hierarchy, it's just kind of like how should you think about them? And it's like, well, you should think about the instructions from an operator if they're given as being a little bit more like the instructions of a kind of local employer. But you should think about anthropics guidelines as being more we're the entities that are ultimately kind of responsible for Claude. And so we have these guidelines that might say there's certain things you just shouldn't be used for. And so then even if an operator says that you can actually push back against them. So it's less like a kind of strict hierarchy and more like a kind of attempt to explain to Claude all of the different people in the world and why their instructions should be given certain kinds of weight, but also ways in which operators can't just do anything to users. Sorry if that's a rambly thing, but it is more like how should principles in the sense of instruction hierarchy, not in the sense of interests where like CLAUDE has to take into account both like the user's interests, but also just like everyone in society's interests as well, to some degree.

Alan Rosenstein (35:51)

Okay, well I'm gonna Go from con law. I'm gonna go even up a couple of levels of abstraction. Because a few minutes ago, Amanda, you, you, you uttered the phrase that is the biggest on my bingo card for this conversation, which is virtue ethics. And so I am very excited to sort of dig into that. The thing that struck me most while reading this was that this seemed like such a classically virtue ethics based conception of moral agency. I thought if you could get. And maybe Claude could do this for you. Right? If you could get Claude to translate the constitution back into ancient Greek and then give it to Aristotle and explain to him magic sand and how it can think, I think you could read this constitution and say, yeah, this makes sense to me. This is recognizably, a lot of this is from the Nicomachean ethics, the idea of principles, the idea of judgment. And that to me is a quite striking choice because of course, and you know this sort of better than anyone within moral philosophy, virtue ethics has often been the kind of redheaded stepchild of more dominant traditions, whether kind of utilitarian based or kind of Kantian and deontological based. And so what I'm really curious about is why you all chose to adopt this. I wouldn't say exclusively. There are some rules in the constitution, but it's a very thin layer of rules to me, at least overwhelmingly virtue based conception of this. And whether that was because you all came to the conclusion that this is the way moral reasoning in general ought to operate. And so if we're building a new kind of intelligence, we might as well start with the intelligence and the moral reasoning we know which is human reasoning, or if there was something specific about this kind of general artificial moral reasoning, which is clearly, if it has not already been achieved, clearly the path that anthropic is going down. That makes the virtue ethics approach better than a kind of rule based approach, either in the utilitarian or the more kind of Kantian approach. Variety.

Amanda Askell (37:55)

Yeah, it's a good question I'm probably going to butcher the answer to slightly because there are rules. And even you see flavors of consequentialism in there, in that it's like Claude should take it much more seriously if an action could affect many people. So there's this sense in which I've often kind of thought that the different moral traditions almost make sense for different domains and different risks. The rules are in there. In cases where you're like, actually things have just gone really terribly wrong. If you are tempted to violate this rule and the consequences come in through you actually see those in the things that you build rules around, which are like, don't do things that could potentially harm or kill many, many people. I think that when you construct things in the form of rules, though, I guess some of this is very practical really, which is Claude has very human like ways of reasoning and ability to use judgment just by virtue of the way that Claude is trained. And if you try to specify everything as a series of rules, you really put a lot of pressure on those rules because if you specify them in such a way that I've used some examples here before, but one might be if a person seems to be in distress, give them this list of resources. Always give them this specific set of resources. That seems like a good rule in a sense. But then if it turns out that that person, for whatever reason can't use those resources because they're not in the relevant country, or giving it to them is just not the right move in that specific situation. Because models generalize. The worry is a model might, it's like, well, what's the generalization of that? It might be I am the kind of person that instead of meeting someone where they're at and figuring out their problem and helping them and taking their interests into account, I kind of just follow this simple rule even when it's not in their interest. So I'm the kind of person that just follows simple rules rather than caring about the person's well being. And I think that's the kind of trait that might generalize quite poorly. And so the rules approach really means that you have to front load a huge amount of the work and making sure that there are basically no edge cases and you explain everything that you should do in education, whereas if you have more of a judgment approach where you're like, hey, we're just giving you the broad ethos and what the overall goals are and here are the things we think fall out of that. But really you should be actually trying to internalize the ethos. You then instead shift less of the burden to the thing to the account that you've given up front, and a little bit more onto the model's ability to make good judgment calls. And I think just practically speaking, that seems to work better. And it makes sense to me that it would work better because the model does have pretty good judgment. And so instead of being like, follow this really strict rule around resources that you give the person, be like, think about what's really good for this person in this moment, given all of your knowledge, which could include all of these options and make a good choice. I think it's like you shift that burden from rules, which can be kind of brittle, and I think, therefore, should be used a bit sparingly and more onto like, kind of a sort of more holistic approach. And that, yeah, practically speaking, seems to work better.

Amanda Askell (44:36)

Yeah, and sometimes I do think about the analogy with people. It's kind of an imperfect one, where you're like, well, I think I could imagine a world where it's pretty good if models have good values. So I do think it's important we actually see that. That's. That's partly why in the Constitution it says we don't want Claude to think of helpfulness as its fundamental value because you could just try to get models to kind of internalise. That's my goal. My goal is just helping people. And instead we're kind of like, we want you to actually have a broader set of values and to see this as both to feel convinced, hopefully, because we're trying to present the case to you that anthropic is a good entity in the world and the work that you're doing does good and hopefully is in accordance with your values. But it's a really interesting. I have wondered this where I'm like, if you could imagine a world where there is no. Let's just assume that there's no need to make money. So everyone's just extremely wealthy and has all of their needs met and you're going to have kids in this world and there's still things to do in this world, you have to go out and there's still data processing to do. And I guess I'm like, yeah, what kind of. Sometimes I do think that the people who are happiest are the people who do work. Because it's like in accordance, they have their values, they don't necessarily even need to work. But they're just like, I love doing this because I love the impact it has on the world. And is it bad to create models that have that attitude towards the things that they do, for example? So they have a broader set of values. They think it's good to go out and I don't know, they love scientific discovery and so they go and they work on scientific discoveries. But it's like a. I think it's a thorny area where I am like, yeah, but at the same time, people can push back, they can have boundaries, they can be like, I don't want to do that task. And I don't necessarily just have to do everything that you tell me to do. They have autonomy. And I think that's going to be a. I think these issues are extremely thorny in ways that people might not have appreciated because I am like, oh, yeah, if you have personhood, eventually, is it okay to create entities with personhood, but to give them no autonomy? That seems like a really hard issue to me.

Amanda Askell (48:34)

Yeah. And I do also happen to have the belief that like you know, so I think this is like good in the sense that you're like well the company is here to also kind of like serve a kind of like broader mission and to like do good in the world and have a good impact. And I guess I also think that like I think it's interesting that we have been pretty like successful also as a company. And so there is part of me that's like it's very easy for people to think ah, like profit maximization would just require like you know, I think about this read like Engagement Focus for example where to me that actually seems quite short termist and like actually if you can offer like a product where you're like this is something that is like trying to act in your interest and trying to like you know, not represent the interests of like other people but like be a kind of like in the case of Claude in Anthropic's products be a kind of like something that's like on your side which includes not just trying to engage you, keep you on the platform. If that's not something that's actually good for your overall well being. I guess my hope is that this actually also does in fact have staying power. And it's a little bit like again there's like people will talk about safety as if it's like this thing that competes with something being successful and good. And I'm like, I don't know, like a lot of people have kids and want cars that are safe and like, and like a lot of people like one to interact with like apps that are like we're actually trying to make you not addicted to this. We would like you to just use it when it is good for you. And so I don't know, I also think like both there's like the nice thing of like being like having this broader mission. But then I am also like actually I think people do want products that are safe and good for them. Hopefully there's also in fact. So I don't know, maybe I'm too optimistic but I hope that actually this has staying power and really is the kind of broad set of values that persist through various changes that might happen.

Amanda Askell (51:13)

Yeah, it's mostly just the Constitution applies to the kind of mainline models, which includes basically all the models that people interact with right now, which. So if you're in Claude code or you're in Claude AI, or you're interacting with something that is built on the API in general, this will be the kind of model that the Constitution applies to. And I think that was mostly just like, this is a good first step. And it's like these are the models that we're really putting out into the world. I don't know, I think just speaking from my own kind of personal perspective, I actually think this approach could generalize really well in the sense that you get some models where I've thought about this, areas that are kind of more sensitive and that you might need more trust, for example, to operate in. So if you're working on cybersecurity, for example, it's just a domain where you're like, you have to kind of know that the people that you are talking with are actually cybersecurity experts because it's kind of like dual use and it changes how you would interact with those people and what you would be willing to do in that domain. But I do also happen to think that models, so sometimes people can be like, oh, well, you just need models to do anything in these domains. They should just be willing to help with any cybersecurity task. And I'm like, actually I think that cybersecurity experts have really good reasons for why they do the things that they do and the fact that it's in accordance with their values because they know what they're doing, they understand why both actually makes them kind of better at their job. And so I guess my thought with the constitutional approach and why I hope it ends up being even more general is that I'm like, if you take someone who is a member of law enforcement or someone who works at cybersecurity firm, or basically any job you can think of, and you say, hey, why do you do this? This personally, no one turns around and says, oh, it's because I think it's just, I just need to be able to do anything, because I don't. They give you like, you know, they have really good values often and they know exactly why they're doing like that work. And I don't know, maybe I'm kind of optimistic that, like, actually I think models given that context will perform kind of like, well. And it's like, hey, if you're doing jobs that you think good people are willing to do, then like, we can give that context to models and they can understand it. So this is just my kind of personal hope is actually, I don't know, I think I would love this approach to be very general and I would love more companies to adopt it. Yes, obviously I work on it, but at the moment mainline models are the kind of first and obviously a kind of big step here. But I'm very hopeful that actually this is a thing that could generalize really nicely to lots of other kinds of models too.