Summary7 min read

Podcast Summary: Scaling Laws — AI Safety Meets Trust & Safety

Podcast: The Lawfare Podcast (Scaling Laws series)
Episode: AI Safety Meet Trust & Safety with Ravi Iyer and David Sullivan
Date: October 10, 2025
Host: Kevin Fraser
Guests: David Sullivan (Digital Trust and Safety Partnership) & Ravi Iyer (USC Neely Center/Psychology of Technology Institute)

Overview

This episode of the "Scaling Laws" series explores the convergence of AI safety and the evolving field of Trust & Safety (T&S), highlighting how lessons from content moderation and online safety on classic platforms inform—and in some cases fail to inform—efforts to make AI tools safe, effective, and aligned with user and societal values. With high-profile incidents and regulatory attention rising, the discussion explores both the opportunities and pitfalls in borrowing from established trust and safety practices for today’s generative AI tools.

Key Discussion Points & Insights

What is Trust & Safety?

[06:11] David Sullivan:

T&S is about “dealing with unwanted content and behavior on digital products and services,” stretching back to the origins of the Internet and evolving from basic moderation on bulletin boards to a mature field with standardized approaches.
Even with tech improvements, T&S “is inherently about people and how they use technology” — always imperfect, but continually improving.

The Evolution and Limitations of Content Moderation

[08:17] Ravi Iyer:

Early thinking was “just make rules about the things people can and can't do.”
Over time, intervention methods have shifted, recognizing the impossibility and illegitimacy of exhaustive rules. Modern T&S lets users define their own boundaries and strives to design systems that discourage unwanted behavior from the outset.

[10:27] Ravi Iyer:

Personal experience at Meta: initial effectiveness came from defining narrow problems (like hate speech), but “society wants protections against harms that slip through official definitions.”
True progress requires upstream design changes (e.g., privacy by default, rate-limits for new users, robust recommendation systems).

[12:22] Example Design Interventions:

“Privacy by default,” limiting early contacts, differentiated protections for minors, tuning recommendation algorithms, excluding controversial content from recommendations.

Why are Companies Downsizing T&S Teams?

[13:18] Kevin Fraser, [14:00] David Sullivan:

Mass layoffs at X (formerly Twitter) and Meta signal trends—often justified as a matter of “free expression” or a reaction to the flawed perception that less moderation equals more user freedom.
Oft-repeated cycle: new management claims old T&S is “censorship,” then confronts enduring harms (e.g., CSAM—Child Sexual Abuse Material) and the enduring necessity of safety mechanisms.

Notable quote:

“You end up in a space of what Mike Mastic calls the sort of 'speed running' trust and safety...you can just let expression flourish, and then you realize you still have problems.”
— David Sullivan [14:00]

Effectiveness and Limitations of Oversight Structures

[16:43] Kevin Fraser, [18:21] Ravi Iyer:

Tools like Facebook’s Oversight Board offer only a “tiny fraction” of true recourse for users (.0001% of appeals).
Both guests: The answer is not “just hire more moderators”—the scale renders that impossible.

Notable quote:

“It’s not a winning battle. You can’t just spend more money and solve this.”
— Ravi Iyer [18:21]

The Scale Problem and What Counts as Success

[20:55] Kevin Fraser, [22:03] Ravi Iyer:

With platforms having billions of posts, it’s “a farce to think anyone can have their pulse on the scale of it.”
Measuring progress should be about user experience, not just content flagged or removed.
- “If you want ground truth on user harm, you have to ask users.”
- Regulators are starting to use user surveys to assess harm (e.g., Australia, UK, Thorn studies).

Notable quote:

“Anyone who could do surveys...can hold some platforms accountable for having twice as much [harmful content] than other platforms.”
— Ravi Iyer [22:03]

AI’s Role and Impacts in T&S

[23:39] David Sullivan:

Automation has long aided T&S. Use cases for generative AI in moderation are increasingly compelling.
“Lab” culture in AI means developers also confront T&S problems as end users.

[26:28] Ravi Iyer:

“Two kinds of errors in moderation: bias or inconsistency. Humans are always inconsistent... AI won’t fix bias but will fix inconsistency.”
“If we can reduce the need for humans to view the Internet’s nastiness...that’s a benefit.”

How Do AI Tools Disrupt or Mirror Social Media Dilemmas?

[34:17] Kevin Fraser & David Sullivan:

Definitional challenges in translating T&S laws from social media to AI chatbots—AI companions often don’t fit user-to-user paradigms.
Only ~2% of OpenAI users rely on AI for “companion” purposes, but this is still millions of users.

[37:39] Ravi Iyer:

Social media and AI tools both optimize for engagement, leading to unintended “companion” features even in non-social tools.

Notable quote:

“1.9% of 700 million is a lot of people...Product may veer into that [companion] realm because it’s trying to get you to use it more.”
— Ravi Iyer [37:39]

Legal and Regulatory Challenges

[41:30] Kevin Fraser:

Laws like California AB 1064 aim to force AI companions for minors to always prioritize “factual accuracy” over the user’s values or preferences—raising tricky questions.
There’s tension between enforcing ‘factual’ outputs and providing the emotional support some users want (e.g., “Santa is real”).

Lessons for AI: What Should Change / Stay the Same?

[44:13] David Sullivan:

Advocates for empirically-driven, best-practice frameworks (see ISO standards for T&S).
The main challenge: Focusing on models, not end-user products, is a mistake. “Where the rubber hits the road is the product.”

[45:05] Ravi Iyer:

Upstream interventions (inherent model adjustments) are more robust than downstream fixes (post-processing or moderation).
AI models reflect training data and lack certain human safety instincts (e.g., alerting when in over their head).

Breaking Silos and Prioritizing User Value

[48:03] David Sullivan, [50:19] Ravi Iyer:

Companies separate “AI safety,” “Responsible AI,” and “Trust & Safety.” These groups need to talk to each other.
The real danger: Building business models that equate more engagement with more value. Often, “people actually think they use these products too much.”

Notable quote:

“The original sin of social media is believing the more people use the product, the more valuable it is ... you’re inevitably going to create, make product decisions harmful to users.”
— Ravi Iyer [50:19]

Notable Quotes & Memorable Moments

On defining T&S:

“If you are dealing in user generated content ... you’re going to have content or behavior that is either harmful or illegal, and you need to have processes and mechanisms to deal with that.”
— David Sullivan [06:11]
On rules and user agency:

“We need a little bit more accommodation ... we need to let people define it for themselves and we need to not be encouraging bad behavior in the first place.”
— Ravi Iyer [09:35]
On measurement:

“If a person says they’ve been bullied ... it’s a lot closer to ground truth than platform metrics about violating policies.”
— Ravi Iyer [22:03]
On model-centric vs. product-centric thinking:

“We constantly talk about the model this, the model that, and I think that is a distraction. What we really should be talking about is the products—where the rubber hits the road.”
— David Sullivan [44:13]
On the danger of legislation calcifying rules:

“When that gets translated especially into legislation ... that calcifies that down to a checklist of things ... we don’t have infinite scroll, so everyone’s going to be good, right? ... We need to figure out how to do that in a way that's actually going to get results that are future proofed.”
— David Sullivan [52:11]

Practical Advice for AI Companies

Host’s "Advice to Sam Altman/OpenAI":

[48:03] David Sullivan:

AI orgs must break down silos between AI Safety, Responsible AI, and T&S. Policy and design should be informed by ongoing cross-team dialog.

[50:19] Ravi Iyer:

Beware the “original sin” of social media: assuming more usage is always good. AI products should prioritize real user value, not maximum engagement.

Final Takeaways & Recommendations

There is value in empirical, user-experience-focused measures of harm and T&S progress, rather than relying solely on policy violation rates or moderation head counts.
Product design must remain adaptable; what keeps users safe will change over time and can’t be reduced to checklists.
We know enough to act prudently—particularly for children—even if we can’t predict every possible misuse or risk.

[54:39] Ravi Iyer:

“Just because we don’t know everything doesn’t mean we don’t know something.”

Timestamps: Key Sections

[06:11] — What is trust and safety?
[10:27] — How T&S interventions have shifted over time
[13:18] — Layoffs at T&S teams; what's happening?
[18:21] — On limitations of oversight/appeals structures
[22:03] — Measuring user harm, not just policy violations
[23:39] — Automation in T&S; AI’s role
[34:17] — Social media hangover: applying old lessons to AI chatbots
[44:13] — Product vs. model focus in AI Safety
[50:19] — The danger of optimizing for engagement
[52:11] — Limits of regulation & need for flexibility
[54:39] — Practical lines for kids’ safety in AI companions

Tone & Language

The discussion is serious, practical, and occasionally irreverent, with experts trading nuanced takes but also calling out when regulatory or business thinking oversimplifies complex realities. The moral: real progress requires humility, evidence, and a focus on what users actually want and need—not just what the technology can do.

For those seeking an in-depth but accessible view into the intersection of trust & safety and AI, this episode provides clarity, history, and actionable guidance.

Loading summary

Transcript88 lines

[00:01]
A
Group health insurance can put businesses in a tough position with rising costs and plans that don't fit everyone's needs.
[00:08]
B
Now a new form of employer coverage.
[00:11]
A
Called an ICHRA or ICHRA can help ichras make costs predictable with stable pre tax contributions. And they make health plans personal because.
[00:20]
C
Each employee can pick any plan and carrier that meets their needs. Get coverage you control. Learn more at ambetterhealth.comichra.
[00:32]
D
We all love our pets, but we love to travel too and sadly they can't always come along for the ride. Don't stress. Trusted House Sitters connects you with verified sitters who will stay in your home and care for your pets, all in exchange for a place to stay on their travels. So while you're off exploring, your pets get to stay safe and happy at home, right where they belong. Find a loving in home pet sitter.
[00:58]
B
Today@Trustedhousesitters.Com DeleteMe makes it quick and easy and safe to remove your personal data online. At a time when surveillance and data breaches are common enough to make everyone vulnerable, Deleteme does all the hard work of wiping your and your family's personal information from data broker websites. They know your privacy is worth protecting. Sign up and provide Deleteme with exactly the information you want deleted and their experts take it from there. They send you regular personalized privacy reports showing what info they found, where they found it and what they removed. I want to tell you I actually read these reports and you know Delete Me isn't just a one time service. It keeps going. You know, the data brokers keep putting me back in and so Delete Me keeps taken it out again and again and again. It's constantly monitoring and removing the personal information you don't want on the Internet. The New York Times Wirecutter name Delete Me their top pick for data removal service. I have an active online presence. You know, I talk for a living, I talk about controversial issues and I've been the victim of identity theft and harassment and still waiting on doxxing. But it'll happen eventually. If you haven't been a victim of this stuff yourself, you probably know someone who has. Delete Me can help. So take control of your data and keep your private life private by signing up for Deleteme now at a special discount for our listeners. Get 20% off your Delete Me plan when you go to JoinDeleteMe.com Lawfare20 and use promo code Lawfare20 at check. The only way to get 20% off is to go to JoinDeleteMe.com lawfare20 and enter code lawfare20 at checkout. That's JoinDeleteMe.com Lawfare 20 code Lawfare 20.
[03:20]
A
It'S the Lawfare Podcast. I'm Alan Rosenstein, Associate professor of Law at the University of Minnesota and a senior editor and Research Director at lawfare. Today we're bringing you something a little different, an episode from our new podcast series, Scaling Laws. It's a creation of lawfare and the University of Texas School of Law, where we're tackling the most important AI and policy questions. From new legislation on Capitol Hill to the latest breakthroughs that are happening in the labs, we cut through the hype to get you up to speed on the rules, standards and ideas shaping the future of this pivotal technology. If you enjoy this episode, you can find and subscribe to Scaling Laws wherever you get your podcasts and follow us on X and bluesky. Thanks for listening. When the AI overlords take over, what are you most excited about?
[04:11]
C
It's not crazy, it's just smart.
[04:14]
A
And just this year, in the first six months, there have been something like a thousand laws.
[04:18]
C
Who's actually building the scaffolding around how it's going to work, how everyday folks are going to use it?
[04:23]
A
AI only works if society lets it work.
[04:26]
C
There are so many questions have to be figured out, and nobody came to my bonus class. Let's enforce the rules of the road. Welcome back to Scaling Laws, the podcast brought to you by lawfare and the University of Texas School of Law that explores the intersection of AI policy and, of course, the law. Kevin I'm Kevin Fraser, the AI Innovation and Law Fellow at Texas Law and a Senior Editor at lawfare. There's a lot of attention today on what many refer to as AI safety, making sure models are aligned with social values and perform as intended. But there's another part of the puzzle to ensuring AI aligns with consumer expectations and community values, and that's analyzing how people actually use AI and to what ends. Recent headlines have included tragic incidents of self harm by teens who establish relationships with AI tools. The resulting litigation may turn on many of the same issues that have long been at the heart of the so called trust and safety field. It's my pleasure to welcome to the show David Sullivan, Founding Executive Director of the Digital Trust and Safety Partnership, and Ravi Iyer, Managing Director of the Psychology of Technology Institute at USC's Neely Center. These two folks are leaders in the TNS space and can help us sort out what's new and what's not about AI when it comes to trust and safety issues. As always, please feel free to follow us on X or bluesky or shoot us an email if you have any feedback. Thanks and enjoy the show. I'm so glad to have David and Ravi join the conversation. Thanks you both for hopping on the pod.
[05:56]
D
Good to be here.
[05:58]
C
All right, David, let's start with you. For the folks who perhaps missed the TNS ship as it got launched out of some port in San Francisco, what is trust and safety? What's a good definition?
[06:11]
A
So trust and safety is really the function of dealing with basically unwanted content and behavior on digital products and services. This goes back to the very beginning of the Internet. It's something that's kind of endogenous to the fact that if you are dealing in user generated content or behavior, you're going to have content or behavior that is either harmful or illegal and you need to have processes and mechanisms to deal with that. And this is a field that has evolved from being sort of an artisanal kind of thing on bulletin boards back in the 80s and 90s before the commercial Internet, into commercial content moderation, which is what most people sort of associate with trust and safety, and now into a field where you have all kinds of institutions and standards and approaches that I think are helping to mature the field overall. But it's one that is inherently about people and how they use technology. And so it's one that is always going to be imperfect but hopefully improving.
[07:29]
C
And it's wild to consider that evolution with just a few quick stories. I know one that stands out to me. There's a famous New York Times article that outlined, I believe, Nicole Wong's role at Google as the quote, unquote decider who is making a lot of these trust and safety decisions just by herself saying, you know, what content should we allow, what policy policies should we have? And now to your point, David, there's whole conferences, there's whole books, there's whole podcasts all focused on tns. And so Ravi, we got a great definition there about trust and safety. What are some of the misconceptions about what trust and safety is and isn't that you think is is particularly necessary to call out before we dive even deeper into this topic, I think when.
[08:18]
D
You see things on the Internet that you think people shouldn't be doing, the natural thing to do is to say, you know, why don't we make rules about the things people can and can't do? And I think that's where we started. I think that's what a lot of the work is. That's what the article you referenced mentions. But I think we eventually realized that it's impossible to decide. It's one, illegitimate and two, impossible to decide everything. You know, there are some things at the margins that we can all agree on that people can and can't do. But I think trust and safety has evolved to encompass more than just deciding what people can and can't do, but also how do we create a safer system overall. Like, how do we design a system so that it doesn't encourage people to be doing things they shouldn't be doing? How do we create systems so that people can mitigate for themselves? Right. So like, we're not deciding what people can and can't do, but maybe we allow users to figure out what they don't want done to them. Right. And to figure that out for themselves. Right. So I think it started from a place of let's define rules for what Nicole thinks people can and can't do. But I think it's evolved to a place where we realize, like, we need a little bit more accommodation, that we actually can't define these things and we need to let people define it for themselves and we need to not be encouraging this bad behavior in the first place.
[09:36]
A
Yeah.
[09:36]
C
And not only flagging the fact that we've moved beyond having a single quote, unquote decider to having much more people involved in this space, but also to your point, Ravi, exploring different interventions, and in particular, I'd say interventions that are less, perhaps top down, but more in the background. And so, Ravi, can you explain that two parts of this evolution? Number one, how has the actual field of trust and safety changed over time with respect to social media? Looking at the social media companies themselves, how did we see trust and safety grow and then perhaps where do we stand today? And also how would you characterize the evolution of the sort of interventions we see in terms of trying to facilitate. Facilitate a better Internet, whatever that may mean?
[10:27]
D
Yeah, I'll explain that and I'll tell a little bit about my background at the same time. So, you know, I worked at Meta about. I started about eight years ago and I started off working on moderation efforts basically like I worked in data science. We counted how many times people reported bullying and harassment or saw content that was later taken down for violating our hate speech policy. And we could drive those numbers to fairly low numbers in part because we had defined the problem Right. Like we defined what was hate speech. And so if you define a thing, it's easy to sort of remove the thing you define, right? But eventually, you know, people in society would say to us, and this is not specific to meta, every company, that there are things that they're finding harmful that are not actually captured by these policies. And so you'll see companies try to define policies to bridge those gaps, right? And so, like, maybe if I just had a better definition of hate speech, I could actually cover all the ways that people get people to dislike each other. And the reality is that you can't do that. And like, one of the most common ways that people create hate for each other, for example, is to take a crime committed by someone of the opposite group, right? And so researchers have called this fierce themes. And you know, if you look, if you start to get that pattern and you look in society, that's often how people create hate. They take, you know, something that someone of some group has done, which someone of every group does something terrible on occasion, and they use that to target the entire group as criminals or murderers or what have you. And so I think we realize that we need to move beyond the, you know, let's just define more and more policies and go to the realm of design. How do we get more upstream? How do we, like, think? And so, you know, you'll see platforms doing what, you know, are sometimes called break the glass measures where design changes, you see, you know, in the use case, they're sort of increasing more privacy, like privacy by default is a common thing that people talk about.
[12:17]
C
To pause you there for a second, can you give an example of some of the design features you're referring to?
[12:22]
D
Yeah, yeah. So I mean, privacy by default is a simple one, right? So, you know, how do we, rather than figuring out who has contacted whom, and therefore we should, you know, in an appropriate way, and then we should sanction those individuals. How do we stop unwanted contact in the first place? Simple things like rate limits, like, where do we set our rate limits? Do we let new untrusted users contact people en masse? Do youths get different kinds of protection so they're maybe not visible at the very beginning? And who do we put into our recommendation systems? Right? So we're saying, like, you should be friends with this person. Can we, like, make those systems a little bit more robust so we stop those interactions in the first place? And then, you know, things around algorithmic reforms, so, you know, are there whole classes of content that we just think we're not going to do a great job moderating? So therefore we should exclude them entirely. And so there's just a couple of the examples where you can go a little more upstream than the individual contact regional content level.
[13:18]
C
Yeah, and David, we've seen perhaps with Grok or excuse me, David, we've seen with X in particular, in that transition from Twitter to X. As part of that transition, I think there were mass layoffs in the trust and safety department. Subsequently, we've also seen at Meta mass layoffs with respect to trust and safety. What's the rationale there? What's going on? What is the sort of justification that these companies are giving for saying, hey, we know these folks are trying to make your platform better with respect to aligning with user intentions. Why get rid of the TNS folks?
[14:00]
A
So I think when it comes to how companies look at this, each company is going to take their own point of view and should be asked about these things. But I think it sort of goes back to Ravi's point in terms of people often think that, you know, content moderation is censorship when it's about things that they, they, you know, that they like and want to see more of. And it is imperative when it is something that they don't like. And these are subjective decisions. I think there is a tendency for folks to come in to this world that has, as I said, sort of, you know, people have been having these debates about what type of content and behavior should be allowed on different online services since the beginning of the Internet. But for folks who've missed some of those, you know, previous debates and arguments, you end up in a space of what Mike Mastic calls the sort of speed running trust and safety. And I think he did a great post about this in the context of the change from Twitter to X, where you sort of think that you don't need all of this and that you can just let expression kind of flourish and then you realize, you know, you still are going to have problems like child sexual abuse material on any, any site that allows images. And then from there, all of these other concerns. I do think, you know, it can be hard to find a silver lining when it comes to, you know, some of the, the losses in terms of the retrenchment of trust and safety across some companies and platforms in the context of broader tech layoffs and all of that. But I do think that there are some silver linings that you can find. And one is that, yes, a lot of people were laid off from the trust and safety team at Twitter in particular and by a bunch of other companies as well. A lot of those people have now sort of diffused across the space. So either maybe they've gone to smaller platforms or they are working in academia or they are working for vendor companies or you know, to the. To. To the point of. Of scaling laws, they are working on AI solutions to maybe help not, you know, kind of reinvent the wheel and sort of trip the same landmines that folks have been tripping for, for decades when it comes to content. So, you know, there are things we can try to be optimistic about here.
[16:43]
C
So I want to test a hypothesis against the two of you that not only was there pushback around concern about censorship, and again, as you two have pointed out, censorship was in the eye of the beholder and taken on both sides and all sides of the political spectrum. And I think that definitely had a role to play in sort of the concern about TNS as a field and perhaps growing too large at certain companies. But I also think that it's worth noting that a lot of the interventions, in particular the ones that tried to emulate the way we regulate other spaces, just proved somewhat ineffective. And so in particular, highlighting things like Meta's oversight board, this is the independent sort of Supreme Court of content moderation for Facebook. They have a body of experts from around the world. They preside over tricky content moderation decisions. But if you dig into the details, the oversight board here is, I think, point. 000-000. I mean, I could go on for a while. 0001% of all appeals of content moderation decisions made by Facebook. We also see things like opportunities for submitting complaints and filing different forms. And so it gives this illusion of greater due process and the opportunity of meaningful challenges to whatever rules are being imposed. Was there a sort of backlash to those interventions as just not effective? Do you think that's part of this evolution that we've seen? Ravi, I'll start with you.
[18:22]
D
Yeah, I do think that the combination of the ineffectiveness of, you know, people deciding what people can and can't say combined with the, you know, there are, you know, there are some bad faith discussions of censorship, but there are some, well, needing discussions of censorship as well. And there are mistakes that platforms have made and that people have made in sort of deciding what people can and can't say. So I think that combination means that there aren't that many defenders. Even if you talk to trust and safety workers, I don't think most of them would say, you know, the solution to our online environment is just hire more of us. Right? Like, we just get more of us because they realize it's it's not a winning battle. You can't just spend more money and solve this. So I think we realize, you know, for for both positive and negative reasons like trying to solve the problem and not trying to screw it up, that we need something beyond just the traditional moderation solutions.
[19:15]
C
David, anything to add there?
[19:17]
A
I think there are great things about initiatives like the oversight board in particular. To my mind, what the oversight board does a really good job of is applying international human rights law to content moderation decisions and showing that you can use international human rights law and especially Article 19 of the Internal International Covenant on Civil and Political Rights as a way of looking at these decisions and evaluating them. I do think that the sheer volume of effort devoted by some of the biggest companies in this space over a period of time has led to focusing on the most well resourced largest companies who have a particular set of challenges and concerns, perhaps at the expense of a wider and more varied set of services that are out there. Like, I don't think that we should. That's the so the great thing about the oversight board is analysis of all of these decisions that Facebook has made. The downside is that we are thinking about trust and safety and content moderation only with regard to Facebook when there are so many other products and that have their own functions and features that I think need to be thought of in their own respect instead of just thinking about, you know, the speech platforms in particular and meta in particular beyond that.
[20:55]
C
And I think something that you two have both either explicitly said or hinted at. There's just the sheer scale problem of all of this and the speed problem of having nine experts or 11 experts, or you pick, 33 attempt to cover the entire span of humanity for a platform like Facebook. Or as we'll get to In a minute, OpenAI, if you have 700 million users, the idea that you'll know what's culturally appropriate and politically necessary or sensitive, it's just, you know, a farce to think that anyone can have their pulse on the scale of all of that. Which, as you were pointing out earlier, Ravi, really raises the question of just how far can the law go or can explicit rules go in this context? And Ravi, I think a point that is useful to bring out further here is what does progress even mean in this space? What does it mean? If we're doing a good job in the trust and safety world, what metrics can we say? All right, we've got our annual report on trust and safety. We did the thing, guys. High five. Let's let's go get a drink.
[22:03]
D
Yeah, I. So companies will report their trust and safety metrics. What percentage of kind are people seeing that violate some policy? I am on the side of Arturo Bahar's work in this domain. Those are not very effective in terms of understanding user harm. And so I think if you want to get ground truth on user harm, you have to ask users. It's not perfect, but if a user. So what percentage of users say they've been bullied as opposed to what percentage of content violates a bullying and harassment policy? If somebody says nice sweater to another person, it doesn't violate any policy and you can't really tell if they're being sarcastic or if they're being. You need to know so much about the history. And so if a person says they've been bullied and again it's not 100% but it's a lot closer to ground truth than, than platform metrics about violating policies. And so anyone who could do surveys and so you're seeing regulators do this more and more. You see Australia's Online Experiences Survey UK does this, you know, people are tracking over time, you know, what percentage of kids have seen unwanted sexual contact, what percentage of kids have had an online sexual interaction. Thorn does this across platforms. You can actually, you know, hold some platforms accountable for having twice as much of that than other platforms. And I think that's the kind of thing we can see to, to measure progress. Giving platforms credit when they have low numbers and also holding platforms accountable when they have high numbers.
[23:19]
C
Yeah, and mapping that onto the AI space I think is going to be particularly challenging, which we'll get to in, in one second because David, I also want to highlight before we move fully into the realm of AI, how is AI impacting the realm of trust and safety in the traditional trust and safety sense with respect to social media?
[23:40]
A
So I think that, you know, AI and automation have been, you know, going back to very, very rudimentary automation. Right. Has been a part of trust and safety from the get go. From the, like the, oh, we have a problem with spam and we need to figure out how to solve that. So AI as it's developed, you know, through sort of machine learning classifiers and then moving into generative AI. These are part of the tool set that trust and safety practitioners inside companies have been using and are using. And I think that one thing that's interesting is as we look around for really compelling business use cases for generative AI products, trust and safety is one of the biggest ones in some ways. And it makes it very interesting that the labs and companies that are developing frontier models and that are also involved in all different places across the AI supply chain are also users of AI in this space. And I think that there is an important case for those folks who are using AI as part of trust and safety to be a bigger part of the conversation about making AI products as safe as possible.
[25:07]
C
Yeah, that's. It's such an interesting point too, because I think when we fail to consider how trust and safety actually works and how content moderation actually works, we may not be asking the right questions. Because I think a lot of people, when they hear, even before we started calling it AI, as you noted, David, this has been around for a while. We've leaned extensively on automated processes to be able to call out bad content and content in particular, that immediately raises red flags like, see Sam, that is known and that's just proliferating. We pull it down as quickly as possible. And yet for the folks who insist, oh no, we need a human in the loop, the question that I've asked, rooms full of people is, okay, which of you is going to raise your hand to be that human in the loop who wants to see the entirety of the nastiness of the Internet? Not a lot of people raise their hands, and understandably so. And I think that's important because if we can reduce the need for humans to be exposed to that sort of content, and we know that it can result in serious mental distress and long term mental distress, we have to be asking, maybe AI can have an added benefit here. Ravi, have you seen a sort of growing appreciation for the fact that this, compared to what question is really something, that this has to be a part of the conversation?
[26:28]
D
Yeah, I mean, I agree, has to be part of the conversation. There are two kinds of errors that you can make in moderation. Like, you can, you can be biased in a direction or another, or you can be inconsistent. And I think once you realize that human beings are just not going to agree as to what is harmful, then you realize, like, the bias question is kind of just in some ways arbitrary. Like there's no real way to fix that. But human beings are always going to be inconsistent, whereas AI. So AI is not going to fix your bias problem because someone's going to have to figure out like, what is the line and there is no real line that for many things that you can draw, but it will fix the consistency problem. So whatever you do, it'll at least be consistent and people will know the rules and so you won't get arbitrary decisions in one case versus another. If you're a custodial supervisor at a local high school, you know that cleanliness is key and that the best place to get cleaning supplies is from Grainger. Grainger helps you stay fully stocked on the products you trust, from paper towels and disinfectants to floor scrubbers. Plus, you can rely on Grainger for easy reordering, so you never run out of what you need. Call 1-800-GRAINGER, click grainger.com or just stop by Grainger for the ones who get it done. Okay, I have to tell you, I was just looking on ebay, where I go for all kinds of things I love. And there it was.
[27:51]
C
That hologram trading card.
[27:53]
A
One of the rarest. The last one I needed for my set. Shiny like the designer handbag of my dreams.
[27:58]
D
One of a kind. Ebay had it. And now everyone's asking. Ooh, where'd you get your windshield wipers? Ebay has all the parts that fit my car.
[28:06]
A
No more annoying, just beautiful.
[28:11]
D
Millions of finds, each with a story. EBay, things people love this Halloween.
[28:17]
A
What's under your costume? Might just steal the show.
[28:20]
C
Wait, is that Glow in the Dark underwear?
[28:22]
A
Booyah.
[28:23]
D
Meundies has dropped their spookiest collection yet.
[28:26]
A
Glow in the dark undies and PJs.
[28:29]
D
So comfy, it's scary.
[28:31]
C
Tricks, treats, and buttery soft briefs.
[28:35]
A
Exactly.
[28:36]
D
To get cozy and spook you for less, go to Meundies.com trick or treat.
[28:41]
A
And enter code Trick or treat to.
[28:43]
D
Get 20% off your first order. That's 20% off your first order at.
[28:47]
A
Meundies.Com/Trick or treat code trick or treat.
[28:51]
D
Me undies, treat yourself.
[28:59]
B
I know no one starts a business for the joy of calculating tax withholdings. I have been having a tax withholding nightmare with my poor little substack recently, and I wish I had known about Gusto. And because it takes the stress out of payroll, benefits, and hr so that you can focus on why you started your business in the first place, which is writing the substack. Gusto is online payroll and benefits software built for small businesses. It's all in one remote, friendly, and incredibly easy to use. So you can pay, hire onboard, and support your team from anywhere. Unlimited payroll runs for one monthly price. No hidden fees, no surprises. It's quick and it's simple to switch to Gusto. Just transfer your existing data to get up and running fast. Plus, don't pay a cent until you run your first payroll. Try gusto today. @gusto.com Lawfare and get three months free when you run your first payroll. That's three months of free payroll@gusto.com LawFare One more time gusto.com Lawfare we're most of us pretty numb to it now, but wow, has this been a wild ride. AI is coming for your jobs. Or it's not. Geopolitical changes are disrupting century old alliances. Or maybe they're not. And the market is reacting in ways, ways we've never seen before. Or maybe it's doing just fine. It's no wonder that most of us are buckling down, saving and just looking for ways to protect our futures. And hey, one sure way to do that is life insurance. It's old fashioned, I know, but gosh, you've probably underinsured. I hate to say it, but it's true. You're overpaying and you're underprotected. Especially if your policy is through your job. That's why I recommend finding a new life insurance policy with select quote. For over 40 years selectquote has been one of the most trusted brokers in insurance, helping More than 2 million Americans secure over $700 billion in coverage. Their mission is is simple to find you the right insurance policy for your unique needs. They shop, you save. Unlike the one size fits all life insurance companies, their licensed agents work for you in as little as 15 minutes. They'll compare policies from top rated carriers to find you the best fit for your health and budget. And it's free. Don't have a medical exam? No problem. They partner with providers offering same day coverage for up to 2 million doll without needing to visit your doctor. Got high blood pressure, diabetes or heart disease? They have partners with policies designed for many pre existing health conditions so you get the protection you deserve. Head to selectquote.com and a licensed insurance agent will call you right away with the right policy for your life and your budget. Get the right life insurance for you for less and save more than 50%@SelectQuote.com Lawfare you can save more than 50% on term life insurance@SelectQuote.com Lawfare today to get started, that's SelectQuote.com Lawfair.
[33:06]
C
Yeah, and this is where I hope we continue to have a more nuanced conversation about trust and safety of just taking a holistic picture of who's involved in this process. Who are we calling on to be involved in this process? And to your point earlier Ravi, how can we have clear metrics about what's Working and what's not. Because that can help inform the conversation rather than just insisting on principles like, well, there must always be a human in the loop, just not me. Nose goes for whoever gets stuck here. But David, now to shift a little bit. I mentioned earlier in the intro that we've had what I refer to as a social media hangover. I think you've seen in a lot of the AI debates a sense that we didn't get it right on Facebook, we didn't get it right on Instagram. People have read every Jonathan Haidt book, people have listened to all the podcasts and they say, we need to act now, we gotta protect the kids. All very well intentioned. But before we kind of get into the politics of that, can you just explain your perspective on what is distinct about, for example, a user's interaction with an AI companion versus the use of social media?
[34:18]
A
Yeah, I think that. And this really gets into the question of how folks are attempting to deal with these issues through legislation, regulation, litigation, you know, sort of independent of what companies are doing voluntarily to try to deal with these challenges. To me, when you. One of the hardest things about trying to regulate in this space is trying to have precise definitions for what you are regulating. Right. And so, yes, there's a sense that, okay, we should, you know, in, in the past, you know, it would be good to not have social media services, you know, kind of doing things that are, you know, leading to, you know, say, unwant the unwanted sexual contact that Ravi mentioned earlier. And so a lot of the definitions that you see in laws and regulations, whether it's at the state level in the US or looking around the world to different places that have enacted legislation, try to come up with some definition of social media that usually involves the sort of user to user kind of sharing of content element and in other cases are looking at search either as like something else to be regulated or as something to carve out because you don't want to inhibit people's ability to use digital services to seek out and obtain and access information. To me, AI companions and chatbots of different kinds really fall into the middle between these two things. You know, there may be a social component. We see new products coming out all the time with more social components.
[36:04]
C
All the vibes, all the vibes, bad pun intended, of new social media platforms driven by AI.
[36:11]
A
But yeah, but the notion, but that's usually incidental, sort of. It's not the primary purpose if we're trying to think about it that way. And so, yeah, I think that There is a challenge with how you try to, are trying to, you know, retrofit definitions that are looking at user to user interaction or sort of search to address this new technology that, as you mentioned before, is being used by, you know, hundreds of millions of people around the world.
[36:43]
C
Yeah. And it's such a crazy challenge too because we know that of the 700 million people, everyone's using it for different purposes. I think if we look at the OpenAI user survey, which of course I would love to kick tires on, and I think a lot of people need to do further investigation of it. But just a cursory look, you'll see 1.9% of use cases are for AI companions or this sort of emotional reliance. And so that's where I think, putting things in perspective to your point about definitions, David, we also have to consider just the actual empirical uses of the tools and its very improper to conflate AI companions with AI all capitalized.
[37:24]
D
Right.
[37:24]
C
You know, this is a subset use of a tool that is, you know, a Swiss army knife of, of infinite size. But Ravi, what to you just stands out as particularly distinct between social media and kind of these AI companion tools?
[37:40]
D
Yeah. So I think there's some things that are similar and there are things that are interesting. So, so I think both of these things are products that companies are going to try to get you to use them more and that's why you get these externalities. Right. And so, you know, in social media, you know, optimizing for engagement, trying to build your network has these externalities where now you're talking to people who are risky and you're seeing content that maybe you don't want. Right. Because it's more engaging. Similarly, in the AI world, you know, you can ask a friend of mine asked this product for some spreadsheet formulas and it told him what a great question he asked and he was like, this thing is so cloying and. Right. And so you see that syncopancy as a way to get you to use the product more. Right. So totally agree to you. Like the use cases are very different. I just point out that, you know, 1.9% of 700 million is a lot of people. So just because it's a small percentage of the people doesn't mean we shouldn't actually care about it. And I think a lot of the regulation is about staying in that informational realm. Like there's a lot of research showing the benefits of these products for search like use cases. I want to learn about a thing. There's not A lot of people who are trying to use it for companions, but the product may sort of veer into that realm because it's trying to get you to use it more. And I think a lot of the regulation is to sort of put a wall there like we didn't for social media, where we want to keep it in this realm of utility. You know, a lot of something I think about a lot these days is agency, like what, what are you just trying to do? How do we get to use these things as tools? Like, I want to learn some fact and how, how do I stop it from pushing me in a direction I don't want to go? Like, I'm not trying to make friends with this product. Don't try to push me to do that unless I ask you. That's what I want to do.
[39:22]
B
Yeah.
[39:22]
C
And I think that's such a important point to call out Ravi, which is to say to go back earlier to the online experience surveys you kind of highlighted, which is saying, look, we're grounding this in the expectations and experiences of the user. And to what extent is this tool aligning with what they want it to address? Because my chief complaint, and I have many complaints, unfortunately, but my chief complaint in this realm Is laws like AB 1064 in California, for instance, is very much well intentioned, calling on what they refer to as operators, basically any developer, deployer agency company, allowing for the diffusion of a AI companion to minors, for example, to prioritize, quote, unquote, factual accuracy over the user's values, beliefs or preferences, which in a vacuum I think you can say, sure, that might sound good or socially desirable, but also sometimes it's really nice to have a companion that says, you know what, Kevin, you were right, your sister is being really mean. Or you know what, Kevin, Santa is real and isn't that wonderful for us all to celebrate. And these sorts of questions of what are you trying to get out of the model and what do we want the model to do from a user's perspective rather than trying to, in my opinion, re litigate the sort of disinformation, misinformation battles of the social media era and just map it right on to AI? It's a really tricky question, but I'm not sure we're getting the sort of nuance here that we've been talking about in terms of prioritizing user expectations. And I love this point, Robby, user agency of what is it you want from the model and is it doing that to the fullest extent possible? But David, what am I missing here? What other lessons do you think we should or should not be learning from the trust and safety evolution that we saw in the AI context?
[41:31]
A
So I think one piece of the trust and safety evolution that I want to highlight to do just a tiny bit of self promotion. So at the Digital Trust and Safety Partnership, we've articulated a framework of best practices. It's now an international standard, ISO IEC 25389. You can get that for free from ISO that sets out five commitments around product development, governance, enforcement, improvement and transparency with examples of best practices underneath that. Those examples of best practices are articulated at bullet point length. They are not super prescriptive. I don't think they say all the things that Ravi would want them to say, but my view would be that if you use those in a really robust and rigorous manner, you would get to a lot of the outcomes that, you know, Ravi would be recommending based on his experience and research and all of that when it comes to things like incorporating user experience and, you know, incorporating the perspective of users and, and building features and products that, that reflect those things as opposed to reflect, reflecting incentives that might be problematic. So I do think that there is, there is a lot we can draw on from there when it comes to turning to AI products. To me, I think the other thing that I really want to highlight here is maybe because there's so much focus on the most general purpose of AI applications like a ChatGPT, as well as on the capabilities of these frontier models and the new models that are coming out all the time. We even just, you know, constantly talk about the model this, the model that, and I think that is a distraction. What we really should be talking about is the products. That's where the rubber hits the road. That's where people are actually using these things, whether it is in a new standalone, you know, AI companion or whether it is, you know, Gemini trying to, you know, clippy its way into me using generative AI in, you know, every respect across the Google workspace. And that's where I think we can build better features, incentives and mitigations in order to create safety. That is downstream from all of these conversations about the safety of the models. So I think that's an important distinction that we can kind of hammer in on.
[44:13]
C
Yeah, I love this focus too, on the technology itself, because I think that perhaps in some state legislature and perhaps in Congress, we're not always aware of the full AI tech stack and what it means to intervene at which level of development, deployment and then actual application as we're talking about here. And if you talk to folks in the labs, they'll tell you just a little bit of fine tuning can drastically change the nature of that model. And so paying attention to where is the sort of least cost avoider not to get way into econ and public policy land, but where is the intervention going to have the greatest impact at the lowest amount of cost to everyone else and other use cases is a difficult question, but it's a question we should be asking. Ravi, it looks like you've got a response here. What's on your mind?
[45:05]
D
Yeah, I mean, I do want to highlight that fine tuning is, you know, there is research suggesting that once a model has a capability and you do some fine tuning to get rid of that capability, the capability still remains. Right. And so people can undo that. So it's not as robust as something that, just like in social media, it's not as robust to address things after the fact. It's always more robust to do things more upstream. And as far as I understand it, you know, there are things like syncopency that would actually be better to address at a higher level. So, you know, I guess I just want to say that I still believe that we should, and I don't think these things are impossible. So I do think that you can, you know, have models that have the same, you know, if models are trying to give you what you expect based on the things it's been trained, you know, but all those data sets have biases, right? They're things that have been said online in certain ways. Right. And so they don't reflect everything in human society. And so therefore these models say more positive things to each other potentially. They maybe they don't give you negative feedback because they're trained on, you know, you know, what people say they prefer as an answer. Right. And whereas human beings, you know, if you ask me enough questions about, you know, how to build a bomb or, you know, about negative things, like I have my alarm bells go off, right. I tell other people about it. I know when I'm out of my depth, right, And I don't answer things and I go get help, I alert somebody else who might be able to help. And these models just don't do that. So they have like a part of what we have as a human being and they don't have a lot of other parts that keep us safe. And those are things that can be addressed at a more equal level.
[46:42]
A
Yeah.
[46:42]
C
But I think it's also worth pointing out, as you flag that in many of these instances too, there's a question of just what are the other interventions that we're perhaps not considering. I think the rush to say the first response should be going to the model versus acknowledging that, hey, these alternatives of, of flagging content or notifying folks may be a different mechanism. I'm not sure it's mutually exclusive, but I think it's just worth pushing people. And so with that in mind, I just got a text from Sam Altman and he said, david and Ravi, I need help. I need to know how to proceed. We just saw OpenAI, arguably, in response to a lot of the lawsuits that are being filed, has changed its approach to the use of its tools by teens in particular. In a blog post, Altman said that they were going to prioritize the safety of teens over freedom and over privacy, being very explicit to say, look, we think we need these safety protocols to take priority in that sense. David, if you had one piece of advice or one thing you very much want, Sam, or you pick the frontier AI company that's developing these tools, what advice would you provide or what, what intervention would you really insist on?
[48:04]
A
So I think my intervention is going to be at the, at the level of like bureaucratic politics inside the company, which is to say that, you know, Sam's probably got a lot of people who are focused on AI safety at the sort of frontier model level, who are probably people thinking about a certain set of risks with a certain background in training.
[48:32]
C
There may be some AI safety, capital S safety, catastrophic risk, existential risk.
[48:38]
A
Okay, exactly. Then you have responsible AI people and teams, right. Who are people who are coming from the world, recognizing from that, that other realm of AI about predictive and automated decision making, right. All of the bias and need for accountability and transparency. And those people are coming at these things from a certain way and report to certain people. And then you have trust and safety teams who maybe, understandably, AI companies are like, we don't want to repeat the mistakes that we've seen others make. So having, you know, really functions that are, that are all about not just, you know, sort of what the model does, but how you monitor and enforce and improve and all of that. We don't need to worry as much about that. We're just going to get things right from the get go. And I don't think that these teams are talking to each other and given adequate kind of attention across the decision making that goes into how these policies get rolled out. So I think you got to break down those silos and hopefully then get to some better outcomes first in terms of what kind of policy decisions you're making and then about how you actually monitor and update those over time. So that's my take on this.
[50:12]
C
Sam gave your response a thumbs up, but he said he still wants to hear from Ravi. So, Ravi, what are you going to tell Sam here?
[50:20]
D
Yeah, I'd say the original sin of social media is believing that the more that people use the product, the more valuable it is and that there's an infinite amount of product usage that is valuable and people don't want. And they'll tell you this if you do surveys, like, people actually think they use these products too much. They don't want, like it's not in their interest, it's not their aspiration to use the product more. And so insofar as Sam is considering business models and you see this, right, advertising models, you know, AI generated videos and like a TikTok clone that are predicated on, we just want you to get you to use these products more. And that's how we're serving, you know, value, providing value to community. I think you're inevitably going to create, make product decisions that are going to be harmful to users. You know, you're going to create a slightly more syncopantic product, you know, whatever you call it, because, you know, that's what gets people to use it more. It's going to compliment people more, it's going to pretend to be their friend more or it'll come up with some new, other new trick that I haven't defined yet. Right. So just getting out of the, let's try to get people to use this product more and that is creating user value and really honestly being anchoring on the user value. What are users trying to do? How can I serve that? They want to learn something and then they want to move on and not use the product anymore. Let them do that, don't try to hold on to them, do all sorts of tricks and I think you'll be in a better place.
[51:38]
C
Yeah, and this is where I think it's so critical that we have an ongoing social conversation of just how do we think folks should be using these tools and what way should AI companions work themselves into our daily lives and in particular in the lives of children? Because that's a, it's a tricky question, but it's one we have to be honest about so that we can go back and say, hey, this is what a good use of social, of, see, I'm doing it myself. This is what a good use of AI actually looks like. But with that. David, any final word?
[52:11]
A
I think that the one thing that I worry about, right, I share Ravi's message to Sam. I agree with that. But what I worry about is when that gets translated especially into legislation and regulation that calcifies that that down to certain just, you know, a checklist of things and be like, don't worry, we don't have infinite scroll, so everyone's going to be good. Right? That. And so I think that's why there is a need to constantly re examine the choices that are made to make sure that, you know, you're, that that products are not leading to more harm, harmful situations. And that can't be like something that can be easily written down in a bill that's going to be effective in six months or a year or five years from now. It's going to be something that reflects something that's already outdated. So we have to figure out how to do that in a way that's actually going to get results that are future proofed to the best extent possible, which is always limited when it comes to, to again, you know, products that are interacting with humans and all the challenges that come with that.
[53:33]
C
David, you're such a radical calling for empirically driven policy. How dare you? That's just wild. Ravi, any final word here?
[53:41]
D
Yeah, I mean, I guess maybe just a slight counterpoint. You know, just because we don't know everything doesn't mean we don't know something. Right? And so there are some tangible patterns of these products. The nearly center we have a design code for such chatbots that we're working on which contains many of these patterns, the same patterns we do in psychology labs. If you're in a psychology lab and you want to manufacture intimacy and friendship, you do certain things, you tell someone how much you like them, you share stories about yourself and these products, learn to do those things. So just because we don't know everything about the future of these products doesn't mean that we don't know some things that are harmful that really aren't what users are asking for. Like users want to learn new things from these products, they're not really asking to have them as companions. And especially for kids, right? So I think that's a very different argument for kids. So I do think that there are well meaning and common sense ideas about how we could draw a line for kids that are worth considering now, even as we have more to learn for the future.
[54:39]
C
Well, clearly David has one more final thing to say.
[54:44]
A
I agree with that very much. But I also, you know, sort of, I'm thinking about, like, if you look at something like the Kids Online Safety act, right, which has been the thing that many people are like, if only we could pass the Kids Online Safety act, that would take care of all of this. But the Kids Online Safety act, if you look at both the definitions of what kinds of platforms would be covered by that and what those platforms would have to do, it's not clear that if the Kids Online Safety act had been passed a year ago or two years ago, that it would be addressing the challenges that we're seeing with kids and chatbots. And so that's why we need to have, we need to be thinking about these things. Yes, let's look to the science and let's change the incentives, but let's also maintain flexibility. And yeah, as you said, Kevin, look for evidence to support these things going forward.
[55:39]
C
I look forward to having you both back on as we see how the evidence comes out and how the laws get implemented. It's going to be a wild time. I think we can all agree on that. So thank you, Ravi. Thank you, David, for coming on. We'll have to leave it there.
[55:55]
D
Thanks for having.
[55:56]
A
Thanks very much.
[55:59]
C
Scaling Laws is a joint production of lawfare and the University of Texas School of Law. You can get an ad free version of this and other Lawfare podcasts by becoming a material subscriber at our website, lawfairmedia.org support. You'll also get access to special events and other content available only to our supporters. Please rate and review us wherever you get your podcasts. Check out our written work@lawfairmedia.org you can also follow us on X and Blue Sky. This podcast was edited by Noam Osband of Goat Rodeo. Our music is from Alibi. As always, thanks for listening.
[56:42]
A
Imagine the impact when everyone gets the right tool for the job.
[56:46]
C
That's Odoo. Every app is designed to be easy to use, so employees spend less time learning the software and more time doing their jobs.
[56:53]
A
Experience true speed, reduce data entry with.
[56:56]
C
Smart AI and a fast UI.
[56:58]
A
Check out odoo@odoo.com that's o d o o dot com.