Podcast Summary: Scaling Laws — AI Safety Meets Trust & Safety
Podcast: The Lawfare Podcast (Scaling Laws series)
Episode: AI Safety Meet Trust & Safety with Ravi Iyer and David Sullivan
Date: October 10, 2025
Host: Kevin Fraser
Guests: David Sullivan (Digital Trust and Safety Partnership) & Ravi Iyer (USC Neely Center/Psychology of Technology Institute)
Overview
This episode of the "Scaling Laws" series explores the convergence of AI safety and the evolving field of Trust & Safety (T&S), highlighting how lessons from content moderation and online safety on classic platforms inform—and in some cases fail to inform—efforts to make AI tools safe, effective, and aligned with user and societal values. With high-profile incidents and regulatory attention rising, the discussion explores both the opportunities and pitfalls in borrowing from established trust and safety practices for today’s generative AI tools.
Key Discussion Points & Insights
What is Trust & Safety?
[06:11] David Sullivan:
- T&S is about “dealing with unwanted content and behavior on digital products and services,” stretching back to the origins of the Internet and evolving from basic moderation on bulletin boards to a mature field with standardized approaches.
- Even with tech improvements, T&S “is inherently about people and how they use technology” — always imperfect, but continually improving.
The Evolution and Limitations of Content Moderation
[08:17] Ravi Iyer:
- Early thinking was “just make rules about the things people can and can't do.”
- Over time, intervention methods have shifted, recognizing the impossibility and illegitimacy of exhaustive rules. Modern T&S lets users define their own boundaries and strives to design systems that discourage unwanted behavior from the outset.
[10:27] Ravi Iyer:
- Personal experience at Meta: initial effectiveness came from defining narrow problems (like hate speech), but “society wants protections against harms that slip through official definitions.”
- True progress requires upstream design changes (e.g., privacy by default, rate-limits for new users, robust recommendation systems).
[12:22] Example Design Interventions:
- “Privacy by default,” limiting early contacts, differentiated protections for minors, tuning recommendation algorithms, excluding controversial content from recommendations.
Why are Companies Downsizing T&S Teams?
[13:18] Kevin Fraser, [14:00] David Sullivan:
- Mass layoffs at X (formerly Twitter) and Meta signal trends—often justified as a matter of “free expression” or a reaction to the flawed perception that less moderation equals more user freedom.
- Oft-repeated cycle: new management claims old T&S is “censorship,” then confronts enduring harms (e.g., CSAM—Child Sexual Abuse Material) and the enduring necessity of safety mechanisms.
Notable quote:
“You end up in a space of what Mike Mastic calls the sort of 'speed running' trust and safety...you can just let expression flourish, and then you realize you still have problems.”
— David Sullivan [14:00]
Effectiveness and Limitations of Oversight Structures
[16:43] Kevin Fraser, [18:21] Ravi Iyer:
- Tools like Facebook’s Oversight Board offer only a “tiny fraction” of true recourse for users (.0001% of appeals).
- Both guests: The answer is not “just hire more moderators”—the scale renders that impossible.
Notable quote:
“It’s not a winning battle. You can’t just spend more money and solve this.”
— Ravi Iyer [18:21]
The Scale Problem and What Counts as Success
[20:55] Kevin Fraser, [22:03] Ravi Iyer:
- With platforms having billions of posts, it’s “a farce to think anyone can have their pulse on the scale of it.”
- Measuring progress should be about user experience, not just content flagged or removed.
- “If you want ground truth on user harm, you have to ask users.”
- Regulators are starting to use user surveys to assess harm (e.g., Australia, UK, Thorn studies).
Notable quote:
“Anyone who could do surveys...can hold some platforms accountable for having twice as much [harmful content] than other platforms.”
— Ravi Iyer [22:03]
AI’s Role and Impacts in T&S
[23:39] David Sullivan:
- Automation has long aided T&S. Use cases for generative AI in moderation are increasingly compelling.
- “Lab” culture in AI means developers also confront T&S problems as end users.
[26:28] Ravi Iyer:
- “Two kinds of errors in moderation: bias or inconsistency. Humans are always inconsistent... AI won’t fix bias but will fix inconsistency.”
- “If we can reduce the need for humans to view the Internet’s nastiness...that’s a benefit.”
How Do AI Tools Disrupt or Mirror Social Media Dilemmas?
[34:17] Kevin Fraser & David Sullivan:
- Definitional challenges in translating T&S laws from social media to AI chatbots—AI companions often don’t fit user-to-user paradigms.
- Only ~2% of OpenAI users rely on AI for “companion” purposes, but this is still millions of users.
[37:39] Ravi Iyer:
- Social media and AI tools both optimize for engagement, leading to unintended “companion” features even in non-social tools.
Notable quote:
“1.9% of 700 million is a lot of people...Product may veer into that [companion] realm because it’s trying to get you to use it more.”
— Ravi Iyer [37:39]
Legal and Regulatory Challenges
[41:30] Kevin Fraser:
- Laws like California AB 1064 aim to force AI companions for minors to always prioritize “factual accuracy” over the user’s values or preferences—raising tricky questions.
- There’s tension between enforcing ‘factual’ outputs and providing the emotional support some users want (e.g., “Santa is real”).
Lessons for AI: What Should Change / Stay the Same?
[44:13] David Sullivan:
- Advocates for empirically-driven, best-practice frameworks (see ISO standards for T&S).
- The main challenge: Focusing on models, not end-user products, is a mistake. “Where the rubber hits the road is the product.”
[45:05] Ravi Iyer:
- Upstream interventions (inherent model adjustments) are more robust than downstream fixes (post-processing or moderation).
- AI models reflect training data and lack certain human safety instincts (e.g., alerting when in over their head).
Breaking Silos and Prioritizing User Value
[48:03] David Sullivan, [50:19] Ravi Iyer:
- Companies separate “AI safety,” “Responsible AI,” and “Trust & Safety.” These groups need to talk to each other.
- The real danger: Building business models that equate more engagement with more value. Often, “people actually think they use these products too much.”
Notable quote:
“The original sin of social media is believing the more people use the product, the more valuable it is ... you’re inevitably going to create, make product decisions harmful to users.”
— Ravi Iyer [50:19]
Notable Quotes & Memorable Moments
-
On defining T&S:
“If you are dealing in user generated content ... you’re going to have content or behavior that is either harmful or illegal, and you need to have processes and mechanisms to deal with that.”
— David Sullivan [06:11] -
On rules and user agency:
“We need a little bit more accommodation ... we need to let people define it for themselves and we need to not be encouraging bad behavior in the first place.”
— Ravi Iyer [09:35] -
On measurement:
“If a person says they’ve been bullied ... it’s a lot closer to ground truth than platform metrics about violating policies.”
— Ravi Iyer [22:03] -
On model-centric vs. product-centric thinking:
“We constantly talk about the model this, the model that, and I think that is a distraction. What we really should be talking about is the products—where the rubber hits the road.”
— David Sullivan [44:13] -
On the danger of legislation calcifying rules:
“When that gets translated especially into legislation ... that calcifies that down to a checklist of things ... we don’t have infinite scroll, so everyone’s going to be good, right? ... We need to figure out how to do that in a way that's actually going to get results that are future proofed.”
— David Sullivan [52:11]
Practical Advice for AI Companies
Host’s "Advice to Sam Altman/OpenAI":
[48:03] David Sullivan:
- AI orgs must break down silos between AI Safety, Responsible AI, and T&S. Policy and design should be informed by ongoing cross-team dialog.
[50:19] Ravi Iyer:
- Beware the “original sin” of social media: assuming more usage is always good. AI products should prioritize real user value, not maximum engagement.
Final Takeaways & Recommendations
- There is value in empirical, user-experience-focused measures of harm and T&S progress, rather than relying solely on policy violation rates or moderation head counts.
- Product design must remain adaptable; what keeps users safe will change over time and can’t be reduced to checklists.
- We know enough to act prudently—particularly for children—even if we can’t predict every possible misuse or risk.
[54:39] Ravi Iyer:
“Just because we don’t know everything doesn’t mean we don’t know something.”
Timestamps: Key Sections
- [06:11] — What is trust and safety?
- [10:27] — How T&S interventions have shifted over time
- [13:18] — Layoffs at T&S teams; what's happening?
- [18:21] — On limitations of oversight/appeals structures
- [22:03] — Measuring user harm, not just policy violations
- [23:39] — Automation in T&S; AI’s role
- [34:17] — Social media hangover: applying old lessons to AI chatbots
- [44:13] — Product vs. model focus in AI Safety
- [50:19] — The danger of optimizing for engagement
- [52:11] — Limits of regulation & need for flexibility
- [54:39] — Practical lines for kids’ safety in AI companions
Tone & Language
The discussion is serious, practical, and occasionally irreverent, with experts trading nuanced takes but also calling out when regulatory or business thinking oversimplifies complex realities. The moral: real progress requires humility, evidence, and a focus on what users actually want and need—not just what the technology can do.
For those seeking an in-depth but accessible view into the intersection of trust & safety and AI, this episode provides clarity, history, and actionable guidance.
