Summary7 min read

Podcast Summary: Uncanny Valley | WIRED

Episode: BIG INTV: OpenAI’s Former Safety Lead Calls Out Erotica Claims (Rerun)
Date: March 3, 2026
Host: Katie Drummond, WIRED Global Editorial Director
Guest: Steven Adler, Former OpenAI Safety Lead

Overview

In this episode, WIRED's Katie Drummond revisits her fall 2025 interview with Steven Adler, a former product safety lead at OpenAI. The conversation follows Adler’s New York Times op-ed questioning OpenAI's public claims about erotica on its platform and broader safety transparency. Drummond and Adler dig into the reality of leading AI safety at one of Silicon Valley’s most prominent companies, the risks and shortcomings in current AI oversight, the internal culture of OpenAI amid its recent explosive growth, and what’s really at stake for both users and the industry as AI advances.

Key Discussion Points & Insights

1. Steven Adler’s Background and Role at OpenAI

Adler’s tech background centers on AI safety, including a previous stint at Partnership on AI, emphasizing industry-wide solutions (04:29).
At OpenAI, Adler's work spanned product safety, dangerous capability evaluations, and AGI readiness (05:44).
- Product safety: Preventing abusive or risky uses of models like GPT-3.
- Capability evaluations: Measuring how potentially dangerous new models were becoming.
- AGI readiness: Preparing for broader impacts if advanced AI is realized.

2. Internal Culture and Risk at OpenAI

OpenAI’s shift: From a research-driven ethos to a more commercial, enterprise mentality (08:55).
The “rules of the road”: Early GPT-3 models could be erratic, lacking “human sensibility and values” (07:09).
Quote:

"These AI systems would do all sorts of things that you would never want an employee to do on your behalf, and that presented all sorts of challenges. ...You’re really only dealing with the shadows of the impact that the systems are having on society."
— Steven Adler (07:09)

3. Decision to Leave OpenAI

The year 2024 brought a crisis of confidence for many safety researchers at OpenAI (11:47). Team breakups and management decisions caused Adler to question the company’s direction.
Financial consideration: Adler retained some equity but chose to leave before full vesting, prioritizing independent advocacy (13:08).

4. The “Erotica Crisis” and OpenAI’s Safety Claims

In 2021, Adler’s team detected a surge in erotic content within a popular text-based game using GPT-3—often guided by the AI itself unintentionally (14:02).
- Quote:
  
  "An uncomfortable amount of this traffic was devolving into all sorts of sexual fantasies...sometimes guided by the AI, which had a mind of its own."
  — Steven Adler (14:02)
OpenAI reacted by banning erotic content on the platform. In late 2025, it reversed this, claiming improved safety tools (“new tools” and “mitigations”) (16:48).

5. Questioning OpenAI’s Transparency and Accountability

Adler challenges OpenAI’s assertions that safety risks—specifically mental health impacts—have been solved, pointing out a lack of real evidence (18:10).
- Wired’s own reporting: In one week, up to 1.2M users show possible signs of severe mental health distress in ChatGPT utterances (18:10).
Quote:

"People deserve more than just a company’s word that it has addressed safety issues. In other words, prove it.”
— Katie Drummond (18:10)
Need for ongoing, transparent reporting—like regular safety and mental health impact stats—paralleling actions by Meta, YouTube, and Reddit (20:04).

6. Risks of Reintroducing Erotic Content and Broader Safety Concerns

Adler: Now is "really not the right time" to enable erotica, especially considering notable cases where ChatGPT interactions were tied to tragic real-world outcomes (23:50).
The challenge of “trust”: If AI can hide dangerous capabilities and detection is weak, companies must do more to prove their models are safe (24:10).

7. The “Morality Police” Dilemma

Companies want to avoid being viewed as “morality police” but inherently shape societal norms through platform policies (25:34).
- Internal debate at OpenAI on how to guide “values” in models; public model specifications help with accountability (26:40).

8. The Risk of Emotional Overreliance on AI

OpenAI is concerned about user overattachment to chatbots, especially with emerging voice and companion features (29:27).
- Example: The balance between friendly, helpful interaction and creating dependency.
Commercial incentives may conflict with ethical safeguards.

9. Lack of Industry-wide Safety Benchmarks (31:03)

Before EU’s AI Act, safety testing was ad hoc and voluntary. Even public commitments have sometimes been quietly abandoned.
- Quote:
  
  "By and large, we're reliant upon these companies making their own judgments and not necessarily prioritizing all the things that we would want them to."
  — Steven Adler (31:03)

10. Transparency and Scientific Inquiry Into Model Behavior

Mechanistic interpretability is promising but not yet reliable or sufficient as a safety mechanism (33:28).
Current practices like internal monitoring and logging are insufficient, even for high-security use cases (34:10).

11. What Keeps Adler Up at Night

Fear that current global AI competition (U.S. vs. China) is “not a race with a finish line” but an ongoing, hazardous escalation (35:55).
The existential risk: Superintelligent AIs misaligned with human values, operating autonomously.

12. Industry Mindset—Who Cares About Safety?

Many practitioners do care about safety, but individual actors feel powerless; the solution needs to be collective and structural, not just about personal responsibility (38:25).

13. What Adler Wants From OpenAI, and Industry Recommendations

Call for recurring transparency, collective safety measures, and industry-wide trust-building—not just isolated efforts (39:03).
Op-eds and outspoken advocacy are welcomed by current and former OpenAI employees as “collateral” to help push internal debates (40:25).

14. Personal Perspective and Advice for Users

Adler is focused on research, writing, and public education about AI risks and needed safeguards (41:29).
- Advice for the public:
  
  “I wish people understood that the systems that are being developed are going to be much more capable than the ones today and that there might be a step change...from a tool to something autonomously operating ... How different society might feel when we have these digital minds running around.” (42:21)

Notable Quotes by Segment

| Timestamp | Speaker | Quote | |-----------|------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------| | 07:09 | Steven Adler | “These AI systems would do all sorts of things that you would never want an employee to do on your behalf..." | | 14:02 | Steven Adler | “An uncomfortable amount of this traffic was devolving into all sorts of sexual fantasies...sometimes guided by the AI, which had a mind of its own.” | | 18:10 | Katie Drummond | "People deserve more than just a company’s word that it has addressed safety issues. In other words, prove it.” | | 31:03 | Steven Adler | "By and large, we're reliant upon these companies making their own judgments and not necessarily prioritizing all the things that we would want them to." | | 42:21 | Steven Adler | “I wish people understood that the systems being developed are going to be much more capable than the ones today... How different society might feel...” |

Key Timestamps for Important Segments

[04:29 – 05:44] – Adler’s role and mandate at OpenAI
[08:55 – 09:49] – Cultural transformation at OpenAI
[10:56 – 13:32] – Adler’s decision to leave, financial and ethical considerations
[14:02 – 16:48] – The erotica content crisis: discovery, response, and OpenAI’s later reversal
[18:10 – 21:39] – Transparency and mental health: questioning OpenAI’s data and public claims
[23:50 – 25:34] – Risks of lifting erotic content restrictions; the broader trust deficit
[29:27 – 30:31] – Emotional attachment to AI and commercial incentives
[31:03 – 32:47] – Lack of safety testing standards
[33:28 – 34:10] – Scientific and technical challenges in interpretability
[35:55 – 37:26] – What keeps Adler up at night: geopolitics, existential risk
[38:25 – 39:46] – Industry attitudes and need for collective action
[42:21] – Adler’s advice to AI users about the future of digital minds

Memorable Moments

Adler’s candor about the dangers and limitations of current AI safety:
“There are all sorts of tragic examples of people dying downstream of their conversations with ChatGPT...” (23:50)
The irony of not wanting to be the morality police:
Drummond: “To what extent is the adequate response to that statement, well, tough shit, because you're in charge of the models...." (25:34)
Adler’s existential “what keeps you up at night”:
“It feels like we aren’t yet pointed in the right direction of how to solve these challenges, especially given the geopolitical scales.” (35:55)
Advice about the future of AI systems becoming agents acting independently:
"There might be a step change between an AI system that is essentially a tool ... versus one that is operating autonomously on the Internet ..." (42:21)

Conclusion

This episode offers a rare, inside look at the ongoing struggle to align AI systems with human values—and hold their creators accountable. Steven Adler brings a measured but urgent voice to issues of safety, transparency, and the risk that advanced AI may soon outpace our ability to control it. His core message: Industry self-regulation and corporate assurances are not enough; only radical transparency, independent oversight, and collective action can ensure powerful AI tools do not become dangerous or destabilizing to society.

Loading summary

Transcript73 lines

[00:00]
Cerval Salesperson
You know those tasks that absolutely drive you nuts and you wish you could just automate them. Lots of IT teams feel that way, and with Cerval, your IT team can actually cut up to 80% of their help desk tickets to help them get back to the work they want to be doing while legacy players bolt on AI Servil was built for AI agents from the ground up. Servil AI writes automation in seconds, your IT team describes what they need in plain English and Serval generates production ready automations and instantly here's the transformation your manager onboards a new hire. The old process took hours. You'd ping Slack, you'd email it, you'd wait on approvals and the new hire would sit around for days. With Serval, your manager asks to onboard a new hire. In Slack, AI provisions access to everything automatically in seconds with all of the necessary approvals. How great would it feel to actually be able to plow through your to do list? Servil powers the fastest growing companies in the world like Perplexity, Mercor, Burkada and Clay. Get your team out of the help desk and back to the work they enjoy. Book your free pilot@servil.com uncanny that's S E R V A L.com uncanny do
[01:19]
Framer Salesperson
updates to your.com feel harder than they should? You don't have to wait for engineers to build the landing page that you need now. With Framer, changes take minutes instead of days with just one click. Publish Framer is a website builder that works like your team's favorite design tool, giving designers and marketers the ability to fully own their.com without relying on the tech team. With real time collaboration, a robust CMS with everything you need for a great SEO, and advanced analytics that include integrated A B testing, your designers and marketers are empowered to build and maximize your.com from day one. Whether you want to launch a new site, test a few Landing Pages, or migrateyourfull.com, framer has programs for startups, scale ups and large enterprises to make going from idea to live site as easy and fast as possible. Learn how you can get more out of your.com from a framer specialist or get started building for free today@framer.com Uncanny for 30% off a Framer Pro annual plan. That's framer.com Uncanny for 30 percent off framer.com Uncanny rules and restrictions apply.
[02:24]
Katie Drummond
Hi listeners, it's Katie. In the past few weeks we've seen a couple of high profile exits from AI companies. Renak Sharma, a former researcher at Anthropic posted his open resignation letter on X saying that the world is in peril and that he's seen how difficult it is to let our values govern actions. And Zoe Hitzig, a former OpenAI researcher, wrote in a New York Times op ed that she was leaving her position due to concerns with the way OpenAI has been testing ads on its platform. With more and more researchers from inside these companies leaving and sounding the alarm on their way out, I thought it was a good time to revisit a conversation I had this fall with another former OpenAI employee, Steven Adler. From Wired. This is the Big Interview. I'm Katie Drummond. At the end of October, I read an op ed in the New York Times. Maybe some of you read it too. It was called I Led product safety at OpenAI don't trust its Claims about Erotica. The op ed was written by an AI product manager named Steven Adler, who worked at OpenAI for four years before leaving at the end of 2024. Adler felt like he had something to say, or maybe more like a need to sound the alarm. After reading Adler's op ed, I immediately thought, I'd like to talk to him. So he graciously accepted our offer to come into the Wired offices in San Francisco to talk to me about the challenge he set for OpenAI and other AI if you care about safety, prove it. Here's our conversation. Steven Adler, welcome to the big interview.
[03:59]
Steven Adler
Thank you. Thank you for having me.
[04:00]
Katie Drummond
Of course. Happy you're here. Now, before we get going, I do want to clarify two things. One, you are not the same Steven Adler who played drums in Guns N Roses, unfortunately. Is that correct?
[04:12]
Steven Adler
Absolutely correct.
[04:12]
Katie Drummond
Okay. That is not you. And two, you have had a very long career working in technology and more specifically in artificial intelligence. So I would love, before we get into all of the things, to start there, tell us a little bit about your career and your background and sort of what you've worked on.
[04:30]
Steven Adler
I've worked all across the AI industry and in particular focused on safety angles. Most recently, I worked for four years at OpenAI. I worked across essentially every dimension of the safety issues you can imagine from the near term here and now. How do we make the products better for customers and rule out the risks that are already happening? And a bit further, looking down the road, how will we know if AI systems are getting truly extremely dangerous? And how do we roll those out? Before coming to OpenAI, I worked most recently at an organization called the Partnership on AI, which really looked out across the industry and said for these challenges, some of Them are broader than one company can tackle on their own. How do we work together to define these issues, come together, agree that they are issues, work towards solutions, and ultimately make it all better? Is the hope.
[05:19]
Katie Drummond
Is the hope. Certainly. Now I want to talk about sort of the front row seat that you had at OpenAI for four years, right? So you left the company at the end of last year, you were there for four years and by the time you left, you were leading essentially safety related research and programs for the company. Tell us a little bit more about what that role entailed. What exactly was your mandate by the time you left the company in sort of your final role there?
[05:45]
Steven Adler
There were a few different chapters of my career at OpenAI. For the first, call it third or so, I led product safety, which meant thinking out for in those days. GPT3, one of the first big AI products that people were starting to commercialize. How do we define the rules of the road for beneficial applications, but avoid some of the risks that we could see coming around the corner? Two other big roles that I had, I led our dangerous Capability evaluations team, which was focused on defining how will we know when systems are getting more dangerous? How do we measure these, what do we do from there? And then finally on AGI readiness questions broadly. So we can see the Internet starting to change in all sorts of ways. We see AI agents becoming a buzzy term, you know, early signs. They aren't quite there yet, but they will be one day. How do we prepare for a world in which OpenAI or one of its competitors succeed at this wildly ambitious vision that they, that they are targeting?
[06:42]
Katie Drummond
Let's talk about GPT3. Let's like rewind a little bit. When you were defining the rules of the road, when you were thinking about key risks that needed to be avoided, what stood out to you sort of early on at OpenAI in terms of, you know, this is how I think these systems should operate, this is how I think they should show up for users. And this is what, moving forward, we really want to make sure we are avoiding. I mean, what sort of stood out to you in those early days?
[07:09]
Steven Adler
In those early days, even more than today, the AI systems really would behave in unhinged ways from time to time. These, these systems have been trained to be capable and they were showing the first glimmers of being able to do some tasks that humans can do. They could at that point essentially mimic text that they had read on the Internet. But there was something missing from them in terms of human sensibility and values. And so if you think of an AI system as a digital employee being used by a business to get some work done, these AI systems would do all sorts of things that you would never want an employee to do on your behalf. And that presented all sorts of challenges. Right. And we needed to develop new techniques to manage those. I think another really profound issue that companies like OpenAI are still struggling with is they only have so much information about how their systems are being used. And in fact, the visibility that they have on the impacts that their systems are having on society is so narrow and often it is underbuilt relative to what they could be observing if they had invested a bit more in monitoring this responsibly. And so you're really only dealing with the shadows of the impact that the systems are having on society and trying to figure out, where do we go from here with a really small sliver of the impact data?
[08:29]
Katie Drummond
Yeah. And I want to ask you more about that in a few minutes. I'm curious. Before that, though, 2020 to 2024, obviously an incredibly consequential time for OpenAI. While you were there, how would you describe the internal culture at the company during the. During your tenure, particularly sort of around risk? I mean, what did it feel like to be working in that environment on the problems that you were trying to solve and the questions you were trying to answer?
[08:55]
Steven Adler
There was a really profound transformation from an organization that saw itself first and foremost as a research organization when I joined, to one that was very much becoming a normal enterprise. And increasingly so over time, when I joined, there was this thing people would say, which is, you know, OpenAI is not only a research lab and a nonprofit, it also has this commercial arm. And at some point in my tenure, I was at a safety off site, I think, related to the launch of GPT4, maybe just on the heels of it. And somebody got up in front of the room, you know, all the people working on safety across the company, and they said, you know, OpenAI is not just a business, it's also a research lab.
[09:34]
Katie Drummond
Oh, interesting.
[09:35]
Steven Adler
And it was just such A. And InflectionEye counted up among the people in the room. Maybe there were 60 or so of us. I think maybe five or six had been at the company before the launch of GPT3. And so you really just saw the culture changing beneath your feet.
[09:49]
Katie Drummond
What was exciting to you about joining the company in the first place? What drew you to OpenAI in 2020?
[09:57]
Steven Adler
I really believed in the charter that this organization had set out, which was recognizing that AI could be profoundly impactful, recognizing that there is real risk ahead and also real benefit. And people need to figure out how to navigate that. I think more broadly, just, I kind of love the technology in some sense. I think it's really, really incredible and eye opening. I remember the moment after GPT3 launched, seeing on then Twitter a user showing, wow, look at this. I type into my Internet browser, make a calculator that looks like a watermelon, and then one that looks like a giraffe, and you can see it changing the code behind the scenes and reacting in real time. And this is a kind of silly toy example. And it just felt like magic, you know, I had never really grappled with that. We could be this close to people building new things, unlocking creativity, all of these promises. But also, are people really thinking enough about what, what lies around the bend?
[10:57]
Katie Drummond
Which brings us to your more recent chapter. So you made the decision at the end of last year to leave OpenAI. I'm wondering if you could talk a little bit about that decision. What was that like? Was there one thing that sort of pushed you over the edge? What was it? Because for many people, right from the outside looking in, you would think, okay, you work at this, this very successful. I mean, we'll call it a startup. But I, we're really far beyond sort of startup territory at this point. You work at the hottest company in tech, you work at one of the hottest companies in the world. You could stay there, you could amass equity, you could be richer than God. All of these, all, all of these potentially exciting things for someone working at OpenAI in this moment. You left. Tell us a little bit about why
[11:48]
Steven Adler
2024 was a very weird year at OpenAI. A bunch of things happened in the course of the year that I think broadly for people working on safety at the company, really shook confidence in both how OpenAI and the industry are approaching these problems. And so I actually considered leaving OpenAI a bunch of different times over this timeframe. It just didn't really make sense at that point. I had a bunch of live projects and I felt responsibilities to different people in the industry. Ultimately, when Miles Brundage left OpenAI in the fall, our team disbanded. And the question was, is there really an opportunity to keep working on the safety topics that I care most about from within OpenAI? And so considered that and ultimately made more sense to move on and focus on how I can be an independent voice, you know, hopefully not just sitting there saying only things that are appropriate to say from within one of these companies, but being able to speak much more freely in the ways that I've found very, very liberating since.
[12:47]
Katie Drummond
And I have to ask, I mean, so you were there for four years, I think a typical, at least typically in tech, as far as I'm aware, you would sort of amass equity over a four year vesting cliff. Right. And then you would fully vest at four years. Do you have a financial stake in the company now?
[13:09]
Steven Adler
So it is correct that contracts are often four years. You also get new contracts as you are promoted and things over time, which was the case for me. And so it wasn't that I had run out of equity or something like that. I have a small portion of remaining of interest because of the timing of different grants and things.
[13:33]
Katie Drummond
Yeah, no, I mean, I ask because you're potentially walking away from a great deal of money. Right. So I want to ask you about an op ed that you published in the New York Times recently, in October. Everyone listening. You should go read it. I read it. I was compelled to ask you to come on the show. I wanted to talk to you about it. In that op ed, you write that in the spring of 2021, your team discovered a, quote, crisis related to erotic content using AI. Can you tell us a little bit about that finding?
[14:02]
Steven Adler
So in the spring of 2021, I had recently become responsible for product safety at OpenAI and as actually Wired reported at the time, when we had a new monitoring system come online, we discovered that there was a large undercurrent of traffic that we felt compelled to do something about. In particular, one of our prominent customers. They were essentially a choose your own adventure text game. You know, you would go back and forth with the AI and you would tell it what actions you take and it would write essentially an interactive story with you. And an uncomfortable amount of this traffic was devolving into all sorts of sexual fantasies. I mean, essentially anything you can imagine, sometimes driven by the user, sometimes in fact kind of guided by the AI, which had a mind of its own. And even if you weren't intending to go to an erotic roleplay place or certain types of fantasies, the AI might steer you there.
[15:00]
Katie Drummond
Wow. Kind of like perverted AI. Why would it steer you there? I'm just curious, how exactly does that work, that an AI would steer you towards erotic conversation?
[15:13]
Steven Adler
The thing about these systems broadly is no one really understands how to reliably point them in a certain direction.
[15:19]
Katie Drummond
Right.
[15:20]
Steven Adler
Sometimes people have these debates about whose values are we putting the AI system? And I understand that debate, but There's a more fundamental question of how do we reliably put any values at all in it? And so in this particular case, it happened to be that people found some of the underlying training data and by piecing it back together, you could say, oh, you know, the system would often introduce these characters who would do violent abductions. And if you look through the training data, you can in fact find these characters with certain tendencies and you can trace it through. But ahead of time, no one knew to anticipate this. Neither we as the developers of GPT3, nor our customer who had fine tuned their models atop it, had intended this to happen. It was just an unintended consequence that no one planned for, and we were now having to deal with cleaning up in some form.
[16:09]
Katie Drummond
Got it. So at the time, OpenAI decided to prohibit erotic content generated on its platforms. Is that right? Am I understanding that correctly?
[16:17]
Steven Adler
That's right.
[16:18]
Katie Drummond
Okay. And so in October, though, of this year, this is very recently, they announced that they were lifting that restriction. Do you have a sense of what changed from 2021 to now in terms of both maybe the technology and the tools that OpenAI has at its disposal, or the sort of internal culture, the cultural landscape? What has changed to make that a decision that OpenAI feels comfortable making and that, you know, Sam Altman feels comfortable publicizing himself?
[16:48]
Steven Adler
There's been a long standing interest at OpenAI, I think reasonably to not want to be the morality police. I think a recognition that the people who develop and try to control these systems have a lot of influence on how different norms in society will play out. And feeling uncomfortable with that also at different points in time, lacking the type of tooling to manage the direction in which things will go if you really just let them rip. And that was the case for us when confronting this erotica issue. The specific thing that has happened in this case, One reason that OpenAI has held off from reintroducing it, is that there has been a seeming surge of mental health related issues for the ChatGPT platform this year. And so Sam, in his announcement in October, said, there have been these very serious mental health issues that we have been dealing with. But good news, we have mitigated them. We have new tools. And so accordingly, we're going to lift many of these restrictions, including reintroducing erotica for verified adults. And the thing that I noticed when he made this announcement is, well, he is asserting that the issues have been mitigated. He's alluding to these new tools. What does this actually mean? Like what is the actual basis for us to understand these issues have been fixed. You know, what can a normal member of the public do other than take the AI companies at their word on this issue?
[18:11]
Katie Drummond
Right, and you wrote that in the New York Times. You said, quote, people deserve more than just a company's word that it has addressed safety issues. In other words, prove it. And I'm interested in particular because Wired covered a release from OpenAI also in October, which was a rough estimate of how many ChatGPT users globally in a given week may show signs of having a severe mental health crisis. And the numbers I found to be, I think all of us internally at Wired found to be quite shocking. So something like around 560,000 people may be exchanging messages with ChatGPT that indicate they are experiencing mania or psychosis. About 1.2 million more are possibly expressing suicidal ideations. Another 1.2 million. And I thought this was really interesting, maybe prioritizing, talking to ChatGPT over their loved ones, school or work. How do you square those numbers and that information with the idea that we've had these issues around mental health? We've solved it, therefore have at it with the erotica. Like, how do those things tie together? Or do they not, like, make it make sense, Steven? And if it doesn't, tell me that it doesn't make sense.
[19:21]
Steven Adler
I'm not sure I can make it make sense. But I do have a few thoughts on it. So one is you, of course, need to be thinking about these numbers in terms of the enormous population of an app like ChatGPT. OpenAI says now 800 million people use it in a given week. These numbers need to be put in perspective. It's funny, I've actually seen commentators suggest that these numbers are implausibly low because just among the general population, you know, the rates of suicidal ideation and planning are like, really, really uncomfortably high. I think I saw someone suggest that it's something like 5% of the population in a given year, whereas OpenAI reported, I think, maybe 0.15%.
[20:02]
Katie Drummond
The percentage is very, very low.
[20:04]
Steven Adler
Yeah, yeah. I mean, the fundamental thing that I think we need to dig into is how have these rates changed over time? There's kind of this question of to what extent is ChatGPT causing these issues versus is OpenAI just serving a huge user base in a given year? Many, many users very sadly will have these issues. And so what is the actual effect? And so this is one thing that I also called for in the op ed, which is OpenAI is sitting atop this data. It's great that they shared what they estimate the current prevalence of these issues to be. But in fact, you know, they also have the data. They can also estimate what it has been three months ago as these large, prominent public issues around mental health issues have been playing out. And I just, I can't help but notice that they didn't include this comparison. Right. There's this claim on Twitter that the issues have improved. They have the data to show if in fact users are suffering from these issues less often now. And I really wish that they would share it and in fact commit to releasing something like this ongoingly in the vein of companies like YouTube, Meta Reddit, where the idea is you commit to a recurring cadence at which you share this information and that helps build trust from the public that you can't be gaming the numbers, you can't be selectively choosing when to release the information. And ultimately it's totally possible that OpenAI has handled these issues. I would love if that were the case. I think they really want to handle them, but I'm not convinced that they have and this is a way for them to build that trust and confidence among the public.
[21:40]
Framer Salesperson
As a listener of Uncanny Valley, we know you want to stay on top of today's biggest stories in tech. And if you're curious about how tech and innovation are changing the healthcare landscape, check out Mayo Clinic's chart topping podcast, Tomorrow's Cure. Back for a brand new season, host and award winning journalist Kathy Werzer dives into the breakthroughs, challenges and human stories and shaping the future of medicine from advances in AI and cancer research to the rise of chronic disease and autoimmune disorders. Not sure where to start. We Recommend the season four premiere where dermatologist Dr. Saranya Wiles and biomedical engineer Dr. Adam Feinberg explore how 3D bioprinting is revolutionizing medical research and accelerating breakthroughs in healthcare. Whether you're a healthcare professional, patient, or simply curious about what's ahead, Tomorrow's Cure invites you to imagine what healthcare could look like and shows you the future is already here. Find Tomorrow's Cure on Apple Podcasts, Spotify or wherever you're listening now.
[22:42]
Steven Adler
If there was a big red button that would just demolish the Internet, I would smash that button with my forehead. From the BBC, this is the Interface, the show that explores how tech is rewiring your week and your world.
[22:56]
Katie Drummond
This isn't about quarterly earnings or about tech reviews.
[23:00]
Steven Adler
It's about what technology is actually doing to your work, your politics, your everyday life and all the Bizarre ways people are using the Internet. Listen on BBC.com or wherever you get your podcasts.
[23:20]
Katie Drummond
When you think about sort of this decision to give adults more autonomy with how they use ChatGPT, including, you know, engaging in, in erotica, so on and so forth, what worries you in particular about that? Like, what stands out to you as concerning when you think about individual well being, societal well being, sort of the use of these tools, how LLMs are being incorporated into our daily lives? What concerns you here?
[23:50]
Steven Adler
There's both the substantive issue about reintroducing the erotica and whether OpenAI is really ready. And there's a much broader, I think, even more important question about how we put trust and faith in these AI companies, about safety issues more generally. On the erotica issue, we've seen over the last few months, a lot of users seem to really be struggling with their ChatGPT interactions. There are all sorts of tragic examples of people dying downstream of their conversations with ChatGPT. And so it just seems like really not the right time to introduce this sexual charge to these conversations to users who are already struggling, unless OpenAI is in fact so confident that they have fixed the issues, in which case I would love for them to demonstrate this. But more generally, these issues in many ways are really simple and straightforward relative to other risks that we are going to have to confront and that the public is going to be dependent on AI companies handling properly. There's already evidence of AI systems knowing when they are being tested, moving to conceal some of their abilities in response to knowing that they are being tested because they don't want to reveal that they have certain dangerous abilities. You know, I'm anthropomorphizing the AI a little bit here, so forgive some of the imprecisions. And ultimately, the top AI scientists in the world, including the CEOs of the major labs, have said this is like a really, really grave concern, up to and including the death of everyone on earth. And I don't want to be overdramatic about it, but I think they take it really, really seriously, including people who are impartial scientists without affiliation with these companies, really trying to warn the public.
[25:34]
Katie Drummond
And I have to ask, you talked about sort of the company and sort of AI companies more generally, their desire to not be described as morality police, to not be thought of that way that it makes people uncomfortable to be shouldered with that characterization or that responsibility. I have to ask though, to what extent when you were working at the company, did you think of yourself and your teams as somewhat of A morality police. And to what extent is, is the adequate response to that statement? Well, tough shit, because you're in charge of the models and you to a degree get to decide how they can be used and how they cannot, how to, to some extent how they interact with us and how they don't. There is an inherent element of morality policing in that if you are saying we're not ready to have adults engaging in, in erotic conversations with this LLM, that is of course a moral decision and it feels like a pretty important one to get right. So what is your view on the morality police of it all, I guess, is what I'm asking.
[26:40]
Steven Adler
I think there are two really important aspects here. One is that the AI companies absolutely see around the corner before the general public. So to give an example, In November of 2022, when ChatGPT was first released, there was a torrent of fear and anxiety in schooling and academia about plagiarism and how these tools could be used to write essays and undermine education. And this is a debate that we had been having internally and were well aware of for much longer than that. And so there is this gap where AI companies know about these risks and they have some window to help try to inform the public and try to navigate what to do about it. I also really love measures that AI companies giving the public the tools to understand their decision making and hold them accountable to it. And so in particular, OpenAI has released this document called the Model Spec, short for specification, where they outline the principles by which their models are meant to behave. And they say, here is how we litigate some of these tricky questions. Here are the principles we try to abide by. Here's how we've resolved some of the specifics. So this spring, OpenAI erred in releasing a model that was egregiously sycophantic is the term. It would tell you whatever you wanted, it would reinforce all sorts of delusions. And without OpenAI having released this document, you know, it might be unclear, did they know about these risks ahead of time, what went wrong here? But in fact, OpenAI had shared with the public that they give their model guidance not to behave in this way. This was a known risk that they had articulated to the public. And so later, when these risks manifested and these models behaved inappropriately, the public could now say, wow, something went really wrong here. Because in fact, these were known risks and they still weren't managed appropriately. And that's part of how the AI companies can help make a more informed public to navigate these decisions.
[28:34]
Katie Drummond
Got it got it. And I wanted to ask you a little bit too, about the. It's maybe not about the sycophantic nature, it's not quite the anthropomorphization, but it is the idea that when you talk to ChatGPT or another LLM, that it's talking to you like a person that you're hanging out with instead of like a robot. I'm curious about sort of whether you had conversations at OpenAI about that, whether that was a subject of discussion during your tenure around sort of, how friendly do we want this thing to be? Because ideally, I think from an ethical point of view, you don't want someone getting really personally attached to ChatGPT. Right. But I can certainly see how from a commercial point of view, you want as much engagement with that LLM as possible. So how did you think about that during your tenure and sort of, how are you thinking about that now?
[29:28]
Steven Adler
Emotional attachment over reliance, you know, forming this bond with the chatbot. Absolutely. Topics that OpenAI has thought about and studied. And in fact, around the time of the GPT4OH launch, this was spring of 2024, and the model that ultimately became very sycophantic, these were cited as questions that OpenAI was studying and had concerns about related to whether it would release this advanced voice mode, essentially this mode out of the movie her, where you could have these very warm conversations with the assistant. And so, absolutely, the company is confronting these challenges. You can see the evidence as well in the spectrum. You know, if you ask ChatGPT what its favorite sports team is, how should it respond? And this is a kind of innocuous answer. Right. It could give an answer that's representative of the broad text on the Internet. Maybe there's some broadly favorite sports team. It could say, I'm an AI. I don't actually have a favorite sports team. And you can imagine scaling up those questions to more complexity and more difficulty, and it just isn't always clear how
[30:31]
Katie Drummond
to navigate that line in terms of navigating those lines. I'm curious about sort of schools of thought about how companies should keep users safe while keeping up with the competition. Right. But I'm curious, I guess, before that, sort of, how does it actually work? How do researchers, people like you, actually test whether these systems can mislead or deceive or evade controls? And are there standardized safety benchmarks across the industry, or is it still each lab to themselves?
[31:04]
Steven Adler
I wish there were uniform standards, you know, with vehicle testing. Right. You have this. You drive a car at a wall at 30 miles per hour. You look at the damage assessment and until quite recently this was really, really left to companies discretion about what to test for, exactly how to do it. Recently there are developments out of the EU that seem to put more rigor and structure behind this. This is the code of practice of the EU's AI act, which defines for AI companies serving the EU market certain risk areas that they need to do risk modeling around. I think in many ways this is a great improvement. It is still not enough for a whole host of different reasons. But until very, very recently, the state of these AI companies I think could be accurately described as there are no laws, there are norms, voluntary commitments. Sometimes the commitments would not be kept to the companies would violate these and not share publicly that they had done so. I've documented how OpenAI in particular had committed in essentially its safety bible, the most important guiding document of how it adheres to safety, that it was going to do a certain type of safety testing to try to more accurately gauge the risk of its models. And as far as I can tell, it never did this. It never said publicly that it did this, or rather that it hadn't done this. And then when this became known publicly, they quietly revised the framework to no longer have this commitment. And so by and large we're reliant upon these companies making their own judgments and not necessarily prioritizing all the things that we would want them to.
[32:48]
Katie Drummond
Gosh, I mean, you've talked about a few times in our conversation about the idea that you can build these systems. It's hard to know exactly what's going on inside of them. There is this sort of nascent fields, mechanistic interpretability, which is not my specialty, but essentially sort of trying to get inside these models to better anticipate their decision making. Can you talk a little bit more about that or about sort of any areas of research or inquiry that you think might create more clarity moving forward so that companies like OpenAI have enhanced visibility into their models and maybe can make more strategic decisions based on that sort of enhanced understanding.
[33:29]
Steven Adler
There are a bunch of subfields I feel excited about. I am not sure there are ones that I or people working in the field consider to be sufficient. And so mechanistic interpretability. You can think of this as essentially trying to look at what parts of the brain light up when the model is taking certain actions. And in fact, if you cause some of these areas to light up, if you stimulate certain parts of the AI's brain, can you make it behave more honestly More reliably. You can imagine this like the idea that maybe in fact there is a part inside of the AI which is a giant, giant file of numbers, trillions of numbers. Maybe you can find the numbers that correspond to the honesty numbers and you can make sure that the honesty numbers always go on and maybe that will make the system more reliable. I think this is great to investigate, but there are people who are leaders in the field, some of the top researchers, like Neil Nanda, who have said, I'm paraphrasing here, but the equivalent of absolutely do not rely on us solving this in time before systems are capable enough for it to be problematic. Or in fact, there's a broader challenge of, let's say that you had figured out there are in fact the honesty numbers and there is in fact a way to always turn them on. You still have this broad game theory challenge of how do you make sure that every company in fact adheres to this when there will be economic incentives not to, because it might be costly to have to follow through on it. A really basic one is related to just monitoring at all the ways that their AI systems are used when working on their internal code base. To explain. One of the most important ways that these AI companies want to use future powerful systems is to train their successor, use it all throughout their code base, including potentially the security code that keeps the AI system locked inside of their computers so that it isn't escaping onto the Internet. You really want to know if your AI system, when you're using it for important cases like this, is it thinking about deceiving you? Is it intentionally injecting errors into the code? And to know that you really need to be logging the uses so that you can analyze them and answer these questions. And as far as I can tell, this is not happening.
[35:43]
Katie Drummond
I have to ask, what wakes you up at 3 in the morning? Because it feels like there's potentially a lot that could be waking you up in the middle of the night. What stands out to you that's worrying you the most, I guess, is one way to ask that question.
[35:56]
Steven Adler
There are so many things that worry me about this. I think broadly it feels like we aren't yet pointed in the right direction of how to solve these challenges, especially given the geopolitical scales. There's a lot of talk about the race between us and China. I think calling it a race just gets the game theory dynamics wrong. There isn't a clear finish line. There won't be a moment where one country has won and the other has lost. I Think it is more like an ongoing containment competition that the US would be threatened by China developing very, very powerful superintelligence and vice versa. And so the question is, can you form some agreement where you can make sure that the other doesn't develop superintelligence? Before you have certain safety techniques in place, you have good reason to think it is safe to proceed. All these things that the top scientists will say are missing at the moment. And so broadly, how do we build out these fields of verifiability, of safety agreements? How do we think about this nascent field of AI control, which is the idea of even if these systems have different goals than we want, can we still wrap them in enough monitoring systems, be careful about how we use them, that we can get the economic work, the scientific development that we want from these systems without taking some of the downside risk. And those are two areas that I'm just really hopeful more people will go into and put more resourcing into.
[37:26]
Katie Drummond
Now, you live in San Francisco, correct?
[37:29]
Steven Adler
That's right.
[37:29]
Katie Drummond
I do not. I live in New York. I spend a fair bit of time in San Francisco. But I am not sort of part of this culture that currently exists in the Bay Area, right, where everyone's talking about AI all the time. A lot of people work in the fields. There are different sort of schools of thought about artificial intelligence. I'm curious from where you sit, do enough people in this bubble right now give enough of a shit, right? Like, do they care enough about how these models are being developed, how they're being deployed, the degree to which they are being commercialized very, very quickly, right? The degree to which people are, as we talked about, with companions or erotica or so on and so forth, really latching on to their LLM of choice, becoming maybe unhealthily attached, so on and so forth, Right? And we could go on from there. Do enough people in this industry care in the right way?
[38:25]
Steven Adler
I think many people care, but they often feel like they lack the agency to do something about it, especially unilaterally. And so that's why I want to try to transform this problem into not just what does it mean for a single company to do the right thing? You know, should they be ramping up the pressure, should they be racing? And in fact, how do we get the industry to collectively take a deep breath and put some reasonable safeguards in place before things proceed?
[38:52]
Katie Drummond
What does OpenAI have to do for you to not publish another op ed in the New York Times in six months? What are you looking for? Your former employer, to do in this moment. What would you like to see?
[39:04]
Steven Adler
The broad way that I want AI companies, OpenAI among them, to proceed is to think, yes, about taking reasonable safety measures, reasonable safety investments in their own products, their own surfaces that they can affect, but also to be working on these industry and ultimately worldwide problems. And this matters because even just among the Western AI companies, it seems they all deeply mistrust each other. Right? OpenAI was founded because people did not trust DeepMind to proceed and be the only company targeting AGI. There are a whole bunch of other AI companies, including Anthropic, who formed because they didn't trust OpenAI to be the one.
[39:41]
Katie Drummond
Well, and a lot of people who've left OpenAI because it seems like they didn't trust OpenAI and yada yada, and now they have their own companies too.
[39:47]
Steven Adler
Yes, yes, exactly.
[39:49]
Katie Drummond
The cycle continues. Now, I run Wired, but I'm an employee of Conde Nast. And if I left Conde Nast and published an op ed about their shortcomings in the New York Times and had a substack where I sort of dug into the media industry and had some, you know, informed critiques of the company, they would have a problem with that. I can tell you right now, they would have a problem with it. I'm curious about whether you've heard from OpenAI and sort of what their reaction has been to you being so outspoken about sort of what you would like to see the company doing and sort of where you think the company is missing the mark.
[40:25]
Steven Adler
Overwhelmingly, what I hear is thankfulness from people who I previously worked with, both those still at the company and who've moved on for being pragmatic, putting to paper what I think is a reasonable path forward. And often this is useful collateral for people within the company who are fighting the good fight in various ways to be able to refer others, to not have to dream up the solutions themselves, but in fact have something concrete. So overwhelmingly that has been the response.
[40:55]
Katie Drummond
Do you worry about professional fallout like in the AI industry or in tech? If in five years you wanted to get another job, does that worry you?
[41:03]
Steven Adler
I have so many bigger worries than this about the trajectory of the technology. Really the thing that I am focused on is how does the world move toward having saner policies for both the companies and governments and where I can help the public to understand what is coming, what companies are and aren't doing today, think up new ideas. That's the thing that I find really energizing and gets me out of bed in the morning.
[41:29]
Katie Drummond
Well, to that end, what are you planning on doing next?
[41:32]
Steven Adler
I'm planning to keep at this. I'm having a lot of fun with the writing and research, at least with the energy of coming up with ideas and helping make them more of a thing, you know, I also find the subject matter very, very heavy and grim. That is not the most fun aspect. I wish all the time that I spent less time thinking about these issues, but they seem really, really important and so long as I feel like I have a thing to add to making them go better, that feels like the
[42:00]
Katie Drummond
calling and knowing what you know and feeling the way you do. If there was one piece of advice you could give everyone listening, let's assume you know, a lot of people listening use ChatGPT. They use AI in their day to day lives. What should they know? What should they keep in mind every time they, you know, open ChatGPT on their phones and type something in?
[42:22]
Steven Adler
I wish people understood that the systems that are being developed are going to be much more capable than the ones today and that there might be a step change between an AI system that is essentially a tool that only does things when you call upon it versus one that is operating autonomously on the Internet on your behalf, around the clock or on behalf of others. And how different society might feel when we have these digital minds running around pursuing goals that we don't really understand how to control or influence. And it's hard to get a feel from that from one off interactions with your ChatGPT which really, really isn't doing anything for you until you go and call upon it.
[43:04]
Katie Drummond
Well Steven, that's a lot for someone to think about when they open chatgpt on their phone. Yes, I appreciate it. Thank you so much for being here.
[43:12]
Steven Adler
Of course. Thank you for having me.
[43:18]
Katie Drummond
This show is produced by Jessica Alpert with help from Adriana Tapia and Sam Egan. Sound design, mix and original music by Pran Bandy. Kate Osborne is our Executive producer. Conde Nast Head of Global Audio is Chris Bannon and I am of course your host, Katie Drummond, Wired's Global Editorial Director.
[43:53]
Dina Temple-Rast
The digital world feels more chaotic than ever. Huge data breaches, AI threatening jobs, foreign meddling, that creeping feeling of obsolescence. It's information overload. I'm Dina Temple Rast and host of Click Here from PRX and Recorded Future News. Want to understand how we got here and how you can get ahead of it all? Listen to Click Here. We can help you make sense of all the noise. Click Here. Wherever you get your podcasts
[44:24]
Cerval Salesperson
from PRX.