
The actor Hank Azaria on why his “Simpsons” characters need a human touch.
Loading summary
Scarlett Johansson
Ever been in a recipe rut where the only constants in your kitchen are a few lonely onions and half a box of pasta? Now that you can order New York Times cooking recipe ingredients through Instacart, trying a new dish doesn't even take a grocery trip. Just find the New York Times cooking recipe you want to try. Click Shop Ingredients on Instacart and get all the ingredients at your door. Classic Pasta alla Norma. Why not? You've already got the pasta. Learn more@nytcooking.com pasta instacart.
Host of The Opinions
This is the Opinions, a show that brings you a mix of voices from New York Times opinion. You've heard the news. Here's what to make of it.
Hank Azaria
My name is Hank Azaria. I'm an actor and voiceover artist and producer. I'm most known for my voices on the Simpsons, such as Motor Bartender, Comic.
Book Guy, of course, Chief Wiggum, Snook, Professor Frank of course, one of my personal favorites, Cletus the slack jawed yokel often makes himself known the Sea Captain Yar, the Old Sea Captain, Duffman, purveyor of Duff Beer.
Oh yeah, I was in a long Kim Poly, did a French accent, the Claude, the Scuba Guy. It goes on and on. I could probably think of doing this for a long time. I didn't really think about AI seriously in terms of voice acting until about a year or two ago when it started getting pretty serious. Pretty obvious that it was making a real run at sounding like humans can sound. Obviously the Scarlett Johansson thing was a big deal that brought into everybody's consciousness.
Scarlett Johansson
Actor Scarlett Johansson says she was approached by tech company OpenAI to be the voice of ChatGPT. The actress said no thanks, but when the company released a voice assistant named sky, it sure sounded a lot like Johansson.
Hank Azaria
And then since then I've had not one, but two companies pitch me tech, wanting me to get on board with the early versions of it to either protect my voice, my name, image, likeness and sound, or license it in a way that can be used in AI with my permission. In other words, oh, we want Mow the bartender to be involved in this. Can we use your voice? You don't need to do it for us, but we have your permission to recreate it. I love my job and I consider myself the luckiest person in show business, if not on the planet, that I actually get paid a lot of money to do these voices on a show that's lasted approaching 40 years and I don't want to be put out of a job on that show. Or any other. And that is a little bit frightening. You know, AI can mimic sounds better and better. I think within five years they'll be able to mimic them pretty perfectly. But I think there's a humanness that the AI can't do right now, at least vocally, and may never be able to do. That involves a character's motivation, certain emotions, subtleties of physicality, facially or otherwise. It just can't quite capture all those little minutiae that add up to a human being.
For example, let's take Mo. So this is what he sounds like. But there's inherent aggression in everything. Mo's, he's angry every.
Oh, in all moments.
There's hate underneath everything that he says. Chief Wiggum is inherently stupid. He doesn't understand almost anything. And there's a childlike innocence to him and I know a curiosity, a kind of wide eyed happiness just to be alive as long as there's a donut in his future, which there always is.
Um, Snake is really sneaky. He's sizing you up always and trying to lull you to a sense of calm so that he can rob you. Those qualities, they're subtleties that are almost more the character than the sound of it itself. And I'm not sure the computer can or ever will be able to do it. What a hubris statement that was. Cut to me out of a job in the cold while computers are laughing at me in Mo's voice.
Ah, you thought we couldn't do it, eh? How you like living on the street there, pal?
So the creation of Mo and Chief Wiggum, they're good examples of what I'm talking about here as far as the human element in this kind of work goes.
In the case of Mo, it started out as I loved Al Pacino as a kid and a teenager, so I did a young Al Pacino impression. Godfather Al, Dog Day Afternoon. And you know, I'm dying here. Everybody's coming down on me here that Al and I actually auditioned for the Simpsons. I was doing a play in la. I was playing a drug dealer and I thought the Al Pacino voice would be good for that. So I auditioned with this voice here.
And they said they wanted it to be gravelly.
So you take young Al Pacino and you make him gravelly and you get motor bartenders. So a lot of times it's like I got a bass voice I'm working with, then you give a little gravel on top. Lately I've been imitating Bruce Springsteen a lot.
I got a Springsteen tribute Band.
Another hero of mine since I was a teenager. And actually Mo is kind of halfway between Bruce and young Al Pacino. Right in the middle there.
Motor bartender Agador Spartacus from the birdcage is my grandmother. We are Sephardic Jews, Spanish speaking Jews.
And she sounded like this.
And not only is a good example because she was very loving and maternal and feminine.
So it isn't just the timber, this.
Voice, it's also the affection and the maternal instincts that come through with this voice. And then there's a physicality to the characters that exists. If you're doing a scene where you're chopping wood, it helps to actually. I don't think Moe's ever chopped wood, but if he was, he's always like.
Yeah, all right, well, get that going for you. As soon as I get this wood chop, man. It's hard chopping this wood, so you.
Kind of gotta do it. It's hard to fake that quality in your voice unless you're exerting not only putting subtleties of emotion and motivation in, but yes, the physical exertion, the way emotion changes your voice. From what I've heard so far, AI can get pretty close, but pretty close, you know, only in horseshoes really, does that count. What AI can do is bring analysis and ideas. I believe Google has this app now. You can put in any piece of writing you like. My wife put in a poem she wrote some years ago and it will create a 15 minute podcast, a male AI voice and a female AI voice chatting about the poem. It was utterly convincing. It was very smart, incisive commentary. The voices were completely believable, including ums and pauses. It sounded extremely lifelike. Now, there wasn't a tremendous range of emotion. When I tried to make it try to be funny, it really fell short. But what it had to say was quite interesting. So it's like ChatGPT, right? It gives you some really good ideas. And, you know, if you don't like what the ChatGPT came up with on the first try, just hit a button and it'll endlessly try variations on it. You're eventually going to get an idea or two you like. It's a tool like any other. And, you know, similar to the vocal aspect, it still needs a human to bring it home, but it's an incredible aid along the way. Forgive me, my writer friends. You know, the way TV's done is there's a showrunner, a head writer, and there's a writing staff. I don't think you could replace the showrunners. I don't think you could replace the people with the vision and the real eye on the prize of creatively what we're going for here. But I'm pretty sure a head writer at this point doesn't need a staff. I'm pretty sure based on ChatGPT prompts and then just zhuzhing scripts that the ChatGPT generates. As long as he's rewriting it or she or they, I think that can be done. And I can't imagine it's too long from now that a studio is going to keep paying for what they know the computer can do for free. I look at the AI visually, vocally, from a writing standpoint. It's very exciting. I enjoy it. I think it's mind blowingly fascinating and fun to see the art and the songs I just came upon the AI generated. What if Led Zeppelin 2 was recorded in the 50s in rockabilly version? It's awesome. Way, way down inside Honey, you need it I'm gonna give you my love I'm gonna give you every inch of my love. I love LED Zepp and you know, I just find that really fun. I don't want to be replaced as a vocal performer in animation or on camera. By the same token, how fascinating that that can happen. I hope humans and AI can work together and collab. I mean, that's my hope, right?
Like, I do a. Yeah, what's up, doc? I love Bugs Bunny. I do a sort of passable Bugs Bunny. What's up, doc?
That's not too exact, but the computer could make it much more exact, especially if it's using my performance. And I'm probably a good enough performer and I know Bugs Bunny so well that I could probably. I could probably do it and we could tweak it to get it pretty much exactly like Mel Blanc sounded. But you still need someone like me, not just creating the voice to begin with, but also then knowing how to fix it once you hear it back. People are gonna listen to and enjoy and watch what they like. And if it's passable and if it's good, they're gonna like it and they're gonna listen to it and they're not gonna care whether AI generated it or human generated it or some combination of the two. Right now that just basic human response is going in my favor because I'm pretty confident that what AI generates by itself as Motor, Bartender or anything else isn't going to cut it. But if it does start to cut it, people are going to listen to it and they're going to be grateful that it's so readily available. I mean, look what happened, you know, to the music industry. I cried a tear because the record industry reinvented itself. I got to listen to all the music for free all of a sudden. So I don't think people are going to feel much differently about any of this.
Host of The Opinions
If you like this show, follow it on Spotify, Apple or wherever you get your podcasts. This show is produced by Derek Arthur, Sophia Alvarez, Boyd, Vishaka Durba, Phoebe Lett, Christina Samulewski and Gillian Weinberger. It's edited by Kari Pitkin, Allison Bruzek and Annie Rose Strasser. Engineering, mixing and original music by Isaac Jones, sonia Herrero, Pat McCusker, Carol Saburo and Afim Shapiro. Additional music by Amin Sohota. The Fact Check team is Kate Sinclair, Mary Marge Locker and Michelle Harris. Audience strategy by Shannon Busta, Christina Samulewski and Adrian Rivera. The executive producer of Times Opinion Audio is Annie Rose Dresser.
Podcast Summary: The Opinions – "A.I. Isn’t Coming for Moe the Bartender. Not Yet, Anyway."
Release Date: February 4, 2025
Introduction
In the February 4, 2025 episode of The Opinions, hosted by The New York Times Opinion team, actor and longtime voice artist Hank Azaria delves deep into the evolving intersection of artificial intelligence (AI) and voice acting. Azaria explores the potential threats and opportunities AI presents to his profession, drawing from his extensive experience voicing iconic characters on The Simpsons.
Hank Azaria’s Iconic Roles on The Simpsons
[00:47 – 01:17]
Hank Azaria begins by enumerating his diverse range of characters on The Simpsons, highlighting his versatility and the depth of his voice acting portfolio:
“I could probably think of doing this for a long time.” [00:59]
Azaria emphasizes his passion for his craft, illustrating the longevity and richness of his career within the show.
The Emergence of AI in Voice Acting
[01:17 – 03:27]
Azaria discusses the rising capabilities of AI in mimicking human voices and the implications for voice actors:
“Pretty obvious that it was making a real run at sounding like humans can sound.” [02:07]
He references the high-profile incident where Scarlett Johansson was approached by OpenAI to lend her voice to ChatGPT, which she declined. Despite her refusal, AI later released a voice assistant named "Sky" that bore a striking resemblance to her, raising concerns about unauthorized use of voice likenesses.
“I don't want to be put out of a job on that show.” [03:00]
Azaria expresses his apprehension about AI potentially replacing human voice actors, underscoring the emotional and nuanced elements that AI currently cannot replicate.
Limitations of AI: The Human Element
[03:27 – 06:15]
Azaria elaborates on the intrinsic qualities of his characters that go beyond mere voice modulation:
“There's hate underneath everything that he says.” [03:27] – Referring to Mo, highlighting the character's inherent aggression.
“There's a childlike innocence to him...” [03:36] – Discussing Chief Wiggum’s naivety and simplicity.
He argues that AI may imitate vocal tones but lacks the ability to infuse characters with complex motivations, emotions, and physical subtleties that define their personalities.
“It's hard to fake that quality in your voice unless you're exerting not only putting subtleties of emotion and motivation in...” [06:15]
Azaria stresses that AI falls short in capturing the depth and multifaceted nature of human performances, which are essential for bringing animated characters to life.
AI as a Collaborative Tool, Not a Replacement
[06:15 – 10:13]
While acknowledging AI’s advancements, Azaria remains optimistic about its role as an aid rather than a replacement:
“What AI can do is bring analysis and ideas.” [08:00]
He discusses various AI applications, such as generating podcast content and assisting in writing processes, drawing parallels to tools like ChatGPT. Azaria envisions a future where AI collaborates with humans to enhance creativity:
“I hope humans and AI can work together and collab.” [09:30]
Despite recognizing AI's potential to handle certain aspects of creative work, he maintains that the irreplaceable human touch is crucial for nuanced performances.
The Future of Creative Industries with AI
[10:13 – 12:15]
Azaria explores the broader impact of AI on creative industries, using the music industry as an example:
“I cried a tear because the record industry reinvented itself.” [11:00]
He anticipates similar transformations in voice acting and writing, where AI can handle preliminary tasks, allowing human creators to focus on refining and adding depth to the work. However, he underscores the importance of retaining human oversight to ensure quality and emotional resonance.
“People are gonna listen to and enjoy and watch what they like. … I'm pretty confident that what AI generates by itself as Motor, Bartender or anything else isn't going to cut it.” [12:00]
Azaria concludes by reaffirming his belief in the enduring value of human creativity and the potential for harmonious collaboration with AI technologies.
Conclusion
Hank Azaria’s insightful discussion on The Opinions podcast sheds light on the nuanced relationship between AI and the creative arts. While acknowledging the impressive strides AI has made in voice replication and content generation, Azaria emphasizes the irreplaceable human elements that define authentic performances. His perspective advocates for a balanced approach, where AI serves as a supportive tool that enhances rather than supplants human creativity.
Notable Quotes with Timestamps
Final Thoughts
This episode provides a comprehensive exploration of the current and future landscape of AI in the creative sector, particularly in voice acting. Hank Azaria's seasoned perspective offers valuable insights into the delicate balance between embracing technological advancements and preserving the human essence that fuels artistic expression.