
Zoe Kleinman speaks to Sam Liang about why human beings are imperfect listeners
Loading summary
BBC Podcast Announcer
This BBC podcast is supported by ads outside the uk.
Shell V Power Nitroplus Advertiser
Shelby Power Nitroplus fuels every drive from the Pacific coast to the high desert with a fuel like no other. It contains active coating ingredients that clean and protect for longer lasting engines. That means more protection with active ingredients for longer lasting engines. Shell V Power Nitroplus Premium Gasoline Engine performance that lasts Chances are you're not far from a shell station. Find it using the Shell app formulation unique to Shell compared to Minimum Detergent gasoline With continuous use of Shell V Power Nitro plus and gasoline direct injection engines, actual effects and benefits may vary. See Shell Us More Protection for more information.
Kara Swisher Promo Announcer
Making sense of the longevity boom can take a lifetime, so let Kara Swisher do it for you. In the CNN Original Series, Kara Swisher Wants to Live Forever. She breaks down AI health claims, flashy longevity tech and big promises to reveal what actually works and what's just bunk. You can't outsmart aging. Or can you? Don't miss the CNN Original Series Kara Swisher Wants to Live Forever New episodes streaming Sundays with the subscription. Go to CNN.com subscribe to get started.
Zoe Kleiman
Hi, I'm Zoe Kleiman, the BBC's technology editor and this is the interview from the BBC World. The best conversations coming out of the BBC people shaping our world from all over the world.
Sam Liang
If you're not a little bit afraid,
Shell V Power Nitroplus Advertiser
then you're not paying attention.
Interviewer (BBC Host)
We have never seen seeing people so united. Do not make that boat crossing.
Zoe Kleiman
Do not make that journey.
Sam Liang
Being born in America, feeling American, having people treat me like I'm not. We're more popular than populism.
Zoe Kleiman
For this interview I met Sam Liang, chief executive and co founder of artificial intelligence transcription startup Otter AI in our London studios. Sam Liang was born in China and moved to the US in 1991. He received a PhD from Stanford University before joining Google where he led the search engines location services. He co founded the California based Otter AI in 2016. The startup has evolved from a voice to text transcription service to offer AI powered recordings of live events, meeting summaries and content searches. He tells us why he thinks we won't be typing as much anymore and how avatars could soon take our place in meetings. He also explains why he wishes Otter had been around when he arrived in America in the early 90s.
Sam Liang
You know when I first went to America, my English was so bad people didn't understand me. I couldn't understand other people. I wish I had Otter at that time to help me. At that time I literally carried a Sony walkman with me in all the classes I went to.
Interviewer (BBC Host)
Did you?
Sam Liang
I had to record it because I couldn't understand the professor in real time. I had to record it and I had to listen to it several times to fully understand the lecture. If I had a lot of that time, I could have studied much better.
Zoe Kleiman
Welcome to the interview from the BBC World Service with Sam Liang.
Sam Liang
We build OTTER to capture all the voice data first, then use AI to transcribe it, use AI to analyze it, and also use AI to aggregate all this intelligence so that we can extract intelligence, extract insights out of it. So it's not just a meeting note taker for a single conversation. More importantly, we're building a platform to create this conversational knowledge engine, to organize thousands of meetings, connect the knowledge in thousands of meetings.
Interviewer (BBC Host)
So how is AI shaping who gets heard? Is it changing the voices that we hear and the voices, if what you're saying is correct, the voices that we will remember and the voices that we understand, Is it curating that and will people lose out as a result?
Sam Liang
The power of AI is that it's able to capture everything. It's able to try to interpret everyone objectively. Human being are imperfect in terms of their capability to listen and understand everyone unconsciously. When they listen, they don't hear everything. Based on their past experience, they may misinterpret certain things, they may unconsciously ignore certain things. So AI actually can fix that.
Interviewer (BBC Host)
What about accents and underrepresented languages? You know, there are 22 official languages in India and most of the the world's largest chatbots can only manage about half of them at the moment.
Sam Liang
Yeah, accent. I'm not a native English speaker myself. I was born in China. I grew up in Beijing. I went to America in 1991. Even after 35 years living in America, I couldn't get rid of my accent. So when we started otter, we build our own speech recognition technology ourselves. We actually stress test our system by speaking to our own speech recognition engine to make sure our engine can recognize our own accent. Of course, we also collect tons of speech data from all over the world to train the engine to stress test it so that it's able to recognize in all kinds of accent, American accent, a British accent. You know, in America, there are so many immigrants from all over the world, everyone speak with different accent.
Interviewer (BBC Host)
Because you have a responsibility, don't you? If something is transcribed wrongly because the tool hasn't heard it properly, because the accent, it doesn't understand the accent, then you are at risk of misquoting Somebody, aren't you? And. And essentially sharing misinformation.
Sam Liang
Exactly. It's a very challenging task to understand people's speeches, especially with accent. In addition, if you're just speaking about common topics, most AI can handle it relatively well. But oftentimes there are challenging words such as acronyms, jargons, unusual names, especially immigrants names. They have unusual spelling. How do we recognize them correctly? They're all challenging. So we build this system that constantly learn new words, constantly learn new acronyms and jargons.
Interviewer (BBC Host)
And there are cultural nuances as well, aren't there? Some things mean different things to different cultures.
Sam Liang
Yeah, even the spelling is different between UK and America. So we have to handle that.
Interviewer (BBC Host)
We get it right, obviously.
Sam Liang
Yeah, of course. Those are the word by word transcription. It's a lot of challenges. We are handling it pretty well. But more challenging is about understanding the meaning. So on top of the word by word transcript, we use AI to try to understand what people mean.
Interviewer (BBC Host)
You must have a treasure trove of data of thousands, millions of billions of conversations. What biases have you had to correct within that data that you've used to train your tools?
Sam Liang
That's a good question. So far, we focus on one type of data, it's the business meetings. We focus on the enterprise intelligence, of course. Every business different. I wouldn't say they're biased, but let's
Interviewer (BBC Host)
say, for example, we know that the majority of senior leadership teams tend to be men, don't they? Because that is a existing bias in society. And so your tool is trained on business meetings, senior leadership teams. It's going to be men. Does that lead to assumptions by the tool then, that women's voices don't carry as much gravitas or they're important or they're not as visible?
Sam Liang
That is a problem. That is a problem because there, as you said, most business leaders are men. So there are not enough meeting data where women spoke in it is a problem. AI actually can help correct some of the problems or detect some of the problems because we have a special technology that can detect when a man speaks or when a woman woman speaks. We can actually measure the speaking time, whether the man interrupt a woman often,
Interviewer (BBC Host)
I bet they do.
Sam Liang
Or whether the man dominate the session by speaking too much, maybe that the man speak 90% of the time, only leave 10% of time to the woman who speak. So that's the first step to detect problems and also can remind people when that situation happens to help people be more mindful. I think that's a first step. The training data Problem. It takes some time to fix. But we can start addressing that problem by detecting the problem and remind people of that.
Interviewer (BBC Host)
You grew up in China, in Beijing, and you came to the US in the early 90s. What do you think about the current tensions between the US and China? Do you think it's helping or hindering the progress of the technology? Would it be quicker if everyone was working together?
Sam Liang
Of course. Different countries have different political system. I moved to America because I want to get freedom. I like that freedom. But of course, America has its own problem, whether you're Democrats or Republican. Every party had their problem. Of course, between countries, a lot of conflict happen due to misunderstanding, due to different beliefs. So this is why communication is so important between China and America. They speak different languages, they have different culture, they don't always understand each other. So I hope AI can help people understand each other better. When they understand each other better, they can resolve more problems. It will take time and a lot of things. It's lost over translation.
Interviewer (BBC Host)
So is that partly why you are doing this? Is there a personal reason?
Sam Liang
It is. It is a problem that hasn't been solved yet. I hope AI can help solve that. It will still take time.
Interviewer (BBC Host)
When you first came to the us, how easy did you find it to communicate?
Sam Liang
It's hard. When I first went to America, my English was so bad, people didn't understand me. I couldn't understand other people. I wish I had OTTER at that time to help me. At that time I literally carried a Sony Walkman with me in all the classes I went to.
Interviewer (BBC Host)
Did you?
Sam Liang
I had to record it because I couldn't understand the professor in real time. I had to record it and I had to listen to it several times to fully understand the lecture. If I had a lot of that time, I could have studied much better.
Interviewer (BBC Host)
I feel like this explains a lot about where you've got to now and why you're so passionate about this.
Sam Liang
That's part of the reason. Because, you know, I need a tool like this to understand other people. And I think probably millions of other people need this too.
Interviewer (BBC Host)
Did you feel isolated?
Sam Liang
Yes. When I couldn't understand other people and I couldn't convince other people, I do feel isolated. I still have trouble communicating. That's a problem I want to solve and I hope other AI can help me solve that too.
Zoe Kleiman
You're listening to the interview from the BBC World Service.
BBC Podcast Announcer
Ever invest in something that seemed incredible at first but didn't live up to the hype? Like those five dollar roses at a gas station or A secondhand piece of technology that breaks in the first 10 minutes. Marketers know that feeling. We optimize for the numbers that look great, impressions reach and reacts. But when they don't show revenue, well, that's a not so great conversation with the CFO. LinkedIn has a word for that. Bullspend. Now you can invest in what looks good to your CFO. LinkedIn Ads generates the highest roas of all major ad networks. You'll reach the right buyers because you can target by company, industry, job title and more. So cut the bull. Spend. Advertise on LinkedIn, the network that works for you. Spend $250 on your first campaign on LinkedIn ads and get a 250 credit for the next one. Just go to LinkedIn.com Broadcast. That's LinkedIn.com Broadcast. Terms and conditions apply.
Jan from Toyota
Hey campers, it's Jan from Toyota. This summer we're headed to Camp Toyota and the fun starts now. We're kicking things off by kicking up mud. Jump in campers. We're going off roading in a 4Runner. Next we're heading to the hot springs in Arav 4 and finally park your tundras and Tacomas around the campfire because we're roasting marshmallows. Your summer start here.
Shell V Power Nitroplus Advertiser
Dealer inventory may vary, so you're participating Toyota Dealer for details event hands June 1st Toyota Let's Go Places.
Zoe Kleiman
You know how sometimes you find yourself replaying a conversation in your head and thinking about how you could have said things differently. Sam Liang does that for real. The BBC was recording our chat, but so was he. He said he'd be listening back to it on his phone later and asking Otter to give him some feedback on how well he answered my questions. I wonder what it had to say about me. He also arrived in his running shoes. He was getting ready for the London Marathon shortly afterwards. Ok, let's return to my conversation with Sam Liang.
Interviewer (BBC Host)
Where are we drawing the line here between a useful AI tool that's recording and tracking stuff to be helpful and surveillance?
Sam Liang
We're definitely against surveillance Big Brother by the government. I think we should definitely block that. But for business communication we see the bigger problem in most of the enterprises is that information are fragmented. Most people don't have access to all the information they need to do their job.
Interviewer (BBC Host)
There is a class action lawsuit going on, isn't there in California which accuses your AI transcription bot of recording confidential conversations without consent. So there are people that are worried about it.
Sam Liang
We totally understand people worry about it. For that particular Lawsuit. We don't think that has merit for many reasons. First of all, all the altered users consent to the terms of service. We set up OTTER user needs to get consent from other speakers in the room or on a video conferencing platform to agree before they use otter. So it's under the control of that particular user. It's similar to the way people use a, for example Sony Walkman in the past. It's a cassette recorder, right. Sony is not responsible for how you use it. The owner of that cassette recorder has the responsibility.
Interviewer (BBC Host)
I suppose what's different now is that that cassette was not likely to be broadcast anywhere, was it? Whereas things that are recorded now and transcribed can be shared hugely widely on the Internet.
Sam Liang
The sharing is controlled by that user as well. By default it's only accessible by that one user. That user can say, because there's 10 people in this meeting, is in their best interest to allow everyone to have access to that meeting notes. Because why do we meet in the first place? For the 10 people to communicate with each other, to share knowledge with each other. Right. But if people forget what they discussed, they actually waste that time, which is actually very expensive because human time is very expensive.
Zoe Kleiman
So for you, the onus is on
Interviewer (BBC Host)
the user to be responsible with otter?
Sam Liang
Absolutely. We provide the tool, we provide a platform. But the user is needs to get consent and to also manage who have access to that.
Interviewer (BBC Host)
That's kind of the social media approach, isn't it?
Sam Liang
It's the same as document. When you write a document, the writer controls who can have access to it.
Interviewer (BBC Host)
I'm interested in the goal. The long term goal of OTTER is, is real time transcription, translation, both of those things. Is it, is that what you're aiming for?
Sam Liang
It is a real time transcription, but the information can be accessed in both real time and post meeting. For large enterprise that with 10,000 people, 20,000 people, there may be millions of meetings that are happening every year. The enterprise actually invests a lot of money to pay people to go to meetings. They need to make sure the time spent in meetings generate good return. In terms of a business value.
Interviewer (BBC Host)
In five to 10 years time, maybe you and I won't be meeting at all. It will be our digital twins or our AI clones that are meeting on our behalf. Where does that leave otter?
Sam Liang
There are several aspects. One is actually we are also building technologies to create a digital twin for every person. Eventually you can send your digital twin or your avatar which has your knowledge to attend some meetings on your behalf.
Interviewer (BBC Host)
Have you got one?
Sam Liang
For example you, you were sick, you're on vacation, you're double booked. You could send your avatar.
Zoe Kleiman
Okay.
Interviewer (BBC Host)
Have you got one?
Sam Liang
In our team, we build a Sam's avatar already. It's not perfect yet, but it can answer a lot of questions on my behalf.
Interviewer (BBC Host)
Do you like it?
Sam Liang
I like it. It's trained based on thousands of meetings I spoke in in the past. It also use a lot of documents I wrote in the past, so it knows how I would respond to a lot of questions already.
Interviewer (BBC Host)
We know that Mark Zuckerberg has one too, or is working on one for talking to staff at Meta. Is that what you use the Sam avatar for as well?
Sam Liang
Yeah, that's what we're building as well. We also try to make it constantly learning everything new. The avatar I built two months ago uses all the historical data, right? But two months later there's something new. So that avatar needs to continuously learn new things.
Interviewer (BBC Host)
When I started out, I used to transcribe stuff by handwriting it down. I would listen back to interviews and write them down. And then I started typing them into Word documents. And now we are increasingly using AI tools to do the transcription for us and then to go back and check, you know, that, that the transcription is accurate. I'm slightly worried that I'm not really using my hands anymore. I don't handwrite, I don't type. What are we going to do with our hands?
Sam Liang
Of course you can use hands for many things. You can play tennis, you can play violin, you can, you can cook badly.
Interviewer (BBC Host)
But yes.
Sam Liang
So our prediction is that we'll increasingly rely on voice to interact with AI because voice is the most natural way for us to communicate. Typing is actually a skill we learned in later part of our life when a baby started to communicate. The baby first use voice to communicate for many years before he or she start learning writing or typing. Right. So writing and typing are harder skills to learn and to perfect. Unless you are a professional writer. Most people are pretty bad at writing. There are also afraid of writing, but talking is much easier to do.
Interviewer (BBC Host)
So do you think that this is the beginning of the end of writing and typing? I mean, might we forget how to do it?
Sam Liang
It's possible. It's the same if you think about it. I drive a Tesla for many years now. I use the full self driving mode. It can take me from my home garage to the garage at my office by itself. So my driving skills has been degrading.
Interviewer (BBC Host)
Have they? You've noticed that?
Sam Liang
Yeah, it's already happening.
Interviewer (BBC Host)
Does it bother you?
Sam Liang
I hate driving manually you don't like driving anymore? I don't like driving anymore because I'm afraid that I may cause an accident.
Interviewer (BBC Host)
So do you think the Tesla is better at driving than you are?
Sam Liang
Exactly. It's already driving better than me because it's trained by millions of miles of driving data. The car has eight cameras compared to two eyes I have.
Interviewer (BBC Host)
But you don't need eight cameras. Your two eyes are doing the job. You can have 10 driving lessons and get your driving license. You don't need all of that.
Sam Liang
Eventually, you don't need that. So it's by the same way, AI can help you write. Way better. AI is a way better writer than most. 99% of the human beings generate.
Interviewer (BBC Host)
But what happens to our brains if we're not. If we're not training our brains in the way that we do when we learn to write and type, when we learn to drive, is there a worry that we just won't be as sharp as we have had to be?
Sam Liang
Not necessarily. When you speak, you are still using your brain, right? You're expressing yourself, you're thinking aloud, especially when you are discussing with your colleague. There could be, you know, five or 10 people. There's a lot of interaction happening. Ideas can be created by talking. Of course, if you like to write, you can still do it, but you don't have to do the tedious part, you know, such as taking meeting notes.
Interviewer (BBC Host)
Do you think OTTER will still be going in 10, 20 years, or do you think it will be part of something else?
Sam Liang
I think so. We're announcing a new platform called Conversational Knowledge Engine, which we think will capture as many conversations as possible, organize that in the platform so people can access that knowledge, so that they can do their job better, and also enables agents to help you with some of your workflows. So that's a huge market and it will take another 10, 20 years to grow. People may stop writing, but people will never stop talking.
Interviewer (BBC Host)
Will they ever stop having meetings?
Sam Liang
No, they won't stop having meetings. People will still talk all the time. They need to communicate, they need to build human relationship. So if you think about that, in how many conversations are happening in the world, it's just so huge.
Zoe Kleiman
Thank you for listening to the interview. You'll find more in depth conversations on the interview wherever you get your BBC podcasts, including episodes with Karim Bagir, boss of Africa's biggest AI firm, the former Prime Minister of Australia, Julia Gillard, and the musical icon Ringo Starr. Until the next time. Bye for now.
Washington's Get Prepaid Tuition Plan Advertiser
Planning for your children's future is a wonderful adventure. Just when you think you have it all figured out.
Interviewer (BBC Host)
Time for school.
Washington's Get Prepaid Tuition Plan Advertiser
It all changes. With Washington's Get Prepaid Tuition plan, you can prepare your child for the next step in their educational journey after high school, whatever that might be.
Interviewer (BBC Host)
You ready?
Washington's Get Prepaid Tuition Plan Advertiser
We're ready. Learn more about WA529 and get at 529.wa.gov hey campers, it's Jan from Toyota.
Jan from Toyota
This summer we're headed to Camp Toyota, and the fun starts now. We're kicking things off by kicking up mud. Jump in, campers. We're going off roading in a four Runner. Next, we're heading to the hot spot. Brings in a RAV4. And finally, park your Tundras and Tacomas around the campfire because we're roasting marshmallows. Your summer starts here.
Shell V Power Nitroplus Advertiser
Dealer inventory may vary, so your participating Toyota dealer for details. Event ends June 1st. Toyota let's go. Places.
Podcast Summary: The Interview – Sam Liang, Otter.ai CEO: "AI Captures Everything"
BBC World Service | Hosted by Zoe Kleiman | May 3, 2026
In this episode of "The Interview," technology editor Zoe Kleiman talks with Sam Liang, CEO and co-founder of Otter.ai. The conversation explores the evolution of AI-powered transcription services, their societal impact, and the future of human communication in the workplace. Drawing from Liang’s personal journey as an immigrant and engineer, the episode covers themes of accessibility, diversity, data privacy, and the shifting role of human skills in a world increasingly mediated by machines.
The conversation balances technical insight with personal anecdotes, maintaining a reflective and accessible style. Liang’s responses are candid, sometimes humble, and often framed within his own immigrant experience. The host, Zoe Kleiman, asks probing yet conversational questions, encouraging Liang to elaborate on both ethical concerns and practical implications of AI in everyday life.
For listeners curious about the intersection of technology, identity, and the future of work, this episode provides a nuanced, global perspective on how AI tools like Otter.ai are reshaping how humanity communicates, remembers, and collaborates.