ElevenLabs CEO: Why Voice is the Next AI Interface

a16z Podcast: “ElevenLabs CEO: Why Voice is the Next AI Interface”

Date: November 4, 2025
Guest: Mati Staniszewski – Co-founder & CEO, ElevenLabs
Host/Panel: a16z team

Episode Overview

This episode dives into the meteoric rise of ElevenLabs, a leading AI voice technology company. Mati, its co-founder and CEO, reveals how ElevenLabs balances cutting-edge research with rapid product launches, why voice is poised to be the next major interface in AI, and how the company navigates the complex intersection of creativity, ethics, and enterprise-grade reliability. The conversation also addresses global talent strategies, adapting to the creative industries, and lessons learned in scaling a modern AI company.

Key Discussion Points and Insights

1. ElevenLabs’ Mission & Voice Marketplace

Timestamps: [00:00] – [01:56]; [13:32] – [17:02]

Voice as the Ultimate Interface: ElevenLabs began with AI-generated voice, progressing to orchestrated voice agents and fully-licensed AI music creation.
Serving Diverse Use Cases: The need to support a wide range of voices, accents, and speaking styles led ElevenLabs to launch a “Voice Marketplace,” where users can create and monetize their own voices.
Impact: Over 10,000 voices on the platform; $10 million paid to creators.

“We launched Voice Marketplace where you could create your voice and then share it. And when the voice is shared, you earn money in the return. Today we have almost 10,000 voices. We paid $10 million back to the people in the community.”
— Mati, [00:00]; [13:32]

2. Research vs. Product: Striking a Balance

Timestamps: [01:56] – [06:45]

Fast Shipping & Small Teams: Success attributed to 20 small, autonomous product teams (5–10 people each), ensuring speed without losing quality.
Research/Product Tradeoffs: Preference to solve problems (like adjusting speech speed) through foundational research, but pragmatic about implementing product-level fixes if timelines demand.
Decision-Making: When research hurdles are likely to take over three months, product teams have freedom to patch with practical solutions.

“We are very against this idea of...sliders, any toggles. We don’t want to become same as previous generation of the editing suite. So instead, let's solve it on the research level...We resisted this…for nine months. We couldn't solve it on the research side. And then the product was a super simple solve.”
— Mati, [04:50]

3. Global, Remote, and Inclusive Team Building

Timestamps: [06:45] – [12:34]

Origins in Europe: The lack of natural-sounding dubbing in Polish media inspired the company’s founding.
Non-traditional Hiring: Early hires included people from surprising backgrounds (e.g., a call center employee who’d built a text-to-speech system).
Hubs + Remote: Blend of fully remote work and physical hubs (London, Warsaw, San Francisco) to balance deep focus and strong company culture.
Flat Organization & No Titles: Encourages impact regardless of tenure; six months to prove team efficacy; promotes lateral responsibility and quick advancement.

“We started fully remote…We hired a person that had incredible open source text to speech model and was working in the call center at the same time…He’s now one of the most brilliant researchers we have.”
— Mati, [08:16]

“We removed titles a year ago and it’s going well…The tenure will not define your position in the hierarchy. If you are smart and quick and passionate, you can elevate yourself very quickly.”
— Mati, [10:04]

4. The Evolving Relationship with Creative Industries

Timestamps: [12:34] – [19:53]

Early Resistance Melting: Initially, creative professionals were wary of AI, but voice AI is now widely adopted in music, advertising, and entertainment.
Working with Industry: ElevenLabs emphasizes partnership over disruption—consulting with creators and learning how AI can add value.
Marketplace Payoffs: Unique voices sometimes find unexpected success in new markets.
Music Licensing: Secured deals with major labels (Merlin, Cobalt) to allow commercial rights and integration into AI models, after lengthy negotiations.

“With labels, I think I’m still learning…We worked with labels to bring their music into the music model so we can do it in a licensed way…That was a hard process. It took us 18 months to figure out the agreement that works.”
— Mati, [15:50]
Hiring Domain-Specific Talent: For new areas, pairs in-house hires with experienced consultants (e.g., music lawyers) to bridge knowledge gaps and accelerate industry alignment.

5. Ethics, Regulation, and Legal Challenges

Timestamps: [17:02] – [19:53]

Building the Right Legal Team: Hiring effective legal counsel is tough; initial hires from big companies brought a risk-averse mindset incompatible with startup needs. Success came with risk-tolerant, commercially-savvy hires.

“Every conversation was pointing out the risks…And now we hired a person…who understands the risk equation a lot better, where…they are like a true thought partner. Tremendous change for sure.”
— Mati, [18:28]

6. Shifting from Creator Brand to Enterprise Platform

Timestamps: [19:53] – [26:50]

Smooth Transition: Started as a tool for creators, but significant inbound from enterprises (especially for AI agents in healthcare, customer service).
Adapting Sales & Product: Moved from engineering-led sales to dedicated sales teams (80% sales, 20% engineering).
Enterprise Features: Built out orchestration tools, integrations, enterprise-level compliance and reliability—now a cornerstone of the offering.

“One obvious difference between PLG and sales is the cycle to work through and identify the right customers is much longer. In the early days…had to shield [teams] from that information and trust us, we'll do this…After 12 months it worked out. But that was probably the hardest culturally.”
— Mati, [23:37]
Product Management: Differentiates between pre- and post-product-market fit within teams; products not capturing a sufficient user base in six months are shelved.

“On the pre-product-market fit your mission is to ship until you think we've hit product-market fit. Usually we give the six month period...If not we kill the product.”
— Mati, [25:32]

7. Scaling Challenges & CEO Lessons

Timestamps: [26:50] – [29:55]

Transitioning Incentives: As the company grew (now 350+), incentive structures became critical; had to align commission with strategy—explicitly encouraging teams to escalate “close calls.”
Maintaining Clarity: Set explicit rules (e.g., not selling models to foundational model competitors) and ensured open communication about misalignment.

“In early days everybody would just operate on a passion basis. [Now] incentive structure really matters. If you…don’t make it extremely clear...strategy [and] commissions…need to be as close as possible.”
— Mati, [27:16]

“We had [a competitor] want to license our models for demos…and the incentive would suggest that you should sell to them. But luckily, we didn’t.”
— Mati, [29:39]

Notable Quotes

“The voice is the next AI interface—not just for accessibility, but for creativity, immersion, and productivity.”
— Mati (paraphrased theme, throughout)
“You can really find talent everywhere. It’s just how hard and how you look for them.”
— Host, [09:12]
“Avoiding this initial knee jerk reaction that AI is bad has been tremendous.”
— Mati, [16:59]

Memorable Moments & Timestamps

[08:16] Accidental Hire: Call center agent working on TTS joins as a standout researcher.
[13:50] Marketplace Surprise: The Spanish voice not popular at home becomes a top-3 option in English-speaking markets.
[15:50] 18-Month Negotiation: Licensing music for AI modeling—hurdles, patience, and the eventual breakthrough.
[25:32] Brutal Product Discipline: “If not we kill the product”—shelving efforts that don’t reach fit in six months.
[29:39] Ethical Sales Decisions: Turning down a lucrative deal with a rival, prioritizing long-term company values and strategy.

Conclusion & Takeaways

ElevenLabs has become a pivotal player by treating voice as the linchpin of future human-computer interaction, accelerating through autonomous team structure, global hiring, and relentless R&D. Their collaborative approach to working with creative industries and strict alignment of incentive structures as they scale stands out as a model for other AI-driven startups. As voice becomes increasingly central to AI, ElevenLabs’ learnings on research, ethics, scaling, and global teamwork provide valuable lessons for any tech innovator.

a16z Podcast: “ElevenLabs CEO: Why Voice is the Next AI Interface”

Date: November 4, 2025
Guest: Mati Staniszewski – Co-founder & CEO, ElevenLabs
Host/Panel: a16z team

Episode Overview

Key Discussion Points and Insights

1. ElevenLabs’ Mission & Voice Marketplace

Timestamps: [00:00] – [01:56]; [13:32] – [17:02]

Voice as the Ultimate Interface: ElevenLabs began with AI-generated voice, progressing to orchestrated voice agents and fully-licensed AI music creation.
Serving Diverse Use Cases: The need to support a wide range of voices, accents, and speaking styles led ElevenLabs to launch a “Voice Marketplace,” where users can create and monetize their own voices.
Impact: Over 10,000 voices on the platform; $10 million paid to creators.

“We launched Voice Marketplace where you could create your voice and then share it. And when the voice is shared, you earn money in the return. Today we have almost 10,000 voices. We paid $10 million back to the people in the community.”
— Mati, [00:00]; [13:32]

2. Research vs. Product: Striking a Balance

Timestamps: [01:56] – [06:45]

Fast Shipping & Small Teams: Success attributed to 20 small, autonomous product teams (5–10 people each), ensuring speed without losing quality.
Research/Product Tradeoffs: Preference to solve problems (like adjusting speech speed) through foundational research, but pragmatic about implementing product-level fixes if timelines demand.
Decision-Making: When research hurdles are likely to take over three months, product teams have freedom to patch with practical solutions.

“We are very against this idea of...sliders, any toggles. We don’t want to become same as previous generation of the editing suite. So instead, let's solve it on the research level...We resisted this…for nine months. We couldn't solve it on the research side. And then the product was a super simple solve.”
— Mati, [04:50]

3. Global, Remote, and Inclusive Team Building

Timestamps: [06:45] – [12:34]

Origins in Europe: The lack of natural-sounding dubbing in Polish media inspired the company’s founding.
Non-traditional Hiring: Early hires included people from surprising backgrounds (e.g., a call center employee who’d built a text-to-speech system).
Hubs + Remote: Blend of fully remote work and physical hubs (London, Warsaw, San Francisco) to balance deep focus and strong company culture.
Flat Organization & No Titles: Encourages impact regardless of tenure; six months to prove team efficacy; promotes lateral responsibility and quick advancement.

“We started fully remote…We hired a person that had incredible open source text to speech model and was working in the call center at the same time…He’s now one of the most brilliant researchers we have.”
— Mati, [08:16]

“We removed titles a year ago and it’s going well…The tenure will not define your position in the hierarchy. If you are smart and quick and passionate, you can elevate yourself very quickly.”
— Mati, [10:04]

4. The Evolving Relationship with Creative Industries

Timestamps: [12:34] – [19:53]

Early Resistance Melting: Initially, creative professionals were wary of AI, but voice AI is now widely adopted in music, advertising, and entertainment.
Working with Industry: ElevenLabs emphasizes partnership over disruption—consulting with creators and learning how AI can add value.
Marketplace Payoffs: Unique voices sometimes find unexpected success in new markets.
Music Licensing: Secured deals with major labels (Merlin, Cobalt) to allow commercial rights and integration into AI models, after lengthy negotiations.

“With labels, I think I’m still learning…We worked with labels to bring their music into the music model so we can do it in a licensed way…That was a hard process. It took us 18 months to figure out the agreement that works.”
— Mati, [15:50]
Hiring Domain-Specific Talent: For new areas, pairs in-house hires with experienced consultants (e.g., music lawyers) to bridge knowledge gaps and accelerate industry alignment.

5. Ethics, Regulation, and Legal Challenges

Timestamps: [17:02] – [19:53]

Building the Right Legal Team: Hiring effective legal counsel is tough; initial hires from big companies brought a risk-averse mindset incompatible with startup needs. Success came with risk-tolerant, commercially-savvy hires.

“Every conversation was pointing out the risks…And now we hired a person…who understands the risk equation a lot better, where…they are like a true thought partner. Tremendous change for sure.”
— Mati, [18:28]

6. Shifting from Creator Brand to Enterprise Platform

Timestamps: [19:53] – [26:50]

Smooth Transition: Started as a tool for creators, but significant inbound from enterprises (especially for AI agents in healthcare, customer service).
Adapting Sales & Product: Moved from engineering-led sales to dedicated sales teams (80% sales, 20% engineering).
Enterprise Features: Built out orchestration tools, integrations, enterprise-level compliance and reliability—now a cornerstone of the offering.

“One obvious difference between PLG and sales is the cycle to work through and identify the right customers is much longer. In the early days…had to shield [teams] from that information and trust us, we'll do this…After 12 months it worked out. But that was probably the hardest culturally.”
— Mati, [23:37]
Product Management: Differentiates between pre- and post-product-market fit within teams; products not capturing a sufficient user base in six months are shelved.

“On the pre-product-market fit your mission is to ship until you think we've hit product-market fit. Usually we give the six month period...If not we kill the product.”
— Mati, [25:32]

7. Scaling Challenges & CEO Lessons

Timestamps: [26:50] – [29:55]

Transitioning Incentives: As the company grew (now 350+), incentive structures became critical; had to align commission with strategy—explicitly encouraging teams to escalate “close calls.”
Maintaining Clarity: Set explicit rules (e.g., not selling models to foundational model competitors) and ensured open communication about misalignment.

“In early days everybody would just operate on a passion basis. [Now] incentive structure really matters. If you…don’t make it extremely clear...strategy [and] commissions…need to be as close as possible.”
— Mati, [27:16]

“We had [a competitor] want to license our models for demos…and the incentive would suggest that you should sell to them. But luckily, we didn’t.”
— Mati, [29:39]

Notable Quotes

“The voice is the next AI interface—not just for accessibility, but for creativity, immersion, and productivity.”
— Mati (paraphrased theme, throughout)
“You can really find talent everywhere. It’s just how hard and how you look for them.”
— Host, [09:12]
“Avoiding this initial knee jerk reaction that AI is bad has been tremendous.”
— Mati, [16:59]

Memorable Moments & Timestamps

[08:16] Accidental Hire: Call center agent working on TTS joins as a standout researcher.
[13:50] Marketplace Surprise: The Spanish voice not popular at home becomes a top-3 option in English-speaking markets.
[15:50] 18-Month Negotiation: Licensing music for AI modeling—hurdles, patience, and the eventual breakthrough.
[25:32] Brutal Product Discipline: “If not we kill the product”—shelving efforts that don’t reach fit in six months.
[29:39] Ethical Sales Decisions: Turning down a lucrative deal with a rival, prioritizing long-term company values and strategy.

wavePod

Powered by Wave AI

Summary

a16z Podcast: “ElevenLabs CEO: Why Voice is the Next AI Interface”

Episode Overview

Key Discussion Points and Insights

1. ElevenLabs’ Mission & Voice Marketplace

2. Research vs. Product: Striking a Balance

3. Global, Remote, and Inclusive Team Building

4. The Evolving Relationship with Creative Industries

5. Ethics, Regulation, and Legal Challenges

6. Shifting from Creator Brand to Enterprise Platform

7. Scaling Challenges & CEO Lessons

Notable Quotes

Memorable Moments & Timestamps

Conclusion & Takeaways

Summary

a16z Podcast: “ElevenLabs CEO: Why Voice is the Next AI Interface”

Episode Overview

Key Discussion Points and Insights

1. ElevenLabs’ Mission & Voice Marketplace

2. Research vs. Product: Striking a Balance

3. Global, Remote, and Inclusive Team Building

4. The Evolving Relationship with Creative Industries

5. Ethics, Regulation, and Legal Challenges

6. Shifting from Creator Brand to Enterprise Platform

7. Scaling Challenges & CEO Lessons

Notable Quotes

Memorable Moments & Timestamps

Conclusion & Takeaways