Podcast Summary: This Day in AI Podcast
Episode: Can o3-pro write a book? ElevenLabs v3 and MCP as the AI Business Model - EP99.08-PRO
Date: June 13, 2025
Hosts: Michael Sharkey, Chris Sharkey
Episode Overview
In this episode, Michael and Chris Sharkey explore the latest advances in generative AI, focusing on ElevenLabs v3 voice synthesis, OpenAI's O3-Pro and its new pricing, and the rising significance of the MCP (Model Connect Protocol) as a future AI business model. The brothers test O3-Pro's capabilities in creative writing, debate the emerging economic potential of AI-accessible data, and reflect on how conversational AI's growing toolset may reshape journalism, software, and information access.
Key Discussion Points & Insights
1. ElevenLabs v3: Advances in AI Voice Synthesis
[00:04 – 06:37]
- Demo & Hands-On Impressions:
The Sharkeys experiment with ElevenLabs V3's new "emote technology," which enables nuanced emotional cues in voice generation through tagging (e.g., laughter, whispering, dramatics).- Mike: "You can designate the voices, but then they have an automatic mode... add all the emotes for me based on the text. You don't even need to go and do it." [03:09]
- Chris notes its emotion fidelity outpaces Microsoft’s comparable offering.
- Current Limitations:
- Voice cloning is still weak in V3’s alpha, but generic voices are impressively emotive.
- For longform content (audiobooks, news), cutting and editing multiple takes can yield strong results.
- Broader Impact:
- AI-generated voices could soon perform entire plays or audiobooks with emotional accuracy.
- Chris jokes about the impact on professional narrators:
"Imagine the poor person who did like the Wuthering Heights audiobook... now a computer can just do it in 15 minutes." [03:09]
- Practical Observations:
- V3 handles context-dependent prompts (like “haha” for laughter) automatically.
- Bulk processing and emotive tags make it ideal for audio content creators.
- ElevenLabs is noted for rapid releases, with V3 being among their most impressive leaps.
2. OpenAI O3 and O3-Pro: Pricing, Performance & Positioning
[06:37 – 14:24]
- O3-Pro Pricing Shift:
- OpenAI significantly lowered the price of O3-Pro—now undercutting GPT-4O—making it viable for more practical, day-to-day API operations.
- Mike: "For comparison, input tokens on GPT4O is $2.50 per million, and on O3 it's now $2 per million, so it's $0.50 cheaper." [07:44]
- Previously, higher-end models like O1 Pro were prohibitively expensive and not useful outside ChatGPT.
- OpenAI significantly lowered the price of O3-Pro—now undercutting GPT-4O—making it viable for more practical, day-to-day API operations.
- Strategic Implications:
- Hosts speculate whether pricing aims to undercut competitors (e.g., Gemini 2.5 Pro, Anthropic).
- O3-Pro’s speed and tool-calling prowess make it standout for tasks requiring multi-step reasoning.
- Model Use Cases and Limitations:
- O3-Pro excels as a background agent—ideal for long, context-heavy, step-intensive jobs rather than quick chats.
- Chris: "It really is a single shot, big job, your hardest problems kind of model. Not a day to day driver." [10:19]
- A key shift: Making users rethink HOW and WHEN to deploy various models—it's less “What can it do?” and more “What context/problem is it for?”
- Some early community (Reddit) criticism is dismissed as misunderstanding—O3-Pro not meant for “hello how are you” chats.
- O3-Pro excels as a background agent—ideal for long, context-heavy, step-intensive jobs rather than quick chats.
3. O3-Pro in Practice: Prompt Engineering, Tool Calling & 'Book Test'
[14:24 – 41:36]
- Behavioral Shifts in the Model:
- O3-Pro gives more precise, less verbose answers than prior OpenAI models.
- It avoids unhelpful “data dumps” and can zero in on key issues—sometimes with higher accuracy than competitors, sometimes surprisingly wrong.
- Chris: "[O3-Pro] picked this non-obvious answer... it just felt like it wasn't shortcutting. It actually thought it through and came to its own conclusion." [14:24]
- Prompting and “Doom Paths:”
- O3-Pro resists conversational “doom paths” (repetitive or context-locked threads) due to its slower and more asynchronous workflow.
- Tool Calling & MCP (Model Connect Protocol):
- Tool calling is essential for these models; O3-Pro relies even more on user instruction for when and how to use tools.
- Chris: "O3 Pro just right off the bat for me seemed like it was more inclined to try to answer the question itself rather than to go off and use a whole bunch of tools. And I really had to nudge it in that direction." [19:17]
- MCP explained:
- A way of connecting AI models with external tools via APIs, unlocking new research, automation, and agentic capabilities.
- Two architectures: Hosted publicly for AI to discover/access tools, or provided as a local list for more selective control.
- Tool calling is essential for these models; O3-Pro relies even more on user instruction for when and how to use tools.
- Future of AI Agents:
- As MCP adoption increases, economic opportunities abound: Data providers, SaaS platforms, even governments can monetize trusted, high-fidelity data/API access for AI agents.
- Chris: "I strongly believe this is going to be the future. And I think there's going to be an entire economy around it where you pay for access to these worlds... trusted data... through mcp." [41:36]
- Practical Examples & Use Cases:
- Financial datasets as MCP: pulling balance sheets, crypto data for analysis by AI.
- Journalism: News orgs as “raw fact” MCPs, letting users build their own trusted feed, potentially disrupting traditional news.
- Research and memory: MCPs could grant AI agents read/write access to personal memories or academic databases—revolutionizing study and retrieval.
4. The O3-Pro Book Writing Experiment
[52:41 – 71:37]
- Experimental Setup:
- Mike challenged three LLMs (O3-Pro, Gemini 2.5 Pro, and Claude Sonnet 4) to write a space-themed Count of Monte Cristo. Each model generated a blurb and opening chapter. The results were converted to audio via ElevenLabs V3 for a blind comparison.
- Mike: "I heard a lot of rumors that O3Pro really excelled at creative writing..." [52:41]
- Results:
- Chris, as the subjective judge, found O3-Pro's sample the most compelling, Gemini 2.5 Pro a close second for faithfulness to the original plot, and Sonnet 4's result flat and cliched.
- The book test confirmed speculation: O3-Pro is an exceptional creative writer, despite not following instructions robotically.
- Notable Quote:
Chris after hearing O3-Pro’s sample: "That's damn compelling. I kind of didn't want it to stop. I was genuinely enjoying that. That's going to be very, very hard to beat. I want to know what happens." [58:12] - Mike: "I genuinely would keep reading this story." [68:50]
- The experiment highlighted both O3-Pro's strengths (creativity, focus, ability to draw in a reader) and the subjective nature of creative output evaluation.
5. Broader Reflections: AI Tool Stacks, Data Models, and the Future
[41:36 – 77:16]
- The Emerging Economy of AI Tools/Data:
- Hosts anticipate an “App Store”-like landscape where MCPs, specialized data sources, and API toolsets become products and subscriptions, leading to a new gold rush in data-driven SaaS.
- Mike: "This seems like the economic delivery method to me." [50:53]
- Parallel Tool Calling & Model Differentiation:
- Claude Sonnet 4 is praised for aggressively leveraging parallel tool calls, a crucial advantage for complex, multi-step workflows.
- Changing Information Workflows:
- AI agents will shift info-seeking from search and curation to “give me the best answer from my chosen sources, as deep as required.”
- Curation and source selection (e.g., “Reuters MCP”), rather than content writing, becomes the valuable role.
- User Empowerment & Interface Paradigm Shift:
- Both developers and non-developers can orchestrate powerful custom workflows by mixing and matching MCPs (news, finance, government, academic, actions).
- The AI assistant of the near future will go well beyond chat—autonomously deciding when and how to fetch, act, or summarize information.
- Outlook on Model Innovation:
- The current period, while sometimes perceived as stagnant, is seen as the "preview of the future"—things will accelerate as tool ecosystems mature and integration becomes seamless.
- Chris: "It's not going to be like, which is the best chat model anymore." [77:16]
Notable Quotes & Memorable Moments
-
On ElevenLabs V3's Progress:
Mike: "You can put the entire script in, right, and designate the voices... and have an automatic mode where you can say, add all the emotes for me based on the text. You don't even need to go and do it." [03:09] -
On O3-Pro's Creative Writing Ability:
Chris (After Listening to Book Test, Clip 1):
"That's damn compelling. I kind of didn't want it to stop. Like I was genuinely enjoying that. That's going to be very, very hard to beat." [58:12] -
On the Future of AI-Driven Journalism:
Mike: "It mightn't be the journalists writing the article, it might just be his Reuters MCP with all the facts that their journalists on the ground have found out." [41:13] -
On Parallel Tool Calling:
Chris: "The parallel tool calling, I think, is a vastly superior way to do it... call these seven tools and then get back to me with all of the results." [25:53] -
On the Paradigm Shift:
Mike: "I think we're just so used to this stuff now... this is just a preview of the future. It's not actually something that's that useful right now. But... as we increasingly add MCPs... it's an async, different world and it's coming and it is exciting and these models are the models that will power it." [76:16]
Key Timestamps for Important Segments
| Timestamp | Segment Description | |-----------|--------------------| | 00:04 – 06:37 | ElevenLabs V3 overview, emotes demo, voice cloning discussion | | 06:37 – 14:24 | O3/O3-Pro pricing, model strengths, user strategies | | 14:24 – 19:17 | Prompting, ‘doom path’ explanation, model behaviors | | 19:17 – 28:21 | Tool calling, MCP explained, integration challenges | | 28:21 – 41:36 | AI agent eco-system, business model futures, data curation | | 52:41 – 71:37 | Book writing test: Setup, results, discussion, literary analysis | | 71:37 – End | Reflections, future tests (e.g., vision), conclusions, “boom factor” ratings |
Tone and Takeaways
True to the podcast’s promise, the hosts deliver accessible, self-deprecating, and humor-laced tech insights. This episode blends practical experimentation with broader speculation about where generative AI is headed, repeatedly highlighting how agentic AI, tool-calling, and data economy models could displace today’s workflows—whether in business, research, or creative writing.
The take-home message:
- AI is entering an era where the orchestration of tools, sources, and context will matter as much as the models themselves.
- Economically, the next big waves will be about who owns, curates, and supplies the data/tools for AI assistants.
- O3-Pro, while not flawless, is a major step forward—especially in creative and research contexts.
Listener Prompt:
The Sharkeys invite listeners to weigh in on the “book writing test” and share which sample they preferred—did O3-Pro’s creative storytelling impress you most?
End of Summary
