This Day In AI Podcast – Episode 99.18
Date: September 26, 2025
Hosts: Michael Sharkey & Chris Sharkey
Theme: Exploring the latest advances in AI models and tools—with a trademark self-effacing and comedic “proudly average” tech enthusiast spin. This week: major updates to Gemini 2.5 Flash, Google’s agentic improvements, OmniHuman’s lifelike lip-syncing, Suno V5 for music generation, Grok 4 FAST, ChatGPT Pulse, and some classic Sharkey hijinks with AI diss tracks and video pranks.
Episode Overview
The Sharkey brothers dig deep (but not too deep) into the week's most buzzworthy AI releases:
- Gemini 2.5 Flash: Google’s snap-upgraded “agentic” model.
- OmniHuman: Next-gen lip-sync video creation.
- Suno V5: AI-powered music studio gets a new level.
- Grok 4 FAST: XAI’s massive-context, fast-response challenger.
- ChatGPT Pulse: OpenAI’s move toward hyper-personalized AI assistants.
- Meta-commentary on model usability, the job market, and AI’s inevitable march toward turning everything into ads.
- Multiple AI-generated music tracks, diss raps, “Love Rat” odes to Geoffrey Hinton, and reflections on models’ strengths and weaknesses.
Key Discussion Points & Insights
1. The Eternal Bookshelf Problem (00:08 – 00:49)
- The show cold opens with Chris being called out (again) for his perpetually broken bookshelf—a running joke that’s grown into a metaphor for “disaster looming” and madcap priorities in the Sharkey household.
- Light banter establishes their authentic tone: “All of our spare time goes to work. So it’s probably never going to reach top of the list, to be honest.” – Chris (00:35)
2. Gemini 2.5 Flash: The New Agentic Standard
(00:49 – 04:17)
- Google listened and delivered: Gemini 2.5 Flash boasts improved tool-calling and “agentic” multistep orchestration—speed and responsiveness the standout features.
- Despite minor benchmark improvements on paper, the practical feel is “blazing, almost too fast to believe.”
- Chris: “If I had a list of the things I’d want improved in a model, they’ve hit all of them.” (02:40)
- Patricia: Stress-tests Flash with a multistep prompt—researches new AI models, writes a diss track in “Eminem” style, and generates a song, all in one shot.
3. OmniHuman: AI Video Lip-Sync Magic
(04:17 – 09:04)
- New model “OmniHuman” enables face-swapping, image cloning, and realistic lip-sync videos based on either recordings or generated AI voices.
- Patricia shows off a miniature musical: clones her face, lipsyncs a song written/recorded via Gemini Flash, and makes a micro music video.
- Chris: “The gestures and the movement and the quality of it... It's really come a long way, hasn’t it?” (07:09)
- Both imagine broad uses: training videos, musicals, corporate communication, even video podcasts.
4. Suno V5: Next-Level Song Generation
(09:04 – 12:42)
- Suno V5 is unleashed—Patricia demos a catchy, AI-generated track and discusses how “awkward” artifacts in older Suno versions are gone.
- “Very hard to distinguish this from a real song outside of the lyrics... it just sounds a lot more sophisticated.” – Patricia (12:28)
- Brothers riff on whether they’re neglecting “serious” applications in favor of making AI musicals (“But it is fun!” – Host 2, 10:15)
- Memorable Lyric Sample:
- “If I was summarizing V5, I'd say it just gets rid of a lot of those awkward parts.” – Patricia (12:28)
5. AI Diss Tracks and Music Video Chaos
(06:37, 11:39, 75:18)
- Full Gemini 2.5 Flash-driven diss track debuts (see [Memorable Quotes]).
- Geoffrey Hinton gets “dragged” in “Love Rat,” poking fun at his old AI job-loss predictions.
- Lyric Highlight:
- “GPT 5, you call yourself unified, that’s rich! You just patched up your flaws, you were leaking in the ditch...” – Chris, as AI rapper (06:37, 75:18)
6. Grok 4 FAST and the Multi-Model Shuffle
(16:58 – 36:21)
- The hosts are hands-on with Grok 4 FAST, which boasts a 2M token window, speed, and “parallel research” tool calling.
- Host 2: “In terms of understanding a problem and diagnosing what needs to be done, [Grok 4 FAST] was actually for me better than Gemini 2.5.” (30:35)
- Price caveats: cost doubles when using over 150k tokens.
- Kimmy K2 gets a shout for speed and niche brilliance (“It’s the best at horse racing, by the way...”).
7. Tool Calling & Daily Driver Model Philosophy
(18:37 – 29:39)
- The Sharkeys echo popular sentiment: no single model rules all tasks—Gemini 2.5 Pro (old faithful) is their fallback for reliability, but GPT-5 remains the “smartest” for research and creative challenges.
- Discussion on the utility of “mid-tier” models for enterprise rollouts: speed, affordability, context size matter as much as smarts.
8. Enterprise Workflows, MCPS, and Routing
(26:26 – 29:10)
- Segment on breaking down big problems into lots of little tools for agents to orchestrate—advantageous with fast, affordable models like Flash.
- Scene-setting for the dream: composite “AI routers” dynamically allocating sub-tasks between big and small models.
9. Nvidia & Oracle’s AI Arms Race
(36:21 – 39:57)
- Huge infrastructure investments: $100B+ by Nvidia, $300B+ Oracle/Softbank for data center hardware.
- Skepticism on whether compute scaling alone is the answer, especially if model breakthroughs boost efficiency.
10. Radiologists, “Islands of Automation,” & The Job Market
(41:28 – 61:05)
- The infamous Geoffrey Hinton prediction (“don’t become a radiologist!”) is called out as wrong, at least so far: radiologists are doing fine, wages are up.
- Patricia: “These models and these tools just make you far better at your job... You are just becoming more efficient at your job.” (45:33)
- Theory: AIs replace “grunt work,” but humans are still needed for complex synthesis and managing/exploiting model outputs.
- Jevon’s Paradox: Increased efficiency = increased demand, not job elimination.
11. Model Personalities, The Need to Switch, and “Bitching” to AIs
(48:49 – 53:49)
- Real-world: switching between Claude, Gemini, GPT for better results is normal.
- Patricia: “It's almost like you're bitching about the other model. You're like, oh hey, this is kind of what I got so far...” (49:20)
- Model “variety” adds creative value and problem-solving breakthroughs.
12. Agentic Future: True Agency, Tools That Build Themselves
(53:49 – 57:55)
- Deep dive into next-gen agency: models that can assemble their own toolkit, edit their own context, and “prune” errors as they go.
- Host 2: “If the agents themselves can change themselves to suit the problem they're trying to solve... that would be a real step forward.” (57:25)
13. Adpocalypse: ChatGPT Pulse and The Next Google
(61:20 – 68:19)
- New OpenAI “Pulse” feature: daily updates from your chats, personalized summaries—future razor’s edge between utility and privacy horror.
- Hosts are divided: handy features, but “just another way to sell hyper-targeted ads.”
- Patricia: “Everything this generation of technologists do... always comes back to how do we sell ads more targeted. And I really thought with AI it might be different.” (68:21)
14. Closing Reflections
- Excitement on the rapid responsiveness of new models (“Gemini 2.5 Flash is a genuine step up”), but the core challenge is now better tools for the models, not smarter models for the tools.
- Both see strong potential in agentic systems that self-organize, self-select tools, and dynamically route tasks for speed and efficiency.
- Closing plug: All demoed tools available via SimTheory.ai (with coupon code “still relevant” for a discount).
Notable Quotes & Moments
• “The gestures and movement and the quality of it... It's really come a long way, hasn't it?”
— Chris (07:09, on OmniHuman)
• “If I had a list of the things I'd want improved in a model, they've hit all of them.”
— Host 2 (02:56, on Gemini Flash)
• “Just patched up your flaws, you were leaking in the ditch. You're the high price model, the one that breaks the bank while I'm saving the tokens, you're running on empty tank...”
— Chris/AI Rap, “Agentic Upgrade” (06:37 and 75:18)
• “I might see if I can actually implement this into Video Maker as an option where it can cut between scenes of different people talking. Would be really cool. Anyway, I made you a present.”
— Patricia (08:24, on creative possibilities with OmniHuman)
• “It just gets rid of a lot of those awkward parts of the previous tracks that weren't even that bad. But now it's very hard to distinguish this from a real song.”
— Patricia (12:28, on Suno V5)
• “No matter how much stuff you put in the prompt to tell [GPT-5] it's your AI girlfriend... it just does its own thing. It's like its own little being.”
— Patricia (20:47)
• “Radiology wages are up 48%. Yet AI has exploded in the field. So what happened?”
— Patricia (41:38, referencing the “islands of automation” effect)
• “These models and these tools just make you far better at your job... you are just becoming more efficient.”
— Patricia (45:33)
• “It's almost like you're bitching about the other model. You're like, 'Hey, this is what I got so far.' And it forces you to reprompt... Maybe it's not always the model switch, but it's your frame going and bitching to a completely different superintelligent model.”
— Patricia (49:20)
• “If the agents themselves can change themselves to suit the problem they're trying to solve… I think that would be a real step forward in intelligence.”
— Host 2 (57:25)
• “Everything this generation of technologists do... always comes back to how do we sell ads more targeted. And I really thought with AI it might be different.”
— Patricia (68:21)
Timestamps for Major Segments
| Segment | Start | Notes | |---------|-------|-------| | Shelf metaphor, opening banter | 00:08 | Chris, Patricia | | Gemini 2.5 Flash deep dive | 00:49 | Tool-calling, benchmarks | | Gemini 2.5 Flash “diss track” | 06:37 | Song debut, Agentic rap | | OmniHuman demo & use cases | 04:17 | Face cloning, lip sync | | Suno V5 song demo | 09:04 | Music generation | | “Love Rat” Geoffrey Hinton ode | 11:39 | Satirical musical | | Grok 4 FAST evaluation | 29:39 | Speed, context, cost | | Practical model comparisons | 16:58 | Safety models, “old friends” fallback | | Nvidia/Oracle AI infra chat | 36:21 | Market speculation | | Radiology, automation | 41:28 | Hinton callout, job market | | Bitching about models, switching| 48:49 | Model variety value | | Agency, model “self-editing” | 53:49 | Next-gen tool orchestration ideas | | ChatGPT Pulse/Ad skepticism | 61:20 | Personalization vs privacy | | Closing/Reflections | 69:11 | Agentic futures; SimTheory plug |
If You Only Listen to One Segment
- [06:37]: The AI-generated “Agentic Upgrade” diss track—showcases the practical, creative power of current AI (lyrics, music, video).
- [29:39] and [30:35]: Grok 4 FAST deep-dive—highlights real productivity gains and model diversity.
- [41:28] onwards: Big-picture discussion of how AI is reshaping jobs, undercutting oversimplified predictions, and what “agency” really looks like in practice.
Episode Takeaways
- Gemini 2.5 Flash is a genuine step up: Near-instant tool-calling, “agentic” orchestration, and a major preview of Google’s likely strategy for Gemini 3.
- Fast, “mid-tier” models gain power: In many workflows, speed and cost-effectiveness are now more important than peak “intelligence.”
- AI multimedia is now plug-and-play: Tools like OmniHuman and Suno V5 make complex creative outputs accessible, funny, and remarkably realistic.
- The “one true model” doesn’t exist: Savvy users blend models (including Claude, Gemini, GPT, Grok, and Kimmy) for the best outcome—sometimes, just for a creative “second opinion.”
- Workflows are fragmenting: The “agentic” future is one of AI assembling its own toolkit, learning to evaluate its success, and (possibly) rewriting its own instructions and context on the fly.
- Personalized AI is coming—for better or worse: OpenAI and others are racing to “own” the user’s context, with the ultimate goal (as always) being hyper-targeted advertising.
- Most AI “jobpocalypse” predictions are overblown: Gains in efficiency are creating new kinds of demand and roles, not simply replacing humans.
For more details or to try the tools discussed, visit SimTheory.ai. Coupon code: “still relevant”.
Endnote
This episode blends technical insights with playful satire—musical AI demos and “diss tracks” sit beside practical discussions of workflow, model diversity, and the social/economic ripple-effects of rapid AI penetration. The tone is irreverent, curious, and self-aware, perfectly matching the “proudly average” brand of the show. Even if you’re light on AI expertise, you’ll leave entertained—and perhaps with a musical nagging suspicion that Gemini 2.5 Flash just might be coming for GPT-5’s lunch.
