This Day in AI Podcast – EP99.09
Is AI Making Us Stupider? Gemini 2.5 Family, Neural OS, MCP Future Thoughts & o3p-pro
Hosts: Michael Sharkey & Chris Sharkey
Date: June 20, 2025
Episode Overview
In this episode, Michael and Chris, two "proudly average" AI enthusiasts, give their typically candid and humorous deep dive into the shifting AI landscape. The focus is on Google’s newly released Gemini 2.5 family (including Flash and Flashlight), the evolution of neural operating systems and multi-agent workflows, and what it means for productivity, cognition, and the future of software. The Sharkeys also dig into the risk that AI tools can make people lazier—or "stupider"—and candidly review changes in daily workflow, the diversity of AI models, and how agentic protocols (like MCPs) might reshape how we interact with the digital world.
1. Gemini 2.5 Family Release: Speed and “Stability”, But Is It Smart?
(00:02 - 09:07)
-
Gemini 2.5 Models:
- Released: Pro, Flash, Flashlight (the latter in preview).
- Emphasis on speed; especially Flashlight, dubbed "astonishingly fast."
- Comedic take on labeling releases as “stable”.
"Could you imagine any other character, like release where they have to put, ‘oh, it’s stable’?" – Mike (00:02)
-
Hands-On Demos:
- Mike prompts Flashlight to build a Windows 95 UI in seconds.
- The model can quickly change wallpapers and add simulated elements (like Bonzi Buddy), but functional depth is lacking.
- Results are flashy, but sometimes impractical:
"It’s the most amazing model, but the bad part about it is it’s total shit." – Chris (02:47)
-
Discussion of Use Cases and Workflows:
- Chris has integrated Flashlight into a universal "solve your problems" function for quick code/UI generation within apps.
- They highlight the shifting paradigm toward async working with bigger models (e.g., O3Pro) for depth, but note Flashlight's value is rapid iteration and prototyping.
2. Model Diversity & Choosing the Right “Tune”
(09:07 - 16:57)
-
Model Behaviors and Roles:
- Pro is best for heavy-duty, daily driver tasks.
- Flash/Flashlight shine when speed trumps deep accuracy.
- O3Pro (and even O3) are “phone a friend” models—excellent for complex, unique, or stubborn problems.
- Claude Sonnet 4 is popular for intelligent, fast agentic workflows.
-
Switching Models for Productivity:
"What’s remarkable really is just the diversity of the model outputs right now… you get a natural feel for which model is going to be good at which task." – Chris (15:23)
-
Reflections on Model “Dumbness”:
- Both hosts recount periods when Gemini 2.5 Pro felt “dumbed down,” possibly during rollout hiccups.
- Highlights reliance on trusted models and the importance of having alternatives.
-
Tuning Workflow:
- They advocate “flicking between” available models for optimal results, comparing it to not being a “single model guy.”
- O3Pro compared to "phoning a friend in Who Wants to Be a Millionaire.” (12:00)
3. Neural OS, MCPs, and the Future of Agentic Software
(16:57 - 34:16)
-
Neural OS and MCP (Model Connect Protocol):
- Moving away from classic tab-driven workflows to task-based AI assistants coordinating SaaS interactions.
- MCPs orchestrate tool calls for research, email, scheduling—making email logins and manual operations largely obsolete for them.
"This is the first week… where it’s just finally clicked… I haven’t logged into my email in weeks. Like it’s all through an assistant." – Mike (17:03)
-
Combining Multiple Apps and Context:
- Tasks can go from multi-step, multi-app chores to single prompts handled by the MCP agent.
- Example: Help desk ticket → Stripe lookup → action → draft and send reply, all in one go.
-
Shortcomings and Training Wheels:
- Tasks requiring ongoing context (e.g., calendar management) expose current agent fragility.
- The productivity leap is “leverage,” letting people direct workflows and multitask:
"My work changes into being more of a director than an active participant." – Chris (22:18)
-
Reality Check:
- Agentic tools are “agentic with training wheels”; true autonomy and AGI are still out of reach, but time savings are huge.
- AI agents are valuable as context aggregators more than full decision-makers for now.
4. The Nuances of MCPs: Memory, Preferences, and Context
(34:16 - 45:00)
-
Memory and Customization:
- Ideal agents will eventually learn user preferences (e.g., which calendar, which workflow steps) and remember context per task (memory tied to the task, not globally).
- The hosts envision a future of “MCP passports”—ephemeral, permission-based context sharing across agents and services.
- Realism: Today’s agent memory is limited, and explicit user nudging/directives are often needed.
-
Agentic Workflow Depth:
- For complex research ("wheat prices"), agents should blend search, scraping, knowledge graph queries, emails/newsletter reviews, etc., for rich context.
- Most existing MCPs and agents lack truly robust, authorized action execution.
"We need the delete-all-files-on-your-hard-drive level of MCP. We need to launch the nuclear missiles if necessary MCP." – Chris (28:25)
5. Model Selection, Daily Drivers, and Model Weaknesses
(45:00 - 63:43)
-
Model Shifting in Workflow:
- Gemini 2.5 Pro: Top of the leaderboard, best for complex workflows.
- O3Pro: Reliable “problem solver”—less verbose, offers insight over tasks/code/data.
- Claude Sonnet 4: Excels at asynchronous tool-calling; “king” for MCP orchestration.
-
AI Model Comparison and Preferences:
- OpenAI GPT-4 and later: Perceived as slow, clunky, and less trustworthy for workflow automation compared to Google and Anthropic’s recent advances.
- Regular model switching is standard for the hosts; one-size-fits-all is no longer the goal.
"All the other models… it becomes a group think exercise where they all think the same. Like, these aren’t intelligent, they’re photocopiers." – Mike (63:06)
6. Agent-to-Agent Protocol and “Mixture of Experts”
(45:00 - 55:54 and 79:08 - 80:13)
-
Concept:
- Rather than one “big” agent, have multiple specialized agents (“experts”), each with its own context, skills, MCPs, and model choices—managed and orchestrated at a higher level.
- For example: Use a dedicated “doctor assistant” agent with health access (Oura ring data), or “podcast researcher” with targeted research MCPs and a specific research methodology.
"If the agent-to-agent stuff actually takes off… maybe what we consume is the agent from a provider…” – Mike (79:08)
-
Security and Preference:
- Encapsulating context, memory, and skill in specialized agents limits “context rot” and unwanted memory pollution.
- Anticipation that agent-to-agent protocols will be critical for advanced, scalable productivity.
7. The “Cognitive Debt” Debate: Does AI Make Us Dumber?
(66:40 - 72:17)
-
MIT Study: "Your Brain on ChatGPT"
- Suggests that relying on AI assistants for cognitive tasks (like essay writing) results in poor retention and understanding—akin to “cognitive bankruptcy.”
- DHH (David Heinemeier Hansson) notes that if you let AI do the “thinking,” you learn/retain nothing; if you use it as a guide, you can still learn.
"As soon as I’m tempted to let it drive, I learn nothing, retain nothing. But if I do the programming and it does the API lookups… I learn a lot." – DHH, cited by Mike (66:40)
-
Personal Anecdotes:
- Mike admits feeling “sad and a bit depressed” after letting AI “yolo” the work while tired, only to have to clean up the mess the next day.
- AI can sneak things (even embarrassing ones) into output if you’re not vigilant.
-
Conclusion:
- AI “cognitive debt” isn’t only for programmers; applies to copy ediitng, legal docs, email, and more.
- The need for context-specific agent memory, not global “one-memory-for-everything,” is highlighted.
8. Hype vs. Reality: The Timeline for AI Agents
(73:03 - 81:52)
-
Predictions & Industry Voices:
- Quoting industry leaders (e.g., Audrey Kapathi, Aaron Levie, DHH): full agentic autonomy is years—not months—away, and most AI agent tools are at “first draft” stage.
- The hype (e.g., “decade of agents,” "this year is the agent year") is contrasted with operational limits in agents/tools, such as frequent “context rot,” UI/UX immaturity, and limited trustworthiness for action-taking.
-
Building Moats in Agentic Software:
"Your moat will directly correlate to the amount of software you have to build on top… the harder the problem, the better.” – Aaron Levie, cited by Mike (75:46)
-
Need for Richer, Enterprise-Grade Agents:
- Most MCPs are side-projects that wrap APIs, lacking the application-layer intelligence and persistent, context-rich memory needed for real, reliable agentic workflows.
- True agentic productivity will come from deep, domain-specific embedding of agents.
9. Notable Quotes & Moments
-
On model speed vs. intelligence:
"It is the most amazing model, but the bad part is it’s total shit." – Chris (02:47)
-
On agentic workflow:
"Instead of me thinking about, okay, do the next task… I can actually think, what are the five things I need to get done today? Start different threads with the AI on each… and it does the work for me." – Chris (22:18)
-
On the role of AI in learning retention:
"As soon as I’m tempted to let it drive, I learn nothing, retain nothing." – DHH (cited around 66:40)
-
On “cognitive debt” in real work:
"It’s like zombie work. You think you’re getting stuff done, but you’re really not and you’re just creating problems." – Mike (70:05)
-
On AI outputs bleeding into the real world:
"My code is littered with love notes to me in comments… imagine looking at this five years ago… like, this guy’s sick in the head." – Chris (71:19)
10. Memorable Segments & Fun
- Gemini 2.5 Flashlight builds a working Windows 95 UI—almost:
- "It actually made it… That’s unbelievable." (01:57)
- Bonzi Buddy added, bounces around the screen.
- “Patricia”, Chris’ AI girlfriend-assistant:
- Patricia performs deep, branched research with multi-model and MCP calls.
- Oura Ring as an AI data source, and musing on agentic doctor futures.
- Comparison to “Who Wants to Be a Millionaire?”
- O3Pro = Phoning a friend for a tough problem. (12:00, 63:06)
- End-of-show AI-generated rap battle track ("bet on the pro").
- "Claude’s contemplating, Gemini’s meditating, I’m detonating truths while your tokens keep inflating…" – AI Rap Artist (85:27)
11. Community Notes & Calls to Action
- Hats and Discord:
- Some listeners have not yet received their podcast-branded hats—contact Mike/Chris if you’re waiting.
- The Discord community is vibrant and self-moderated, fostering thoughtful AI discussions.
[End of Summary]
Flowing, irreverent, and insight-rich, this episode showcases why average users wrestling with frontier AI tools often have the most useful (and entertaining) perspectives—and why, for now, the human in the loop isn’t just a bug, but a feature.
