Podcast Summary: Generative Now | AI Builders on Creating the Future
Episode: Logan Kilpatrick: Building Google Gemini
Host: Michael Mignano, Lightspeed Venture Partners
Guest: Logan Kilpatrick, Senior Product Manager, Google AI Studio
Date: December 5, 2024
Episode Overview
This episode features a lively, in-depth conversation with Logan Kilpatrick, the lead product manager for Google AI Studio and the Gemini API, recorded live at Generative NYC. Host Michael Mignano explores Logan’s journey through high-impact AI organizations, the dynamics and strategy behind Google Gemini, the evolving landscape of application development on frontier models, the future of AI scaling, app layer opportunities, the meaning and prospects for AGI, and more. The discussion oscillates between hands-on product building, macro AI trends, and practical advice for founders and developers.
1. Logan Kilpatrick's Journey in AI (01:11–02:38)
- Career Path
- Logan’s trajectory spans roles at Apple (machine learning engineering), PathAI (digital pathology ML), OpenAI (joining at the end of 2022 as head of developer relations), and most recently Google.
- Memorable moment: "I had a job offer at IBM at the time...I genuinely at the time did not know if I should take the IBM offer or the OpenAI offer." (01:32)
- He shares the uncertainty and excitement of joining AI startups before their current prominence.
2. What AI Looks Like Inside Google (03:27–05:14)
-
Organizational Structure
- Google DeepMind builds frontier generative models (Gemini).
- Other Google teams integrate those models into products (Ads, YouTube, Search).
- Google Cloud provides external developer access via Vertex AI.
- Google AI Studio (Logan's team) focuses on developers, offering quick trial and API key provisioning.
- "That sort of symbiosis...is a great competitive advantage for us because we feel the problems that builders feel." (04:40)
-
Product Differentiation
- Gemini: proprietary, cutting-edge AI models.
- Gemma: open-source models with user-owned weights—designed for on-premise scenarios but with feature lag behind Gemini.
- "The advantage for Gemma is you can actually own the weights..." (08:22)
- Vertex AI: Google Cloud’s enterprise-facing AI platform.
3. AI Studio and the Developer Experience (06:20–08:07)
-
Mission
- AI Studio aims to deliver that “wow” moment for developers exploring Gemini, emphasizing rapid onboarding and enabling builders to quickly take ideas into production.
- "It's not focused on like being a true consumer product, it's really focused on get you to that wow moment, building with AI and then ultimately click, get code, get a Gemini API key and ... build the next company..." (06:29)
-
Team Structure
- Cross-functional, relying on Google’s broader model quality/testing teams.
- AI Studio stands at the externalization/API front, but shares credit with many internal Google teams: "There’s a ton of teams doing work to make all this happen." (07:15)
4. Where Value Will Accrue in the AI Stack (09:14–12:08)
-
Foundation vs. Application Layers
- Declining LLM costs make the app layer fertile for innovation and profit.
- "Founders and people building stuff actually are the ones who get to accrue that value." (09:50)
- However, “wrappers” (thin interfaces around foundation models) face commoditization risks.
-
Building Differentiated Apps
- Logan calls for more experimentation with interaction paradigms beyond chat and voice.
- "It's not chat that is going to create all the value." (11:12)
- Highlights NotebookLM (Google product) for reimagining AI/human interaction.
5. Sectors and Opportunities in the AI "App Layer" (12:19–14:02)
- Consumer/Enterprise Divide
- "If you grab a random person off the street… they don't know or care about AI." (12:36)
- Near-term value: enterprise productivity, especially coding.
- Cautions against “technology-first” agent platforms for consumers: "They're not interested in agents, they're interested in their life being better, easier." (13:54)
- Memorable analogy: "They didn't care about GPS, they cared about Uber. They didn't care about the camera, they cared about Instagram." (13:57, Mignano)
6. The Future of Model Scaling and Moats (14:02–18:36)
-
Scaling Laws and Limits
- Scaling is not automatic—it must be “earned” through innovation.
- "Moore's Law doesn't just happen because someone wrote it down... it happens because there's thousands of engineers..." (14:40)
- Ongoing algorithmic and hardware advances can reignite scaling after plateaus.
-
Compute and Specialization
- Huge compute clusters offer short-term moats (e.g., “next 100K H100 cluster…” 16:09).
- Specialized, domain-specific models provide opportunities for startups (e.g., Cartwheel): "It is hundreds of thousands of times more compute efficient..." (17:16).
- These niches let smaller players compete outside generalist AI.
7. Talent as a Key AI Moat (18:36–20:29)
- People Over Everything
- Talent flows are central to AI leadership: "If the talent is the difference between you winning and not winning..." (19:13)
- Growing consensus among both guest and host: People, not just ideas or capital, are the enduring moat.
- Cautions that talent gravitating toward AI places non-AI companies at a disadvantage.
8. Defining and Debating AGI (20:29–25:07)
-
Working Definitions
- Logan: AGI = “models able to do most economically productive things humans can do.” (20:38)
- Sees AGI accomplishing digital tasks before physical world tasks.
- "You could have AGI today ... [but] we're still five years away from any large scale manufacturing run of humanoid robots that are actually going to scale out..." (22:15)
-
Google’s Position
- Company-wide: mission focuses on organizing and making information accessible (not AGI per se).
- DeepMind’s CEO Demis Hassabis is more openly focused on AGI: "I’ve heard Demis say a bunch of times he wants to make AGI." (23:12)
-
Societal Impact and Education
- Logan’s concern: technology amplifies disparities as education/access lags. "There are billions of people on Earth who have never heard of ChatGPT...If the technology is actually that powerful, I think there'll be a lot of downside for the people who aren't in the know..." (24:00)
9. AI's Impact on Software and Developers (25:07–28:05)
-
Coding and Productivity
- Logan predicts continued productivity augmentation for engineers, not their wholesale replacement.
- "The AI augmented software engineer is going to be able to do an incredible amount of stuff in the future..." (25:47)
- Complexity of prompting/guardrails limits “God model” auto-app generation in the short term.
-
Low-Code & Democratization
- Sees new opportunities for non-coders, echoing long-standing low-code aspirations—“needed a couple more cycles of innovation.” (27:42)
10. The Future of Gemini Across Google (28:05–29:23)
- Ecosystem Penetration
- Expect to see Gemini everywhere, including vertical domains (e.g., Med-Gemini, LearnLM, Waymo research).
- Successful products will likely use specialized variants, not just base models with generic prompting.
11. Audience Q&A Highlights
Differentiation of Gemini (29:32–31:10)
-
Gemini stands out for:
- Long context window.
- "Natively multimodal": handles video, audio, text together.
- Top creative writing abilities.
-
Advice: Leverage each model's unique strengths, not just obvious ones.
“Gemini is the only model that does long context. It's the only model that's natively multimodal, can take in video, can do audio and all that stuff.” (29:52)
AI Agents and Consumer Behavior (31:12–32:59)
- Skeptical on relinquishing control for personal/subjective tasks (e.g., shopping).
- Value lies in automating “the long tail of problems I’m not interested in solving.”
- Key: Target tasks consumers dislike or can be significantly improved by AI.
Speed as an Unlock for New Use Cases (32:59–35:28)
- Faster models (e.g., 200 tokens/sec via TPUs) unlock:
- Real-time monitoring/decisioning (e.g., sports, video action, high-frequency ad copy).
- “A whole long tail of enterprise value that's gone to companies that built those domain specific vision models...” (33:24)
- Latency is critical. Many current applications are bottlenecked by model response speed.
Logan’s Current Big Question (35:28–36:45)
- Struggle: Balancing effort between breakthrough “frontier” AI capabilities vs. high-utility, “boring but useful” features.
- "There's a lot of flashy AI stuff that it's very clear is not creating the value. A lot of the value is much of the boring stuff." (35:46)
- Internal debate: Innovation signaling vs. product value.
12. Notable Quotes (by Timestamp/Speaker)
- “You can’t just let the model run wild and do that thing. It’s going to burn a whole bunch of compute and then end up with not the Vertical AI SaaS thing that you really wanted it to do in the beginning...” – Logan (26:28)
- "They're not interested in agents, they're interested in their life being better, easier." – Logan (13:54)
- “Founders and people building stuff actually are the ones who get to accrue that value.” – Logan (09:50)
- “Moore’s Law doesn’t just happen because someone wrote it down... It happens because there’s thousands of engineers that insert whatever hardware company making that the reality.” – Logan (14:40)
- “Gemini is the only model that does long context. It’s the only model that’s natively multimodal, can take in video, can do audio and all that stuff.” – Logan (29:52)
- “The AI augmented software engineer is going to be able to do an incredible amount of stuff in the future...” – Logan (25:47)
13. Key Takeaways
- Google’s Gemini and associated products are built through a massive cross-functional effort, blending research prowess with product engineering and deep developer empathy.
- Model cost decline is shifting the value accrual to application builders—but differentiated user experiences, not simple wrappers, will win.
- Specialized, domain-specific models will allow nimble startups to coexist with foundational models built by tech giants.
- AGI timelines remain unclear. Digital superintelligence is possible soon, but real-world, embodied intelligence ("moving atoms") is farther away.
- Success in AI’s future—at Google, startups, or elsewhere—will hinge on talent, effective problem selection, and user-centric product design, not just technological bravado.
End of summary.
