GKE 10 Year Anniversary, with Gary Singh
Kubernetes Podcast from Google
Hosts: Abdel Sghiouar & Kaslin Fields
Guest: Gary Singh, Outbound Product Manager, Google Kubernetes Engine
Date: October 29, 2025
Episode Overview
This special episode celebrates the 10-year anniversary of Google Kubernetes Engine (GKE). Kaslin and Abdel sit down with Gary Singh, Outbound Product Manager for GKE, to explore the journey of GKE over the past decade. Together, they reflect on the early days of Kubernetes and managed container orchestration, discuss how the landscape has evolved—especially with the rise of AI and automation—and consider what the future holds for GKE and Kubernetes at large. The conversation bridges lessons from the past, present improvements, and a vision for cloud-native operations driven by AI.
Key Discussion Points & Insights
Gary’s Story: From Early Kubernetes to Product Manager at GKE
- Gary’s Background ([03:38]):
- Gary’s role involves outbound activities, lots of customer interactions, and product testing.
- Previously worked at IBM, involved with early container technologies before Kubernetes and GKE.
- Used GKE as a customer before joining Google:
“...when I got here... thought it'd be cool to come to Google and work on Kubernetes. So that's... my mission and I got here.” ([05:29])
- Learning by Doing ([04:44]):
- Emphasizes being a "doing learner" and hands-on tester:
“I'm a doing learner.” ([05:12])
- Emphasizes being a "doing learner" and hands-on tester:
Early Days of Kubernetes & GKE
- Challenges in Early Adoption ([06:47]):
- Early Kubernetes and the era of “container orchestrator wars.”
- Setting up Kubernetes was difficult; concepts like Minikube would have made adoption much easier:
“What if we had had Minikube on day one? ...there was a lot of struggles in setting up Kubernetes in the early days.” ([08:08])
- Barrier to Entry ([08:51]):
- Kubernetes was built for large enterprise-scale workloads, making it hard for individuals to test out.
Evolution and Milestones in GKE
- Simplifying Operations ([09:56]):
- Initial GKE releases simplified multi-machine Kubernetes clusters but required networking expertise.
- The introduction of GKE Autopilot (2021) enabled "one-click production clusters":
“...with Autopilot... you really can have one click production clusters... push a button or you run GCloud... and now you know, we've sort of got that.” ([09:56])
- Focus Shift: From Infrastructure to Workloads
- GKE improvements have reduced the management burden—users focus more on applications and less on infrastructure.
- Automated upgrades and hands-off operations now possible:
“I leave clusters running and they upgrade and things seem to work right, you know, kind of that hands off experience.” ([09:56])
Serverless, Extensibility, and Customization
- Serverless & “Prescription without Restriction” ([13:57]):
- The ecosystem now offers a spectrum—from deep sysadmin control to almost serverless operations.
- Striving for the right balance:
“The things should just work... then if you so desire... go down and tweak a specific parameter.” ([14:49])
- Extensibility—A Core Kubernetes Superpower ([15:45]):
- Flexibility sometimes feels “unfinished” by design, enabling powerful platform-building.
- The challenge is keeping user experience simple as extensibility grows.
AI Workloads and Infrastructure Optimization
- AI’s Impact on GKE ([17:39]):
- Modern AI workloads demand sophisticated, resource-aware scheduling and scaling.
- Kubernetes must now balance ease-of-use with deep hardware configurability.
- GKE enhancements—like custom compute classes and tight accelerator integration—bring flexibility without user burden:
“...the main thing... is... leverage Kubernetes for its power of scaling ...but do they want to have to go in and configure every single kernel parameter, the overlay networking...?” ([18:22])
- Abstracting Complexity for Data Scientists:
- Expose just the necessary tuning points; optimize defaults for typical cases.
The Many Rabbit Holes: Platform Engineering and Abstraction Choices
- Platform Engineering and Avoiding Rabbit Holes ([20:31]):
- Kubernetes gives users many possible problem spaces ("rabbit holes"); GKE aims to let users choose which to dive into.
- Rise of platform engineering—building organization-specific platforms atop Kubernetes and GKE.
- GKE adds features for multi-cluster management, config, and scaling to help platform teams abstract and automate more.
Looking Forward: GKE & Kubernetes in the Future
-
The Future—AI for Operations (AIOps, Autonomics) ([22:21]):
- Next wave is using AI to operate Kubernetes:
“How do we leverage AI more in terms of like running GKE and ...operating in your workloads?” ([22:21])
- Moving from users worrying about “how” to simply stating service objectives ("I want 150ms response time, figure it out").
- Dynamic, AI-driven customization: personalize dashboards, automate scaling decisions, turn logs and metrics into real-time recommendations.
- Next wave is using AI to operate Kubernetes:
-
Observability & LLMs for Operations ([25:59]):
- AI can synthesize immense observability data, automate tasks, and reduce the "mental load" for cluster admins.
- True innovation is not just YAML-gen but agent-based, data-aware operational intelligence.
Debating Automation & Trust in AI Agents
-
SRE Concerns About AI Agents ([31:27]):
- Platform engineers may be cautious about giving control to agents; trust must be built over time.
- Early uses: reporting, snapshot generation, pod resizing proposals rather than automatic critical actions.
- Vision: Agents learn in safe dev environments, suggest or enact optimizations in prod.
-
Non-determinism and Offloading Complexity ([33:22]):
- AI allows for less deterministic, more adaptive operational logic.
Notable Quotes & Memorable Moments
- “I'm a doing learner.”
— Gary Singh ([05:12]) - “What if we had had Minikube on day one? ...there was a lot of struggles in setting up Kubernetes in the early days.”
— Gary Singh ([08:08]) - “We can scale the infrastructure, but it's how do we figure out what the right way to scale the apps is... And I think that's where we can really help with the agents.”
— Gary Singh ([31:51]) - “That's the thing with Kubernetes is that it's not just one hole to fall into. It's so many rabbit holes...”
— Kaslin Fields ([20:31]) - “Prescription without restriction... things should just work... and then allow people to customize and tweak, where necessary.”
— Gary Singh ([14:49]) - “There's always this notion that, yeah, I mean that's the, that's the trust over time, right. That you start to build [with automation].”
— Gary Singh ([31:39]) - “Imagine a world where you're like, hey, here's my workload and you ... want to scale up in a new region... What if I could just deploy a workload, tell you what region I want it in. ...If we don't [have a cluster], we create a new cluster and... do that in like less than 20 seconds?... That to me is like the ultimate in ... serverless Kubernetes.”
— Gary Singh ([38:10])
Timestamps for Key Segments
- Gary’s Introduction & Background – [03:38]
- Early Container Orchestration Wars – [06:47]
- Barriers to Entry, Setup Challenges – [08:08]
- How GKE Has Changed / Rise of Autopilot – [09:56]
- Serverless, Extensibility, Platform Engineering – [13:57] to [16:47]
- Adapting to AI Workloads, Custom Compute – [17:39] to [20:31]
- Platform Engineering, Multi-cluster Features – [20:49]
- The Future: AIOps, Automated Operations – [22:21]
- Observability & Data-Driven Ops – [25:59], [27:04]
- AI Automation, SRE Concerns, Trust – [31:27] to [33:22]
- Favorite New GKE & Kubernetes Features (In-place Pod Resizing) – [36:09]
- Top Feature Wish: Instant Cluster Creation/Serverless Kubernetes – [38:10]
Favorite Features & Wishlist
Favorite Recent Features ([36:09]):
- Open Source: In-place Pod Resizing (IPPR)
- GKE: Container Optimized Compute (dynamic underlying compute resizing)
Most-Wanted Future Feature ([38:10]):
- Instant, on-demand cluster creation—“serverless Kubernetes”—where the user only needs to define their workload and region; GKE spins up clusters/compute resources in seconds, removing manual cluster management.
Closing Thoughts
The episode paints a picture of GKE’s transformation—from pioneering container orchestration at scale and making Kubernetes accessible, to a platform with a rich spectrum of operational abstraction, helping everyone from platform engineers to data scientists manage their workloads. With new demands from AI and the promise of smarter, AI-driven operations, both the possibilities and challenges for GKE and cloud-native platforms are greater than ever.
The future? More automation, more AI, more flexibility—with thoughtful attention to abstraction, user trust, and the choice of which “rabbit holes” platform teams need to care about.
