Loading summary
Kaslyn Fields
Hello and welcome to the Kubernetes podcast from Google. I'm your host Kaslyn Fields.
Abdel Sigiwar
And I'm Abdel Sigiwar.
Kaslyn Fields
Google Kubernetes engine turned 10 this year. In this episode we talk with Outbound Product manager Gary Singh about how the product has changed over the years and some of his favorite things that are happening now. But first, let's get to the news.
Abdel Sigiwar
The Knative project reached the CNCF graduated maturity level. The project was created at Google in 2018 with contributions from IBM, Red Hat, VMware and SAP, and was introduced as an incubated project to the CNCF in 2022. The project aims at removing the project aims at removing much of the complexity of running modern workloads on Kubernetes by handling infrastructure tasks like auto scaling, routing and event delivery. Congratulations to the community for reaching this milestone.
Kaslyn Fields
LLMD introduced release 0.3 the new release provides a fast path to deploying high performance hardware agnostic, easy to operationalize at scale inference for large language models LLMD is an open source collaborative project between major cloud providers aimed at laying the foundation for LLM inference at scale on Kubernetes.
Abdel Sigiwar
VLLM introduced new updates to its Semantic Router. VLLM is an open source library for running large language models inference efficiently. It's available as a Python package or a container that can be used with Kubernetes. The Semantic Router is a mixture of models router that intelligently direct OpenAI EPI requests to the most suitable models from a defined pool based on semantic understanding of the request's intent. The new updates include a dashboard for visualization, a paper on why to use the Semantic router and the YouTube channel. The roadmap for the router is also publicly available.
Kaslyn Fields
The CNCF introduced the Certified meshery contributor. This certification validates technical proficiency in contributing to the meshery open source project through written assessments. The certification consists of five distinct exams, each dedicated to one of Meshery's major architectural domains. Meshery is recognized as the CNCF's sixth highest velocity project and aims with this new certification to offer a thoughtfully designed contributor onboarding experience.
Abdel Sigiwar
Headlamp is an open source plugin created by Kubernetes contributors and designed to complement Carpenter. Headlamp's ui gives users real time visibility into Karpenter's activity. It shows how Carpenter resources relate to Kubernetes objects, displays. It shows how Carpenter resources relate to Kubernetes objects, displays life metrics and surfaces scaling events as they happen. You can inspect Pending pods review, scaling decisions and edit carpenter managed resources with built in validation.
Kaslyn Fields
And that's the news. Hello and welcome to the show Gary Singh. We're going to be talking about the GKE 10 year anniversary today. But Gary, first, why don't you tell us a little bit about yourself.
Gary Singh
Hey, Kaslyn. Yeah, great to be here. Gary Singh. I'm one of the product managers for gke. Been here at Google for, I don't know, four years or so. Yeah, my role is kind of an outbound role, so I do lots of events, talk to lots of customers and I guess on the other side I'm probably one of the better testers of the product. I like to use the product a lot. So great to be here to talk about all the cool stuff we've been doing.
Kaslyn Fields
I was wondering how to work it in that you are someone that we can trust to give your true opinion.
Gary Singh
Yes, yes, I guess I do not, I do not, I do not tell a lie.
Kaslyn Fields
And the product has benefited because of it. I always love meetings that we get to be on together internally where we're talking about the product and features that we're thinking of creating and all kinds of things like that. Because I feel like you always have great insights into what the community really needs and what it's like to really work on. Work with gke.
Gary Singh
Definitely. I think. Yeah. You know, I mean, it's great to work with like, you know, customers, hear stuff at various communities, go to meetups. Right. You kind of, if you're out there, you hear the stuff and then, and yeah, I like to like to try, like to try everything. Right. You don't, if you're in a talking role, it's always better to be able to know what you're really saying. Right. You know, sort of behind the scenes. And I think I, you know, I think doing is the best way to learn.
Kaslyn Fields
Yeah, I feel like any new thing I can trust that you have probably already tried it out.
Gary Singh
Yeah, yeah, yeah. I'm a. Do you know there's visual learners, oral learners and then there's like doing learners. I'm a doing learner.
Kaslyn Fields
So you've been here at Google for about four years, but where did your journey with GKE begin?
Gary Singh
That's interesting. Yeah. So I mean, I guess as a funny story, the first time I obviously heard of GKE was way back when in, well, Kubernetes, you know, in 2014, whatever, you know, when it 10 years ago I was at, I guess I can say where we're on a podcast So I was at IBM at the time and we were building a bunch of stuff on containers. IBM had its own container service and everything. And then all of a sudden, like, you know, Google and Red Hat and a few others like, well, we're going to do Kubernetes. And I was like, well, this is great, right? At least there's a sort of standard out there. And, you know, GKE was one of the first, you know, was, I think, one of the first managed services out there. So, you know, I used it back in the, back in the old days, right, when I think we still had, I think the API server was like a single vm, right? We didn't have today. And then, you know, most recently, you know, fast forwarding before I got here, we were running, you know, I was running clusters all over the world in my previous job. And, you know, no cloud is in every region. So we used, we used gke, you know, in, in a number, in a number of places. And yeah, then it was pretty cool coming here. You know, I was getting it, looking for my background. But I was, it's like, you know, what do you want to do next? I was like, I like Kubernetes, thought it'd be cool to come to Google and work on Kubernetes. So that's, that's, that was kind of my mission and I got here.
Kaslyn Fields
Yeah, can relate. And those early days of Kubernetes, during the container orchestrator wars, as we call them, was an interesting time because it was kind of all about containers, really. In 2013, 2014, it was like, there's this new virtualization isolation technology that was getting all of this hype and excitement for very good reason. But it was like, how do you run that at scale? And everybody was trying to figure that out at the same time.
Gary Singh
How do you run it at scale? How do you get things to talk to each other? There was all like a lot of like, single machine solutions, right? You know, there was some original versions of like, you know, there was Docker Compose, right? I think it was out there. There was, you know, Docker had the original swarm, not swarm mode that sort of came in. Everybody was building it. You know, Amazon had a container service or something like that, right? It was like, all right, I got it. How do I scale this thing? How do I get this thing across, like multiple machines and, you know, all that, right? And I think that's where, that's where Kubernetes really, you know, Docker really, the Docker company helped push that. And then I Think Kubernetes really helped, you know, push it to the, to the mainstream for production and beyond just using it, you know, for development on your laptop.
Kaslyn Fields
Yeah. With things like minikube and that's, that's a whole other topic, minikube and K3s and how people actually develop on their machines.
Gary Singh
You know the side thing I always keep reminding and I keep forgetting submitted, maybe I'll do it for kubecon EU is like what if we had had. Maybe it'll be good segue into like the cool stuff we do. But what if we had had Minikube on day one? I think that, I think it could different. Right. Because there was a lot of struggles in setting up Kubernetes in the early days. Right. Because as we know most people really don't want to deal with networking and don't know networking and so getting it set up really put the focus on that rather than necessarily the deploying and the apps and all the cool other types of resources we had. So I wonder if we had Minikube on day one or kind on day one would have been a much more pleasurable experience, I feel.
Kaslyn Fields
Yeah, it's always a challenge for folks trying to get started with Kubernetes and trying to learn it for the first time. It's really a system that's meant for large enterprise scale kind of use cases. So testing it out on your own, it's a little difficult to figure out how to set it up in a way that kind of mimics the kind of production environments that are built with it. And those technologies have really grown. Not really what we're here to talk about but very important will play into.
Gary Singh
You know, what we've been doing for the last 10 years on, you know, with GKE. Trying to make it much easier to run this thing at scale. Right.
Kaslyn Fields
That's true. So how do you think GKE has changed over the years from the early days of containers being the hotness and then everybody trying to figure out how to orchestrate containers on them to Kubernetes being released by Google as open source and working with the Linux foundation to create the CNCF. And then GKE came out about a year later. 10 year anniversary of Kubernetes was last year. So we've been around for about 10 years now. I think things have changed a lot since then. What do you see?
Gary Singh
Yeah, I mean I think you know, a lot of the, like we said like, you know, nice segue to the, you know, I think the early days were still all about Just kind of getting the thing, you know, up and running, you know, mapping it into the cloud, right? So I think, you know, the initial thing with GKE was it was nice, right, that if you did want to run on multiple machines, right. With Kubernetes, it was a quite easy way to get started, right? But again, I think, you know, as you, as you looked at it, you still had to understand things like networking and setting all this stuff up, right? And I think as you know, GKE has evolved, right? We're now at the point where, you know, if you click, you know, with Autopilot, which came out in 2021, I mean you literally can't have, you know, it's a marketing term. I try not to do too many marketing, but you really can't have one click production clusters, right? You like, you push a button or you run GCloud, you know, container GCloud, container clusters create auto, right? And now, you know, we've sort of got that, right, Which I think then really moves. Starts to move the focus beyond running like the infrastructure itself and how all that connected everyone's to more focus on workloads, right? And I think in kubernetes in general, I think it started to focus more on workloads. And I think a lot of what we've been doing on the GKE side, right, has been focusing on the workloads themselves, whether it be AI workloads, stateful workloads, and then trying to make like all the day, two operations kind of go away, right, like, you know, automated upgrades, automatic upgrades and we've, you know, you even seen the change of that over the last, you know, since the first time GKE had it, which was still okay, better than doing it yourself, but now it's, you know, you can really kind of trust the thing. And you know, I leave clusters running and they upgrade and things seem to work right, you know, kind of that hands off experience, right? Which really then I just start caring about like trying new workloads, trying new, you know, new parts of it, right. Rather than like, you know, managing that infra.
Kaslyn Fields
That is a good callback to really the hardest thing about Kubernetes in the early days was getting it going. I feel like, at least for me, it was Kelsey Hightower's Kubernetes the Hard way is very famous for a reason, because it came around just at the right time when people were trying to learn about Kubernetes and how it could help them orchestrate containers at scale and nobody knew how to run the thing. Because it's this like kind of obscure technology really rooted in the Linux kernel. So in order to deploy it in the early, early days, you had to understand the underlying Linux operating system components that actually made it possible because you had to set up each one individually using the Linux system.
Gary Singh
Exactly, exactly. Right. And the kernels didn't support things and whatever. I think, you know, when you get to the point where you just know what the thing should do, right? It is an orchestrator of compute, right? The underlying stuff. Can we just get that out of the way? And I think we've done a really nice job on GKE of, you know, getting to that point, right, where you don't worry about nodes, node pools and whatever. You literally describe it all when you deploy it, when you deploy your workload. You know, we have those things called compute classes now or even on autopilot, right? And you're like, I want to run on a ARM machine, annotate your workload, you know, get the right ARM machine for us. Right? Or whatever it may be, right. Or an AMD or Intel machine. Right. You don't have to necessarily worry about pre provisioning all that stuff. Right? And how do you size all that stuff? And that's all the kind of, you know, minutiae, right? And you really wanted to. So I think, you know, we've really done, I think a great job of bringing the power of container orchestration, but giving it that the ideal cloud experience, right? Which is supposed to be that the cloud is quote unquote magic and just kind of scales for you. Right? But how do you bring, how do you, how do you, how do you, how do you, you know, tie those two things together? And I think we've put a lot of stuff into open source to make that happen, but then really done a great job of making sure that that ties out really well to Google cloud infrastructure, right.
Kaslyn Fields
In the early days of Kubernetes, I don't know, maybe the middays, I feel like Kelsey Hightower's favorite thing to say was that one day, you know, we're not going to care about Kubernetes. It's going to be serverless. And I think it's really interesting how the ecosystem has just kind of grown this kind of natural spectrum between how deeply you want to be involved with that underlying Linux system. How much do you want to be a sysadmin who's really hands on with your machines and how they're configured versus how much do you just want to say, I'd rather pretend that There aren't servers and just run my workloads and there's just this whole spectrum now of what level of, of control you want over that underlying hardware.
Gary Singh
Yeah, exactly. I think that's where. And I think, you know, as we've, you know, changed over time, right. We've gone from the, you know, I like to call it, you know, prescript. I use the term prescription without restriction. Right. Like, you know, the things should just work. Right. Here's the best practices, here's all the stuff that we've hardened, here's everything we've learned over all the years. Just give me that. And then if you so desire, or you have because you just want to, or because you have a specific need, how do you go down and like tweak a specific parameter if you need to. Right. And I think that's probably where obviously Kubernetes has probably been shines through. Right. It sort of does everything. You can run anything on it. And I think that's where we found that sort of fine line on gke. Right. How do you make it work for most use cases out of the box, but then allow people to customize and tweak, you know, where. Where necessary. Right.
Kaslyn Fields
I was talking to someone at an event last week who said that they ran Kubernetes clusters in the past weren't at that particular moment, but they were saying that their favorite thing about Kubernetes was its extensibility is that sometimes things would, I think he said sometimes things would feel kind of like unfinished or like really hard to use. But it's kind of by design because you want to implement that flexibility into the system. It's a platform for building platforms. So it's all about what you want to do with it. And when we were Talking about the 10 year anniversary of open source Kubernetes last year, we also talked a lot about how the future of Kubernetes is all about that extensibility. We need a level of extensibility where it's flexible enough to do the things that people need it to do without being overly complicated. Because it has gotten really complicated and it's really hard to keep the user experience simple. And GKE faces that too.
Gary Singh
Yeah, most definitely. And I think, you know, that's where you start to, you know, where's that fine line of where do you have the right levels of extensibility? Right. Especially in, you know, the AI workloads. Right. So I think, you know, obviously pushing things like, you know, DRA was, was great in the open source side, right? But then that also allows us to better optimize our experience for using things like TPUs, right? And make those sort of have like, you know, similar experiences, right. So that they don't look like sort of, you know, foreign add ons, right? Or you know, having a standard set of APIs and make auto scaling, you know, work. The cluster autoscaler has its capabilities, but then we added things like node auto provisioning, now sort of custom compute classes. So there's all these levels of extensibility, but you're still getting the core. So we give you an optimal experience, but we're still based on, you know, the fundamental core things that, that, that, that are in the open source ecosystem.
Kaslyn Fields
That's been the most interesting thing for me as Kubernetes has evolved during this new AI world that we live in is it's kind of going back to that high focus on the underlying infrastructure because those AI workloads are just so resource intensive that it really matters what kinds of underlying hardware accelerators you're using for those workloads and if they're optimized for those kinds of hardware accelerators. So I think Kubernetes itself has had to make some interesting shifts to meet that flexibility of like, you want it to be as easy to use as possible, but also you want to give people as much control as possible.
Gary Singh
And I think that's an interesting thing where, you know, that's the, that's also I think an interesting area where, you know, sometimes, you know, what do you do on like a managed platform, you know, like GKE versus like what do you have to put in sort of the core, right? And I think obviously making sure that you can make. Because the other interesting thing about AI is that a majority of people on the AI side, right, Most people, you know, the end users will end up being data scientists or training or whatever, right? They are kind of like, you know, Kubernetes is a means to an end for running like the AI workloads. The main thing they want to do, we Want to train DeepMind wants to train Gemini, right? And they want to leverage Kubernetes for its power of scaling, orchestration, etc. But do they want to have to, they don't want to have to go in and configure like every single kernel parameter, the overlay networking that makes things like, you know, the, the Nvidia's, you know, cross GPU networking work or like sort of TPUs in there, right? So we put that level extensibility in there and then we try to expose, on the gke, you know, on the managed side of the GKE side, right. That experience that's just, okay, deploy your workload, describe it like this, and then we'll use the raw power of Kubernetes to push out that infrastructure. But you, you know what I mean, you didn't, you didn't have to come in and figure out how to create, you know, your 10,000 GPU node, how to network all that together, how to put everything together. You just described. Here's my thing. I need to shard the model, right? So we take advantage of some of the higher level software for doing that. You know, things like Jetstream or JAX and whatever, libraries and frameworks and how do you map those and make that easy to use? And that's the, I think that's, that's been the interesting focus, I think on GKE moving beyond like the making the infrastructure stuff more automatic with autopilot to now making it easy for people to just sort of deploy these workloads. And then of course the most advanced people can go in and tweak it if they need to. But in a majority of cases, right, you know, just kubectl, apply f some YAML, right. And it should work.
Kaslyn Fields
That's the thing with Kubernetes is that it's not just one hole to fall into.
Gary Singh
Yeah.
Kaslyn Fields
It's so many rabbit holes that you can fall into along the way. And so now rabbit hole do you want to fall into and which ones do you want to ignore the existence of?
Gary Singh
And I think the thing that we've learned from our customers over time, right, is it's interesting because you start to see, you know, the, the rise of platform engineering, right. Where Kubernetes was obviously a great thing for platform engineers because it did a lot of that, as we said, underlying cloud orchestration. Right. And then they sort of build their layers on it. But now you're like, well, okay, I want to build, you know, more of my stuff that's specific to my business, right. I don't want to have to, you know, build add ons to like Kubernetes to sort of do that. So I think that's where you know, from the sort of, you know, GKE side also, right. We've looked at like, how do we add, you know, multi cluster stuff. Right. How do we add, you know, additional management of this stuff. Multi cluster routing. Right. Config stuff. Right. How do we build more kind of core things into that, into the, into our, you know, managed Offering right. To. To offload, you know, a lot of that so that people can again, just focus on the things that matter for running their applications. Because after years of doing that, right. You're just like, can I. Yeah, I think I just want to offload as much as possible to, to you guys, Right?
Kaslyn Fields
Mm. Some people do and some people don't.
Gary Singh
Some people, yeah, we got. Right, yeah, yeah.
Kaslyn Fields
It's about which rabbit holes you want to fall into. So all of this, how GKE has changed over the years and what kind of environment it's existing in now and how it's changing for the AI sphere. But what do you see for GKE in the future?
Gary Singh
Yeah, I think we have some really interesting stuff coming up and we've touched upon a few of the things. Right. I'm really excited about a lot of the work that we've. So we've had to push boundaries of scale due to AI, right. And then at the same time we talked about, you know, people don't really want to like, they just want to tell you, like, tell us and make this stuff happen. Right. So I think like AI. Yeah, like AI. Right. I mean, so it's interesting. We talk a lot about like, you know, running AI on Kubernetes, right. And running AI and gke. Obviously we've had a lot of focus on that. I think now we look at how do we leverage AI more in terms of like running GKE and sort of, you know, operating in your workloads. Right? Because again, yeah, I mean, a canonical example, right. Is like, I want my, you know, I want to deploy an application and you know, people will say, hey, you know, I'm going to deploy an hpa and you know, I'm going to say the threshold is 70% CPU or maybe they're lucky and they put in some custom metric. But at the end of the day, probably what they wanted to say for their web app or whatever it may be is I want to have a response time of like, you know, 150 milliseconds. Figure out what that means. Does that mean I should scale vertically? Does it mean I should scale horizontally? What's the right type of compute? And I think as we start, you know, moving more in that direction, whether you call that, you know, AIOps, whether you use, you know, autonomic computing, whatever, right? There's all these partially serverless, you know, some, some service environments have that. So I think that's, that's really where we, we go. We continue to work on, you know, scale because obviously the Bigger you can scale, it's better, right? You just take that out of the equation. Right? So don't worry about scale. We'll. We'll get you the compute that you need when you need it, you know, how you need it, et cetera. That's stuff like we've been doing with like, you know, compute classes and things like that. But then how do we just move to deploy your workload out there and we'll handle a lot of it for you. Sort of the next evolution of what we did with Autopilot and then I think the next thing on top of that is then how do you manage that, right? Or how do you absorb that? And I think we've had things like you've had the Kubernetes console, the GKE console, you've got dashboards, Grafana, Prometheus, whatever you have, but it's really all about the data. And then, you know, can we have like some agents just like, you know, looking at what's actually running, right, like and starting to automate some of those things? Right? Here's what we've noticed, right? We've noticed these differences in there and then also give you more of a. And I hate using buzzwords, to be honest with you, but I do believe in this. Can we just dynamically, you know, so you can have a lot of agents running for people, but can we also just give you your own more personalized, you know, I think we have some RPMs talking about giving you a more personalized experience. Why couldn't you just dynamically generate the UI or the experience, the model that you wanted, right? We can host it somewhere. But here's how you want to look at your Kubernetes distribution, right? Not how we GKE thought you should look at it. So I think it really comes down to AI for running the workloads, agents from monitoring, managing that stuff, but then giving, you know, tailored experiences around. Around just what's going on with your workloads etc, which can definitely change the game. I think from a platform engineering perspective.
Kaslyn Fields
That is such a important point. I think in the world of AI, I keep. Anytime we're having a conversation about AI in the world of Kubernetes, I always kind of want to pause it for a second and be like, are we talking about running AI on Kubernetes? Are we talking about running Kubernetes with AI? Yes, because there's so much potential in, so much potential to make Kubernetes administrators lives so much easier by using AI to help you understand how the system is running because that's just so hard to understand right now. There's so much data for an administrator to parse through in terms of the logs of the workloads, the logs of the clusters, all sorts of pieces, the networking and getting all of that into a view is very, very difficult. So I think there's a lot of interesting potential there. I think observability is a really important aspect to think about in the world of AI and Kubernetes and what's next in the future for GKE and Kubernetes in general.
Gary Singh
Yeah, I mean, I think, you know, and it's more so than. I mean there's obviously the stuff that goes out for like app development folks, right, who are like, oh, generate my yamls and all that stuff. Right, right. And obviously AI is fantastic for co generation and YAML generation, right. Like, I admit, you know, I did it today, right. Like I don't have to write it. I'm like, I need an app that does this, just deploy it. Right. And I have that, you know, kind of workflow, like always working. But I, I think if you look at it from like you said, there's lots of data. You know, the thing about Kubernetes is, you know, I guess the technical term, right. It's an edge driven event, you know, an edge event driven asynchronous system. Right. Which makes it hard sometimes when you, so you go, you know, you run kubectl apply, you know, you deploy a pod and then it just comes back and says okay, right. But really all it did was validate the API server, accepted your pod spec, right. Then another task is sort of deploying that, right? And you've got the logs there and then you've got, you know, okay, the nose tie, if you're on GKE those tie to that I have to deploy the autoscaler kick in and then did that mean I had to deploy whatever underlying compute and maybe there was a log, blah, blah, blah. Right. How do you connect all those things together? Well, we have all those sort of sources I do think and you could hard code a lot of this. But I think where we're. To me, you know, it's a lot of the AI stuff becomes fantastic is if we have the right data sources, whether that's model context servers or just APIs, you know, AI gives you a lot of that connectivity and you can also streamline the logic. Right. Most people know event driven systems are. And response systems are fundamentally a bunch of events come in data is Pulled and you write a ton of if then else logic. Fundamentally the fund even rules engines. In the end they're just a higher level language that still is generating if they announced statements. But what if we can avoid doing that? Right. With putting more natural language in there where it's actually more figuring this stuff out on the fly. Tailored to you, right? To me, that's the real kind of power that we have with this like you know, running kubernetes with AI or you know, using, using AI to run kubernetes. And we know a lot of the best practices and we've tried to do that. You know, we have a lot of those. But people don't want to always get an alert here or looking at console or whatever. Right. Can they have something that can automatically process that? Right. And then how do we avoid having to hard code every rule? Right. I think that's where, you know, AI makes this stuff much more streamlined and possible to try. Right?
Kaslyn Fields
And flexible, right? Yeah, Just like the Google motto, making the world's information universally accessible and useful. Making your kubernetes clusters maybe not accessible in terms the words have some issues.
Gary Singh
There, but accessible to platform administrators or those who, those who have the proper role. But yes, but yeah, I mean, you know, there's some, some good stuff, right? I mean there's lots of like, you know, some tools out there. You know, we, you know, I think we've put out like, you know, kubectl AI that's out there. We've got an MCP server you can try out for GKE now. Right. And yeah, I think, you know, we'll, we'll get beyond trying to just, you know, I think a lot of the work right now people are doing is like, yeah, like I said, generating, generating stuff. Right. And that's great, that's generative AI. But the best part is when you can have the generative part happening in response to the data that sort of processing and the events come in and that's where you get into the sort of, you know, agent mode or agentic mode. Right. And I think the more of that that we can do, we can codify, we can codify without writing every single rule. Like a lot of the best practices that we have, right. Start to provide those recommendations. And then you know, when people look at those recommendations like, oh, just have, okay, well here's my. Whatever agent that's gonna, yeah, we're gonna turn that on. And when it sees these recommendations, it's just gonna, you know, start acting upon those. Right. And it Becomes this pretty, pretty powerful self contained, you know, system, right? Where we're observing it, we have agents observing it. And then you may also have your own, you know, experience that you've dynamically created, right. You know, tailored to what you may specifically be doing.
Kaslyn Fields
I'm sure that a number of SREs and platform engineers in the audience are however, at the moment clutching pearls like, I don't want agents running my production systems.
Gary Singh
Yeah, I think, you know, there's always this notion that, yeah, I mean that's the, that's the trust over time, right. That you start to build.
Kaslyn Fields
It's got to be rooted in data, there's got to be checks.
Gary Singh
And the thing is there's still so much more to do, right? It's just, you know, offloading some of that work that you've had to code a number of things up to do before, right? You're still, now you're focusing more on the outer layers that matter, right? And I think that's where. But you know, there's lots of mundane tasks, right, that people have to do, right? I mean, even stuff like just like generating, you know, agents that generate, you know, reports on snapshots and point in time, right. All that stuff becomes. Those are probably your starting points, right. More static. But then you really could get to the point where, hey, you know, figure out how to. I'm excited about like combining AI and stuff like that with like the ability to resize pods. Right. That's a great feature, right? You know, the VPA coming out with these sort of, you know, that we can now resize pods. We could have the VPA running, but imagine we could just have some agents running also specific to your kind of characteristics. Right. You know, people don't know what's the right requests and limits to set. You know, maybe turn it on in your development environment in the first place and it'll find out, you know, the right kind of things to set and push those to production and then someday in the future just have leverage that to really help with the, you know, the sort of. Because we can scale the infrastructure, but it's how do we figure out what the right way to scale the apps is, right. And bind that to the infrastructure. And I think that's where, that's what's always problematic for people. And I think that's where we can really help with, with, with the agents.
Kaslyn Fields
This reminds me of trying to explain non determinism to engineers. I feel like a lot of engineers really struggle with the concept of non determinism because it's just kind of like let the computer figure it out, offload some of that mental load and let the computer figure out some of the parts of it. Like I was doing a project with PDF parsing where it's like humans wrote a bunch of documents and they share them as PDFs and we need to derive rules from those, but they're written in natural human language and that's just really hard to parse. Writing that out deterministically is going to be so many IF statements, so many kinds of things to consider. Whereas you can just be like AI, figure it out. And it kind of reminds me too of the, the serverless to not so serverless spectrum in the world of infrastructure and the, the many rabbit holes that we could all go down. In the world of Kubernetes, it's which ones do you want to go down? Where is the value for you to focus with this infrastructure, with your systems or with your software? And what can you kind of offload elsewhere?
Gary Singh
Yeah, exactly, exactly. That's a good analogy, I think, on the, on the PDFs, right? The.
Kaslyn Fields
Yeah, that sounds really good.
Gary Singh
I had the pleasure of running a OCR product, but yeah, imagine, right, that you tried to do some forms, right. And they're like, if they don't match, right. How do you handle the exceptions? Right. And I think that's where that sort of training came in. Right. On those old models, right. It'll do this sort of matching. And I think that's where we can, that's where this stuff sort of really starts to help us, you know, bring these things together. That may not be obvious, Right. I mean AI is very good at figuring out how things kind of like, you know, correlate look similar next gen. And as long as we can plug in those right. Data sources, right. It's using the tools to do it. You still have to have the right data. So we kubernetes has to have the right data. Gke, you have to have the right data and then you as the platform team or whatever have to have the right data specific to your applications or your infrastructure. Right. And tying that all together. But you probably want to focus more on your stuff and then hope, you know, that we can now tie that to, you know, have that full trace all the way through to the, to low level stuff when you need it.
Kaslyn Fields
Yeah. And so I have two last questions for you. Number one, we've talked about a bunch of features of GKE and open source kubernetes in this. What is your favorite new feature In GKE and. Or open source Kubernetes.
Gary Singh
They'Re sort of tied together. My favorite feature in open source is definitely in place Pod resizing, or I think that's what we call it here. I think. What's it called?
Kaslyn Fields
Ippr.
Gary Singh
It has some other name, whatever the name is in.
Kaslyn Fields
Does it.
Gary Singh
I forget. Yeah, it's whatever that feature.
Kaslyn Fields
Oh yeah, it does, yeah.
Gary Singh
Yeah, we should know. We can look it up and edit it for the thing. But we call it ippr. But yeah, the ability to resize pods. And so I think what that enables great things like boost and whatever. And then from a GKE perspective, we introduced this thing that we call container optimized compute, which allows us to kind of dynamically resize the actual underlying computer. I think when you combine those things together, it's pretty nice. We can scale horizontally super fast, but we can also boost applications super fast. So that becomes a core primitive that's needed to do all the AI stuff that I talked about that we should hopefully be getting to.
Kaslyn Fields
Yeah, it makes me so sad when I hear Kubernetes folks like platform engineers talk about clunkiness in auto scaling processes. With Kubernetes clusters, I'm like, we have so many tools for this now. And especially right now, the platform is getting so much better at being able to auto scale super smoothly, which also terrifies a lot of the SRE's and platform engineers out there. Because you don't want to have an incident where it's just scaled out of control and now you're paying crazy costs or control. Oh yes, that one too. So it's all about controlling the scaling. But even with controlled scaling, it should be super, super smooth. So I'm excited to see those features too. And a bit of a spicy question for you because I enjoy asking you spicy questions. What feature do you most want to see added to gke?
Gary Singh
That's an interesting one. Okay, well, I guess the one that I would really love. So we talked about like auto scaling or whatever, but I would, you know, I would love for us to be able to just deploy. Call Kubernetes API. Like so you deploy your thing, you just have this endpoint and then we just create clusters on the fly.
Kaslyn Fields
Oh, interesting.
Gary Singh
Because right now we still have to create a. You still have to physically create a cluster. And yes, we've made it super simple, but it can still take, you know, a couple minutes, whatever, right, to. To start that cluster. But imagine a world where you're like, hey, you know, here's my workload and you decided, I want to scale up in a new region all of a sudden because you needed a new AI accelerator. What if I could just. Don't have to even create a cluster. I just deploy a workload, tell you what region I want it in. If we have a cluster there, we deploy it on the cluster. If we don't, we create a new cluster and within a response. We could do that today, but it takes like six minutes. But what if we could do that in like less than like 20 seconds? That, that to me is like the ultimate in. It's still Kubernetes, but it's basically like serverless. It's like a knative cluster. You know what I mean? When you deploy a knative workload on a Kubernetes cluster, right. It's like it sort of scales now I'm just pushing it as a Kube API and a cluster just magically spins up and down and you don't worry about clusters anymore.
Kaslyn Fields
Yeah, I thought at first you were going to say like just a Kubernetes API that I can deploy my workloads to and I was like serverless. But yeah, it kind of all comes back together to kind of breaking down the walls more between Kubernetes and the concept of serverless, making it smoother for users to interact with the pieces they want to interact with and just not worry about the pieces that they don't want to. It's an interesting idea, I think, you.
Gary Singh
Know, you know, it's like there's overloaded terms, right? But they call it a serverless Kubernetes or how you call it. But it's, I think just at the end, the engine being Kubernetes is still, I think, an important piece to me, or I feel that's an important piece because that gives you access to all of the compute infrastructure whether you need it directly or not. And so that just makes, you know, you can run any workload, right? But now if I could just pull it out a little bit that I don't have to worry about the, that I can just create these things on the fly in a timely fashion that just, you know, makes everything go away, right? Upgrades become easier, everything. Right. You get ephemeral clusters anyway. We can, we're probably out of time, but you can get to this. You can get to this like pretty cool. Like, you know, you know, dream, right? We wanted faster node startup. I want basically faster cluster startup anywhere.
Kaslyn Fields
I think a lot of people want that. I hear that a lot at events, especially in multi cluster conversations. Like, I want it to be easier to manage all of my clusters.
Gary Singh
Exactly. Selfishly, I do demos of apps all the time and cool features all the time. I don't want to have to have my clusters. I just want to sit around and say somebody's like, hey, can you show me how to do blah blah blah. Yeah, sure I can. Boom. Here it was. Just put this in your YAML done.
Kaslyn Fields
And you're good to go. Awesome. Thank you so much Gary for hanging out with me today and talking about gke.
Gary Singh
Thanks for having us. And yeah, looking forward to seeing everybody at, you know, Kubecon and some more GKE events. Hopefully.
Kaslyn Fields
Yeah. Join us at kubecon.
That brings us to the end of another episode. If you enjoyed the show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on social media Ubernetespod or reach us by email at kubernetespodcastgoogle.com you can also check out the website at kubernetespodcast.com where you'll find transcripts, show notes and links. To subscribe, please consider rating us in your podcast player so we can help more people find and enjoy the show. Thanks for listening and we'll see you next time.
Kubernetes Podcast from Google
Hosts: Abdel Sghiouar & Kaslin Fields
Guest: Gary Singh, Outbound Product Manager, Google Kubernetes Engine
Date: October 29, 2025
This special episode celebrates the 10-year anniversary of Google Kubernetes Engine (GKE). Kaslin and Abdel sit down with Gary Singh, Outbound Product Manager for GKE, to explore the journey of GKE over the past decade. Together, they reflect on the early days of Kubernetes and managed container orchestration, discuss how the landscape has evolved—especially with the rise of AI and automation—and consider what the future holds for GKE and Kubernetes at large. The conversation bridges lessons from the past, present improvements, and a vision for cloud-native operations driven by AI.
“...when I got here... thought it'd be cool to come to Google and work on Kubernetes. So that's... my mission and I got here.” ([05:29])
“I'm a doing learner.” ([05:12])
“What if we had had Minikube on day one? ...there was a lot of struggles in setting up Kubernetes in the early days.” ([08:08])
“...with Autopilot... you really can have one click production clusters... push a button or you run GCloud... and now you know, we've sort of got that.” ([09:56])
“I leave clusters running and they upgrade and things seem to work right, you know, kind of that hands off experience.” ([09:56])
“The things should just work... then if you so desire... go down and tweak a specific parameter.” ([14:49])
“...the main thing... is... leverage Kubernetes for its power of scaling ...but do they want to have to go in and configure every single kernel parameter, the overlay networking...?” ([18:22])
The Future—AI for Operations (AIOps, Autonomics) ([22:21]):
“How do we leverage AI more in terms of like running GKE and ...operating in your workloads?” ([22:21])
Observability & LLMs for Operations ([25:59]):
SRE Concerns About AI Agents ([31:27]):
Non-determinism and Offloading Complexity ([33:22]):
The episode paints a picture of GKE’s transformation—from pioneering container orchestration at scale and making Kubernetes accessible, to a platform with a rich spectrum of operational abstraction, helping everyone from platform engineers to data scientists manage their workloads. With new demands from AI and the promise of smarter, AI-driven operations, both the possibilities and challenges for GKE and cloud-native platforms are greater than ever.
The future? More automation, more AI, more flexibility—with thoughtful attention to abstraction, user trust, and the choice of which “rabbit holes” platform teams need to care about.