Loading summary
Kaslyn Fields
Hello and welcome to the Kubernetes Podcast from Google. I'm your host, Kaslyn Fields.
Abdel Sigiwar
And I am Abdel Sigiwar. This episode is a crossover with our friends at the SRE Podcast. I mean the broadcast from Google. Kathleen joined Ben good and Steve McGee to talk about Kubernetes for platform engineering.
Kaslyn Fields
But first, let's get to the news. Kubernetes 1.34 is coming at the end of August 2025. This release will not include any removals or deprecations, but is packed with an impressive number of enhancements. The Sneak Peek blog is out now, covering some of our most exciting features coming out in 1.34. Learn more by clicking the link in the Show Notes.
Abdel Sigiwar
Bitnami is moving most of their free images and helm charts to a Legacy repository starting August 28, 2025, with the exception of a few latest tag images which will remain free. If you need updates and security patches, you will have to move to the paid tier of the service. Check the link in the Show Notes and make sure you take actions before the deadline.
Kaslyn Fields
Amazon Web Services announced support for 100,000 nodes on EKS. The announcement blog dives into the technical details of how the team at AWS managed to achieve this scale.
Abdel Sigiwar
The CNCF Cloud Native Glossary is working on introducing sign language in video format as an additional language for the glossary. Sign language is incredibly diverse, with over 300 signs used by 70 million deaf people worldwide. The CNCF Deaf and Hard of Hearing Working Group released a sign language styling guide to allow contributors to create cloud native terms in sign language. Sharing videos of CNCF terminology in sign language allow deaf and hard of hearing technologists to establish common language. The first six terms, like auto scaling, containers and serverless, are already released, with more to come.
Kaslyn Fields
And that's the news.
Steve McGee
Hello.
Kaslyn Fields
Hello. I'm very excited for this very special episode, Steve.
Steve McGee
This is a very special episode. We're kind of double dipping here. This is great. So we're going to do one amount of work for two amounts of output, which is, I think, fantastic. So I'm Steve and pretty sure you're Kaslyn.
Kaslyn Fields
Yeah. Maybe we should introduce ourselves since folks may not know us.
Steve McGee
Yeah, that's a good point. That's a good point. We have two sets of audiences. Half of each don't know what's going on. So. So this is. This is perfect. Why don't you go first, Kaslyn? Who are you?
Kaslyn Fields
All right. Hello everyone who listens to the podcast and maybe doesn't Listen to the Kubernetes podcast regularly or if you're just tuning in wherever you may be listening from. I'm one of the co hosts of the Kubernetes podcast from Google and I do all sorts of things Kubernetes cloud native, containery related. And I'm excited about our episode today to talk about platform engineering, which we'll get to in a minute. But first, Steve.
Steve McGee
Yeah, who am I? I'm Steve McGee. I was an SRE for a long time and I'm in Devrel now, which means I get to talk on like podcasts and stuff about reliability and SRE ing, but also like other things that are related. So, like, I was on a team with Ben, actually our guest, on a DevOps focused team, and we kind of went into the platform space and I hosted the podcast, which is a podcast you should definitely subscribe to. If you're a regular Kubernetes podcast listener, you should probably subscribe to both all the time for the rest of your life. That seems like a good idea.
Kaslyn Fields
Definitely. Yes.
Steve McGee
But there's a pretty good amount of overlap, I'd say, in the kind of reliability space and, you know, the platform space and the Kubernetes space. So, like, as you'll hear today, I think we're going to talk about. One of my favorite Kelsey Hightower lines is like, you know, Kubernetes being a platform for platforms, and here we have someone talking about platforms. It's going to be great. So. Dearest guest, Mr. Good, would you please introduce yourself? Who. Who the heck are you?
Ben Good
Yeah, happy to. My name is Ben Good. I don't host a podcast. Oh, you're really left out right now. I might have to change that in the future, so we'll see how this goes. But I'm a. I'm a cloud solutions architect for Google, which is a fancy way of saying that I take the different Google products and not Google products and like, smash them together into solutions that, you know, hopefully solve customer problems and make things easier for everyone trying to do things in the cloud. I've been doing that for, oh, quite a few years at this point. I lose track and have to do.
Steve McGee
Math, but no need, no need.
Ben Good
Lots of fun stuff. And I've been doing platform engineering of late, or actually for quite some time actually now at this point.
Steve McGee
You've been doing it. You've actually been doing it the whole time, Ben. It's amazing. Yeah.
Ben Good
Back prior to Google, I did, you know, operations for startups in the. A few different startups in the Denver Boulder area where I'm located. And I was doing platform engineering back then and we just didn't call it platform engineering. You know, when DevOps started to be a thing, you know, it was DevOps. And now, now it's called platform engineering, but with a slightly different take. So been doing it for a while, just didn't necessarily call it that or realize that's what what was happening.
Kaslyn Fields
I am really excited to dive into this. Forget about the podcast stuff. This is what we're here for. So let's talk about platform Eng and Kubernetes. That famous Kelsey Hightower quote that you were mentioning, Steve, Kubernetes is a platform for building platforms. Is that true, Ben? Is Kubernetes all you need to build platforms? Granted, Kelsey didn't say it was the only thing you need, but it is.
Ben Good
Definitely a tool in the toolkit and it's a big one when it comes down to it. Kubernetes provides a lot of different constructs and capabilities that make it a whole lot easier to build platforms. So I, in my opinion, it is, it is one of the tools in the toolbox to build a platform and make it successful. But it's not the one and the only thing. Back, way back when, when I said I was doing platform engineering before it was called platform engineering, there was a company I worked for built would have been called a portal in today's language, but it was a way that other engineers in the team could go and spin up VMs that were in the platform or databases that were in the platform. When we containerized and went to Kubernetes, a lot of that stuff became so much easier and went away because the container abstracted away the VM ness of things. So Kubernetes is super helpful, but it's not the only thing that you need to make a platform in my $0.02.
Kaslyn Fields
So I'd love to know more about what else goes into it these days. I know, you know, last year, two years ago, when was it that we were all excited about Backstage as a tool for like, when I think about platform engineering, I'm imagining teams at various companies who are taking the underlying infrastructure and making it something that developers and other technologists throughout the organization can consume. So what does that look like for you today? So Kubernetes is the base of managing the underlying compute infrastructure, but there's so many other types of infrastructure that you have to deal with and then you have to deal with also. How are people interacting with all of that? So what kinds of Tools are you using to make that happen and does that describe what you do?
Ben Good
Yeah, no, that's perfectly it. That's a very good description of what it is. So the way I think about it is when you're doing platform engineering, you're taking the underlying technology like you say, kubernetes for compute, maybe managed databases for your database infrastructure, maybe self managed databases. All the things that you need to make your applications run and run at scale. Those are the things that you're building interfaces to and you're applying automation to. I really think that platform engineering is the process that you go through to glue all that stuff together. So from like a technology standpoint, you're seeing lots of this typical, like a lot of bash and lot of scripting and you know, a lot of terraform and a lot of YAML and it's all stuck together with some sort of automation tool in the background to run it and orchestrate it. And then that interface is fungible. I've recently been working on a project where the interface is a document in Firestore. It's not a fancy ui, it's write a document in the proper document format into Firestore and then automation kicks off and magic happens. Then that is the interface to it. You can get more advanced or user friendly with the tool like Backstage, but it doesn't have to be that. It's just got to be some well defined interface. That's really what it comes down to. And that needs to be easy for your users or your engineers to adopt and make use of.
Steve McGee
Yeah. There's a phrase that I know goes around with platform engineering which is the idea of golden paths, right? Which is, you know, if you're going to do this thing, if you're going to go down this road, like don't just wander through the wilderness because there's a lot of branches you can get caught on. You know, there's a lot of dark corners where bad things happen. You can fall into a well or something. I don't know where this is going but like the idea is, you know, if instead someone has like already taken that path for you and has like laid some bricks behind them, saying like this is the way, like just, just follow this thing, then you're more likely to succeed. And often from what I've seen is like this tends to be like kind of what you're saying with whether it's a document or a portal or something like that, like it's really just at the end of the day a form of abstraction. So you want the Developers who are like, look, I just want to get to the end of this road so I can like, write my code. Like, don't make me learn about the entire forest of infrastructure. Just get me to the point where, like, my app is running and it's scalable and all this stuff. And what you, what you're saying is like, yeah, provide a series of paths that you know, like, so you. In some form of abstraction. If that's a portal, great. If that's a document, great. Just, just, you know, something, you know, to make it a little bit easier on everyone. Does that line up with.
Ben Good
Yeah. And that, that documentation is a good counterexample to what, you know, everyone thinks it was. Like, it's got to be super easy and like WYSIWYG and all those kinds of things. No, a golden path can be as simple as a document that lays out the three, four, five steps that you have to do to accomplish a task. And that is an example of a golden path. You can make it a whole bunch more than just documentation if you want to, but it doesn't have to. It just has to meet the, you know, the engineer, where they're at in the tasks that they're trying to accomplish.
Kaslyn Fields
And that's one place where platform engineering really gets tricky, I think, is that you've got so many different ways that you could find your way through this forest. And so one of the key things that platform engineers do within an organization is understand how different parts of the organization need to use that infrastructure. And I love that you gave the example of using a Firebase document as the interface for the workflow of doing whatever this thing is. Because I think Backstage offered a vision of a unified interface of here's how you like, request, compute from different sources and things like that. But it can be much simpler than that depending on the use case. Just give them a Firebase document that they can put the thing in and it'll go, I love it.
Ben Good
Yeah, exactly. In the case of the Firebase document, like, you could write that through a, a little cli. You could do that through Backstage. There's no reason you can't make that call via Backstage. There's millions of different ways that you could get that document in Firestore. It doesn't really matter how it happens, so long as it's easy enough for the people that are trying to use that system.
Kaslyn Fields
Flexible but structured platform engineering.
Ben Good
Well said.
Steve McGee
That sounds like a good tagline. We should put that somewhere. So I have to wear my SRE hat now? Once you've like gone through the phases in your platform, whatever methodology you're using, or you've abstracted away a lot of this, these, these inconvenient truths about infrastructure. You've sort of said like, don't worry about it, just follow this path, you're good. You're basically getting the user, the developer through the provisioning stage to like the initial deployment stage. But it turns out there's more stages, we're not done yet. And this is where I come in, right? This is where the SREs come in, where it's like, okay, cool, we've deployed the thing. It mostly works. What's next? What do you have to do next in terms of the life cycle? Sometimes we call this day two operations or day two whatever, or just observability. What are all the things that you would now want to make sure that your developers have access to, maybe through a portal, maybe through another method beyond just that startup process, like what happens in the operate phase of this life cycle? Yes.
Ben Good
I think if you lump all that together, you need to provide visibility into what's happening with the workloads that are running on the platform. And that takes lots of different shapes and form factors. So if I think about the traditional in air quotes operations metrics, like what's my latencies, what are my failure rates, you know, are, is, did my deployment succeed? Those kinds of things. That's a level of visibility that you can expose back up to those engineering teams using this platform engineering concept, like make it easy to get for those developers and engineers to see what's happening. But I think that there's more to it, to your point than that there's visibility into is the application working in the way that the users expect it to. So is the features and functionality happening there? That's something the platform can provide up. Cost controls can come up through there visibility into different security and compliance regimens that they have to adhere to. Those kinds of things that need to come up into the platform. And those are, to your point, all like day two things like, great, you got the first couple releases out, you got some users, now you have to keep it up and going and the platform can provide those ways and those golden paths to make that easy to see.
Steve McGee
Yeah, sometimes they're referred to as like non functional requirements. Like this has nothing to do with widget selling, but you still have to do it. Sorry. You know, the worry that teams have when they're not building this stuff through a platform is like now that I've deployed my widget selling Device. Now I have to do day two, but it's really more like day 222. Like, it's going to take so long to, like, which, you know, regulation are we subject to? And how do you generate these artifacts for which lawyers to look at? Like, what are you talking about? So, like, what are some patterns that you've seen customers of ours, like, adopt to, like, how do you do this in such a way that it's like part of the platform and not just like a thing you have to do? It's like a tax on your time? How do you do this properly?
Ben Good
Yeah, I think that the platform can provide those things for you. So if we go back to the, you know, using Kubernetes as an example, there's things that you can do in your Kubernetes, you know, deployment around, making sure that the right policies are in place, the right rbac, all those kinds of things are in there, such that when you get a namespace, that those policies just kind of come along with it. And then that's an example of where you're getting those things via the platform for free. You can do those same things with metrics and observability and logging. Like, you can, you know, plumb those up through such that the application teams can go in and see their logs without having to do any sort of extra work or magic to make that.
Steve McGee
Happen, or log into another tool or something like that. Like, oh, I need the credentials for the logging app now. Great.
Ben Good
Yeah. So it's not like those aren't super. Those are examples. Every implementation about how you go about it is a little bit different depending on the tool stack and all those kinds of things. But that's what the platform can do for application teams. And those other engineers is just like, do that for them. The term is shifting down. You might have heard that before. Like, those are examples of where you can shift those responsibilities down into the platform and you just get that stuff for free because your application's running on that platform.
Kaslyn Fields
Observability is so important. I often hear platform engineers that I talk with when we ask for feedback on, like, features that we're developing and things like that. One of the things that I often hear is this kind of consideration of, well, that feature is great, but when I bubble that up to my users, how am I going to make sure that they know the pieces of that that are going to keep them from spending a ton of money or running up a crazy amount of compute or something like that? I need to make sure that it's all observable and visible to them through the platform that we are building.
Steve McGee
So yeah, I feel like another thing that we talk about that is kind of what you're getting at, Caslin, is like when you're developing a platform, like basically a platform doesn't just come out of, out of the box in perfect form like as Ben has described, like has all these things like you, you kind of have to develop it and I think this is a basically per customer problem, like per you know, like entity, like Org. Because like every company is a little bit different and has different requirements and has different, there are different levels of like capability and like just staffing even and like regulation and like they, they're all slightly different snowflakes. And I mean that in, not in the like pejorative sense but just like they're all very different from each other. So one thing that I've heard, and maybe Ben, you can back me up on this, is that like whenever a team builds a platform they tend to name it. Like they give it a name and it's sort of a pet and like, and then they take care of it like it is, this platform is a pet and you, you love it and you feed it and you, you extend it to exactly what you need as a, as a team because it's always going to be a little bit different. Does that line up with your experience?
Ben Good
Oh yeah, yeah. Platforms are very much bespoke things to the engineering organization that they serve. Because to your point, every engineering team operates at a different skill level, a different layer in the stack. You might have different teams that are responsible for different types of infrastructure that need to come together to work on the platform together. The one that I was, you know, lightly referencing to earlier, it was called Divine Spork. That was the name that, that GitHub, you know, auto generated for me I'm like, ah, that's kind of fun. Like we're kind of, you know, spoon feeding but it's kind of pokey. So it's a, it's a spork. And these kind of, you get things biomagically so it's kind of divine. So that's what we called our platform at that company was Divine Spork. So yeah, they, they're very much bespoke things. They have names, they have teams that rally around them. But yes, you do care. You can care and feed for them and love them and, and hopefully they love you back and all that kind of stuff. So. But yeah, every platform is different and I think it's a bit of an anti pattern to go and look for the platform in a box that you can just click button, install and like, whoa. I have a platform. That doesn't typically happen.
Steve McGee
Yeah. As we alluded to before, like Kubernetes being a platform for platforms. I've actually heard of Backstage being a not a portal, but a system for building portals. So it's like kind of the same idea of like it has all these knobs, like don't use them all, don't just like click install. Like you got to think about what it is you're trying to get out of this. And you know, I think we've said in the past, like you want to like the abstract idea of a platform is just a grouping of capabilities. And so like what are the capabilities that your team wants to adopt and then deliver through this platform? Whether it's through the Kubernetes side or through the Backstage thing, or through the observability suite that you have through another company or through the whatever. There's a load of different things that you could be a lot of stuff you build yourself too. But at the end of the day, it's the set of capabilities and the fact that they're working together in concert, hopefully that makes it beneficial to your company.
Ben Good
Yeah, very much so. The folks that install and are successful with Backstage, they have teams or a group of people that are modifying and building plugins for it and creating the different templates. It isn't just something that you get to install, but you build on it that exposes those critical user journeys through the portal in a way that works, that takes development effort and maintenance and care and love as well.
Kaslyn Fields
This customization for different organizations and for different types of users within organizations. There's a term that we used in our notes in preparing for this interview, which was deployment archetypes. I feel like all of this lends itself to that term, which I hadn't really seen before, but I really like it. So is that what you all think of as deployment archetypes is making these workflows customized for the environment that they exist within and the users that they're serving.
Ben Good
I think that what you want to do is you have a pattern for ways that you deploy things and those patterns are what you end up exposing out through the platform. So those, those would be the archetypes and you can largely group them like we've done with Steve is well aware of like different reliability characteristics built into them. So I have this thing of this shape and size. I need to deploy it on this type of infrastructure with this, this load balancer, configuration, this database, you know, configuration and that supports, you know, this shape of user across these different regions of the world, or this form factor. And you provide in number of those things in the platform and those become the archetypes, if you will. And those can be genericized. And we've published some documentation around those. But those are the. That's the type of thing that you want to expose out through the platform. And then making it easy to pick the right archetype or change between archetypes is a task that the platform helps you with and get you down that journey.
Steve McGee
So a good way to think about this for the, like the pure kubernetes viewpoint, a good way to think about this is like when you're trying to make a system more reliable and more robust, often what you do is you're like, we have one of these things. How about we have two, right? Like just have a second one. And that's great. It's a good idea. Like that's generally the method that we use to make things more robust is that we have two of them instead of one or n of them. Right? But one thing that you could do naively is you're like, we have a cluster of stuff, let's make a second cluster. Done. Stamp it out, good to go. Ready, set, go. Except you put it in the same zone. Like you have two clusters that are in the same zone. And then when there's a problem in that zone, guess what? You actually only had one. You did not have two. Too bad. Bummer. You know, you get a bad day. So there's this idea of failure domains. And this will be like familiar to a lot of SREs out in the world, but like you have, they're like the Matryoshka dolls of like, you know, you have like your app and then you have like your namespace and then you have your cluster and then you have your zone and then you have your region and then you have like the universe or something. And so you can, your failure can happen at any point along these things. And so there's this paper that came out, I don't know, years ago at this point called Deployment Archetypes, which is like how to apply this understanding of failure domains to when you're doing deployments. And this is where like, you know, zonal deployments versus regional deployments versus global deployments versus like multi region versus multi zonal versus blah, blah, blah. Like there's like these very abstract ideas of Basically where to put your app, in which clusters at which time. And like, it's too much like it's. It's hard to understand all of it. And so what we've seen people do in the past, and Ben has done a really good job of this with some customers that I've seen of, is that like, there will just be like a dropdown in the portal or in that doc or something. It's like, do you want this to be a big one or a little one? You know, and like under the covers, it's picking one of these and it's like making a bunch of choices for you so you don't make the wrong choice. So it's like instead of saying, just give me two clusters in the same zone, it's like, no, no, no, no. Would you like a big one or a small one? And you say big. And it's like, okay, I'm going to give you two clusters in two different regions and not tell. But then also the deployment is going to be able to take advantage of that and the observability is going to be able to take advantage of that. And like, you don't even know the whole time. It's just like kind of magically happens. That's the kind of the vague idea that I get. But Ben has actually done it. So I'm curious if that holds true when you hit reality.
Ben Good
Oh, very much so.
Steve McGee
Cool.
Ben Good
Yeah, it definitely holds true. And the thing that you're doing with that little dropdown is you're abstracting away a lot of the super nitty gritty detail of what cluster, what subnet, what this, what, that, that you might have to go and know how to do otherwise. I think a lot of times when folks are trying to build a platform, you like, throw a bunch of terraform or throw a bunch of YAML at folks and say, okay, go do the thing with the YAML or the terraform or whatever your favorite, you know, language is.
Kaslyn Fields
A lot of the things that we've talked about in our conversation today are kind of evergreen topics. Reliability, making sure that the systems keep working, making sure that you understand how they're working, building golden paths. But I think a lot of that has kind of shifted recently due to industry trends. I wanted to know if you've seen any of that in your day to day work, Ben, if the types of golden paths that you're building these days are different and how and all of that kind of stuff.
Ben Good
I think the underlying technology changes a little bit, but really it's still the same thing I think that we've been doing just with slightly different twists or name or a change of focus on how do we do it, you know, at scale, if you will. I think the rate of change in the industry as a whole is. It's always ever increasing. Right. So like, I think the things that we have been doing, we just have to do more of those things and maybe with different technology, not. Not doing something different.
Kaslyn Fields
So yeah, in the Kubernetes space, what I've seen a lot of is, you know, hardware, accelerators, it's a lot of the same kinds of stuff. But now we're enabling folks to use different types of hardware than we've used used in the past and for applications that are very particular about how they're using that hardware. So Kubernetes has traditionally allowed folks to, you know, a lot of the management of that underlying infrastructure is handled for you through Kubernetes and then you add another layer on top of it, the platform that users actually interact with to further abstract that. And now folks want a different type of abstraction for those types of workloads that really need careful control over those underlying hardware resources. So that's where I've seen Kubernetes doing a lot of work recently to adapt to new changes.
Ben Good
That's fair. Like, you know, back when Kubernetes was young and when I started using it was CPU and memory, like those were the things you had to be concerned with. Now it's, there's networking and to your point, there's GPUs and how do you go about getting access to those things? But yeah, it's, you're beginning to add more to it, but really you're still orchestrating the access to the underlying infrastructure and providing more control than you were way back forever when. But it's the same, same process or the same thing you're accomplishing just a little bit different.
Steve McGee
I've heard that also customers will have different motivations for platforms than they used to. Like sort of in the post 0 interest rate ZIRP phenomenon, there's been a lot of consideration around cost management. Right. And cost observability, things like that. And frankly, like part of the problem with a, you know, kind of a naive level way of doing these abstractions through platforms is like to just hide everything. And then like you get a bill and you're like, oh whoops, we did something. And so the ability to have, like to expose usage patterns and like oh, we accidentally all the databases, like early and often. Right. So you can actually see in like somewhat real time, like your consumption of the database or whatever the expensive thing of the month is, and be able to control that and make a change. So if you do a deployment that accidentally writes 10 billion times, you should be able to, as a developer, notice that and rectify that all on your own. So self service is one of these things that is really, really ubiquitous in the marketing of platform engineering. But often we think of that just in terms of like, why won't to make my service for the first time like that greenfield step. But self service also applies in the operational context as well. So does this come up like as a motivation when customers are talking to you about platform engineering or am I making this.
Ben Good
Yeah, like there's some customers that approach it from a cost control perspective. You know, not many approach it from a self service standpoint. I get a lot of customers that look at it from a security, compliance, governance perspective and those, those are reasons to go about it. But that can't be the angle that you take, in my opinion. So we can't take the angle of, well, we're going to do platform engineering to reduce costs and we're going to begin to make our engineering teams do something to reduce costs. We want to improve compliance and governance and those things. But forcing engineering teams onto the platform doesn't typically yield the desired outcome, if you will. There's oftentimes reasons why that happens, or, you know, you want to listen to those engineering teams and begin to design your platform in a way that supports those engineering teams, but also accomplishes the security and compliance and governance and cost controls. And, and those things come along when you begin to actually do platform engineering. So the motivation can be there, but it can't necessarily be the reason that you go to engineering teams and say, we're going to do platform engineering and you're going to use the platform because of this. That usually doesn't yield the best platform.
Kaslyn Fields
That opens up a whole new can of worms that we could dive into for a whole nother episode. At least diving into security considerations, regulatory compliance issues, and working all of those into how we build platforms for specific users. But I think we've covered some really awesome stuff today with making sure that you're designing platforms to serve the users that they need to serve, and that a lot of the underlying concepts are the same as they've ever been. It's about making sure that your infrastructure is usable by the teams within the organization. And I'm excited for all of those platform engineers out there listening to do that and hopefully not cause outages like the ones that we just experienced.
Steve McGee
Except that they do happen, right? You can't just wish them away. They will happen. Another thing that we didn't get to which I'm amazed. Ben, how did we not mention dora? I'm even wearing the shirt under this jacket like so I know we blew it. But like being able to track your team's success through metrics and being able to show that these capabilities are actually making an improvement to the lives of the developers and to the business. Like go to dora.dev to see what that even what all those words mean. But like delivering it through a platform is totally like a good move. I believe you would, you would agree with me on that side note. Ben and I work on the platform engineering chapter of the DORA survey every year, so I know he agrees with me.
Ben Good
It's a lot. It's a lot of fun. I definitely agree. A fun thing to think about from a DORA perspective in your platform engineering endeavors is using DORA metrics to understand your feature velocity and your application development practices, but also your platform engineering practices and your platform velocity, which I think is a fun, a fun thing to think about.
Steve McGee
Yeah, platforms are software too. Well, thank you Ben. Thanks for suffering through an outage with us and handling the mitigation with grace. And we'd love to have you back either on each of our episodes or maybe another double episode. Who knows? It'll be great. You never know what's going to happen in the future as we just learned.
Ben Good
This was a lot of fun. Thank you. I really appreciate it. Thank you much.
Abdel Sigiwar
That brings us to the end of another episode. If you enjoyed this show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on social media Kubernetespod or reach us by email at Kubernetespodcastgoogle.com you can also check our website at Kubernetespodcast.com where you will find transcripts and show notes and links. To subscribe, please consider rating us in your podcast player so we can help more people find and enjoy the show. Thanks for listening and we'll see you next time.
Kubernetes Podcast from Google: Episode Summary
Title: Platform Engineering, with Ben Good
Hosts: Abdel Sghiouar, Kaslin Fields
Guest: Ben Good
Release Date: August 6, 2025
In this special crossover episode with the SRE Podcast, Kaslin Fields and Abdel Sghiouar delve into the realm of platform engineering within the Kubernetes ecosystem. Joined by Ben Good, a Cloud Solutions Architect at Google, the hosts explore how Kubernetes serves as a foundation for building robust platforms and the multifaceted aspects of platform engineering in modern cloud-native environments.
The conversation begins with updates on the latest Kubernetes release, Bitnami's repository changes, AWS's EKS scalability, and CNCF's initiative to incorporate sign language into their glossary. The core discussion focuses on platform engineering—its definition, components, best practices, and evolving trends influenced by industry demands and technological advancements.
Kaslin Fields introduces the topic by referencing Kelsey Hightower's famous quote: "Kubernetes is a platform for building platforms."
Kaslin Fields [05:55]: "That famous Kelsey Hightower quote that Kubernetes is a platform for building platforms. Is that true, Ben?"
Ben Good [05:59]: "Kubernetes is definitely a tool in the toolkit and it's a big one when it comes down to it. It provides a lot of constructs and capabilities that make it a whole lot easier to build platforms."
Ben emphasizes that while Kubernetes is a significant asset in platform engineering, it is not the sole component. It serves as one of many tools necessary to construct a comprehensive platform.
The hosts explore what constitutes modern platform engineering beyond Kubernetes.
Kaslin Fields [07:46]: "What kinds of Tools are you using to make that happen and does that describe what you do?"
Ben Good [07:46]: "Platform engineering is about taking underlying technologies like Kubernetes for compute, managed databases, etc., and building interfaces and automation around them."
Ben elaborates that platform engineering involves integrating various infrastructure components and automating their management to provide seamless interfaces for developers.
The concept of "golden paths"—standardized, optimized workflows—is discussed as a means to guide developers through complex infrastructure setups.
Steve McGee [09:20]: "There's a phrase... golden paths, which is the idea of providing predefined routes for developers to follow to ensure success."
Ben Good [10:34]: "A golden path can be as simple as a document that lays out the steps to accomplish a task."
The discussion highlights that golden paths can range from simple documentation to more sophisticated interfaces, all aimed at reducing complexity for developers.
The term "deployment archetypes" refers to standardized deployment patterns tailored to an organization's specific needs.
Kaslyn Fields [22:06]: "Deployment archetypes are making these workflows customized for the environment that they exist within and the users that they're serving."
Ben Good [22:06]: "You have patterns for ways to deploy things, and these patterns are what you expose through the platform."
This section underscores the importance of defining and implementing deployment archetypes to streamline application deployment processes.
The panel discusses the growing emphasis on observability and cost management within platform engineering.
Kaslin Fields [17:50]: "I need to make sure that it's all observable and visible to them through the platform."
Steve McGee [28:13]: "Cost management and observability have become critical considerations, enabling developers to monitor and control their resource usage effectively."
The conversation points out that modern platforms must provide robust observability tools and cost control mechanisms to prevent resource overconsumption and ensure application performance.
Customization is vital for platforms to cater to diverse organizational needs.
Ben Good [21:06]: "Platforms are very much bespoke things to the engineering organization that they serve."
Kaslyn Fields [11:55]: "Flexible but structured platform engineering."
Customization ensures that platforms can adapt to different workflows, compliance requirements, and infrastructure setups, providing tailored solutions for various teams within an organization.
The panel examines how platform engineering practices must evolve in response to changing technologies and industry trends.
Ben Good [26:50]: "The underlying technology changes a bit, but we're still accomplishing the same things with different twists."
Kaslyn Fields [27:27]: "Enabling folks to use different types of hardware... Kubernetes is adapting to new changes."
As new hardware accelerators and specialized workloads emerge, platform engineering must adapt Kubernetes abstractions to manage these resources effectively.
The integration of DORA (DevOps Research and Assessment) metrics into platform engineering practices is explored.
Steve McGee [32:58]: "Platforms are software too... Using DORA metrics to track platform engineering success."
Ben Good [32:58]: "Using DORA metrics can help understand feature velocity and platform performance."
Implementing DORA metrics aids in assessing the effectiveness of platform engineering efforts, ensuring they contribute positively to development velocity and operational performance.
The episode underscores that platform engineering is a multifaceted discipline essential for modern cloud-native application development. Kubernetes serves as a foundational tool, but successful platform engineering requires integrating various technologies, automating workflows, and providing flexible, developer-friendly interfaces. Golden paths and deployment archetypes simplify complex processes, while observability and cost management ensure operational efficiency and reliability. Customization and adaptability are crucial as technological landscapes evolve, and metrics like DORA provide measurable insights into platform performance.
Notable Quotes:
Ben Good [07:01]: "Kubernetes is super helpful, but it's not the only thing that you need to make a platform in my $0.02."
Kaslyn Fields [12:22]: "Flexible but structured platform engineering."
Steve McGee [15:48]: "Self-service also applies in the operational context as well."
This episode provides a comprehensive exploration of platform engineering within the Kubernetes ecosystem, highlighting best practices, essential tools, and evolving strategies to meet the dynamic needs of modern organizations. By leveraging Kubernetes and complementary technologies, platform engineering enables scalable, reliable, and efficient application development and deployment.