
is a senior staff software engineer at Google who has been involved in Kubernetes since 2016, and is currently a co-chair of both SIG Architecture and WG Device Management. Do you have something cool to share? Some questions? Let us know: -...
Loading summary
Kaslyn Fields
Hello and welcome to the Kubernetes podcast from Google. I'm your host, Kaslyn Fields.
Abdel Sighiwar
And I am Abdel Sighiwar.
Kaslyn Fields
And happy New year everyone. Happy 2025. To kick off the New Year we're talking with John Bellamarek. He's one of the co leads of Working Group Device Management in open source Kubernetes. We'll talk about the awesome features the group is developing and what problems they're trying to solve. But first let's get to the news.
Abdel Sighiwar
Kubecon Japan and India both have CFPS open now. Kubecon Japan will be the first Kubecon event held in Japan on June 16th and 17th. The CFP closes for that event on February 2nd. Kubecon India will be the second Kubecon event held in the country. The previous Kubecon in India was held in December in 2024 in New Delhi. Kubecon India 2025 will be held in Hyderabad on August 6 and 7. The CFP for India closes on March 23.
Kaslyn Fields
And that's the news. I am excited today to be speaking with John Bellamarek. He is a senior staff software engineer at Google who has been involved with Kubernetes since 2016 and is currently a co chair of both SIG Architecture and Working Group Device Management which we're going to be talking about today. So welcome to the show, John.
John Bellamarek
Thank you, Kaslyn for having me. I'm excited to be here.
Kaslyn Fields
So you've been around with the community for quite some time. 2016 was actually the first year that I went to a Kubecon. Did you go to Seattle that year?
John Bellamarek
Yes, I did, yes.
Kaslyn Fields
Nice.
John Bellamarek
The election night. Kubecon. Yeah.
Kaslyn Fields
Oh man. That was a thing.
John Bellamarek
Yes, yes. That was my first Kubecon and I was involved at the time with CoreDNS. I still am, but that was my sort of first. The company I worked for at the time was a DNS company and so we got involved and brought core DNS to Kubernetes. So I had a lightning talk at my first Kubecon, which I had three minutes and I was a little nervous, so I was so fast that I finished it in less than 1.
Kaslyn Fields
Whoa.
John Bellamarek
Which I'm sure was completely useless to everybody in the audience, but you know, that's how it goes.
Kaslyn Fields
That almost never happens with lightning talks. They always run over. So I don't know. Props to you for that. I think.
John Bellamarek
I don't think it was a good thing.
Kaslyn Fields
That is super cool. And I could go on and on talking about the 2016 Kubecon. What an adventure it was. But let's talk about what's going on these days with Kubernetes. So we had our 10 year anniversary of the project last year in 2024 and and of course AI. AI, AI, all the things. And so I hear that that is related to what you're doing in working group device management. Could you tell me a little bit about the group and what a working group is too so that we can bring folks up to speed if they haven't heard of it.
John Bellamarek
Absolutely, that's exactly right. So working group in the Kubernetes community, the way Kubernetes community is organized, we have special interest groups, sigs those are actually own the code. So it's just a bunch of people who work together, you know, in open source. But sometimes there's a problem that spans multiple SIGs. So our SIGs are things like Node which focuses on Kubelet and related APIs and API machinery which focuses on the actual API server and all of that, and Scheduling which focuses on the scheduler. So it's kind of broken up by components that are part of Kubernetes. But sometimes there's a feature that you want to implement that's going to span and touch many of these different components. And so we have another concept called working group. And a working group is sponsored by multiple sigs and it has a specific short, relatively short meaning usually a couple years lifespan that is sort of gated by, you know, we finish this feature and we're done and it dissolves and the change is made into the code by that feature, by that implementation are still owned by the sigs that sponsored it. So that's what a working group is.
Kaslyn Fields
Perfect.
John Bellamarek
And with devices, well, so if we roll back say to Kubecon Chicago, which was in November of 2023, and we could roll back even further than that. But that's kind of where it started to reach a fever pitch where people were like we need to do things with AI, which means we need to do things with GPUs and accelerators. And the solution that some folks, particularly my co chairs of working group devices, which didn't exist at the time, Patrick Oli at Intel and Kevin Kluz at Nvidia, they had put together a solution for this called dynamic resource allocation. Patrick had been working on it for several years. It originally came out of a different use case, but similar kind of needs for managing more flexibly managing devices or rather resources on a node. And so there was a lot of excitement in QCon in Chicago around DRA. Unfortunately, that iteration of DRA also caused some anxiety amongst some of the sigs, in particular the scheduling and auto scaling sigs, because the specific design and implementation was super flexible, which is great, except when it's not right. Because what it meant was that the autoscaler couldn't look at a POD spec or a deployment spec and a node and sort of easily identify whether that new node it wants to create would satisfy that POD spec because of the level of flexibility built in to that DRA functionality. So this kind of put a little bit of halt on it and we started discussions about how we might revise that. And so then that's around when I got involved. I got involved post that kubecon. So like January of the next year.
Kaslyn Fields
I did not expect this to take a turn into dynamic resource allocation. Very interesting.
John Bellamarek
So exactly right. So what we decided is that there were a couple of things. One is we needed to revisit DRA and how it is designed and structured such that it meets the needs of the auto scaling community. And the scheduling community also had some concerns about certain aspects of it.
Kaslyn Fields
So the dra, to give folks a little bit of context around this, I guess DRA or dynamic resource allocation is a feature that was a primary feature within 1.33, the last release of Kubernetes that just came out in December of 2024. And something that we talked about in the release episode was that it was kind of a revision of dra, and I actually didn't know that going into that release episode. So this is really interesting to me to hear a little bit about the origins of DRA and it being very flexible and causing issues with the autoscaler and scheduler. And so now a new version of it is out in 1.33. Right?
John Bellamarek
It's 1.32. 1.33 is the one more start.
Kaslyn Fields
1.32. Darn it. 1.33 shadow applications are closing the.
John Bellamarek
Sorry, I should have stopped you.
Kaslyn Fields
The release team is spinning up. So 1.33 will be next. 1.32.
John Bellamarek
Yes, we just released beta of DRA in 1.32 and that beta is the revised version that came out of our discussions post the Chicago kubecon. So after the Chicago Kubecon, we all worked together and it came out of this working group. So kind of like going back to. We had the Chicago qcom, these concerns were raised, the Kubernetes community met and then we met in the next kubecon. We've discussed A lot of things offline, of course, or online and Slack and meetings and everything. And what we realized is that DRA have been designed out of the batch working group and that there were a lot of use cases around things like inference that meant we needed something a little broader than batch. We also realized in our discussions at Kubecon EU that following the Chicago one, that there were other things than accelerators that needed to come together to solve Orglodge problems. So for example, sometimes your accelerator needs to talk to other accelerators over the network and you want that accelerator and that network interface card to be on the same PCI bus, because if you look at say Nvidia, for example, has technology that if they're on the same PCI bus, they can talk directly to each other and bypass the cpu. And it's a tenfold improvement in the I O performance between them. So if we're just looking at accelerators, then we're not solving the whole problem. And so we kind of took those few pieces of information and said, hey, we really need a new group that's going to understand all of these use cases around different types of workloads that use these specialized devices and try to come up with a plan that works for the auto scaling community, works for the scheduling community, works for of course, meeting the needs of those workloads. And that's how this was born, the workgroup device management.
Kaslyn Fields
It makes a lot of sense that your description of working groups was so clear, because my first introduction to working groups was the long term support working group and all of the shenanigans and drama in the community around that. And a primary feature of the long term support working group is that it is meant to be a limited time thing. So that point of it was very emphasized. But working group device management has all of these connections between the different sigs in the different areas of kubernetes. And that's a more primary part of it than I would say it is in long term support. Because long term support is like it's just talking about the whole project and how we deliver it to folks.
John Bellamarek
Right.
Kaslyn Fields
So it doesn't have as much of that piece of working groups, but so you've got the batch. Was batch a working group at that point? Yeah, it is still. Right.
John Bellamarek
Is it a SIG now? I think it's a. I'm not sure if it's a SIG or a working group now, but I think it might be a signal because it may own.
Kaslyn Fields
Certain code, but I think it might be.
John Bellamarek
I don't recall but node scheduling, auto scaling, architecture and networking as well are all involved in the working group device manager because especially NETWORKING for the NICs, the network interface cards and in fact we're able to solve some of the low level multi network concerns for kubernetes with DRA as well because we're allowed to attach different devices to a pod and so devices network interface cards attached to a POD means you get access to another network. So it kind of solves some of the problems that Multus for example is used for today.
Kaslyn Fields
That's a big swath of the project. That's a good chunk of the sigs.
John Bellamarek
Yeah, well it's about the abstraction, right? Like our physical machines have this abstraction already. Our kernel has this abstraction already of devices that can be attached and put in these different types of namespaces. In some sense we're just kind of leveraging the logical constructs that the kernel gives us of this abstract device. And going back further, what Unix gives us, everything's a file. But anyway, the mission, if you go look at the charter for the working group, is to enable simple and efficient configuration sharing and allocation of accelerators and other specialized devices. So what that means is right as a PODSPEC author I can put in some selection criteria and that goes out and finds the right type of device based on that selection criteria and allocates it so that other people can't use it and attaches it back to my POD at the same time I might have specialized configuration I want to attach to that. It's not just the selection of the right type, but I want to configure it in a certain way and there might be certain ways that my administrator allows it to be configured and not. So it kind of tries to get all of these different pieces into place.
Kaslyn Fields
It's almost like when you try to implement new functionality with new hardware in a distributed system. You have to deal with the whole distributed system.
John Bellamarek
Yeah, exactly, exactly.
Kaslyn Fields
All the components of it. Basically because the system is made up of hardware.
John Bellamarek
And that's the thing. This is kind of like to me, and maybe I'm biased because I'm one of the co chairs here, but to me one of Kubernetes fundamental challenges moving from the sort of traditional web app type of environment to AI environments was that our first 10 years, or maybe the first majority of the first 10 years was spent thinking about fungibility of hardware, making hardware as invisible as possible. And we had certain workloads that we used as our kind of primary use cases and they would just consume that hardware in any old way. And that works great for those type of applications. It doesn't work as well for these training and inference workloads, which have very specific hardware requirements and have very expensive hardware that's scarce. And we want to get the most utilization and most utility out of it as we can. So, fundamentally, the goal of workgroup device management is to change Kubernetes relationship with the hardware and change how Kubernetes understands the hardware and makes the hardware available to our users.
Kaslyn Fields
That is a big ask. It is.
John Bellamarek
It really is a big ask. Right now, what our current effort is is dra, and that actually solves a substantial part of that problem, but not all of it. So we may have new things that come in after dra, but I can't look that far ahead yet.
Kaslyn Fields
Yeah, I can see how DRA would be an important part of that. The dynamic allocation of these new types of hardware resources. You're going to have to solve a lot of problems with the way that Kubernetes looks at those resources in order to dynamically allocate them. So that makes a lot of sense as kind of a base level. But I do think there is going to be more, like you're saying here. Yes, we'll see what that is. I do also think it's funny sometimes when we talk about the advent of AI workloads and how that's different from the web world that Kubernetes was originally built in, that normally in technology, I think you tend to go toward a world of more abstraction. But it feels kind of like we've done a step backward here on that respect. Because in the web application world, we could abstract the hardware more. Whereas these AI workloads are so specific in the kinds of hardware that they need and how they use that hardware in it, can want to do things, like, at a very granular level with the actual hardware that you have. And so we're kind of going backwards there and showing all of that detail again to the users.
John Bellamarek
Yes, I think that's partly, though, a function of the newness of the hardware. Right. So CPUs have been around a long time. Memory had been around a long time before Kubernetes came along. So making it more fungible, more commoditized, was pretty easy to do. People have been working on that problem for many years. We are not at that state yet as an industry with our accelerators, they're not fungible. They're not even very equivalent at times. Right. So even if you can somehow Represent them. Like the workload won't run on two different, you know, on this one versus that one. Like, so their performance is different. So many characteristics are different. We don't even know how to measure it at times. And so we don't know how to measure the utilization sometimes. So we're just not at a state yet where the abstractions can be as useful as we'd like. Now we are building abstractions, it's just that we are, and they are useful, but the parts that we're abstracting are about the sort of scheduling, selection, configuration pieces. And it's a little bit. It's like the orchestration layer we're abstracting, whereas with CPUs, we can sort of abstract at a little bit lower level. I don't know if that makes complete sense, but it's kind of how I think about it.
Kaslyn Fields
Yeah, I think that's a helpful way of looking at it. So the working group Device Management has a very big mission in terms of its meaning and impact to the project and kind of a wide range of areas. There's a lot of work going on with the Dynamic Resource Allocation project, which is in 1.32, but there's continuing work on that. What are some of the work streams? How does the working group kind of operate?
John Bellamarek
I mean, DRA is really our primary work stream, but you could think of that in. We break that into many pieces right.
Kaslyn Fields
Now, I'd imagine you break it down. It sounds pretty big.
John Bellamarek
Exactly. It's pretty big. But so with dra, there are a few aspects that you can think about. Like one is the API for how device vendors specify their devices. So traditionally we had device plugin, and device plugin just says, here's a string and a count, and that's it for the node, Right? It's an extended resource, we call it. Where it says, I have Nvidia.com GPU. I have eight of them. And that's it. DRA widens that API to allow a lot more detail. And on top of that, we allow kind of very sophisticated models of how to represent devices. So the. The simplest thing we started with in132, which is going to get much more sophisticated as we move on, is you publish. Instead of publishing, here's a string and account. We publish eight structures, right? There's eight of them, one for each device. And it has a bunch of attributes, and those attributes are of varying types. So you can have model name, vendor name, et cetera, et cetera. But you can also have things like capacity, how much memory this thing has. So that's how the users, or rather how the vendors publish information about that. Then there's another aspect which is how do users ask for those things? So that's our sort of claim or resource claim API. And it goes and allows the user to say, I need this particular model from this particular vendor. Or it can be more flexible and say, hey, I need any model from that vendor as long as it has more than eight gigs of memory. And so this is a way that we can allow some flexibility or allow the user to underspecify, which leaves room for the platform kubernetes to satisfy the request in different ways. So when you have room for that to happen, you have people with opinions, in particular your cluster. Administrators have opinions in that and want to control which choices get made first. So we have something called Device Class that helps with that and future things coming to help with that. So resource claims are the way the user specifies how to do it. Device Class is the prepackaged version of what a device might look like from an administrator point of view. So the administrator can attach configuration, for example. Resource slices, we call them, are where the driver, the vendor, publishes the information. And then we have an allocator part of the scheduler that's going to go and satisfy or resolve those claims against the available set of devices out there. And then we have a driver that runs on the node that when the allocation is made and the pod lands on the node, it attaches the device to the pod. So these are kind of like four different areas, at least four there that we actually each one of them has its own set of caps that's developing. So like for the resource claim side, we have, you know, the basic claim we started with, but we also have now in 133, we're going to likely have an alpha feature which allows you to say, you know what, you can satisfy this claim by giving me one of this type of device or two of this other type of device, or four of this other type of device, which allows you to solve some of the obtainability problems we have in Kubernetes with GPUs. But that's a separate cap, and that only touches the claim side. You don't have to touch the driver, you don't even have to touch how you publish those resources. It's just the user can get flexibility in how they specify their claims. So that's sort of one area. I'll pause there because I can go on and on and you may have questions.
Kaslyn Fields
I'm very Curious how all of this is going to look in five to 10 years, because that level of flexibility is very interesting there. So I'm imagining, say you're a platform engineer who manages the infrastructure for a number of different development teams. Some of those development teams are doing AI workloads and some of them maybe are doing traditional web applications or other kinds of applications where maybe they don't need the level of detail that the AI applications would need. So dynamic resource allocation and especially the. What is it called again? The device, the definition piece. Class class. There we go. So the device class piece is very useful for those AI type workloads. Would you use a dynamic resource allocation and device classes across the cluster also for the workloads that are not AI workloads that don't need that level of detail and control over the devices, or is it mainly meant just for those workloads and you would do something different for the non AI workloads?
John Bellamarek
That's a very good question. The short answer is you would. In my opinion, one should minimally specify what they need and let the system figure out how to optimize it. So if you don't need a device, you definitely shouldn't specify one. In theory, there are people working on drivers for DRA which model and represent CPU and CPU topology and memory topology. In which case, if you have needs like that we currently solve with something like topology manager, you potentially can solve them with this where with a little bit more flexibility, Topology manager or like CPU manager and things they're based on per node settings, which is an artifact of that. They were built by signode, not an artifact of the technology. And so you could actually, if people build the right drivers and we can make it scale, that's the big issue, there could be scalability issues. Then you could in theory use this for types of workloads that don't need specialized devices, but have say specialized NUMA or other memory constraint or CPU constraints or need PIN CPUs or things like that. And you could dynamically configure those on a per workload basis rather than a per node basis. So today it's per node. And so then you're sort of cordoning off that node to say only this type of workload should go on this node and it creates kind of a chunky infrastructure, like a blocky infrastructure, as opposed to a infrastructure you can cut up in smaller and smaller pieces. That's the long answer. The short answer is regular workloads that don't need specialized devices shouldn't specify any of this stuff and our existing systems will work perfectly well. The longer answer is once you start getting some specialized needs, you should minimally specify what those needs are to give the platform the flexibility to satisfy it in the most optimal way.
Kaslyn Fields
We've said a few times on this show in our particularly AI focused episodes that it's a pretty good time to be an infrastructure engineer. One thing I'm getting from all of this is there's still a lot of room for infrastructure expertise here and a lot of need for it.
John Bellamarek
Absolutely. And we're building the APIs such that if you're an infrastructure like a platform engineer or running, you're understanding, say even just your contracts with your cloud provider and where you have reservations versus where you have spot availability versus where you have like we're enabling the APIs that when you combine with something like Carpenter or Google's custom compute classes, which are auto scaling technologies, we kind of dovetail with those. And I'll have a hopefully have a talk about this in an upcoming kubecon, but where this DRA dovetails with those technologies. So that. Right. Like I said earlier, the workload author can underspecify the request, meaning say I can take this or this or this and then let the cluster administrator through these other auto scaling tools decide which of the this or this or this based upon their preferences. So it kind of decouples a little bit the workload authors efforts from the platform engineers, efforts such that they can work independently without having to talk to each other every day. Because, you know, people don't like. Engineers don't like to talk to each other. They can avoid it.
Kaslyn Fields
It can certainly be a challenge, especially in cases like this where the developers who are creating the application are just like, I want my code to do the thing.
John Bellamarek
Exactly.
Kaslyn Fields
Platform engineer, infrastructure person, please just make the computer do that. And things get dropped in that.
John Bellamarek
Exactly. And the infrastructure engineer is like, well, I have a budget, right. I can't just give you anything you want. I have to make sure that there's enough for everybody.
Kaslyn Fields
Yeah, that can be a real challenge. So dynamic resource allocation and device classes in the world of AI workloads can help with that. One thing I was thinking as you were talking about all of this was that that sounds like a lot for infrastructure engineers to keep in mind. So splitting it up kind of between the cluster administrator and other roles makes sense, but still lends credence to the idea that infrastructure engineers have some job security here.
John Bellamarek
Absolutely.
Kaslyn Fields
So speaking of infrastructure engineers and users, how can listeners out there who are maybe infrastructure engineers or maybe who are building AI workloads or whatever they may be doing. How can they support working group device management's work?
John Bellamarek
We do have some end users involved, obviously both customers of those of us who are there. We have all the major cloud providers, Nvidia and other folks all involved in the working group and we all talk to our own customers and we have some end users who come to the meetings. But more is always better with respect to that because we're building APIs that are hard to change and are likely going to have to live for another 10 years. And we're going to screw them up because we're human. But the more information we have up front, hopefully we'll make them better. So as an example I talked earlier a little bit about, we have a way to under specify claims. We also have flexibility in the way devices get published by the the vendor. One of those ways we're working on for flexibility for the device vendors is what we call our partitionable device model. So the mental model you can think of, the sort of the canonical model for this would be like Nvidia has what they call multi instance GPUs. You can take an individual GPU and you can break it into smaller GPUs and you can do that dynamically and we need to pick which one. So we want to represent that. We have a similar thing at Google where it's like we have TPUs, you can have eight of them in a node, but you can't consume any arbitrary two. If you want to consume two, they come in special pairs or four or whatever. So these topologies so we can represent that. Other vendors have similar use cases and we would love to hear about. So Amazon has come and given us some of theirs and Microsoft is there. But like end users may have other constraints. They want to put that other ways, they want to make use of that. And we've gotten feedback from different end users on that. But that's just one small example of one of the things we're working on where we could use input from either vendors or end users. And the way you could help is the first thing to do would be just reach out to us. On Slack we have working group device management. On the Kubernetes Slack come to our meetings. We have a meeting 8:30am Pacific Time every other Tuesday. So from the show notes you can go on there and you can find out when our meetings are where they are. We're super friendly and welcoming, everybody is welcome to contribute and our meetings are very open. Anybody can add an agenda item, we talk about it. If it looks like it's something people want to do, then you know, you can create an issue in the Kubernetes, you know, repo to track it, enhancement issue or whatever, depending what it is. And we just, we just start executing on it and tracking it in each meeting.
Kaslyn Fields
Highly recommend popping into a meeting if you do want to give them work to do because work that people know about and were told about in a meeting tends to get prioritized a little bit more easily than work. Definitely they just saw it on an issue in GitHub. So if you can.
John Bellamarek
There's too many issues that go by. So yeah, yep.
Kaslyn Fields
If you can pop into a working group device management meeting. If you can't pop into working group device management, maybe pop into one of the other sigs that we mentioned. A lot of them are involved with this work. So if you had a question about one of these things, you might actually run into someone who knows something about it in one of the other meetings as well, depending on what times of meetings work for you. So thank you so much John for being on today and teaching us about device management and what Kubernetes is doing to address it. I was just talking to some end users the other day actually who were saying I'd really like it if GPU auto scaling worked a bit better on Kubernetes. So I know there's demand for your work and I look forward to seeing where it goes.
John Bellamarek
That sounds great. And send those users to our working group.
Kaslyn Fields
I'll see if I can do that.
John Bellamarek
Thank you. Thanks so much.
Abdel Sighiwar
Well, hi Kathleen. Happy New Year's.
Kaslyn Fields
Thank you. It was super fun to get to talk to John to start off the year. He has done a lot of awesome work in the community. I've talked with him at Kubecons a bunch and seen a bunch of his lightning talks. We have him at the booth, at the Google booth pretty often I feel like. So it was really cool to get to hear more about working group device management, which I've seen a lot of stuff about around the community but haven't really gotten to dive into until now.
Abdel Sighiwar
Nice. I mean, we talked before about the fact that we were trying to get quotes on the new working group serving and device management, but we couldn't suddenly get both of them last year. So we're just. I'm just happy that we managed to at least do it.
Kaslyn Fields
Yeah, I didn't mention that in the thing, but a hint at why both of those were created AI, the two.
Abdel Sighiwar
Most important acronyms, I guess, 2024 and maybe 2025.
Kaslyn Fields
Yeah.
Abdel Sighiwar
So I didn't have the time to listen to the episode. So why don't you walk us through what have been discussed? I might have questions. I will for sure have questions.
Kaslyn Fields
Yeah. So to start off things, right off the bat, I loved John's description of what working groups are in the community. I talked in the episode about how or in the interview about how my first real working group that I was kind of involved with stuff in was working group long term support.
Abdel Sighiwar
Oh yeah.
Kaslyn Fields
And that's kind of a weird one because it's like, it's very much about what the industry is doing and like, how do we support that in open source rather than being inspired by specific. Well, I mean, it is kind of a specific technical need, but a little bit of a different perspective on how it relates to the different areas within Kubernetes, the different special interest groups. Whereas this one is like, of course, AI functionality. We're getting a lot more AI workloads on Kubernetes. And so the SIGs were trying to address what was going on with the new workloads that people are trying to run on Kubernetes. And they were encountering conflicts because certain things that need to be created, new functionality that Kubernetes needs really crosses the SIG boundaries. And so they had to create this new working group. And so that kind of speaks to what working groups are meant to do. They're meant to be groups of folks who are working on projects that are cross sig, that are usually not forever. Usually the work that they do ends up being owned by sig, so they don't own the code. But so working group device management is just a really good example of a working group, I think.
Abdel Sighiwar
Yeah. So like in my head it sounds more like what they're doing is coordination across multiple SIGs to make sure that stuff are laid up in the right.
Kaslyn Fields
Order, I guess kind of the primary feature that they're working on right now. It was very interesting to me that their work can be summed up with one feature and it's dynamic resource allocation, of course, diare. So John talked about some of the conflicts between scheduling, auto scaling node, all of these different SIGs within Kubernetes that need to work smoothly together. In order for hardware resources to be dynamically allocated. You need stuff on the node, you need auto scaling to work right for it. You need the scheduler to understand what kind of hardware it has and how to schedule things on it. And so they needed to implement new ways for users to be able to specify what kinds of hardware that they needed. So it kind of touches all of those. But he was saying that a lot of the code will end up being owned by the different sigs. It sounds like. Yeah, different pieces fall in those different categories nicely.
Abdel Sighiwar
Yeah. And I assume that particularly with devices, and I guess by devices we mean typically, I mean, in the context of AI, we mean accelerators, but that could mean anything else. It's kind of more complicated because Kubernetes is a workload orchestrator, but you typically also have an infrastructure orchestration layer, whether that's your cloud provider, VMware, OpenStack, whatever thing that gives you a VM or a node or whatever. And so having all of these things work together in a coordinated fashion is important.
Kaslyn Fields
I talked about one thing I find really interesting about our current predicament, our current situation in the industry with all of these hardware accelerators. Just massive interest in using them and creating these new AI workloads and things. A weird thing is that we're kind of, it feels to me kind of like we're going backwards because usually, you know, hardware becomes more commoditized over time and we're more about abstracting all of that underlying hardware. But right now it's like we've gone backwards a little bit there because everybody wants really close control of the hardware and understanding of the hardware. For certain types of AI workloads, you need to be really, really in control of what's happening with that hardware. So it's kind of backwards in the sense that we're getting more fine grained control rather than more abstracted control. But one feature of the working group's work is that it's kind of flexible. You can have that level of detail, but there are also still ways that you can kind of let the system decide things. Yeah, yeah. They're trying to build in those abstractions now and I think it's going to be interesting to see how that develops over the next several years.
Abdel Sighiwar
Yeah, I mean, a little bit off topic, but it's still related to this. I was on Reddit over the holiday season because, you know, when I don't have anything to do, I just go on Reddit because it's fun. And there was a conversation going on on the Kubernetes subreddit about somebody saying that they have workloads running on Kubernetes, but they have to restart them every few days. Like not let them restart manually. Yes, Manually? Pretty much, yes. I was actually making the same exact face you are making right now. I will make sure to link the subreddit or the thread because everybody was like, so you're running pods as VMs? And they were like, yeah, yeah, yeah, because the workloads have to be restarted so they can start from a fresh state. And I'm like, all right, sounds like we're not moving forward as much as we wish we are.
Kaslyn Fields
One thing I'm always telling folks, especially management, about this world, is that you think we've been around for 10 years and people know what's going on. They don't.
Abdel Sighiwar
They don't.
Kaslyn Fields
People are still making the transition from the VM world into the container world. Things are still brand new to a huge swath of the industry. And that's okay. And great. Honestly, it's great to still continue introducing it to people for the first time. Those conversations are fun.
Abdel Sighiwar
I just like the thread because the person was mentioning the fact that, oh, we adopted Kubernetes without really knowing much what we were doing. So that's the result of what we did. Right. So it was like, okay, that's interesting. I will. I will make sure to link the thread. I think it's fun.
Kaslyn Fields
And folks are going to be dealing with the ramifications of that for years and years to come.
Abdel Sighiwar
Oh, for sure, for sure. Yeah, for sure. Cool. Well, that sounds cool. I'm excited to listen to the episode later. So, yeah, thank you very much for your time.
Kaslyn Fields
Of course. Happy 2025. I'm excited to see a lot of the work that you're working on in 2025. Is there anything that you want to call out that's coming up in this year that you're excited about?
Abdel Sighiwar
A lot of AI, but, yeah, I mean, 2025 is looking super exciting. There will be a bunch of things going on. We are gearing up for Kubecon Europe, obviously. It's like, what, 12 weeks from now?
Kaslyn Fields
We are so difficult.
Abdel Sighiwar
So difficult. Yeah. There is quite a lot of KCD's happening this year and there's a lot of content to be created. Yeah, there will be a lot of things going on. I'm excited. 2025 will be a good year.
Kaslyn Fields
The biggest year of Kubecons yet.
Abdel Sighiwar
Yeah. There are five this year.
Kaslyn Fields
What have we got?
Abdel Sighiwar
Yeah, we got like, Europe, US as usual. We still have China and then we have India and Japan. Right?
Kaslyn Fields
Yeah. So I think that's five. Five Kubecons.
Abdel Sighiwar
Yeah. And then you have Oasis summits and.
Kaslyn Fields
Yeah. Open Source Summits, Cabrane's Community Days, like you were saying.
Abdel Sighiwar
Yes. Then there will be, I think Cloud Security Con again this year.
Kaslyn Fields
Oh, right.
Abdel Sighiwar
There will be 30 KCDS 3.0.
Kaslyn Fields
Wow.
Abdel Sighiwar
Around the world. Right. And then third party events. Right. And first party events.
Kaslyn Fields
So it's going to be a big year for the Kubernetes and cloud native communities. Infrastructure is not slowing down in the era of AI.
Abdel Sighiwar
No, it's not actually. Speaking of exciting things, I was scrolling on LinkedIn yesterday and I saw this thread. I did not know, but apparently the default ingress controller in Kubernetes and what I mean by the default, it means if you install vanilla Kubernetes, there is an ingress controller in it. Right. That ingress controller was called nginx Ingress. Very confusing. It has nothing to do with nginx Proxy, it was just called nginx Ingress. But apparently they are moving toward the new implementation called In Gates and I'm excited to explore that. We might actually have to have an episode about it on the podcast.
Kaslyn Fields
All right. I have been hearing about some other exciting ingress things that have been happening this year with Gateway, so hopefully we'll talk more about that soon as well.
Abdel Sighiwar
Yes, that's for sure. I know that we chatted about that and as we, I think discussed last year, we will also have some end user episodes. We've been talking about some particular, not CNCF affiliated, but more companies that are actually using Kubernetes internally and we are planning to do some of those.
Kaslyn Fields
So yeah, definitely want to feature more of those stories of folks out there using Kubernetes and cloud native technologies to do really awesome things. Those are always my favorite. I used to love it when there was a use case track at Kubecon and other container events. I feel like the events have kind of moved away from that. There's not a single track that is like use cases anymore and that makes me sad. But whenever I'm track sharing or reviewing CFPs, I'm always like if it's a use case.
Abdel Sighiwar
Yeah. Actually speaking of that, if you are listening to this section of this episode, if you have a very interesting use case, email us. We would be curious.
Kaslyn Fields
Yes, we'd love to feature you on the show.
Abdel Sighiwar
Yeah, just email us what are you working on? How are you using Kubernetes? What challenges do you have? Or if you have any particular questions or anything interesting that you want us to explore, just feel free to use email or DMS on social media. We'll be open to listen to what.
Kaslyn Fields
You have to say, and I think that's an excellent note to close on. Thank you everyone for listening to our first episode of 2025, and we hope to talk with you soon.
Abdel Sighiwar
Thank you.
Kaslyn Fields
That brings us to the end of another episode. If you enjoyed the show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on social media Kubernetes Pod or reach us by email@kubernetes podcastoogle.com youm can also check out the website@kubernetespodcast.com where you'll find transcripts, show notes, and links. To subscribe, please consider rating us in your podcast player so that we can help more people find and enjoy the show. Thanks for listening and we'll see you next time.
Kubernetes Podcast from Google – Episode Summary: "Device Management in Kubernetes, with John Bellamaric"
Release Date: January 15, 2025
Hosts: Abdel Sghiouar & Kaslyn Fields
Guest: John Bellamaric, Senior Staff Software Engineer at Google and Co-Chair of the Kubernetes Working Group Device Management
In the premiere episode of 2025, hosts Kaslyn Fields and Abdel Sighiouar delve into the intricacies of device management within Kubernetes, featuring insights from John Bellamaric. The discussion centers around the evolving needs of the Kubernetes community, especially in the context of specialized workloads such as AI and machine learning.
The episode kicks off with Kaslyn sharing exciting updates about upcoming KubeCon events:
Timestamp: [00:41]
John Bellamaric is introduced as a long-standing contributor to Kubernetes since 2016. He currently serves as a co-chair for both SIG Architecture and the Working Group Device Management. John shares his early experiences at KubeCon Seattle 2016, highlighting his role in bringing CoreDNS to Kubernetes.
Timestamp: [01:14] – [02:19]
John provides a comprehensive explanation of Kubernetes' organizational structure, distinguishing between Special Interest Groups (SIGs) and Working Groups. While SIGs focus on specific Kubernetes components like Node, API Machinery, and Scheduling, Working Groups are formed to tackle cross-cutting challenges that span multiple SIGs. These groups typically have a short lifespan, dissolving once their objectives are met.
Timestamp: [03:07] – [04:21]
The conversation shifts to the formation of the Device Management Working Group, driven by the burgeoning demand for AI workloads that require specialized hardware like GPUs and accelerators. John recounts the initial enthusiasm and subsequent challenges faced with Dynamic Resource Allocation (DRA) presented at KubeCon Chicago.
Timestamp: [04:23] – [06:21]
Definition & Purpose: DRA is a feature introduced to allow more flexible management of hardware resources on Kubernetes nodes. It aims to allocate devices dynamically based on the specific needs of workloads, particularly those related to AI.
Challenges: The initial implementation of DRA introduced complexities for the autoscaler and scheduler SIGs. The high degree of flexibility made it difficult for the autoscaler to predict whether new nodes would satisfy POD specifications, leading to hesitations and the need for redesign.
Timestamp: [06:21] – [07:44]
Notable Quote:
John Bellamaric at [06:26] says,
"We need to revisit DRA and how it is designed and structured such that it meets the needs of the auto scaling community and the scheduling community."
In response to the challenges, the community decided to overhaul DRA, ensuring it aligns with the requirements of auto-scaling and scheduling. This led to the establishment of the Device Management Working Group, which encompasses multiple SIGs including Node, Scheduling, Auto Scaling, and Networking. The group's mission is to facilitate efficient configuration, sharing, and allocation of accelerators and specialized devices.
Timestamp: [07:49] – [09:55]
Notable Quote:
John Bellamaric at [14:36] elaborates,
"The goal of working group device management is to change Kubernetes' relationship with the hardware and change how Kubernetes understands the hardware and makes the hardware available to our users."
John discusses how Kubernetes initially focused on making hardware as abstract and fungible as possible, suitable for traditional web applications. However, AI workloads necessitate a more granular and controlled approach to hardware management due to their specific and resource-intensive requirements. This shift represents a move towards increased complexity in hardware abstraction to optimize utilization and performance.
Timestamp: [13:00] – [17:27]
Notable Quote:
John Bellamaric at [14:52] states,
"We're trying to change how Kubernetes understands and interacts with hardware to better support the specific needs of AI workloads."
The primary workstream within the Device Management Working Group is the continued development of DRA. This includes enhancing the API to allow device vendors to specify detailed device attributes and enabling users to make more nuanced resource claims. Upcoming features in Kubernetes 1.33 aim to provide greater flexibility in specifying device requirements, such as allowing multiple types of devices to satisfy a single claim.
Timestamp: [17:56] – [22:00]
Notable Quote:
John Bellamaric at [18:07],
"With DRA, users can specify their needs more flexibly, allowing the system to optimize resource allocation based on the cluster's overall state and policies."
John emphasizes the importance of community feedback in shaping the APIs and features related to device management. He invites infrastructure engineers, platform developers, and end-users to participate in discussions, contribute to the GitHub repository, and attend bi-weekly meetings to provide input and collaborate on solutions.
Timestamp: [27:55] – [31:03]
Notable Quote:
John Bellamaric at [27:55],
"We're building APIs that are hard to change and are likely going to have to live for another 10 years. The more information we have up front, the better we can design them."
The episode wraps up with the hosts and John acknowledging the critical role of infrastructure engineers in the evolving Kubernetes ecosystem. They highlight the numerous upcoming KubeCon events and encourage listeners to engage with the community, share their use cases, and contribute to ongoing projects.
Timestamp: [31:03] – [42:32]
Key Takeaways:
Cross-SIG Collaboration: The Device Management Working Group exemplifies effective collaboration across multiple SIGs to address complex, cross-cutting challenges.
Dynamic Resource Allocation: DRA is pivotal in enabling flexible and efficient management of specialized hardware resources, crucial for AI workloads.
Community Engagement: Active participation and feedback from the community are essential in refining APIs and ensuring the longevity and scalability of Kubernetes features.
Future-Proofing Kubernetes: By addressing the nuanced needs of modern workloads, Kubernetes continues to evolve, maintaining its relevance and adaptability in diverse computing environments.
Stay Connected:
Subscribe and Rate:
If you enjoyed this summary, consider subscribing to the Kubernetes Podcast on your favorite podcast platform and leave a rating to help others discover the show!