
Guests are Nick Eberts and Jon Li. Nick is a Product Manager at Google working on Fleets and Multi-Cluster and Jon is a Software Engineer at Google working on AI Inference on Kubernetes. We discussed the newly announced Multi Cluster Orchestrator...
Loading summary
Kaslin Fields
Hello and welcome to the Kubernetes podcast from Google. I'm your host Kaslin Fields.
Mofi Rahman
And I'm Mofi Rahman.
Kaslin Fields
At Google Cloud. Next, Abdel sat down with Nick Eberts, a product manager at Google, and John Lee, a software engineer at Google, and to talk about MultiCluster Orchestrator, or MCO, a new open source tool announced at Kubecon EU 2025. MCO addresses the need for managing workloads across multiple Kubernetes clusters, especially with expensive accelerated hardware like GPUs which require efficient scaling. But first, let's get to the news.
Mofi Rahman
HCD has released version 3.6.0. This is the first minor version release in about four years since June 2021. The new release includes some exciting updates like downgrade support and some significant performance improvements.
John Lee
Check out the blog on HCT IO for more details.
Kaslin Fields
Kubernetes 1.33 is now available in the Rapid channel on GKE. This update comes just two weeks after the open source 1.33 release.
Mofi Rahman
Kyvarena 1.14.0 was released, marking a significant milestone in the journey to making policy.
John Lee
Management in Kubernetes more modular, streamlined and powerful.
Mofi Rahman
This release introduces two new policy type Validating Policy and Image validating policy. Kiverna 1.14.0 begins a new chapter for Kiverno with the introduction of specialized policy types that separate concerns and confusion about validation checks being written in various patterns by providing a more focused approach to functionality.
Kaslin Fields
And that's the news.
Mofi Rahman
Hello everyone and welcome to a new episode of the Kubernetes podcast. I'm your host Abdel and we're here live from Google Cloud Next 2025. I am here with John and Nick. Hi guys.
Nick Eberts
Howdy.
Mofi Rahman
We're going to be talking multicluster Orchestrator, which is something we actually announced at Kubecon London last week. I don't know when this episode is going to be out, but last week, whatever that last week is. But before we get there, let's start with some introductions. Why don't we start with you Nick? Who are you? What do you do?
Nick Eberts
My name is Nick Eberts. I am a product manager at Google, working on GKE and all of the multi cluster tooling that we build. Mainly fleets and of course multicluster orchestrator.
Mofi Rahman
Awesome. And John.
John Lee
Yeah, I'm John. I'm a software engineer, also working at GTE closely with Nick. In the past year or so I've been focusing on inference on a lot of the Gen AI workloads.
Mofi Rahman
Like everybody basically these days. Awesome. So let's start with something basic. Multicluster orchestrator. We announced that last week. There is a blog out there is a GitHub repo somewhere. Why don't you explain to us what is mco?
Nick Eberts
Yeah, sure. Actually, before we explain what MCO is, I'm going to have John describe the problem. So John and I, we got together about a year ago to try and solve this particular problem. I'm going to let you go through and describe.
Mofi Rahman
Yeah, let's start with that.
John Lee
Yeah. In the past two, three years, we see the cloud to evolve. When we first built Kubernetes, it was built with the assumption of the cloud is infinite and the cloud is uniform. You have infinite amount of capacity. CPUs, they're all about the same everywhere. That's the assumption we took back 10 years ago. Kubernetes was started when accelerators started to enter the picture. Right. It's not quite infinite. We got stockouts, sometimes our customers can get capacity. Also it's also not uniform, the different generations of GPUs and TPUs. Right. That calls for a different solution. Right. Like when customers have stock out in one region. Right. It's natural to expand out to different regions close by. And also when for a lot of the inference workloads, your latency requirement is not that strict. Actually going a little bit further is okay. So that kind of calls for a multi region, multi cluster solution.
Mofi Rahman
Got it, got it. And before we start talking about mco, Nick, I guess this problem of managing multiple clusters is not even new to the inference world. Like it have existed for a while and it has to do with things like I need to run my app closer to where my customers are. That would be one of them. So it's not only just like stock outs and running out of availability of hardware. But I want you, Nick, to tell us to try to answer this question. Where are we right now? Where do you think we are on the question of one large cluster or multiple small clusters? Because that kind of blends into this conversation, right?
Nick Eberts
Yeah. At Google or even upstream Kubernetes, we're making pretty massive clusters these days. Like these days we have 65k nodes. I know you did a podcast about all that.
Mofi Rahman
Yeah.
Nick Eberts
But it's not. Just because you can, I think, doesn't mean you necessarily should.
Mofi Rahman
Got it.
Nick Eberts
Even though your cluster can be massive, there's still like the blast radius of an of the control plane on any particular cluster.
Mofi Rahman
Sure.
Nick Eberts
The idea here is to you don't want like too many small clusters and I Don't think you necessarily want one massive cluster, but you certainly want some kind of middle ground in between in which maybe you have a cluster that represents a shape of applications that bin pack nicely together region. And so the idea is that the goal of the products that I build and what I'm trying to push upstream is this ability to think about how you want to bin pack applications together onto the same shapes of clusters. Clusters are fungible and not really have to think too much actually about a physical cluster. Here's this set of configuration that represents these apps.
Mofi Rahman
Got it.
Nick Eberts
And I want to make sure that it runs highly available and I have customers in maybe these three regions and just make sure that it's there for them to make lower latency answer lower latency requests.
Mofi Rahman
Got it, Got it. And we're going to talk a little bit about details of how MCO works because I had to go do some reading to prepare for this episode. But then let's get into it. What is mco?
Nick Eberts
Yeah, I'm going to take a one. Take a second here just to describe what we have without mco, so maybe lead into what MCO provides. So if you're going to build, if you were going to build a multicluster inferencing engine, like you could just build that, literally you could deploy n number of clusters across n number of regions. You can host that inferencing app in those regions. And by the way, inferencing is just the serving app.
Mofi Rahman
I know that there are some nuances.
Nick Eberts
There as it relates to networking and workload placement. I think that you could think of them somewhat the same, but the point is that you can have these clusters set up across multiple regions. Then you just need to make sure you have a load balancer that can reach them. And you want to make sure that load balancer can route traffic based on preferences.
John Lee
Right.
Nick Eberts
So you can ensure that if you want most of your requests to go to a certain region, there's like a preference so it can send traffic there. So you could do all this in Kubernetes. There's lots of ways to solve it today. But the thing that you can't really solve for unless you introduce like a paas service running in Knative or any of these services that can scale to zero in a cluster. What you can't do is take those, those workloads, those inferencing engines running in all those clusters and scale them to zero.
Mofi Rahman
Sure.
Nick Eberts
Now this is one of the main differences when you're talking about accelerated hardware versus regular CPUs. Accelerated hardware is expensive. So, like, you don't necessarily want to have to pay for GPUs in regions where they're not serving requests. So just having to have a GPU sit in a region just in case is not really cost effective.
Mofi Rahman
Yeah.
Nick Eberts
Right. And so the general problem that. One of the problems that we're out to solve with Multicluster Orchestrator is this idea of taking secondary regions, sorry, taking the HPA of those workloads from zero to one right. When there's a need for them to scale out, and then also taking them back to zero when there's no longer a need to have that inferencing engine running in that extra region. So Multicluster Orchestrator's job is to allow an ML operator, or just a workload operator to define a set of priorities. And those priorities are like, which regions they prefer, which clusters they prefer, stuff like that, then evaluate those against actual capacity and return a result. That's a recommendation for which particular cluster in your fleet should this workload land? Now, I just want to be clear. It's just making a recommendation.
Mofi Rahman
Yeah.
Nick Eberts
So what Multi Cluster Orchestrator is not. Is a CD tool. We have enough of those. I don't think we need a full one. And so like, the first implementation that you'll see is with Argo cd, because that's where most of our customers are right now. But we are working upstream to get it to work with flux and also config sync down the line.
Mofi Rahman
And so the work you're doing upstream is to standardize the way those recommendations are spit out by MCO so they can be consumed by a PCD tool.
Nick Eberts
Yeah. So MCO is open source and we believe it's still early days, but the idea is that we're going to have almost like a cloud provider model. Not a cloud, I shouldn't say cloud provider, but a provider model. So if you're a kubernetes provider in which you could plug in maybe an API that MCO is going to call to search for capacity.
Mofi Rahman
Got it.
Nick Eberts
And also we are going to make the metric that determines whether or not there actually is a capacity issue in any region that's live, open and accessible so that you could bring your own sort of metric to evaluate the running workloads and decide whether or not they're stocked out, or any logic that you want to decide to add another cluster or remove cluster. So those are the two. That's like the surface area for integration with other providers.
Mofi Rahman
Yeah, got it. Got It And John, you were going to say something about inference specifically. I want to hear what. I want to hear your thoughts.
Nick Eberts
Great, yeah.
John Lee
For inference, Kunik, earlier I said it's just like web service in a lot of sense. It is stateless, not keeping track of any states. It's not typically connected to any like relational database. In those sense, yes, it's very much like a web server. There are also cases where it's not quite like a web server, where oftentimes it's the actual computation is done by an accelerator and with transformer inference it is auto regressed and the workload is divided between pre fill and decode. Pre fill typically is very compute bound and decode is memory bandwidth bound and autoregressive in the sense that, you know, for one path forward of the neural net, you generate one token and then you keep generating until you hit the end of sentence token. So because of that nature, the latency here could be in the order of seconds as opposed to what we're used to in the microservice world. Like things are in the millisecond realm. So that's the major difference.
Mofi Rahman
Yeah, so to rephrase what you said, basically traffic for LLMs is not your typical web traffic in the sense that the request could be long, the size of the request could be big. And there is also the fact that a lot of these LLMs today are multimodal. So the request is not always text. It could be like audio, it could be video, it could be a picture, it could be whatever, both requests and response. And I think that's what partially what the gateway API inference extension is trying to address. And we're going to have an episode about that. But I want to go back to, to mco. I was looking at the demo that you built, Nick. So there is MCO and there is this thing called workload placement, which is a CRD object, right?
John Lee
Yeah.
Mofi Rahman
You deploy that, you say, I want this workload. These are my list of clusters in order of preference. And then it spits out in its status field of recommendation. And then you plug a CD tool to take that recommendation and do something with it. Right?
Nick Eberts
Yeah.
Mofi Rahman
But MCU itself builds on top of two things, the Cluster Inventory API and the Cluster Profile API.
Nick Eberts
Correct.
Mofi Rahman
So what are those?
Nick Eberts
Yeah, so actually just to be super clear, Cluster Inventory isn't an API as much as it's just a word to describe a number of cluster profiles. Okay, so Cluster Profile is a CRD that we built upstream in SIG multicluster, and it's essentially just a pointer to an actual cluster. But the idea that or the thing that we noticed is that like a lot of providers were building their own cluster list, including us. We have fleets.
Mofi Rahman
Yeah.
Nick Eberts
Azure has fleets like OpenShift has their own version of things and then a lot of tool like multi cluster tools for maintaining their own list. If you use Argo cd, the the secrets on your central Orgod CD server are essentially a list. Right. If you're looking at Multiq, that's another service. It's a multi cluster type of workload distributor.
John Lee
Right.
Nick Eberts
That thing has its own list and Multi Cluster Orchestrator has its. Could have had its own list. But what we tried to do is normalize that list involves open source specs essentially if you think about it, all of the clusters that are generated as cluster profiles in a namespace on a central hub cluster and that's actually a term that we use, we've decided to accept upstream, it's the hub cluster. Those represent like a sameness boundary.
Mofi Rahman
Yep.
Nick Eberts
To some degree. And you can, you could consider them to be analogous to a fleet.
Mofi Rahman
Got it.
Nick Eberts
And that's what in cluster inventory is, it's a number of cluster profiles.
Mofi Rahman
Got it.
Nick Eberts
And so. Sorry. And it's not just like the cluster name, it's all the metadata that you want to decorate that cluster.
John Lee
It's almost like the capability list. Right. For cluster, it's got GPUs in it like also that cluster is got certain networking is a way for you to express what this cluster can do.
Nick Eberts
Yeah, yeah.
Mofi Rahman
The word cluster inventory is pretty self explanatory in my opinion. But it's still interesting to talk about it. So I think my follow up question then would be who generates? By who I don't mean the person which part of a system generates the cluster profiles? Is it something that you would generate as a user manually or would you write like something to generate it so it would query your cluster and then generate the profile and update it?
Nick Eberts
Yeah. I think the whole point of cluster profile or I think the whole point of cluster profile is to remove the need from an end user to have to write a bunch of glue code to sync all of these desperate lists together. So if you add a cluster before cluster profile, you'd have to add that cluster, you'd have to add the Argo CD secret, you would have to add n number of other entries to a list somewhere that represents that cluster. So I think what I'm seeing is both with Microsoft certainly and us at Google is we have a service that's generating the cluster profiles based on the cluster create. So like for example, if you're in. If you're using GKE fleets, when you add a new cluster to the fleet, we are automatically going to create a cluster profile on your hub cluster that you've identified with a label and say so we are going to. Every. Everything gets added. Every time you make a change to some metadata on a label, we're going to reflect that into the profile.
Mofi Rahman
Got it?
John Lee
Yeah.
Mofi Rahman
And that's what MCO uses as an input to say these are my available clusters and for each cluster these are the capabilities as you said, John.
Nick Eberts
Yeah, that is the. Let's say that's if in terms of SQL, that's the select, right?
Mofi Rahman
Yeah.
Nick Eberts
And then there's a filter that you can apply to it which is part of the spec of multicluster orchestra in that placement.
Mofi Rahman
Yes, yes.
Nick Eberts
You could do regex or you can hard code the list and then shortly we're going to provide a way for you to use label selection to decide which clusters because you don't. Not every workload in your fleet probably needs to be a target for. Sorry, not every cluster in your fleet probably needs to be a target for mco.
Mofi Rahman
Yeah, sure. You don't have to add all the clusters and their MCO essentially but so then a follow up question would be the engineering. Me like is thinking one problem that could happen is if your cluster profile is not up to date, would there be a situation which MCO would make a recommendation that would be outdated in a way especially think about it as you have multiple clusters and then multiple people are using other tools to deploy these clusters. So how fast can you reflect the status of a cluster to MCO matters In this case?
Nick Eberts
Sure. I can only speak for how fast we could do it in GKE and it's on the order of milliseconds pretty quick.
Mofi Rahman
Okay.
Nick Eberts
But that's up to. I think that's an implementation detail of the provider of cluster profile. That's not even an MCO thing. That's just like how quickly is your sync occurring between whatever that source of your cluster list is and the actual hub cluster profile.
Mofi Rahman
Got it, Got it. Okay, cool. And so MCU is under SIG Multicluster, right?
Nick Eberts
Not yet.
Mofi Rahman
Not yet. Okay. That's the intent, that's what you're trying to do.
Nick Eberts
Yeah.
Mofi Rahman
All right. And when do you expect people will be able to like play around with this?
Nick Eberts
Yeah. So depending on the time of release, maybe today, but so two weeks after next. Let's say is our goal. And so the first thing you're going to see is that the images, so it's going to be first available obviously for GKE clusters, because this is the team that's building it. And so the images for the binaries will be public and available in the GitHub repo and then you'll have a terraform sample that shows you how to build it all out using our implementation.
Mofi Rahman
Got it. But.
Nick Eberts
And then shortly after that we're actually going to release the code into that very same repo. And then over the next six months I'm going to go through the process of working with SIG Multicluster to figure out how, where and when it gets pushed in.
Mofi Rahman
Got it? Yeah. And another question to you John, on the inference side, because the demo that, that, that I saw that you built, what the one that is internal for now does also leverages multi cluster gateway for multi cluster load balancing for inference. Is it based on the GKE inference gateway or just the regular gateway?
Nick Eberts
So we, that's based on a gateway class that we built for gcp that's a multi cluster, multi region internal load balancer gateway class.
Mofi Rahman
Got it.
Nick Eberts
There's also some other work that we're doing upstream with the gateway spec, as in respects to inference pools.
John Lee
There's. Yeah, there's a separate inference gateway.
Mofi Rahman
Yeah. So that's what I was, that, that was actually my actual question and my follow up would be where do. Because I did some reading about the inference extension and I did actually talk at Kubecon last week, like a lightning talk about it and where the part of what this does there is this thing called the endpoint picker which kind of plugs the inference pools into the gateway itself to tell the gateway where to route traffic. Exactly.
John Lee
Yes.
Mofi Rahman
So how do you see this plug into mcl? Is this something. Is the inference extension going to be able to leverage MCO in a way?
John Lee
Yes. Yes. So that's something. We're so actively figuring out the details. But I can give you some high levels.
Mofi Rahman
Yeah.
John Lee
So mco, what you think of like the routing at that layer, Right. Is region picking.
Mofi Rahman
Yes.
John Lee
And then what EPP does is the endpoint picking.
Mofi Rahman
Yes.
John Lee
So you can think of this in two layers. Right. So first when the request goes in on the data path, Right. You first decide what region you assigned to. And then once you get to the regional level, like what EPP is going to give you, it has a capability, instead of doing round robin load balancing, it can use a custom metric for Instance kivicache utilization to fully balance all.
Mofi Rahman
Your accelerators or queue size or something like that. Yeah.
John Lee
So the latency that it works at is in the milliseconds. It needs to be fast because it needs to be able to route to an endpoint. That's the latency we're working with. And for regional picking, what we wanted to do is we wanted to send to regions where there's capacity and the latency there doesn't actually need to be that low compared to the endpoint picking part. So think of this as a two.
Mofi Rahman
Layer, two layer problem. Yeah.
John Lee
So first pick the region and then we have ways to direct shape traffic to regions where there is capacity. And second, once a whole bunch of requests lands there, there's mechanisms to balance the load among all the accelerators. As doing the endpoint picking, I think.
Mofi Rahman
My question was kind of slightly more broader than that in the sense that if you have a situation where you need to auto scale based on utilization. Right. That MCO would be able to be plugged to make that workload placement. Auto scaling recommendation. Or am I?
Nick Eberts
So MCO's job is to take it from zero to one and one to zero.
Mofi Rahman
Okay.
Nick Eberts
The HPA takes over once it gets to one.
Mofi Rahman
Got it, got it.
Nick Eberts
So you're going to configure the HPA on a metric that makes sense for that particular workload. In the case of inferencing, we've seen people use KV. We're recommending like KVCache or maybe even Q depth of the LLM or maybe.
Mofi Rahman
Metrics from the GPUs or something. Yeah, cool.
Nick Eberts
Awesome.
John Lee
One thing we'll add though, like on the multiclasser gateway side. Right. This is, I guess we're getting more to the details of gcp. There's something called preferred backend. Yep. That is for the customers to say, I wanted to shave my traffic to a particular region. Right. For reasons like if people have bought reservation in that region, they want to fill it up. Right. So those are ways to steer traffic and shape traffic. And then once you've done that, MCO can auto scale that workload to that region. And also HPA can scale the number of nodes within the region pods and then nodes.
Mofi Rahman
Awesome, awesome. Thank you very much, folks. This is pretty cool. I'm looking forward for MCU to come out and maybe one of you coming back on the podcast to tell us more about the new stuff.
Nick Eberts
My goal is to have this conversation with you again in six to seven months and talk about its sort of birth into sig multicluster and use by other other companies Besides Google.
Mofi Rahman
That's 100% awesome. Last time we had you on the show, that episode was very popular. So we're happy to have you back. And then maybe we can have John come back to talk about Szechuan Peppers, because I've heard you guys gossiping about Szechuan peppers.
Nick Eberts
Do a whole episode on Copy.
Mofi Rahman
Well, probably not on the Criminal podcast, but.
Nick Eberts
All right, awesome.
Mofi Rahman
Thank you very much, folks.
Nick Eberts
All right, thanks.
John Lee
Thank you.
Kaslin Fields
That brings us to the end of another episode. If you enjoyed the show, please help us spread the word and tell a friend. If you have any feedback for us, you can find us on social media Kubernetes pod or reach us by email@kubernetespodcastgoogle.com you can also check out the website@kubernetes podcast.com where you'll find transcripts, show notes and links. To subscribe, please consider rating us in your podcast player so we can help more people find and enjoy the show. Thanks for listening and we'll see you next time.
Kubernetes Podcast from Google: Episode Summary
Title: Multi-Cluster Orchestrator, with Nick Eberts and Jon Li
Hosts: Abdel Sghiouar, Kaslin Fields
Release Date: May 28, 2025
In this episode of the Kubernetes Podcast from Google, hosts Kaslin Fields and Mofi Rahman delve into the intricacies of managing workloads across multiple Kubernetes clusters. They are joined by Nick Eberts, a Product Manager at Google, and John Lee, a Software Engineer at Google, to discuss the newly announced open-source tool, Multi-Cluster Orchestrator (MCO). MCO aims to address the challenges of orchestrating workloads, particularly those requiring expensive accelerated hardware like GPUs, across diverse Kubernetes environments.
John Lee sets the stage by highlighting the evolving landscape of cloud computing:
“Kubernetes was built with the assumption of an infinite and uniform cloud.”
— John Lee [03:56]
Originally, Kubernetes was designed under the premise that cloud resources—such as CPUs and memory—were abundant and consistent across regions. However, with the advent of specialized accelerators like GPUs and TPUs, this assumption no longer holds true. Challenges such as hardware stockouts and non-uniform availability across regions necessitate a more sophisticated approach to workload management.
Nick Eberts provides insight into the current Kubernetes cluster dynamics:
“Even though your cluster can be massive, there's still like the blast radius of the control plane on any particular cluster.”
— Nick Eberts [04:39]
While Google and other upstream Kubernetes contributors have scaled clusters to impressive sizes (e.g., 65k nodes), managing a single large cluster isn't always optimal. The risks associated with a single point of failure (“blast radius”) and the logistical complexities of handling diverse workloads across regions underscore the need for a balanced multi-cluster strategy.
MCO emerges as a solution to streamline multi-cluster management. Nick Eberts elaborates:
“The goal of the products that I build and what I'm trying to push upstream is this ability to think about how you want to bin pack applications together onto the same shapes of clusters.”
— Nick Eberts [05:22]
MCO facilitates the efficient distribution of workloads across multiple clusters by recommending optimal placement based on predefined priorities and actual cluster capacities. Unlike traditional Continuous Deployment (CD) tools, MCO specializes in workload orchestration rather than deployment, ensuring that resources like GPUs are utilized cost-effectively.
The conversation delves into the technical architecture of MCO, underpinned by two primary components:
Nick Eberts explains:
“Cluster Profile is a CRD that we built upstream in SIG multicluster, and it's essentially just a pointer to an actual cluster.”
— Nick Eberts [11:15]
Cluster Inventory serves as a centralized repository of cluster metadata, eliminating the need for disparate tools to manage cluster lists. Cluster Profiles encapsulate detailed information about each cluster, including available resources and capabilities (e.g., GPUs, networking configurations), enabling MCO to make informed placement recommendations.
John Lee adds context-specific insights related to inference workloads:
“For inference, Kubernetes is not your typical web traffic in the sense that the request could be long, the size of the request could be big.”
— John Lee [10:19]
These workloads demand specialized handling due to their unique characteristics, such as prolonged processing times and substantial resource utilization.
MCO is designed to integrate seamlessly with popular CD tools. Nick Eberts states:
“The first implementation that you'll see is with Argo CD, because that's where most of our customers are right now.”
— Nick Eberts [08:03]
Currently, MCO supports integration with Argo CD, with plans to extend compatibility to other tools like Flux and Config Sync. This integration allows MCO to provide placement recommendations that CD tools can act upon, ensuring efficient deployment across clusters.
The discussion highlights how MCO specifically benefits inference workloads, which often require specialized hardware:
John Lee explains the complexities of handling transformer-based inference:
“The latency here could be in the order of seconds as opposed to what we're used to in the microservice world.”
— John Lee [10:19]
MCO addresses these challenges by dynamically scaling resources based on demand, ensuring that expensive hardware like GPUs are utilized only when necessary. This dynamic scaling is crucial for cost-effectiveness and maintaining high performance.
Nick Eberts outlines the roadmap for MCO:
“We're going to make the metric that determines whether or not there actually is a capacity issue in any region that's live, open and accessible so that you could bring your own sort of metric to evaluate the running workloads.”
— Nick Eberts [08:48]
MCO is set to be released as an open-source tool, with initial support targeting GKE clusters. The team plans to collaborate with SIG Multicluster to standardize and expand MCO’s capabilities, ensuring broad adoption across the Kubernetes ecosystem.
The episode concludes with enthusiasm for the upcoming release and future developments:
“My goal is to have this conversation with you again in six to seven months and talk about its sort of birth into sig multicluster and use by other companies besides Google.”
— Nick Eberts [20:24]
Hosts express anticipation for MCO’s impact on multi-cluster management and invite listeners to stay tuned for further updates and episodes exploring related advancements.
Key Takeaways:
Notable Quotes:
For those interested in exploring Multi-Cluster Orchestrator further, keep an eye on the GitHub repository and upcoming releases detailed in the episode.