
Loading summary
Interviewer
Fantastic. We have Darren and Cosimo from BP's Castrol Immersion Division. I need you guys to give me a little background or context of what that, what that looks like.
Darren
Absolutely. If you want to, go ahead, sure.
Cosimo
Yeah. So Castrol obviously wholly owned subsidiary of bp, well known worldwide lubricants company. And gosh, I guess close to three years ago now, I formulated what we call the thermal management business unit. And so thermal management started with full single phase immersion cooling three years ago. That was hugely adopted, over 500 megawatts in fact, in bitcoin mining, in fact, and was really coming on strong in data centers. And kind of took a step back to what's known as direct to chip or cold plate liquid cooling. And we've kind of embraced the whole thing. As a chemical company, we offer all those fluids, fluid management services. And so that's kind of where we're coming from.
Interviewer
And the immersion technology, where did that, how was that developed?
Cosimo
Immersion technology? Gosh, it's. There's been different flavors of it around, coming almost from both crayons computing and also a bit into the gaming in.
Interviewer
Relation to AI data centers and scaling data centers. How does that apply?
Darren
As Dario said, liquid cooling in general has been around for what, 40 years, 50 years.
Cosimo
It's incredible.
Darren
Since the dawn of actually the mainframes. So it's not a new technology. And even immersion goes back. The difference here is the adoption at scale, as you were mentioning, with AI that generates so much heat that air just cannot do it.
Interviewer
So.
Cosimo
Exactly. Air is no longer adequate. So in the case of AI, you've got to embrace liquid cooling.
Interviewer
So where are we in the adoption of this?
Cosimo
You know, we were just talking about the major hyperscalers. The names that affect our lives every day have been deploying this probably four years now.
Interviewer
Okay.
Cosimo
Yeah.
Interviewer
So most of their.
Darren
So there's actually much more than people know. But the companies that are doing these don't like to broadcast what they are doing. And we cannot talk about our customers yet. We have NDAs with everybody.
Interviewer
There are people in Italy that would find us.
Darren
There are people in Italy that have baseball bats and have knees. So I like to keep. Which is one of the challenges of this industry because we were actually talking about that on the stage a few minutes ago. Every. The market is made of very large companies that we all know and they all think that they are unique and special and smarter than anybody else. I don't want to say that they are doing all the same thing because there's a difference here. And there But I mean, the technology is basic technology.
Interviewer
Let me ask this way then, stepping back a layer to the enterprises, what are the design considerations they have to consider?
Cosimo
Wow, it's really interesting. So the. You're talking about a plate on top of a chip stack and the heat transfer right there is. That's where all the action is.
Interviewer
Yeah.
Cosimo
And so, yeah. What is this plate made of? How thin is the metal? What kind of grease or metallic paste do you put in between to help things out? How fast can you run the liquid inside? How small are the channels? Bigger the channels? How are you pulling the heat out back there after you pull it out at the little spot? There's a lot going on.
Interviewer
What viscosity are we running out here?
Cosimo
Is essentially water.
Darren
It's essentially water viscosity. Viscosity is not a big challenge.
Cosimo
Yeah, yeah. The pump would blow up if viscosity were the challenge.
Interviewer
Okay, so what are the biggest technology challenges for you and what are the biggest for your company then for the companies you serve?
Darren
I would say the biggest technology challenge is. Well, there's a few of them. As Darien was said, the most common liquid cool technology is the cold plate. Now with the cold plate, you cool down mainly the processor. It could be a cpu, gpu, it doesn't matter. But there's a large component of air cooled still involved. Now you can actually put cold plates everywhere and cool. Like remove 98, 99% of the heat with cold plates. But then maintenance of the server, it becomes complicated because everything is covered by coldplates and if you have to change a memory, you have to. It's complicated. So that is the first challenge. The second challenge I would say is that it's still kind of a new technology. Has been around for a while, but not at scale. The adoption, it's at a very early stage. You ask who is actually adopting this? There's not a 1 million server liquid cool situation yet. It's coming in weeks, not even in months. It is happening right now because it was nice to have analysis. That said, now it's must have give.
Interviewer
Me an idea of how much we're talking about in terms of efficiency or cost savings from moving to liquid cooling.
Cosimo
So immersion cooling is more than 40%. So we're moving back over to put the entire server down inside the dielectric.
Interviewer
Yeah.
Cosimo
Okay. You can put it in any room you want.
Interviewer
Yeah.
Cosimo
There's no more air conditioning really needed other than just to keep a basic comfort level. So all of that specialty air handling, everything you've got going on is all gone now.
Interviewer
Yeah.
Cosimo
So that's a massive savings. When you come to again, the cold plate directed chip, you still got to cool the rest of the server. So the room, you know, you still, you still got the air conditioning going on.
Interviewer
Does that a little have to do with a little thermal management?
Cosimo
Well, just, you know, you still got the memory parts of the server and everything else that are generating heat, but those don't have the plates touching them. So you know, the whole server still got to stay relatively cool. Probably a bit cooler than this room, you know, just to keep everything going. So it depends which way you go, how much energy you save. But again now it's flip flopping, particularly in the United States where energy costs are not that big of a deal that you just simply cannot cool an AI chip without the liquid cooling. So it's a matter of necessity, regardless if you're saving water or power or anything or not.
Interviewer
Who's your customer?
Darren
Well, as we said, we have NDAs with everybody, but I would say our main customers are the end users. The hyperscalers, for instance.
Interviewer
Yeah. So the top 10, 20 kind of companies in that space.
Darren
Well, no comment.
Interviewer
Okay.
Cosimo
Beautiful.
Darren
No.
Interviewer
Who in the decision making authority at the company is this a CFO decision? Cto, head of the data center? Is it engineering.
Cosimo
It really steps its way through because you know, this is another important point that the big guys have been through and then everybody else is coming to the reality is the complications you can have in the chemistry, even as simple as this. PG25 it's propylene glycol commodity plus water. Clearly there's plenty of that around and a bit of additives thrown in for biofouling and for anti corrosion. Problem is that bit of additives is where the magic is. And that bit of additives is different for just about everybody who makes it. And, and you throw two pieces of magic together, you end up with magic that you didn't want sometimes. So you start to deploy this at scale.
Interviewer
That's the best way of putting it.
Darren
This basically applies for everything, by the way.
Cosimo
Yeah, I mean you start to deploy that at scale and three years from now you're like, we got all sorts of magic here. We weren't thinking about. We should have been checking this every six months. So that's a big part of the sort of overall management of the thing and that it's a bit more complicated than just throwing a commodity fluid in.
Interviewer
And so on the buying process side, you're doing a lot of tests with the companies.
Cosimo
Yeah. See, I completely forgot your question.
Interviewer
Yeah.
Cosimo
The point is it starts with the engineers because they got to deal with the magic and it goes all the way through to procurement because they got to buy millions of gallons. And it's a step wise process along.
Interviewer
The way that'll take me a year or two out. What are we going to see just in terms of like painting the roadmap and innovation?
Cosimo
I think we particularly, and we can't be the only ones, you know, have some innovations going on with additives where it's going to increase basically the heat capacity of liquid. So its ability to take more heat away without, you can't add, you know, extreme expense. There's a lot of things going on with again, as I mentioned, you know, there. So there's greases right now that you could put in between the chipset and the cold plate. There are metallic pastes. So some balance in there because the greases pump out as things expand and contract because they're getting hotter and colder. And how do you add that back in? In a running server, metallic paste don't pump out, but they're not as good. So how do you keep getting more heat out with just sort of this basic physics because there's only so much you can change about the physics. There's all sorts of wild ideas, but wild ideas don't make it into production.
Interviewer
Right.
Darren
And then I would say the other aspect that we have to consider is that air cooling is not going anywhere. So because we focus on the tip of the pyramid of the super high specialized GPUs for AI, but the vast majority of data centers don't do that. So that's the shiny new object and it's where the money is going. But, but your picture of your pets and your kids, they are not AI enabled. I mean they don't demand super high. So the future, the complication of the future is that it's going to be a combination of all the technology altogether. Five years ago you had 99.9% air cooled in five years. You're going to have a mix of different technologies in the same room. And you need to be ready to handle all of them, to maintain all of them, to fix all of them. And therefore, as we've said, you need to have a list of partners that you work with that gives you the confidence that you can do that at scale globally. Because we are talking about the us we are talking about Indonesia, Malaysia, Malaysia, we are talking about New Zealand, I mean talking about night shifts, we're talking about thousands of people that do thousands of things.
Interviewer
That's fascinating.
Darren
It is gonna. Accident is gonna happen. I mean, it's just.
Interviewer
It has to fight for energy purposes and I mean.
Cosimo
Yeah.
Interviewer
What's your biggest concern?
Cosimo
People ignoring the potential. Just like I said, the potential. When you make. When you make. When you make new magic.
Interviewer
Yeah.
Cosimo
I mean, because obviously if you come with a clog, then suddenly you got pumps that don't run and you've got some real problems. But you know, it's kind of like the person who gets less healthy over the heart attack doesn't come until way. There's plenty of signs along the way, but if you ignore them all, you end up with a heart attack eventually. Well, the whole system was running mut was running poorly a long time ago. And that's the same thing with these servers. The heat, you're taking less of it away. Everything's running, you know, worse and worse and worse. And just saying, ah, you can mix this stuff, it doesn't matter. There's nothing going on there. It's just a little chemical in water.
Interviewer
This isn't our father's BP or Castrol anymore, is it?
Darren
It's not, but actually my biggest concern is that it's a different declination of what he said, actually. And my biggest concern is what we were talking about before, that we need to share experience and information. We are not doing things so differently, one to another. And the moment that we start actually sharing experiences and mistakes, the faster we solve those and we predict them. Yeah, this is my biggest concern, that everybody's keeping their cards very close to their chest.
Interviewer
Concern and opportunity. So good luck. I think you have the eye of the tiger and the chance to go make it happen. I like the race you're running now.
Cosimo
Thank you.
Darren
Thank you very much.
Interviewer
Thanks for coming by. Really appreciate it.
Darren
Thank you.
Cosimo
Bye Bye.
Podcast Summary: AI is Overheating Data Centers | The Liftoff with Keith
Podcast Information:
In the episode titled "AI is Overheating Data Centers," host Keith Newman engages in an insightful conversation with Darren and Cosimo from BP's Castrol Immersion Division. The discussion centers around the pressing challenges of thermal management in data centers, especially in the context of the burgeoning demands of artificial intelligence (AI).
Keith opens the discussion by introducing the guests, who provide background on Castrol's Immersion Division. Cosimo elaborates on the evolution of their thermal management business unit:
Cosimo [00:19]: "Castrol is a wholly owned subsidiary of BP, a well-known worldwide lubricants company. Nearly three years ago, I formulated what we call the thermal management business unit."
He highlights the shift from single-phase immersion cooling to direct-to-chip or cold plate liquid cooling, emphasizing the company's comprehensive approach to fluid management services.
The conversation delves into the historical context and modern adoption of liquid cooling technologies. Darren underscores the long-standing presence of liquid cooling:
Darren [01:33]: "Liquid cooling in general has been around for what, 40 years, 50 years. Since the dawn of actually the mainframes. So it's not a new technology."
However, Cosimo points out that the current wave of AI-driven data centers requires a scale of liquid cooling adoption previously unseen:
Cosimo [01:56]: "Air is no longer adequate. So in the case of AI, you've got to embrace liquid cooling."
Keith probes into the design considerations for data centers adopting liquid cooling. Cosimo responds by detailing the complexities involved:
Cosimo [03:12]: "What is this plate made of? How thin is the metal? What kind of grease or metallic paste do you put in between to help things out?"
The discussion highlights the intricate balance between materials, heat transfer efficiency, and maintenance challenges that enterprises must navigate.
Darren identifies two primary challenges in the widespread adoption of liquid cooling:
Maintenance Complexity:
Darren [04:04]: "With the cold plate, you cool down mainly the processor... the maintenance of the server becomes complicated because everything is covered by coldplates."
Early-Stage Adoption:
Darren [04:04]: "It's still kind of a new technology... not a 1 million server liquid cool situation yet. It's happening right now because it was nice to have analysis. That said, now it's must-have."
Additionally, Cosimo discusses the complexities introduced by chemical additives in cooling fluids:
Cosimo [07:17]: "There's a bit of additives thrown in for biofouling and for anti-corrosion... it's a bit more complicated than just throwing a commodity fluid in."
The conversation shifts to the tangible benefits of liquid cooling. Cosimo provides impressive figures regarding efficiency improvements:
Cosimo [05:35]: "Immersion cooling is more than 40%."
He explains that full server immersion drastically reduces the need for specialized air conditioning, leading to significant energy and cost savings.
Keith inquires about the customer base and the decision-making process within enterprises. Darren mentions that while they have non-disclosure agreements (NDAs) with major hyperscalers, the primary decision-makers span across engineering and procurement departments:
Darren [06:46]: "Our main customers are the end users, the hyperscalers... we cannot talk about our customers yet."
Cosimo adds that the purchasing process is incremental, starting with engineers and moving through to procurement as scalability increases.
Looking ahead, Cosimo anticipates innovations in fluid additives that enhance heat capacity without exorbitant costs:
Cosimo [08:47]: "There are innovating going on with additives where it's going to increase basically the heat capacity of liquid."
He also remarks on the persistent role of air cooling for non-AI workloads, predicting a hybrid future where multiple cooling technologies coexist within the same data centers.
The guests express concerns about the industry's reluctance to share experiences and information, which could hinder problem-solving and innovation:
Darren [12:07]: "We need to share experience and information. We are not doing things so differently, one to another... everybody's keeping their cards very close to their chest."
Conversely, they highlight significant opportunities in improving thermal management systems to support the relentless growth of AI technologies.
The episode concludes with Keith acknowledging the critical race to enhance data center cooling solutions amidst AI's explosive growth. Darren emphasizes the importance of collaboration and transparency in overcoming industry challenges:
Darren [12:40]: "The moment that we start actually sharing experiences and mistakes, the faster we solve those and we predict them."
Keith commends the guests for their pioneering efforts and the vital role Castrol's Immersion Division plays in shaping the future of data center infrastructure.
Notable Quotes:
Cosimo [03:12]: "What is this plate made of? How thin is the metal? What kind of grease or metallic paste do you put in between to help things out?"
Darren [04:04]: "It's still kind of a new technology... not a 1 million server liquid cool situation yet. It's happening right now because it was nice to have analysis. That said, now it's must-have."
Cosimo [05:35]: "Immersion cooling is more than 40%."
Darren [12:07]: "We need to share experience and information. We are not doing things so differently, one to another... everybody's keeping their cards very close to their chest."
This episode of "Liftoff with Keith Newman" offers a comprehensive exploration of the evolving landscape of thermal management in AI-driven data centers. With expert insights from industry leaders Darren and Cosimo, listeners gain a deep understanding of the technological advancements, challenges, and strategic considerations shaping the future of data center infrastructure.