Summary8 min read

Catalyst with Shayle Kann

Episode: The Rise of Flexible Data Centers

Date: April 9, 2026

Host: Shayle Kann
Guest: Varun Sivaram, CEO of Emerald AI
Produced by Latitude Media

Episode Overview

This episode explores the rapidly evolving landscape of data center flexibility amid surging AI workloads and grid pressures. Shayle Kann reunites with Varun Sivaram—the CEO of Emerald AI, and now a business partner following Kann’s recent investment—to discuss recent breakthroughs, the economics and mechanics of flexible data centers, and how AI “factories” could help address pressing grid challenges without sacrificing growth or affordability.

Key Discussion Points & Insights

1. Data Center Impact on Grids: Recent Shifts

Timestamps: [03:22]–[07:47]

Surge in Energy Demand: Since their last conversation in August 2025, data centers’ grid impact has come into stark focus.
- “Data centers now account for 94% of PJM's projected peak load growth. And by 2030, EPRI forecasts that data centers could use up to 17% of America's power. All of these are incredible, insane statistics and they're reshaping the landscape of energy as we know it.” — Varun Sivaram [04:57]
Consequences of Unchecked Growth: Building out the power system exclusively to meet this growth could trigger higher rates and slow AI industry expansion.
- "224 gigawatts of peak load growth is between a quarter and a third of peak demand ... if we try and build our way out of this, ... we risk higher rates and slower AI growth. And that's good for nobody." — Varun Sivaram [05:39]
Affordability Front and Center: The conversation about grid and data centers now pivots on affordability more than ever before [06:20].

2. The Promise and Challenge of Data Center Flexibility

Timestamps: [07:47]–[11:09]

Schism Between Compute and Utility Flexibility: There’s a key disconnect between the nuanced service tiers on the compute side (offering different levels of reliability and latency) and the historically rigid service in electric utilities.
- “There is this deep divide ... between the level of flexibility, service levels and tiers that the compute industry offers and the level ... that the utility and grid operator ecosystem offers ... And that schism is one of the reasons that this is such a hard problem to solve.” — Varun Sivaram [08:12]
Flexibility as a Grid Solution: Data centers can tap into vast “stranded” power capacity if they can be flexible during rare grid peaks—turning a challenge into an opportunity.
- “If these new AI factories … can be flexible just a little bit of the time, they can utilize this vast amount of unused capacity on the grid.” — Varun Sivaram [08:38]

3. Evolving Service Tiers and Demand Response

Timestamps: [11:09]–[16:47]

Service Levels Emerging in AI: Companies like Google now offer priority and flex tiers for AI inference, hinting at service-level flexibility that could translate to power use.
- “Earlier this week, Google announced its Flex and Priority inference tiers. … It's not just the price component, it is the service component. The service literally changes between those tiers.” — Varun Sivaram [09:55]
On the Grid Side—Lagging: The electric grid remains stuck providing only “firm” (always-on) service, complicating efforts to unlock this latent flexibility.
Demand Response Evolution: Traditional demand response mechanisms aren’t enough; what’s needed is connection acceleration or increased interconnect size in exchange for flexibility, which currently doesn’t widely exist [13:08].
Economic Rationale (and limitations): Flexing for minor cost savings doesn’t pencil out, given the massive revenue per watt of AI compute. But being able to connect or expand faster would significantly outweigh potential lost compute revenue [16:47].
- “If the benefit of [flexing] is purely a cost savings on your electricity bill… those numbers don't pencil… The economics of getting a data center connected larger or faster are orders of magnitude different.” — Shayle Kann [16:47]

4. Economics and Workloads: Where Flexibility Works

Timestamps: [17:52]–[25:22]

The Logic Shifts with Incentives: As inference tokens become less lucrative relative to electricity costs, curtailing for price and grid events will make increasing sense—the dynamic is changing long-term [19:44].
Workload Categorization: Real-world demonstrations show many AI workloads (training, inference, batch processing, background updates) are inherently flexible.
- “We've now done these five demonstrations… And in each of these, we've tried to reenact real production grade actual workloads… we could achieve performance levels that real customers find acceptable, making sure not to throttle workloads that the customer labels as mission critical, while at the same time precisely meeting grid objectives.” — Varun Sivaram [21:12]
Geo-shifting and Batching: Workloads can be shifted geographically or batched to adapt to grid events with little/no user impact.
- “We showcased migrating AI workloads from one location, Virginia, to another, Chicago… inference queries got rerouted … you were able to precisely meet the Dominion grid's power constraint while utilizing capacity far away.” — Varun Sivaram [22:43]
Opening the System: The crucial next step is to treat the grid itself as a part of the data center optimization system.

5. Complexity of the Stack: Who Needs to Coordinate?

Timestamps: [26:50]–[30:15]

Multilateral Coordination Problem: Delivering flexibility at scale involves grid operators, data center owners/operators, cloud providers, customers, and sometimes additional third parties.
- "It's a wickedly complicated multi-party problem. … the data center is not one monolithic entity. It comprises a lot of players." — Varun Sivaram [27:42]
Incentives as the Linchpin: If the grid offers meaningful incentives for flexibility (accelerated connection, bigger capacity), other actors will coordinate.
- “Everybody becomes much more willing to work together when there’s a real economic incentive. And it’s the grid that sets that incentive.” — Varun Sivaram [29:42]
Emerald’s Approach: They aim to be the "easy button," building modules for every stack layer—from utility down to end user and on-site resources.

6. Behind-the-Meter Resources and Hybrid Flexibility

Timestamps: [30:15]–[39:16]

Emerging Trend: More data centers are deploying “bridge power”—gas turbines, batteries, on-site renewables—to meet grid signals. But full islanding is not the optimal long-term solution.
- “As data centers become, by the end of this decade, up to 17% of America's load ... it would be a catastrophe if data centers were entirely decoupled from the electricity system.” — Varun Sivaram [32:01]
Hybrid AI Factories: The ideal model involves integrating workload flexibility with behind-the-meter resources for grid services and rapid scaling.
- “With Nvidia … we made a major announcement … Nvidia has a reference architecture ... One element of it is DSX Flex, the capability to be flexible. And Emerald is a software partner that helps to operationalize that.” — Varun Sivaram [33:00]
Unified Dispatch Curve Concept:
- “From the grid's perspective, it's kind of like a little mini dispatch curve. ... If you take [workload flex] to be true, then workload flex is the cheapest thing you can do and you should do as much of it as you can ... Then if you need more ... you should then dispatch things that cost more money ... your generator ... your battery or fuel cells ... all those things come at a significantly higher cost.” — Shayle Kann [34:56]
- But unlike the static grid dispatch curve, a data center’s is "complicated, dynamic, constantly changing," governed by both resource constraints and workload flexibility [37:07].

7. The Four “Birds” of Flexibility and The Path Forward

Timestamps: [40:01]–[42:23]

The Four Birds with One Stone:
1. Faster, higher-capacity data center grid connections
2. Lower, more stable rates by avoiding overbuilds
3. Enhanced grid reliability (helping respond to events)
4. (New) Ability to charge and dispatch on-site storage more intelligently [40:01]
Challenges Remain: Differentiated service tiers for power, complex stakeholder coordination, and the temptation for data centers to go off-grid.
Proof Point Ahead: The partners are preparing the world’s first 100MW flexible commercial AI factory, designed to prove the concept at real scale by late 2026.
- “We will put together the world’s first 100 megawatt commercial scale AI factory that is truly power flexible... it's going to be able to respond precisely to all of these grid needs, but at a commercial scale." — Varun Sivaram [41:33]

Notable Quotes & Memorable Moments

On the scale of the issue:
“224 gigawatts of peak load growth ... that’s a massive increase that data centers are going to be driving.” — Varun Sivaram [05:39]
On the underlying opportunity:
“The farsighted way of thinking about this is ... it would be a catastrophe if data centers were entirely decoupled from the electricity system because the system loses their biggest source of anchor tenant revenue and the most exciting engine of American economic growth.” — Varun Sivaram [32:17]
On the potential for innovation:
“If electric power utilities and grid operators offered a range of different service tiers... innovation would solve this problem.” — Varun Sivaram [15:30]
On multilayered complexity:
“The data center is not one monolithic entity. … it comprises a lot of players.” — Varun Sivaram [27:42]
On the big breakthrough to come:
“By late 2026 and 2027, this really takes off and it kind of solves all three of those problems.” — Varun Sivaram [42:16]

Episode Structure & Timestamps

| Segment | Timestamps | |------------------------------------|-------------------| | Market Overview & Recent Changes | 03:22 – 07:47 | | The Flexibility Opportunity | 07:47 – 11:09 | | Service Tiers: AI vs. Utilities | 11:09 – 16:47 | | Workloads & Real-World Demos | 17:52 – 25:22 | | Stack Complexity/Stakeholders | 26:50 – 30:15 | | Behind-the-Meter & Hybrids | 30:15 – 39:16 | | Synthesis & Future Outlook | 40:01 – 42:23 |

Final Takeaways

Data center-driven electricity demand is rapidly redefining grid planning and affordability discussions.
Unlocking flexibility—by leveraging compute orchestration and integrating on-site assets—could fundamentally solve grid strains and accelerate AI deployment.
True progress will require new electric service tiers, innovative partnerships, and sophisticated orchestration platforms bridging all stakeholders.
A commercial-scale demonstration of a flexible AI factory is on the horizon, aiming to validate these concepts at scale and pave the way for broader adoption.

For deeper dives, check out the full episode and related resources at Latitude Media.

Loading summary

Transcript34 lines

[00:03]
A
Latitude Media, covering the new frontiers of the energy transition.
[00:08]
B
I'm Sheil Khan. I lead the early stage venture strategy at Energy Impact Partners. Welcome to Catalyst. So my friend Varun Sivaram came on this podcast back in August 2025 or roughly a century ago in AI terms. At that time we talked about his mission at Emerald AI to make data centers flexible, specifically at that time by shifting AI workloads to deliver compute flexibility in response to grid signals. It was then and remains now a somewhat controversial concept, largely I think, because the history of data centers dating back to the emergence of the cloud industry always suggested that they are perhaps the most inflexible load on the planet. Not only would they generally pass on participation in demand response programs, but they actually needed N2 reliability just to ensure that the spice would always continue to flow. But the world has changed in a bunch of ways since then. The strain that data centers are putting on the grid has become clearer and more present than ever. Pressure on electric power rates is high and so affordability is top of mind across the board. And data center flexibility, either through compute orchestration or through behind the meter resources, has started to become a mainstream concept. I actually went on my own journey on this concept and spent a lot of time on it over the last year and here's where I came out from. First Principles Data centers should be flexible assets. They just should. They have many different types of workloads with different degrees of urgency and it's crazy to think that they couldn't differentiate. And in fact some of them are now starting to differentiate. But the actual mechanics of getting them to do so and getting all the players aligned is really tricky. Anyway, long story short, I invested in Emerald. We announced just a couple weeks ago that we at EIP led a $25 million round in Varun's company. So I brought him back on today for an update really on the extremely dynamic world of data center flexibility that's coming up next.
[02:16]
A
Catalyst is supported by Fishtank pr, an award winning PR firm focused on climate and energy tech, renewables and sustainability. Fish Tank is known for generating prominent and effective media coverage for the brands they work with. If you want a PR partner that's thoughtful, shoots straight and gets results, you'll like Fishtank PR. To learn more about Fishtank's approach, visit fishtankpr.com that's F I S C-H fishtankpr.com
[02:42]
C
when utilities need flexible capacity they can count on, they turn to Energy Hub. Energy Hub works with more than 170 utilities coordinating over 2.5 million devices to manage 3.4 gigawatts of flexibility. Built for the moments when utilities can't afford uncertainty. Energy Hub builds and operates virtual power plants that utilities actually stake their grid, planning on coordinating EVs, batteries, thermostats and more through a single platform built for utility scale, predictive, verifiable and designed to perform when it counts. Learn more@energyhub.com.
[03:18]
D
VARUN welcome back Shael. Thank you for having me back.
[03:22]
B
It's nice to be on the other side and talking as partners in crime in your business. So you were back on this podcast in August of last year, which, depending on how you look at it, is either a very long time or a very short time ago. I guess I want to start with just like your high level perspective on what has happened in the market. And I guess by the market in this context I mean data centers in the grid and then specifically also in your market, which is the provisioning of flexibility for data centers in the grid. So just over that last, whatever that is, nine, 10 months, like what do you see as having happened?
[03:59]
D
I think Shail, it is a very long time given the dynamics of data centers and the grid. But actually, before I answer that question, let me first say how delighted I am to get to work with you as a partner in crime. Ten years ago today was when you and I coauthored a Nature Energy article. We wrote this article about setting a new cost target for solar power. Really ambitious, $0.25 per watt fully installed. And I think, you know, the, the latest prices in China show that the cost is roughly double that. So we're almost there.
[04:34]
B
More, more to come on that because I have been doing some things that are still trying to hit that target. As you said, even in China we have not hit that target yet. I still think it's possible, but yes, it was. It's fun to know. I did not realize it was 10 years ago to the day, but obviously it's been a, it's been a fun journey. So good to be on the same team here. Data centers in the grid.
[04:57]
D
Absolutely. So let me get back to that overview. It's been a long 10 months since I was last in the pod and here's what's changed. Data centers are even more of an energy issue than we thought was the case back in 2025. The latest numbers show that in January of this year NERC forecasts a summer peak increase of 224 gigawatts, almost all of which will come from data centers. Data centers now account for 94% of PJM's projected peak load growth. And by 2030, EPRI forecasts that data centers could use up to 17% of America's power. All of these are incredible, insane statistics and they're reshaping the, the landscape of energy as we know it. And you've obviously talked about this a lot on your other podcast sessions, but I also think that this particular topic we're talking about today, it's important to keep talking about it. The last time we talked about it was a different context than today. 224 gigawatts of peak load growth is between a quarter and a third of peak demand. And so that's a massive increase that data centers are going to be driving. And if we try and build our way out of this, as I told you on the last podcast, we risk higher rates and slower AI growth. And that's good for nobody.
[06:20]
B
I would add. You said higher rates. I mean, I think one thing, to me, yes, it is true that the build out, the data center build out has just continued to accelerate over the past nine or 10 months. Nothing has stopped that upward trajectory. And maybe it has gone parabolic, maybe. It's hard to tell exactly because we're in the middle of it. The one thing though, that I think is much more front of mind today than it was in August of last year you alluded to, which is affordability, like that has become, that has come front and center now in a way that I think it wasn't quite there, you know, nine, ten months ago. It's just starting to be. And now it's like every conversation is about affordability.
[06:58]
D
Oh, absolutely. And look, we should be clear that historically the drivers of rate increases may very well not have been data centers. Data centers may have been conflated in the data. And some data shows that in areas where data centers grew more quickly, rates actually increased more slowly. But I think it's incontrovertible that going forward, if data centers do drive most peak load growth and peak load drives most rate increases, data centers, absent some mitigations to help with this exorbitant grid build out, data centers could very well drive affordability difficulties. And that's why I think that there's a risk that they become a problem. But there's also a real opportunity for data centers to become the true hero of solving this affordability challenge.
[07:47]
B
Right. And there's multiple ways to do that, and lots of them are starting to emerge. The unique tariffs that data center operators or developers are starting to sign with utilities I think are interesting, and there's a bunch of different structures of those. Let's narrow in though on the portion of that that is most relevant to you, which is making data centers flexible assets. So on that front, over the past nine, 10 months, what has changed?
[08:13]
D
A lot and a little has changed, I'd say. Let me put it this way. There is this deep divide that I observe between the level of flexibility, service levels and tiers that the compute industry offers and the level of flexibility in the service tiers and level that the utility and grid operator ecosystem offers in terms of electric service. And that schism is one of the reasons that this is such a hard problem to solve. That even though I'm super excited and want to talk to you about the five commercial demonstrations we've done since you and I just last talked at Emerald AI, it's still a fundamental challenge to make sure that we can take advantage of all of what I call the stranded power on the grid. So to back up Shale for those viewers who didn't hear the last episode, Data Center Flexibility why is it important? Data center flexibility is important because if these new AI factories, as Jensen Huang at Nvidia calls them, if they can be flexible just a little bit of the time, they can utilize this vast amount of unused capacity on the grid. The grid is utilized at roughly 50% or less during most of the year and therefore we have all of this 100 plus gigawatts of stranded power capacity that could power AI factories connecting to the grid today if during those rare peak moments, AI factories could ramp down partially for a limited amount of time. That's why flexibility is so important. Now I mentioned the schism because just this month we've seen a lot of movement on the AI and compute side in terms of flexibility tiers. Just earlier this week, Google announced its Flex and Priority inference tiers. So if you're a developer and you are buying tokens of artificial intelligence, you're running models, you're being served inference, you can choose to have those delivered absolutely immediately, or you can choose to have those delivered after a delay and you'll pay different prices. But more importantly, it's not just the price component, it is the service component. The service literally changes between those tiers. And it's not just Google that does this. You know, Anthropic has a peak period for serving tokens and you tend to hit your rate limits earlier and it has a non peak period. I know all of this because we have One of these token maxers, Nikhil, on our Emerald AI team, and he always tells me every time he hits a rate limit. So given that there are literally different service levels, it's going to be possible, I believe, relatively quickly to take advantage of all this flexibility in the different ways people use compute to throttle their power consumption. The question however, is on the other side, on the utility side, it is still a work in progress for a utility to offer a different service level. Today you typically get one service level. The service level is you get power. There isn't a service level that says for most of the year you get power, but some of the year we're going to ask you to be flexible.
[11:09]
B
There is to some extent, which is demand response, right? That's the legacy version of what we're talking about, which is a, you enroll in a program. This is less so for data centers historically, but for large loads, you enroll in a program. We. And as that enrollment, you know, we are going to send you a signal or we're going to call you, as it's been historically, a couple of times a year at peak hours and we're going to ask you to ramp down. This is kind of an extension of demand response, right? Like what do you think of as being the same and different?
[11:39]
D
I completely agree that first of all, there are natural pricing tiers. So if you're in ERCOT in Texas, you can pay more money for power right now or you can ramp down and pay less money because power prices are high. Then there are demand response programs. As you mentioned, utility might say you can make some money if you agree to curtail. During this period, you might own an S thermostat and be enrolled in a smart thermostat program. And they, they will pay you if you are willing to reduce your consumption. There are even mandatory programs where a prerequisite of your enrollment is your promise to curtail and a hefty penalty if you do not curtail. But what's missing from all of these is the ability to offer this service of curtailment at the large scale that data centers could theoretically provide it and get a real benefit out of it. And that benefit is not just a cheaper cost of power or a flexibility payment. That benefit is a larger power connection, faster access to the power grid. Whether you're an existing data center seeking to increase your capacity or a brand new data center seeking to connect, you should be allowed to harvest the existing stranded capacity on the grid that can be served to you. If you are willing to curtail every so often. And that product doesn't exist today. And with good reason. The electric utility industry for over a century has wanted to promise firm service to its customers. And so there really isn't this non firm service tier that allows you to skip the line. And there are all kinds of legal considerations that come into play, but that's absolutely where we have to go.
[13:09]
B
You know, the interesting thing about that, I think my guess is that most people would say, if you ask most folks who are kind of like in and around this market, like, what is the limiter on data centers becoming flexible grid assets, particularly leveraging flexibility in compute, as opposed to leveraging on site behind the meter resources, which we will talk about as well. I would guess that most people would say the main limiter is actually on the compute side. It's because in the legacy cloud world and then AI as it has emerged since then, there was an expectation of extraordinarily high reliability and low latency. And so your expectation of your workload getting curtailed is very low. And so that would have been the limiter. You know, it's interesting that you're saying it's actually on the other side because on that side, on the compute side, it seems to be emerging. I saw that Gemini announcement as well, and that's cool to see Google doing that. And that that makes the new limiter on the other side of the equation, which again is, as you said, the problem is not offering differential pricing or savings based on curtailment. It is actually saying, look, if you agree to curtail a certain amount for certain times, we will interconnect you faster or we will give you a larger interconnect. That's the thing that's missing.
[14:29]
D
Yeah, exactly. And let me first just preface by saying it's not the case that faster and larger connections, because you're willing to be flexible is not happening anywhere in the world. In fact, Google now has reached a gigawatt of contracted flexible capacity across, I think five different utility territories, at least some of whom are willing to provide some of these benefits. So Google's really been a pioneer. I believe across the more than 3,000American utilities. We have a long way to go to bring these differential service tiers onto the market. But still we're making good progress. But the point you make, shale, I think, is a really important one. The point you made shale is, look, everybody sort of discounts data center flexibility because they're thinking about the compute side. And they say these AI GPUs or accelerators are extremely valuable. And produce wildly valuable tokens of artificial intelligence. If you watched Jensen's keynote at Nvidia gtc, you saw some of the ridiculous economics of operating token factories, AI factories. It's a great idea to maximize the tokens you are generating per watt of power. And so the intuitive responses, it is a terrible idea to ever curtail any GPU because the economics just absolutely don't make sense. That, by the way, is one of the reasons why I feel fortunate that basically no one else is doing what Emerald does, because that in itself is an intuitive blocker to founding a company like this. But I believe it's actually on the other side, as you said, Shale, I believe that if electric power utilities and grid operators offered a range of different service tiers, just like on the compute side, the cloud operators offer a range of different service tiers to their end AI customers, innovation would solve this problem. And you would absolutely have data centers taking advantage of the lower service tiers. And they're not that low, by the way. It's just a, call it 50 or 100 or 200 hours a year that you'd have to curtail. Data centers would very happily take utilities up on these lower service tiers in order to accelerate their connection or get a larger connection.
[16:47]
B
Yeah, I think there's another way to put it, which is that it is true economically that it's probably a dumb idea to curtail, to not maximize token generation if the benefit of doing so is purely a cost savings on your electricity bill. Right. And so if it is traditional demand response or something like that, and the benefit that you get is just you save some money on your bill, those numbers don't pencil, largely for the reason of, you know, Brian Janis coined the bit watt spread term. The bit watt spread is so big that it's just not that valuable to you to save some money on your electricity bill relative to the revenue that you're going to generate with your chips. So that is true. Generally, you could disagree with me if you want, but the economics of getting a data center connected larger or faster are orders of magnitude different. And so if that is a benefit that you can get, it actually does flip those economics.
[17:53]
D
So I completely agree with your second point, and I sort of half agree and half disagree with the first point. Right. On the second point, completely agree. If you've got a 200 megawatt data center and you are able to increase its capacity to 230 megawatts just a year ahead of schedule and you can swap out to liquid cooling and next generation Nvidia GPUs, you create billions of dollars of value, even netting out the cost of some of the downtime of curtailing during those rare peak load hours. You should absolutely take the deal about a new electric utility service tier. The earlier point you made though, which is is there ever an economic incentive that makes it worthwhile on the fly to reduce your operational expense, your OpEx, by reducing your power cost or getting a flexibility payment to curtail a little bit? Even if the answer isn't yes in all cases today, and I think there are some cases where it is yes, I think the answer will increasingly become yes in the future as the cost of inference, the cost of token generation, asymptotically approaches the cost of power, which is the only real operational expense input into the cost of intelligence generation. And even today there are lots of efficiencies we can harvest on a temporary basis to mean that I can reduce peak power by a larger amount than the token generation that I'm avoiding simply because by operating this is getting a little technical, by operating a little differently on the power performance curve, on a part of it where I'm not losing as much performance, but I am reducing quite a bit of power, is a pretty good trade off to be made on a temporary basis. For example, for some inference workloads. Microsoft has done a great job quantifying this, so there is some low hanging fruit to harvest here. Which means I believe even today it makes sense to be responsive and in the medium and longer run it's going to make a ton of sense to be responsive just in terms of reducing your power bill.
[19:44]
B
It's a good point. Yeah, it's a good point, particularly about over time. We're in a moment right now that is not reflective of where we're going to be in five years or 10 years, or who knows how many years. Two years maybe. But like inference should and probably will get cheaper and cheaper, the market will be saturated with it. At that point, the cost of energy does matter. So that is a good point. You alluded to one thing I wanted to ask you about. As you think about workload flexibility, talk to me about the different workloads and what is more and less suited. Obviously there's the training versus inference distinction and I'm interested in your perspective on that. But even within a category, even within inference, for example, what have you learned having run all these demos now, about what types of workloads seem to be the ones where there is enough volume, they are a big enough portion of the overall workloads to matter and where they have the most flexibility.
[20:38]
D
Yeah, absolutely. There are many different AI workloads. It's more than just a simple dichotomy of training and inference. Each of these categories has lots and lots of different subtypes. I will say a good reference. There's a March 2026 paper that was published actually by Emerald's chief scientist, Professor Aisha Koskun, at Boston University with two others, both at BU and at Emerald AI, that shows a 18 to 55% power flexibility opportunity across a range of different representative AI workloads spanning training, inference, fine tuning and all of their subtypes. So there's a lot of inherent flexibility is kind of my overarching point. And then I can go into the various kinds, as you suggested. As you mentioned, we've now done these five demonstrations at data centers with Nvidia, with EPRI's DC Flex initiative, and with many other partners from Oracle to Nebias to National Grid. And in each of these, we've tried to reenact real production grade actual workloads. So for example, in London, we ran workloads from real models, whether they're OpenAI's models, Meta's open source model, even an Alibaba model from China, to showcase that we could achieve performance levels that real customers find acceptable, making sure not to throttle workloads that the customer labels as mission critical or that should not be throttled, while at the same time precisely meeting grid objectives, whether that is responding within seconds to a scenario of a lightning strike or reducing power by 30 or 40% during the halftime of a soccer game, where in England everybody turns on their tea kettles. So we believe that many AI workloads can be throttled in a way that's acceptable to customers. And many of these will be fine tuning, post training, training type workloads. There are other workloads, and by the way, some inference workloads as well, like batch inference. There are other workloads where these workloads, even if they can't be throttled, they might be batched differently. So again, you're still basically reducing the power consumption in one data center or they can be migrated. So with Oracle, we showcased migrating AI workloads from one location, Virginia to another, Chicago. This was during the data winter, the Dominion winter peak period. The inference queries got rerouted in such a way that you were able to precisely meet the Dominion grid's power constraint while utilizing capacity far away, while the queries moved within milliseconds halfway across the country. So the user experience really wasn't changed. And so if you're chatting with the chatbot, which by the way is just one of a million different AI use cases, that isn't an experience that's going to change very much if what's happening is this geo shifting under the surface. So my only point here is there's so many different use cases. Google, when they announced their Flex and priority inference tiers yesterday, they cited cases like background CRM updates, large scale research simulations, agentic workflows where a model is browsing or thinking in the background as cases of workloads that are inherently flexible that you probably don't need an answer from right this second. And of course that helps Google to better optimize its own AI infrastructure. Right? You might be able to use your AI servers for queries that are more urgent, but it also helps us to throttle power use by tapping into the inherent flexibility of computational workloads. And so the last thing I'd close on this is historically data centers have been optimized as a closed system. There are computers, CPUs, GPUs, memory storage, there are fiber optic networks, and so there are multiple data centers across the country and you optimize this closed system. Nobody's ever considered adding within this closed system. If you weren't listening to this podcast and you were looking at my hands, one circle is around the data center system and a bigger circle is around the data center plus the grid system. No one's ever added the grid to the closed system. And once we add the grid, then we are optimizing not just for where there are servers available or where there are fiber optic congestion constraints, but also where there are transmission line congestion constraints or where there is inadequate generation. And that overall optimization problem causes you to harvest computational flexibility in a different way and often in a way that utilizes this massive electric grid fixed asset and save everybody money.
[25:22]
A
Are you tired of overpaying for big name PR firms but not really knowing what they're delivering? Is your comms team wasting time reviewing lengthy messaging briefs and decks instead of engaging journalists or producing content? Are you wondering why your competitors are getting press and you aren't? Phishtank PR is an award winning climate and energy tech, renewables and sustainability focused PR firm dedicated to elevating the work of both early stage and established companies. Whether you need to position yourself as a thought leader in between project announcements or translate complex ideas and technologies into tangible, compelling stories that resonate with the media, Fish Tank can help. Check out fishtankpr.com that's F I S C-CH fishtankpr.com Virtual power plants are becoming
[26:06]
C
a reliable way for utilities to manage capacity. But enrolling devices is just the start. What really matters is confidence, knowing those resources will perform when dispatched and being able to prove it. From the control room to the living room, Energy Hub's platform handles the full picture from near real time forecasting, locational dispatch, and the kind of rigorous verification that holds up when regulators, grid operators or leadership asks did it deliver? Easy enrollment creates momentum. Proven performance builds trust. That's why more than 170 utilities rely on Energy Hub to manage over 2.5 million devices delivering 3.4 gigawatts of flexible capacity. See what that looks like@energyhub.com
[26:51]
B
One thing that as I've spent more and more time with you and sort of learning about what Emerald is building and more broadly learning about the concept of compute flexibility with data centers. One of the challenges it seems to me is just it's a multi party situation. There are a bunch of actors in any given situation. It's not as simple as you want it to be. It's not like there's data center and grid even within the data center, quote, unquote, right. Somebody is operating the data center. Somebody else might be the cloud provider to the data center. Somebody else might be running the workloads or actually being the the customer. So can you walk me through how you think about the stack of who needs to do what? If we're going to deliver on this promise and we're going to take advantage of the 100 gigawatts of latent capacity we've got on the grid by making data centers flexible, who needs to sign off on what?
[27:42]
D
Yeah, it is a wickedly complicated multi party problem. I'm delighted, Shale, you got comfortable getting your hands around this and now we can work together. Look, I go back to an earlier question you asked, which is what's the most critical thing that has to happen? The most critical thing that has to happen is power utilities and system operators and regulators and governors saying if you are willing to be power flexible o data center, we want you in our state. You get to skip the line. You get to connect faster as a flexible load, fast track, you get a bigger data center, et cetera. If that happens, I believe everything else quickly falls in line. Now, you're right. The data center is not one monolithic entity. It comprises a lot of players. You might have, for example, a data center developer, owner and operator. I'll make one up Digital Realty, a terrific one that we partner with that is operating a data center within which they have a tenant. That tenant might be one of the many folks we've partnered with, like Nebius for example, or Oracle or Lambda. And within that cloud provider, by the way, it could be a hyperscaler as well. Within that cloud provider, you might then have a customer, and that customer, by the way, may not be the end customer. You might have together AI or Fireworks AI, which is an inference serving service, which is then serving tokens and enabling an end customer to run models on them. And there may be N layers here and ultimately all of those layers have to work together so that the data center at the point of common coupling, that interconnection point to the grid, has to adjust its net withdrawal from the system consistent with the grid signals. Right? This is complicated. Emerald seeks to be the easy button for data center flexibility. And in order to do that, we basically have to have modules at every layer of this stack, right? We have a module for utilities, we have a module for the data center operator to interact with the utility and communicate. We have modules for the cloud operator for the end user to have Emerald agents to help them to most gracefully throttle the workloads that they may want to throttle. We have agents elsewhere that are working on the on site energy resources to harness all of those energy resources as well. So it is a complicated stack. But I will say everybody becomes much more willing to work together when there's a real economic incentive. And it's the grid that sets that incentive, which is to say you get connected faster, you get a bigger data center. If you're willing to do this, everybody else will work together.
[30:15]
B
Yeah, I agree with that. The prerequisite if the grid says the right thing, everybody else falls into line. You just mentioned onset resources. I want to talk about that for a second too, because I think part of what's happened as the concept of data center flexibility has gained more prominence is that it has morphed somewhat. Sometimes people, when they say data center flexibility, they're talking about workload flex. Other times they're talking about from the grid's perspective, what makes a flexible data center. And in many cases right now, what's happening is that data center developers and operators are putting a bunch of assets behind the meter. Usually what that means is gas turbines of one kind or another, maybe some bass and battery storage, maybe there's some other generation. Solar could be behind the meter as well, but it's behind the meter. Generation and storage and so there's one version of data center flexibility which is just grid says, or utility says, I need you to curtail now this amount and you just fire up your behind the meter generator and you don't do any workload flex at all. There's another version where these things all play together. So walk me through how you see the landscape emerging with the relationship between behind the meter, physical resources and workload flex.
[31:34]
D
Totally, I'll say a prefatory point, which is I believe AI factories belong on the grid. I think it's best for everybody. This is counterintuitive, by the way. A lot of folks might say, well, if the data centers just went off the grid, that would insulate the rate pairs from the peak load increase that the data centers would cause and the bill increases, et cetera. But I believe that's a little bit shortsighted. The farsighted way of thinking about this is as data centers become, by the end of this decade, up to 17% of America's load and in the decade beyond, a quarter and a third and half of America's load, it would be a catastrophe if data centers were entirely decoupled from the electricity system because the system loses their biggest source of anchor tenant revenue and the most exciting engine of American economic growth. It's a terrible idea to be completely off grid forever. But in order to achieve that, there may be a period of time in the near term where data centers say, I need to get online right this second and so therefore I'm going to build myself bridge power. It's highly rational and the hope is we will be able to quickly bring those resources behind the meter to bear to support the broader electrical system and connect those data centers with Nvidia. We made a major announcement at ciroeek that Nvidia has a reference architecture. It's called dsx. It's how AI factories should be laid out and should operate. One element of it is DSX Flex, the capability to be flexible. And Emerald is a software partner that helps to operationalize that. And we joined six large, the largest American power companies to say, even if you are putting on bridge power, we're going to incorporate that into the DSX reference design. We're going to call them hybrid AI factories. And we're going to make sure that they can work together as a single unit to provide services back to the grid. In some sense this is a super flexible AI factory facility. And the reason for this is you can coordinate the on site resources, the gas generators, the batteries alongside the computational flex Because AI factories, these token factories, are inherently flexible. As I mentioned, that 18 to 55% inherent flex built in and to AI workloads, you can take all of this together and when you do get a grid connection, and ideally it comes faster than it otherwise should, because you're flexible, you're able to provide real services back to the grid. So one of the things that at Emerald we've been focused on is orchestrating not only the computational resources, slowing down workloads, moving workloads, but also doing that in tandem with the on site energy resources. Generators, cooling, batteries, fuel cells. The reason that's important is you might have a microgrid on site. It may be operated through the software systems of Siemens or an Eaton or a GE Vernova, all of whom actually just entered this round in Emerald AI that you led Shale. And we will play very nicely with all of them. We'll integrate and we'll say Emerald will help to recruit the amount of generation and flexibility from these on site energy resources and coordinate it with the computational flexibility from the GPUs on site. And that entire unified amount of flexibility is what the grid sees in terms of a change in the net withdrawal of energy from the grid. So I look at bridge power behind the meter resources as really a way to supercharge flexibility and not a way to remain as a permanent island.
[34:56]
B
Yeah, I'll tell you the way that I think about it and you can tell me if this resonates with you. There's a dispatch curve in electricity in general, and when there's a certain amount of demand, you start at the bottom of the dispatch curve, which is basically the cheapest resource to generate, and you keep moving up the dispatch curve until you meet the demand. And so in the context of the broader electricity market, the low end of the dispatch curve is stuff with no marginal cost, which is solar and wind, mostly. Right. And then you get further and further up and other things get dispatched more and more. When there's a single asset, a single data center that has multiple resources that it can draw upon to meet a need, which in this case is going to be, you know, some amount of curtailment. From the grid's perspective, it's kind of like a little mini dispatch curve. Right. And they may have one thing that can go in that dispatch curve, or they may have six, it doesn't really matter. And if you think about it in that context, then it should be that workload flex is the bottom of that dispatch curve. In other words, the cheapest thing to deploy Assuming that you still, you know, to your point, you're taking advantage of the, the latent inherent flexibility. In other words, you're not sacrificing customer SLAs, you're not sacrificing performance to customers, things like that. If you take that to be true, then workload flex is the cheapest thing you can do and you should do as much of it as you can as long as you don't sacrifice customer performance. Then if you need more, which you may, well, you should then dispatch things that cost more money and that may be firing up your generator that you have behind the meter, it may be firing up your battery or dispatching your battery or fuel cells or whatever it is. All those things come at a significantly higher cost to dispatch. But you might be able to get more out of them, right? You might have a generator, gas generator behind the meter that is rated to the same capacity as the entire data center. And so you can, you know, if you need to flex down to zero, that's the way to do it. But if you need to flex down 20%, it actually might make more sense in most cases just to do the workload flex. So I think of it as this little mini dispatch curve that some data centers will ultimately have, but really only the ones that do have the behind the meter resources.
[37:07]
D
I love the dispatch curve analogy. All I'll say is I don't think it's a static dispatch curve. So in electricity markets it's always that gas peaker that's going to set that marginal price when you have sufficient demand, right? It's always that skinny, pointy one at the right side of the dispatch curve. Whereas in the data center you'll have a complicated, dynamic, constantly changing dispatch curve. I agree with you that there's always going to be a fat short part on the left hand side of the dispatch curve. That's going to be some latent workload flexibility that we can just harvest. There will be customers who are willing to tolerate a little bit of flexibility and there will be workloads that are inherently tolerant to some flexibility. I'll just note here, by the way, there are so many reasons that AI users are tolerant to workload flexibility, because all other kinds of things can happen in a data center that might require them to be flexible. So power is just yet the next thing that we ask them to be flexible about. But in addition, in addition, there will be workloads that are less inherently flexible or that are higher up on that dispatch curve. They might sandwich the battery, the battery by the way might have an operating constraint. It can provide you a certain amount for a certain amount of time that sets the width of that bar, so to speak. But you might have some very interesting dispatch algorithms. One day. You might even have what I call energy token arbitrage, or watt token arbitrage, where you might actually choose to throttle tokens even before the grid actually requires you to do so. Because it's economically optimal in this particular case to charge your battery, let's say. And I believe that as we build in intelligence, such as forecasting, which many of our five demonstrations now have done, we'll be able to forecast on both sides, both the grid side, when we expect an event to arrive, and on the AI side, when we expect a job that is more or less flexible to arrive on the scheduler. All of this means it's just a more complex dispatch curve. But I love the analogy. And for us grid wonks, it's a useful organizing principle for us.
[39:17]
B
Yeah, it's a good point on the charge the battery one. I think people haven't really thought this one through. Right. We're going to put a lot of batteries behind the meter data centers, like, I'm pretty convinced that that's going to happen. But let's say that you're a data center that has 200 megawatt. You're a 200 megawatt data center and you have a 200 megawatt interconnected. And you add a battery. How do you charge that battery? Right. Like, you kind of either need the data, either you need a bigger interconnect, your total load is actually 200 megawatts plus the size of the battery, which is going to be big. Or you need to figure out how to be flexible on your power consumption from the data center, such that some of the time you can be simultaneously charging the battery and pulling from the grid. And so that's like an inherent workload flex requirement that you're going to have to solve unless you are going to get a much bigger interconnection, which nobody can get.
[40:02]
D
Yeah, completely agree. And shale, I guess I just don't want to lose sight of the overarching story here, which 38 minutes in, I'm now going to share that what you just shared. Shale is an important functionality and I'll call it the fourth bird that you can kill with a stone. But the first three birds are, first, let's get AI factories, data centers connected much more quickly and at larger capacities to grids, thanks to flexibility. Second, let's keep rates low. And stable thanks to flexibility by avoiding unnecessary grid build outs. We still got to build, but nevertheless, if we can harness flexibility, we can build less quickly. Rising peak demand while bringing on massive amounts of energy demand, megawatt hour demand from data centers that help to pay for the whole system. Third, let's keep the system reliable. The third bird here is if AI factories can respond to system needs, that lightning strike, that soccer game teakettle spike, a heat dome we demonstrated in Portland, Oregon with pge, the utility and Nvidia, and many other potential reliability issues, well then we'll be able to. With one solution, we'll be able to basically get the grid we want and the AI adoption that we want. It's that really rare holy grail solution. It's why there's so much chatter about it. But you've correctly laid out the reasons. It's hard. There's a lot of actors that you have to coordinate. There's a lot of ongoing forces such as the push to just go entirely off grid. And of course there's the lack of those differentiated service tiers from the electric power system. I think later this year, as you know Emerald and Nvidia and some other partners, Digital Realty, EPRI, Dominion and PJM, we will put together the world's first 100 megawatt commercial scale AI factory that is truly power flexible. It's custom designed from the ground up to be power flexible and it's going to be able to respond precisely to all of these grid needs, but at a commercial scale. My hope is the community sees that in parallel. We're getting to the point where more electric utilities are understanding they have to offer these differentiated service tiers and give you accelerated interconnection and larger connection sizes. And that's when late 2026 and 2027, this really takes off and it kind of solves all three of those problems.
[42:23]
B
All right, Varun, this was fun as always. Appreciate you coming back.
[42:28]
D
Really appreciate it. Chael, thanks for having me.
[42:31]
B
This show is a production of Latitude Media. You can head over to latitudemedia.com for links to today's topics. Latitude is supported by Prelude Ventures. This episode was produced by Max Savage Levinson, Anne Bailey and Sean Marquand. Mixing and theme song by Sean Marquant. Stephen Lacey is our Executive editor. I'm Shael Khan and this is Catalyst.