Summary7 min read

Podcast Summary: Open Circuit – "The Latitude Stage: How AI Changes Our Digital Energy Footprint"

Podcast: Open Circuit
Host: Stephen Lacy (Latitude Media)
Guest: Vijay Gaddupalli (Senior Scientist at MIT Lincoln Laboratory Supercomputing Center, CTO at Radium Cloud)
Release Date: August 20, 2025

Episode Overview

This episode, recorded live at Latitude Media’s Transition AI Conference, explores the rapidly rising energy consumption of artificial intelligence (AI) and its implications for the digital energy footprint. As AI transitions from task-based tools to “always-on” agentic companions embedded in daily life, the hosts dive into the mechanics, real-world use cases, and the urgent need for innovation around AI’s massive power appetite. The discussion breaks down the scale of electricity involved, looks at what's driving inefficiencies, and investigates if tech ingenuity can offset the surging demand on power grids.

Key Discussion Points and Insights

1. A Paradigm Shift: From Simple Tasks to Energy-Intensive Companions

Main Idea: The move from basic AI tasks (e.g., search queries, email sorting) to persistent, conversational, agent-like systems is drastically increasing energy consumption.
Notable Quote:
- “A single AI agent booking your flight can use as much electricity as running your dishwasher. New reasoning models burn up 90% more energy than traditional ones.” — Stephen Lacy [01:13]
Everyday interactions like “please” and “thank you” with chatbots aggregate into huge computation—and electricity—costs at planetary scale.

2. Cultural Impact: The Attachment to AI Personalities

Refers to waves of emotional attachment formed with AI, referencing the film Her and recent events where OpenAI had to restore older AI personalities after user backlash.
Notable Quote:
- “When OpenAI released GPT5 this month, it wiped out the personalities that people had become attached to. And they freaked out, forcing the company to reintegrate their old model.” — Stephen Lacy [05:35]
Raises concerns about AI-induced psychosis and emotional dependence, connecting these trends to a societal scale-up of always-on AI.

3. The Mechanics: Why AI Is So Power Hungry

[07:28]

Vijay’s Example:
- Booking a flight with an AI agent using a reasoning model (like DeepSeek R1) can use ~3 kWh per task.
- Equivalent to running a dishwasher—for one AI task.
Reasons for Energy Intensity:
- Modern models (especially reasoning models) run repeated, complex inference cycles.
- Many daily workflows now trigger cloud-based AI models, even for simple queries, instead of running on-device computations.

Notable Quote:

“It may seem like you’re asking the same question as maybe you did five years ago. But the amount of energy … is just going up. Sure, the fidelity of the answers is going up, but so is the amount of energy and compute.” — Vijay Gaddupalli [08:55]

4. The Invisible Footprint: Everyday Apps, Enormous Hidden Costs

[10:48]

Example: Vijay’s children's storybook iPad app—generating one AI story for his son consumes as much energy as the entire iPad itself, but most of that draw happens out-of-sight, in remote data centers.
“It’s somewhere in Virginia or wherever you’re seeing a little cloud of smoke come out thanks to this little story that was created by a 5-year-old.” — Vijay Gaddupalli [11:26]
Many “edge” or local tasks now offload to energy-hungry AI in the cloud, making everyday digital life exponentially more consumptive.

5. Why Energy Use Is Growing so Fast

Past: The main constraint for AI’s expansion was access to enough compute (GPUs, chips).
Present: The bottleneck is power—securing enough electricity and grid capacity for new data centers.
Technical Innovations:
- Hardware efficiency (operations per watt) has improved 10x in six years.
- Recent breakthroughs stem from using lower-precision calculations, not just “Moore’s Law.”
- However, efficiency is being outpaced by explosive usage and sophistication.

Quote:

“We’ve roughly 10x’d the operations per watt in the last 5 to 6 years… It’s not really Moore’s Law. … We’ve become just a lot better with using different precisions.” — Vijay Gaddupalli [14:40]

6. Why Some ‘Efficient’ AI Models Are Still More Power-Hungry

[15:47]

Reasoning models (DeepSeek R1, O1) vs. Standard LLMs:
- DeepSeek R1 is more efficient vis-a-vis other reasoning models, but uses up to 90% more energy than traditional LLMs for the same prompt.
- Because “reasoning” models repeatedly activate large portions of the network internally before returning outputs, a small tweak in a prompt can cause massive unexpected jumps in compute/energy usage.
Quote:
- “Using something like DeepSeek R1 versus a more traditional LLM, say Llama 3.3, used almost 90% more energy for the same task…” — Vijay Gaddupalli [18:00]

7. Big Picture: Can We Innovate Our Way Out of the AI Power Crunch?

[22:03+]

Most AI systems are not built with efficiency in mind.
Opportunities:
- Smarter, context-aware software that can adjust task length or answer detail depending on time-of-day energy cost or carbon intensity.
- Example: A carbon-aware LLM reduced emissions by 70% without hurting answer quality by shortening answers when power was dirty or expensive.
Quote:
- “Why aren’t our LLMs … aware of the environment around them? They use so much energy and are instrumented less than a light bulb in your home.” — Vijay Gaddupalli [23:23]

Host challenges the “internet will use half of America’s electricity” hype of the late ’90s, asking whether a similar innovation curve can keep AI’s power hunger in check.

Vijay is optimistic, saying we can avoid reckless growth if we design smarter.
“We can be stupid and just keep building more stuff … or we can pause, think, and do so in a way that’s far more sustainable.” — Vijay [26:28]

8. Market Forces and Economic Incentives

[26:52]

Hyperscalers and major AI players have powerful economic incentives to drive electricity and compute efficiency.
As energy becomes a bigger part of AI’s cost structure, companies will be forced to innovate.
Fast-falling token costs (from $20/M to $1/M tokens in months) are driving up overall usage, validating Jevons Paradox—efficiency gains lead to more overall consumption.

Quote:

“As much as … we’re trying to make more efficient models … utilization is just going to go up in a pretty significant way, at least for the next couple years as we AIfy everything.” — Vijay Gaddupalli [31:24]

9. Advice for Energy and Grid Stakeholders

[32:03]

Industry must rethink “flexibility” beyond binary on/off load management.
There are big opportunities in throttling and modulating AI workloads to balance grid demands versus simply denying data center power.
Loads often peak only 1–2% of the time; most of the time, capacity is underutilized.
By collaborating, energy providers and data centers can optimize demand and possibly turn data centers into grid contributors instead of just passive loads.

Quote:

“If we can start to design around that fact, be able to take care of the peaks … there are some huge opportunities. … The data center [could] not just be a load on the energy side, but also a contributor at times.” — Vijay Gaddupalli [33:17]

Memorable Moments & Notable Quotes with Timestamps

“A single AI agent booking your flight can use as much electricity as running your dishwasher.” — Stephen Lacy [01:13]
Always-on AI and the loneliness epidemic (reference to Zuckerberg and Her) [04:12–06:00]
“My son’s AI storybook used as much energy to make a story as the whole iPad—except most of that was burned in a Virginia data center.” — Vijay Gaddupalli [11:26]
DeepSeek R1 benchmark: “Used almost 90% more energy for the same task as a traditional LLM.” — Vijay Gaddupalli [18:00]
Hardware curve:
- “From 100 giga ops/watt to 1 teraops/watt for commercial chips in six years.” — Vijay [14:40]
“It’s not that economic incentives won’t always be there, but there will be years where the focus is on launching new products—costs become secondary temporarily. … But the next phase will be commoditization and the big cost is going to be energy.” — Vijay [27:55]
“Jevons Paradox seems to be playing out right now.” — Vijay [31:19]
Advice for industry: “Flexibility isn’t just running or not running a workload—throttling is huge. If you say it’s going to run a tad bit slower, that can be huge.” — Vijay [32:14]

Key Timestamps for Segment Navigation

[01:13] Main theme introduction: AI’s surging energy use
[03:51] Sam Altman & culture of “always-on” AI
[07:28] How “agentic” AI workloads spike power use
[10:48] Everyday apps and hidden energy costs
[13:10] Power, not compute, is today’s growth bottleneck
[14:40] Hardware efficiency leaps—but not a silver bullet
[15:47] DeepSeek R1 and reasoning models’ real-world power draw
[22:03] Where we’re still leaving massive efficiency unaddressed
[25:35] Can we innovate our way out? Historical perspective
[26:52] Hyperscaler incentives, energy as an AI cost driver
[28:45] Projections for US data center power use
[31:19] Jevons Paradox and utilization surge
[32:14] Actionable advice for grid/energy industry

Conclusion

This episode challenges listeners to reconsider the booming energy costs being driven by the AI revolution—costs that stem not just from training but especially ongoing usage (“inference”). The conversation reveals both the urgency and the real-world optimism around technical and market solutions, arguing for collaborative, nuanced approaches to data center flexibility, smarter algorithms, and better integration between tech and energy sectors.

For those working at the intersection of AI and energy, the takeaway is clear: demand is surging, efficiency gains are real but insufficient on their own, and opportunities abound for creative, systemic solutions before AI overwhelms grid capacity.

Loading summary

Transcript32 lines

[00:00]
Stephen Lacy
A quick note before we start, I am going to be hosting a live conversation on Sept. 4 about the state level playbook for Clean energy. After passage of the obbb. You've heard us talking about it on this show and all the other chaos swirling around federal energy policy. And there is so much that can be done on the state level now and in regional markets, and that is where companies need to focus their attention. So in my conversation with Advanced Energy United CEO Heather o', Neill, we're going to share insights from a forthcoming policy playbook on initiating reforms in in states and in regional electricity markets, from streamlined permitting to grid enhancing technologies and distributed energy resources. O' Neill is going to detail the specific reforms that can open new markets and accelerate deployment across red, purple and blue states alike. So that is September 4th. It's at 2pm Eastern. This is of course just such an important topic and you should join to ask questions live. Sign up by clicking the link in the Show Notes or go to latitudemedia.com/events. We'll see you September 4th. Latitude Media covering the new frontiers of.
[01:07]
Vijay Gaddupalli
The energy transition.
[01:12]
Stephen Lacy
From Latitude Media, this is Open Circuit this week, the rising energy cost of our digital lives as we outsource more of our everyday cognitive work to AI, we're entering an era where simple tasks writing an email summarizing a meeting, drafting a to do list they're all becoming energy intensive computing events. A single AI agent booking your flight can use as much electricity as running your dishwasher. New reasoning models burn up 90% more energy than traditional ones. And inference the everyday running of AI models now accounts for the vast majority of their lifetime energy use. So this week, MIT's Vijay Gattappalli walks us through the numbers behind AI's growing power draw from the hidden energy cost of being polite to a chatbot to the efficiency breakthroughs that could slow the trend, we'll unpack how our work and social lives are turning into real world load growth and what it will take to balance the equation.
[02:11]
Sponsor Announcer
OpenCircuit is brought to you by Natural Power for nearly two decades, Natural Power has provided engineering and consulting services for renewables projects across the US Natural Power supports clients in wind, solar and battery storage with a focus on independent engineering, technical due diligence, energy estimation and developer support. With more than 245 gigawatts of project experience in North America and acceptance from major financiers, Natural Power is responsive, able to meet tight timelines and pragmatic. Natural Power works with you to understand, quantify and mitigate risks. Learn more@naturalpower.com or click the link in the show Notes OpenCircuit is supported by Sungrow, a global leader in PV inverters and battery storage systems. With a resilient global supply chain and 28 years of experience, developers trust Sungrow to deliver reliable, affordable energy solutions. Sungrow's PV and storage systems are ranked most bankable by Bloomberg Nef, thanks in part to the company's strong fire safety record. That record was demonstrated by live streamed tests and certification from the New York City Fire Department. Learn more@sungrowpower.com or click the link in.
[03:18]
Stephen Lacy
The show Notes.
[03:24]
Welcome to the show. I'm Stephen Lacy. I am the Executive Editor of Latitude Media. Normally I'm here with Jigar Shah and Kathryn Hamilton, but they are on summer break again so this week we've got another great conversation from our Transition AI Conference. Last week you heard a conversation about the demand picture on the grid. And so this week I want to look at the demands of computing itself. And to set it up, I want to consider a concept that Sam Altman of OpenAI has been talking a lot about.
[03:52]
Sam Altman (quoted)
It's not that you plug your brain in one day, but it's you will talk to ChatGPT over the course of your life at someday, maybe if you want, it'll be listening to you throughout the day and sort of observing what you're doing. And it'll get to know you and it'll become this extension of yourself, this companion, this thing that just tries to help you be the best, do the best you can.
[04:12]
Stephen Lacy
That is Altman teasing his vision for always on AI. He often describes this world where ChatGPT is no longer a place to just ask queries in a chat window. Rather, it's a genuine companion. It's on all the time, giving constant feedback and motivation and whatever else you need. Mark Zuckerberg has said the same thing recently, citing the loneliness epidemic. He was talking about building AI that can fill in the gap in friendship. Does any of this sound familiar to you?
[04:38]
AI Character from 'Her'
Do you want to know how I work?
[04:40]
Vijay Gaddupalli
Yeah, actually. How do you work?
[04:42]
AI Character from 'Her'
Well, basically I have intuition. I mean, the DNA of who I am is based on the millions of personalities of all the programmers who wrote me. But what makes me me is my ability to grow through my experiences. So basically, in every moment I'm evolving. Just like you.
[05:00]
Stephen Lacy
That's a clip from the 2013 movie Her. If you haven't seen it yet, definitely go watch it. I was blown away when it first came out. It is such a Beautiful film. And 12 years later, it just feels so on the nose. It's supposed to be a dystopian love story about a man who falls in love with his operating system. But Silicon Valley sees it as a playbook for the way we build relationships and interact with their AI. And this is unfolding right now. We're already seeing millions of people treating AI tools as much, much more than assistance. They're turning to them for emotional support, for conversation, for companionship. Medical professionals are even reporting a surge in what they call AI induced psychosis. And when OpenAI released GPT5 this month, it wiped out the personalities that people had become attached to. And they freaked out, forcing the company to reintegrate their old model. I think they were even surprised by how attached users were to the personality of the model. And I bring this up not to create some kind of moral panic, although I am a little terrified about what our personal relationships to AI will bring. But rather this is super interesting and important to understand from an energy perspective. Think about what normally happens when you interact with the Internet. You type a search query, Google looks it up in an index, which is essentially a massive pre organized database. You send an email, the server just routes and stores your text. You browse a website, it serves up pre written content. These are relatively simple operations. Look up, store, retrieve, transmit. But AI is fundamentally different. Large language models don't just look things up, they process each word through a complex network of mathematical operations. So instead of retrieving pre existing information, they're generating new responses in real time and running billions of calculations to predict what word should come next. And when millions of people are doing this, it adds up to power plant level demand. Now imagine tens of millions of people doing this all day, every day. This is what the AI labs envision. This shift from task based tools to always on companions. It fundamentally changes how we think about energy in this new era. And so to understand what it all means for our digital footprints, I sat down with Vijay Gaddupalli on stage at our Transition AI conference. Vijay is a senior scientist at the MIT Lincoln Laboratory Supercomputing Center. And I laid out this scenario that I just laid out for you and asked him about the increase in computing intensity and whether efficiency improvements could be a key counterbalance. Here's that conversation.
[07:28]
So basically what I outlined was a shift from discrete AI tools to agentic, always on AI and how that will shape data center workloads and overall growth. So I'm curious, can you just talk about how the computing is evolving as the tools are progressing.
[07:46]
Vijay Gaddupalli
So it is really interesting, right? We have wear a few different hats on one side. I like to think that we're focusing on making computing and AI specifically more efficient. So it's sort of the academic hat and then I have a commercial hat in which I make the problem worse. So it's kind of nice, fun, fun thing. So we've done a lot of measurements. We're seeing how customers are starting to use cloud computing, the types of workloads that they're running, and the intensity of these workloads is going up very, very significantly. I'll just give you a really simple, like agentic example. How many have used an AI agent to, I don't know, book flight tickets or something like that? Maybe a few of you, or at least you've seen the demos. So I've done that. I travel quite a bit. So I tried to see if that would help. It's kind of nice. But we did a quick analysis. We were seeing customers using these Agentix systems on our cloud platform and they're using a reasoning model like deep seq R1 or something similar to that. And each one of those tasks that they perform takes about 500,000 tokens worth of computing. Now, just to put that into perspective, if you don't speak token and watt hours, that's rough, 3 kilowatt hours of energy used for a single task. So every time you ask an AI agent like, hey, do this thing, try that thing out, it's like running your dishwasher or your washing machine. That's how much energy each of these tasks uses. And so that's what I mean by the energy intensity of these tasks is going up. It may seem like you're asking the same question as maybe you did five months ago, maybe five years ago. But the amount of energy, sure, the fidelity of the answers is going up, but so is the amount of energy and compute that's going into that. Running a simple model, I mean simple quote, unquote, but one of these reasoning models uses like 16 to 24 to 32 GPUs just to run a single instance. So if you're interacting with that for five to 10 minutes, as you might on a deep research or a, you know, agentic workflow, the energy cost starts to add up. And if we combine that with the proliferation of this technology. Right. It's all over the place. Right. You can't avoid it. You can't do a Google search without running through a large language model at this point. And so it's I think that scale is what's really starting to build very quickly.
[10:06]
Stephen Lacy
Yeah. One of the things I outlined earlier was how a lot of these small interactions add up. I gave the please and thank you example that cost OpenAI tens of millions of dollars in computing and electricity costs. So talk about like how the energy intensity of our overall digital lives is growing. You know, I don't think anybody in the room really raised their hand on agentic AI. They're still relatively new. I have not personally used some of the new tools. I tend to use them more as actual tools in my work life. But like as we outsourcing a lot of these interactions to large language models, what does that do to our digital lives in total?
[10:48]
Vijay Gaddupalli
Yeah, our footprint is certainly growing very quickly and in a way that we don't often realize. I love to give this example. I've used it before. I wrote this little children's book app for my son as he was learning to read. And it uses generative AI. It's a little iPad app. You can go in, talk to it. It'll create a story based on whatever he wants. And this is a couple of years ago. And then I started to look at what is the energy footprint of that app. It turns out every time he was generating one of these stories, he was using as much energy as that entire iPad. So it's like discharging that much energy is going into creating one story for him. Except the difference here is that the battery on the iPad was fine. Luckily my code wasn't too crappy. But it's somewhere in Virginia or wherever you're seeing a little cloud of smoke come out thanks to this little story that was created by a 5 year old. And so this is that digital footprint I think that we don't see that's growing very quickly as we're seeing costs going down. We're seeing this commoditization both on the hardware side as well as on the, you know, we'll just call it the token costs associated with running these models. We're finding it's much cheaper for people to build these complex apps on top of them. So where your prior hey Alexa, what's the weather would happen locally on device? It's now sending that just in case you decided to mix it up with another language in between or you decided to do something funky in the way that you're talking about it. So there are certainly the responses, the answers, the quality are going up. But so is the digital footprint associated with this. How my washing machine has to talk to the Internet to do things these days, right. And they have AI on it. I don't know what it does. Still screws it up. But, you know, it's still, you know, everything that we're doing is just becoming more complex. It's like, you know, when the cars were, you know, in say the 70s, it was just bigger and bigger engines, but you're still driving maybe the same distance. So the gas consumption or the fuel consumption went up significantly even though you're say, covering the same distance. Sort of the same thing that we're seeing with AI, people are bigger and bigger engines. Behind the hood, you might still be going the same distance, but the amount of energy or power that we're using is just going up. And we're often not getting to choose if we want that or not. It's just kind of being stuffed down our throat.
[13:10]
Stephen Lacy
So over the last few years there's been this intense focus on power. And so as the grid became the biggest bottle and power availability became the biggest bottleneck for AI expansion, it set off this real scramble to secure power and grid capacity. But what about on the comput? Like, how have those power constraints catalyzed any movement? Innovation in modeling, architecture and hardware design?
[13:37]
Vijay Gaddupalli
Yeah, I mean that's, I think there's a lot going on there. You know, I like to describe this. We talk to a lot of people who are maybe not from, you know, you can think of policymakers and folks like that. You know, 10 years ago, if you're a company, you couldn't work hard enough to get your data in order. Right. It was the big data era. We were all about collecting data. And a lot of enterprises now have, you know, very strong data governance strategies. About five years ago, when we started our company, the big bottleneck to AI's growth was compute. You could not get your hands on enough Nvidia GPUs, if you remember Covid time and stuff like that. Nowadays it's all about power. And can you get access to power? Now if we look at one of the things that my group has done over the last about six years now is we've run an annual survey of different computing platforms. About five years ago, we found most commercial products would. And we like to measure it sort of in terms of what we call operations per watt to give you an idea of the computing efficiency of the hardware. About five, six years ago, most of the commercial products lined up around the 100 giga ops per watt. So you're, you know, the number of watts would change based on if it was a data center chip or an edge chip. But, you know, sort of there was this straight line and you'd find a lot of things lining up on that. As of the last year, that commercial curve has moved to about 1 teraops per watt. And you know, five years ago, the research E chips were at 1 teraops per watt. Now the Research E chips are sitting at the 10 teraops per watt. So it's easy to think that, you know, one way of thinking about this is we've roughly 10x the operations per watt in the last 5 to 6 years. Now some of you are going to say that sounds an awful lot like Moore's Law in action. It's not really Moore's Law. It turns out that we've become just a lot better with using different precisions and being able to, say, perform lower precision calculations and still have the same numerical accuracy as we did with higher precision calculations. So a lot of the chips and the technology that we've been building has really been, can I get more operations per watt? I'll make the operation a little bit weaker, but it turns out that I can still make it good enough and get an answer. And so that's where we're seeing a lot of innovation right now.
[15:47]
Stephen Lacy
I think this is a good place to bring up DeepSeek. So earlier this year, when the Chinese company DeepSeq released its R1 model, supposedly training a model with a lot fewer computing resources, it challenged the assumption that we just need more compute. More compute equals better modeling. But you actually found that it was more energy intensive than other models. What did you find?
[16:13]
Vijay Gaddupalli
Yeah, so if we look at the world of how large language models are built and deployed and kind of the lifecycle associated with them, maybe five, six years ago, we used to say that, and you probably saw a lot of news articles, the training cost of large language models. And at that time, training these models was one of the most expensive. In fact, traditionally building machine learning or AI tools, training has been the big thing. And we've deployed a relatively simple model after that. But the training is where we're kind of figuring out all the rules. And then after that, the model itself. Large language models somewhat flip the paradigm. So if you talk to people who run these models, typically the inference side of the model constitutes about 80 to 90% of the lifecycle energy costs associated with a model. So even though training is a one time thing and can be very expensive, it's often a relatively small portion. So that brings in models. So that's where the OpenAI's and the anthropics and all of these large companies are struggling with on the inference side, because that's where the traffic is growing. We actually see that the cost parity is about 2 to 300 million inferences, maybe even half a billion inferences, roughly translates to a training of the model. So coming back, I think, to your original question on Deep seq. So, yeah, Deep SEQ was, you know, first of all, I love Deep Seq because it's one of the only open models that we had, open weight models that we had from a reasoning model. So it helps to maybe lay out the land a little bit. There's sort of like two classes of large language models, at least as of today. The first are what we call these standard models. These are these monolithic models. This was GPT3, GPT4, you know, llama 3.2, llama 3.3. These are essentially a large set of weights. You'd have a prompt, that prompt would propagate through these weights and out would come out some result. That would be the output prompts that were created. The energy impact of these models was somewhat linear. It was somewhat proportional to the prompt and the length of the prompt or the length of the output. So if you had a large document, that would use more, and if you had a small document, you'd use less more. Recently, we have these models that are called reasoning models. These are what things like O1,03 or deep seq R1 fall into. These reasoning models are slightly different. They take your input prompt the same way, but they also now have this thing. The reasoning part of it is an internal chain of thought. So where they take your thing and then they bounce it around the model a bunch of different times to come up with some consensus across different parts of the model and then spit out the actual result. So when you look at that, it actually, even though the prompt length may be small or maybe the same size as the other one, because of this internal movement of information, it's constantly activating. So it's almost like you're using a lot more tokens, even though you're not controlling how those tokens are actually being used. So a colleague of mine actually did a very simple benchmark, and he found that using something like deep seq R1 versus a more traditional LLM, you know, say llama 3.3 used almost 90% more energy for the same task or for the same input prompt. And so it's important. So when we look at DeepSeq R1 and we say it's efficient. It's efficient compared to the other reasoning models that are out there, say O3 or 01 at the time when it was released, but not necessarily compared to some of the more standard LLMs. In fact, it's very difficult to measure the energy actually of something like deep seq R1 because a small change in the prompt can have a huge change change in the. You know, previously if you added a token that had like this somewhat linear relationship to how much energy it consumed. Nowadays, because of this chain of thought, you might now be bouncing around the question a few more times because you asked a slightly more complicated or complex question. And so we yeah, we've certainly seen this. As much as we talk about deep seq R1 being more energy efficient, it's in relationship to other reasoning models, and not necessarily with just all large language models as a whole.
[20:19]
Sponsor Announcer
Open Circuit is brought to you by Natural Power for nearly two decades, Natural Power has delivered expert, independent engineering and consulting services for renewables projects across the US and beyond. Success in project transactions requires an independent engineer who's laser focused on timelines, understands the nuances of risk, and collaborates seamlessly to develop solutions tailored to your needs. Natural Power excels at working within tight time constraints while ensuring diligence never takes a balance. With a deep expertise in wind, solar and battery storage, Natural Power delivers top tier support and independent engineering, technical due diligence, energy estimation and developer support accepted by major financiers. Their flexible approach ensures projects are built on a strong foundation powered by expertise driven by sustainability that is natural power. Find out more@naturalpower.com or click the link in the show notes. OpenCircuit is supported by Sunground, a leading maker of PV inverters and battery storage systems. As you know, listening to this show, electricity needs are soaring. They're set to grow 2 to 3% each year in the US and solar plus storage is best positioned to meet this demand. It's abundant, affordable, and relatively quick to get up and running. Sungrow has been producing this technology for 28 years. With a robust global supply chain, a strong fire safety record, and 740 gigawatts of power electronic converters installed worldwide, Sungrow is ready to drive our energy revolution. Click the link in the show notes to learn more or visit sungrowpower.com.
[21:55]
Stephen Lacy
So you're also CTO of a cloud AI cloud computing company, Radium Cloud. Where else are you finding inefficiencies in computing right now?
[22:04]
Vijay Gaddupalli
Yeah, so it's funny, we've just been building these systems without thinking about efficiency. And we are seeing huge opportunities in improving both the processors themselves, the way we're utilizing these processors and how efficiently we're using them. I like to think of, you know, having a few knobs that you can adjust when you're, if you're right. On one end there is the, the camp that says, okay, we're just going to build more energy generation, we're going to add more things and that's, you know, we probably have to do a little bit of that. It also turns out that there's just a lot of efficiencies that we're dropping on the table because we either don't care or haven't looked deep enough into them. And I think a company like Deepsea gives us that right when we have some constraints, we're actually pretty innovative and we can invent and come up with new things. So I like to think of the knobs that we can use. One of them is pretty simple. Just do less compute. An example of that is the please and thank you as you brought up. Think a second about the prompt that you're issuing. And if you can avoid that repetition back and forth, that can often have a pretty big just energy efficiency impact and you can reduce that. And even if you don't care about energy and energy efficiency, it costs you less to do that back and forth over and over. The second thing is you can use more, what I'd say use less energy intensive computing. So this would be maybe upgrading your hardware, switching to lower precision. There are ways that you can get the same answer often with, with less energy consumption. This is kind of like driving a more fuel efficient vehicle. If you're looking to save fuel costs, there's often an upfront cost, but very quickly that can be recovered for most organizations. The third one, which I think is particularly exciting, and this is something that we're experimenting with, is our software systems have no idea about the world around them. And I'm not talking about like a skylink or something like that, but they are not environmentally aware. They don't know what your kilowatt hour energy cost is at the time. They don't know what demand response pricing is looking like. They don't know what the carbon intensity of compute looks like. Why aren't our LLMs, our foundation models, our generative AI tools somewhat aware of the environment around them? They use so much energy and they're instrumented less than a light bulb in your home at this point. And so how do we make sure that our software is more intelligent about the world around it. So as an example, we did a simple experiment where we fed in the carbon intensity, right? So for those who are familiar, like there are pretty wild variations of carbon intensity through the day, which is how much carbon is emitted per some unit of energy. We made our, this is a language model that we made somewhat aware of this. We had this little controller in front of it and we said, hey, when it's, it's really intense out, like if it's a, you know, high carbon intensity, just limit the length of your answer. And we did this through what's called a generation directive. We were able to show that over a 48 hour period, we could reduce carbon emissions from the model by 70%. And we actually performed the same on the language, on the actual task that they were trying to perform. So instead of giving a long answer, it just gave a shorter answer. It's still correct. And so, so it's these simple things. And you know, we've actually started to, you know, some of these tools are starting to get commercialized. Saves a lot of money. If you're someone who runs a company and has a big gen AI bill can actually reduce the amount of money that you're spending by giving short, succinct answers early and then allowing people to probe further.
[25:36]
Stephen Lacy
If I think about the innovations or the possible innovations in computing, you know, back in the late 90s, I think it's very sort of famously known in energy circles now that there were some analysts saying that the Internet was going to, you know, use half of America's electricity by the early 2000s, and that never happened. Of course. We saw these extraordinary innovations in data center design and computing and hardware design and data center electricity. Demand has been flat over the decades as capacity has expanded radically. So do you think we can innovate our way around AI's power problem?
[26:14]
Vijay Gaddupalli
Absolutely. I think this isn't. We tend to sometimes come and do this and say it's like, oh God, we're screwed, but it's actually not that bad, we'll figure something out. The question is, do we do the right thing and can we do it in the right way? Right. Obviously we're going to be able to scale up energy production, but are we doing it in a way that's consistent with our long term goals? Let's say there's a lot of efficiencies and locked up energy capacity that's not being utilized. We can be stupid and just keep building more stuff, you know, without thinking about it. But if we can stop Pause think we can probably do so in a way that's far more sustainable. And I mean that both economically as well as environmentally.
[26:53]
Stephen Lacy
I mean, if we think about the hyperscalers, for example, they have every economic incentive to keep this under control. Do you think that the incentives are aligned for them to continue to innovate?
[27:02]
Vijay Gaddupalli
Yeah, absolutely. I think that very quickly this is, I think a few weeks ago, but there was a few of the tech CEOs were called into a Senate hearing. And I think Sam Altman don't fully agree with all of the comments that he made, but at least this one I thought was somewhat there at some point that the price of AI and the price of energy are going to sort of be directly related to each other. And I think that sentiment I somewhat agree with. Maybe there'll be this big constant term outside of it, but if you look at the complexity type analysis, it's going to look very similar to that. And so yes, there is an economic incentive to drive costs down. And I think we in the US tend to be a little bit more forward leaning or more allowable of bringing on new energy sources. But if you go across the world, that's not necessarily the case. A lot of European countries are really clamping down. You look at Asian countries, they may not have the ability to quickly scale up the type of technology that they want to bring on. And so we're seeing a lot more people innovating in the side of that. And I think the hyperscalers absolutely have an economic incentive to do so as well. Now there's going to be little ups and downs as in it's not that economic incentive won't always be there at every minute of every day. So there will be some months, some years maybe. And I think the last couple of years, years have probably been that where the incentive has been to put a new product online and kind of look at costs secondarily because you don't want to lose the race. But you know, we always see this as that as we start to reach the peak of where the tech, you know, where the quality of these models are going and the types of tasks that people are performing, the next thing is going to be okay, how do we commoditize that? And a big part of that cost is going to be energy.
[28:45]
Stephen Lacy
So earlier, you know, I think you put a lot of technical context to what we've been discussing throughout the morning, which is the demand signal looks very different because of the computation patterns that we're seeing playing out. If you think about Some of the high projections for data center demand, Some people believe 60 gigawatts or more here in the US knowing what you know about the computing side, do you think that, does that feel accurate to you? Does it feel like we could build more than that?
[29:16]
Vijay Gaddupalli
Yeah, I mean, I, I think I've worked with people that have calculated this number in a few different directions. And you know, top, you know, top down, bottom up, the numbers actually pretty accurately line up to somewhere around, you know, in the US probably about 300 terawatt hours. You know, increased energy consumption due to data centers over the next, I'll call it three years or so. And there have been a few ways that they've been calculated and I've sort of seen them all come around, you know, let's go call it 30% of this number. Which means that, you know, there is maybe some truth to it. Now, it could be that we all are like, you know what? Screw AI. I hate this. You know, papers and pencils, there's no. And you. Maybe it all falls apart, but so far does not seem to be the case. In fact, you know, what we're seeing, and I can say this from our company's perspective, we're actually seeing a growth in demand at this point because we've been able to start driving down costs. It was a luxury item five years ago, or maybe not five. Sorry, it didn't exist five years ago. It was a luxury item you know, five months ago. And nowadays the costs have gone down. What used to cost $20 per million tokens is now a dollar per million tokens in the span of about four to five months. So that is a 20x cost reduction in half a year, roughly. And that's because A, more players, so competition, and B, we've been able to build more technology, be able to drive down costs. And some of the incentives on the economics of it have changed. So I think that as much as. But that doesn't mean that the actual energy consumption has gone down. The energy consumption has skyrocketed in the meantime, just the cost has gone down. And so I think coming back to something that we've also discussed before is like Jevons Paradox and seems to be playing out right now. And for those who aren't familiar with this, it is this principle that we were actually, I was made aware of a while ago by an economist on the team or an economist on the team that when the cost of a commodity goes down, instead of consumption, you know, overall consumption of that commodity going down, it might have a tendency to go up. And that's because we often the cost coming down or the price or the efficiency of that, you know, increasing actually makes more people want to use it. And so as much as, you know, I appreciate work that folks like deepseek and I think even OpenAI and everyone's kind of moving in that direction of trying to make more efficient models, that doesn't. I think the utilization is just going to go up in a pretty significant way, at least for the next couple of years as we aify everything that we're going to do. Right. I mean, our company works with customers that you've definitely heard of, definitely interacted with, but would have no idea that there's a generative AI solution happening behind the scenes.
[32:04]
Stephen Lacy
Final question here for those this is an energy industry audience. A lot of people who are, are on the development side, on the services side, what should they be watching in this space right now?
[32:15]
Vijay Gaddupalli
So I'd say the biggest thing, and I think even the previous panel had kind of talked a little bit about flexibility. We often think about things like flexibility as either running a workload or not running a workload, which is something that's not as appetizing to people. There are ways to add flexibility in things like throttling, for example, that can actually make it, you know, so you can. So flexibility isn't just, you know, I see, you know, even when we do simulations on data center utilization and usage, we often say either the workload runs or it doesn't run, but that's actually somewhat over, you know, oversimplifying the problem. And more importantly, it's actually making it less likely because no one wants to say that their workload's not going to run. That doesn't sound fun. But if you say that it's going to run a tad bit slower, that can be huge. Right. And we've seen this. We work with a lot of data centers. In fact, my group at mit, we released a data center data set a couple years ago, working on a new one. We've actually seen that you peak only about 1 to 2% of the time, and most of the other time you're well below that. And your averages are often around 50, 60%. Even if we look at. It's not that the hyperscalers have told me this, but we've sort of calculated this. They're probably sitting at 60 to 70% utilization on average. They absolutely do peak, but it's not that often. And so if we can start to design around that fact, be able to essentially take care of the peaks, and there's a few ways we can do that, and I'm happy to talk offline about what those could look like. I think there are some huge opportunities, so if you're coming from the grid or the energy sector or the services side, it might be helpful to have that conversation. I know a lot of folks like us are happy with speed to connection, and so I think we're also open to a little bit more. You know, even if there was instrumentation that needed needed that was required in order to get speed to power, I think people would be open to those type of conversations at this point. And so you can shave down, you know, expensive times or peak demand times and use have the data center not just be a load on the on the energy side, but also be a contributor at times.
[34:24]
Stephen Lacy
Vijay Gharappalli is a senior scientist at the MIT Lincoln Laboratory Supercomputing Center. This conversation was recorded at Latitude Media's Transition AI Conference in June. As I said last week, stay tuned because we're going to be announcing our west coast edition soon. That is going to be in the spring of 2026. We also opened registration for our Power resilience forum on January 21st to the 23rd in Houston, Texas. That conference is going to be tackling the most pressing issues around grid resiliency with leading utilities solution providers, investors. Click the link in the show Notes to learn more. And of course, you can find all of our coverage on data centers, resiliency and the latest business trends in the energy transition at Latitude Media. We'll catch you next week.