Loading summary
Narrator/Host
Indiana University is proving how higher education can create solutions with industry. We're working side by side with industry partners to fuel economic growth that powers a future ready workforce. Explore IU's impact at iu.edu impact.
Matt Garman
Bloomberg.
Narrator/Host
Audio Studios Podcasts Radio news Now let's.
Ed Ludlow
Narrow our focus from the broader markets to one single stock Amazon. The tech giant hosting its annual Amazon Web Services Re invent conference down in Las Vegas this week. The cloud focused confab draws developers, engineers and other thought leaders in tech to explore the latest cloud and AI projects happening under Amazon's roof, including a new AI chip. Let's go live now where Bloomberg Tech co host Ed Ludlow is joined by a special guest Ed. Take it away.
Interviewer/Bloomberg Tech Host
Yeah. Three pieces of news move markets this morning. A new generation of frontier model from us, new agentic tools and then a very quickly released installed and now ramping generation of in house custom accelerator which is training M3. All points of discussion for Matt Garman, NWC O. You know the basic point with training three and you've moved quickly to bring it to the real world is cost, performance, efficiency over the prior generation but also over Nvidia GPUs, over Google GPUs. I think what people are trying to understand is that ramp part I was talking about when real world custom use it beyond this anchor customer event for pick which relies on it currently.
Matt Garman
Yeah, well look, we're quite excited about Trainium and training three in particular as you mentioned, excited to get it in the customers hands and part of where we have a benefit that we can bring to bear is as you mentioned, getting it into market quickly. And it's because we control that full stack. We control the silicon development, we control the data centers that IT land in. We know that full environment and we can land that in very large clusters for people to take advantage of that and the performance that we're seeing out of it is quite incredible. And so we're, we're, we're anxious and excited to get more and more people using it.
Interviewer/Bloomberg Tech Host
I've been able to go inside Annapolis Labs and look at the engineering work between the first generation of Trainium and the second. It wasn't just the accelerator, it's at the server level as well. That's right. But part of the surprise of today is this. You appear to be committing to an annual cadence of new generation of training. How do you keep that up?
Matt Garman
Well the key thing that we're focused on is making sure that we can iterate on the technology as fast as possible. The, the Desire and the hunger out there for, for more power and more compute is, is almost insatiable. And so the more we can take an existing power footprint, an existing set of capabilities and bring more and more compute into that for customers to build cool applications and cool environments and to get value from that, that's where you're focused on. And so we're going to be pushing that envelope as fast as we as we possibly can to get those new and new capabilities out to customers.
Interviewer/Bloomberg Tech Host
The pitch for training, I mean both the training and inference use case is that it's a great deal effective performance. At the same time you went on stage and said us is quite by far the best place to run Nvidia GPUs. How are both possible?
Matt Garman
Well, I mean both, both are possible because that is a great environment to run accelerators and compute in. And so we've been working for 15 plus years with the Nvidia team and Jensen and team to deliver outstanding capabilities for our customers. And for when you're running a large cluster of Nvidia GPUs, people will tell you AD AWS is the best place, you get the best performance, the most stable cluster, the best capabilities out there and broad scale. That's why folks like OpenAI and others are running in AWS and we have that choice. And so for others that want to be able to take advantage of Trainium and there's, there's some use cases that are best for training them, there's other use cases where Nvidia GPUs are going to be your best option. We want to have all of those available. And so we think that if we can continue to push the envelope on what Trainium can deliver for customers and make sure that we are supporting the latest and greatest from everything that the awesome team and Nvidia is delivering, that's going to be the best outcome for our customers.
Interviewer/Bloomberg Tech Host
The plan for us is to basically double capacity by end of 2027 to around 8 gigawatts. Do you have a sense of how you apportion that capacity in house silicon and server designs of training versus Nvidia GPUs?
Matt Garman
We're just going to keep pushing as fast as we can and we'll see where customer demand drives us as we go. And as you said, we're massively adding capacity. In the last year alone we've added 3.8 gigawatts of capacity and we'll continue to add more and more over the next couple of years and, and we'll let customer demand Drive us a little bit on what they're looking for and what they want. And, and that's what we always listen to and that's what we'll continue to listen to.
Interviewer/Bloomberg Tech Host
The focus with training, in the time I've been able to interact with you and talk about not again, not just the accelerator but the server design level, there's a lot of benefits the customer. When does that benefit start accruing to us in terms of profitability? Like if it's such a good financial proposition, you must be able soon to say we're making a lot of money on this.
Matt Garman
Yeah, well, you' seeing some of the benefits accrue. You see things like Bedrock growing really, really rapidly and you see Trainium powering that under the covers. And we announced today that more than half of all tokens and inference done in Bedrock are done on training to servers under the covers. And so you're already seeing that benefit come. You see the models that we're building in Nova and Nova to start to get better and better over time and be accelerated by Trainium. And so we really think that there's a whole bunch of dimensions on which both our customers, our partners and our own products are going to get accelerated all from train him.
Interviewer/Bloomberg Tech Host
Every time you come onto the program, I always offer the audience opportunity to pose a question to you. There's a lot of interest in of us, right? Many of your customers span global technology. Actually, most of the questions were about Anthropic much that wasn't much said on stage. I think people are trying to understand what is the benefit and advantage offers to Anthropic while they are ramping Trainium through Project Rainier, but also ramping their TPU allocations as well.
Matt Garman
Well, look, our partners at Anthropic, our partnership with them is incredibly strong and it's never been stronger. And we do a ton of collaboration with them and as I mentioned through Project Rainier, it's a huge collaboration there to go build their current generation models and all their models run today and launch on day one on top of Trainium and on top of aws, which we're incredibly excited about and we'll continue that partnership for a long time. I think from them they have a huge demand for compute and so they'll go to other places where it makes sense to round out their compute needs because they just have such massive needs for compute and they have customers in other clouds as well. But we're definitely there their primary cloud provider and closest partner for sure.
Interviewer/Bloomberg Tech Host
Supply constraints. So Anthropic is supply constraint. They can't get the compute they need. We've talked about the ramp on Nvidia GPU and in house silicon. Is there a supply constraint element with us? Are you able to get the chips that you need?
Matt Garman
Yeah, I think there's always. Any time you see an industry that's growing as fast as this is right now, when you think about AI and, and model development and chips, there are going to be constraints no matter what there is demand than there is supply. Sometimes it's in chips, sometimes it's in power in data centers, sometimes it's in, you know, different parts of that. You know, at some points it's, you know, networking equipment, at some point it's, it's transistors, you know, resistors or whatever it is. And you look at the entire supply chain that is needed to ramp up at such a massive rate. Right. Never before has the technology industry ramped at the rate that we are right now. And so there are always constraints. And so it's not necessarily, there is necessarily one constraint where it's like, wow, I can't get in beta Nvidia chips. We can get Nvidia chips. And actually Jensen team have been incredibly supportive and great partners in helping us get capacity there. It's not that you can't get power, we're getting power all over the place, but it's just we're ramping all of these places in so in such rapid rates that always there's a constraint in that system and it'll change every month. You ask me of what the current.
Interviewer/Bloomberg Tech Host
One is throughout the day. We're speaking with your team about the idea we're moving from AI assistants to AI coworkers. You know, particular focus on the agentic offering that you've done. You're in the camp of people, if you don't mind me saying, that sees basically 90% of the value in enterprise coming from Magentic technology. Do you have any data or evidence to support that? All of your customers are ready for that?
Matt Garman
Yeah, I don't think all of our customers are yet ready for that, but they're excited about it. So, you know, I think it'd definitely be an overstatement to say everybody's ready for it. And part of that is because it is going to take change. Right? People are going to have to change how they think about work. They're going to have to change their process flows, they're going to have to change some of the things about how they get work done. It's not just going to be a magic wand that's going to come in and magically get them to get value. But almost everyone that I talk to definitely sees that that's the path these, the agent power, the power of agents is what allows customers to actually get that work done. And when they see that efficiency gain, they see them able to accomplish things they weren't able to do before. That is when it's worth it to go make these changes. And so there's going to be work for people and it's going to take some time. Right? It's taken. We're 20 years into the cloud journey and and still only a fraction of workloads have moved to the cloud. So it's going to take time. It's not like people are going to magically switch and I think it's going to be really fast.
Interviewer/Bloomberg Tech Host
Now we just have 60 seconds. 20 years into the cloud journey when touchdown in Vegas everyone accepts us number one in terms of scale infrastructure. They question is NWC number one in AI just in the 30 seconds we have left?
Matt Garman
Yeah, I think it's a question that we got a lot two years ago and not that much a year ago. And today I don't think we get that nearly as much. It's just people that are kind of playing the same tapes we are. We have a huge choice of models. We see when customers are actually moving their workloads to production, they want to run those AI workloads on us. And that to me is the biggest signal. When we see our customers they say I ran proof of concepts in a lot of place places. When I want to move it to production, I want to run on aws. And that's the thing that we hear over and over again, which makes me think we're actually in a great position.
Interviewer/Bloomberg Tech Host
Matt Garman, AWS CEO with the full stack AI company pitch here in Las Vegas at Re Invent.
Narrator/Host
Hey, Ryan Reynolds here for Mint Mobile. You know one of the perks about having four kids that you know about is actually getting a direct line to the big man up north. And this year he wants you to know the best gift that you can give someone is the gift of Mint Mobile's unlimited wireless for $15 a month. Now you don't even need to wrap it. Give it a try@mintmobile.com Switch upfront payment.
Matt Garman
Of $45 for three month plan equivalent to $15 per month required new customer offer for first three months only. Speed slow after 35 gigabytes if network's busy, taxes and fees extra. See mintmobile.com.
Date: December 2, 2025
Host: Ed Ludlow, Bloomberg Tech
Guest: Matt Garman, CEO of AWS
Event Context: Live from AWS re:Invent, Las Vegas, focused on AWS's latest advances in cloud and AI infrastructure.
This episode spotlights the rapid AI advances by Amazon Web Services under CEO Matt Garman, focusing on custom AI chip innovation, the practical AI race among cloud giants, capacity scaling, customer adoption, and AWS’s partnerships—particularly with Nvidia and Anthropic. Garman delivers a comprehensive look into AWS’s approach to AI infrastructure, addressing both technical and business aspects, and gives a candid outlook on industry challenges and AWS’s competitive edge.
AWS announced the next-generation “Trainium 3” accelerator chip, unveiled at re:Invent.
Garman explains that AWS benefits by controlling the whole stack—silicon, data center, server design—which enables faster delivery to customers and significant performance gains.
Matt Garman (01:40): “We control the silicon development, we control the data centers... and the performance that we’re seeing out of it is quite incredible.”
AWS is appearing to commit to an annual release cadence for new chips and server generations.
Matt Garman (02:30): “The desire and the hunger out there for more power and more compute is almost insatiable... we’re going to be pushing that envelope as fast as we possibly can.”
Despite AWS’s advances with its own Trainium chips, Garman reasserts their ongoing commitment to Nvidia, with AWS touted as the best place to run Nvidia GPUs.
Matt Garman (03:22): “We’ve been working for 15+ years with the Nvidia team... people will tell you AWS is the best place, you get the best performance, the most stable cluster.”
Emphasizes customer choice—different workloads benefit from different accelerators—AWS aims to provide both best-in-class Nvidia infrastructure and proprietary solutions.
AWS plans to double its cloud capacity by end of 2027 (to ~8GW).
Asked about resource allocation between in-house silicon and Nvidia GPUs, Garman says customer demand dictates provisioning.
Matt Garman (04:34): “We’ll let customer demand drive us a little bit on what they’re looking for... and that’s what we’ll continue to listen to.”
Garman points to rapid growth in AWS Bedrock (AI services platform) as evidence of accruing financial benefits from custom silicon.
Matt Garman (05:18): “We announced... more than half of all tokens and inference done in Bedrock are done on Trainium 2 servers under the covers.”
AWS’s own Nova and Nova 2 model families are also being accelerated by custom chips.
The AWS-Anthropic partnership is “incredibly strong,” and Anthropic’s core models launch first on AWS/Trainium infrastructure—but due to massive demand, Anthropic uses other clouds as needed.
Matt Garman (06:18): “We’re definitely their primary cloud provider and closest partner for sure.”
Addresses AI compute supply constraints: issues are complex and shift quickly, not isolated to a single vendor or hardware type.
Matt Garman (07:14): “Never before has the technology industry ramped at the rate that we are right now... there are always constraints. It’ll change every month.”
AWS is investing heavily in “agentic” (autonomous AI coworker) technology, which Garman believes will provide “90% of the value” in enterprise AI—but says widespread customer adoption will take time and mindset change.
Matt Garman (08:38): “It is going to take change. People are going to have to change how they think about work, change their process flows… but they see that’s the path.”
Draws analogy to cloud adoption: still only a fraction of workloads have moved to cloud after 20 years.
Ludlow asks Garman to address the industry perception of AWS as the AI leader.
Matt Garman (09:44): “When we see our customers... when I want to move it to production, I want to run on AWS. And that's... what makes me think we’re actually in a great position.”
On controlling the full stack:
“We control the silicon development, we control the data centers that IT land in... we can land that in very large clusters for people to take advantage of that.”
— Matt Garman (01:40)
On AI demand:
“The desire and the hunger out there for, for more power and more compute is, is almost insatiable.”
— Matt Garman (02:30)
On AWS as Nvidia partner:
“When you’re running a large cluster of Nvidia GPUs, people will tell you AWS is the best place. You get the best performance, the most stable cluster.”
— Matt Garman (03:22)
On customer-led infrastructure growth:
“We’ll let customer demand drive us a little bit on what they’re looking for and what they want.”
— Matt Garman (04:34)
On Bedrock and custom silicon benefits:
“More than half of all tokens and inference done in Bedrock are done on Trainium 2 servers under the covers.”
— Matt Garman (05:18)
On supply chain constraints:
“Never before has the technology industry ramped at the rate that we are right now... always there’s a constraint in that system and it’ll change every month.”
— Matt Garman (07:14)
On the shift to AI coworkers:
“It is going to take change... almost everyone I talk to definitely sees that that’s the path.”
— Matt Garman (08:38)
On AWS’s AI leadership:
“When I want to move it to production, I want to run on AWS. And that’s the thing that we hear over and over again, which makes me think we’re actually in a great position.”
— Matt Garman (09:44)
Throughout, Garman is pragmatic, bullish, and energetic about AWS’s innovations. He is frank about both the pace and the challenges of the AI/cloud race, repeatedly emphasizing AWS’s philosophy of customer choice and bottom-up innovation while projecting quiet confidence in AWS’s market position.