
Loading summary
Tomer Cohen
AWS CEO Matt Garman is here to talk about AI, the state of the economy and Amazon culture. That's coming up right after this. The LinkedIn podcast network is sponsored by Dell AI Factory with Nvidia, which provides AI solutions that are easy to implement, secure and tailor to your business. Visit Dell.com to learn more. I'm Tomer Cohen, LinkedIn's chief product officer. In my new podcast, Building One, I interview some of the best product builders out there.
Matt Garman
People at the intersection of dreaming and building and learning.
Tomer Cohen
Together, you and I will learn from their experiences. If you're just as curious as I am, follow Building One wherever you listen.
Matt Garman
And check out the conversation on LinkedIn.
Tomer Cohen
Welcome to Big Technology Podcast, a show for cool headed nuanced conversation of the tech world and beyond. We are here in Las Vegas, Nevada at Amazon's Re Invent conference with the CEO of aws, Matt Garman. Matt, so great to see you. Welcome to the show.
Matt Garman
Yeah, thank you for having me.
Tomer Cohen
So let's talk a little bit about infrastructure. You're the kings of building data centers, right? There's no one that does it better than AWS. But there are headlines in the AI world. Elon Musk took 122 days to build a 100,000 plus GPU data center. Does this show that scaling data centers is now core to competing in AI? Is it validation? What do you think about it?
Matt Garman
Well, look, we've been building data centers for almost two decades now. And so this is something that we spend a lot of time on and it's less that we're out there kind of bragging to the press about. But what we do is we provide.
Tomer Cohen
You can brag, what's your size of yours?
Matt Garman
Well, more is what we do is provide infinite scale for customers. And so our goal is for largely the customers not have to think about these things, right? And so we want them across their compute, across their storage, across their databases, to be able to score, to be able to scale to any number of size. And so take something like S3 as an example. It's an incredibly complex, very detailed system that keeps your data, keeps it durable and scales infinitely. And customers largely just put data in there, don't have to think about it. And so today S3 actually stores 400 trillion objects. There's an enormous number that's hard to even get your head around. But it's just something where we just keep scaling and we keep growing for our customers. As you think about AI now, these are power hungry, massive data centers for sure. And AWS is adding tons and tons of compute all the time for our customers. Largely what we think of though is, is less about how fast can you build one particular cluster. The absolute size of AWS is dwarfed by any other particular cluster out there. But we're focused on how do we deliver the compute. The customers need to go build their applications. Take somebody like Anthropic as an example. Anthropic has what are widely considered to be the most powerful AI models out there today. In their CLAUDE set of models, we're building together with them what we call Project Rainier. And so it's using our next generation trainium 2 chips and this cluster that we're building for them in 2025 will be five times the size, the number of exaflops that they use to train the current generation of models, which are by far the most powerful ones out there, and it's going to be five times that size next year, all built on Trainium 2, delivering hundreds of thousands of chips in a single cluster for them so they can train the next generation. That's the type of thing where we work with customers, understand what's interesting for them, and then help them scale to whatever level they need. And that's just one of our customers. Of course, we have hundreds and hundreds of other customers as well.
Tomer Cohen
Here's my point. You're so good at this, right? Look at what you just talked about in terms of Anthropic being able to help them scale the way that you are. And that would lead me to believe that Amazon would have its own cutting edge, state of the art model, one that would lead and be better than the OpenAI's and the anthropics. This is your core competency and this is what makes these models run. So why hasn't that happened?
Matt Garman
Our core competency is about delivering compute power for all of the people that need it. And you know, for a long time we've been very focused on how do we build the capabilities to let our customers build whatever they want. And sometimes there are areas that Amazon also builds and other times they're not areas that Amazon builds. And so you think about whether it's in the database world, or you think about in the storage world, or you think about the data analytics world, or you think about the ML world. We build this underlying compute platform that everybody can go build upon and sometimes we build services that compete with others out there in the market. Think about a redshift competing with a snowflake who's also a very important partner of Ours and a big customer of ours and somebody that we do a lot of partnering together on. And then there's other times where there's applications that people build on top of AWS that Amazon doesn't go and build. And so we operate across that whole swath of area. And sometimes we'll build and sometimes we don't. But that's the kind of, the beauty of AWS is that our goal is to build that infrastructure so that sometimes we can build those, sometimes we won't build them, but we want this platform that everybody can go build the broadest set of applications possible out there.
Tomer Cohen
But I'm thinking about it for AI specifically. And in the world that you play in, you have Google, they have their own model, they sell cloud services, you have Microsoft. Okay, they don't have their own models necessarily, but they have this deal with OpenAI. Pretty sure that OpenAI is exclusive on Azure. Now this is where a lot of the growth is coming from.
Matt Garman
And so, and I think it's a mistake, actually. So the interesting thing there is, and this is where a lot of people started and I just think it's fundamentally the wrong way of thinking about it. Just a lot of times people are thinking about there's just going to be this one model and I want to have the one model that's going to be the most powerful and the one model to rule them all. And as you've transitioned, as you've seen over the last year, there isn't one model that's the best at everything. There's models that are really good at reasoning. There are models that are great for that provide open weight so that people can bring their own data and fine tune them and distill them and create things that, and create kind of completely new models from that that are completely custom for customers. And on that, you may want to use a Llama model or you may want to use Mistral model. There's customers who really want to build the world's best images and they might use something like a stability or they might use something like a Titan model. There's customers that need really complex reasoning and they might use an anthropic model. There's a whole ton of these operated out there. And our goal is how do we help customers use the very best. It doesn't have to be one thing, it's not just one. And we don't think that there's one best database, we don't think there's one best computer platform or processor. We don't think that there's one best model. It's across that whole set. And that's been our strategy. And customers have really embraced that strategy as you get to them. And they're thinking about how they go building production applications. They want the stability, the operational excellence and the security that they get with aws, but they also want that choice. It's incredibly important for them. And I think choice is important for customers. No matter if they're building jobs, AI applications, no matter if they're picking a database offering, no matter if they're picking a compute platform, they want that choice. And that is something that AWS has from the very earliest days, really leaned into. And I think it's an important part of our strategy. And it's maybe not the strategy that others have. Maybe others say it's just this one, and this is the one that we're gonna lean into, but it's not the strategy that we've picked. Our choice is around choice, and it's part of why we have the broadest set of partner ecosystem as well. It's why as you walk the halls here on Re Invent, it's filled with partners who are building their business on top of aws, who are leaning in and helping our joint customers accelerate their journeys to the cloud, their modernization efforts, their AI efforts. And it's because of that, I think that is a lot of what makes AWS special.
Tomer Cohen
Okay, and I'm gonna move off this in a moment, but the reason why I'm asking these questions is because you do have at least a bet that big foundational models are gonna matter. That's the 4 billion you just invested in Anthropic. And I think that the strategy that AWS has makes a lot of sense, right? This bedrock strategy, There's a lot of different models in there. People have their data in the cloud, they're gonna build with their data they have within aws, using bedrock, picking models. But you also are limited in the fact that OpenAI is not there. I don't think Google's there. So wouldn't it make sense, in parallel to the bring your own model strategy, to also use this capacity that you have to scale infrastructure to get in the game yourself?
Matt Garman
Look, what I will say is I'll never say never, right? I think that it's an interesting idea and we never close any doors. I think we're always open to, frankly, a whole host of things. We're always open to having OpenAI be available in AWS someday, or having Gemini models be available in AWS someday and maybe Someday we will spend more time focused on our own models, for sure. I think all of that is open, and part of what I think makes AWS special is we're always open to take. Our announcement earlier this year about partnering deeply with Oracle, about making Oracle databases available in aws, lots of people would say, oh, that's never going to happen, and it's against your strategy. Our strategy is to embrace all technologies because we want anything that customers can use. We want them to be available and to be able to use it inside of aws. And look, sometimes it happens today, sometimes it happens tomorrow, sometimes it happens weeks from now, months from now, years from now. But that is our goal, is to make all of those technologies available for our customers.
Tomer Cohen
Okay, I'm gonna parse your language a little bit because you said that you're always. You might be open to having OpenAI on bedrock within AWS. Are you talking to them? Would you want to ask them to Come on.
Matt Garman
There's nothing to announce there today, but I'm saying if customers want that, that's something that we would want and we'd love to make it happen at some point.
Tomer Cohen
Okay, well, maybe they're listening and they want to make that move.
Matt Garman
Yeah.
Tomer Cohen
But let's speak to the one that I think is the biggest challenger to them, the one that you have all this money in, which is Anthropic. So what does the 4 billion that you just invested in Anthropic get you, and how does that make you differentiated from other cloud providers?
Matt Garman
Well, there's a couple things I'd say. One is, you know, we make the investments in Anthropic because we think it's a good bet. They have a very good team. They've made some incredible traction in the market, and we really like where they're innovating.
Tomer Cohen
Yeah, we're definitely cloth heads on the show.
Matt Garman
Yeah, it's a fantastic product. Right. And Dario and team are very good, and they continue to actually attract some of the best talent out there in the market today. The other thing that we get from that is a deep collaboration on Trainium, and we've made a big bet on Trainium as an additional option for customers. You know, the vast majority of.
Tomer Cohen
We should define. Oh, sorry, go ahead. I'll just say the chips that people can. That companies can use to train their own models with.
Matt Garman
That's right.
Tomer Cohen
At aws.
Matt Garman
That's right. And so today, the vast majority of AI processing, whether it's inference or training, is done on Nvidia GPUs and we're a huge partner of Nvidia. We will be for a really long time and I think that. And they make fantastic products by the way, and they continue to do that. And when black hole chips come out, I think people are very excited about that next generation platform. But we also think the customers want choice and we've seen that time and time again. We've done that with general purpose processors. We have our own custom general purpose processor called Graviton. And so we actually went and built our own AI chips. The first version was called Trainium. And we launched Trainium 1 a couple years ago in 2002 and are just Yang Trainium 2 here at Re Invent.
Tomer Cohen
So that's news that's happening this week.
Matt Garman
That will be announced in my keynote.
Tomer Cohen
Yes.
Matt Garman
Remember this is filming. You may have already happened.
Tomer Cohen
We're going to release the transcript as your keynote hits in the podcast the day later.
Matt Garman
Great.
Tomer Cohen
But this is brand new news. Fresh off the press, folks.
Matt Garman
Fresh off the press. And so we'll have Trainium 2. And Trainium 2 gives really differentiated performance. We see 30 to 40% price performance games versus our instances that are GPU powered today. So we're very excited about Trainium 2 and customers are really excited about that. And what Anthropic gives us back to your question. Is a leading frontier model provider that can really work deeply to build the very largest clusters that have ever been built with this new technology where we can learn from them. Right. And just learn what's working, what's not, what are the things you need accelerated so that training 3 and training 4 and training 5 and training 6 can all get better as we, as we continue to go and the software associated with GPUs gets better or the accelerators gets better as well. I think that's one of the things where people who've tried to build accelerator platforms before have fallen down is the software support has not been as good as Nvidia. Software support is fantastic. And so that's a big area where they're helping us as well as we help iron out the creeks and the kinks and try to figure out how we make sure that developers can start to use these Trainium 2 chips in a very seamless way and high performance way. And so we learn a lot from them as big users that are really leaning in and help us learn. And they get benefits from, they get that scale and cost benefit of running on this price performance platform that gives them a huge win. And we think then from that investment we can both benefit as they deliver better and better models over time.
Tomer Cohen
There's an interesting thing that happens when I speak with people who are working in cloud or working to train models or working to build their own chips. There's always a preface. We love working with Nvidia and we're also building chips that compete with what they do. So how does that relationship work out? They don't get upset that you're trying to build the same. I mean they have a supply issue. But how does it work with them?
Matt Garman
No, no, I have a great relationship with Nvidia and Jensen and this is a thing that we've done before. We have a fantastic relationship with intel and AMD and we produce our own general purpose processors and it's a big world out there and there's a lot of market for and for lots of different use cases and it's not one is going to be the winner. There's going to be use cases where people are going to want to use GPUs and there's going to be use cases where people are going to find Trainium to be the best case. There are use cases where people find that our intel instances are the best choice for them. They're ones where they find that the AMD instances are the best choice for them. And there's increasingly a large set where they find Graviton, which is our purpose built general purpose processor is the right fit for them. And it doesn't mean that we don't have great relationships with intel and Nvidia or intel and amd. And it means we'll continue to have a great relationship with Nvidia because for them and for us it's incredibly important for Nvidia processors and, and then GPU powered processors to perform great on aws. And so we are doubling down our investment to make sure that Nvidia performs outstanding in aws. We want it to be the best place for people to run GPU based workloads and I expect it will continue to be for a really long time.
Tomer Cohen
What's the buying process like with Nvidia? Because you want as many chips as you can get, I would imagine. You have Elon who buys them by the truckload. You have Zuckerberg who has been buying lots and I think he wants to power them with a nuclear submarine or something like that. So do you have to jostle with the other companies to get Nvidia chips or do you get every client?
Matt Garman
Nvidia's very fair about how they go about. I mean you can ask them about how they internally allocate. That's not really a question for me, it's for them. But they're very fair in dealing with us and we give long term forecasts and they tell us what they can supply. And we all know that there's been shortages in the last couple of years specifically as demand has really ramped up and they've been great about ensuring that we get enough to support our joint customers as much as possible.
Tomer Cohen
What about your inference chips inferentia? Because last time I heard you speak you said that the activity within AI right now, gen AI is 50% training, 50% inference. Does that ratio still hold and how are you going to put the chips out there to allow companies to be able to do cheaper inference? Because that's the issue with generative AI. It works well, but it's so expensive that companies take proof of concepts and only 1/5 actually make them out into production.
Matt Garman
Yeah, it's absolutely the case. And I think we're, you know, we're still probably seeing about that ratio of 50, 50. I think if more and more, it's more inference than training and increasingly we'll see more and more of the workload shift that way. Cost is a super important factor that many of our customers are definitely worried about and thinking about on a daily basis. And you know, if you think about where a lot of people were, they went and did a bunch of these Gen AI capability or tests, right, where they did proof of concepts and they launched hundreds of proof of concepts across the enterprise without really paying attention to like what was the value going to be or anything like that. And now they're looking at them and they say, well, the ROI is not really there. They're not really integrated in a micro production environment. They're just kind of these POCs that I'm not getting a lot of value out of and they're expensive, as you mentioned. So two things that people are thinking about is one, how do I lower the cost so that I make the cost much lower to run and that's your point about cost of inference. And two, how do I actually get more value out of that? So the ROI equation just completely shifts and makes more sense. And it turns out it's probably not all 100 of those, it's probably 2 or 3 or 5 of those that are really valuable. And so there's a couple things. Number one is on the cost side as there's a few things that we're doing to help people lower costs. Number one is I think Trainium 2 will be material impact there. And as these models have gotten bigger and bigger. You mentioned Inferentia. Originally we had a small chip called Inferentia that would run really fast lightweight inference. Now as you're running models that have billions, tens of billions, hundreds of billions, trillions of parameters, they're way too big to fit on these small inference chips. And effectively they're running on the same training chips like they're the exact same things. And so you run inference today on H1 hundreds, H2 hundreds, or you run inference today on Trainium 2s or Trainium ones. And, and so we may come out over time with other Inferentia chips as you will, but they're really using a lot of that same architecture and they're still really large servers. And so we actually expect that Trainium 2 is going to be a fantastic inference platform. Our naming is not necessarily always our suit as to what these chips are for, but it's going to be a fantastic inference platform. We actually think it'll be as you think about that 30 to 40% price performance benefit that customers are going to get now, if you can run inference at 30 to 40% cheaper compared to the the leading GPU based platforms, that's a pretty big price decrease. And then there's a couple others. Also announced at Re Invent, we're launching automated model distillation inside of Bedrock. And what that lets you do is you could take one of these really large models that's really good at answering questions, you can feed it all your prompts and things for the specific use case you're going to want and it'll automatically tune a smaller model based on those outputs and kind of teach a smaller model to be an expert only in the area that you want with regards to reasoning and answering. So you can get these smaller, cheaper models, say like a llama 8B model as opposed to a llama 405B model, cheaper to run, faster to run, and you can still treat get it to be an expert at the narrow use case that you want it to be. And so that combined with a cheaper infrastructure we think is one of the things that is really going to help people lower their costs and be able to do more and more inference and production.
Tomer Cohen
Yeah, those small models seem to be the cost solution. Sounds like you're a believer. One more question about Nvidia. You've tested the new Blackwell chip. Is it the real deal?
Matt Garman
We have, you know, they're working on getting the yields up and getting it into production, but we're Excited about that. And also at Re Invent, we're going to announce that the P6, which is the Blackwell based instance that's coming early next year and we're excited about that. I think customers, I think we are expecting about two and a half times the compute performance out of a Blackwell chip that you get out of a. An H100. And so that's a pretty big win for customers.
Tomer Cohen
So you're on with Jensen's. The more you spend, the more you save.
Matt Garman
That's right. That team has executed quite well and they continue to deliver huge improvements in performance and we're happy to make those available for customers. Okay.
Tomer Cohen
Should we talk about ROI?
Matt Garman
Sure.
Tomer Cohen
All right. Two year anniversary of ChatGPT, all these companies have rushed to put generative AI in their products to this point. There's a couple of things that I've heard that have worked well. AI for coding. AI that is a customer service chatbot with a little more juice. AI that can read unstructured documents and make little sense of them. Those are the three big ones. I haven't heard much more outside of that. We're talking about something that's added trillions of dollars potentially to public company market caps. Something that has had the largest VC funding round and then probably the subsequent. Yeah, three after that.
Matt Garman
Yeah.
Tomer Cohen
Are the three examples that I listed enough to make this worth the money?
Matt Garman
No, definitely not. But they're super valuable right now and they're just the tip of the iceberg. And that's the thing. It's like you just have to look at the rest of the iceberg to realize how big. So where are we going the opportunity is. And on those three, look, I think those are actually massive opportunities by themselves. We have a number of announcements here at Re Invent around Q Developer and making developers and their whole life cycle more valuable. You think about the first generation just using this as an example. The first generation of even developers was just code suggestions. Right in like code suggestion. Super valuable, actually. It made developers much more efficient being able to code. It turns out also developers on average code about one hour a day. The rest of their day is spent doing documentation, it's spent doing writing unit tests, it's spent writing, doing code reviews, it's spent doing, you know, going to meetings, it's spent doing upgrading existing applications, doing all that stuff that's not writing code. Maybe some ping pong in there. Yeah. And so as part of that, we're actually launching a bunch of new agents that do all of those things for you. You can Just type in slash test and it'll actually automatically write unit tests for you as you're sitting there coding. You can have a Q developer agent write documentation for you as you're writing code. And so you can have really well documented code and you're done and you don't have to go think about it. It'll even do code reviews and look for where you have risky parts of your code, where you maybe have open source or parts that you want to. You should go look at and think about what the licensing rules are around, how you think about where even from deployment, where you may want to think about how you're deploying stuff, things that you would expect out of somebody doing a code review for you before you go do a deployment. Q can now do all that for you. Same on the contact center side, we're doing a ton of announcements around Connect, which is our contact center in the cloud offering, making it much more efficient so for customers to get a ton out of that contact center. All powered by generative AI. And to your point, that's just so those use cases I think get more and more valuable as you add more capabilities. And I think if you think about where things are going, it is a lot more about. If you think about how I talked about CO generation moving to a bunch of the value, it's adding agents in there so they can do a bunch of these things now. It's not just giving you code suggestions, it's actually going and doing stuff for you. It's writing documentation for you. It's helping you identify and troubleshoot where you have operations issues. And it says, ooh, you have an operations issue and it can look and understand your whole environment. You can interact with it and you and Q together can go and look and say, oh, it looks like some permissions over here were broken. And if you go fix those, maybe this is something that you can automatically, you know, it'll fix your application. So saving tons of time across that whole development lifecycle. And I think that that's where AS AI gets to be more integrated into the core of what a business is, the core of what you do. And you really have to learn it. That's where you get the value. There's a startup, but there's a number of these doing that. But there's a startup that we work with called Evolutionary Scale and they use AI to try to discover new proteins and molecules that may be more applicable to solving certain diseases. Now you think about AI is not just generating stuff or it's doing but it's actually sitting there. Instead of being able to find tens or hundreds of new molecules a year, you can now find hundreds of thousands of different proteins and test all of these and figure out which are the most likely to be successful and get drugs to market much faster. And that's a huge amount of additional revenue. So if you think about models and capabilities that can do that, whether it's in healthcare and life sciences, whether it's in financial services, whether it's in automate manufacturing and automation, every single industry, in our view, is going to be completely remade by generative AI at its core. And that's where we think that that's where you get that huge value.
Tomer Cohen
I have a question about this. I was speaking with a developer friend who said, yes, AI can code, AI can do all these things, probably looking at the different things that these agents can do. The problem is, and this applies probably across the board, when you trust things to generative AI, something breaks and then you've lost the skillset to go in and fix that because you've relied so much on the artificial intelligence. What do you think about that? Isn't that a problem?
Matt Garman
No, it's three times four twelve. Yeah. You still have Excel, but you still know how to multiply. Like, I would say that like maybe, but like, you know, you're able to.
Tomer Cohen
Kind of do those things different than multiplication. This is.
Matt Garman
Well, it's different, but it's. Again, it's. It's different. But I think the key parts of coding are not the semantics around writing language. The key parts about coding are thinking about how you break down a problem, how you creatively come up with solutions. And I think that doesn't change. The tools change, you can make you more efficient. But the developer, the core of what the developer actually does is not going to change. You're going to want to think about. There's not a lot of developers today that know a lot about garbage collection. It's just true. They don't. Right. Because Java just does it for them and they just don't have to worry about that doesn't mean that all of a sudden, like if it breaks, people don't know how to do garbage collection. They can go figure it out and do it. They just don't do it as part of their daily jobs and because it's not fun and it's not value added and they can focus more on that writing code. Right. This is what new languages have done. But. And so increasingly, I think developers are going to get to do the things that are exciting. They're going to do good to the creative work. They're going to get to figure out how to go solve those interesting problems and they're going to be able to move much faster because they don't have to worry about writing documentation. And someday if it breaks, they probably will know how to write documentation and we'll figure out how to fix that. It's not rocket science. It's just things they don't necessarily want to do today.
Tomer Cohen
So you're a believer in reasoning. I know that AWS has some news also this week that you're going to have automated reasoning tests where it checks for hallucinations before an answer goes out. Another issue when it comes to ROI is again, how can I trust it? It always comes out with wrong answers. So talk a little bit about your announcement this week and how reasoning can.
Matt Garman
Solve some of these issues. It's a different reasoning than you might be thinking about too. So automated reasoning is a form of artificial intelligence that it's been around for a while, and it's a thing that Amazon has adopted pretty significantly across a number of different places. And we use it. It's actually what it does is it uses mathematical proofs to prove that something is operating as you intended. That's the historical. And an example of that is we actually use it internally to make sure that our permissioning system is actually, when you change permissions, that it's actually behaving as expected. And so we have a. It's that this AI system has this mathematical proof that can go say, okay, all the places that permissions are applied across the surface area that's too large for you to actually go check everything. It can prove that they're applied in the way because it knows how the system is supposed to operate. And it can go kind of mathematically prove yes, your ARM permissions mean you can access this bucket or you can't access this bucket. We took that and we said, can we apply that to AI to eliminate hallucinations? And so turns out, not universally you can't do it. But for selected use cases where it's important that you get the answer right, you can. And so what we do is say an example, like you're an insurance company and you want to be able to answer questions about people, they say, hey, I have this problem. Is it covered? Right. And you don't want to say yes when the answer is no, or vice versa. Right. And so that is the one where it's pretty important to get that right. What you do is you Upload all your policies and all of your information into the system and we'll automatically create these automated reasoning rules. And then there's a process you go through that's a couple minutes, kind of 10, 15, 20, 30 minutes, where you as the developer answer questions of how it's supposed to interact and you tune it a little bit to say, yep, that's how you'd answer that type of question, or no, or that's what this means. It'll ask you questions and you kind of interact with it and then it goes, okay, now I have a tuned model now. If you go ask it a question and you say, hey, I, you know, I ran my car through my garage door, like, is that covered by my insurance policy? It'll go and it will actually produce a response for you. And then it'll tell you that, yes, this is provably correct, that the answer is yes. And here are the reasons why in the documentation I have and why I feel confident. Or it'll tell you, actually, I don't know the answer. Here's some suggested prompts that I recommend you put back into the engine to see if you can get the answer correct, because I can't. I can't tell you. I came up with yes, but actually don't know for sure that it's the right answer. Change the prompts and it'll give you kind of tips and hints on how you can reengineer your prompts or ask additional questions to come back until you get an answer that's a for sure answer that's provably correct by automated reasoning. So by this kind of mechanism, you're systematically able to actually mathematically prove that you got the right answer coming out of this and completely eliminate hallucinations for that area. Right. It doesn't mean that we've eliminated hallucinations all up, but just for that area. Yeah, if you go ask it, then who's the best pitcher on the Mets? May or may not answer your reasonable question.
Tomer Cohen
But let me ask you, what you're talking about also is very similar to what Marc Benioff talked about on the show last week, where he said that because companies have large stores of information within his platform, agents will be able to go in and pull it out and then present it and sort of help create the linkage to go from step A to step B. And it was interesting to me because I had always thought agents are going to be something that may be built by anthropic, where it's my individual agent that goes out into the world and does what I need. And I think both you and Benioff, and correct me if I'm wrong, have this idea that the agent is going to be something that I'm going to interact with when I'm speaking with the company or actually is going to perform tasks at work. Maybe that's going to happen before consumers get them.
Matt Garman
Yeah, I think that that's right. I think that agents are going to be a really powerful tool. Actually, another thing that we're launching this week is one of the things that agents today are quite good at, doing relatively simple tasks. And you can have an agent that goes. And what they're very good at actually is tasks that are pretty well defined in a, in a particular narrow slice and go accomplish something. And so what a lot of people are doing is starting to launch a bunch of agents, right? One that's very good at going and doing, you know, one particular task, another one that's good at another task, another one's good at another task. But increasingly you actually need those agents to interact with each other. Right? So we have an example in my keynote where we talk about if you're thinking about, should I launch a coffee shop? And you actually, you're say your global coffee chain, you want to say, I'm going to launch a new location here. You might have an agent that goes out and investigates what the situation is in a particular location. You might have another agent that goes and looks at what are the competitors in that area. You may have another agent that goes and does a financial analysis of that particular area, another one that looks at the demographics of that zone, et cetera. And that's great. So now you have like half a dozen dudgeon agents that go and do a bunch of these things. Saves you some time, but they actually kind of interact with each other, right? Like the demographics may imply, like they may change your financial analysis, as an example. And so that's super hard. And then if you want to do it across 100 different locations, see where the best one is. That's also hard to do. And it's super hard to coordinate because actually those also may be interrelated too. Like putting a coffee shop here and then another one two blocks down may interact with each other. They can't be independent. So we launched a multi agent collaboration capability where you basically have this kind of super agent brain that can actually help collaborate across all of them, break ties between them, help pass data back and forth between them. And so we think that this is going to be a really powerful way for people to really accomplish much more complicated things out there in the world with again, there's a fundamental model under the covers that's driving a bunch of this reasoning and breaking these jobs into individual parts and then the agents go and actually accomplish a bunch of this work.
Tomer Cohen
Okay, I'm just going to say before we go to break, I appreciate how much news that you're weaving into this.
Matt Garman
Yeah, sure.
Tomer Cohen
This is the ultimate number of keynote announcements that have been introduced into a podcast. So thank you for that.
Matt Garman
All right, welcome to Re Invent.
Tomer Cohen
Exactly. All right, we're going to take a quick break and come back with Matt Garman, the CEO of aws.
Matt Garman
What happens when businesses have Dell AI Factory with Nvidia?
Tomer Cohen
Years of patient data gets distilled in seconds.
Matt Garman
Investors know the exact moment stocks should be traded and quality checks speed up so production lines never slow down, no matter what business you're in.
Tomer Cohen
There is an AI solution that's easy to implement, secure and tailored to you, that can keep your business moving Moving forward. Dell AI factory with Nvidia.
Matt Garman
Your way to AI. Visit Dell.com to learn more from the minds of visionaries to the desks of disruptors. I'm Lars Schmidt, host of the Redefining Work podcast. Join me each week as we explore the new world of work through the lens of those shaping it. CEOs, HR leaders, investors and more. Be a part of the conversation that changes everything. Subscribe to Redefining Work Today.
Tomer Cohen
And we're back here on Big Technology Podcast with Matt Garman of aws. Let's talk about some Earth centric topics. And starting with nuclear, you have invested 500 million in a company called X Energy to do nuclear. You're also part of, I would say, a wave of companies that are reanimating nuclear energy in the United States. And part of that is because these nuclear plants just didn't they had excess capacity that they needed to get off their hands. I just want to ask you a broad question about should we really believe in this moment for nuclear? Because on one hand it's for the moment cleaner than fossil fuels. On the other hand, we don't really know what happens with nuclear waste. We can't get rid of it. It has to sit in silos that could be damaging for the planet over time. So is it really, and this is sort of a sensitive one, but is it really an improvement to go to nuclear? And how can we be sure because of the long term effects here?
Matt Garman
Yeah, look, I think nuclear is a fantastic option for clean Energy, it is a carbon zero energy that has a ton of potential. And as you look about the energy needs over the next couple of years and really the next couple of decades, whether it's from technology or broadly in the world, or there's electric cars or just the general electrification of lots of things in our world, we're going to need a lot more energy. And it's, you know, we, we at Amazon are one of the biggest investors in renewable energy in the world. In the last five years we've done 500, over 500 renewable projects where we've added and paid for new energy to the grid, whether they're solar or wind or others. And so, you know, we, and we'll continue to continue to invest in those projects. I think they're, they're super valuable and there's probably not going to be enough of those soon enough for us to really get to where we want to get from a clean energy perspective. And so I think nuclear is a huge portion of that. You know, look, there's always the fear mongering from like back in the 60s and 70s of what nuclear used to be. Nuclear is an incredibly safe technology. Today it's much different today. It turns out technology has changed in the last 50 years. It's improved a lot. And so there is a ton of improvements in that space. And we think that it is a both very safe, very eco friendly energy source that is going to be critical for our Earth if we're going to keep for our world as we keep ramping our energy needs. And we think that as part of that portfolio, right, you're going to have solar, you're going to have wind and you're going to have other, but nuclear is going to play an important role in that and we're excited about what that potential looks like. You mentioned X Energy. We do think that over the next, probably starting somewhere in 2030 and beyond, these small modular reactors, which is what X Energy builds, are going to be a huge component of this. And so one of the, and they'll be part of that portfolio of offerings. But today all these nuclear plants that people build are really large implementations. Like they're multibillions and billions of dollars to go build these energy plants and they produce lots of energy, which is great, but they're obviously all that energy is in one location and then you have to invest in a ton of transmission to get the energy to the actual place you need it to go. And they're big projects. These small modular reactors are much smaller. You can actually produce Them almost like you produce gas turbines, like in a factory type setting eventually. And you can put them where you need them. Right. So you can actually put them next to a data center where transmission is not going to have to be an important factor. And so we think that that's a great solve for a portion of the world's energy needs as we continue to evolve over time. And it's one of the components of an energy portfolio that we're very excited about.
Tomer Cohen
Okay, so we'll be watching that closely. On the state of the economy, AWS had a few quarters of stagnant growth. It was still impressive growth, but it flatlined for a moment. And part of that was because customers.
Matt Garman
Not quite flatlined, but it was down from where it had been.
Tomer Cohen
Okay, but that's, I'm just talking about the percentage of.
Matt Garman
Yeah, yeah.
Tomer Cohen
Part of that was because the economy was in a rough moment. Everybody was looking for efficiency. And so what you did was, I think you made some deals with customers to help get their bills down or to help get them the most out of what they were doing so they could, you know, effectively live that efficiency motto. What does it look like right now? Is the economy back or are people still in efficiency mode?
Matt Garman
Yeah, I'd say. And by the way, it wasn't even just deals. We went and proactively jumped in with our customers and helped them figure out how they could reduce their bills. And we looked about where they could consolidate resources, where they could move to cheaper offerings, where they could maybe do more with less. And we were really proactive about helping customers reduce those costs because we thought from our view, one, it was important for them as they thought about how they got their economics in the right place and it was the right thing to do for them and built that long term trust. Now, customers, I think, number one, a lot of them have been optimized. Right. And there's only so much you can kind of squeeze into an optimized place. And customers are still looking for optimizations, but a lot of that work has been done and they're using some of that optimization to help fund some of the new development they want to do. A lot of that is in the area of AI. Much of that is in the area of migration and modernization where they're moving from on prem into a cloud world. And so some of those optimizations they did are helping them fund some of that work that's moving more of their workloads to the cloud, that's moving and letting them go and build new AI experiences in aws. And so that is where you've seen our growth start to come back up as a percentage basis. Some of that is customers leaning into those new experiences and doing some of those. More of those modernization migrations.
Tomer Cohen
Okay, I want to wrap on a culture question.
Matt Garman
Okay.
Tomer Cohen
Andy Jassy recently emailed the company and he said that as a consequence of scale. And I'm going to get it exactly as he said it. He says there have been pre meetings for pre meetings for the decision meetings. A longer line of managers feeling like they need to review a topic before it moves forward. Owners of initiatives feeling less like they should make recommendations because the decision will be made elsewhere. Was that going on within aws and what is the process to change that?
Matt Garman
Yeah, I think it's across Amazon. So it wasn't specific to the rest of Amazon. It was definitely inside of AWS too. And I think, look, it's kind of a natural evolution. We have these leadership principles inside of Amazon. Things like being customer obsessed and really understanding the customer. And in order to really understand the customer, you've got to be close to the customer. And so a flatter organization, the more layers you have, the more removed you are from customers. And so we just kind of fundamentally, as we were growing and then we went through an area of explosive growth of just the number of people and the size of the company and the size of the business. And so throughout that, we just didn't always have the organizational structure exactly right. And so it's, you know, we believe that a flatter organizational structure is better. The closer you are to the customers, the better decisions you're going to make, the faster decisions are you going to make. And you really want ownership to be pushed down to the people who really are making some of those decisions. And when you have a very kind of hierarchical organization where people don't feel like they have that ownership to make decisions, you go slow. And for us, speed really matters. And so I think Andy was just highlighting some observations we had where I think he's incredibly thoughtful on these points and which I appreciate. We've had a lot of debate here where it's. Nothing is broken, but you could see really early warning signs or stress around it. And for us, culture is so important and doing the things in the right way. Being that customer obsessed, being ownership, having the right level of ownership is so important for what makes Amazon so special and so kind of getting ahead of there being any problems. It's not like there was any burning problem. And we obviously could have just said done nothing and kind of let things go for a while. But for us, it's not the Amazon way. It's not the Amazon way. And so we're just being proactive, identifying that like, hey, look, this is super important for us. And so let's just be aware of it, let's be upfront about it, think about it and be very intentional as we think about organizational structures and things about where we can land. And I think all of that has been received really well because it turns out not many of those things customers complained about, they, they are really focused on ownership, they love being customer obsessed and most of that has been quite well received.
Tomer Cohen
So you can be a big company but not have big company culture.
Matt Garman
That's right.
Tomer Cohen
Okay, last one before we go. You've said that less than 20% of all workloads have moved to the cloud so far. What is the max number that that can get? Can that be 100% in time?
Matt Garman
I was going to say 100 is the smart answer.
Tomer Cohen
No, but what's realistic?
Matt Garman
Yeah, you know, I think it's a good question. I think if you think about how many workloads are out there, I don't know what the max is. I'm very bad about picking the maximum size of aws, but I do think, I actually think that at a minimum, I think that percentage can flip and it could be 80, 20 versus 2080 where it is today, or even less. I think there's a massive number of applications that just haven't moved. And if you think about line of business applications, as you think about workloads that are in telco networks, if you think about workloads that are running inside of hospitals, if you think about like it's not even just traditional data center workloads, but there's a lot of these other workloads that would be much more valuable, they'd be much more connected, they'd be much more able to take advantage of advancements in AI if they were connected into the cloud world and running there. And so I think that there's a huge opportunity for us to continue to expand, what it means to be in the cloud and to continue to migrate. Many of these workloads that are, that just haven't moved. And so there's a massive opportunity. I think kind of flipping that percentage over time could be an interesting opportunity for us. And the size of the pie is getting bigger too. I think that's the other exciting thing about generative AI is that the total amount of compute workloads are actually significantly accelerating too.
Tomer Cohen
And timeline to flip.
Matt Garman
I still think we're still ways out for the whole thing to flip. There's just a massive amount of workloads out there, but we'll keep working on them and keep going as fast as we can.
Tomer Cohen
Matt Garmin, great to meet you. Thanks so much for coming on the show. Show.
Matt Garman
Yeah. Thank you.
Tomer Cohen
All right, everybody, thank you for listening. We'll be back on Friday, breaking down the news, and we'll see you next time on big Technology Podcast.
Matt Garman
All right, Matt, cool. Thank you.
Tomer Cohen
All right, you're here all week till Wednesday.
Matt Garman
Okay, cool. Hope you enjoy it.
Big Technology Podcast: AWS CEO Matt Garman on Amazon's Big AI Chips Bet, Working With OpenAI, and Nuclear Energy
Release Date: December 4, 2024
In this insightful episode of the Big Technology Podcast, host Tomer Cohen engages in a comprehensive discussion with Matt Garman, the CEO of Amazon Web Services (AWS). Filmed at Amazon’s ReInvent conference in Las Vegas, Nevada, the conversation delves into AWS's strategic moves in artificial intelligence (AI), infrastructure scaling, collaborations with key players like OpenAI and Anthropic, investments in clean energy, and organizational culture within Amazon. This summary captures the key points, notable quotes, and the depth of the discussions that Matt Garman brings to the table.
Matt Garman opens the dialogue by addressing the significance of AWS’s extensive experience in building data centers. Highlighting AWS's long-standing expertise, he emphasizes the company's commitment to providing scalable infrastructure solutions for customers.
“We provide infinite scale for customers.” ([01:35])
Garman contrasts AWS’s approach with Elon Musk’s rapid construction of GPU data centers, underscoring that AWS focuses on delivering vast compute resources that allow customers to build and scale their applications seamlessly.
When questioned about why AWS hasn’t developed its proprietary AI models despite its robust infrastructure capabilities, Garman clarifies AWS’s strategic focus.
“We don't think that there's one best model. It's across that whole set.” ([05:00])
He explains that AWS prioritizes delivering a versatile compute platform, enabling customers to choose from a variety of AI models based on their specific needs. This approach contrasts with other tech giants who may prioritize developing and promoting a single dominant AI model.
A significant part of the conversation centers on AWS’s $4 billion investment in Anthropic, a leading AI model provider. Garman outlines the collaborative efforts between AWS and Anthropic, particularly focusing on the development of AWS’s custom AI chips, Trainium.
“Our goal is to help customers use the very best. It doesn't have to be one thing, it's not just one.” ([05:00])
This partnership aims to enhance AWS’s AI capabilities by leveraging Anthropic’s advanced models and AWS’s scalable infrastructure.
Garman provides an update on AWS’s AI chip technology, particularly the Trainium 2. He highlights the chip's superior performance and cost-efficiency compared to existing GPU-powered platforms.
“Trainium 2 is going to be a fantastic inference platform.” ([10:47])
Additionally, he discusses Inferentia, AWS’s inference chip, designed to reduce the costs associated with AI inference operations, addressing a critical barrier for businesses scaling their AI applications.
Despite AWS developing its own AI chips, Garman assures that the company maintains strong partnerships with key industry players like Nvidia.
“We are always open to take. Our relationship with Nvidia is great.” ([12:39])
He emphasizes that AWS values the performance of Nvidia’s processors and remains committed to optimizing their integration within AWS’s services, ensuring customers receive the best possible performance regardless of the underlying hardware.
Addressing the high costs associated with generative AI, Garman outlines AWS’s strategies to make AI more affordable and accessible.
“Trainium 2 is going to be a fantastic inference platform.” ([15:02])
He discusses initiatives like automated model distillation within AWS’s Bedrock service, which allows businesses to create smaller, more efficient AI models tailored to specific use cases, thereby significantly lowering inference costs and improving return on investment (ROI).
Garman introduces AWS’s latest advancements in AI services, including the development of intelligent agents capable of performing complex, multi-faceted tasks. He explains the concept of multi-agent collaboration, where multiple AI agents work together to accomplish intricate objectives.
“We launched a multi agent collaboration capability where you basically have this kind of super agent brain.” ([30:34])
This innovation aims to enhance the efficiency and effectiveness of AI-driven operations across various business functions.
One critical challenge in AI is the issue of hallucinations, where AI systems generate incorrect or nonsensical outputs. Garman discusses AWS’s approach to addressing this through automated reasoning.
“We can mathematically prove that you got the right answer.” ([24:45])
By implementing automated reasoning, AWS ensures that AI outputs, especially in sensitive applications like insurance or healthcare, are accurate and reliable, thereby increasing trust and usability in real-world scenarios.
Shifting focus to sustainability, Garman reveals AWS’s $500 million investment in X Energy, a company specializing in nuclear energy. He advocates for nuclear power as a crucial component of a clean energy portfolio.
“Nuclear is an incredibly safe technology.” ([32:41])
Garman highlights the advancements in nuclear technology, particularly small modular reactors, which offer scalable and eco-friendly energy solutions that can be integrated closely with AWS’s data centers, reducing the carbon footprint and supporting the growing energy demands of the cloud infrastructure.
Discussing AWS’s performance in a fluctuating economy, Garman explains the company’s proactive measures to support customers through cost optimizations and strategic investments in new technologies like AI and cloud modernization.
“Customers have been optimized... now moving to AI and modernization.” ([35:15])
These efforts have helped AWS maintain its growth trajectory by enabling customers to fund new developments and adopt advanced technologies even in challenging economic conditions.
Addressing internal challenges, Garman reflects on Andy Jassy’s email about organizational inefficiencies due to rapid scaling. He outlines AWS’s commitment to fostering a flatter organizational structure to enhance customer obsession and expedite decision-making.
“A flatter organization is better.” ([37:42])
Garman emphasizes the importance of ownership and proximity to customers, ensuring that AWS remains responsive and agile despite its vast size.
Concluding the discussion, Garman tackles the topic of cloud adoption rates, noting that less than 20% of workloads have migrated to the cloud. He envisions significant growth potential as AWS continues to facilitate the migration of diverse and complex workloads.
“At a minimum, I think that percentage can flip and it could be 80, 20 versus 2080.” ([40:05])
Garman expresses optimism about the increasing integration of cloud technologies across industries, driven by advancements in AI and the expanding capabilities of AWS’s infrastructure.
Conclusion
Matt Garman’s conversation with Tomer Cohen provides a deep dive into AWS’s multifaceted strategies in AI development, infrastructure scaling, sustainable energy investments, and organizational culture. By prioritizing customer choice, fostering key partnerships, and continuously innovating its AI and energy solutions, AWS positions itself as a leader in the evolving tech landscape. This episode underscores AWS’s commitment to empowering businesses with scalable, efficient, and sustainable technology solutions.