
Werner Vogels is the Chief Technology Officer at Amazon, where he has played a pivotal role in shaping the company’s technology vision for over two decades. Before joining Amazon in 2004, Werner was a research scientist at Cornell University where he f...
Loading summary
Narrator
Werner Vogels is the Chief Technology Officer at Amazon, where he has played a pivotal role in shaping the company's technology vision for over two decades. Before joining Amazon in 2004, Werner was a research scientist at Cornell University where he focused on distributed systems and scalability, both of which are concepts that would later influence the design of AWS. He holds a PhD in computer science and has authored numerous academic papers on the reliability and performance of large scale systems. As cto, Werner has been instrumental in guiding Amazon's transition from an online retailer to a global cloud infrastructure provider. He is one of the key architects behind Amazon's push into cloud computing, helping to define the new model for delivering infrastructure. He is known for his pragmatic, customer focused approach to technology and for championing ideas such as you build it, you run it, APIs are forever and more recently Frugal Architecting what? Which emphasizes cost effective and sustainable software design. In this episode, Kevin Ball sits down with Werner for a wide ranging conversation. They discuss the early days of Amazon, the birth of aws, the principles of the Frugal Architect, aligning costs to the business, engineering, business collaboration, technical debt, and much more. Kevin Ball, or K. Ball, is the Vice President of Engineering at Mento and an independent coach for engineers and engineering leaders. He co founded and served as CTO for two companies, founded the San Diego JavaScript Meetup and organizes the AI in Action discussion group through Latent Space. Check out the show notes to follow K. Ball on Twitter or LinkedIn or visit his website Kball LLC.
Kevin Ball
Hello, K. Ball here and I have the absolute honor of getting to introduce today the Chief Technical Officer of Amazon, Werner Vogels.
Werner Vogels
Thank you Kevin. Thanks for having me. And it's going to be a fun day.
Kevin Ball
I'm excited to get to talk to you now. I like to start especially with people who have done a lot of speaking and being introduced places. How do you introduce yourself? What do you highlight when you get to say here, here's who I am.
Werner Vogels
Ah, I've been doing this for 20 years and I'm kind of tired now but I have a lot of stories to tell. Yep. Now in general just all these things in your past and I've been an academic and I was in the army and worked in hospitals and all these kind of stories don't really matter. I mean the last 20 years of Amazon have really formed me especially also because Amazon every year is a different year and when I was 18 or something like that he would have asked me if I would work for the Same company for 20 years I would have laughed at you. But now it's 20 years later and I'm still here. Yep.
Kevin Ball
So we wanted to talk today about cloud architecture. And this is something that I think the rise of the cloud. Amazon has very much led that wave and it has changed the way that we do software. But you've been very focused on it for the last few years. So how do you introduce this topic as a field of study for software engineers?
Werner Vogels
Well, maybe we should go back a little bit in time. That makes the whole story a bit more easier to follow. When Jeff started Amazon, he didn't really want to start a bookshop. Now he was just fascinated by the Internet. What could you do on the Internet that you couldn't do anywhere else? And he just picked a bookshelf. A Good Bookshop has 40,000 titles in stock, yet there's millions of books out there. So you could do something on the Internet that you couldn't do anywhere else. Unfortunately, nobody has written a book about E commerce before because the word didn't exist yet. This is 94. And so anything that Amazon tried to do after that, they basically had to invent themselves. Every piece of technology, everything you now know about E commerce. I mean, now we probably just used Shopify. Yeah. As a platform, but there was nothing. And so Amazon engineers have been really, really good in sort of inventing themselves out of the corners that they got themselves in because there were many things that nobody had done before. And the cool thing is actually, and this is already before the year 2000, that AI played an important role in all of that. Think about recommendations, similarities, customers bought, export, why things that we now consider normal everywhere. And we don't call it AI anymore, because it just works. Now, one of the things that definitely in my earlier days as CTO of Amazon, we had fixed capacity and we had a really good mechanism to sort of predict what would happen the next year. So we bought capacity for that. Now, if I bought. Didn't buy enough, my customers would be unhappy. If I bought too much, the CFO was unhappy. So you always need to be careful in that. And then, of course, Black Friday came four times the traffic from normal. A nightmare, really. But what really happened five, six days before Black Friday is that suddenly a team would show up. They would say, oh, oh, we've got this brilliant idea and it will deliver 50 million to free cash flow and we need to implement it immediately. And I go like, no, we can't. But for some reason, we always made it work because these constraints also breed creativity. And you Found ways to do it. Now, when the whole cloud thing started, suddenly everything changed, of course, because suddenly you didn't have to live within the constraints that the original fixed hardware had. And I thought that that would suddenly also become the point where people started to realize about how much certain architectures cost. And mostly, I mean, I love the way that we built aws. We built amazing technology that nobody else has done before. And as cto, of course, I'm amazingly proud of that. But maybe what I'm most proud of is that we changed the economic model before that. I was extremely frustrated as a cto because in dealing with vendors, I always felt that the vendor was in charge. I was never in charge. If I wanted to get the cost down, let's say of this database, I needed to make a five to ten year contract. I had no idea how much you would need at Amazon five years from now. So you massively overscaled. Also to avoid sort of the penalties you would get if they would check on the number of databases that you were using and then you would write this check. And this check had many zeros on it. And the moment you gave the check to the guy that you were dealing with, he didn't care anymore because he was paid. And so at Amazon, we had this principle that we wanted to be the earth's most customer centric company, that you understand how that works in retail. But we started thinking, how would the earth's most customer centric IT provider look like? Well, the first thing we need to do is change our economic model. Instead of that, people have to pay upfront, which they had to do with every other IT company. You only had to pay for what you've used. And now that seems normal. But that was revolutionary at that moment.
Kevin Ball
No, it completely revolutionized, by the way.
Werner Vogels
Your electricity at home. You pay for what you've used, you don't go to the electricity and give them money for the rest of the year and hope it's sufficient. So in some sense it was a normal model. And I think people caught up on that pretty quickly. Lots of other large IT companies didn't. They were still very much addicted to the 70% margins that they had. So the biggest thing that we were really proud of was actually changing the economic model. So this pay as you go model, which I thought was completely natural, also allows you to think about the choices you make because it results in cost. Now roll forward. Well, in 2008, with the crisis, the financial crisis, we already saw some of that, that CSOs were somewhat concerned about cost of digital infrastructure. But most organizations actually that we worked with used the COVID period also as, oh, let's accelerate our digital transformation, you know, let's move everything. And so they weren't that really that concerned and go forward. In 2012, I gave my first keynote at Re Invent and I gave a longer list about how I thought development would change or was changing on the cloud and all the reasons why we're doing that. Small building blocks, blah, blah, blah, deploy to two availability zones at minimum in production. And I also said, now you can architect with cost in mind. Everybody ignored that. Why? Because moving fast and innovating was way more important. It didn't matter at cost. Or you could get all the things that you wanted and you didn't really. You were really thinking about your idea and the things that you wanted to achieve and your customers and stuff like that. And kind of cost of all of that was sort of put under the rug until a number of years ago, I think, when most of the CFO or most of the financial people started thinking like, should this really be costing us this much? And that made me think that if our customers start to become more concerned about cost, maybe we should revisit that topic and give them some solid advice about how. I think after, by that time, was it 15 years at Amazon, the experiences that we had and how we sort of have integrated costly many of the architectural decisions that we've made. And so that resulted in the Frugal Architect.
Kevin Ball
Yeah, absolutely. Well. And I think we are very much in this phase right now where it's harder to raise money, interest rates are higher. Everybody's looking at how do we cut costs and actually pay attention to this in a way that they weren't when money was flowing freely.
Werner Vogels
Absolutely. And especially think now about all the efforts in and around AI. There's enough models where $15 per million tokens and there's models where they cost $0.15 per million tokens, which ones give you the better result, you know, and that's still, by the way, when I say frugal architect, I don't mean cheap. I mean that you get maximum value for the money that you're spending or what you want to do, and then work backwards from that. And this is how much is it going to cost me? Do I really want that? Yeah. So there's a bit of principled advice in the Frugal Architect, but there is also some sort of practical advice. This is what you should be doing. Yeah.
Kevin Ball
Well, let's start to dive into that, because I think a lot of folks, they do, especially with the ease of scaling up and down with cloud, it's very easy to just say, oh, we'll use what we need. And it scales, and it scales. And at some point you say, how did we get here? This is so expensive. So you start from the design side, right? How do you think about designing your system?
Werner Vogels
Well, of course there's many systems out there that have already been built, but definitely with the first laws or lessons or whatever you want to call them. I really targeted the upfront thinking where there's things that are often non negotiable upfront. Think about compliance, think about security, think about accessibility. Those are things that are just given. And then there's a bunch of other things like, you know, reliability, fault tolerance, performance, things that you trade off against each other and we will come to that later. I think cost should be part of that. You should be upfront, be aware that whatever architectural decision you make, there is a dollar amount associated with it. And you know what, that's fine because after all, whether you were buying hardware or whether you have now operational costs in aws, the money needs to flow anyway. But here, every little piece that you make has a financial consequence and as such that is at least something you should keep in your head upfront. Now I also wanted to make the case that especially in AWS, until we can give people clearly milligram CO2 for this particular computation that you've done, cost is a pretty good proxy for sustainability. Yeah, the more you pay, probably the more resources you've used. And as such, you know, you can keep these two things in balance. I mean, we are all just enough companies that require sustainability to be reported to to the board these days. And I find, and especially at the younger group of developers, they are absolutely passionate about making sure that they don't ruin the planet while using AI for example. And as such, having cost as just one of the other non functional requirements, which we always have, makes you at least aware of it.
Kevin Ball
Yeah, that makes sense and in some ways makes me ask like, why do we even need to say that? Of course cost is a thing that.
Werner Vogels
We need to think about many years where we did.
Kevin Ball
It's true.
Werner Vogels
No, but even in the earlier days of Amazon where we went on our own hardware, engineers were never concerned about cost. It was there anyway. And of course at a different level you were concerned about cost, but as an engineer, you just asked for five more servers and boom, there they were.
Kevin Ball
Yeah, absolutely. So let's look at what that is because maybe engineers aren't used to thinking about this. What are the important aspects to trade off in cost? I think your second principle is around how you align cost to different things within your business. So what does that look like?
Werner Vogels
One of the things that I find extremely important as an engineering organization is that you need to be deeply involved with the business. After all, we make technology for a business. You don't make technology in a vacuum just because you think it's funny. And also you have to remember that AWS is a business to business. We make technology for other people to build applications with. And as such, you need to be closely reliant to your business unit to truly understand what they actually want from you. I mean, I've seen so many old situations where the IT department is somewhere else. Yeah. And they get a list of requirements, go build it, come back to the business nonsense. And why? I mean, often requirements change and that's normal, or you discover new information or the business changes trajectory and things like that. So I've always believed that engineers should be in the same room as the business people because then you start building applications or systems or support systems that actually really meet the requirements of the business. Instead of that the business has to adopt to you. And I think that is really making sure that you align the two. Also very important is that technology costs money. And so if you have to scale and you scale up and so your costs are growing, but your income as a company is coming from a completely different direction. And let's say you've billed it as a pay as you go model, but you actually have a subscription model. Those two things don't really work out well. So thinking upfront about over which direction you think you're going to get revenue, then you need to make sure that if you grow, your costs grow over exactly the same dimension. Because otherwise I'm pretty sure you're going to run into trouble. I, at some moment, long time ago invested in a young business and they were making these small MiFi devices. Now everybody has one. Or now all your phones have Internet connectivity and people will buy these devices and then buy 10 gig or 20 gig on it, or 30 gig. And at some moment the customers get frustrated. Why can't I just buy an unlimited package? It can be expensive. Okay, so they made a very expensive unlimited package. What do customers do? Start watching Netflix 24 7. And as such, usage greatly explodes above what the big price was. So having a financial model and having a technology scaling model that are not aligned is at risk for Your business.
Kevin Ball
Yeah, that makes a ton of sense. Now.
Narrator
You'Re a professional software engineer. Vibes won't cut it. Augment code is the only AI assistant built for real engineering teams. It ingests your entire repo. Millions of lines, tens of thousands of files. So every suggestion lands in context and keeps you in flow where other tools stall. Augment code sprints. Unlike Vibe coding tools, Augment code is built for shipping to production. And you don't have to switch tooling. Keep using VS Code, JetBrains, Android Studio or even Vim. Don't hire an AI for Vibes. Get the agent that knows you and your code base best. Start your free trial@ augmentcode.com pricing is also hard.
Kevin Ball
A lot of folks won't pay for pay as you go in different parts of the market. So how do you trade off those things or evaluate this as one of the many trade offs that you might have?
Werner Vogels
Well, trade offs, first of all, I think even before we put a cost into the mix, trade off was always a part of architecture. How highly available does this component need to be versus that component? And what is the kind of performance? You know, how much capacity do we need to get to this performance? And in that sense, for example, measurement is extremely important. Knowing and truly understanding what is happening there. If you get an average web page latency of 1.5 seconds, it means nothing. It means that 50% of your customers are getting a worse experience. You need to know how much works and you need to know how to control that endpoint. Let's say the 99.9 percentile. And how can you bring that in? How much does that cost me to actually do the engineering? And then do I get a return on that? The wisdom is that faster web pages give you more conversion. But there is a point of limiting return. Of course, after 1.1 seconds, nothing matters anymore. And so you need to think in your engineering about sort of how much is this work going to cost me and is there, from the business perspective, a return on it? Of course, we all want to build the fastest possible web pages ever. I mean, as an engineer, I would love nothing else than doing that. It's just that for the business, that's just busy work and useless because there's no return on it. So that definitely, I think in the trade offs part is really important. And it comes back to something else. Your application doesn't exist out of one big thing. Take Amazon.com you go to the webpage of Amazon.com and there's a few things that actually Always need to work because without that we don't have a business search, browse, shopping cart checkout and reviews. Because if reviews are not online, people are not that interested in buying. Yeah. Then there's a number of things that are quite important to customers. You know, customers who bought X, bought Y, recommendation similarities, things like that. And then there's a bunch of nice to have bestseller list. Now you need to have a discussion with the business about how much money should I spend on the things that are tier one, the really important ones, tier two, tier three and tier one, you might say, oh, I absolute need to replicate that over three different availability zones. Or I need 99.9% availability. Well, that's going to cost you. Yeah. And tier two, you may say, you know what, three nines might be fine. If recommendations is offline for five minutes, we can handle that. And then tier three, the bestseller lists, they may go, well, you know, if that's offline for an hour, I don't think anybody misses it. Yeah. And so this is a conversation you need to have with the business. I mean, we used to as engineers to make those decisions. And then what we do is we make everything four nines available, which is from a business perspective, way too costly because there are parts where, from a business perspective, you can kind of live it out, even if for five minutes or 10 minutes. And it has a significant impact on the bottom line. So there's all sorts of tools and tricks that you can think about to decompose your application in such a way. But the most important part of it is that you then take it to the business and have a conversation with them because they're the ones at some moment that will have to pony up the money for it.
Kevin Ball
Yeah, well, and as you mentioned before, like, requirements change as well. It may be that the best seller list is not important when you launch it and you discover that drives tremendous amounts of purchases and so that ROI goes up and we need more reliability.
Werner Vogels
Yeah. Or you build things that, you know, once you get it in the hands of the customers that they go, like what?
Kevin Ball
Yeah, that's probably more common. Right.
Werner Vogels
Then it needs to be cheap enough in retail. Definitely. We have this massive AB testing environment and instead of hiring focus groups and psychologists and whatever, before we build things, you might as well build it and put it in front of customers and see what they think about it at some moment. We built something called your digital Soulmate. This is the person on Amazon that is in terms of purchasing, just like you wouldn't tell you who was. But we will tell you what that person also bought that you didn't buy with the idea that that might be inspirational. Yeah. Customers hated it. Hated the fact that there was somebody else just like them. Yeah. Now maybe, I mean that's already 10, 15 years ago. Maybe with the changing social media and things like that, people don't care that much anymore. But it was much easier to quickly build it, put it in front of customers and customers going like, no, nope, that's not what we want.
Kevin Ball
So thinking about that and knowing that things are going to sometimes prove out to be not actually valuable, how much cost alignment, how much architecture went into that before you built it, versus building it to be evolvable if it turns out to work well.
Werner Vogels
So if you think about innovation at Amazon, I mean everybody knows we're really built out of all these small teams that all take care of their own little world and things like that. They're all charged with innovation. And some of these innovations, especially because our A B testing environment is so robust, aren't terribly costly. But there are teams that are working really hard on what may look on the outside world, small problems. I don't know if you have ever bought shoes online. People who buy shoes often buy three pairs of the same shoe, different sizes, and then send two back. I think that's a bad customer experience and it's not great for the business either. So we have a small technical team that sits with the shoes team and tries to understand all of these kind of things. And can we build a data set? Can we build data sets such that, you know, if these Nike 11, that particular 9011 fits you really well, maybe you should buy these trails for Adidas, things like, like that, you know, just lost that cost of the money. Not, I mean, but it may make a major difference for our customers if you can really give that good advice. So, you know, and as such, small teams are all charged with doing that kind of level of innovation. But the bigger things go up to the central management team. Of course, if it requires significant capital investment, then we have this principle that if we make major capital investment, the result of it needs to have a significant impact on our balance sheet. It means we're not necessarily interested in putting a lot of money on and then just getting that money back. But you know, if this really successful, it should be really successful. Probably AWS is a really good example of that. But think also about things like the Kindle that wasn't something that you expected Amazon to deliver. And remember, we sell the Kindle still At cost. That means if you never read a book on Kindle, we don't make any money. And also in the early days, one of the bigger challenges to customer service was that people would call in and say, ah and complain about something. And then the customer service agents would say like, yeah, but that's not by Amazon. That's by a third party. People go, no, no, no, no, no, no, I bought this on Amazon. So a significant portion of our customer contact all had to do with third parties and mostly with shipping, because people love to sell, but they hate to ship. And so starting fulfillment by Amazon so they could put their goods in our warehouses, we would take care of the guarantees on delivery. Major difference. Did it require a lot of capital investment on our side? Yes. Or think about something like Prime. Prime is not just a gimmick. We had to completely relay out a complete fulfillment network to be able to afford Prime. And prime doesn't pay for itself.
Kevin Ball
Prime is a really interesting example. What if we broke it down from the frugal architect principles? Right? So like you said, it doesn't pay for itself. So how does cost get incorporated into the architecture of what makes up Prime?
Werner Vogels
You know, if in around 2000, 2001 or something like that, you would have asked people, why don't you buy your lawn furniture on Amazon? And they go like lawn furniture. They sell books, music video, or your TV or your electronics or things like that. And one of the bigger challenges with Primals is to not only build a subscription model, but to make sure that people would understand that this would not only apply to books, it would apply to, if you buy a big screen tv, it comes free to you in two days. And as such, you know, it incentivizes our customers to do cross category shopping, especially for those items that are, you know, maybe big or costly in transport or things like that. Now it all comes in the same bucket. Now, of course, one of the things that customers really want is convenience. And we are no longer talking about two day delivery. We're not talking about one day delivery or, you know, where I live some part of the year In Dubai, there's two delivery windows. That is today before 6 or today before 11. If you get something the next day, you're kind of disappointed. And this is how people change on the influence of technology, kind of things that we can do. But believe me, if you want to do same day delivery in New York, you'll have to make some investments in making that happen. And as such, there's a clear thinking behind it. But most of these also, you know, prime, they're experiments. Nobody's done that before. And things are not an experiment if you really know the outcome. And some of these experiments, just by the nature of being an experiment, need to fail. We had this phone, I don't know if you remember this, the fire phone. $800 million write off. Not all of these investments work out the way that we planned them to. And that's okay. And when Jeff explained this to the shareholder meeting, there was nobody that better than I because in exchange for this, there are 10 or 20 or other big other success stories to make. And sometimes you make these gambles.
Kevin Ball
Well, and as you say, like you can't know how it's going to turn out. That actually in some ways feeds back into your frugal architect principles. Beyond design. You have this sort of measure and observe area where you say, okay, as we go, this is going to change. I'm sure prime today looks very different than prime when it was originally imagined. So how do you keep track of what's going on and then evolve it as you go?
Werner Vogels
Thinking about, without measuring, you're flying blind. Yeah. And whether that's around cost or reliability or uptime or how are people changing their behavior under prime? But I wanted to make actually one thing clear. In Amazon Retail, we have the luxury of being able to experiment. You bring things in front of customers, they don't like it, stop it, or whatever. You just in aws, the world is different because as soon as we launch something, people start building their business on top of it. That's not something you can suddenly then pull the plug from underneath. Because people have been doing this. I mean, we're still running SimpleDB and you can't sign up for it anymore. But there are a number of customers that are still using SimpleDB because they've tuned the hell out of it and know exactly how they want to run it. But we launched this. And the same is like with APIs now APIs are forever. It's one of the hardest thing to do is API design because you need to think about sort of how is this going to evolve because this is going to stay around for a while. And so measurement is extremely important. But also making measurement visible to everyone. And I often tell the story and it goes back a long time. Most Americans don't know this story, but in 1972, there was an oil crisis. Yeah, there were the hijackings and the fingertips with the Olympics in Munich, with the hostage taking off the Israeli athletes there. And so on Sunday, we couldn't drive a car. But also a lot of research was being done why some houses used more energy than other houses. Although there were comparable same family in it. It turned out that the family that was using more energy had their energy meter in the basement. The family that was using less energy had the energy meter in the hallway. That meant that every time when they walked into the house, they got confronted with their energy usage. And that changes behavior. I remember one of the first jokes I made, so when we go home at night, she can turn off the lights. Because, you know, as engineers, we're used to having some desktop underneath there and you just let it run. We go home in the cloud, it's a much better idea to actually shut your development environment down because you're not going to use it anyway. Costs money. If you have that on a big monitor somewhere in your engineering environment and see how these things change up and down, it changes behavior. And so getting good insight also in sort of making changes between do I want to run this on intel, do I want this one on a graviton? And things like that. So immediately seeing that your cost drops by 30% is a big motivator.
Kevin Ball
Yeah, we are natural optimizers. Give us a number and we'll try to push it up or down, whichever one it is.
Werner Vogels
It's a bit the same. We talked mostly about sort of if you have total control over how you're building your applications from scratch from in greenfield. Most of our applications aren't like that. They've been around for a few years. People that developed them probably are not around anymore. But paying off technical debt is crucial in any organization. No matter how brilliant you were with developing your first version, I'm pretty sure there are some things to fix, some things to refactor, or some things to look at. Where are my costs going? And is that, does that meet the intuition of how much this should be costing? Before cloud, I remember there was this moment where I think we had 12 different search services within Amazon. Don't ask me why, we don't centralize those things. But you know that 12 different ones.
Kevin Ball
And it's a natural outcome of experimentation, right? Like experimentation leads to diversification. But then what do you do as a team?
Werner Vogels
You're allowed to move fast and if you feel you're being hampered by another team that still has to complete something, you just go do it yourself. I don't know if you realize, but there has been forever a button in your orders that says digital orders. That was mostly because integrating digital into the traditional Order pipeline was way too much work and so we allowed that team to build their own pipeline. Not the best customer experience probably, but we were talking about something else. What was that?
Kevin Ball
Technical debt.
Werner Vogels
Technical debt, yeah. And so you know there are certain engineers who love to tinker, who love to. So I'm not showing every engineer to be the same. Some, some engineers love to babysit an SAP system and they are conservative and they make sure the swing will run to the max. Absolutely. Those kind of people you want to hire for that. There's also engineers that would love to think and do some innovation here. You ship them off to something else. But there's also a group of engineers that really love to look at the minuscule things that make things better. And you know, if you can find, can build a team out of these people and have them just go around the company, go look at things. Can we see, can we have so many optimizers, we have so many deep insight into the execution of these things, all these flame graphs and things like that. And where's all this compute going? It shouldn't be going anywhere. And now they find gold. And so that's why I think, you know, that's. Yeah. And it's always the same. Perimeter optimization is not a good plan and things like that. It doesn't mean you shouldn't use your brain upfront when you're actually building something and not shove this up to a later moment. Oh, we'll check at this later. It's a bit like. And then technical debt is like the mortgage on your house. You know, if you don't pay off your mortgage, the bank comes and repossesses your house. If you don't actually eventually solve all your technical debt, it'll come back to haunt you. Whether it's in reliability, in cost, in performance, it will come back to haunt you. So it is a worthwhile effort to put to put some engineering against. Capital One's tech team isn't just talking about multi agentic AI. They already deployed one. It's called chat concierge and a simplifier in car shopping using self reflection and layered reasoning with live API checks. It doesn't just help buyers find a car they love. It helps schedule a test drive, get pre approved for financing and estimate trade in value. Advanced, intuitive and deployed. That's how they stack. That's technology at Capital One.
Kevin Ball
APIs are the foundation of Reliable AI. And Reliable APIs start with Postman. Trusted by 98% of the Fortune 500, Postman is the platform that helps over 40 million developers build and scale the APIs behind their most critical business workflows. With Postman, teams get centralized access to the latest LLMs and APIs, MCP support and no code workflows all in one platform. Quickly integrate critical tools and build multi step agents without writing a single line of code. Start building smarter, more reliable agents today. Visit postman.comsed to learn more. It's interesting thinking about technical debt in the context of what we've been talking about today in terms of aligning cost. I think as a business goes through these different phases, as you do experimentation, as you a product goes through those phases, you assume technical debt because you're optimizing for different things. Right. So you may not optimize for cost upfront because you think there's a 70% chance this thing gets thrown away.
Werner Vogels
I have a really pretty good example of that. When we started Amazon Fresh, we had no idea how the interface was going to look like. How did people want to interact differently with the fresh interface versus let's say the normal retail interface? Yeah, of course they wanted and with subscription mechanisms and things set up, but you wanted the team that actually was building this to be really agile, to be able to move things around fast. So they started off in Ruby on Rails. Why? Because it's a great prototyping environment, good visuals, you can do this. But they knew on day one that the moment that they needed to start scaling it needed to be rewritten. Let's put it like that. Yeah, because I wouldn't be able to. Or it would be, probably may be able to scale, but it would be at an enormous cost. And bringing it back to the normal Amazon principles. We decided to go with Ruby on Rails first because we didn't know how things were going to look like and it was a great prototyping environment. But then you do need to pay a few technical debt eventually.
Kevin Ball
I'm feeling exactly that right now in my day job. By the way, we're paying off a Ruby on Rails technical debt issue.
Werner Vogels
But yeah, what did I see yesterday? There was the, I mean stack overflow is not that popular anymore but they do have this survey I think, which I pay attention to and the drop in Ruby developers seem to be something like 75%. So there is, there's quite a bit of movement in that which I always like because I think if there's one thing as engineers that we're always forced to do is new learn new stuff and it's the cool thing I think but you know, it's you do need to do that.
Kevin Ball
Well, and I think programming languages is kind of at an interesting moment right now, especially with all of these LLM assisted coding tools. And learning a new programming language is probably easier than it's ever been.
Werner Vogels
Yeah, I do think so. Yeah. No, I think just learning anything these days. You can get a little assistant on the side that will help you. It would be great if the assistant would actually really be able to track your progress really well and sort of start suggesting which things you should be paying a bit more attention to. But I'm pretty sure that will happen in the future. But one of the things in with respect to programming language and the Frugal Architect. Let's come back to that one. I ended the Frugal Architect presentation with a quote from Grace Hopper. From Admiral Grace Hopper, our famous first programmer. And she says the most dangerous word in the English language is we have always done it this way. I think that goes to often how we treat things. You know, oh, I'm a Java programmer or you know, oh, we're really good at Rails or you know what, this looks just like the project we did last year. Let's just repeat that. And especially when it comes to, I think, sustainability and cost, we should really reevaluate which programming languages we're using. You know, if Python and Ruby are 75 times as inefficient as Rust, that maybe there should be some light bulb going on in your head and going like, well, maybe if cost is important to us or if sustainability is important to us, maybe we should investigate also. Now, I know learning Rust is not the easiest thing to do here. You have a compiler that is so picky that you start to hate it. Everything you get in exchange blindingly fast, superior security properties, you know, really make it really. In Amazon at this moment, the programming language of choice, we are rewriting significant portions in Rust. Senior Principal engineers wrote a great article on my blog about how they're thinking about why they moved from Java to Rust when they were building Aurora D SQL. And you know, security plays a very important role in that, but also efficiency and cost. The thing here is that there will be another programming language after Rust that is even more efficient and has even better security principles or whatever. As engineers, we'll be learning the rest of our lives. And I think it's fun. We, as engineers, we have the most amazing jobs in the world. You know, we can go to work every day and create something new. Who else can do that? Nobody. And we have the most. We are the artists. We are the creatives doesn't look like that to me outside.
Kevin Ball
Yeah, well, and this brings back I think a little bit to something we were talking about in terms of paying off technical debt and different languages maybe being appropriate for different phases of a project. How do you set up your project to enable you to say, move easily from Ruby on Rails over to a Rust based project or something like that? I know you've talked before about evolvable architectures. So what does that actually look like?
Werner Vogels
Evolve a build architectures is a. Is a little bit different because. Well, let's take Amazon S3. We started off with six microservices. That was the whole thing. Now it's well over 200 or over 300. I forgot what the number is. Danny can probably tell you that you know everything that we added. But we never took S3 down. It wasn't that we changing software or things like that and send an email to all of our customers saying sorry, but Friday evening between 6 and 10, S3 is down. Well, probably your TV won't work anymore, you can't get coffee and I wouldn't be surprised if the beer taps don't work anymore. But you know, always evolving your software because you know that a few years from now you will be running a completely different architecture. You think differently. S3 is a great example. You know when in version one we were just storing three copies of an object in on different servers and then we started realizing that eraser coding actually could help us here and actually still get the same level of durability but you needed to store less data and significant cost improvement of course. Yeah, so you can introduce that doesn't mean you're going to recode all the old objects. Maybe once you touch them, you recode them, but you just leave them there. And so we know that sort of over time, yeah, we went to different storage model where you would only store in two servers or in one server or add all sorts of functionality. Now these days you can store your Factory embeddings in S3. So evolvability is really realizing that this is not going to win forever. But we need to build it in such a way that for our customers it looks like it will run forever. It can't go down. So that in terms of evolvability is crucial for us. And in terms when you think about technical depth, when you think about the example that I gave with multiple programming languages, sometimes what you do. So in Amazon we're organizing small teams. Each team has ownership over the piece that they have control over, they can make decisions. There's agency there. So if you want to do something that may spend some of these teams, what you do is you fire up another team. The other team has a task either coordinating or maybe do some of the work. Because remember, these teams are relatively small. They weren't sitting around waiting for more work. They had their roadmap planned out for them. But you know, you may actually bring a team on board that whose task it is to start carving up pieces of fresh, rewriting them, making them as a service so we can actually call them. We don't need to worry about how they're implemented. So you know, you need to play around a little bit with your organizational structure as well to get these things done.
Kevin Ball
Yeah, that makes sense. Well, and I think the carving off and the balance of the API is forever. The contract with the customer is forever. The implementation, you can't fall into the we've always done it this way trap.
Werner Vogels
No, because there is always something new. I mean sometimes you build things that are intuitively what you find intuitively the right way. There's a great website called the Amazon Builders Library and there's a whole bunch of, how shall I say, non obvious thinking where there's this a great one by Colm Colin McCarthy who is about constant work. And when you read it you think like that's inefficient but in terms of stability of a system and other principles around system, it looks great, you know, and so here you have. That's a great document where things are being traded off against each other. A bit more bites on the wire, a bit more work to be done. But the system itself is rock stable so there's a lot of work you have to do but you might play around with your organization to actually get that done.
Kevin Ball
Yeah, no, this makes sense. Well, we've covered a lot of this, kind of gone through all the different pieces of the frugal architect. We're getting close to the end of our time. Is there anything we haven't talked about today that you think would be important to cover before we wrap on this particular topic?
Werner Vogels
No, I think we've done well. But as always, I want this to be practical. I don't want this to be just some high level architect that never got his hand dirty and suddenly he start talking to you about coffee cost. I think important is the relationship between the business and tech because that's where money matters. And so if you keep that in mind, a bit of an agile working together with the business to make sure you build the right things and requirements change. That's not new, by the way. In the 1990s, there were enough reports already about changing requirements is why all these big projects fail. And actually that is still the case. You know, I do believe that evolvability centers around decomposition. So decomposition, smaller services, but also then decomposition of the application into cells such that you minimize the exposure to failures. And there's so many different technologies to think about when you build your system that. No, I'm still having a great time.
Kevin Ball
Yeah, I think there's no shortage of things to learn and to do in the engineering world.
Werner Vogels
Absolutely awesome.
Podcast: Software Engineering Daily
Host: Kevin Ball (K. Ball)
Guest: Werner Vogels, CTO of Amazon
Date: August 28, 2025
This episode features an in-depth conversation between Kevin Ball and Amazon CTO Werner Vogels, exploring the origins and ongoing evolution of cloud architecture, the foundational principles behind Amazon Web Services (AWS), and Vogels’ philosophy of the "Frugal Architect." The discussion covers Amazon’s early innovation culture, the pay-as-you-go model, aligning engineering with business requirements, the importance of measuring costs, technical debt management, and building evolvable systems. Throughout, Vogels offers pragmatic advice, rich anecdotes, and actionable insights for software engineers and leaders navigating the challenges of architecture, cost, and business impact.
[02:20 – 07:35]
"Unfortunately, nobody has written a book about E-commerce before because the word didn't exist yet. This is 94. And so anything that Amazon tried to do after that, they basically had to invent themselves."
— Werner Vogels [03:32]
[07:35 – 10:14]
"Instead of that, people have to pay upfront, which they had to do with every other IT company. You only had to pay for what you've used. And now that seems normal. But that was revolutionary at that moment."
— Werner Vogels [09:20]
[10:14 – 13:35]
"When I say frugal architect, I don't mean cheap. I mean that you get maximum value for the money that you're spending or what you want to do, and then work backwards from that."
— Werner Vogels [10:40]
[13:35 – 17:14]
"You need to make sure that if you grow, your costs grow over exactly the same dimension [as your business]. Because otherwise I'm pretty sure you're going to run into trouble."
— Werner Vogels [15:30]
[18:12 – 22:00]
"We used to as engineers to make those decisions. And then what we do is we make everything four nines available, which is from a business perspective, way too costly."
— Werner Vogels [21:28]
[22:17 – 29:47]
"APIs are forever. It's one of the hardest thing to do is API design because you need to think about sort of how is this going to evolve because this is going to stay around for a while."
— Werner Vogels [30:24]
[32:46 – 39:06]
"Technical debt is like the mortgage on your house. You know, if you don't pay off your mortgage, the bank comes and repossesses your house. If you don't actually eventually solve all your technical debt, it'll come back to haunt you."
— Werner Vogels [34:12]
[39:06 – 42:26]
"The most dangerous word in the English language is ‘we have always done it this way.’ ... If Python and Ruby are 75 times as inefficient as Rust, that maybe there should be some light bulb going on in your head."
— Werner Vogels, citing Grace Hopper [40:31]
[42:26 – 46:32]
"Evolvability is really realizing that this is not going to win forever. But we need to build it in such a way that for our customers, it looks like it will run forever."
— Werner Vogels [43:28]
"Constraints breed creativity. And you found ways to do it."
— Werner Vogels [05:52]
"Without measuring, you're flying blind. ... It changes behavior [when you make measurement visible]."
— Werner Vogels [29:47, 31:07]
"The relationship between the business and tech ... that's where money matters. ... A bit of an agile working together with the business to make sure you build the right things."
— Werner Vogels [46:45]
Werner Vogels brings a pragmatic, customer-centric philosophy to software architecture, emphasizing that cost, business alignment, and adaptability are as crucial as technical brilliance. His emphasis on clear measurement, cross-disciplinary collaboration, continuous learning, and permission to experiment (and fail) offers actionable guidance for engineers and leaders steering organizations through both innovation and constraint. Vogels’ parting reminder: architecture and systems must evolve continuously—businesses and technologies that insist on doing things the way they always did will inevitably fall behind.