Transcript
Host (0:00)
Aws S3 is the world's largest cloud storage service, but just how big is it and how is it engineered to be as reliable as it is at such a massive scale? Mylan is the VP of Data and analytics at AWS and has been running S3 for 13 years. Today we discuss the sheer scale of S3 in the data stored and the number of servers it runs on, how seemingly overnight AWS went from an eventually consistent data store to a strongly consistent one, and the massive engineering complexity behind this move, what is correlated failure, crash consistency and failure allowances, and why engineers on S3 live and breathe these concepts, the importance of formal methods to ensure correctness at S3 scale, and many more. A lot of these topics are ones that AWS engineering rarely talks about in public. I hope you enjoy these rare details shared. If you're interested in how one of the largest systems in the world is built and keeps evolving, this episode is for you. This episode is presented by statsig, the Unified Platform for Flags, analytics, experiments and more. Check out the show notes to learn more about them and our other seasoned sponsors.
Interviewer (Gregory) (1:00)
So Mailon, welcome to the podcast.
Mylan (VP of Data and Analytics at AWS) (1:03)
Thanks for having me.
Interviewer (Gregory) (1:04)
To kick things off, can you tell me the scale of S3 today?
Mylan (VP of Data and Analytics at AWS) (1:09)
Well, if you want to take a step back and just think about S3, it is a place where you put an incredible amount of data. And so right now, S3 holds over 500 trillion objects, we have hundreds of exabytes of data, and we serve hundreds of millions of transactions per second worldwide. And if you want another fun stat, we process over a quadrillion requests every single year. And what's under the hood of all that is also pretty amazing scale if you think about what's underneath the hood of S3. Fundamentally, we're disks and servers, which sit in racks, and those sit in buildings. And if you try to think about all of the scale of what is under the hood, we manage tens of millions of hard drives across millions of servers. And that is in 120 availability zones across 38 regions, which is pretty amazing if you think about it.
Interviewer (Gregory) (2:12)
So deep down, it all starts with hard drives sitting inside servers sitting inside racks. And then you have a bunch of these racks and then rows of them, buildings of them. Right? And that's what you said. So there's tens of millions of of hard drives deep down in the bottom of this.
Mylan (VP of Data and Analytics at AWS) (2:27)
