Loading summary
A
Today we are diving into a problem that might be more common than we like to think among cloud copying data between S3 buckets or even S3 compatible storages. So this is something that can happen if you are migrating some workloads to AWS. You have been using S3 compatible object storage and now at some point you decide to go fully on aws, so it makes sense to move all the data to S3 as well. Or maybe the other way around, maybe you are escaping from AWS for whatever reason or. Or maybe you're just escaping the object storage part. So there are more and more S3 compatible alternative storage services and some of them are actually becoming really, really competitive on pricing. So if you don't mind the extra complexity of having to manage workloads distributed across multiple cloud providers, this is actually something that can be an effective strategy to save some cost on your cloud expenses. Or yet again, there might be another use case. Maybe you are just copying data from two buckets, so still in aws, but maybe they happen to be in different accounts. And you know that giving permission across accounts is something that sometimes can be challenging. And if you're sticking to aws, all the recommendations assume that you have one set of credentials that you can use to copy, to read the data and copy the data across accounts. And this is not always an easy situation to have. So this is another problem you might have to deal with when you're trying to copy data from one bucket to another in different regions and accounts. So today we're going to talk about all these kind of different use cases and we will share a little bit of a story that we had personally and how we ended up building a small CLI tool that allows us to simplify copying data between S3 compatible storage. My name is Luciano and as always, I'm joined by Owen for another episode of AWS Bytes podcast. AWS Bytes is sponsored by Forthereum, but we'll tell you more about them later. So let's get into the S3 and S3 compatible industry.
B
Yeah, the whole ecosystem around cloud storage is growing really rapidly. Now, S3 has been around for a long time and is still dominating, but there are a lot of more interesting alternatives and some of them are really competitive on pricing, trying to grab some of that market share that S3 has and ride on the coattail. Like if you look at S3, you pay about $23 if you have like a terabyte of S3 storage on the standard tier. Now there's a decent enough free Tier. You get 5 gigabytes, which might be enough, and 100 gigabytes of egress, which is something that was recently increased by a significant amount to try and I think combat some of this competitive evolution. If we look at some of the alternatives out there, DigitalOcean has one digital space, DigitalOcean Spaces object storage that's $5 a month fixed price.
A
But.
B
But then you pay US$20 per terabyte per month and they give you a 250 gigabyte free tier. Then Cloudflare, Cloudflare R2, which is one of the entrants that I think really caused AWS to rethink their pricing Strategy. That's like $15 per terabyte per month. And it is interesting for its zero egress fees approach. So it's interesting. It seems like the market leader is keen to make it difficult for people to do egress, but the new entrants are very keen to say that they want to make that as cheap as possible. So similar to R2, you've got Backblaze B2 which is $6 per terabyte per month, which seems like the cheapest option you can get right now. Another one is wasabi, which is $7 per terabyte per month. And then you have Linode with Akamai, they have an object storage offering for US$20 per terabyte per month. But you get a terabyte of egress for free, which is pretty significant. Now there are other options. You don't necessarily have to go with another cloud provider for object storage. You can host it yourself. Minio is a reasonably popular one for people who want S3 compatible object storage. If you need to host it in a data center, you might say that's only for the brave. But obviously you might have your own existing storage that you've already invested in and you might have compliance requirements that mean you have to keep it in your data center. Now Miniot also does have a managed cloud service, but this seems to be a bit of a premium offering because you have to spend at least $96,000 per year. But with that you'll get 400 terabytes of storage, which works out to about $20 per terabyte per month. So there's a lot more. So if you do a web search, you'll be inundated. And one of the interesting aspects to this story is that last week DHH of Basecamp Hay and Interesting Hot Takes Fame moved wrote an article about moving off of S3 onto their own storage. Now this isn't necessarily new. Like other people have done this in the past. Like if you consider Dropbox, I think their original solution was based on S3 and eventually they were just doing so much storage that it made sense for them to build their own storage data centers. But in the basecamp hey space it's I think we did a previous episode on DHH's Hot Takes on Serverless and it's a bit similar in this case because you don't get much of the detail. It's not very quantitative analysis. It's much, much more opinion based. But so they don't really talk about the cost of hardware and how much it's going to cost them to operate it. But it's just an interesting case you might have heard of about people and companies considering alternatives to just paying for S3 and letting Amazon do all of the work for them. There's a link to that article in the show Notes if you're curious. So we had a similar case recently and you know all about this, Luciano. So what's the backstory?
A
Yeah, the backstory is that this was basically a few weeks ago. We needed to move the the entire content of an S3 bucket to another storage, an S3 compatible storage managed by another cloud provider. Now don't really ask us why. We are big fans of AWS and S3, as you know, but sometimes business requirements can get in the way and you end up in unexpected places and you just need to solve the problem. So I'm sure you can relate. And especially now that you know all about these other competitive offers, you can see why businesses might decide to do something like this. So yeah, this was the situation we were in and we thought this was like a simple problem. Right? How hard can it be to just copy data from S3 to something else that promises you S3 compatible APIs. Right. Seems like you can just do an S3 sync and call it a day. Right. But of course it's not that easy and that's the reason why we are talking about it. And I just want to explain some of the requirements we had so that you can understand why we ended up with a specific solution. So basically we needed to copy all these objects from this packet to another S3 compatible service. Now in fairness, it wasn't like a huge amount of objects. I think it was quite a couple like of 10 terabytes or maybe something more. But it was a lot of small objects in the order of like millions of very small objects. So the COP itself needs to be efficient. We wanted to make it efficient in terms of memory. So possibly we didn't want to kind of buffer everything into an intermediate machine to just copy to the destination. We wanted to do some kind of copy on the fly. So as you read the data from source, you start to copy to the destination. And so ideally, another thing that this was more for like, operational purposes because there were applications actually using this data and also the applications needed to transition to the new storage. So the business decided that it made sense to start to prioritize newer files because these will be the ones with, I guess, the higher probability of being used by the application. So another requirement is the copy process should take that into account and prioritize more recent objects rather than the oldest one. And then the other thing is that it should be possible to interrupt the copy process at any point and resume it later. And this can include if something fails, maybe the machine needs to be rebooted, maybe the copy process itself, I don't know, has a bug and just fails. We don't want to restart from scratch because that will be a huge waste of time and also bandwidth. So let's figure out a way that the whole copy process can be interrupted at any time and it can be resumed later. So again, how difficult could this be? S3 sync seems to hit most of the boxes here. But yeah, when we started to look into it, there were some problems that we'll tell you a bit more about later. And therefore we ended up deciding, okay, we are going to create our own little CLI utility that is able to read files from the original strip bucket and copy the files to the destination service. But I think before getting into the details of this solution, which by the way, little spoiler is called S3Migrate and it's fully open source. We'll share the link in the show notes. But before diving into it, I think we should talk a little bit more about our analysis of the existing solutions and why we couldn't use anything that is already available.
B
Generally we don't like to have to invent these tools ourselves. And you might think like S3 compatible storages should just work with S3 tools like the AWS CLI and the AWS CDK. That's kind of what we thought too. But when we did a little bit more research, we realized that in this case it actually made some sense for this client to create a new tool from scratch. So if you just Google for how do you copy data between S3 buckets, you might end up on an Amazon AWS S3. Sorry, on an AWS repost thread that suggests to use either the CLI, that's the AWS S3 sync command, which we use a lot to be Fair, or use S3 batch operations, which are very useful if you've got a whole number of like copies to do or a whole load of objects to manage in one batch. So these are all good solutions, but there's a couple of fundamental challenges with them. First one is that they assume you're all in on aws. They don't. Naturally enough can cover the scenario where you might be using an S3 compatible storage, either as a source or as a destination. So normally when you do a copy operation on S3, it might managed by S3 the data doesn't have to go through your client, so you can just do a copy object API call. But that only works if the source and destination are on the same provider. Even if you are all in on aws, if the two buckets live in two different accounts, you need to set up cross account permissions. And that can add a lot of complexity because essentially what it means is that if you're doing a copy operation, copy object operation, that's got to be signed with a signature from an IAM identity and you can only have one principle there. You can't have two principles. So that principle must be authorized to access the source and the destination in the read and write modes you need. So when you run the sync command, the AWS CLI operates with that one set of credentials. And it isn't going to work if you've got something on S3 and the other destination or the source is on Cloudflare for example. So we were looking for something that could operate with two different sets of credentials, one for reading from an arbitrary SP3 compatible source and one for writing to another arbitrary S3 compatible destination. Destination. And since we couldn't find anything out of the box, being the nerdy programmers who probably suffer a little bit from the not invented here syndrome, we thought, well, how difficult can it be to write a little CLI tool that uses the SDK to do what we want to do? Luciano, you wrote that tool, so how does it work?
A
Yeah, let me try to explain how it is built. So again, in a nutshell is Effectively it's called S3Migrate. It tries to do something somewhat similar to AWS S3Sync, but allows you to provide two separate sets of credentials. This is probably the main difference from like a idea perspective of the tool. So you don't necessarily have to have one single set of credentials you can provide two for source and for destination. And the tool itself is written using Node JS specifically in TypeScript, and it uses CMDER JS for the CLI argument parsing SQLite for data storage. We'll get into the details of that in a second because it might sound weird right now. And of course it uses the AWS SDK version 3 for JavaScript to interact with S3 compatible endpoint. By the way, fun fact, if you look at most of these other providers, they all tell you just use the AWS S3SDK to interact with our APIs. So this is actually a good sign that most providers are actually trying to be strictly compatible with those APIs to the point that it's not even worth for them to create their own clients because you can just use the existing SDKs and clients. So that kind of made it a little bit easier for us because we didn't need to learn a new, I don't know, set of libraries or even trying to figure out if we want this tool to work with multiple providers. Do we need to, I don't know, have some kind of abstraction layer where you need to plug in different SDKs? Thankfully, everything seems to work just fine with the AWS SDK for jam. Now you might be asking usual question here, why didn't you use Rust or go? And of course this is something we could debate on for hours and we could do like a flame war of sort. But yeah, if you just want the long story short, I would have personally loved to write it in Rust because I'm a big fan of Rust and I'm always looking for excuses to use Rust more. But honestly, given that we have tons of experience in Node JS typescript and this seems a use case that you have lots of tooling existing that can support you in Node JS and TypeScript it was just much easier and faster to deliver the solution using TypeScript. And the other thing is that from a performance perspective it is true that maybe Rust could have made it a little bit faster and maybe more, I guess from a memory perspective a little bit savvy, like it's not gonna use as much memory. But at the same time the real bottleneck here is networking speed. We are doing a copy like a progressive copy of the data. So really, yeah, networking is the real boss of Ikea. So even if we maybe if we use Rust multithreaded asyncio, the multi threading could have given us a way to parallelize a little bit more the copy. But there are Other strategies that we put in place and we'll talk about that later. So, yeah, this is why we didn't use Go or Rust. But I don't know, maybe it's an exercise for somebody if you want to try to do something similar with one of those languages. As I said, the tool is fully open source, it's published on npm, so you can just use it today. But by using something like mpx, you don't even need to like install it. You can just try it just with one command and see if it works for you. Now, we mentioned that there are two sets of credentials. It works in a similar way to the AWS CLI or the AWS SDK, meaning that you can use the usual environment variables like AWS access key ID or you can use Endpoint and so on. But the difference is that you have, you can use the basic one. If you just use the basic one, that's kind of the default layer. But you can also override by saying Source AWS access key or Source endpoint. And similarly, you can override the destination. For instance, you can say destination AWS access key ID destination endpoint. And the tool also reads from ENV files. So if you prefer to just put all this information in an ENV file because it makes your life easier, the tool is going to load an ENV file automatically, if that exists in the current working directory. Now, the way that it is a little bit different from sync is that there are actually two phases. Like you don't just just run one command and it starts the copy. You actually need to run two different commands. And the first command is called catalog, and that's what we call the catalog phase, which is basically what it's going to do, is going to do a list operation on the source bucket and store all the objects in a local SQLITE database. And the reason why we do this, this is effectively like a mini state file, if you want. And this is what we decided to do to effectively have the kind of resumability feature on one side. So as we copy the files, we know exactly how many files there are to copy. So we can keep track of the progress. We can mark which ones have been copied. And the other thing we can do, because we also store the metadata related to all the objects as we discover them through the list operation. That's also what we can use to effectively do the sorting. So if you want to prioritize the files that are bigger, smaller or newer, you can do that and effectively will be doing. The tool is going to be doing behind the scenes a different SQL query with a different sorting based on your parameters. So that's the reason why we have this kind of intermediate step, just to make it a little bit more flexible, to understand how many objects there are and as you copy, to understand what is the current progress and then to do prioritization of different objects and presumably once you have done the catalog phase, so effectively you end up with this state file, which is effectively a SQLite. You can open it with any SQLite compatible UI or CLI just to see what's inside. And with that you can start the copy phase. So there is another command, s3migrate copy, where you specify the source bucket, the destination bucket and the state file. And of course through the environment you are providing all your credentials. And effectively this command is going to start to look at the state file, figure out what still needs to be copied and start to copy. And of course, being a CLI utility, one of the challenges of course is that you need to have it in some kind of host system or your own personal laptop, like wherever. Like it needs to be a process that runs somewhere. And of course you need to control that process, make sure it's a long running thing. So probably you're going to have some kind of remote machine somewhere, install the tool there, provide all the credentials, create the catalog and then run the command and just monitor that the application is progressing without any issues.
B
Okay, it sounds pretty like there's a lot of capability here. And I guess the thing about building these tools is that it's achievable enough to get version one up and running. But already, even if you run it once or twice, you might be starting to think about how you can make it faster, especially trying to handle different types of data sets. You mentioned that in this case the subject was a lot of small files. On S3, you might also have a lot of large files and you're trying to optimize for IO and parallelism and request throttling, a lot of that kind of stuff. So what kind of performance optimizations did you think about so far?
A
Yeah, that's a very good question. And I'm going to start with a caveat that I think this is still a very early project. If you look at the repo, it clearly states that this is experimental, so don't trust it too much, or I would say trust but verify. So I'm sure that there are still loads of opportunities to improve it and also in terms of performance. So with that being said, what have we done so far to try to give you options on how you can improve both performance in terms of data transfer, but also in terms of how much memory is being consumed at the host level. So if you want to be very memory efficient, for example, there are options there as well. So one thing worth mentioning is that we use Node JS streams to copy data. And that's another thing that I'm a big fan of, so probably no surprises if people know me. The idea is that when you run a get object command using the AWS S3SDK, the body that you receive in the response is an O JS stream. So effectively you are not eagerly consuming that data. You almost, you can think of that like you have a pointer to where the data is and then you can start to fetch as you need it. And also Node JS streams give you a nice API where you can effectively combine streams together. So you could have a stream to read and another stream to write. And effectively you can pipe them together and let the data flow from one to another. And this is very useful because when you do a putobject operation, you also have a stream in the body that effectively you are writing. And in node JS terms you have a readable stream for the get operation and a writable stream for the put operation. So effectively you can easily combine a readable stream with a writable stream and basically just create this pipe where you say, read from one place, write to another. And node JS takes care of most of the complexity there because for instance, even handles back pressure. If you are much faster at reading than you are at writing, what generally would happen is that you easily exhaust all the memory reading and as you try to write, you are not able to flush all this data fast enough. So node JS has a mechanism called back pressure handling, where effectively it kind of figures out when you have too much data accumulated and it's going to stop reading, give time to the backend system to receive all the writes, and then it's going to resume reading. And all of that stuff happens automatically when you stream. So I think that's kind of an easy optimization to have because all built in in Node JS and we just took advantage of it. There is some additional complexity if we want to get into the nitty gritty details where there is a minimum amount of like when you use streams, you are effectively reading and writing in chunks, so you get blobs of bytes and they generally have like a fixed size. The S3 API forces you to have a consistent chunk size when you're writing and there is a minimum amount of bytes that that chunk size needs to have. So we kind of have to do some. It's called like a transform stream. We need to put something in between that buffers enough data to be able then to write. But that's yeah, just as much complexity as we added. Then node JS streams take care of everything else. And this is actually an interesting part because this is another place where you can optimize so you can decide to increase the chunk size, which effectively means you are going to accumulate more memory in the host system because effectively you are creating more windows of data that are ready to be flushed. And the bigger they are, of course, the more memory you are consuming in the host system. But at the same time that means you are doing less API calls to the storage service where you're writing. So that can be convenient as well because of course every API call has an override. So yeah, generally it's suggested to try to figure out to find a balance where if you keep it too small, you maybe are doing too many writes and there is an overhead on the operative system and everything else. But if you find maybe what is a good chunk size, then you probably can optimize a little bit more on the write speed as well. Now another interesting optimization is concurrency using node js. This is effectively you have a language that allows you to do concurrency relatively easy. Just be aware that of course this is still a single threaded type of concurrency. So in this case I think it works really well because you are effectively waiting for IO most of the time. So as you are waiting, you can have multiple copy operations that are kind of interleaved between each other and they will progress together. But of course this works up to a certain point. So you can try to figure out what is the maximum amount of concurrency that I can use. And there is a parameter you can specify to the point where you don't see an improvement of speed anymore, just because there is so much interleaved operation that effectively you're wasting more time just jumping from one operation to another rather than actually copying the data and doing progress. So I think at some point it might be beneficial to use proper parallelism. So trying to spin up multiple processes to do the copy. And this is something that it is supported by the tool but might be a little bit tricky. And actually the AWS S3 sync does something similar as well, where effectively you can create a catalog only for a certain prefix in your sources 3 bucket. So effectively you end up with multiple catalogs as per many prefixes you use. And then you can take those catalogs and even in different machines, if you want, you can do the copy operation only for those subsets of all the data. So effectively you can parallelize the copy across multiple machines, which gives you more parallelism. Probably you can use more bandwidth as well, because of course bandwidth at some point can become a bottleneck as well. The only issue with that is that it is a little bit more complex to set up. And also it's always a little bit challenging to figure out what are some good prefixes that I can use to effectively spread equally the amount of data is being copied at every point of your parallelized solution. So it is an option, it is supported. But I guess depending on the shape of your data, it might be easier or harder to effectively adopt this solution.
B
You mentioned that it's still in the early stages. So what would you like to see in terms of roadmap? What doesn't it do yet? Do you have any call to action for the audience to get contributing on this?
A
Yeah, I think there are two sets of things worth discussing. One is like things that are not supported by design and other things that are not supported just because we didn't have the time or the immediate need for those features. So the things that are not supported by design are things like copying, I don't know, attributes or tags or ACL rules or anything, I guess, related, anything that falls outside just the data of the objects themselves. Like in S3, you have so many options of things you can configure, storage classes, life cycles, things you can configure at the object level, at the bucket level. This tool intentionally doesn't try to do to replicate any of this. And one of the reasons is because this was not our immediate need. But then the other reason is if you analyze the problem a bit Pragmatically, different S3 compatible storages are going to have a different. They're going to have a different level of support for those features. So trying to be comprehensive and support all of these things, I think you easily end up with like a matrix of what is supported and not supported. And it becomes quickly obvious that, yeah, either you create a system that is like either configurable and then let the user figure out which configuration works for them, or it becomes effectively impossible to maintain this matrix of what storage supports what feature and then try to automatically leverage whatever is supported. So that's something that by design we didn't even try to implement. Similarly, also, encryption is another thing, like if you have encrypted objects. I'm not actually sure, I haven't done a lot of testing, but we don't try to provide any option at this stage. So that is something that might get in the way if you're actually working with encrypted data in buckets. I guess it depends also on the encryption mechanisms you are using. And the one thing that actually I would like to see, but we just didn't have the time to implement it ourselves, it's some kind of support for multi part uploads because this tool worked really well for us because all the files were relatively small. But I guess if you have some kind of media intensive application where maybe you have lots of big images or even videos and you can have files that spans multiple, multiple gigabytes, then maybe this won't be the most efficient way to copy your files. Probably you want to do some kind of multi part upload to try to parallelize as much as possible, even the individual objects. So if anyone is open, maybe you are using this tool and you find it useful. It's open source, so feel free to send a pr. This is one feature that we would love to see.
B
Nice. And it would be great to get more development on this because if we look at some of the alternative solutions, there's a couple of open source ones out there, but a lot of them seem to have been written by people who needed to solve a problem and then maybe not maintained so well. There's one from AWS Labs which is relatively new but kind of written in golang. It uses S3 batch operations though, so AWS only doesn't really solve a problem. There's an older 1 called S3S3 Mirror on GitHub. It's a Java based one that allows you to mirror buckets from one to the other or from a local file system. And then there's one called Knox Copy which was written in Ruby quite a while ago but it seems to be quite deprecated. Now there if we look at not so open source options, there's one called or clone S3 as well. And that's like a tool that allows you to copy data between lots of different sources. So that could be FTP, Dropbox, Google Drive and it also includes S3 so it seems quite powerful but we haven't tried it yet. And then there's another paid cloud service called Flexify. This is actually what DigitalOcean recommends for migrations. We haven't tried this but but thought it was worth mentioning in case you wanted to just Throw money at the problem, I guess. It would be interesting to benchmark this. It depends on your use case, of course. But I wonder, would tools like mount point for S3, which we covered on a previous episode, if you just mount two different S3 buckets using different credentials on your file system and then do like rsync between them, what would the performance of that be like? I'm always kind of interested but skeptical about solutions that try to map object storage into a file system abstraction. But Mount Point does work well for some cases and same with the Fuse file system. User. User. What do you call it? User file system for S3 as well. So options that I'm interested to get other people's take on as well.
A
Yeah, and this seems like a common enough problem that I'm surprised that there isn't really a lot of literature out there or a lot of solutions. And I think it's going to become a more common problem with all these other solutions that are appearing everywhere. So I'm just curious to see what other people have, like if they had this kind of use case and what kind of solutions they came up with.
B
Will cloudflare Layer and all these other vendors start adding tooling to do one click import from an S3 bucket, do you think?
A
I wouldn't be too surprised to be honest if they do because I guess it's in their best interest. It's almost like all these tools that are trying to compete with like newsletter things and they all have imports from mailchimp. Right. Because it makes sense for them to try to make it easier for new customers. So I wouldn't.
B
They always work one way only though they never allowed you allow you to export.
A
Yeah. So I think that's all you get for today. Again, we are really curious to hear from you. Have you dealt with this kind of problem? Don't be shy. Let us know because we are always eager to learn from you and share our experience, not just in one way. So please give us some of your experience. But before we wrap up, let's hear it from our sponsors. We promise you to tell you a little bit more about For Theorem and thank you for Theorem for supporting yet another episode of AWS PI. So migrating data is hard, but optimize your cloud setup doesn't have to be. That's where our friends at Forthereum come in. It's an AWS Advanced Consulting partner specialized in serverless first solutions to slash cost, scale seamlessly and modernize your cloud application. So whether you are streamlining infrastructure, accelerating development or turning your tech team into a profit powerhouse for Theorem is there to help you out to maximize your AWS investment. So check out forthereum@fort.com and you can find some trusted partner for your next AWS project, I'm sure. So thank you everyone and we'll see you in the next episode.
Hosts: Eoin Shanaghy & Luciano Mammino
Date: April 3, 2025
This episode explores the practical and strategic challenges of migrating large amounts of data between Amazon S3 and S3-compatible storage solutions. Eoin and Luciano discuss why organizations might look to move data away from (or into) S3, the growing landscape of S3 alternatives, and their experience building an open-source CLI tool—S3Migrate—to solve complex, cross-provider data transfer needs when standard solutions fall short.
“There are more and more S3 compatible alternative storage services and some of them are actually becoming really, really competitive on pricing... this is actually something that can be an effective strategy to save some cost.”
—Luciano (00:40)
“It seems like the market leader is keen to make it difficult for people to do egress, but the new entrants are very keen to say that they want to make that as cheap as possible.”
—Eoin (02:51)
“Effectively it's called S3Migrate. It tries to do something somewhat similar to AWS s3 sync, but allows you to provide two separate sets of credentials. This is probably the main difference...”
—Luciano (11:22)
Technical stack:
Rationale for technology choice:
npx; set credentials via ENV vars or .env files.Workflow:
s3migrate catalog — Scans and stores file metadata locally.s3migrate copy — Copies outstanding files, tracks completion.Credential handling:
.env.Limitation:
“When you run a get object command... the body that you receive in the response is a Node.js stream... you are not eagerly consuming that data... you can pipe [read/write] them together.”
—Luciano (18:19)
By design not supported:
Wishlist:
“If anyone is open, maybe you are using this tool and you find it useful. It's open source, so feel free to send a PR. This is one feature that we would love to see.”
—Luciano (25:55)
“I'm always kind of interested but skeptical about solutions that try to map object storage into a file system abstraction. But Mount Point does work well for some cases...”
—Eoin (28:21)
“We are really curious to hear from you. Have you dealt with this kind of problem? Don't be shy. Let us know because we are always eager to learn from you and share our experience.”
—Luciano (29:31)
This episode provides a practical, detailed look at the nuances of migrating data out of—or into—S3 and other object storages. The hosts’ open-source project, S3Migrate, fills a notable gap for those needing more flexibility than standard AWS tools provide. They encourage contributions, experience sharing, and further community discussion as S3-compatible migration becomes increasingly important for cloud professionals.