Great write-up; really interesting that the CPU ended up being the bottleneck in this experiment! Regarding the cost of sending this data out of AWS, did you run into any issues there using rsync? IIRC rsync copies the data over TCP, so wouldn't this end up being expensive as well? Generally, though, that was my favorite part of the experiment!

My use case converted 10TB in only a couple of GB after processing. Downloading that was very cheap.

