If you're not an Iron Mountain customer, this product probably isn't for you. It wasn't built to back up your family photos and music collection.
Regarding other questions about transfer rates - using something like AWS Import/Export will have a limited impact. While the link between your device and the service will be much fatter, the reason Glacier is so cheap is because of the custom hardware. They've optimized for low-power, low-speed, which will lead to increased cost savings due to both energy savings and increased drive life. I'm not sure how much detail I can go into, but I will say that they've contracted a major hardware manufacturer to create custom low-RPM (and therefore low-power) hard drives that can programmatically be spun down. These custom HDs are put in custom racks with custom logic boards all designed to be very low-power. The upper limit of how much I/O they can perform is surprisingly low - only so many drives can be spun up to full speed on a given rack. I'm not sure how they stripe their data, so the perceived throughput may be higher based on parallel retrievals across racks, but if they're using the same erasure coding strategy that S3 uses, and writing those fragments sequentially, it doesn't matter - you'll still have to wait for the last usable fragment to be read.
I think this will be a definite game-changer for enterprise customers. Hopefully the rest of us will benefit indirectly - as large S3 customers move archival data to Glacier, S3 costs could go down.
My backup wouldn't it be cool if is, unlike the above reasonableness, a joke: imagining 108 USB hard drives chained to a poor PandaBoard ES, running a fistful at a time:
The Marvell ARM chipsets at least have SATA built in, but I'm not sure if you can keep chaining out port expanders ad-infinitum the same way you can USB. ;)
Thanks so much for your words. I'm nearly certain the custom logic boards you mention are done with far more vision, panache, and big-scale bottom line foresight than these ideas, even some CPLD multiplexers hotswapping drives would be a sizable power win over SATA port expanders and USB hubs. Check out the port expanders on OpenCompute Vault 1.0, and their burly aluminium heat sinks:
Then there's failure conditions. EBS is an S3 customer. Glacier is an S3 customer. Some amount of isolation is desirable. If a bad code checkin from an S3 engineer causes a systemic error that takes down a DC, it would be nice if only S3 were impacted.
I probably shouldn't go into the hardware design (because 1) I'm not an expert and 2) I don't think they've given any public talks on it), but it's some of the cooler stuff I've seen, especially when it came to temperature control.
But at its price points, with most US families living under pretty nasty data cap or overage regimes, it sounds superb, with of course the appropriate front ends.
There's no good (reliable), easy and cheap way to store digital movies, e.g. DVD recordable media is small by today's standards and it's much worse than CD-Rs for data retention (haven't been following Blu-ray recordable media, I must confess, I bought an LTO drive instead, but I'm of course unusual). And the last time I checked very few people made a point of buying the most reliable media of any of these formats.
In case of disk failure, fire, tornado (http://www.ancell-ent.com/1715_Rex_Ave_127B_Joplin/images/ ... and rsync.net helped save the day), for this use case you don't care about quick recovery so much as knowing your data is safe (hopefully AWS has been careful enough about common mode failures) and knowing you can eventually get it all back. Plus a clever front end will allow for some prioritizing.
Important rule learned from Clayton Christensen's study of disruptive innovations (where the hardest data comes from the history of disk drives...) is that you, or rather AWS here, can't predict how your stuff will be used. So if they're pricing it according to their costs as you imply they're doing the right thing. Me, I've got a few thousand Taiyo Yuden CD-Rs who's data is probably going to find a second home on Glacier.
ADDED: Normal CDs can rot, getting them replaced after a disaster is a colossal pain even if your insurance company is the best in the US (USAA ... and I'm speaking from experience, with a 400+ line item claim that could have been 10 times as bad since most of my media losses were to limited water problems), so this is also a good solution to backing up them. Will have to think about DVDs....
I personally don't have a feel for enterprise archival requirements (vs. backups), but I do know there are a whole lot of grandparents out there with indifferently stored digital media of their grand-kids (I know two in particular :-); the right middlemen plus a perception of enough permanent losses of the irreplaceable "precious moments" and AWS might see some serious business from this in the long term.
Separate from the play for replacing tape, there's also the ecosystem strategy. When you run large portions of your business using Amazon's services, you tend to generate a lot of data that ends up needing to be purged, else your storage bill goes through the roof. S3's Lifecycle Policy feature is a hint at the direction they want you to go - keep your data, just put it somewhere cheaper.
This could also be the case where they think they're going after tape, but end up filling some other, unforeseen need. S3 itself was originally designed as an internal service for saving and retrieving software configuration files. They thought it would be a wonder if they managed to store more than a few GB of data. Now look at it. They're handling 500k+ requests per second, and you can, at your leisure, upload a 5 TB object, no prob.
But maybe you're right. The thing could fail. Too expensive. After all, 512k ought to be enough for anybody.
Clearly, with on-line differential backups - you might be able to do things more intelligently.
I'm already looking forward to using Glacier, but, for the forseeable future, it looks like the "High End" archiving will be owned by Tape. And, just as Glacier will (eventually) make sense for >100 Terabyte Archives, I suspect Tape Density will increase, and then "High End" archiving will be measured in Petabytes.
The tradeoffs will be different depending on how many tapes you write and how often you reuse them.
Agreed - how-often you re-use tapes (and whether you do) - has a dramatic effect on "system cost" of your backup system.