So what do you with those three year old 1 TB hard drives where the power-consumption-to-space ratio is not good enough anymore? Or can of course destroy them. Or you actually do build a disk drive robot, fill the disk with Glacier data, simply spin it down and store it away. Zero cost to buy the drives, zero cost for power-consumption. Then add a 3-4 hour retrieval delay to ensure that those old disk don't have to spin up more than 6-8 at times a day anymore even in the worst case.
But that's just my personal theory anyway.
The problem with this theory, however, is that tape still would have been cheaper with roughly the same footprint, and tape has the benefit of often being forward-compatible too, as the drives improve, so does the storage capacity of the tape.
The author dismisses that amazon wasn't using tape, but I haven't seen much evidence to support that necessarily.
Maybe the reality is they probably use a bit of everything. A robotic disk library, a robotic tape library, maybe a robotic optical library. Maybe they're secretly ahead of the rest of current tape technology and are getting 20TB out of a single tape. They could be using custom hard drives, or even having a robotic platter library.
One thing for sure is that it's not active disk drives, and I don't believe they keep all the data on optical disks alone, given that optical disks degenerate much quicker than magnetic storage.
HDDs are not designed for long term archive. They simply fail hard when being moved or switched off for a long time.
This is also surprisingly easy to do with tape - simply leaving an LTO tape on its side can push failure rates to significant levels within a year.
If you care about data there's no substitute for multiple copies which are regularly verified. The idea that you can leave something on the shelf and expect to reliably read it is a dangerous myth. If you care about archival, build a system with the staffing and procedures needed to make that happen. This is enormously easier and cheaper to do with spinning disk below a certain level but if you have enough data the lower media cost of tape will balance out the increased overhead.
Do you have a source for that? I'd not heard of that before, I'm now interested to know what % the failure rate increases by
I'd never heard about that before, either (not having had tape storage as a primary job), but a colleague was quite surprised by 20-30% failure rates for tapes first used within a year and asked our drive vendor about it. At the time, there was nothing mentioned on the media we bought and the tape vendor didn't have any official docs but the drive technician we dealt with said that it was well known among the support group - apparently a lot of customers either didn't know or didn't assume it'd be so dramatic.
They don't mention the rates though
A single disk certainly is bad for long term archive. But when you have thousands of disks and use multiple copies for all data with some fancy error correction you can still get extremely high reliability, you just need to calculate in the expected failure rate.
The best part about a robot is that you can take things out of a library and move them to a somewhere without power or epic cooling.
The drives and firmware you want for 24/7 use are not the same as what you want for intermittent use. On big old raid arrays, the scariest thing was powering them down because for sure, some of the disks would not come back again.
They could even have them just plugged in to power and laying in racks with custom firmware that triggers drives to automatically spin up ever few hours/days. Then they could just be manually plugging them in to retrieve data when data from a drive is requested.
secretamznsqrl 7 hours ago | link [dead]
AWS does not re-use disks under any circumstance. It is strictly forbidden to protect customer data. Disks also do not leave the datacenter until they have been degaussed AND crushed.
“They’ve optimized for low-power, low-speed, which will lead to increased cost savings due to both energy savings and increased drive life. I’m not sure how much detail I can go into, but I will say that they’ve contracted a major hardware manufacturer to create custom low-RPM (and therefore low-power) hard drives that can programmatically be spun down. These custom HDs are put in custom racks with custom logic boards all designed to be very low-power. The upper limit of how much I/O they can perform is surprisingly low – only so many drives can be spun up to full speed on a given rack. I’m not sure how they stripe their data, so the perceived throughput may be higher based on parallel retrievals across racks, but if they’re using the same erasure coding strategy that S3 uses, and writing those fragments sequentially, it doesn’t matter – you’ll still have to wait for the last usable fragment to be read.”
Apologies for the diversion - what does this mean? Does it mean that when an item is erased from S3, S3 "encodes" the data so that the next person who gets the same physical disk space can't read what was there before?
Imo, they found a sweet spot of $/GB on a much higher latency/lower reliability region (this is analogous to increasing overall capacity in a communication channel by instead of using a few highly reliable high powered symbols use optimally many unreliable low power ones with error correction) -- disk manufacturers already use this aggressively for soft failures within the disk, but are obviously restricted on more systematic failures (i.e. if the whole drive fails there's nothing they can do).
If a single drive has a P_failure, with many drives they can achieve close to 1-P_failure reliable storage capacity . So all they have to do is seek the optimum
$/GB_opt = min over C,D [ C/(D*(1-P_failure(C))) ],
where C is the cost per drive and D is the drive capacity
-ability to switch to transparently switch to slower technologies in future
Now factor that the disks are both large and crappy. A 120 Mb/s read speed and 1Tb disk size would imply 8000 secs ~ 2 hours. Factor possibility of differential pricing as you mentioned (even longer queues), and you may get an upper bound of 3 or 4 hours.
I'm just speculating though.
I wouldn't be surprised if Glacier's latency wasn't purely artificial so much as it was a deliberate design decision so the architecture can be very different from S3: pure streaming I/O, huge block sizes, concurrent access is nowhere near the same, etc. That much time allows really aggressive disk scheduling and it'd make it much easier to do things like spread data across a large number of devices with wide geographic separation.
Back in 2000, our studio switched from archiving our audio recordings on DAT to CD-R. Thinking the greatest threats were loss and scratches, they made three copies of every disc and stored them in different locations. Around 2007, all of those CD-R discs started becoming unreadable around the same time.
Video recordings were archived to DVD-R starting in 2003 with the same three copy approach. In 2009, after having successfully rescued all our audio recordings on to RAID5 network attached storage, we weren't so lucky with saving our early DVD-R data.
I always assumed the reason DVD-R discs lasted less time than CD-R discs before chemical decomposition was because of the greater density, but I never looked into it enough to know for sure.
I have to imagine if Amazon is indeed doing this, they have some sort of plan in place to write all data to fresh optical media every five years or so. If so, that would create a very different cost equation.
Etches instead of dye, result is claimed 1000 year lifetime. DOD has evidently certified.
However, I worry about even finding optical readers in 10 years time.
The worry of not being able to find a device to read/play back your media is a common one for all storage types, analog and digital. For example, we have hundreds of reels of quarter inch tape from the fifties and sixties down in our library. The tapes themselves are safe (some may need a day in an easy-bake oven), but otherwise they still sound great. We recently had our Studer A-820 refurbished and it's probably in better than new condition, but I worry there won't be any people to do the service, or parts to do the service with if it needs attention in ten years.
The more obscure, the medium, the trickier it gets. We have a bunch of early digital recordings stored on Betamax tapes, that need both a working Betamax deck and a PCM encoder-decoder box in order to be read. Fortunately, those have all been transferred now.
This is why I'm a believer in thinking of maintaining an archive as being an active rather than static process. It's important to be periodically re-evaluating your digital assets to make sure they can be losslessly transferred to current file formats and modern storage media. It was probably never a good idea to simply "put it on a shelf and forget about it", but thankfully with digital assets, these migrations can be lossless, automated, and tested.
The NASA tapes problem (where they couldn't find drives) is definitely a concern on longer time scales. USB seems widespread enough, too.
It would be cute if someone made a self-contained archival device with display, designed for 100+ year lifespan. Solar powered (although generally external power is a simple enough interface that as long as specs are given, it shouldn't be too hard to recreate), multi-language, redundant, etc. Ideally with periodic integrity checks, a duplication function, etc. built in.
Seems kind of like an Internet Archive project, or OLPC or something.
CD-Rs ... who made the discs, and how did you store them? Or did you buy a huge batch at once? I've had great success for over a decade with Taiyo Yuden CD-Rs, over 3,000 recorded and not a single failure to read that wasn't due to a bad read drive, easily solved by using another. Although those retrieval results aren't yet statistically significant.
I've also had great luck with the Taiyo Yuden CD-Rs. We go through about a thousand discs a year for people who prefer to have a CD copy of their music to being able to download a file, and I can't remember the last time I had a failure. However, now that our master copies are 24-bit wav files on network storage, I haven't been as concerned with how those discs have been aging.
I suppose we probably have enough samples for an interesting study on the longevity of optical media, but unfortunately I just don't see us having the time to devote to something like that.
-RW discs supposedly last a lot longer because the RW tech uses a phase change metal instead of dye, and the phase changes are more durable.
It appears as though the early DVD-R discs were Mitsui MAM-A, with us switching over to Taiyo Yuden in early 2007. Since we have all that stuff on network storage, I haven't gone back to check how the Taiyo Yuden discs have fared over time.
This is the first time I can remember ever hearing of HHB gold archival CD-R discs. That's probably not a good sign given how much I was into this field back then, they certainly weren't on the recommended list of anyone I respected. After Kodak stopped making their gold discs, Taiyo Yuden, or Mitsui/MAM-A if you felt like it gave you extra protection.
Ah, you did check to make sure that each disc burned could be read back, didn't you?
Now, when I want to see how bad a disc has gotten, I use PlexTools to do a Q-Check. However, we haven't considered anything on optical media to be a master copy for a while (at least five years). CDs only get burned when someone wants to listen to something in their car.
secretamznsqrl 20 minutes ago | link [dead] [-]
Here is a Glacier S3 rack:
3 x Servers
2 x 4U Jbod per server that contains around 90 disks or so
Disks are WD Green 4TB.
Roughly 2 PB of raw disk capacity per rack
The magic as with all AWS stuff is in the software.
I worked in AWS. OP flatters AWS arguing that they take
care to make money and assuming that they are developing
advanced technologies. That't not working as Amazon.
Glacier is S3, with the added code to S3 that waits. That
is all that needed to do. Second or third iteration could
be something else. But this is what the glacier is now.
It all makes sense.
Here's a video of a 40PB library at NERSC:
What? You partially eliminate a few strawmen, and thus conclude that your pet theory is the only answer? The logic in this article is so weak that I presume this must be an example of 'parallel construction'. I have to guess that someone he trusts but is not allowed to quote has told him that Amazon is using BDXL disks, and he's pretending to reason himself to the same conclusion. My next best guess is that he's a few weeks late on his April Fool's post.
Assuming aggressive forward pricing by Panasonic or TDK, Amazon probably paid no more than $5/disc or 5¢/GB in 2012. Written once, placed in a cartridge, barcoded and stored on a shelf, the $50 media cost less than a hard drive – Blu-ray writers are cheap – Amazon would recoup variable costs in the first year and after that mostly profit.
OK, maybe the April Fool's explanation should come first. Because if you were trying to come up with plausible logic, surely you could do better than declaring that Amazon's private price for BDXL media is 1/5th that known to the public and that all other costs are zero. And even if one were to assume this admittedly unlikely scenario, wouldn't you need to write more than one disk for redundancy? But that's easy to solve: just wave the magic wand and halve Amazon's secret price down to a level where it make sense.
Powered down hard drives seem like a much simpler explanation. The robotics don't seem that difficult if instead of bringing the disk to the backplane you keep the disk fixed in place and just attach the cable. Presumably you could design your own sturdier and easier to align connector, and leave the adapter in place. Or maybe there is some way to do it with a mechanical switch? Or do you even need a robot? If you built a jig so you could plug in a whole drawer of drives at once, maybe you could just hire someone to do it.
But I have always assumed it was just -- like the mysteriously/inappropriately killed comments from jeffers_hanging on this thread claim -- just S3 with hard delays enforced by software.
If you are already doing S3 for the entire world at exabyte scale, and you have petabytes of excess capacity and a bunch of aging, lower-performance infrastructure sitting around... do you really need to fuck with risky new optical disk formats?
It seems to me that you could just start with selling slowed-down rebranded S3, and then iterate on it.
Glacier as S3 with sleeps seems to be a pretty reasonable extrapolation of that same idea. Plus, the sleeps are long enough that if and when they build it for real, it could be economically viable.
That said, if the quoted 0.3 cents is true, it should be viable as it is by just stuffing way more density in a rack and keeping them powered down most of the time... Though that rack would probably weigh tons, so you'd probably want to reinforce the floor of the data center a bit. We saw equinix facilities with concrete floors and ventilation delivered from above, so something like that could be an option.
I fail to see how it is useful as a backup service. This got me into serious trouble. Built a whole backup system around it and now it is not what it was supposed to be.
It's easy to get a nasty surprise. If I pull enough to saturate my 28Mbps DSL downstream for 4 hours (~51GB), I'd pay nearly $100 that month, not 51GB*($0.01/GB)=$0.51. You really need a Glacier-aware restore program that is told your maximum budget, knows the pricing formula, and carefully schedules the retrievals to keep the peak rate below that threshold.
If our office burns down, and I had our offsite backups in Glacier, the retrieval costs are peanuts compared to cost of losing the data (going out of business). If my home burns down, and I have my offsite backups in Glacier, a few hundred dollars to retrieve data would be nothing compared to the emotional loss of years worth of photos etc.
But I have triple copies at home - minimum - of almost all of my data (mirrored drives + regular snapshots on a third drive; and a lot of the stuff is also synced to/from one or more other computers), and similar setups at work: An offsite backup is last resort.
From what I remember from it, its custom HDs, custom racks, custom logic boards with custom power supplies. The system trades performance for durability and energy efficiency.
Robin Harris' deductions in this article are worthy of Arthus Conan Doyle...
Or is it that Amazon has no competitor here yet, good enough to push it to lower the prices?
Glacier now costs the same as a Google drive, except drive has none of the restrictions. AWS on-demand prices have been cut so much they've actually left some of their prepaid discount prices ('reserved instances', in their parlance) higher than what they sell on demand.
Only if your usage is exactly equal to the maximum usage from one of Google's pricing tiers.
I came to that conclusion while modeling cost for Eye-Fi. Power is the big driver.
I'm not even saying robots. Just leave the disk there and power it off.
There are those online backup services that are unlimited for $X/month, but I have been bait-and-switched by those services twice now, and it takes such an incredibly long time (months) to upload everything.. so I've quit relying on those as well.
You are right that external hard drives are a cheaper option, but you must actually take the time to transfer the drives offsite. I'm lazy and since I can't automate physical transfer of drives, that does not work for me. That's why I use Glacier via Arq.
Aside: CrashPlan and a few others allow me to backup to drives I've placed at a friend's house. But I'm not interested in setting that up.
If someone marketed a turn-key solution like this (home backup appliance, plus vehicle-based transfer appliance), would this be a good alternative to online based services?
Encryption would solve that.
I'm kind of hoping a good+cheap service actually comes along. All I'd really need to do is take a backup to work though and swap them out once in a while -- but I'm not disciplined enough for that :)
I certainly wish things were cheaper, but all of the "do it yourself" options have left me more than a little worried at my own lack of discipline working against me.
I've been hearing about this since pets.com was a hot investment and not a whole lot has shipped since then. For a new storage technology that's particularly concerning since you don't know how accurate current guesses as to cost will be and, critically, nobody has a baseline to say what reliability will be like in the real world.
That said, BDXL has shipped in volume for years and has less new technology involved so I would be surprised if it's significantly different than older Blu-Ray or DVD/CD systems which have been heavily tested.
The cost issue just doesn't seem to compute for me. And that's leaving alone the custom storage system setup costs.
Considering Amazon bought Kiva Systems, which made highly intelligent floor robots for warehouses, they obviously have the talent to build the robots necessary for such an operation.
With hard drives it would make more sense to do some development on the electronics side and build a system where lots of drives can be simultaneously connected to a small controller computer. All of the HD's don't need to be powered on or accessible all the time, the controller could turn on only few of them at a time. And of course also part of the controllers could be normally powered off, once all the harddrives connected to them are filled.
TApe is awesome, cheap and proven. It fits the MO of glacier. This is not to say that its not BDXL, but I'd be very suprised if amazon bet the entire house on something so untested.
3 x Servers
2 x 4U Jbod per server that contains around 90 disks or so
Disks are WD Green 4TB.
Roughly 2 PB of raw disk capacity per rack
The magic as with all AWS stuff is in the software.
Really? I don't think this is true, I have only seen the question mark when it is on the original article as well. Any other examples?
I'm a primary source on this, having done it many times myself, and I know PG has as well. Sorry, but I don't have time to dig up examples. It might be fairly straightforward, though, if you wanted to write code against the HN Search API.
I think is just a virtual product. Probably unused S3 capacity, at a much lower price, not a different technology.