
Glacier redux - wmf
http://storagemojo.com/2014/04/30/glacier-redux/
======
WestCoastJustin
> _Old hard drives that are no longer economical for more intensive service,
> supported by disk-handling robotics._

I am with the author in that I do not think they are using older disks. From
personal experience (working with petabytes of disk based storage), I would
put money on the fact that AWS is not using stock off the shelf older disks
and then powering them on/off as a method of storage. The _last_ thing you
want to do with an older disk (5+ years) is _ever_ power it off. The chances
of it coming back to life decrease rapidly with each cycle. Picture seized
bearings, etc. Your best hope it to keep them continually moving, but never
powered down for extended periods of time.

Also, the GB/power/physical space required ratio does not make sense after X
number of years. For example, you could replace 3 shelves of 1TB disks, with
one shelf of 3TB disks. Yielding less power consumption in less physical
space! When you are at scale, data centre space and power are big factors. For
these reasons, I do not think they are using old stock disks.

NOTE: Maybe they are using some custom make disks, but from what I know about
standard disks, even enterprise, they do not like being powered on/off when
they get old.

~~~
hga
Emphasizing the above, the study a few years ago of disk drives at big HPC
centers found that they don't follow a bathtub failure model. Rather, there
are very few infant failures, and wear is noticeable starting in the range of
1-2 years in service.

~~~
lmm
The google results I saw were more like an exponential curve, i.e. a memory-
less distribution - which would mean old drives are just as reliable as new
ones.

~~~
hga
The HPC paper struck me as a lot more detailed, rigorous and useful. I read
the Google paper later (assuming we're talking about the same one, they came
out at roughly the same time), and the only useful takeaway I got from it was
that disk companies seem to have solved moderately high temperature issues,
Google actually saw a correlation with longer life and higher temperatures.

------
reitzensteinm
The author handwaves away the power savings of powering down disks by talking
about the capital cost of providing power.

"Unless the prices of copper, PDUs and diesel-generators have started
following Moore’s Law, this is probably more true today than in 2007."

But this is fallaciously assuming that Glacier servers would need to be like
EC2 or S3 servers that are switched on for x random hours per day. This isn't
the case at all.

Diesel generators, for instance - what's the use? If you're replicating the
data around the world, it doesn't matter if your Glacier servers in one
location are powered down for days.

Cooling, power - just switch off _every single_ Glacier server at each
location during the peak hours of the day. No additional capital costs,
because you're never adding to peak power usage.

Distribution of reads - Profile customers based on how frequently they're
reading from the data store. I bet 50% of customers never read a single byte
back. Colocate these customers on the same servers, and you only have to power
up the server once a month to check if the data is still there.

Bonus - Profile data access patterns with simple heuristics to determine
what's likely to be read back in, and temporarily store that data in S3.
Imagine a company that archives everything to Glacier daily, but restores day,
week and month old backups regularly. Keep all data less than a month old in
S3, and the rest on the indefinitely powered down servers talked about above.

I actually quite liked the author's case for BDXL, but it seems he's doing a
straw man on other possible solutions. Well implemented BDXL more cost
efficient than a naively implemented disk strategy? Not exactly news.

~~~
raverbashing
Makes me wonder something

Database machines are usually never shut down. And when they are, it's usually
a manual process (unless we're talking about a power failure or something)

"Diesel generators, for instance - what's the use?"

Backup power (not only for Glacier).

Reducing power consumption in peak hours is a good strategy, however, I'm not
sure how worried Amazon is about this (and the difference in price between
regular hours and peak hours)

~~~
reitzensteinm
It's not about the price they're paying for power (peak hours from the power
company's perspective), it's about the peak usage of their data center.

As the article (correctly) says, the capital costs are dependent on your peak
usage. If you have 1000 servers using 400kW at peak, you need sufficient air
conditioning to extract 400kW worth of heat, and backup generation capable of
producing 400kW. It doesn't matter if you only use 100kW 16 hours a day - the
capital costs are the same.

I'm suggesting that Glacier could live entirely in non peak periods, meaning
that the capital costs are unchanged and the demand curve is flattened.

~~~
raverbashing
The real question then is: is there really a peak hour for datacenter
usage/consumption? What's the consumption difference between peak and regular
hours (and low-demand hours)

How much power does Amazon use during mornings compared to Netflix watching
peak time?

~~~
keypusher
Yes, I'm sure there is. Look at Google clicks by hour or similar metrics,
there is a significant curve during the day, peak can be 4-5x the low.

------
greatzebu
If you have a lot of heavily-read data distributed across machines, you're
probably constrained by available spindles rather than available storage
space. So co-locating data that will almost never be read with heavily-read
data is effectively free. I would guess that Amazon stores Glacier data
alongside S3 data, and the lower price reflects the fact that the limiting
factor in their storage system is IO rather than capacity.

~~~
duskwuff
That doesn't offer any clear explanation of why they would charge extra for
early deletion of that data, though.

~~~
greatzebu
One possible explanation: the cost of initial IO to write your data is
amortized over that data's lifetime. The early-delete penalty ensures that
Amazon always makes enough on your data to justify the cost of writing it into
their system in the first place.

------
michaelt

      But NONE of the Hacker News commenters addressed Sony and 
      Panasonic’s continued investment in high-density optical 
      disc technology. [...] There has to be a business reason 
      for the continued investment, i.e. customers prepared to 
      buy a lot of product in the future and buying a lot right 
      now.
    

If Amazon was the only customer for high-density optical storage, they'd be
crazy to invest in developing it, because Amazon will have all the negotiating
power in the relationship.

There must be other customers somewhere if _two_ companies are continuing to
develop this stuff.

~~~
eli
Is it really that unusual to have two vendors competing for a single very
large customer?

~~~
dsr_
If the customer is a major government, no. Otherwise? Yes.

------
xhrpost
The deletion charge is interesting. Does deletion guarantee a scrubbing of the
data as soon as possible? If so, I could see that as justification for the fee
since some sort of significant work is involved in retrieving data, implied by
the hours of wait time. If not though, the author has a good point in that
Amazon needs to make at least .03/GB in order to be profitable.

~~~
watson
I would be surprised. Anyway, why would they pay for scrubbing if the data is
less than 3 months old and do it for free if not? If they do scrub, I don't
think those are related

------
batbomb
Okay, this is beyond silly. Still ignoring tape.

Explain how BDXL, a new format, which has never proved itself beyond a few
years, which costs $45 for a few hundred GB, which AFAIK has very little data
on re-writability (which is probably terrible and close to one-time use) could
be any more profitable or reliable, or useful than tape.

I don't buy it. Tape would be the logical short-term choice to get started,
because Amazon could just go buy an off-the-shelf tape library and add tape as
they needed from Oracle, versus again, engineering their own BDXL library on
the expectation that it would cost less than tape, taking into account factors
such as:

1\. Reliability and degradation 2\. Supply 3\. Cost 4\. Reusability/Re-
Writability

~~~
dmourati
Not tape.

[http://www.zdnet.com/amazon-launches-glacier-cloud-
storage-h...](http://www.zdnet.com/amazon-launches-glacier-cloud-storage-
hopes-enterprise-will-go-cold-on-tape-use-7000002926/)

~~~
batbomb
No, it doesn't definitively say the backing storage is not tape based. Only
that:

    
    
        "Essentially you can see this as a replacement for tape," 
    

and:

    
    
        "inexpensive commodity hardware components"
    

Nowhere did they explicitly deny that it may be tape-backed.

In addition, From the article:

    
    
         Instead, Glacier runs on "inexpensive commodity hardware components", he said, noting that the service is designed to be hardware-agnostic.
    

Which may allude to the fact that the backing storage itself may be flexible
(a combination of HDD, Tape, possible BDXL)

The author himself only acknowledges:

    
    
        This suggests the system will be based on very large storage arrays consisting of a multitude of high-capacity low-cost discs.
    
    

Which isn't definitive in the slightest. Also, the article is over 18 months
old.

I've said it before, I wouldn't be terrible surprised if they got started with
old commodity hardware to get started, but the economics and characteristics
of tape still seem much more amenable to the use case.

~~~
dmourati
"Replacement for tape" is tape? OK.

~~~
VintageCool
"Essentially you can see this (our tape) as a replacement for [your] tape".

------
justinsb
What I'd like to see is a storage product that aggregates the unused storage
space on EC2 instances.

* By default instance storage isn't attached, so there's probably a lot of completely available capacity.

* Even if attached, it's rare that the full capacity would be used, so thin provisioning would leave some space available.

* Some host machines won't be fully allocated.

I imagine taking this pool of capacity and using erasure coding and
replication to build reliable storage. As your volumes come and go, you need
to make sure it remains available, which is why I imagine erasure coding
across a large number of customers. By integrating with the guest -> host
assignment function you can ensure that you never lose data, if need be
delaying scheduling until you've copied data elsewhere.

You'd have to throttle reads & writes to ensure that the guests weren't unduly
impacted (easier with SSD's predictable IOPS), and the splitting and erasure
coding would make for slow reads & writes as well. But this makes the
economics a lot more attractive (free!).

One thing that suggests Glacier could be doing this: if I was AWS, and I was
doing this, I would not be in any rush to tell EC2 customers that I was
"stealing" their unused capacity!

~~~
GregorStocks
I don't see why Amazon would want to buy enough disks to support 100% of what
they promise and then sell the unused capacity at below cost when they could
just buy fewer disks instead. Either way they'd be in trouble if their EC2
customers suddenly started wanting to use all the space they were promised.

~~~
justinsb
It's a good point: a cloud provider could choose to try to under-provision
disks instead. The problem though is that (some) disks are local to the
machine; it's not easy to move physical hard disks around when the
calculations are wrong (and if you have to get it right on a per-host basis,
it's more likely to go wrong). It is however easy to move chunks of data
around, particularly if you use something like erasure coding to give you a
huge amount of flexibility.

In short, your way is a good alternative, but my guess is that buying the full
capacity and selling the surplus is probably roughly cost-equivalent, and
considerably less likely to end up with you not being able to sell the full
capacity of any given host.

------
enricotal
The Amazon Glacier Secret is hidden in plan sight. I think is just a virtual
product. Probably unused S3 capacity, at a much lower price, not a different
technology.

~~~
asn0
Maybe Glacier stores in the unused parts of S3 disk sectors, the empty part of
the last sector of files that don't completely fill 512k. Takes 4-5 hours to
retrieve because it has to pull without impacting dozens or hundreds of S3
drives.

------
vilpponen
Full disclosure: I work at UpCloud, a cloud hosting provider and I think
business in this industry day and night.

One thing that gets overlooked in almost all comparisons is the pricing
models. I actually use Glacier through Arq (brilliant backup for Mac), but the
catch is in the requests. I recently uploaded 200GB of photos to glacier and
the upload process cost me about $10. The monthly storage is about $2.

The thing is that you shouldn't compare the sole storage price, but the total
of cost of storing your data in glacier, which was also overlooked in the
original article.

I'm sure AWS has understood this through their S3 storage lifecycle and thus
developed an appropriate pricing model for glacier that arouses interest in
the product like no other.

~~~
klodolph
$10 for upload? Pricing is $0.050 per UPLOAD request plus $0.000 per byte.
That means that your backup system made 200,000 UPLOAD requests. I don't think
I have that many files which I want to back up, and I would hope that any
system using Glacier as a back-end would bundle smaller files into larger
archives.

(Well, I guess it means that your request size averaged 1 MB. If I'm already
using Glacier, I would be perfectly happy with archive granularity coarser
than 1 MB.)

------
timfrietas

        Glacier is significantly cheaper than S3
    

Yes, as long as you put something in and almost never take it out.

    
    
        They charge for deletions in the first 3 months
    

What if this is just disincentive to pull content out and treat Glacier like
S3?

    
    
        Power is not the driving cost for Internet scale infrastructure
    

It is not the only cost, but it is still one of the largest factors, no?

    
    
        Sony and Panasonic continue to invest in a product that has no visible commercial uptake
    

That means nothing in and of itself. The only person who will win such a
market needs to be one of the first there, innovator's dilemma, etc.

    
    
        Facebook believes optical is a reasonable solution to their archive needs
    

Do they? I saw one mention in the author's previous post of James Hamilton
commenting on a Facebook cold storage system using Blu-ray but is unclear to
me if it is in production.

Assuming it is true though, it is likely an apples-to-oranges comparison.
Glacier provides archival restoration for presumably largely enterprise-level
customers. Facebook backs up data from users, and I'd presume this is from
deactivated accounts, etc., and unlikely to need urgent restoration.

~~~
thatthatis
Is there another example of amazon doing discentive pricing?

Every example of their pricing I've seen has been cost plus, cost plus, and/or
cost plus.

Thus it seems at least an order of magnitude more likely that the $.03
reflects some cost.

~~~
timfrietas
Fulfilled by Amazon has disincentives for heavy/large objects, pulling
inventory back to mail it back to you and for not prepping your item correctly
when shipping it in.

------
dmourati
Other than burying the cite (2 links to his own blog) to why "cost of
provisioning a single watt of power is more expensive than 10 years of power
consumption," the author's canonical source is eventually a 404:

[http://static.googleusercontent.com/media/labs.google.com/en...](http://static.googleusercontent.com/media/labs.google.com/en/us/papers/power_provisioning.pdf)

The data is also > 7 years old.

EDIT: working link:

[http://static.googleusercontent.com/media/research.google.co...](http://static.googleusercontent.com/media/research.google.com/en/us/archive/power_provisioning.pdf)

------
richardw
If power provisioning and space are an issue it seems unlikely to me that
they're sitting with many drives all hooked up for immediate power-up. It
would make more sense to separate the storage of media and the power
connection so you get the most from each power connection. Therefore:
removable media, whether those are hard drives or optical.

There's a data retrieval period of multiple hours [1]. That doesn't sound like
they're just powering up a drive. It sounds like they're moving something
around or doing some kind of linear read (as opposed to random-access). I'd
bet on "a retrieval job fetches this stack of read-only media and connects it
to the powered device, which loads it onto hard drives for quick download".

[1] "Retrieval jobs typically complete within 3-5 hours" \-
[http://aws.amazon.com/glacier/faqs/](http://aws.amazon.com/glacier/faqs/)

------
watson
I've always thought that the deletion fee for <3 month old data was a way to
limit "misuse of the service" \- call it a detergent if you will - more than
actually a way to recuperating costs. Because I guess the costs will be there
no matter when you delete data

------
cbsmith
I still think it is possible to make the whole thing work using a tape
library.

------
snowwrestler
If the goal is to manage capital costs, I don't see how funding the
development of an entirely new data storage hardware ecosystem is a reasonable
answer. You control capital costs with commodity hardware, not cutting edge.

Yet that is what the author thinks Amazon is doing with BDXL.

So who's funding high-capacity optical storage? Hmm, can we think of a
customer who ingests huge amounts of data, wants to keep it for a long time,
and has no fear if funding cutting-edge product development? Yes: defense
departments.

------
mark_l_watson
My friend Robin Harris wrote the original article.

A little off topic, but it seems really strange to me that Amazon is not
transparent on the technology. Because of the high charge for fast reads, I
tend to believe that the underlying storage is some form of media that gets
mounted, perhaps like Facebook's bluray archival system.

------
atdt
Is data really so heterogeneous? How many files in your home directory are
generic, with bit-exact duplicates existing in the home directories of many
other users? And the remaining files, which are uniquely yours -- what percent
of each of them consists of generic data, like file headers?

~~~
lmm
I have 100gb of gaming screencaptures. That data exists nowhere else (some
people may have copies of the final cuts on youtube, but that's tiny in
comparison); if you could store it as input data + game code you could
compress it by a lot, but I highly doubt Amazon's that far ahead of mainline
video encoding technology.

Other than that, I think the big culprit will be photos; everyone's family
photos are different (the JPEG header is a tiny proportion of a modern 5MB
photo) and that's one of the most popular things for people to back up on
these kind of services.

Plenty of data is generic in the way you say, but plenty of it isn't. So I
don't think there's any free lunch here.

------
olegkikin
Don't forget that S3 costs additional $0.12/GB for transfer out.

So either cheap VPSs or dedicated servers are still much cheaper if you don't
want to deal with Glacier.

------
toolslive
another question: what is the difference in power consumption between a HDD
that's powered up but idle, and one that's being used to read from or write
to?

As an aside, erasure codes allow you to reduce power consumption as the
redundant fragments are only necessary for safety, not for regular retrieval.
You don't need glacier to benefit from that. (But glacier might be an
optimization of that strategy)

~~~
wmf
Most of the power is used to keep the drive spinning.

------
olegkikin
Don't forget that S3 costs additional $0.12/GB for transfer out.

Dedicated servers are still much cheaper than even Glacier.

------
toolslive
question: how does Amazon's glacier relate to this?
[http://dl.acm.org/citation.cfm?id=1251214](http://dl.acm.org/citation.cfm?id=1251214)

~~~
andrewguenther
It doesn't. Just coincidence.

------
nicpottier
Erm, isn't it getting a bit meta to have a post to respond to the comments on
HN for a previous post of yours? Surely, that is exactly what the HN comments
are for?

~~~
NotHereNotThere
Robin's blog post is not only to respond to HN comments (as you may noticed,
he addresses an observation made by another commenter on his own blog).

And he not only responds to the comments, but adds plenty of other information
and reasons why he suspects optical media is used by Glacier.

So no, I don't see anything "meta" about this post.

