
Glacier Costlier Than S3 for Small Files - drue
https://therub.org/2015/11/18/glacier-costlier-than-s3-for-small-files/
======
raverbashing
The real question is why are people glaciering small files?

Tar everything and send it up. And then do an incremental backup

Amazon charges per operation, reduce the number of operations

~~~
lucaspiller
My guess is they don't realise. AWS let's you setup lifecycle rules to archive
S3 objects to Glacier automatically. It doesn't explicitly say it is cheaper
(it says "this may reduce your costs"), but they probably just saw the
headline prices for Glacier and moved everything.

~~~
smackfu
Also, I would guess a lot of Glacier users are probably just slotting it in
place of some other system that doesn't have the same pricing model.

------
boulos
This is one of the things I love about GCS Nearline, it really is a penny per
byte and then the whole retrieval charges (which for my lazy rsync backup is
never).

Disclaimer: I work at Google on Compute Engine, but not on GCS.

------
TazeTSchnitzel
> For consistency and precision, the following units are used throughout this
> article.

> KB: 1,024 bytes, expressed as 2^10

Why not just use KiB? Unlike KB, it's unambiguously binary.

~~~
drue
I primarily wanted to note that I used 2^X notation in the math, in case it
was confusing for folks. I used KB/MB/GB to be consistent with the language
AWS uses on their S3 pricing page
([https://aws.amazon.com/s3/pricing/](https://aws.amazon.com/s3/pricing/)).

------
nine_k
Glacier has a very clear use case, to my mind.

It is useful for keeping archives of massive data you're _unlikely to ever
need,_ but legally obliged to keep around, or just want to have available for
a very improbable later examination. Think some huge transaction logs of two
years back.

For a case like this, you don't _need_ fast retrieval, and mostly you don't
need retrieval at all. You plan ahead to only ever retrieve a small percent of
these data. The rest will be silently discarded when retention period has
expired.

If your use case is not like that, Glacier probably makes little sense for
you.

This is totally _not_ a backup which you likely keep in order to restore the
entire state from as soon as possible.

~~~
drue
Right, but you missed the salient point: it also does not make sense for files
smaller than about 200KB.

------
jakozaur
AWS is more honest with S3 Infrequent Access:
[https://aws.amazon.com/s3/storage-
classes/](https://aws.amazon.com/s3/storage-classes/)

No additional metadata, but you get charged at least for 128 KB . Luckily
lifecycle transition doesn't move files smaller than 128 KB. However, even
using lifecycle transition you pay $0.01 per 1000 transitions. It doesn't seem
much but for smaller items it can decrease savings a lot. E.g. if you average
file is 6 MB, than you will loose 4 days of savings on S3 IA comparing to
standard class.

------
tghw
Personally, I'm looking forward to Backblaze's cloud storage, which is being
advertised at $0.005/GB/month. Cheaper than glacier, without all of the
transaction fees.

[https://www.backblaze.com/b2/cloud-
storage.html](https://www.backblaze.com/b2/cloud-storage.html)

~~~
tw04
The whole custom API instead of S3 or Swift compatibility is really, REALLY
annoying.

~~~
toomuchtodo
Someone from Backblaze previously mentioned on HN they're working on an
S3-compatible API.

------
bound008
AWS has a wizard for bucket storage policy where they tell you this
information explicitly with a big warning sign.

------
aps-sids
Unfortunately, I learned it the hard way :(

