
Lower Cost S3 Storage Option and Glacier Price Reduction - jeffbarr
https://aws.amazon.com/blogs/aws/aws-storage-update-new-lower-cost-s3-storage-option-glacier-price-reduction/
======
cperciva
This looks like it would be perfect for Tarsnap -- the data Tarsnap stores is
almost always kept for 30+ days, and it's almost always in objects of 128kB or
more. The $0.01/GB for reads would be annoying (one of the reasons Tarsnap is
hosted in EC2 is because it has free data transfer to and from S3; data is
regularly retrieved and stored back after filtering out blocks marked for
deletion) but it would be cheaper.

One thing concerns me however: _Standard – IA has an availability SLA of 99%._

If this is just a reduced SLA but the actual availability is likely to be
similar, that's fine. But if the actual availability is not expected to hit
99.9% -- say, if the backend implementation is "one copy in online storage,
plus a backup in Glacier which gets retrieved if the online copy dies" that
would be completely inadequate.

Hopefully we'll get more details over time.

~~~
jeffbarr
This is online storage.

If a GET fails, just retry as usual (most higher-level libraries do this
automatically, sometimes with a backoff mechanism).

~~~
cperciva
_If a GET fails, just retry as usual_

Thanks! This is a very important detail which isn't documented anywhere:
Retries are likely to succeed. A service where 1% of requests fail but
failures are completely uncorrelated is far more usable than a service where
0.01% of requests fail but they keep on failing no matter how many times you
retry them.

~~~
zeckalpha
Additionally, assuming your block data is being hash-addressed, i.e. not
changing the S3 objects once they are in S3, adding CloudFront in front of
your buckets may go a long way to increasing that percentage.

However, its SLA is a little more involved (for better or worse):
[http://aws.amazon.com/cloudfront/sla/](http://aws.amazon.com/cloudfront/sla/)

I'm not accessing S3 from EC2, though, so another benefit for me was it
brought my S3 network costs way down.

------
gaul
Updated cross-provider comparison:

[http://gaul.org/object-store-comparison/](http://gaul.org/object-store-
comparison/)

~~~
aembleton
Thanks for that. After looking at that comparison, I came across
[https://www.runabove.com/index.xml](https://www.runabove.com/index.xml) who
provide a good deal.

Storage at 1c/GB/month, outgoing traffic at 1c/GB/month and no charge for
incoming traffic. Data is replicated 3 times.

~~~
hackerboos
Run Above is managed by OVH. Very cheap but not the most reliable, at least on
their dedicated servers.

------
timdierks
How does Amazon claime 11 9's of durability when the chance of an asteroid
extinction event is roughly 1000x as high?

This isn't a joke: I can't find any documentation on the risk model that lets
them estimate 11 9's and what class of risks it includes.

------
Trisell
With the cost per GB for Glacier Storage now. Any small to medium company
would be fiscally irresponsible to not use it as a primary disaster recovery
option. $84 per year per TB is ridiculous for geographically diverse storage.

~~~
Wilya
Everything related to Glacier is ridiculously complicated.

The pricing is tricky (the per-GB price is cheap, but the retrieval can get
horribly expensive). There's the fixed 4-hour delay for all actions (including
listing stored files), which makes any interaction a pain. And there aren't
really any good clients or high-level libraries that abstract away this
complexity.

For a disaster recovery, I would certainly go for something simpler and easier
to use. When everything is one fire, the last thing I need is dealing with a
tricky API to restore the company files.

~~~
georgeott
There are Glacier clients that allow you to manage the "Restore Speed" so you
don't get hit with ridiculous price hikes.

Glacier is PERFECT if you just need to restore a photo or document, and not
the entire repo.

~~~
dexterdog
But you have to store things in larger archives or the per-object overhead
hurts your pricing. When Glacier first came out I really wanted to use it, but
it had so much complexity over just treating it like an object store that I
didn't use it. Then add the fact that S3 Standard kept coming down in price
and Glacier just stood still (thus the name).

~~~
georgeott
I'm currently storing my entire photo collection of about 50GB and ~20,000
photos. It cost me about $3 to upload the entire library. I pay around 50
cents a month for storage. YMMV, but I'm very happy with it.

------
JoshTriplett
These prices make the new Standard-IA storage significantly cheaper than the
Reduced Redundancy Storage, even if you end up reading the data back.

However, I find it interesting that in addition to the cost per GB to retrieve
data, this new storage class also has a significantly higher per-request cost,
too. Actually, it looks cheaper to upload an object as a different storage
class and then transition it to Standard-IA, since PUT of IA costs $0.01 per
1000, but PUT of another class costs $0.005 per 1000, and the cost to
transition another class to IA is $0.01 per 10000. It's a small difference
($0.04 per 10k objects), but if you store an obscene amount of data on S3,
that seems like enough difference to matter.

~~~
cperciva
I'm going to go out on a limb and guess that Standard-IA is basically Standard
except with slow disks instead of fast. Slow disks have lower $/GB, but higher
$/IOPS.

~~~
JoshTriplett
Seems unlikely, given some of the other properties it has; more likely,
Standard-IA has fewer copies of the data, and then a backup in Glacier, which
explains why it has high durability but low availability.

------
venning
This seems like a smart response to Google Cloud Nearline Storage. Slightly
more expensive, but with the S3->IA->Glacier lifecycle mapping the ultimate
costs may be lower with AWS, certainly the flexibility will be there.

Amazon, staring at its mighty armory, goes on the hunt for a tiny chink to
repair.

~~~
boulos
Just a clarification the difference in cost-per-byte stored is 25% (1.25 cents
vs 1). The rates are all so small it's hard to see that but when you say
$12.50 vs $10 per TiB/month, I think that makes it more "visible". As you get
into the PiB range ($10k/month on Nearline) and then consider storage for a
year you've got a difference of $30k/year/PiB. For individuals doing a small
backup even 25% isn't huge in absolute numbers, but in the petabyte range it
matters a lot.

Disclaimer: I work on Compute Engine (and not GCS or Nearline).

------
omribahumi
Just released a tool Yesterday to give an insight on your S3 bucket size:
[https://github.com/EverythingMe/ncdu-s3](https://github.com/EverythingMe/ncdu-s3)

~~~
gallamine
Can you say a few words about this. What's the advantage over doing a
recursive ls with something like s3cmd?

~~~
omribahumi
Just like the difference between ncdu and regular du. It lets you drill down
into the directory hierarchy and spot the exact dir/file that consumes a lot
of space.

------
devit
Aren't these services still wildly overpriced?

At $0.0125/GB/month, that means it costs $75 for 6TB per month.

But a 6TB hard drive costs less than $300, which means that assuming the data
is stored on 3 hard drives for redundancy, they break even in less than a
year.

However, hard drives seems to last at least 3-5 years on average, so this
service seems to be priced at least 3-5 times as much as it costs to Amazon.

And there is even a $0.01/GB charge for retrieval on top.

There are other costs, but they should be relatively small at scale.

Am I missing something? If not, why doesn't anybody compete with Amazon and
provide more reasonable pricing?

~~~
ju-st
Glacier isn't even using hard drives, so their per TB costs should be way
lower (raw storage and infrastructure).

~~~
kalleboo
> Glacier isn't even using hard drives

That's still speculation right? Some have theorized they use offline harddisks

~~~
acveilleux
I would expect offline HDDs as well, possibly combined with tapes for
secondary copies to meet durability requirements.

------
amatix
Would be nice to apply this to EBS snapshot storage too, that can add up
mighty-quick.

------
hendry
If I could have a git API to glacier I would be happy.

I have tried various glacier clients and they all seem to suck. So I have
trouble tracking exactly what I have stored there sanely. :( Unusable.

~~~
phamilton
Are you using glacier specific clients? I just use an s3 client and configure
the bucket to put everything in Glacier.

For example, I have a cron job calling `s3cmd sync` for my photos on my iMac
once a day.

~~~
hendry
I'm aware of that easy API of getting stuff to glacier, but once it "expires"
to glacier, how do you check what's actually there?

It quickly becomes a nightmare for me. Hence I need git!
[http://natalian.org/2015/04/13/How_I_organise_my_media/](http://natalian.org/2015/04/13/How_I_organise_my_media/)

------
ck2
Spent the past 10 minutes just looking for a simple grid chart of pricing on
S3 and gave up.

and "infrequent access" is not on the price calculator

wtf is it so complicated to compare

~~~
PhantomGremlin
Did you see this?

[https://aws.amazon.com/s3/pricing/](https://aws.amazon.com/s3/pricing/)

It has quite a lot of info.

Edit: you must view that page with JS enabled, otherwise no prices are shown.
Perhaps that was your problem?

~~~
ck2
Didn't show in Firefox, had to open it in Chrome to see the table, thanks

Firefox logs the dreaded

 _" Error: InvalidStateError: A mutation operation was attempted on a database
that did not allow mutations."_

which is known bad coding problem.

------
finalight
so..does S3 objects becomes IA automatically if it's access infrequently, or
do we have to do the implementation on our side?

~~~
kalleboo
No, but you can configure your bucket to behave that way (and then after even
more time move to Glacier if you wish)

