

Amazon S3 Lifecycle Management for Versioned Objects - jeffbarr
http://aws.amazon.com/blogs/aws/amazon-s3-lifecycle-management-update/

======
jxf
The versioning is cool in and of itself, but auto-archiving to Glacier after
an expiration date is a fantastic feature. The value-add on that is going to
be really great for a lot of teams I work with.

What I'd really love to see on S3, though, is the ability to make partial or
append-only updates without needing to re-upload the entire bitstream. As of
the last time I checked this wasn't directly possible at the API level (it
seems like most people store deltas and "reconstruct" the file
programmatically when they need to do this).

This would enable an entire class of use cases that's otherwise not easy to do
on S3, but which I think would be useful (streaming logs, sensor data, etc.).

~~~
leef
Sounds like something to use Kinesis[1] for. Read the kinesis stream, process
the updates and write/rewrite the S3 file.

1 - [https://aws.amazon.com/kinesis/](https://aws.amazon.com/kinesis/)

~~~
jxf
Repeatedly updating S3 files on a stream that produced a lot of updates (e.g.
1-second-resolution server logs) would get very expensive, very quickly!

Something like that might work if you only periodically rewrote the S3 file
(e.g. once an hour). But it's still much less efficient than just allowing
delta updates.

~~~
toomuchtodo
You'd want to write your logs to S3, and then process and cache in-memory,
either in something like Redis or in-memory in your analytics app on an EC2
instance.

------
TheMagicHorsey
Amazon keeps blowing me away with the speed of new feature introductions and
the breadth of their cloud platform.

I really hope that Rackspace and other players figure out how to make an open
standards version of Amazon's platform, or else we are all basically going to
be locked into Amazon for the foreseeable future.

Once you get hooked into all these awesome features, its hard to migrate to
another provider.

~~~
thinkmassive
Eucalyptus Systems makes an open source version of a subset of AWS-compatible
services:

[https://www.eucalyptus.com/](https://www.eucalyptus.com/)

------
nl
11 nines durability!

I don't think I've ever seen that before. What does that work out to in terms
of data loss?

~~~
jzwinck
Amazon explains here: [http://aws.typepad.com/aws/2010/05/new-
amazon-s3-reduced-red...](http://aws.typepad.com/aws/2010/05/new-
amazon-s3-reduced-redundancy-storage-rrs.html)

"If you store 10,000 objects with us, on average we may lose one of them every
10 million years or so."

~~~
gjm11
Which is of course rubbish; events like the sudden fall of civilization,
large-scale nuclear war, all governments on earth suddenly declaring very
large computing or storage facilities illegal out of paranoia about runaway
AI, etc., are surely much more probable than 10^-11 per year and have the
potential to wipe out S3 entirely.

It's arguable that all of these are sufficiently major events that if any of
them happens you won't _care_ that you just lost all your data stored in S3
because you'll be too busy fighting off wolves, dying of radiation sickness,
or whatever.

More to the point, though, I take it the real point of these many-9s
guarantees is that if you store a very very large number of things on S3 the
danger of losing one of them is still very small. So, for instance, if you
store 10^8 objects in S3 then the probability of losing any of them in a given
year is allegedly about 0.1%, which is actually fairly credible. (But when
that happens, a substantial fraction of the time it's because of a really
major disaster and you probably lose a lot more than one object.)

We could do with more precise ways of describing availability guarantees, to
distinguish between a small probability of losing lots of data and a large
probability of losing a tiny amount of data.

~~~
ceejayoz
That argument applies to any uptime guarantee from any vendor. When Rackspace
says "99.9% uptime!" no one complains that they should have to say "unless
nuclear war happens".

~~~
gjm11
If the figure is 99.9% then I don't think it needs any such disclaimer; large-
scale nuclear war is pretty improbable these days. It's only once you start
getting to large numbers of 9s that these spectacular low-probability events
begin to matter.

~~~
dmd
> pretty improbable these days

[http://upload.wikimedia.org/wikipedia/commons/4/4b/Doomsday_...](http://upload.wikimedia.org/wikipedia/commons/4/4b/Doomsday_Clock_graph.svg)

------
kolev
Finally! This will save us a lot of money as we wrote tools to do this, but it
costs keeps increasing, especially when you have to scan the whole bucket.

------
randall
Whoa jeffbarr hangs out on HN? Go figure.

~~~
kolev
Jeff Barr and Werner Vogels are two huge reasons I'm such a fan of AWS. Jeff
is very responsive on Twitter as well.

~~~
jeffbarr
Well, thank you, that's great to hear. I really enjoy what I do and I hope
that it shows.

And to the GP post, I have been on HN for 7.23 years.

