
AWS Storage Gateway - bpuvanathasan
http://aws.amazon.com/storagegateway/
======
ridruejo
More details at Werner's post

[http://www.allthingsdistributed.com/2012/01/The-AWS-
Storage-...](http://www.allthingsdistributed.com/2012/01/The-AWS-Storage-
Gateway.html)

~~~
jeffbarr
And even more details in mine:

[http://aws.typepad.com/aws/2012/01/the-aws-storage-
gateway-i...](http://aws.typepad.com/aws/2012/01/the-aws-storage-gateway-
integrate-your-existing-on-premises-applications-with-aws-cloud-storage.html)

------
latchkey
I don't have a personal need for this right now, but the fact that it stores
the data as EBS volumes is pretty cool. I could imagine having local servers
automatically mirrored so that they could be failed over to instances on ec2.
Very powerful stuff indeed.

~~~
cperciva
Note that the mirroring is asynchronous -- if your local server fails you can
replace it with an EC2 instance, but you lose anything since the last
snapshot.

~~~
kondro
Not since the last snapshot. It syncs as close to real-time as possible
without the latency and network speeds affecting the performance locally.

This solution isn't designed to replace fault-tolerance on local hardware. It
is for close to realtime offsite replication and backup.

Data in S3 is stored in at least three geographically separate locations and
snapshots are very fast and very efficient on storage space.

The final major advantage you get through a solution such as this is that if
you do have your primary site go down (floods, tornados, etc), you can bring
up all your existing images via EC2 without having to have a bunch of
redundant hardware sitting around waiting for disaster to strike.

And what do you pay for this? $125/month plus a per GB storage cost CHEAPER
than enterprise storage generally is.

~~~
sciurus
Let's say I need 50TB of usable space. I can purchase an Equallogic PS6100E (a
mid-level "enterprise" storage device) with twenty-four 3TB drives and 5 years
of support for $85,000. Rack space and power isn't free, so let's say the
total cost of equipment and facilities over 5 years is $100,000.

In contrast, storing 50TB for 5 years on S3's reduced redundancy storage would
cost $250,000. If I ever need to transfer any of that data back to my data
center, there'll be a hefty bill for that as well.

~~~
evandena
With Amazon you could facilitate a DR onto their other platforms. With your
Equallogic example you would need to mirror that data to another device with
recovery targets in another location.

~~~
nknight
At the scales he's talking about, 10gbps uplinks are easily available. That
said, the killer isn't the equipment costs, it's the people.

------
donky_cong
So you are basically paying 125$+ to have S3 cached locally and get an iSCSI
interface ?

~~~
nl
Which is worth it.

The price of enterprise backup solutions is crazy.

------
yalogin
This should make EMC sit up and take notice. Amazon is doing great. I would
have expected dropbox to do this after icloud was released. They should
atleast mimic this now.

------
bprater
Can someone give a brief explanation of what this service is good for?

~~~
chubs
It sounds to me like enterprisey dropbox: It backs up your files from your
local file server to the aws cloud, and if your server dies, bung in a new one
and all your files will reappear. Great idea, although maybe something you
could already do with dropbox?

~~~
soult
As far as I understood the product page and Werner Vogels' blog post[1] it
does not run on file level, but on filesystem level. So you won't be able to
access single files from within S3, but rather have whole disk images stored
in Amazon S3, ready to be restored back to your local datacenter or to be
mounted on EC2 servers.

1: [http://www.allthingsdistributed.com/2012/01/The-AWS-
Storage-...](http://www.allthingsdistributed.com/2012/01/The-AWS-Storage-
Gateway.html)

~~~
kisielk
Actually even lower level than that. iSCSI just makes a block device available
over a TCP/IP network. You can use that block device however you like, write
random data to it, partition it in to multiple volumes, use it as part of a
volume group or disk pool, etc. The individual block device doesn't need to
have a filesystem on it.

------
mopoke
From my brief reading of Werner's blog post, it seems that this is very
similar to what Nasuni have been selling.

------
bwarp
This is really cool. Having just done a large and complicated S3 integration
into a legacy soup of 20 year old filesystem based document management kludge,
this would have made life much easier (and considerably cheaper!).

Not only that, it solves a lot of problems such as dangerously storing backup
snapshots on-site, archival and easy deployment and access to S3's CDN
functionality.

Sold as far as I am concerned!

(Yes I know it's expensive, but it's cheaper than buying something in-house
and employing another hairless monkey to manage it).

------
kondro
Sounds like a great idea. I can't wait until they release Gateway-Cached
Volumes myself as it better suits my use-case.

~~~
tiernano
the volumes are cached... from reading the post, you write synchronously to
the iSCSI device, which is Asynchronously sent to S3

~~~
moe
I'm not sure if that's what kondro meant but what I would like to see is a
virtually unlimited size ("elastic") volume where the local disk acts only as
the cache.

This would make a wide range of big-storage use-cases ridiculously trivial -
those where only ~10% of the data-set is frequently accessed.

I.e. one could lazily scale the expensive local storage with throughput-
demand, while the S3 backing store takes care of the long-tail (which can
easily be many terabytes long when you're dealing with media files).

~~~
sciurus
Let's say you expect to grow to 20TB of data. Storing that for one month on S3
costs $2,560 (standard) or $1700 (reduced redundancy). In contrast, a Dell
R515 with twelve 2TB drives costs $7,000. In a year that's one-third to one-
quarter the price of using S3.

Implementing a tiered storage system yourself is pretty complex. Using this S3
gateway might be simpler, but it's not trivial (e.g. you'll need VMware ESXi
just to get started).

~~~
moe
Well, I only glanced at their current offering, missed the VMware part. My
request was mostly wishful thinking.

I.e. instead of VMware it'd be more useful for us to hook in with a FUSE-layer
or a patched variant of a filesystem such as GlusterFS.

You're of course correct about the pricing. Their current prices cover some
middle-ground but would need to be discounted to make it feasible for larger
deployments. However, at the low-end (your 20T figure) the price seems already
justifiable when you factor in staff and infrastructure costs (rack+power
alone make up for half of the difference).

