
Amazon Snowball - polmolea
http://aws.amazon.com/importexport/
======
lisper
But can you trust it?

When I returned to JPL after working at Google for a year I was tasked with
evaluating a Google Search Appliance. We ultimately decided not to keep it,
and so we had to erase the disks, which now contained sensitive data. The
appliance had a "self-destruct" feature that supposedly erased all the data,
but there was no way to verify it. After lengthy negotiations with Google
(some people just have a hard time grasping the idea that just because a file
has been deleted doesn't mean the data is actually gone) we eventually got
them to agree to let us open the enclosure and take out the disks. Forensic
analysis revealed that they had not in fact been erased.

Caveat emptor.

~~~
Swannie
From: [https://aws.amazon.com/blogs/aws/aws-importexport-
snowball-t...](https://aws.amazon.com/blogs/aws/aws-importexport-snowball-
transfer-1-petabyte-per-week-using-amazon-owned-storage-appliances/)

"...The data will be 256-bit encrypted on the host [running the Snowball
client?] and stored on the appliance in encrypted form. The appliance can be
hosted on a private subnet with limited network access."

So I assume the data is encrypted asymetrically.

"...ship it back to us for ingestion. We’ll decrypt the data [using the
private key specified in the job,] and copy it to the S3 bucket(s) that you
specified when you made your request[/job]. Then we’ll sanitize the appliance
in accordance with NIST Special Publication 800-88 (Guidelines for Media
Sanitization)."

That links to
[http://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP...](http://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-88r1.pdf)

There are a few different types of sanitisation (clear/purge/destroy), and
Amazon doesn't specify which type. I assume they would go with "clear", and
maybe in a few select places (I'd hope storage media) "purge".

"Clear" is scary though, as for network devices, it is only "full
manufacturer’s reset to reset the router or switch back to its factory default
settings", and for HDD's it is "Overwrite media by using organizationally
approved and validated overwriting technologies/methods/tools. The Clear
procedure should consist of at least one pass of writes with a fixed data
value, such as all zeros. Multiple passes or more complex values may
optionally be used".

So what vector do you want to protect? Accidental data egress shouldn't happen
as the data is encrypted. However there are more interesting vectors, such as
getting hold of the public key and injecting your own data into another
companies buckets...

~~~
lisper
But that's the whole point: Can you trust that the box does what Amazon says
it does? Because the Google box did not do what Google said it did, but if I
hadn't been _very_ insistent about it (to the point of having a number of
people think I was being a total dick), we would never have known.

------
hughes
"Never underestimate the bandwidth of a station wagon full of tapes hurtling
down the highway."

\- Andrew S. Tanenbaum

~~~
rsync
You can drive that station wagon right up to your chosen rsync.net location:

[http://www.rsync.net/products/oob.html](http://www.rsync.net/products/oob.html)

------
devit
As usual, the pricing is not very friendly, and apparently designed to lock
your data into AWS or exploit your weak negotiating position once you buy in.

While you can send in 50TB for $200, taking the same 50TB out costs an
additional $1500 charge (50000 * 0.03).

[assuming they are not transferring the data over the Internet, the cost to
AWS should be the same or cheaper for reading]

~~~
desdiv
$1700 is amazingly cheap when you consider that it costs $4300 [0] to transfer
that same 50TB out of EC2.

[0] [https://aws.amazon.com/ec2/pricing/](https://aws.amazon.com/ec2/pricing/)

$4300 = 10000 * $0.09 + 40000 * $0.085

~~~
devit
... which is also wildly overpriced.

According to multiple sources, Internet transit in the US now costs less than
1$ for a 1 Mbps line for large deals, which translates to 1$ for 324GB, which
translates to 0.003$ / GB.

Amazon charges 15-30 times that.

(it appears that traffic can be much more expensive in places other than the
US and presumably Europe)

~~~
utrex
It may cost that much to buy that capacity, but it costs a lot more than that
to run the large scale organizations (CAPEX/OPEX) that build and buy these
services. You're not just paying for a pipe, you're paying for the
corporation.

~~~
johansch
> You're not just paying for a pipe, you're paying for the corporation.

You are paying their profit margin, yeah.

It is wildly overpriced, no matter how you look at it. Operating costs even
when done on a much smaller and more inefficient scale than for AWS do not
make the total cost for incremental bandwidth usage THAT much larger.

~~~
azinman2
What's the alternative?

~~~
faizshah
Centurylink has a pretty formidable IaaS offering, it's very well priced and
they charge very little (comparatively) for bandwidth. The only problem is
their storage solution is extremely expensive but using S3 or soon Backblaze
B2 as a storage layer is good enough for me.

~~~
grkvlt
So you're saying S3 is the alternative to S3 ... ?

~~~
faizshah
I said their IaaS is a cheaper alternative to AWS IaaS with the exception of
their S3 alternative. I suggested B2 is the cheapest alternative to S3 do you
have a suggestion?

------
vvanders
The e-ink display showing a shipping label is brilliant.

~~~
Carrok
This article says it's not e-ink but just a Kindle strapped to the side.

>It has a Kindle on the side, which functions as an automatic shipping label.

[http://techcrunch.com/2015/10/07/amazon-launches-
snowball-a-...](http://techcrunch.com/2015/10/07/amazon-launches-snowball-a-
rugged-storage-appliance-for-importing-data-to-aws-by-fedex/)

~~~
notatoad
Unless i'm mis-judging the scale of this thing, that screen looks a fair bit
smaller than a kindle. I think it's just techcrunch using "kindle" as a
generic word for e-ink display.

~~~
unfletch
The Amazon exec who introduced the Snowball during the keynote also called it
included a "Kindle".

~~~
A010
Perhaps a mini kindle, I estimate by the image that's about 2x4" max.

------
jonkiddy
An AWS version of a sneakernet.

[https://en.wikipedia.org/wiki/Sneakernet](https://en.wikipedia.org/wiki/Sneakernet)

~~~
wanderfowl
Yep, all I could think was "Wow, that's a good deal fancier than a station
wagon full of tapes".

Cloud backup companies have had similar services for a while now, but it's
nice to see AWS adopting it.

~~~
toomuchtodo
I have to admit, while it makes for good storage lock-in, I was impressed that
they only charge 3 cents/GB to get the data back out.

Someone else in this thread thought $1500 was expensive to get 50TB back out.
If you use this for disaster recovery, you could get all of your data back
onsite quickly for a very low (comparative) cost, versus trying to provision
high speed connectivity.

~~~
imajes
Don't forget, there's the ongoing storage cost at S3, which also adds up
really quickly.

~~~
toomuchtodo
3 cents/GB is _cheap_. Go 1 cent a GB S3's infrequent access class (since you
won't be incurring the charge for retrieval through S3, you'll be pulling back
out through Snowball), and its even cheaper.

$10/TB/month? Where else can I store data reliably that cheap? (Yes, Backblaze
is half that price. I hope they become a worthy adversary to AWS S3 to drive
prices further down).

~~~
imajes
If it cost you about $1000 to buy a diskpack (4*6TB drives) you could create
backups and send them to at least a half dozen locations for less money than
using S3 to store that data.

Yes, S3 is cheap(ish). But given Snowball is a snapshot backup service, it's
not comparatively cheaper than it would be to distribute that same data by
creating a clone and sending it to a safe place.

~~~
toomuchtodo
That's not really how business IT works (unless you're sending tape off to
Iron Mountain, which has its own costs and storage fees).

S3 is the cheapest "real" business storage option besides Backblaze's new
storage offering. S3 can't be compared to shipping disks someplace where they
sit offline.

------
silveira
More details [https://aws.amazon.com/blogs/aws/aws-importexport-
snowball-t...](https://aws.amazon.com/blogs/aws/aws-importexport-snowball-
transfer-1-petabyte-per-week-using-amazon-owned-storage-appliances/)

------
mpeg
I'm curious about how they're going to approach this from the fraud
perspective. This is a $200 charge for a device that has 50TB storage, which
would probably cost you around $2000 to buy.

There's people out there that will sign a contract under a fake name / address
with a phone provider and sell the phones, and the way the providers fight
against it usually by running credit checks and verifying address against
them. Ultimately, this is very hard to detect when it involves identity theft.

~~~
vvanders
Couldn't you just put a hold for $2k and release it when the device is
returned?

~~~
mpeg
Yes, you could, the pricing page doesn't mention anything about it though,
hence why I wonder.

------
msandford
The XKCD about FedEx's bandwidth seems particularly appropriate:
[https://what-if.xkcd.com/31/](https://what-if.xkcd.com/31/)

~~~
guelo
His MicroSD card calculations are off because it would take a month to insert
an airplane-load of those cards one by one into some computer to copy them
into your storage.

Which also makes me wonder why this Snowball device only does Ethernet. Seems
like it should have an eSATA port.

~~~
scott_karana
Maybe it'll be a 10GbE interface?

~~~
toomuchtodo
Ethernet is most ubiquitous. You don't need physical access to a server with
data, just network access. Most environments outside of datacenters don't have
10GbE switches or NICs yet.

EDIT: It appears it supports 10GbE natively. I assume it'll also support lower
Ethernet speeds.

~~~
scott_karana
> I assume it'll also support lower Ethernet speeds.

Yeah, ethernet has pretty much always downgraded well. That's why a 10GBASE-T
interface is perfect :P

(I should have been more clear with my initialisms)

------
sengstrom
The math for the time to transfer comparison is interesting:

"Even with high-speed Internet connections, it can take months to transfer
large amounts of data. For example, 100 terabytes of data will take more than
100 days to transfer over a dedicated 100 Mbps connection. That same transfer
can be accomplished in less than one day, plus shipping time, using two
Snowball appliances."

With a 100 Mbps connection it takes over 100 days [1] but with a 100 times
faster connection (10 Gbps) it takes less than a day :)

[1] Assuming no network overhead it is 92.6 days

~~~
nacs
Good luck trying to saturate the 10Gbps port (on both the read and write
sides) however.

------
itsjustjoe
In my last role we would often need to upload large amounts of data for our
clients to AWS. When this data got into the terabytes we would ship a NAS box
to our customer and then send that to Amazon. On more than one occasion Amazon
fubar'ed the upload on their end (why would you move drives around in a RAID
5/6 array?). Maybe since this is AWS branded solution it will be more
reliable.

------
driverdan
This has interesting security implications for both sides. Is the device 100%
offline or does it phone home when you connect it to your network or transmit
any other data? What if someone gets the device and hacks it to scan Amazon's
networks when sent back?

~~~
MichaelGG
I'm gonna guess they have all sorts of physical tamper detection capabilities
to prevent this. And perhaps a software load that gets wiped every time, so in
case you find a bug in their software (iSCSI? NFS? whatever) it might be hard
to escalate.

~~~
Swannie
Tamper detection won't prevent anything. It would just be an indicator that
_something_ appears to have happened.

The software load that is wiped every time is a first, and extremely basic,
line of defence.

Realistically I'd hope the OS is on a SD card that they can literally take out
and throw away after they have the data off (you can pwn the micro-controller
on an SD card) - and replace with a freshly baked card.

~~~
MichaelGG
Presumably if their sensors say the system has been cracked open, they don't
just ship it out to another user. (And they could have many layers of sensors,
telling them if it was just via damage (hitting it with a forklift) or someone
really getting in.) Considering the potential downside, I'm sure they've done
some work here.

------
softwarerocks
All the innovation at AWS is amazing. If they every stop charging by the
Gigabyte for bandwidth and move to a flat model then I would be tempted to
switch to them for all my sites.

~~~
tomschlick
Why would they do that? It makes sense to charge per gb. It gets people using
them when they are small and its cheap to use S3/EC2 and then as they scale up
its easier to just stick with AWS.

------
mkobit
> __Snowball currently supports importing data to AWS. Exporting data out of
> AWS will be supported in a future release.

I'm interested in hearing about how they are going to do this.

------
stonogo
Forgive me, but what exactly is 'petabyte-scale' about a 50TB NAS with a dog-
slow link?

------
imajes
Say I wanted to do something similar, but move data around locally between two
NAS appliances, but not incur a double disk charge - (i've got a pack of ~ 20
disks in NAS brand a, but i want to move to NAS brand b. The disks work in
both, but need reformatting).

Does anyone know of a service where I could rent a 20TB device like snowball
but not push it to S3?

~~~
mey
I know Dell has glorified portable harddrive arrays in pelican styles cases
for moving data between their SAN systems. Unfortunately I don't recall the
name and I don't know if they are available for rental. If you have a Dell
VAR, have them ask the Dell storage group.

Edit: Correction, it doesn't appear to be an array, just single device so
1.5-2TB max. Also basically targeted at this SAN solution only.

~~~
imajes
Yeah, I need to move 20TB in one go - so i can repurpose the disks.

~~~
Twirrim
[https://www.synology.com/en-
us/products/DS1515+#spec](https://www.synology.com/en-
us/products/DS1515+#spec)

5 * 6TB drives, RAID5 would put you at nearly 24TB of storage, if my brain is
in gear.

~~~
cthalupa
He wants to rent something temporarily, so that he can copy stuff off the
drives, take them out of the nas, format them in the new nas so it's
compatible, copy stuff off the rented appliance to the new nas, ship the
rented thing back off.

He wants to avoid having to have to buy enough drives, etc, to hold all of the
data at once

------
fensterblick
Getting terabytes of data into AWS/Hotel California is great, but wish there
was a way to get the data out just as quickly!

~~~
evanpw
> AWS Import/Export Snowball is a petabyte-scale data transport solution that
> uses secure appliances to transfer large amounts of data into _and out of_
> AWS.

~~~
icebraining
Yes, but "Currently, Snowball doesn't support exporting data."

[http://docs.aws.amazon.com/AWSImportExport/latest/DG/introdu...](http://docs.aws.amazon.com/AWSImportExport/latest/DG/introduction.html)

------
mkobit
The picture on the page doesn't give me an accurate estimate of the size. They
are actually 50 pounds (says on the blog)!

~~~
res0nat0r
They demoed one on stage at re:invent just a bit ago, it is about the size of
a desktop PC with hard plastic case all around it. I think he said it is ~47
lbs.

------
devy
It's amazing that Amazon has been vertically integrated so many of both its
own and external product/services into this appliances:

\- Kindle's E-Ink

\- AWS IAM / KMS/ SNS

\- Amazon Carrier? (perhaps in the future?)

\- GPS-powered chain-of-custody tracking (AWS working on it, perhaps Amazon
Drone delivery in the future?)

------
meritt
When I export data from S3, what do I get for a given bucket? Just basically a
file system? How is the metadata stored? What about object versions?

I'm curious what the end result looks like in doing this.

~~~
icebraining
"Currently, Snowball doesn't support exporting data."

[http://docs.aws.amazon.com/AWSImportExport/latest/DG/limits....](http://docs.aws.amazon.com/AWSImportExport/latest/DG/limits.html)

~~~
meritt
Wow. Amazon is really embracing the ship fast, break things, roll with the MVP
even when it lacks 50% of the features.

~~~
luhn
Amazon wants to get companies' data into AWS so they're locked in. They don't
want that data flowing the other way.

------
marcosscriven
I thought they already allowed you to mail in hard drives?

~~~
polmolea
They did but you had to buy them yourself and most of the times you didn't
have a use for them after. With snowball they manage the process end to end.

~~~
larrys
What's interesting is that they don't mention this on the marketing page for
snowball. As in "also if you want you can mail your own harddrive, see this
page for details". While most would think "why rain on the parade of this new
service by mentioning the old service" with Amazon it's more than that. It's
this entire idea of weaning people off of legacy ways of doing things (with
new names and new processes) so it's harder for any competitor to offer the
same type of service, unique way of doing things or handholding. After all
anyone can accept (in theory) a mailed in hard drive. Much harder to offer a
solution like this with hardware and so on. So to me this is obviously
deliberate and consistent with Amazon wanting to raise an entire generation on
a new paradigm of getting things done.

Edit: And yes this way it's easier for them as well and removes "missing power
supplies" (big deal actually by I get the point..)

------
ck2
_the E Ink shipping label will automatically update_

Are you kidding me? Instead of a 25 cent shipping label they use a $100 e-ink
display?

The display alone might get the device stolen.

~~~
gamegoblin
They mentioned in the keynote they are using a kindle as the display; given
that Amazon sells refurbished kindles for $65, I imagine the actual cost is a
good bit lower. They also mentioned that the kindle also functions as the
user-interface, so it's more than a glorified shipping label.

------
chx
I am sure most companies which are the target of this will welcome plugging in
an outside device into their internal networks with open arms.

~~~
eru
Relying on securing the perimeter is gonna blow up in their face sooner or
later.

But in any case, you can just connect it to a sacrificial server on its own
network that has nothing else on it.

------
timonv
Oh amazon please get your naming right. Why does every service have to be
named so utterly confusing? I was expected some kind of cloud NLP library.
Would have been cool.

Out of curiosity, does anyone have a real life example where they send
petabytes over the wire? You know, outside the adult industry.

~~~
dagw
Lots of scientific experiments can easily generate huge data sets of raw
measurements that need to be moved from the locations of measuring instruments
back to the lab.

------
pc2g4d
Named after the horse in Animal Farm?

~~~
Swannie
I assume named after the fact that a snowball is a lump of cloud matter, in a
flight ready state?

------
iancarroll
Isn't this a bit risky? What happens if someone keeps it? 50TB is a lot more
than $200.

~~~
silveira
"The data will be 256-bit encrypted on the host and stored on the appliance in
encrypted form", key is stored in KMS, according to
[https://aws.amazon.com/blogs/aws/aws-importexport-
snowball-t...](https://aws.amazon.com/blogs/aws/aws-importexport-snowball-
transfer-1-petabyte-per-week-using-amazon-owned-storage-appliances/)

~~~
crazysim
I think the poster was talking about someone simply gutting the appliance for
the 50TB drives. With a bit of fraud here and there on top.

~~~
djhworld
From the looks of it the delivery provider is UPS, if they lose it or a rogue
employee guts it during transit, I think AWS will probably have clauses in
their contracts to go at them full pelt with a lawsuit

------
gcb0
safer than google option of asking you to ship the un-encrypted drive.

~~~
hrez
encrypt it before shipping

~~~
gcb0
then they can't import it.

------
jonknee
Sneakernet as a Service!

------
letstryagain
They should have called it 'speedball'

------
beachstartup
the biggest message this sends is something nobody is talking about: amazon is
not afraid of sending hardware on-premises.

------
spot
Can this work with Arq?

------
rajeshmoov
interesting how much we need to pay for the GB

------
chinathrow
Another easy method to move your customers data to AWS - where I'm sure some
three letter agenices feast over each newly arrived platter of data.

I'm still waiting for the big leak on how AWS cooperates with NSA at large.

------
buro9
Well, that's an unfortunate name.

[http://www.urbandictionary.com/define.php?term=snowball](http://www.urbandictionary.com/define.php?term=snowball)

Reminiscent of when Microsoft called an overlay dialog a "floater" and all the
South Africans and Brits in the room started laughing.

[http://www.urbandictionary.com/define.php?term=floater](http://www.urbandictionary.com/define.php?term=floater)
(the 2nd definition)

~~~
karlkatzke
I really don't think there's a word that doesn't mean /something/ dirty in
some part of the world.

~~~
antidamage
Yeah, but this is a bit like calling it Amazon Cumswapper. There's no other
valid technical use of the word.

~~~
grkvlt
Well, apart from meaning a ball of snow, such as might be thrown by children
at each other for fun. Really, that's where my mind went with the word
'snowball', much like the rest of the anglophone world. Yes, there may be
slang and other juvenile repurposings of the word, but in normal, everyday
(well, wintertime everyday) usage snowball means exactly what the OED says it
means. Not sure why anyone would imagine otherwise?

