
How to Move 8PB from NYC to SJC? - throwbigdata
I have been asked to cooy 8PB from a customers datacenter in NYC to our Datacenter in SJC. I expect to start in 2 weeks and I want to be done 30 days later.<p>NYC dc is currently unknown (!).  I should learn this next week.  The data is a collection of files size 100k to 2MB Ave size 500k.<p>What&#x27;s the best way to move it?<p>Options:<p>1 - Ship my storage servers out there.  24 core controller with 128g ram, SAS card with 2xHGST 60 disk 10tb JBOD and 40gig nic.  We use Freenas. I could ship it out, copy the data and ship it back.  I could either just ship the disks or the whole storage system.<p>Option 2 - aws snowball.  I figure I can get 3gbps based on testing so the copy would take 30 days.<p>Option 3 - aws snowmobile - overkill for this?<p>Option 4 - find 1 or more temp 10g or 100g circuits<p>Option 5 - run it through aws direct connect assuming I can get this at the east coast dc. I have high bw aws dx in the sjc facility.<p>Option 6 - something else?<p>Your thoughts suggestions recommendations and experience appreciated!
======
jwilliams
If you already own the servers and can do without them, then that'll be a fast
option for sure. That said, that's quite a bit to insure on the shipping
(that's a quarter to half of a million $ of discs alone?). Hate to think of
the damage if one of those servers were dropped. Plus you're shipping twice,
so that's double the cost.

You can probably do via tape for circa $50k. Probably a lot less if you buy in
bulk. Also depends on the kind of compression you can get on that 8pb.

If you can handle tape (500-1,000 tapes is a lot to do manually), I'd consider
shipping the tape. Then try selling the tape at the destination (surely there
is a decent market in SJC).

~~~
thiago_fm
I would do the same, perhaps find some companies who does this and quote them.

~~~
throwbigdata
What companies do this?

------
abra_kadabra
So I do agree with the rest of the comments that the absolute fastest way is
to ship the disks and then copy the data. Now that being said if you can't or
would rather not ship anything because of disks getting ruined, then the best
thing to do is to find a high bandwidth internet connection and use something
like IBM Aspera ([https://www-03.ibm.com/software/products/en/high-speed-
file-...](https://www-03.ibm.com/software/products/en/high-speed-file-
transfer))

Aspera is used widely in the Movie industry to move Tb size movies around
(~2.5Tb). The key to Aspera is that it assumes that packet order doesn't
matter so it will max out your pipes and it doesn't have to immediately report
back about what packets were dropped and therefore is pretty efficient at
getting the data across quickly.

------
brudgers
_Never underestimate the bandwidth of a station wagon full of tapes hurtling
down the highway._ \-- Andrew S. Tanenbaum

~~~
twobyfour
This. Don't bother shipping the servers. Ship the disks. Copy them. Ship them
back. Easy peasy.

------
jtchang
Shipping storage servers is by far the fastest way. You can encrypt data if
needed and have options to either fly the thing out or ship it ground.

------
throwaway8347
IBM just rolled out Snowball competitor[1], it has 10Gbps port, might be a
good option.

[1]: [http://www.datacenterdynamics.com/content-tracks/servers-
sto...](http://www.datacenterdynamics.com/content-tracks/servers-storage/ibm-
rolls-out-cloud-courier-service-with-120tb-storage-blocks/98969.article)

------
jtchang
This sounds like a fun project.

------
PaulHoule
Option 1

