Hacker News new | past | comments | ask | show | jobs | submit login

You can buy a dedicated server ($50-200/mo) hooked up to a 1Gbps line ($1-2k/mo). Assuming you fully saturate the link, 900TB will take 83 days to download. Of course you can also rent a bigger line, or more of them. And this doesn't consider storage costs. But it's certainly less than $80k.

Edit: I think you could even rent at a datacenter that peers with amazon, hook your server with the files up to a VPC via ipsec, then move files to s3 or glacier via a server in the VPC. Not sure if you could avoid the PUT costs, but you would avoid amazon bandwidth charges with the peering, I think (not 100% sure).




And how much do you need to spend on HDDs and how many do you need for 900TB of data?

WD 10TB Gold drives on newegg are $414 * 90 = 37260, now you want at least a 25% raid redundancy... $46575... now, you need how many U of space to store all those drives? It's not just something you can throw in a 1U host. Even if you can get 125 drives in servers 12 at a time, you still need at least a 3U * 10, 30U, so that's a pretty expensive hosting proposition, beyond the $50K for drives alone


A chassis design that holds the HD's vertically[1] would half the space; down to 12U instead of 30U, which is slightly easier to stomach. However, that's less than half of Soundcloud's reported 2.5 PB[2], though I have no idea how much of that 2.5 PB consists of downsampled copies which wouldn't need to be stored.

[1] https://www.supermicro.com/products/system/4U/6048/SSG-6048R... [2] https://aws.amazon.com/solutions/case-studies/soundcloud/


This might be an application where the cost of a tape drive + many tape cartridges will be far less than the cost of HDDs. LTO-6 tapes are ~$4/TB and the drive is a few $K.

Tapes are also more reliable than HDDs for long-term archival storage.


S3 transfer inbound is free, from anywhere. You'd use several dedicated servers consuming a queue of urls to warc package and push into S3, no vpc or IPSec tunnel required (random S3 key prefix ftw to prevent creating bucket hotspots).

Then you'd hope someone came along with a big chunk of AWS credits to use snowball to migrate your bucket(s) of data to the Internet Archive when they could safely accept said data.


AWS Direct Connect out of AWS into the connected DC costs per GB according to https://aws.amazon.com/directconnect/pricing/.. and it's quite the pretty penny!


Yes, but looks like transfer in is free (other than paying for the pipe). So in the scenario we are discussing, you could get 900tb into AWS fairly cheaply.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: