Consider donating to the Internet Archive.
> Archive Team plans large scale backing up of Soundcloud soon, but seriously, please donate money to the Archive. http://Archive.org/donate
Edit: my info is dated. Archive Team backed off. Smoke if you've got 'em.
By honoring SC's request Archive Team has allowed the indie Library of Alexandria to burn...
That's what pisses me off about VC. Everything's a fucking product...
Edit: I was mixed up. I was thinking of what it would cost SoundCloud to transfer all that music. I guess storing it would be much cheaper...
Edit: I think you could even rent at a datacenter that peers with amazon, hook your server with the files up to a VPC via ipsec, then move files to s3 or glacier via a server in the VPC. Not sure if you could avoid the PUT costs, but you would avoid amazon bandwidth charges with the peering, I think (not 100% sure).
WD 10TB Gold drives on newegg are $414 * 90 = 37260, now you want at least a 25% raid redundancy... $46575... now, you need how many U of space to store all those drives? It's not just something you can throw in a 1U host. Even if you can get 125 drives in servers 12 at a time, you still need at least a 3U * 10, 30U, so that's a pretty expensive hosting proposition, beyond the $50K for drives alone
Tapes are also more reliable than HDDs for long-term archival storage.
Then you'd hope someone came along with a big chunk of AWS credits to use snowball to migrate your bucket(s) of data to the Internet Archive when they could safely accept said data.
Since incoming traffic is free and outgoing traffic to other Google services is free, bandwidth costs would be minimal (just the size of the requests, not the responses).
If you directly run traffic agreements with ISPs, and peer directly, you can get below half of that.
Where are you getting those moon prices? AWS? GCP? Never use their prices to get a fair estimate, AWS and GCP's prices are orders of magnitudes off from a fair price. Unless you're in Silicon Valley, running systems yourself at a traditional datacenter will always end up cheaper. (The issue behind that being that Amazon and Google have to pay far higher wages than you, and you can cut out a middleman)
Still, it's quite impressive to be able to manage that much data.
That said, if they pay the rates you mentioned, then it's no wonder the company fell apart.
Also, I was thinking that gsuite would be the cheapest option since it has "unlimited" storage.
The most egregious example to me was when Madlib released a remix of a Kanye West song that he had produced and it was taken down for copyright by UMG (a shareholder in Soundcloud). That's when I knew the sun had set on Soundcloud as the open creative community it had started as.
There are still people interested in archiving in #soundbutt on EFNet but no plan as there is no place to store all the data. :(
I continue to be interested in helping with these efforts but multiple petabytes of online storage is not trivial ...