Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: cloud data storage cheaper than AWS S3?
39 points by dpapathanasiou on Feb 17, 2020 | hide | past | favorite | 34 comments
I'm working on a new neural network based project, which means I'll need to access and store more data than I can keep on my hard drives.

While I'm familiar with AWS S3 from my day job, I was wondering what other alternatives are out there that may be cheaper (or even free)?

I'm also aware of dropbox, google docs, etc., but ideally I'd like to have programmatic access via an API.





BigQuery with it's $20/TB for active, and half of the price for cold storage, is the best structured (eg: columns or JSON) solution where import and export is free. You even get SQL as the product is a data warehouse.


If you need that much storage, I would posit that cloud storage is going to be a problem not due to cost but due to network performance. A bunch of cheap storage is great but if it takes you days to transfer data there it might be a dealbreaker.

Have you considered a few external hard drives? You can get a couple of terabytes for pretty cheap these days.



If you can tolerate risk of data loss, VPS or even physical hosts are likely to be far cheaper than any cloud. Look at OVH, Hetzner, Linode, DigitalOcean.


DigitalOcean Spaces has the same API as AWS S3, and the download bandwidth costs a lot less. I would recommend it


Linode, Pair and Rackspace are established players in this field. (Pair.com is the oldest I know of).


According to Wikipédia, OVH is the third largest hosting company in the world based on physical services and the largest hosting company in Europe. They are not a fly by night operation.

And IMHO Hetzner is by far the best service provider there is for bare metal and plain old VM services.


Didn't know about OVH, thanks. A long time ago, I was advised to always first look at the age of the company when choosing a web host. (And then their financials and client base). I've found this advise to be a solid one. A lot of web hosting companies fold up ...


"Risk of data loss".

Is there anything that guarantees no data loss or zero risk ?


The risk profile of storing something in S3, where your data is stored in at least three different Availability Zones is completely different than a Managed Server or VPS where, if you so chose, it might be stored without any redundancy at all.


I assume that everything depends on: - how much storage you need, - how long you need it to store this, - if this data easy reproducible.

Generally storage isn't cheap, and cloud storage is quite expensive in the long run. If you need storage for more that a year, I would invest in own local HDDs - put to your PC or buy used NAS server or PC. You will benefit with much better performance and this would be the most cost effective solution.

Keep in mind that often transfer to cheap cloud storage is slow, I tried to keep my backup in few different providers, it could take literally months to upload 6TB of data. Also keep in mind that you may be charged for data transfer separately, for every data access, so cloud cost may be much higher than expected.

If you plan use this in shorter periods, I would go with OVH offer - they probably have best quality/cost ratio. Depending on your needs I would suggest buying dedicated storage server, or use their Data Storage (3x replicated $0.0112/month/GB, plus outgoing transfer - $0.011/GB). They also have cold storage for about $0.0023/month/GB.


Wasabi is decently inexpensive at $.0059 per GB/month https://wasabi.com/

I currently use it as my personal cloud backup


Keep in mind wasabi has a minimum retention perioid. So if you create a bucket and delete it you still have to keep paying.


FWIW, mastodon.social has had frequent problems with the availability of Wasabi.


I like Wasabi for the price and use it for my personal 4TB backup. I used Wasabi for company CCTV storage (100TB+) but switched to Azure Cold storage because of Wasabi availability issues.

I prefer Wasabi for my personal because I can use my own backup/encryption scripts (using rclone) instead of closed source service like CrashPlan.


i second wasabi, its API is compatible with AWS.


Storj Tardigrade

https://tardigrade.io/


Keep in mind that if you use this data often as training data you want to store it close to your GPU. No point in saving money on the data storage and have your expensive GPU idle because you are waiting for data to download from Dropbox...


What is your tolerance for losing data?


Google gets ready for its entry in cloud services market

https://www.headlinesoftoday.com/technology/tech-reviews/goo...


You don't mention how much data. At some size point, a cheap storage server and minio (open source s3 compatible) might be a better value.

At the low end, OVH has a 4×4TB HDD SATA + 1×500GB SSD NVMe server for ~$90/month.

Of course, you have to configure and administer it, so not for everyone.


Depending on your needs, you can rent a dedicated server with hard drives. For example, Hetzner has an offering of 10x 10TB HDDs for $200/month or so.

Disclaimer: I have never used Hetzner's services nor can I vouch for them.


I've used them for the last 5 years. They are OK. Their network is not redundant (core routers) so you should expect some downtime at least once per year for several hours. But it's the only complaint I have. The good part is that you can buy several servers for 100-200 EUR and build a decent HA with fixed monthly cost. Our AWS bill was always hard to predict. Also, they have a VPS cloud service that is decent and we use it for non-core services.


As you said: Depending on needs

S3 is not a single host solution and will neither get unavailable nor will lose data if one particular host (or even the datacenter it is in) becomes unavailable. If you don’t need those properties then building your own hosts might be cheaper.


Scaleway Object Storage

S3 compatible and comes with free bandwidth

https://www.scaleway.com/en/object-storage/


Too big for my hard drives covers a lot of space.



s3 may be cheaper than you think.

If you're willing to tolerate hr+ delays in accessing your data aws glacier deep archive is 70cents per terabyte month.

that's pretty awesome in my book.

If you need to access the data in under an hr it comes out around $2 per TB/mo


I am not sure if you are calculating this correctly but for me, 70 cent/TB/mo is not a realistic number. Cheapest glacier storage for me, is, $0.004 per gb per mo, which makes a terabyte storage for $4/TB per mo.For me, this number is at Ohio or Oregon data centers. And their retrieval rates are super expensive, as in, $0.03 per gb. Glacier might be cheap but it definitely is not the cheapest.


no cheapest deep archive is 0.0007 per gb month, or 70 cents per tb/mo

note they've recently reduced their pricing so you might be looking at old marketing. But my latest bill charged me 0.0007

you're right of course that retrieval in an out is much more expensive and so prob not a great fit for your usecase


Glacier Deep Archive is $1/TB-month. Glacier is $4/TB-month, and both neither is cost effective for immediate or recurrent retrieval.


Whoa! That's nothing!

Is there an easy way for a consumer to use this for backup?


Why don't you try Usenet? Unlimited storage for $3.14/month[1].

There are a couple of Python libraries out there for posting and fetching, but it'll definitely be shabbier than a purpose-built service. However, for seriously large storage requirements, you can't beat that price.

I have a full PC backup I did 5 years ago drifting around on Usenet somewhere. It was still there when I checked a year ago.

[1]https://newsgroupdirect.com/member/billing/?planid=189&deal= but I've had problems with their service, and personally use FrugalUsenet for a little more.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: