Hacker News new | comments | show | ask | jobs | submit login
How to: Compete with Amazon S3 without Buying Hardware (adamsmith.cc)
66 points by adamsmith on Apr 27, 2012 | hide | past | web | favorite | 37 comments

Tahoe-LAFS[1] is a storage system that works similarly. It splits data into n fragments, of which any k fragments are enough to restore the data. This leads to a replication factor of only DATA_SIZE/k*n, while you can still lose n-k pieces without data loss. Additionally all data is encrypted, signed and optionally deduplicated.

One of the authors of Tahoe-LAFS started a company that ported the whole system over to cloud storage providers.[2] It's still in alpha, but it's definitly worth a look if you want secure, encrypted storage without relying on a single cloud provider.

1: https://tahoe-lafs.org/trac/tahoe-lafs

2: https://leastauthority.com/

Now this is a cloud application.

RE Least Authority, awesome!, thanks for the link!

I've been watching the storage industry for years as a hobby-passion. Adam pretty much hit all the major points. The storage space is still open for disruption but is hard with high risk.

It's not like building a website. You need serious funding for hardware. You need people to manage the hardware. You need complex software to manage the data and ensure security. One serious breach early on and you're done.

Competing with Amazon is especially hard since S3 is well established and entrenched. If you use EC2 you're going to use S3.

Pricing would be a primary factor in competing. 3x redundancy is unnecessary. I'm not sure why services still do that. Reed-Solomon or similar redundancy algorithms can provide better protection and use less space. They have CPU overhead but CPUs aren't going to be the bottleneck for a storage service, bandwidth and hard drives will be.

Edit: This would be if you built from the hardware up. I don't think offering a service like S3 on top of other storage services would work as a business. You'd have to deal with too many vendors, too much variation in APIs / software / hardware, lack of control, latency issues, and much tighter margins. IMO you'd be better off starting with bare metal. You could do something like this for personal, smaller scale storage but growing it to scale would be a nightmare.

We've[1] been doing this for 11 years now, just as you describe. We built the bare metal ourselves, we own it, and the buck stops here.

Most importantly, unlike the OP who speaks of "the big hosting guys don't have a track record of building complex systems software" and your own post speaking of "complex software", we run an architecture that is as simple as possible.

Our failures are always boring ones, just like our business is.

You are correct that a chain of vendors, ending in a behemoth[2] that nobody will ever interact with, and will never take responsibility, is a bad model.

So too is a model whose risk you cannot assess. You have no idea how to model the risk of data in an S3 container. You can absolutely model the risk of data in a UFS filesystem running on FreeBSD[3].

[1] rsync.net

[2] Amazon

[3] ZFS deployment occurs in May, 2012

I imagine that the response times of S3 vs. this Frankenstore model will be much better, which may be an issue for certain applications.

Also you need to host infrastructure software that knows where your data is sitting, how to deal with provider failures, how to efficiently route requests, etc. which means yet-another-thing-to-configure.

Finally, if the volume of data you're storing is so expensive on S3, I have to wonder why you have all this non-revenue generating data stored in the first place. Processing it also seems more expensive now because the free bandwidth you get from EC2<->S3 won't apply in the Frankenstore model.

All good points!

I should have mentioned this more explicitly but you could take the buying-raw-storage model and use it to do anything S3 does, I think. Eg you could have three independent whole copies, or one whole copy and 1.5 copies distributed widely.

The only thing I can think of that Amazon could do that you couldn't do, if the raw storage providers are untrusted, is serve the data with no addition hops, since the data would be encrypted.

It really isn't that expensive see http://www.backblaze.com/ and specifically http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v...

Obviously they are targeted at backups but you wouldn't need to change a lot to improve performance (mostly it would be in software + some caching boxes I think).

We have ~8PB of spinning storage that we built with the backblaze storage pods, and use Openstack's Swift Object Storage for the software layer. Works like a champ.

This sounds very interesting and is something that I'm thinking about doing as well. One "limitation" that I see with the BackBlaze pods is the possibility that they don't perform well in heavy everyday use. They were designed to be mostly write-only devices, but my use case would be very read heavy and I'm not sure how they would hold up.

Do you have any information on what/how you went about building your storage system? If not, would you be so kind as to create some text (blog, how-to's, etc.) that detailed your setup and how it performs under your workload?

Before my current startup, I worked at Fermi National Accelerator Lab on the CMS detector data taking team for the LHC. I spent a year there getting to admin the spinning storage (~5PB) on Nexsan Satabeasts (very nice, but very expensive for ~48-96TB of disk per enclosure) and ~17PB of storage on Storagetek tape silos (also, of less consequence, ~5500 nodes that reconstructed collider data from raw data we streamed over 40Gb/s optical links from CERN).

After my experience there (both technical and political), I left to do big storage. We settled on the Backblaze enclosures due solely to cost (cheap is cheap); read performance is sub-optimal, but we try to compensate with heavy caching in memory and intelligent read assumptions ("what might someone request next based on past read requests") at the app level (sitting on top of Nova).

I could do a blog post, but I have to check with my partner to make sure they're cool with me spilling that much info =)

Hope this has helped a bit. If you haven't guessed yet, I love object storage.

A blog post would be really interesting, even if you couldn't disclose everything.

What are your startup and blog URLs? You don't have any profile info.

"In 2006 a 320 GB hard drive cost $120. Today (Thailand floods aside) that much money will snag you a 3 TB drive."

Floods or not, the current price isn't $120 dollars. It's 50% higher than that.


Shows one of the cheapest 3tb non enterprise drives. It looks like 3tb was $120 for ~2 weeks. Looking at enterprise drives, 3tb is closer to $300.

This article is basically advocating RAID 5 across many storage providers.

*edit: From the pictures, article is advocating RAID 10. Nonetheless, RAID5 would be just as feasible for additional storage.

People always conveniently forget as well that when you store data at S3 or Rackspace they're doing 3 replica's. If you store 3TB's, its not just the cost of a single 3TB drive..its 3x. To do it at home you'd need to buy 3x3TB Drives @300$ each.

That's right. The main point is that these services still benefit from the falling cost of storage.

Suppose the prices of _x_ amount of storage was halved recently. If they stored their data in three replicas, shouldn't their price become (1/2)^3, or 1/8th the original price? That would only serve to prove his point further, or I'm missing something.




Each drive cost $x. You have 9$x = cost to store 1 drive of data, across 3 providers, who each store 3 copies. If drive prices halve, it's still 9$x.

I don't think anyone is proposing 3 providers, who each store 3 copies. Either use one provider who stores 3 copies (Amazon) or three providers who each store one copy.

"People always conveniently forget as well that when you store data at S3 or Rackspace they're doing 3 replica's. If you store 3TB's, its not just the cost of a single 3TB drive..its 3x. To do it at home you'd need to buy 3x3TB Drives @300$ each."

Existing companies store multiple copies transparently...Either way, 1 copy or 3 copies, I dont think the math is wrong. Just change from 9$x to 3$x.

> This article is basically advocating RAID 5 across many storage providers.

That is correct. And to be more precise, I'm advocating RAID 5 across storage providers as a service, so people who just want to store data don't have to manage anything.

If this were built on top of (or in conjunction with) Openstack swift (http://swift.openstack.org), it could be done with a common storage backend. There are already several public cloud storage providers using swift (Rackspace, HP, Softlayer, Internap, KT, and others), and a growing number of private deployments as well (Wikipedia probably being the most recognizable name).

If you'd like to talk more about this, send me an email (address in profile).

Aren't most Swift providers already using 2x-3x duplication? Putting parity on top of that would be like RAID15 (the reason you've never heard of RAID15 is that it would be crazy expensive).

Yes, swift provides multiple redundant copies of the data. It depends on what you are protecting against: failure of a piece of your data or failure of a hosting provider.

If you simply rely on "dumb disks" spread across multiple providers to provide availability, then you may be interested in looking at the nova-volumes part of openstack (it provides block storage attached to nova VMs). As part of openstack, it's an open system that is seeing rapid adoption.

However, one of the most-requested features for swift is to provide support for logical clusters that span a wide geographic area. This could potentially allow multiple providers to collaborate in providing a multi-provider storage system. However, I'd guess that the technical problems are much simpler than the business problems in setting up multi-provider clusters.

>> This article is basically advocating RAID 5 across many storage providers.

> That is correct.

The diagram in the article and text description is RAID 1+0 aka RAID 10.

No, it's RAID 5. Well, specifically, it's RAID 6 but customizable. You can lose any k drives.

Wouldn't the next issue become bandwidth? Sure, you have cheap storage, but you still need to manipulate the data somehow. I know S3 only charges for outbound data. What about the other companies? Is bandwidth free and plenty?

This is a really interesting question.

At scale, this could be managed, I think, through a combination of (a) shipping hard drives around, (b) caching, and (c) peering between storage providers. For example, shipping hard drives around would be expedient if you wanted to switch out a raw storage provider. The optimal strategy also depends on the access patterns and latency requirements.

It seems solvable, but not trivial.

A big cost of running a redundant data storage service is data transfer.

To store two replicas of each piece of data, you must receive the data at one replica, transmit it to the other replica, and receive it at that replica. The data goes in at one server, then back out, and then in at the other server. To store 1 GB of data, you must pay for 3 GB of data transfer. Data transfer is expensive.

Amazon works around this problem by building data centers in clusters, interconnected with low-cost connections. When you upload to S3, your data goes over the Internet only once.

So basically https://nimbus.io/ on rented hardware.

> Amazon S3 has high margins today. ...

> ... despite the fact that hard drive costs fall 50% per year.

Citations for both statements please.

Even if both are true, it may be the case that hard drives are not the primary cost of running a large cloud storage service.

I'm not sure I completely understand. Are you recommending people shop around? Do you want someone to develop a service to shop around for storage? Or, do you simply want a cheaper competitor?

Thanks for the comment. I just edited the post to try to make it clearer.

I'm proposing that anyone could start a company, Foo Inc, who would sell redundant storage and compete with S3. Instead of operating your own hard drives, you rent hard drives connected to the Internet from a variety of providers. Of course your customer would know that you were doing this, and advanced customers could even choose their own blend of raw storage providers to optimize for different things.

Towards the end of the post I mention briefly that instead of a startup (Foo Inc), this ecosystem could be set up in a decentralized way (think Bitcoin v.s. central banking), though that is far less realistic.

It's true that cost/GB has fallen, but cost/IOPS hasn't followed suit. If your I/O maxes out when the disk is 10% full, you can't really do much with the other 90%.

Break the file into 3 equal parts. You only need to store one part plus the parity file. And not necessarily with the same cloud provider. Result: you pay 1/3 less for storage. So if Dropbox is so clever why aren't they doing this for their customers? xor'ing/Reed-Solomon has been used this way since Usenet.

So... let me get this straight. You came up with an idea which requires building a strong brand which takes a lot of money and that Amazon can squelch any second there's a hint of possibility of a success. And this made HackerNews frontpage. What?

Or building an open source tool that individual users could use to manage raw storage in an S3 fashion. Its an interesting idea.

Or one many little startups everywhere could use to compete locally with Amazon in the SMB space.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact