

Ask HN: riak/luwak vs openstack swift for private S3 clone? - shedam

Hello,
what is the best choice for a private cloud storage between openstack's swift and a riak/luwak solution?
The purpose of the storage is long term archiving of files (1MB - 500MB), what would you choose and why?
======
rzezeski
I know nothing about Switft so I'll keep my comments strictly related to
Luwak.

Luwak preforms really well as a large object store...up to a point. I've
personally seen latencies of < 4s when writing/reading a 700+MB CSV across 3
physical nodes (keep in mind, in the case of write, that doesn't mean all the
data was strongly consistent yet...it's all async). Luwak has a really cool
feature that also acts as a double edged sword -- it's a persistent data
structure [1]. When you insert data it's chunked into blocks which are then
keyed by their hash, i.e. a Merkle Tree [2]. If 2 (or more) blocks of data are
the same only one instance will ever be created which acts as a general form
of compression. The flip side of this is that you can't just willy nilly
delete a file. You must perform garbage collection (via reference counting).
Currently this is not implemented in the main line but I have a branch with a
prototype implementation that uses Map-Reduce under the covers [3]. It scaled
for me up to about 20+ GB of data and then I started hitting timeouts. I had
plans to take this further but went a different direction for my purposes
(which wouldn't relate to your problem at all so I'm not going to bother
stating them).

[1]: <http://en.wikipedia.org/wiki/Persistent_data_structure>

[2]: <http://en.wikipedia.org/wiki/Hash_tree>

[3]: <https://github.com/rzezeski/luwak/tree/delete2-1.0>

------
jarnold
(I'm going to have some bias here as I've done Swift deployments as well.)

One of the advantages of using something like Swift for long-term storage is
it's simplicity of how data is stored on disk. Data isn't chunked up and
distributed throughout a cluster. Each replica is whole and on disk. And, as
notmyname, mentions there are auditors that continuously run to check for
bitrot and to ensure replicas are in place. The data is extremely durable in a
swift cluster if you deploy and configure everything right.

I've done most of my price modeling for petabyte-scale deployments, but it
actually has good scaling-down properties as well. If you're concerned about
power, 2.5" drives may be an option. Although I would avoid 'green' drives as
Swift tends to keep the drives quite active (replica and integrity checks) and
you won't see much benefit.

------
notmyname
I'm going to lean towards swift, but then again, I'm a core dev for swift and
I haven't played with riak/luwak.

Swift is ideal for storing static content (backups, web resources, documents).
It's designed to be very scalable (both with concurrency and with total
storage space). And it's designed to work with commodity hardware (read:
cheap), so it handles failures well (ensuring that data is replicated and safe
from bit rot). Also, swift has basic S3 compatibility support.

Please let me know if you have other, more specific questions, and I'll try to
answer to the best I can.

~~~
shedam
thank you very much for your answer. I have several question : can we mix node
with different storage capacities (for example when we add new node later)? Is
this kind of storage really cheaper than Nas with RAID on "small" capacity
like 40 or 60TB total space? Ok hardware is cheaper, but what about other cost
like power consumption or cooling?

~~~
notmyname
Power consumption and cooling will, of course, be very dependent on your
chosen hardware and hosting (DC).

Yes, swift can handle heterogenous drive sizes. For example, you can start
with 2T drives and start adding 3T drives when they become more cost
effective. You initial 40-60TB is well within reason for a swift cluster. I
know of several running clusters at much larger scale than that (PBs of data,
billions of objects).

Cost consideration are also highly dependent on your particular hardware
choice. We recommend that you optimize your hardware for price per GB rather
than for IOps or RAM or CPU. As you can imagine, there are nuances to all of
this, too.

In case you haven't seen it yet, all the auto-generated docs for swift are at
<http://swift.openstack.org> and much help can be found in #openstack on
freenode IRC.

I'd recommend that you first look at the swift all-in-one docs
(<http://swift.openstack.org/development_saio.html>) for running an entire
cluster in a single VM. It should give you a good feel of the different parts
of the system.

