Hacker News new | past | comments | ask | show | jobs | submit login
Riak: A decentralized key-value store (basho.com)
33 points by mace on Aug 7, 2009 | hide | past | web | favorite | 12 comments



It looks like a nice distributed datastore. I can't tell how fast it is (my guess is 'pretty slow right now'), but it has all the right scaling properties.

It works like a hashtable where the keys are distributed onto nodes based on their hash values. Each node takes a subset of the keyspace, and this subset can be dynamically reconfigured (so you can add nodes later and not have to move everything). You can also replicate each key onto several different nodes for fault-tolerance.

It doesn't attempt to address transactions; instead, when people make branching updates, it keeps all the branches. (Think how Git works -- it deals with fast-forward automatically.) You have to merge them yourself -- when you do a get and there are multiple branches, you get all the heads.

It's based on Erlang and has a pluggable backend storage system, so you don't have to deal with ETS if you don't want to. (Hooray)

It has a builtin mapreduce framework. The docs suggest that it does as much of the work as possible for a given set of keys on the node which contains those keys, minimizing transfer costs. That's a very nice property.

I'm sort of excited about this project. I still want a non-relational distributed database that can do fast range queries over arbitrary properties -- I hate to be iterating over millions of items when I just want the latest 10. Give me the tools to define those indices over my data, and I'll be a happy man. (CouchDB comes the closest for me so far...)


I still want a non-relational distributed database that can do fast range queries over arbitrary properties -- I hate to be iterating over millions of items when I just want the latest 10.

MongoDB does exactly that. But it's not nearly as good on the decentralization front as Riak.

A crossbreed between Riak and Mongo - that would be perfect feature-set.


Have you looked at http://neo4j.org/ ?


Given the fact that a graph database does not meet any of the design criteria listed why exactly would one consider neo4j as an option? It is great for what it does, but completely unsuited to the task described by the parent comment.

There are a few features of riak that appear somewhat related to similar features of neo4j (the concept of ancestor/child/sibling links which is only lightly explained in the available docs and does not seem to be really thought out well given the documentation that is available at the moment) but if you want real graph operations you would not use something like riak...


Well, I was responding to lincolnq's comment, not Riak itself:

"I'm sort of excited about this project. I still want a non-relational distributed database that can do fast range queries over arbitrary properties"

I thought neo4j's ranged queries and lucene indexer might be of interest, or at least discussion. It's certainly ideal for those requirements.


This is actually quite helpful, thanks. I did look at neo4j, but I got as far as noting "this doesn't have queries" before giving up. So thanks for pointing out that Lucene might help with the querying part.


Good point. For some reason I keep forgetting the combination of those two items within neo4j.


Man so many to choose from: mongodb, riak, and hbase. Anyone know of comparisons?


http://www.metabrew.com/article/anti-rdbms-a-list-of-distrib...

http://themindstorms.blogspot.com/2009/05/quick-reference-to...

The comparisons in those posts are probably outdated soon with so many new nosql data stores.


Have a look at Tokyo Cabinet/Tyrant too.


I'd like some as well, add Project Voldemort in that list as well (used and developed by LinkedIn)


As well as Redis, cassandra, CouchDB, and more. It's a confusing space :P




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: