

My Year of Riak - inaka
http://inakanetworks.com/blog/2011/08/25/when-to-use-riak/

======
rarrrrrr
The killer feature I'm looking for is Riak's "it just works" -- especially in
the case of nodes failing, soft failing, going offline, timing out, whatever.

In my situation, I don't care about the performance at all, because I don't
have many keys at any given moment. The few that I have matter greatly.

I care that when I store a key, it's reliably, durably, stored and replicated,
and that when nodes fail I don't have to do anything special to keep running.
(This is in contrast to PostgreSQL, MySQL, or Mongo replication, where you
have to fail over, then switch back eventually, and it takes special effort.)

AFAICT, It's not provided by Redis or CouchDB either, because their
replication is async -- keys can get lost.

Having looked at a bunch of options in the last couple weeks, it seems like
only Riak and Cassandra truly offer durable, synced replication that isn't
difficult to admin. (...and of the two of them, Riak's documentation gives
much more confidence about the ongoing admin efforts.)

Has anyone used any solid options I've perhaps overlooked?

~~~
btilly
Be warned, shortly after nodes join/leave, Riak has a real possibility of
rearranging which nodes will respond to requests for which keys without making
sure that those nodes actually have those keys. The result is that key/value
pairs can become inaccessible for some time until data migrates under the
hood.

This is unlikely to be a problem in practice. But it is a possibility to be
aware of.

~~~
bretthoerner
This changed a lot recently in master. I believe they wait until the new node
has all of the data until sending it requests. I'm sure it will be part of
1.0.

~~~
pharkmillups
This (and a few other less-than-elegant operations) were all handled as part
of this:

[https://github.com/basho/riak_core/commit/c4b80137998359f0db...](https://github.com/basho/riak_core/commit/c4b80137998359f0db6eec769c0295db70e61739)

------
devongall
Definitely agree on the expense of a list-keys operation, be sure to avoid at
all costs.

Some of the Riak documentation was incomplete/incorrect which made
implementation a little sticky, but the mailing list is extremely responsive
and helpful.

Otherwise, have had a great experience with Riak thus far. Looking forward to
the ease of scaling as well!

~~~
willbmoss
It's worth noting that not only is the list keys operation expensive, but
since it uses Bloom filters, it's not guaranteed to returns all keys.

My sources at Basho tell me that this is fixed in 1.0, but until that's
officially released, basically don't try to list keys.

~~~
seancribbs
Yes, the problem was not necessarily using a Bloom filter, but that it was too
small. However, 1.0 is smarter about which vnodes it sends list-keys requests
to and thus obviates the need for the Bloom filter (at least for that
operation).

------
metabrew
I'd love to use riak for all the reasons mentioned in this article, and more.

The single missing 'feature' (design decision) that I can't live without, is
that you can't efficiently do range queries/order-by on the key in riak today.

Hopefully this will get easier with secondary indexes / riak-search
integration. Not clear yet.

~~~
samg_
It's so important to evaluate whether you need range queries before picking a
tech like Riak.

They are coming, though. That's what I hear, at least.

~~~
rzezeski
If we are talking about performing range queries on an index then Riak already
has it in the form of Riak Search. In 1.0 this is also supported by secondary
indices.

If we are talking about performing a range operation on the primary key which
returns the matching __objects __, then no, Riak doesn't currently offer that.
However, given it's support for an ordered data store such as leveldb in 1.0
it should only be a matter of time before that is possible.

Just to try it out I already implemented this for fun on my fork.

<https://github.com/rzezeski/riak_kv/tree/native-range>

[https://github.com/rzezeski/riak-erlang-
client/tree/native-r...](https://github.com/rzezeski/riak-erlang-
client/tree/native-range)

~~~
metabrew
this is great, I hope mainline riak gets support like this soon, for backends
that support.

------
dsl
I love Riak. The one minor complaint that I have is having a small
development/test cluster is pretty painful. If you have <N nodes, you end up
with duplicated data in memory and on disk. Sucks for developers who would
like a local instance to test with on their virtualized dev boxes.

~~~
siculars
I change n to 1 in the default_bucket_props[0] when I just want a quick
instance to test against.

[0] <http://wiki.basho.com/Configuration-Files.html#app.config>

