AOL, Meet Riak

foobarbazetc · on March 22, 2011

Serious question: how do you guys get around the slowness of Riak? I've tried deploying it to like 5-6 nodes, and I max out at like 5000 GET/s with multiple clients/threads/whatever for a single key on giant ass machines with plenty of CPU and RAM.

rzezeski · on March 22, 2011

You are performing a GET for the same key each time?

Don't quote me on this but I took a quick glance at the code and it looks like all gets are [currently] serialized at the vnode level. Since your key will always hash to the same vnodes (increasing machine count doesn't help) it means that request will pile up on the vnodes' mailbox.

I also see an asynchronous get defined in the vnode but don't see anyway to use it.

That said, if this isn't an artificial use case then I'd probably throw a short-lived cache in front of Riak for situations like this. I mean, if this were a more traditional database, you wouldn't run the same query 5000 times if you know the answer isn't likely to change.

foobarbazetc · on March 23, 2011

Thanks -- the serialization causing the bottleneck makes sense. I'll try with more keys. :)

megaman821 · on March 22, 2011

I haven't used Riak but just in general, if I had that kind of load I would use Varnish in front or use a lighter protocol than http like protobufs.

billybob · on March 21, 2011

We're using Riak to store forms that can have varying schemas as JSON. That and the no-single-point-of-failure were two of the big draws.

siculars · on March 21, 2011

Looks like this guy built a zookeeper lite on top of riak_core. That looks interesting. For those that don't know, the "riak" nosql product is actually comprised of many submodules that each handle a certain portion of the overall application. Riak_core is the module that handles the coordination of distributing work amongst many nodes in a single cluster.