It works like a hashtable where the keys are distributed onto nodes based on their hash values. Each node takes a subset of the keyspace, and this subset can be dynamically reconfigured (so you can add nodes later and not have to move everything). You can also replicate each key onto several different nodes for fault-tolerance.
It doesn't attempt to address transactions; instead, when people make branching updates, it keeps all the branches. (Think how Git works -- it deals with fast-forward automatically.) You have to merge them yourself -- when you do a get and there are multiple branches, you get all the heads.
It's based on Erlang and has a pluggable backend storage system, so you don't have to deal with ETS if you don't want to. (Hooray)
It has a builtin mapreduce framework. The docs suggest that it does as much of the work as possible for a given set of keys on the node which contains those keys, minimizing transfer costs. That's a very nice property.
I'm sort of excited about this project. I still want a non-relational distributed database that can do fast range queries over arbitrary properties -- I hate to be iterating over millions of items when I just want the latest 10. Give me the tools to define those indices over my data, and I'll be a happy man. (CouchDB comes the closest for me so far...)
MongoDB does exactly that. But it's not nearly as good on the decentralization front as Riak.
A crossbreed between Riak and Mongo - that would be perfect feature-set.
There are a few features of riak that appear somewhat related to similar features of neo4j (the concept of ancestor/child/sibling links which is only lightly explained in the available docs and does not seem to be really thought out well given the documentation that is available at the moment) but if you want real graph operations you would not use something like riak...
"I'm sort of excited about this project. I still want a non-relational distributed database that can do fast range queries over arbitrary properties"
I thought neo4j's ranged queries and lucene indexer might be of interest, or at least discussion. It's certainly ideal for those requirements.
The comparisons in those posts are probably outdated soon with so many new nosql data stores.