I'd be interested in hearing some thoughts on backing up. Would rsync to S3 happening regularly be an adequate solution? Figure if it is on a cron and it fires every other minute, the deltas would be pretty small, and not much would be lost if a datacenter or cluster goes down.
We use Riak at Dropcam to store H.264 video streaming from cameras running 24/7 which needs to be available for playback on the website and mobile apps. We needed a block store that could scale to handle petabytes, tolerate disk and node failures, and provide always-on write- and read-ability.
Riak fits the bill very well, though we did need to make a few mods to workaround certain problems. But it's Erlang/OTP, fairly small code-wise, and easy to modify.
We use Riak as our primary datastore at Showyou, with Redis, Solr, and Memcached helping out. The big advantage of Riak is high availability and easy management, but for sorting/listing keys, range queries, and low latency we need other datastores which maintain post-commit-hook updated indices.
Yes, sorry, the wording was somewhat ambiguous. We use post-commit-hooks in riak to update external datastores, as well as callbacks within our ORM for many indexes stored in Riak itself.
Don't know why that got downvoted, since it's a link to relevant prior discussion. But here's a more specific link (to the Riak thread in that discussion): http://news.ycombinator.com/item?id=2685053
For now, nothing. My systems use redis as a datastore at first, then move to riak when they outgrow redis. So far my tiny systems haven't outgrown redis.
When I have time, I'll add distributed counter support to Riak and take over the world. (Let's count everything. In real time with massive concurrency, full historical resolution, and no single points of failure.)
- PostgreSQL for the user database since it maps nicely to django and sqlalchemy models
- Redis for caches and non persistent data
- Riak for everything else
We love the fact that it is easy to scale, that bitcask is safe (it writes to disk:-p) and easy to backup (rsync or cp is enough).