Hacker News new | comments | show | ask | jobs | submit login

I have decided on wanting to use riak as well. I was wondering if anyone had examples of how they used it with their data model?

For example this article mentions "With appropriate logic (set unions, timestamps, etc) it is easy to resolve these conflicts" however timestamps are not an adequate way to do this due to distributed systems having partial ordering. The magicd may be serialising all requests to riak to mitigate this (essentially using the time reference of magicd) in which case they're losing out on the distributed nature of riak (magicd becomes a single point of failure / bottleneck).

Insight into how others have approached this would be awesome.

There are a several ways to approach this. The simplest is to just take last-write-wins, which is the only option some distributed databases give you. For cases where this isn't ideal, you resolve write-conflicts in a couple ways.

One way is to write domain-specific logic that knows how to resolve your values. For example, your models might have some state that only happen-after another state, so conflicts of this nature resolve to the 'later' state.

Another approach is to use data-structures or a library designed for this, like CRDTs. Some resources below:

A comprehensive study of Convergent and Commutative Replicated Data Types http://hal.archives-ouvertes.fr/inria-00555588/

https://github.com/reiddraper/knockbox https://github.com/aphyr/meangirls https://github.com/ericmoritz/crdt https://github.com/mochi/statebox

Are there any connector libs that provide "simple" last-write-wins out of the box?

Simple last-write-wins can be configured per-bucket, so client libs can be naïve: http://wiki.basho.com/HTTP-Set-Bucket-Properties.html

It's not hard. def merge(siblings) { sort_by(siblings) { |s| s.timestamp } }.last

Or, in Knockbox/Meangirls, strategies like LWW-set.

Until your clocks get out of sync.

Unless I'm missing something, I would assume they run magicd on all servers that run the application. Thus Riak's degree of redundancy is independent of magicd's degree of redundancy since each instance of magicd can communicate to the entire Riak pool.

Yep! This is exactly how it works. Each app node runs a magicd which connects to an haproxy instance on localhost (connected to every machine in the database cluster), so when a Riak node goes down we don't miss a beat.

What's magicd? Googling did not help :-(

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact