

Riak 2.0.0 RC1 - cmeiklejohn
http://lists.basho.com/pipermail/riak-users_lists.basho.com/2014-July/015556.html

======
morsch
I have a hard time keeping up with all the NoSQL engines out there. What can I
do about it? I could easily evaluate, say, CouchDB in a side-project, but
that's a) not necessarily going to tell me a lot about its characteristics at
scale, and b) I can't do this even for the big NoSQL engines out there.

Is there a good overview? I realize that this is sort of an oxymoron -- a high
level overview is doomed to fail because you can't compress the complex
characteristics down to a few bulletpoints. Antirez put it this way [0]: _That
said I think that picking the good database is something you can do only with
a lot of work. Picking good technologies for your project is hard work, so
there is to try one, and another and so forth, and even reconsidering after a
few years (or months?) the state of the things again, given the evolution
speed of the DB panorama in the recent years._

That was a comment to a pretty good overview[1], which despite being 3 years
old is still useful. Apart from the purely technical characteristics, social
characteristics such as rate of updates, adoption (and by whom?), openness are
also interesting. You just "know" these things for the fields you're working
in, but they're very hard to tell from outside and rarely discussed.

[0]
[https://news.ycombinator.com/item?id=2053594](https://news.ycombinator.com/item?id=2053594)
[1] [http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-
redis](http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis)

~~~
mateuszf
You can find lot of good articles about CAP characteristics of various NoSQL
products here: [http://aphyr.com/tags/Jepsen](http://aphyr.com/tags/Jepsen)

------
Pfiffer
Ooo, theres some support for CRDTs in here:

[https://github.com/basho/riak/blob/2.0/RELEASE-
NOTES.md#conv...](https://github.com/basho/riak/blob/2.0/RELEASE-
NOTES.md#convergent-data-types)

~~~
lomnakkus
That's great, but isn't the whole point of CRDTs that you _don 't_ actually
need any special support as long as you get "access" to conflicting writes and
can resolve them at the application level? It seems that they would be trivial
to implement atop Riak-1.x, unless I'm missing something.

Don't get me wrong: Built-in support is fine and all, but it's hardly going to
save the world.

EDIT: Just to add something other than skepticism, I found this video with
Mark Shapiro quite interesting (and on-topic!)

[http://msrvideo.vo.msecnd.net/rmcvideos/153540/dl/153540.mp4](http://msrvideo.vo.msecnd.net/rmcvideos/153540/dl/153540.mp4)

~~~
tsantero
managing the # of actors, safely dealing with tombstones and reducing space
complexity are just a few reasons why one might want CRDTs on the server side.

~~~
lomnakkus
Can you add (or link to) something with more details?

"Managing the # of actors" doesn't make much sense to me without further
context. I know that Erlang is actor-based, but that shouldn't matter if I'm
e.g. a client of the system? Also, I'm not sure what "safely dealing with
tombstones" would mean -- what additional safety is added by "server-side"
CRTDs which cannot be achieved with client-side code (and how so?). (Etc.)

~~~
cmeiklejohn
Full disclosure: I’m a current employee of Basho Technologies.

Other than some supporting work [1] [2] introduced into 2.0 to provide
advanced causality tracking, you’re correct in assuming we could have
introduced more [3] CRDTs as part of the Riak 1.x series. We could have also
implemented all of the CRDTs we provide in the client as well, which is
similar to what the SwiftCloud CRDT reference platform does.

There's a couple important things to note here, however:

* When talking about merging conflicting writes, we are specifically referring to state-based CRDTs, not operation-based, which is what we have implemented in Riak.

* Retrieving conflicting writes from the client, or siblings as we call them in Riak, requires bringing all of the siblings to the client, performing the merge operation, and shipping the updated state back. Given this, the number of siblings an object can have on disk, assuming all merge operations happen at the client, is potentially unbounded if you never ever read, and only ever write. When implemented on the server, we can ensure that we perform this merge operation during both the read and write cycle, keeping the sibling count down to one and reducing the amount of state we need to ship to the client.

* In addition, we use the coordinating node (really, a combination of virtual node and partition index) of the write as the "participant" or "actor" for the operation. This is not to be confused with actor model based languages. This allows us to have better control over actor growth; when dealing with clients all writing to CRDTs, every single particpant needs to have a unique actor id. Recall, that most of the CRDTs track actor counts, for instance the G-Counter which is structurally equivalent to a vector clock, although semantically different. This introduces a problem of garbage collection. Interval tree clocks, is one such solution for addressing the problem, but can not be used as the basis for some CRDTs. [4]

* Finally, there is work underway in making state-based CRDTs more efficient through "delta-CRDTs" [5], which allow for a more efficient optimistic and anti-entropy repair mechanism.

While the most notable resource for exploring CRDTs continues to be the
comprehensive report by Shapiro, et al, [6] in practice many of the data
structures outlined here have unbounded growth in garbage (specifically
referring to items such as the OR set, which tracks an object for every
operation performed). Therefore, we rely on some of the more optimized
representations which don’t accumulate garbage. [7] In addition, the conflict-
free, composable, replicated map structure, which is provided by Riak 2.0 was
specifically invented by Basho, and it is the first of its kind. [8] It took
many hours and iterations on QuickCheck models to ensure that, given somewhat
arbitrary composition, that merge operations happened correctly. This is why
there has been interest in exploring alternative ways of checking or building
these models. [9]

By storing these CRDTs at the server-side, we also are able to provide a
operation-based interface for interacting with these objects from all of our
clients, and leave the complexity of implementing the CRDTs out of the client.
This additionally allows for our search offering, Yokozuna, to be able to
index these data types and provide query over their values.

[1]
[https://github.com/basho/riak_kv/pull/746](https://github.com/basho/riak_kv/pull/746)

[2]
[https://github.com/basho/riak_core/pull/463](https://github.com/basho/riak_core/pull/463)

[3] [http://basho.com/counters-in-riak-1-4/](http://basho.com/counters-in-
riak-1-4/)

[4]
[http://gsd.di.uminho.pt/members/cbm/ps/itc2008.pdf](http://gsd.di.uminho.pt/members/cbm/ps/itc2008.pdf)

[5]
[https://twitter.com/xmal/status/467331615535149059](https://twitter.com/xmal/status/467331615535149059)

[6] [http://hal.inria.fr/inria-00555588](http://hal.inria.fr/inria-00555588)

[7] [http://arxiv.org/abs/1210.3368](http://arxiv.org/abs/1210.3368)

[8]
[http://dl.acm.org/citation.cfm?id=2596633](http://dl.acm.org/citation.cfm?id=2596633)

[9] [http://arxiv.org/abs/1406.4291](http://arxiv.org/abs/1406.4291)

* Edited to fix citation formatting.

~~~
cmeiklejohn
For what it's worth, I'm putting up a more permanent resource with
information, links to papers, etc., on my blog:

[http://christophermeiklejohn.com/crdt/2014/07/22/readings-
in...](http://christophermeiklejohn.com/crdt/2014/07/22/readings-in-
crdts.html)

------
lobster_johnson
This looks really nice. The lack of automatic merging was really the last
hurdle preventing me from wanting to use Riak.

One issue: I'm surprised that the merge algorithm seems to generally pick the
last write as the winner. For example about registers, the docs say: "The most
chronologically recent value wins, based on timestamps". But that's not always
correct: For example, if client A reads version 10 and writes version 11, and
then some client B has version 9 and writes version 12, you have a conflict
where node A's version should win even though it's older, since it's history
is more correct.

Or is the documentation just ambiguously worded? It says the algorithm is
_weighted_ towards last-write-wins, but also that it takes history into
consideration.

~~~
Russelldb
Only last-write-wins registers use last-write-wins to resolve conflicts. Maps
and Sets and Flags use an Add-Wins/Observed-Remove semantic based on fine
grained causality tracking methods borrowed from dotted version vectors,
counters are vectors of actor->count pairs. None of these types use temporal
time at all.

For something like a single string register inside a Map (which is the
register you refer to) a simple LWW seemed adequate. Maybe in future we can
add some more complex type here, maybe with causal history + timestamp
arbitration (like Riak's allow_mult=false.)

~~~
lobster_johnson
So a register in a map will resolve correctly vis-a-vis my example above?

------
losvedir
As a RoR developer primarily, Jose Valim's new language, Elixir, which looks
ruby-esque but runs on the Erlang VM, has piqued my interest in learning more
about Erlang.

Erlang has always been on the outskirts of my awareness as something that
might be worth looking into, but I can't quite determine what it's best used
for. I know it's behind a lot of telecom stuff, and that it powers WhatsApp,
but when you're not dealing with massive scale where you need distributed
computing and fault tolerance, are there still benefits?

Does Erlang make sense for a side-project web app? Or is it mostly for
enterprise-level applications?

------
kitd
Very interesting.

Riak is one of the core components in the new Spine2 data messaging and
handling hub being built for the NHS here in the UK.

The NHS is one of the largest producers and consumers of data in Europe and
the new hub is being evolved using FOSS software and agile methods.

More details: [http://www.ehi.co.uk/news/ehi/8534/spine2-built-in-house-
on-...](http://www.ehi.co.uk/news/ehi/8534/spine2-built-in-house-on-open-
source)

------
alphadevx
I'm most excited about the embedded Solr search engine in this release
(Yokozuna), always felt that architecturally search and data should sit in the
same place.

~~~
riffraff
is that actually solr or is it just lucene-based? I.e. do you use the same
clients, schema, extension modules you'd use with normal solr?

~~~
alphadevx
It's Solr:

"Yokozuna comes pre-bundled with Solr 4.0.0 running in the Jetty container."
(they now bundle 4.1.0).

Source:
[https://github.com/rzezeski/yokozuna/blob/v0.3.0/docs/RELEAS...](https://github.com/rzezeski/yokozuna/blob/v0.3.0/docs/RELEASE_NOTES.md)

~~~
seancribbs
The Solr version in the Release Candidate is actually 4.7.0.

------
politician
Congrats Basho!

