
Antidote: CRDT-based distributed database - KirinDave
http://syncfree.github.io/antidote/
======
cmeiklejohn
Member of the SyncFree Consortium (and committer to Antidote) here.

Here's a video from J On The Beach this year on the details around the Just-
Right Consistency approach that might help answer some questions:
[https://www.youtube.com/watch?v=Vd2I9v3pYpA](https://www.youtube.com/watch?v=Vd2I9v3pYpA)

~~~
cmeiklejohn
Here's links to other things in the SyncFree group.

Lasp (previously on HN):

[https://news.ycombinator.com/item?id=14300763](https://news.ycombinator.com/item?id=14300763)

[https://news.ycombinator.com/item?id=15444156](https://news.ycombinator.com/item?id=15444156)

SyncFree project review results:

[http://cordis.europa.eu/result/rcn/197514_en.html](http://cordis.europa.eu/result/rcn/197514_en.html)

Legion:

[https://syncfree.lip6.fr/index.php/2-uncategorised/54-legion](https://syncfree.lip6.fr/index.php/2-uncategorised/54-legion)

SyncFree homepage:

[http://syncfree.lip6.fr](http://syncfree.lip6.fr)

~~~
cmeiklejohn
Further follow-up: we're (members of the SyncFree consortium and it's follow
up project for moving CRDTs, Antidote, and Lasp to the edge) considering
raising money to industrialize Antidote (and it's associated tool chain) for
building Just-Right Consistency applications.

If you'd be interested in any of the above, or just using the database and/or
trying the tools, reach out to me via email christopher.meiklejohn at
gmail.com or cmeik on Twitter.

------
skrebbel
At first glance, this looks amazing. I truly believe CRDTs are the solution to
lots of distributed systems problems, and that exposing their characteristics
to developers directly, rather than trying to abstract them away in nicer, but
leaky, abstractions, is the right way to go.

That said, a major part of why databases are hard is that reliable storage is
hard. I see remarkably little about this on Antidote's homepage. I'd wish this
was just a frontend to some battle tested storage engine but it appears that
this is not the case.

~~~
cmeiklejohn
(committer on Antidote, Riak and Lasp)

We're investigating a backend that works on LevelDB and RocksDB. We just
haven't had the academic resources to get it implemented yet. However, it's
largely an engineering resource problem and not a theoretical problem.

~~~
runT1ME
CRDTs accumulate garbage, and need a global "sync" to GC. How do you mitigate
this from having performance impacts?

~~~
anne_biene
[Disclaimer: I am an Antidote maintainer]

Some CRDTs support garbage collection directly - if you run them in a causally
consistent environment. Antidote is causally consistent and has a Set and Map
implementation that work like this; for these CRDTs you don't need a global
sync.

~~~
runT1ME
Thanks for the reply. Don't you negate some of the advantages of CRDTs by
mandating casually consistent environments? Can you speak to that more?

------
kevmo314
This has that "too good to be true" vibe, and I can't find much information on
the authors or the Syncfree Consortium organization that backs the project
besides their own website.

Is this at the cost of fast writes or flexible schema? The pitch video doesn't
seem to mention any cons, yet seems to avoid mentioning the type of data or
mutations supported. I guess I'll go read their publications.

~~~
KirinDave
It's a government sponsored project so one would expect their publicity to be
limited. That's part of the reason I posted it. Stuff like this is extremely
exciting.

It's at the cost of playing by the rules of CRDTs. Making CRDTs consequence-
free is ongoing research.

~~~
teleclimber
> Making CRDTs consequence-free is ongoing research.

What do you mean by that?

I found possibly related language on this page the other day [0]:

CRDTs are "[t]ypically not suited for editing application with consequent UI."

Can you point me in the right direction? My googling got me nowhere. Thanks.

[0] [https://irisate.com/collaborative-editing-solutions-round-
up...](https://irisate.com/collaborative-editing-solutions-round-up/)

~~~
KirinDave
I dunno about that quote, Treedoc works pretty well.

In general there are tradeoffs to CRDTs and not everyone loves those
tradeoffs. As time goes on we find CRDTs that make better tradeoffs.

For example, the earliest examples of CRDT sets grew linearly with the values
added to the set, without regard for deletion. For all time. That's a pretty
steep cost.

~~~
bpicolo
LSEQ appears to improve upon treedoc in ways, too: [https://hal.archives-
ouvertes.fr/hal-00921633/document](https://hal.archives-
ouvertes.fr/hal-00921633/document)

------
shalabhc
Related ideas exist in David Reed's 'Atomic Actions' which use pseudotime:
[http://www.cs.sfu.ca/~vaughan/teaching/431/papers/reed83.pdf](http://www.cs.sfu.ca/~vaughan/teaching/431/papers/reed83.pdf):

"thinking about objects as sequences of unchangeable versions, object
histories"

"the correct construction and execution of a new atomic action can be
accomplished without knowledge of all other atomic actions in the system that
might execute concurrently"

Is anyone aware of comparisons between these two streams of ideas (pseudo time
based action and CRDT based)?

~~~
jlu
FYI, that paper was from 1983, CRDT was introduced 2007, conceptually they
share the same idea, but most modern CRDT implementations tends to use Lamport
timestamp to causally solve conflicts.

In short, antidote looks like a decent solution to this kind of problems.

------
jnordwick
For anybody else wondering what CRDT means: conflict-free replication
datatype.

[https://en.m.wikipedia.org/wiki/Conflict-
free_replicated_dat...](https://en.m.wikipedia.org/wiki/Conflict-
free_replicated_data_type)

------
chrisweekly
CRDT: Conflict-Free Replicated Datatype

[https://en.m.wikipedia.org/wiki/Conflict-
free_replicated_dat...](https://en.m.wikipedia.org/wiki/Conflict-
free_replicated_data_type)

------
manigandham
There was also Datanet announced last year:
[http://highscalability.com/blog/2016/10/17/datanet-a-new-
crd...](http://highscalability.com/blog/2016/10/17/datanet-a-new-crdt-
database-that-lets-you-do-bad-bad-things.html)

It's now rebranded as Kuhiro:
[http://highscalability.com/blog/2017/11/6/birth-of-the-
nearc...](http://highscalability.com/blog/2017/11/6/birth-of-the-nearcloud-
serverless-crdts-edge-is-the-new-next.html)

~~~
dvirsky
There's also CRDB, a CRDT based version of Redis (closed source, and
disclosure - I work for Redis Labs) [https://redislabs.com/redis-enterprise-
documentation/adminis...](https://redislabs.com/redis-enterprise-
documentation/administering/database-operations/create-crdb/)

------
davidw
I wonder what it looks like in terms of resource usage.

I think there's a strong case for something like this for IoT type devices.
Imagine the simple case of adding a name to a contact list on a mobile device,
and wanting that to get synced up with a series of other devices.

~~~
cmeiklejohn
(committer on Antidote, author of Lasp, both SyncFree projects)

Antidote is designed for DC deployments, in the hundreds of clusters. It
provides causal consistency and transactions.

Lasp is designed for high-churn, edge deployments (with CRDTs) and provides
eventual consistency: it's designed for 1,000+ nodes.

They share a bit of code, and you might want to evaluate both to determine
which is a better use case: Lasp might be a better fit for the IoT
application, because it was designed for that.

Here's a HN post on Lasp's scalability:

[https://news.ycombinator.com/item?id=15444156](https://news.ycombinator.com/item?id=15444156)

Here's a HN post on Lasp:

[https://news.ycombinator.com/item?id=14300763](https://news.ycombinator.com/item?id=14300763)

------
continuational
How would I do a "put if absent" in this database? Is it efficient?

~~~
marc_shapiro
If the thing you "put" into is a CRDT, then two concurrent "put" will be
merged, if that's OK for your application. If however you want to disallow
concurrent "put"s then you need to add some concurrency control to the CRDT
layer. See the CISE tool [https://youtu.be/HJjWqNDh-
GA](https://youtu.be/HJjWqNDh-GA),
[http://dx.doi.org/10.1145/2911151.2911160](http://dx.doi.org/10.1145/2911151.2911160).
Antidote doesn't yet support concurrency control, it's work in progress.

------
xinsight
The video lists pros/cons for strongly consistent and eventually consistent
databases, but only has pros for a "just right consistency" database. What are
the cons?

~~~
joaomlneto
The idea of the "just right consistency" is that it brings the best of both
worlds, without any drawbacks.

Your application works as well as if it was executed fully in strong
consistency, but with improved scalability for the set of operations that can
execute in an eventual consistent model.

[https://www.youtube.com/watch?v=HJjWqNDh-
GA](https://www.youtube.com/watch?v=HJjWqNDh-GA)

~~~
erik_seaberg
The biggest drawback is that only some operations can be supported. E.g.,
without strong consistency you can detect double-spending from an account but
you can't prevent it, because the validity of an operation can't depend on
operations a datacenter hasn't seen.

~~~
KirinDave
Financial examples are bad because in fact the financial world IS eventually
consistent. It's quite possible to withdraw the same $100 from an account via
multiple ATM machines.

~~~
nine_k
With ATMs and debit cards, I thinks it's generally not true, they seem to use
the online mode and update the balance of a checking account within seconds.

With credit cards, you can indeed start more transactions against the same
balance, and you're never sure in which order they will complete.

~~~
KirinDave
Gonna repeat, it is DEFINITELY possible to spend the same money multiple times
with one ATM card, using old ATM machines with other payment methods.

No, I will not further detail how here.

~~~
nine_k
Interesting! A bit or research in this area may literally pay :)

~~~
readams
you can do this but the bank will know and the police will show up.

~~~
KirinDave
I don't mean to endorse such behavior, but the folks who used a loophole after
stealing my ATM card details from a data breach never got caught using a
variety of ApplePay based variants of this attack (now fixed, btw).

In general folks who get serious about it never get caught. Which is why folks
who give a damn about the world don't talk about specifics on public forums.

------
fiatjaf
I wanted a database that would receive "events" asynchronously and stored
that, but at the same time would process these events (from some piece of
previously written code) to generate a queryable schema.

If I wanted to change the schema later, the database would let me just rewrite
the code and it would reprocess all the received events since the beggining.

My use case is not anything high-performance or with thousands of writes --
it's the opposite.

~~~
jlu
Sounds like kappa architecture is something that will fit your request.

[http://milinda.pathirage.org/kappa-
architecture.com/](http://milinda.pathirage.org/kappa-architecture.com/)

------
avodonosov
One more idea about distributed consistencywithout synchronization:
[http://avodonosov.blogspot.com.by/2016/09/partitioned-
availa...](http://avodonosov.blogspot.com.by/2016/09/partitioned-available-
and-consistent.html)

Comments?

------
ShabbosGoy
Can anyone explain the advantages/disadvantages of using CRDTs over OT
(operational transformation)?

~~~
marknadal
OT requires a centralized server for intention resolution.

CRDT does not. It can be fully P2P.

------
haadcode
This absolutely awesome!

I'm very excited for AntidoteDB, for its use cases but also for the
underlying, pioneering work you're doing on CRDTs. Thank you for doing it! <3

------
dqv
Does anyone have this compiling on Erlang 20?

~~~
anne_biene
We tried a couple of weeks back, but then most dependencies have not been
upgraded yet. The problem is that, to my knowledge, there is no riak_core for
Erlang 20.

~~~
cmeiklejohn
I suggest that you build on top of the Riak Core from Heinz Gies. That's what
we are using in some of our new projects.

~~~
anne_biene
This is what we actually do. I checked a couple of weeks ago with Heinz, but
there wasn't a version available (but maybe I misunderstood him...).

Can you point me to an Erlang 20 compatible version?

~~~
cmeiklejohn
Ah, we might be pulling a branch for our Riak Core work. I can look into it
tomorrow and get back to you.

------
Cieplak
Looking forward to Kyle Kingsbury’s Jepsen review.

~~~
elvinyung
Jepsen goes into detail about general CRDTs here:
[https://aphyr.com/posts/285-jepsen-riak](https://aphyr.com/posts/285-jepsen-
riak)

~~~
lostcolony
Yes, but devil's in the details. The fact CRDTs are provably able to have
specific consistency guarantees doesn't mean a particular implementation does
it correctly in all contexts. Without testing, "We built this (city) on CRDTs"
isn't very useful.

~~~
KirinDave
What's nice about building databases on compositions of CRDTs is that if you
can validate the individual CRDTs via automated testing, you have a very high
degree of confidence the composition of those CRDTs will do something similar.

No one's arguing that a Jepsen test shouldn't be done. Just that it'll
probably be very different in character from more invented industry
technology.

~~~
lostcolony
Yes; if you can prove a system was built atop an academically sound paradigm,
it's more likely to adhere to the guarantees that paradigm is intended to
provide than a system built atop an unproven paradigm. Kinda goes without
saying, though. :P

Unsnarkily, I agree that implementation and composition of CRDTs is
comparatively straightforward compared with other approaches people have
taken. But if correct behavior is a requirement, full testing is too,
regardless of how easy it -should- be in theory.

------
marknadal
Hey, author of GUN (the current most popular generalizable CRDT based
database), and want to say I'm impressed. I'm often the first to nitpick
things but this looks great:

\- Built in Erlang

\- Great explainer videos

\- Well documented CRDTs that you accept

\- Team of university related researches in CRDTs.

I'll be looking through your guys stuff more. But good job! We need more
people like you guys out there.

~~~
dang
We detached this comment from
[https://news.ycombinator.com/item?id=15863587](https://news.ycombinator.com/item?id=15863587)
and marked it off-topic.

