
Keeping CALM: When Distributed Consistency Is Easy - fofoz
https://m-cacm.acm.org/magazines/2020/9/246941-keeping-calm/fulltext
======
lukevp
This is great stuff, will need time to read it and digest. I feel that CRDTs
are something that should have been invented much earlier than 2011.
Formalizing the conversation around what the base level assumptions are that
are necessary to build these systems is really exciting. I have a conceptual
understanding of the implications of distributing state and coordination of
changes on that state, but it’s so much easier for us all to get things right
when there’s best practices and understanding around these concepts. It’s kind
of like how Raft made an easy to understand and include consensus library, or
Yjs the same for CRDTs, or libsodium makes it easy(er) to do security
correctly. It helps us develop the native units of computing for distributed
systems, in the “geographically distinct edge computing sense” and not in the
“a bunch of nodes with fast interconnects” sense, where offline is common and
coordination has major performance implications to UIs.

------
SmooL
I find the similarities & diffferences interesting between this and CRDTs.

This seems to be saying that any algorithm with a monotonic output with
respect to input information "has a consistent, coordination-free distributed
implementation".

As I understand it, for data to be monotonic requires that the data is
partially orderable.

CRDTs require partial ordering, as well as a merge() function so as to create
a lattice.

This seems then that CRDT's have stronger requirements - this seems to make
sense, since CRDT's are about sharing data, whereas this CALM theory is only
talking about making a local decision.

~~~
dwenzek
> CRDT's are about sharing data, whereas this CALM theory is only talking
> about making a local decision.

Both CRDT and CALM are about sharing data to make a local decision which is
globally consistent.

Both use an order relation over datasets to modelise the concept of "adding
more info to some partial input or output".

I would say that the difference is on their focus. CALM determines the
frontier where no coordination is required and provides general criteria (no
need either to retract former output nor to _hear everything there is to hear_
nor to know all the participants). CDRT provides a mean to meet this criteria.
By taking the least upper bound of former partial results, CDRT ensures that
the outcome is growing.

------
vivekseth
Another related conversation when this was discussed on The Morning Paper:
[https://news.ycombinator.com/item?id=19316737](https://news.ycombinator.com/item?id=19316737)

------
j-pb
So basically every system wich allows you to 'delete' stuff, instead of
'forgetting' it has it (distributed state) wrong.

Good thing, that that's not every database ever _cough_

~~~
hinkley
I think they're saying that deletion is a consequence of reachability.

A 'delete' is then the removal of all edges that lead to the forgotten item.

~~~
j-pb
I think you're conflating the deadlock-detection/garbage collector examples
with the general gist of the paper, or maybe I just misunderstood?

My point is that 'deletion' in every dbms - be it SQL or NoSQL - is equivalent
to logical negation of a statement, which in turn is equivalent to the non
monotonic query example that they use throughout the article `¬∃ ("not
exists")`.

There is a major logical difference between an explicit delete (negation in a
closed world interpretation) and a forget (temporary inconsistency in an open
world interpretation), in the sense that the latter can be repaired by
subsequent messaging.

This means that you can't have easy distributed consistency with our current
databases and data models.

This is partially addressed by temporal databases like datomic and juxt,
however they too have delete/retract as logical negation, which makes their
monotonicity properties limited to the set of 'change operations' instead of
the data itself, which in turn makes it hard to work with. If their
monotonicity were on the data-model layer the entire database would
essentially become a CRDT.

~~~
sriram_malhar
Sure, negation and deletion are non-monotonic. And sure, they require
coordination and waiting if you really want serializability. But the modern
default is snapshot isolation, and experience has shown that there are many
many applications that are quite willing to live with this weaker level of
isolation (eventual consistency). You can get monotonic reads/writes, without
having to wait for all of them to respond, which is the point of this paper.
Multi-version databases are lattices.

~~~
jupp0r
> experience has shown that there are many many applications that are quite
> willing to live with this weaker level of isolation (eventual consistency)

My experience has also shown that the effort for getting this right is vastly
underestimated, initially and that eventually most of these many many
applications end up with a set of bugs triggered by edge cases nobody thought
about.

~~~
hinkley
thedailywtf used to be a fun place for seeing people answer the question,
"what could go wrong?" but it got too dark or cringey for me (I can find dark
without any help, thank you.)

There's a whole ecosystem of minor catastrophes that don't quite warrant a
blog post. Do we need a name for the Chesterton house on the corner of Dunning
Way and Kruger Lane?

At the same time, it's really hard to learn why that fence is there until you
at least see someone else get injured by ignoring the signs, so in that
respect TDWTF probably taught a lot of people about quite a few fences.

