

The demise of eventual consistency - mycodebreaks
http://gigaom.com/2013/11/02/next-gen-nosql-the-demise-of-eventual-consistency

======
haberman
> A system that keeps some, but not all, of its nodes able to read and write
> during a partition is not available in the CAP sense but is still available
> in the sense that clients can talk to the nodes that are still connected.

Sure, as long as the clients can reach the nodes that are "still connected."
But clients can get partitioned from server nodes too. For example if there
was a complete fiber cut across the Atlantic, both clients and servers in the
US would be partitioned from clients and servers in Europe. Whichever side has
the majority of replicas (say the US) gets to keep operating. You can try to
say that the service is still "available", but that doesn't help the clients
in Europe.

If the client is partitioned from the majority of replicas, it's game over. At
that point, the system has to give up either consistency or availability _from
the perspective of that client_.

And of course, if the majority of replicas suddenly goes down completely
because of power failure or something like that, then the system truly is
unavailable no matter where you put a client. There is no way for the
surviving nodes to know that the dead nodes aren't actually alive and still
accepting writes.

> New distributed, consistent systems like Google Spanner concretely
> demonstrate the falsity of a trade-off between strong consistency and high
> availability.

I am pretty sure that even Spanner becomes unavailable if a majority of
replicas go down (or the client is partitioned from them).

~~~
jasode
>If the client is partitioned from the majority of replicas, it's game over.
At that point, the system has to give up either consistency or availability
from the perspective of that client.

It sounds like the article's author simply shifted the problem from "
_eventually consistent_ " to " _eventually 100% available to 100% of the
clients_ " \-- e.g. when fiber optical cable is repaired.

That can be a perfectly fine tradeoff but it would be clearer if the author
spelled that out explicitly.

I believe his more notable point is that the current crop of immature
databases push the logical reasoning about "eventually consistent" too far up
into application layer which in turn causes unnecessary pain for developers. I
think focusing on this one concept would be a better blog post because it
addresses many real-world requirements of "eventual consistency".

~~~
haberman
Even under that analysis, a consistent system is completely unavailable (0% of
clients can get service) when a majority of replicas suddenly go down
completely.

The point about eventual consistency pushing inconsistency too high in the
stack makes sense to me though. The spanner paper makes a similar argument.

~~~
mtdewcmu
I thought that the original justification for eventual consistency was that,
for some kinds of data, availability is much more important than consistency.
It could be that, in reality, that kind of data doesn't exist, because any
data worth keeping is worth the trouble of keeping consistent. Or, it could be
that the use case exists, but it's uncommon and seldom worth maintaining a
separate database system if you are not Facebook. The article didn't make
those arguments, though. It argues that weak guarantees by the database make
client code harder to write, which ought to be kind of obvious, if consistency
is what you need.

~~~
rdtsc
> It could be that, in reality, that kind of data doesn't exist, because any
> data worth keeping is worth the trouble of keeping consistent

Not necessarily. What is important is not "dropping" the data on the floor in
an unpredictable fashion. The sane eventually consistent models will present
the conflicted versions to the user. (Optionally all sides picking the same
one value, as winner, but never throwing away others).

That is what Riak does in its correct (sadly non-default) configuration and
that is what CouchDB does.

This bubbling up of eventual consistency to the very top layer is the correct
behavior. The database might find that both you and your friend withdrew $100
from the same account. Now that account is in a negative balance perhaps. But
the important thing is it keeps both transactions. So something above can
decide to pick a winning one, not pick any and cancel both, to use maybe a
timestamp. Or to cancel the account because of possible fraud.

------
rdtsc
> Building the complex, scalable systems demanded by todays highly connected
> world with such weak guarantees is exceptionally difficult.

So is building a complex scalable system that break the laws of physics or
theorems. The world is eventually consistent. Some business domains can handle
that. Even some banking operations are eventually consistent. You can get to
an ATM in Australia and one in US at the same time and overdraw your account.
That is eventually consistency. Banks see it better that way then expect you
to wait for half an hour until they can decide on a global shared consistent
state of your account.

> Essentially, that engineer needs to manually do the hard work to ensure that
> multiple clients don’t step on each other’s toes and deal with stale data.

Sometimes that is not that hard. Sometimes business logic allows for a custom
(user-based) reconciliation of conflicts. In some cases that engineer has to
rely on magic unicorns that another engineer (who build the DB) put it in the
product to make it beat the CAP theorem. Or the administrator needs to handle
global restart of all the servers because the global cluster has become
unavailable because say one node has blown up or got partitioned. That is not
_always_ in all cases better than the case of eventual consistency.

> Google addressed the pain points of eventual consistency in a recent paper
> on its F1 database

So the answer is installing expensive GPS receiver on the roof of your data-
centers and running wires down to the cluster of machines? Yeah that works
better in some cases. But it is not the best answer always.

> Vendors should stop hiding behind the CAP theorem as a justification for
> eventual consistency.

Vendors should lie and market-speak their way out of a theorem? This has been
done before with other database products. So maybe FoundationDB is choosing
that path to follow...

> Dave Rosenthal is a co-founder of FoundationDB.

I don't know. As I often say, sometimes the biggest enemies of a an idea are
its most ardent supporters.

~~~
Dave_Rosenthal
"beat the CAP theorem" ... "break the laws of physics" ... "lie and market-
speak their way out of a theorem"

Strong words, but the article doesn't advocate any of the above. It advocates
choosing consistency over availability and lays out the reasons why that is a
good choice. Choosing consistency is hardly a radical idea and many of the
most popular NoSQL databases actually choose consistency (e.g. HBase,
MongoDB).

Indeed even Riak, the biggest advocate of eventual consistency, is working
hard to build strong consistency into Riak 2.0.

~~~
rdtsc
But choosing consistency over availability in all cases is not an answer, one
is not a strictly superior choice over another, as he article proposes.

At one level striving for consistency in a large distributed system is
fighting with the laws of physics. It has been attempted and so far Google
probably has a better handle on it but it requires tight coupling with a time
synchronization service.

> Choosing consistency is hardly a radical idea

_Claiming_ to choose it is not radical. Actually doing it, is. If you read the
"Call me maybe" you'd see that most of supposed consistent databases fail.
Granted that measures partition tolerance but at the same time usually that is
not a choice to be made (according to the author Aphyr).

> Indeed even Riak, the biggest advocate of eventual consistency, is working
> hard to build strong consistency into Riak 2.0.

That is what I've heard. Consistency for some data is the right choice, no
argument there. At the same time they are _not_ throwing away or switching
away from eventual consistency.

In fact they are the database that when preserving conflicted siblings
actually did better than most in "Call me maybe" series. CouchDB is another
database that does this right. It preserves merge conflicts explicitly during
replication. Users can chose to ignore them but it doesn't arbitrary throw
data away. It is not in the series because replication and cluster topology is
user defined and managed so there is no default, single distributed cluster
setup.

Most of the failures of databases come from trying to sweep under the rug
effects of conflicts in eventually consistency. Even Riak's last-write-wins
did that. MongoDB and others failed too. That doesn't mean eventual
consistency is flawed. Eventual consistency is a physical reality, and in some
case it is also a viable business application pattern.

~~~
mtdewcmu
This reminds me of a paper I read in college, "The Dangers of Replication."[1]
It lays out the fundamental limits on distributed updates, and it's a good one
to read if you haven't seen it before.

[1]
[http://www.cs.cmu.edu/~natassa/courses/15-823/current/papers...](http://www.cs.cmu.edu/~natassa/courses/15-823/current/papers/gray96danger.pdf)

~~~
rdtsc
That is a pretty good paper. 1996. What it talks about is relevant. CAP wasn't
talked about and distributed databases where not the topic of casual
conversations between developers.

It mentions Lotus Notes. As a piece of trivia, the creator of CouchDB (Damien
Katz) originally worked on Lotus Notes (at IBM). Then created CouchDB in his
spare time. I believe its design is influenced by Lotus Notes quite a bit.

~~~
mtdewcmu
It wasn't a new paper when I took the class, either. None of the papers I read
for that class were new. I took the fact that it was assigned reading to mean
that the professor considered it timeless.

------
mtdewcmu
This was written by somebody with something to sell, and it sounds like it. I
was hoping he would explain the misreading of the CAP theorem, but it ends by
promising that future databases will be more powerful. Eventually, I guess?

------
zimbatm
Would love to see a call-me-maybe article on FoundationDB...

~~~
mtdewcmu
I gave you my number, but you saved it in MongoDB. Call me, maybe?

~~~
zimbatm
MongoDB did get tested :)
[http://aphyr.com/tags/jepsen](http://aphyr.com/tags/jepsen)

------
yohanatan
'Eventually consistent' is not at all a "new term".

