
A Critique of the CAP Theorem - yarapavan
http://arxiv.org/abs/1509.05393
======
devit
Indeed, applying the CAP theorem to real-world databases makes no sense,
because the CAP definition of "available" is unnecessarily restrictive.

The real tradeoff is much simpler: if you want a consistent system, it will be
slower and more expensive.

Regarding availability in a consistent system, as long as more than half of
the servers are working and connected with each other, they will be able to
elect a leader and function in a consistent way (using the Raft protocol for
instance). Now as long as a client can connect to at least 1/k of the servers
in the majority it can just keep trying connecting to random servers until it
finds a reachable one in the functioning majority in time proportional to k.

For systems within a single datacenter, redundant networking makes partitions
almost impossible, and having enough hosts makes losing more than half almost
impossible, so the only failure mode is losing the whole datacenter. For
systems distributed among multiple datacenters, availability is almost
guaranteed unless a global catastrophe causes a global Internet partition
(such as Eurasia and America no longer being connected).

The only issues, again, are that it's more expensive because you need more
nodes and redundant networking and that it's slower because nodes have to
communicate among each other before commits can be confirmed, especially if
that needs to be done across datacenters. In particular, write throughput does
not necessarily scale with more nodes in a consistent system, since all writes
could modify the same value (after reading it) and that's not parallelizable
in the general case.

~~~
bcoughlan
A lot of distributed systems literature seems to be too abstract for real
world systems engineering. Where does one go to learn about architecting a
distributed system on a that works in the real world?

~~~
adamnemecek
The author of this paper, Martin Kleppmann, is writing a book "Designing Data-
Intensive Applications"
([http://shop.oreilly.com/product/0636920032175.do?cmp=af-
stra...](http://shop.oreilly.com/product/0636920032175.do?cmp=af-strata-books-
videos-product_cj_9781491903094_%25zp)). I've been reading it via the O'Reilly
immediate access and I think that it's the book you are looking for.

~~~
dwenzek
Indeed, he written a less formal post on the matter : "Please stop calling
databases CP or AP" ([http://martin.kleppmann.com/2015/05/11/please-stop-
calling-d...](http://martin.kleppmann.com/2015/05/11/please-stop-calling-
databases-cp-or-ap.html))

------
mslot
CAP is a simple and obvious principle: If you have a partition, meaning you
cannot always meet your read/write quorums, you can choose:

\- availability: always accept writes, but reads in other partitions might not
see them so there's no consistency.

\- consistency: writes fail/block until the partition is resolved, so there's
no availability.

Clearly, parts of your data that are not affected by a partition might still
be consistent and available.

The point is, in classical database systems 100% of transactions are
consistent. In distributed databases this is only possible if you sacrifice
availability some of the time, or alternatively you can sacrifice consistency
some of the time.

The author of this paper is suggesting a more elaborate theory that involves
network delay, which is great, though criticising what is effectively a
mathematical truth seems strange.

~~~
Randgalt
You have the correct response. The paper is a sleight of hand. It defines CAP
as something it's not and then attacks the straw man.

------
jamii
For me the most interesting part was:

> ...we can prove that certain levels of consistency cannot be achieved
> without making operation latency proportional to network delay.

'Consistency requires waiting' is a pretty well known rule of thumb for
distributed systems but this is the first quantitative proof that I've seen.
It's really useful see exactly what kinds of consistency impose latency and
how that latency varies with respect to network conditions.

~~~
seiji
Well, guaranteed consistency requires _coordination_ and if your coordination
mechanism is over a network, then the speed of your writes will be multiple
factors of your network latency.

"network" doesn't necessarily have to mean high latency ethernet though. You
can have a network running on top of an embedded backplane in a blade system.
There are ways to minimize latency, but latency is as latency does.

------
brudgers
Eric Brewer discussed _his_ CAP theorem earlier this year on SE-Radio:
[http://www.se-radio.net/2015/05/the-cap-theorem-then-and-
now...](http://www.se-radio.net/2015/05/the-cap-theorem-then-and-now/) In the
interview, Brewer mentions that the nuance has always been part of it, but a
decade ago the unnuanced "pick any two" elevator pitch was what upset database
vendors and developers. And it was close enough as a model to be useful.

------
hyperion2010
Reading this and some of the contents I wonder what a galaxy scale distributed
system might look like and whether speed of light would be sufficient to cope
with trade that could proceed at say, 10% the speed of light.

~~~
wmf
This is a classic:
[http://www.princeton.edu/~pkrugman/interstellar.pdf](http://www.princeton.edu/~pkrugman/interstellar.pdf)

------
macintux
I think everyone's been looking for more nuanced ways to describe the
tradeoffs involved in creating distributed systems. Another look at the
problem:
[http://radlab.cs.berkeley.edu/people/fox/static/pubs/pdf/c18...](http://radlab.cs.berkeley.edu/people/fox/static/pubs/pdf/c18.pdf)

(Edit: Worth pointing out that comes from Mr. CAP himself, Eric Brewer.)

------
lemevi
Seems like the paper includes a pretty straightforward observation that CA
just doesn't make much sense. How can you have consistency and availability if
there's a partition? That's when you get the strong versus weak consistency or
eventual consistency distinctions. If you look at the call me post by Aphyr
consistency is a problem when there's a partition in a lot of real world
software.

~~~
seliopou
That's actually the exact opposite of what he says. He says that CA can be
vacuously satisfied by a system by simply going unavailable. It makes totally
sense, but is a pathological behavior. It's a confusion that many people have
about CAP, which I tried to clarify in a blog post[0] a couple months ago, in
response to some other articles were going around at the time.

[0]: [http://computationallyendowed.com/blog/2015/07/09/cap-
theore...](http://computationallyendowed.com/blog/2015/07/09/cap-theorem-
logic.html)

~~~
brianpgordon
"CA can be vacuously satisfied by a system by simply going unavailable"

That kind of takes the A out of CA.

------
nutate
The tie ins with the google dataflow paper are at least superficial (if not
deeper on deeper reading) in that they mention: "latency, correctness and
cost" as their drivers: [http://blog.acolyer.org/2015/08/18/the-dataflow-
model-a-prac...](http://blog.acolyer.org/2015/08/18/the-dataflow-model-a-
practical-approach-to-balancing-correctness-latency-and-cost-in-massive-scale-
unbounded-out-of-order-data-processing/)

------
itistoday2
Money quote: _" we believe that CAP has now reached the end of its
usefulness"_

From Conclusion:

 _In this paper we discussed several problems with the CAP theorem: the
definitions of consistency, availability and partition tolerance in the
literature are somewhat contradictory and counter-intuitive, and the
distinction that CAP draws between “strong” and “eventual” consistency models
is less clear than widely believed._

 _CAP has nevertheless been very influential in the design of distributed data
systems. It deserves credit for catalyzing the exploration of the design space
of systems with weak consistency guarantees, e.g. in the NoSQL movement.
However, we believe that CAP has now reached the end of its usefulness; we
recommend that it should be relegated to the history of distributed systems,
and no longer be used for justifying design decisions._

~~~
seliopou
This is precisely the point of an article titled Consistency Tradeoffs in
Modern Distributed Database System Design[0]. CAP focuses on failures and not
much else. There needs to be a richer vocabulary to describe all the axes of
performance in a properly (or partially) functioning distributed system.

[0]: [http://cs-www.cs.yale.edu/homes/dna/papers/abadi-pacelc.pdf](http://cs-
www.cs.yale.edu/homes/dna/papers/abadi-pacelc.pdf)

