

Forfeiting Partition Tolerance in Distributed Systems - nkeywal
http://blog.thislongrun.com/2015/07/Forfeit-Partition-Tolerance-Distributed-System-CAP-Theorem.html

======
jhugg
I don't understand. When you run into partitions, you need to make a decision,
risk being inconsistent or shut down. Ignoring the problem is just choosing
inconsistency by default. You haven't actually gone CA.

~~~
nkeywal
You could also shut down definitively, for example if the system is corrupted
and cannot restart after a partition. There is no inconsistent history in this
case, just complete unavailability (you could also claim being CP).

You can also be partly available and partly inconsistent (2PC with heuristic
resolution). Here you're not AP nor CP.

Partition intolerance (CA) is a specification. It's saying any network
partition is a serious issue for the system, and that may break one or more
invariants (ex: atomicity in a sql database).

~~~
jhugg
> You could also shut down definitively, for example if the system is
> corrupted and cannot restart after a partition. There is no inconsistent
> history in this case, just complete unavailability (you could also claim
> being CP).

This is precisely CP.

> You can also be partly available and partly inconsistent (2PC with heuristic
> resolution). Here you're not AP nor CP.

This is basically what the AP systems do, with various ways to manage
inconsistency. Dynamo-style EC is but one.

> Partition intolerance (CA) is a specification. It's saying any network
> partition is a serious issue for the system, and that may break one or more
> invariants (ex: atomicity in a sql database).

If you can break C in the face of a partition, then you're not CA, are you?

CA is not meaningful. CAP is about choosing between availability and
consistency in the face of partitions, which are essentially unavoidable in
any non-trivial multi-node system. There are maybe some interesting things to
say about latency, but I suspect there isn't much that hasn't been said.

Side note: Mike Stonebraker posited in 2010 that partitions on a network are
rare. I'm not going to outright call him wrong, but for the purposes of
someone building a distributed system to be run for others on a non-appliance,
you're going to run into plenty of partition events on a LAN. VoltDB changed
the way our product behaved in version 3.0 to aggressively kill nodes if there
is any potential for split brains. We originally intended this feature for the
cloud only, but too many users with their own hardware were hitting partitions
in surprising configurations.

~~~
nkeywal
> This is precisely CP.

Hum. I would prefer to call 'precisely CP' a system that shutdown one of the
partition, not one that cannot restart after a partition. Even if formally you
can use both (i.e. CP/AP)

> VoltDB changed the way our product behaved in version 3.0 to aggressively
> kill nodes if there is any potential for split brains. We originally
> intended this feature for the cloud only,

Quite interesting.

