
Flexible Paxos: Quorum intersection revisited - mgrosvenor
https://arxiv.org/abs/1608.06696
======
rusanu
TL;DR summary (to my understanding, no sane human can ever claim it can
summarize Paxos):

The claim is that, once a leader is elected (ie. Q1), is no longer necessary
to attain a majority quorum for actually accepting writes (ie. Q2). A minority
quorum can accept writes, provided the minority contains at least one node
that participated in the leader election. By increasing the leader election
quorum number to higher than N/2+1 (as to have a sufficient number of nodes
that participated in the Q1 election), the cluster can then operate much
faster because writes require only minority quorum. The drawback is that it no
longer tolerates N/2-1 failures, as N/2-1 failures leaves too few electors to
choose a new leader in Q1.

NB. the Paxos terminology uses terms like 'decide a value', but practically in
clusters this is equivalent to 'accept writes' so I used that instead for
easier comprehension.

~~~
btrask
If your summary is accurate (haven't read the paper yet), I don't think this
works, because if you don't have a proper quorum, you can't know that the
leader is still valid at the time of an event. It might've been re-elected in
the meanwhile. The only way to know is to "check in" with all the other nodes.

I recently proposed this idea (informally) and had to retract it:
[https://bentrask.com/?q=hash://sha256/b40971e7b30324fdda15ce...](https://bentrask.com/?q=hash://sha256/b40971e7b30324fdda15ce03)

Disclaimer: totally not an expert.

~~~
jlouis
The nice thing about the paper is that it is transparent. It includes a TLA+
specification for the claims that they make.

In turn, you have a concrete model checked implementation you can talk about
or use a basis for understanding where either their or your proposed idea
either holds or fails.

Note that a model can specify the wrong correctness criterion in practice. So
you may have a "proof" which works, yet the proposition proven is wrong.

In general, when working on distributed systems, we usually want some kind of
formal criterion of correctness. The failure modes of such system are quite
hard to get right, and hence retractions of claims are plenty. Sadly, we have
too little literature on how to on-board people on model checkers such as
TLA+, proof assistants such as Coq, Isabelle/HOL or NuPRL, and QuickCheck
systems such as Haskell or Erlang QuickCheck, Clojure's core.spec (I believe),
etc.

~~~
rusanu
Book recommendation: Specifying Systems: The TLA+ Language and Tools for
Hardware and Software Engineers [0]. I did read it, but unfortunately I can't
say I actually understood it as in to be able to use it day-to-day.

[0] [https://www.amazon.ca/Specifying-Systems-Language-
Hardware-E...](https://www.amazon.ca/Specifying-Systems-Language-Hardware-
Engineers/dp/032114306X)

------
rusanu
Related discussion of the 'A More Flexible Paxos' blog quoted in the paper:
[https://news.ycombinator.com/item?id=12292590](https://news.ycombinator.com/item?id=12292590)

------
rollulus
A friendlier story here: [http://hh360.user.srcf.net/blog/2016/08/majority-
agreement-i...](http://hh360.user.srcf.net/blog/2016/08/majority-agreement-is-
not-necessary/)

------
TimFogarty
There's an easy to understand visualisation of distributed consensus
(specifically the Raft algorithm) here:
[http://thesecretlivesofdata.com/raft/](http://thesecretlivesofdata.com/raft/)

------
arthursilva
Not a paper, but I liked this related article
[https://news.ycombinator.com/item?id=11813180](https://news.ycombinator.com/item?id=11813180)

