Can we take a second to appreciate that this result will rapidly become the first thing taught in every single distributed systems class? And if this holds as a generalization of trustful distributed consensus as a field, then she has defined its Turing Machine equivalent. And it is even remarkably easy to understand! That is wild!!
Props to the authors. Personally, this is going on the top of my “to memorize” list.
tldr; I don't see world-changing novelty here, theoretically. There are dozens of other consensus algorithms in the world, some randomized, some not quorum based, etc. This is interesting if you're in Paxos land, trying to tune your Paxos for different fault-tolerance requirements. It's no way comparable to a computational model.
This is not true.
It generalizes quite many more than that.
It happens to use PaxOS as examples.
This is not a property of all consensus algorithms. Consensus algorithms don't need to behave this way. Paxos and derivatives behave this way, but I'm not sure Raft, etc. slide in as neatly (that would require a substantial analysis).
In particular, this generalization does not capture any nondeterministic algorithm, any algorithm tolerating Byzantine faults (or various other consensus problems), any non-quorum based algorithm, Nakamoto style (which is a byzantine consensus algorithm), Ben-Orr, etc. etc.
So, my question for you: what other algorithms do you think this paper generalizes, that are useful outside of the narrow scope of optimizing Paxos?
Don't forget that people often just want to use an algorithm and are not necessarily that interested in the underlying mathematical structure, only in whether it works reliably and does a better job than the existing solution.
That is, you can describe PaxOS, Raft, etc using the generalization given here.
The result is fairly understandable.
(Of course, as mentioned, Raft is trivial in her framework because there is only one thing deciding values, Raft is spending it's time in the leader election part to find that one thing)
Making multiple computers all store the same thing is hard because they have to send messages between each other, and sometimes messages get lost.
Other people have tried to write down how they solved this problem, but their explanations were hard for adults to understand.
This person has tried a new way to solve this problem, and a new way to explain their solution that uses ideas that many big kids who go to big kid school to study computers can understand.
1) Heidi Howard’s papers are very fun to read. She’s done lots of work I’ve enjoyed esp wrt Raft
2) She touches on two things in this story’s paper that I first read in another paper of hers that are a joy to realize for two reasons a) She made a practical improvement in the already simple Raft protocol b) quorum != (obvious) majority - in (eg) Raft she identifies that quorum actually spans two states, which are
* leader election and
* “regular” mode with a leader and requisite # followers
Read/watch on and be amazed
Is anyone aware of a forum where consensus algorithms are discussed? A mailing list, subreddit, or website?
I found a Raft mailing list a while ago, but I'm looking for a place for a more general discussion, where real experts hang out.
Or Howard's blog post: https://hh360.user.srcf.net/blog/2019/02/towards-an-intuitiv...
Twitter used to be a useful place for general distsys discussions, but I haven't been around for a few years now so I'm not certain how much still takes place.
This tweet, for example, I would have expected to see more people chime in on.
There's a distributed systems theory Google Group, but it appears to have completely died off:
To inquire what avenues for theoretical discussion are currently active, I would suggest reaching out on Twitter to Chris Meiklejohn (cmeik), Michael Bernstein (mrb_bk) or Kyle Kingsbury (aphyr).
In practice though, it seems easier just to run separate instances of Paxos.
Edit: ok after that sat there for a while I realized that generating shared pseudorandom bits would be just as effective.
Whereas Paxos tries to maintain a fully distributed consensus state, Raft instead proposes a protocol for automatically electing a leader node from a group of follower nodes, and coordinating the replication of the write log through that leader node.
I haven't had the pleasure of experimenting with either one though and would love to hear from people who have.
She happens to use PaxOS as examples to show they are non-optimal, in part because Raft is already deliberately non-optimal.
The other reason is that Raft is completely trivial in her framework.
It is equivalent to only having one thing decide the registers
(and all the hard work is in deciding which thing).
If we we call "heartbeat" the frequency at which a node is synced to the network (achieves consensus) - lower heartbeat nodes need to stay close only to nodes with a "heartbeat" faster than theirs and elect leaders in that group. The leaders in that group are also in a group of other leaders of that layer and listening to their leader, which belongs to a group above (like zooming out from a group and the group becoming a node in a larger graph).
In the end, there is only one state after all and the graph is the ultimate leader with the highest heartbeat. We wouldn't need many layers to cover all nodes, assuming laws of nature and all. Then just go a layer up/down if the network is getting too large, use a karmic system to position new/reactivated nodes. Seems scalable enough.
Sorry OP, it might be too early and I'm just thinking out loud in a comment box.
Hence, the single leader election is not actually necessary (and this is one of their points - the existing algorithms require much stronger quorum guarantees than are actually necessary for correctness)
The Raft paper is really easy to read and understand, but the actual code is very tricky.
It took a while to get the building blocks together: the log file, which can be traversed, read, and rolled back/truncated (use a kv store for this, trust me), and the transport mechanism to deal with sending complex objects and handling errors and timeouts.
Tracking terms, states (leader, follower, candidate), rejecting out of date terms, voting, elections -- it's all very complex. You'll have to build some really, really good multi-threaded tests that simulate out-of-order and dropped messages, then really pound away at the system. And good logging to find that odd corner case that keeps coming up and throwing mysterious messages.
I haven't even tried membership changes yet.
Order is important here - is this inherent in your key?
I did two implementations, one with plain text files and one with a kv store. For plain text files, I had to maintain file offsets of the records so I could traverse backward and truncate records that had to be replaced, and then break the files into segments so I could retire segments that are no longer needed. It had all kinds of little bugs.
Then I reimplemented using a kv store, and it took a tiny fraction of the time. And surprisingly had better performance.
Just create a good interface at the start, use the kv store underneath, and then later if you don't want the dependency swap it out for the more complicated code.
Do you host diep.io on cloudflare, heroku, linode etc.? or do you have own servers?
Because I'm interested about developing something.
AIUI, Bitcoin's "consensus" is just whoever has the longest valid blockchain is the winner. The algorithm tunes the difficulty s.t. it takes ~10 minutes to find a block, so the chance of two blockchains continuing to have the same length for any long amount of time is unlikely. That is, one chain will find the next block before the other, and in the meantime, be able to distribute that fact to the other nodes and thus "win".
> Honest generators only build onto a block (by referencing it in blocks they create) if it is the latest block in the longest valid chain. "Length" is calculated as total combined difficulty of that chain, not number of blocks, though this distinction is only important in the context of a few potential attacks.
Raft and Paxos are non-Byzantine. They assume that the nodes are actively trying to follow the protocol though there me be such things as network partitions and crashes.