How to Build a Highly Available System Using Consensus

rubiquity · on Oct 11, 2020

One of many timeless works from Butler Lampson. The patterns are still widely in use to this day. The trade offs between lease times and availability are so clearly put, reading this could save a lot of hard earned lessons. This paper is such a great top-down explanation of consensus and Paxos.

If I could go back in time I’d start off my interest in distributed systems by reading just this paper. That would have been a much better approach than feeling like I’m getting beat in the face over and over again by a water mill. The path I took was like many others: You see some distributed database that gets you really excited. CAP leads to Paxos, that leads to FLP, you rebel for a few months by getting interested in CRDTs, and then you have a bunch of hang overs from papers not using the term consistency consistently. The rest of it can best be drawn by giving a toddler a box of crayons.

Just read this instead.

avinassh · on Oct 11, 2020

> If I could go back in time I’d start off my interest in distributed systems by reading just this paper.

do you have more recommendations for such papers in distributed systems?

inopinatus · on Oct 11, 2020

three favourite bookmarks of mine:

Time, Clocks, and the Ordering of Events in a Distributed System, Lamport. (1978) http://lamport.azurewebsites.net/pubs/time-clocks.pdf

Impossibility of Distributed Consensus with One Faulty Process, Fischer, Lynch, Patterson. (1985) http://macs.citadel.edu/rudolphg/csci604/ImpossibilityofCons...

A History Of The Virtual Synchrony Replication Model, Birman. (2010) http://www.cs.cornell.edu/ken/History.pdf

rubiquity · on Oct 11, 2020

There are a lot of these lists on the internet and they're all personal, but since you asked, here would be mine. This list is just a starting point.

If I started over, I would focus solely on building the intuition of time and then start with the strictest consistency models and work my way down. I'd ignore persistence entirely and every time state is mentioned I'd just picture an in-memory number or tuple. Much like the rest of computing, everything we're doing today is based on the original ideas of the 70s/80/90s, just with more sharding. A lot more sharding.

- Time, Clocks, and the Ordering of Events in a Distributed System[0]

- The many faces of consistency[1]

- Understanding Replication in Databases and Distributed Systems[2]

- Viewstamped Replication Revisited[3]

- Consensus on Transaction Commit[4]

A lot of people recommend the FLP, CAP, and popular industry papers such as Chubby and Dynamo but I don't find them very useful in the beginning. CAP is too vague to be useful anymore and it's much better to speak in terms of specific consistency models instead. The FLP paper is certainly worth reading later on, but all you need to know in the beginning is that reaching consensus (termination and agreement) over networks with unbounded delivery times (the Internet) is impossible. Once you understand this, or don't understand it but have convinced yourself to accept it, the reasons for why fault detectors enter the picture is a lot easier to digest.

I prefer Viewstamped Replication as a deep dive into learning consensus because I find it to be a lot more concrete in contrast to Paxos.

0 - https://lamport.azurewebsites.net/pubs/time-clocks.pdf

1 - http://sites.computer.org/debull/A16mar/p3.pdf

2 - http://www-users.cselabs.umn.edu/classes/Spring-2018/csci898...

3 - http://www.pmg.lcs.mit.edu/papers/vr-revisited.pdf

4 - https://lamport.azurewebsites.net/video/consensus-on-transac...

0xbadcafebee · on Oct 11, 2020

It always struck me as odd how papers like this make it all make sense, and then you go and run one of these things, and realize that an available system doesn't mean a system that runs well. It just means a system which is capable of running well.

The bugs in all this complex software are out of scope of the conversation. It's a theoretical system, after all. The fact that the more complex and distributed a consensus-based system is, the more likely failure of the system is, isn't mentioned. Presumably either because it's implied that you should only ever use this when you have to, or because it's actually elementary.

The most stable highly available consensus-based systems are dependent on highly available hardware and networks. As soon as either become flaky, bugs start coming out of the woodwork like a log on a roaring fire (and this explains why Jepsen is so effective at finding bugs). Othe other hand, HA systems that don't depend on consensus can be smaller, and their software is typically easier to reason about, leading to fewer classes of bugs as well as a potentially smaller codebase which leads to less bugs.

teddyh · on Oct 11, 2020

> The most stable highly available consensus-based systems are dependent on highly available hardware and networks. As soon as either become flaky, bugs start coming out

“The network is reliable” is fallacy #1 of the “Fallacies of distributed computing”: https://en.wikipedia.org/wiki/Fallacies_of_distributed_compu...

0xbadcafebee · on Oct 11, 2020

#9 should be "distributed systems are superior to centralized systems", but the people who made the list must have found it too obvious to include.