Hacker News new | past | comments | ask | show | jobs | submit login

can anyone recommend a good overview of Raft vs. Paxos?

While this doesn't answer your question, you will find this interesting - https://ramcloud.stanford.edu/~ongaro/userstudy/

I'm not sure that this experiment is all that useful in determining the practicality of using Raft instead of Paxos. Anybody implementing one of these algorithms is going to be spending more than one hour learning it. In my experience, there a lots of corollaries that must be proven in order to actually implement the spec described by the Raft paper (mainly due to weak assumptions that cannot be assumed in all production systems). Paxos is a bit more established, so this is not as big of an issue; however, I still do not have an intuitive understanding of Paxos. This can (and will) lead to non-deterministic bugs in your implementation. A more useful study would track bugs in Raft & Paxos implementations.

That's pretty interesting and doesn't look like it has been discussed on HN before. You should submit it.

I'm almost positive that I heard about it here. It's Raft's primary distinguishing feature from Paxos, right?

But hey, if you hadn't heard about it then there's a good chance that other HNers haven't either.

I can't really give an overview, but I'll tell you what I know :).

Paxos is a rather simple algorithm for reaching consensus on a value once: http://research.microsoft.com/en-us/um/people/lamport/pubs/p...

The two phases of the algorithm are described at the bottom of page 5. Section 3 also describes how to use Paxos to implement log replication, Multi-Paxos, which is comparable to Raft. Unfortunately, this section does a lot of hand-waving, so it requires the implementer to have a thorough understanding of the underlying issues and find good solutions to problems such as membership changes.

Conversely, the Raft paper is much more instructive to implementers, giving a clear overview of the necessary functions, messages and state: https://ramcloud.stanford.edu/raft.pdf

Personally, I find Multi-Paxos a lot more elegant, as it essentially derived by working backwards from the safety constraints. As a distributed systems researcher, it is very obvious to me why it works and I can do a basic implementation almost from memory. Unlike Raft it is also independent of time. Another nice property is that it is symmetric/masterless. All nodes can do reads and writes at any time, though whether that's a good idea is another matter. Of course, these are rather subjective and mostly aesthetic notions.

Multi-Paxos is generally seen as harder to implement, because the Paxos Made Simple paper is not very explicit about implementation, and the original paper is... well... read it :) http://research.microsoft.com/en-us/um/people/lamport/pubs/l...

Raft is getting quite popular, because people find it easy to follow the instructions of the paper, but this can be slightly deceptive. For example, using stable storage for your state is essential for crash-safety (meaning fsync-ing before answering), but I don't see where Hashicorp's raft implementation does that. Raft relies on the combination of several rules to reach consensus reliably, for which Ongaro has given a rather elaborate proof using 20 pages of TLA+ in his thesis. It's quite easy to make a mistake in your implementation and break one of the rules. The same goes for Paxos, but the rules are, at least to some extent, more obvious.

A big benefit of Raft is that it already includes quite a few optimizations that would be necessary to make a Multi-Paxos implementation efficient, and you don't have to come up with them yourself. You can generally expect Raft implementations to outperform Multi-Paxos implementations. On the other hand, Multi-Paxos can be tailored to a specific use-case, such as Google's Megastore, Chubby and Spanner, which would be much more difficult with Raft.

The Hashicorp implementation is normally used with the BoltDB store, which is stable on disk. You can find it in the hashicorp/raft-boltdb package.

Yeah, rqlite enables the BoltDB storage option.

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact