When you read Paxos Made Simple it really all seems so, well, simple. But then you get inconsistent commits and look at the traces of what happened and just go "How?!"
This stuff is hard!
The closest thing I'm aware of is TLA+.
In my experience, async (which includes futures/promises, and actor-like mechanisms) makes the nut-and-bolts problems of avoiding variable race conditions, avoiding deadlock, managing multiple things going on, way easier.
You still need fuzzing and model checking to make sure you got the strategic stuff right.
That said, the team I work on is about to release our first Raft-based product, so I might have a different opinion in a few months.
One thing I wish raft had - a learner role which act like a follower that can't start an election until it has catch up with the rest of cluster. etcd has it, but I wish it was part of the raft instead, as well as bulk log transfer.
Article pointing out a very common issue that anyone who tried implementing raft runs into:
Letting a follower forward request to leader on client's behalf is not easy to implement correctly, that's why most popular raft based software (hashicorp stack) doesn't do that. Not worth it.
I'm honestly surprised by this comment! I've written multiple Raft implementations, and request proxying was one of the easiest things to get right--it doesn't have to touch the Raft subsystem at all. Could you talk a little more about this?
> If you follow basic software engineering principles, you'll find distributed systems easier to approach.
When I implemented Paxos I had tests and when they failed they spit out an exact trace of what happened in what order and on what node. Sometimes it was still excruciating to figure out what happened. Here's a comment which you can think of as a bug tombstone. It took me half a day to figure out after I had a trace to analyze the issue.
Full-scale blackbox testing of a database system is similar to dogfooding. You only use it when you have high confidence that you have exhausted the possibilities of unit and integration tests. It's clear this project did not start with exhaustive unit tests.
It reminds me a bit of FoundationDB, which is also a terrible program nobody should entrust with data they ever want to see again. The first time I tried to use it it ran out of memory and crashed in about ten seconds. I found the problem, which was that their huge-page-aware allocator, which has no tests, had never actually been used by anybody on a machine with huge pages. It was a core library of a released database which had never been executed by anyone. This Redis thing is the same: nobody had ever said "RAFT SET foo bar", if they had done they would have seen the problem right away.
I can't speak to "exhaustive", but Redis-Raft did have an extant unit and integration test suite prior to our collaboration. Here's what they looked like: https://github.com/RedisLabs/redisraft/tree/ff9fb28c74db880c...
I'm hesitant to draw too strong a conclusion here, and I can't speak for the Redis Labs team, but I do suspect that this is somewhere where... having an outside tester, like Jepsen (or a suitably adversarial QA team) can help detect missing-stairs sorts of problems. Coming from the perspective of a prospective operator (and having some experience with testing distributed systems), I immediately said "of course I want proxy mode by default", when this wasn't how the Redis-Raft designers necessarily intended things to be used--they intended smart clients to make it so that users wouldn't actually need proxy mode, so they hadn't focused on testing it that way.
To me that would be the ur-example of "proving it is correct, but that doesn't mean there aren't bugs in it"
I really enjoy the Jepsen analyses and they've made me think a lot harder about distributed systems. Thanks!
Also, if you are looking for a linearly scalable distributed pub-sub with strong guarantees around consensus and message persistence, it might be worth looking at Apache Pulsar.
Small typo, I believe the link in the sentence Tangentially, we were surprised to discover that Redis Enterprise’s claim of “full ACID compliance”... was copy/pasted incorrectly
> In future work, we believe it be prudent to explore other types of operations: GET and PUT, perhaps, or operations on sets.
Should say GET and SET.