One of the things that surprised me about this analysis was just how many bugs we found that had to do with the actual Raft implementation. Usually when I test Raft-based systems the bugs are at the edges--like the coupling of the system to the Raft library, treating it like an externally-queryable log rather than the driver of a state machine, and so on. We found integration bugs here too, but also a fair number of issues in the Raft library itself--and this is despite Redis-Raft having existing integration tests!
Has anyone approached Jepsen about running an analysis on the Erlang Ra implementation? I believe they've been running Jepsen tests internally, just curious if they're thinking about getting an official analysis at some point. Thanks for all that you folks do!!
* https://github.com/rabbitmq/ra
Pre-existing! It's a fork of willamt's https://github.com/willemt/raft/, which has been around since 2013, and has property-based fuzz testing! It really does look like it's got its own extensive tests; I'm surprised we found issues.
This stuff is hard!