I don't normally vote articles up but this looks really cool.

It would be good if they had more info about their testing methodology and also something like a haproxy implementation.

Also I don't see any mention of failure if the arbiter falls over.

Yes, the paper talks about having secondary arbiter that does watchdog pings, and if the primary dies, the secondary waits for the queues to flush then takes over, statelessly. It's a huge hole in the paper.

