
Copysets and Chainsets: A Better Way to Replicate (2014) - nodivbyzero
http://hackingdistributed.com/2014/02/14/chainsets/
======
GauntletWizard
Precisely where they went wrong:

    
    
        In practice, the speed of recovery is typically bottlenecked by the incoming bandwidth of the recovering server, which is easily exceeded by the outgoing read bandwidth of the other servers, so this limitation is typically not a big deal in practice.
    

If you're recovering to _one_ server, you're going to have a bad time. With
random distribution, you recover to _every_ server, equally, over a very short
period of time. The tradeoff is that you'll have a lot of churn, as temporary
failures cause a lot of data to be rereplicated, and then extra copies deleted
as the come back online. On the other hand, this helps balance your
utilization and load.

The actual insight is that you want failure domain anti-affinity; That is, if
you have 1000 servers on 50 network switches, you want your replica selection
algorithm to use not three different machines at random, but three different
_switches_ at random. If you have three different AZ's, for each copy, put one
replica in each of the three. Copysets can provide this, but as stated in the
article, they're much more likely to give you Achilles heels - A typical
failure won't hurt, won't have any unavailability - But the wrong one, and you
go down hard, with N% dataloss rather than thousandths of a percent dataloss.

In short - Failures happen. Recovering from them is what matters, not
convincing yourself that they can't happen.

~~~
rescrv
I think you're pointing out a good tradeoff here. The original copysets work
allows you to explicitly tune the likelihood of data loss with the amount of
work done for recovery. A cluster replicated to minimize the likelihood of
losing a replica set under correlated failure will have a higher cost of
recovery from failure. A cluster replicated to minimize recovery time (e.g.,
RAMcloud's random allocation) will likely lose entire replica sets upon a set
of (even random) failures.

Chainsets were an attempt to add the properties of copysets to a system based
upon chain replication.

Working with the original Copysets authors we refined the chainsets algorithm
into a tiered replication algorithm that can enforce independence assumptions
on the replica set (what you've termed anti-affinity). Here's the paper on the
subject: [https://www.usenix.org/conference/atc15/technical-
session/pr...](https://www.usenix.org/conference/atc15/technical-
session/presentation/cidon)

