CAP Twelve Years Later: How the "Rules" Have Changed

mad44 · on May 30, 2012

Here is a summary of the article. http://muratbuffalo.blogspot.com/2012/02/cap-12-years-later-...

parasubvert · on May 30, 2012

A great article that clarifies many of the CAP theorem debates that raged back in late 2010.

A quick summary...

1. Partitions are rare; why sacrifice consistency or availability for them in the general case?

Thus consider a system with a "partition mode" that either limits operations to preserve consistency, or allows operations that risk consistency, depending on what the application needs.

This is approach taken with some of the newer NoSQL databases like Cassandra, which tends to prefer availability over consistency, though this is adjustable to prefer consistency in some cases; or the newer distributed RDBMS' like Google F1, which tends to prefer consistency over availability, but generally remains highly available across data centres: http://research.google.com/pubs/pub38125.html

2. CAP isn't a single choice for a system. Systems are composed of many sub-systems that often make a different CAP choice, depending on the operation or data or user involved.

3. The properties of partition-tolerance, availability and consistency are more continuous rather than binary.

Availability can be measured percentage-wise, whereas consistency isn't as easily measured but can have many different forms, some weaker or stronger. Partition tolerance can also take on many forms, such as tolerating certain common partitions, but not other (perhaps low probability) ones.

Reflecting on the last two points, consider that a system might have:

- an offline mode its for HTML or mobile client applications, which enables availability over consistency when certain Internet WAN failures occur,

- uses a traditional RDBMS in the server, preferring consistency over availability when partitions occur in the data centre

- has a backup RDBMS in a second data centre, which uses asynchronous log shipping replication to preserve availability when partitions occur between data centres (with a chance of data loss), but also preserves both consistency and availability when partitions occur between the Internet and the primary data centre (but not the secondary).

These are all nuanced choices that are very common. One could replace the above RDBMS with a wide-area Riak Enterprise cluster or Cassandra cluster. The tradeoffs would be different - say, no data loss, but the application would have to be much more explicit about how to deal with inconsistencies when recovering from a partition between the two data centres.

Overall, it's good that we have new, more scalable & available data management options beyond the "all consistency all the time" traditional RDBMS. But the solution differs based on the application and level of scalability required. One shouldn't just throw out consistency because it seems like the new way of doing things. Similarly, one shouldn't throw out availability just because you're afraid of data loss or inconsistency. It depends on your problem.

dude_abides · on May 30, 2012

The ATM example at the end of the article is awesome and a must-read, even if you just skimmed over the rest of the article.