The real tradeoff is much simpler: if you want a consistent system, it will be slower and more expensive.
Regarding availability in a consistent system, as long as more than half of the servers are working and connected with each other, they will be able to elect a leader and function in a consistent way (using the Raft protocol for instance). Now as long as a client can connect to at least 1/k of the servers in the majority it can just keep trying connecting to random servers until it finds a reachable one in the functioning majority in time proportional to k.
For systems within a single datacenter, redundant networking makes partitions almost impossible, and having enough hosts makes losing more than half almost impossible, so the only failure mode is losing the whole datacenter. For systems distributed among multiple datacenters, availability is almost guaranteed unless a global catastrophe causes a global Internet partition (such as Eurasia and America no longer being connected).
The only issues, again, are that it's more expensive because you need more nodes and redundant networking and that it's slower because nodes have to communicate among each other before commits can be confirmed, especially if that needs to be done across datacenters. In particular, write throughput does not necessarily scale with more nodes in a consistent system, since all writes could modify the same value (after reading it) and that's not parallelizable in the general case.
derp, nope. No amount of redundant networking can avoid administrator mistakes or accidental "our redundancy all runs through the same conduit and ninjas chopped it in half during a new buildout."
better researched anecdotes: https://aphyr.com/posts/288-the-network-is-reliable
enough hosts makes losing more than half almost impossible
Losing network connectivity is indistinguishable from losing hosts though, and losing hosts is indistinguishable from your application just not responding any longer (dumb programming language GC pauses? kernel panic and halted? infinite docker deploy loop? SSD slowly failing and now writes take 1,000,000 times longer than before? Y2038 bug? All applications run on the same timer or compaction schedule and decide to blip for the exact same 30 seconds?). So, the failure mode for hosts is your network failure probability and your host failure probability, and host failure probability includes software failures.
Your post was full of good information. why did you have to start it by being an asshole?
more derp available at http://themajestichusky.tumblr.com/archive
You can just consider that as a failure of the whole datacenter.
The ninjas could just as well chop all the external network connections, which would result in actual datacenter failure (from a client/service PoV), so it shouldn't increase the rate that much.
You can of course say "it will never happen" and just let chance decide what happens do the data in case when a partition happen.
Partitions and datacenter failures only determine whether the system is up or not, and thus its availability properties.
Plus, we live in cloud la la land these days. You have no idea how any of your machines/VMs are connected together. We can assume nothing.
- availability: always accept writes, but reads in other partitions might not see them so there's no consistency.
- consistency: writes fail/block until the partition is resolved, so there's no availability.
Clearly, parts of your data that are not affected by a partition might still be consistent and available.
The point is, in classical database systems 100% of transactions are consistent. In distributed databases this is only possible if you sacrifice availability some of the time, or alternatively you can sacrifice consistency some of the time.
The author of this paper is suggesting a more elaborate theory that involves network delay, which is great, though criticising what is effectively a mathematical truth seems strange.
> ...we can prove that certain levels of consistency cannot be achieved without making operation latency proportional to network delay.
'Consistency requires waiting' is a pretty well known rule of thumb for distributed systems but this is the first quantitative proof that I've seen. It's really useful see exactly what kinds of consistency impose latency and how that latency varies with respect to network conditions.
"network" doesn't necessarily have to mean high latency ethernet though. You can have a network running on top of an embedded backplane in a blade system. There are ways to minimize latency, but latency is as latency does.
(Edit: Worth pointing out that comes from Mr. CAP himself, Eric Brewer.)
That kind of takes the A out of CA.
The interesting thing about CA systems is that their behavior strongly resembles that of non-distributed systems. It's just like having one SPOF machine!
I don't get why people seem to have such a problem with the CA trade off, but seem to understand the CP or AP tradeoff. If you trade partition tolerance then have a partition your clients need to handle that just like they need to handle inconsistency, or unavailability if that was the trade off selected.
In this paper we discussed several problems with the
CAP theorem: the definitions of consistency, availability and partition tolerance in the literature are somewhat
contradictory and counter-intuitive, and the distinction
that CAP draws between “strong” and “eventual” consistency
models is less clear than widely believed.
CAP has nevertheless been very influential in the
design of distributed data systems. It deserves credit
for catalyzing the exploration of the design space of
systems with weak consistency guarantees, e.g. in the
NoSQL movement. However, we believe that CAP has
now reached the end of its usefulness; we recommend
that it should be relegated to the history of distributed
systems, and no longer be used for justifying design decisions.