Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I strongly disagree with this. Most of our interaction with the world is perceived consistently. When I touch an object, the sensation of the object and the force I exert upon the object is immediate and clearly corresponds to my actions. There is no perception of reordering my intentions in the real world.

You misunderstand me, and I appreciate that I should have been clearer: A single observer in a single location that is both the sole observer and creator of event will have a consistent view. But this isn't a very interesting scenario - the type of interactions that matter to us are operations that requires synchronisation between at last two different entities separated in space.

Consider if you have two identical objects, on opposite sides of the world, that represents the same thing to you and another person. When you move one, the other should move identically.

Suddenly consistency requires synchronisation limited by the time it takes to propagate signals between the two locations to ensure only one side is allowed to move at once.

This is the cost of trying to enforce consistency. If you lose your network, suddenly both or at least one (if you have a consistent way of determining who is in charge) are unable to at least modify, and possibly read the state of the system, depending on how strict consistency you want to enforce. In this specific case, it might be necessary.

But most cases rather involves multiple parties doing mostly their own things, but occasionally needing to read or write information that will affect other parties.

E.g. consider a forum system. It requires only a bare minimum amount of consistency. You don't want it to get badly out of sync, so that e.g. there's suddenly a flood of delayed comments arriving, but a bit works just fine. We have plenty of evidence in the form of mailing lists and Usenet, which both can run just fine over store-and-forward networks.

Yet modern forum systems are often designed with the expectation of using database transactions to keep a globally consistent view. For what? If the forum is busy, the view will regularly alrady be outdated by the time the page has updated on the users screens.

> Furthermore, this availability vs consistency dichotomy is an overstatement of Brewer's theorem. For instance, linearizable algorithms like Raft and Multi-Paxos DO provide high availability (they'll remain available so long as a majority of nodes in the system operate correctly). In fact, Google's most recent database was developed to be highly available and strongly consistent [2].

They'll remain available to the portion of nodes that can reach a majority partition. In the case of a network partition that is not at all guaranteed, and often not likely. These solutions are fantastic when you do need consistency, in that they will ensure you retain as much availability as possible, but they do not at all guarantee full availability.

Spanner is interesting in this context because it does some great optimization of throughput, by relying on minimising time drift: If you know your clocks are precise enough, you can reduce synchronisation needed by being able to order operations globally if timestamps are further apart than your maximum clock drift, which helps a lot. But it still requires communication.

In instances where you can safely forgo consistency, or forgo consistency within certain parameters, you can ensure full read/write availability globally (even in minority partitions) even in the face of total disintegration of your entire network. But more importantly, it entirely removes the need to limit throughput based on replication and synchronisation speeds. This is the reason to consider carefully when you actually need to enforce consistency.



> You misunderstand me, and I appreciate that I should have been clearer: A single observer in a single location that is both the sole observer and creator of event will have a consistent view. But this isn't a very interesting scenario - the type of interactions that matter to us are operations that requires synchronisation between at last two different entities separated in space.

My point in bringing that up is that applications don't present the appearance of appearance of entities separated in space, that's why weak consistency is unintuitive. We expect local realism from nearby objects the same as we expect behavior in an application that exhibits effects from our immediate actions.

> E.g. consider a forum system. It requires only a bare minimum amount of consistency. You don't want it to get badly out of sync, so that e.g. there's suddenly a flood of delayed comments arriving, but a bit works just fine. We have plenty of evidence in the form of mailing lists and Usenet, which both can run just fine over store-and-forward networks.

Most threaded or log-like things are examples that fare well with weak consistency (because they're semi-lattices with lax concerns about witnessing inconsistency). I'm not saying that there aren't examples where its a tradeoff worth making, I'm saying that many applications are built on non-trivial business logic that coordinates several kinds of updates where it is tricky to handle reordering.

> They'll remain available to the portion of nodes that can reach a majority partition. In the case of a network partition that is not at all guaranteed, and often not likely. These solutions are fantastic when you do need consistency, in that they will ensure you retain as much availability as possible, but they do not at all guarantee full availability.

Weakly consistent databases can't guarantee full availability either when a client is partitioned from them. In fact there's no way to guarantee full availability of internet services to an arbitrary client. I think the importance of network partitions is somewhat overstated by HA advocates anyways. They're convenient in distributed systems literature because they can stand-in for other types of failures, but link failures (and other networking failures, but link hardware failures are disproportionately responsible for networking outages[1]) happen at an order of magnitude less than disk failures.

> In instances where you can safely forgo consistency, or forgo consistency within certain parameters, you can ensure full read/write availability globally (even in minority partitions) even in the face of total disintegration of your entire network. But more importantly, it entirely removes the need to limit throughput based on replication and synchronisation speeds. This is the reason to consider carefully when you actually need to enforce consistency.

I'm all for giving careful consideration to your model. And I agree, there are several cases where weakening consistency makes sense, but I think they are fairly infrequent. Weakening consistency semantics per operation to the fullest extent can add undue complexity to your application by requiring some hydra combination of data stores, or come at a large cost by requiring you to roll your own solution to recover some properties like atomic transactions. You called this "laziness" earlier, but I think this is actually pragmatism.

[1]: http://research.microsoft.com/en-us/um/people/navendu/papers...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: