Hacker News new | comments | show | ask | jobs | submit login
Consistency is Consistently Undervalued (kevinmahoney.co.uk)
220 points by kpmah on Sept 17, 2016 | hide | past | web | favorite | 124 comments



My opinion is exactly opposite: Consistently is overvalued.

Requiring consistency in distributed system generally leads to designs that reduces availability.

Which is one of the reasons that bank transactions generally do not rely on transactional updates against your bank. "Low level" operations as part of settlement may us transactions, but the bank system is "designed" (more like it has grown by accretion) to function almost entirely by settlement and reconciliation rather than holding onto any notion of consistency.

The real world rarely involves having a consistent view of anything. We often design software with consistency guarantees that are pointless because the guarantees can only hold until the data has been output, and are often obsolete before the user has even seen it.

That's not to say that there are no places where consistency matters, but often it matters because of thoughtless designs elsewhere that ends up demanding unnecessary locks and killing throughput, failing if connectivity to some canonical data store happens to be unavailable etc.

The places where we can't design systems to function without consistently guarantees are few and far between.


As you say, the low-level operations in banking do provide atomicity, which are often lacking in distributed databases.

Also, the banking system does have the system of compensating transactions (e.g. returned checks) when constraints are violated, as well as an extensive system of auditing to maintain consistency.

Do most distributed systems have the ability to detect inconsistency? How do they manage to correct problems when found? Is it mostly manual? I'm genuinely interested in hearing what people have done.

I've certainly experienced inconsistencies which were in very low-stakes online situations that were still annoying. For example, a reddit comment that was posted on a thread, but never shows up my user page. Or when I stopped watching a movie on Netflix on my PC to go watch something else with my wife on our TV, but Netflix won't let us because it thinks I'm still watching on the PC.

Come to think of it, there was also something that used to happen frequently with Amazon video, though I haven't seen it happen in a while. When I would resume a video, it gave me two choices of times to resume, one based on the device and one based on the server. The odd thing was that I had only ever used one device to watch the video but it was always the server-stored time that was right. Just a bug, I guess, but that does indicate the kind of inconsistencies that have to be resolved somehow.


> Do most distributed systems have the ability to detect inconsistency?

They do if you build them to have them. Your bank account works without requiring consistency because every operation is commutative: You never do anything that says "set account to value x". You have operations like "deposit x" and "withdraw x".

For operations that require you move money from one account to another, you need to guarantee that both or none steps gets durably recorded, and you may (or may not, depending on your requirements) insist on local ordering, but even there global consistency is unnecessary.

> How do they manage to correct problems when found? Is it mostly manual? I'm genuinely interested in hearing what people have done.

You design systems to make them self-correcting: Log events durably first, and then act on the events. If possible, make operations idempotent. If not, either make operations fully commutative, or group them into larger commutative operations and use local transactions (so require local consistency) to ensure they are all applied at once. The point is not to require no atomicity, but to localise any guarantees as much as possible, because the more local they are, the less they affect throughput.

On failure, your system needs to be able to determine which operations have been applied, and which have not, and resume processing of operations that have not.

> I've certainly experienced inconsistencies which were in very low-stakes online situations that were still annoying. For example, a reddit comment that was posted on a thread, but never shows up my user page. Or when I stopped watching a movie on Netflix on my PC to go watch something else with my wife on our TV, but Netflix won't let us because it thinks I'm still watching on the PC.

This type of thing is often down to copping out on guarantees when a system includes multiple data stores, and is often fixed by applying a local transactional event log used to trigger jobs that are guaranteed to get processed. E.g. consider the Reddit example - if you want to be able to effortlessly shard, you might denormalize and store data about the comment both tied to a thread, and tid to a user. If these are in two different databases, you need some form of distributed transactional system.

A simple way of achieving that is to write a transactional log locally on each system (as long as operations are design to not need coordination) that ensures that both "attach post to thread" and "attach post to user page" will get logged successfully before giving a success response to the end user. You can do that by using a commit marker in your log. Then you "just" need a job processor that marks jobs done when all operations in a transaction has completed (and that can handle re-running the same ones).

But this is something people often fail to do, because it's easier to just write out the code for each step sequentially and deal with the rare instances where a server dies in the middle of processing a request.

So the basic pattern for ditching global consistency is to write transaction logs locally, representing each set of operations you need to apply, and put in place a mechanism to ensure each one is applied no matter what. Then use that "everywhere" including in the many places where you should've had guarantees like that, but don't because developers have been lazy.


> You design systems to make them self-correcting: Log events durably first, and then act on the events. If possible, make operations idempotent. If not, either make operations fully commutative, or group them into larger commutative operations and use local transactions (so require local consistency) to ensure they are all applied at once. The point is not to require no atomicity, but to localise any guarantees as much as possible, because the more local they are, the less they affect throughput.

You can't do this in general, because we often want to build applications where the user expects their actions won't be reordered. People often forget this: the user is sort of the unstated extra party in a distributed system. A canonical example is privacy settings on a social network. If I want to hide my profile from a user by unfriending them and then post something unkind about them, it doesn't really work for me that the application is free to reorder my actions.

At some point your desire to have HA via a weakly consistent system conflicts with intuition and usability of an application.


> You can't do this in general, because we often want to build applications where the user expects their actions won't be reordered.

The key being they don't expect their actions to be reordered. In most systems this is trivially handled by logging event time as seen from the users system and ensure the user sees a consistent view of their own operations.

There are exceptions, naturally, but the point is not that you should be able to do this "in general", but that expecting global consistency is a relatively rare exception.

If I post a comment here, I expect my comment to be visible when I return to the page. But there is no reason for me to notice if there is a strict global ordering to when posted comments first appears in the comment page. If someone presses submit on their post 100ms before me, but the comment first appears to me the next time I reload, while it appeared the opposite way for the other poster, it is extremely unlikely to matter.

> A canonical example is privacy settings on a social network. If I want to hide my profile from a user by unfriending them and then post something unkind about them, it doesn't really for me that the application is free to reorder my actions.

This is a design issue. Your privacy settings is a local concern to your account, it most certainly does not need to be globally consistent. All you need to do to avoid problems with that is to ensure consistency in routing requests for a given session, and enforce local consistency of ordering within a given user account or session depending on requirements.

As long as you do that, it makes no difference to you or anyone else if the setting and post was replicated globally in the same order or not.


> Your privacy settings is a local concern to your account, it most certainly does not need to be globally consistent. All you need to do to avoid problems with that is to ensure consistency in routing requests for a given session, and enforce local consistency of ordering within a given user account or session depending on requirements.

I think you're misunderstanding what the problem is here. I want global consistency (really I want two party consistency) to prevent the said user from witnessing the unkind thing I posted about them. It's necessary that my own actions appear in their original order to me, but also to anyone else who is plausibly affected by their reordering.

The full example also involves posting on a mutual friend's wall and having that post hidden by unfriending. You could route all requests to fetch posts to the owning poster's pinned set of servers, but this would be extremely expensive. Sharding solutions also weaken availability guarantees, because you necessarily shrink the set of nodes able to service a request.


> I think you're misunderstanding what the problem is here. I want global consistency (really I want two party consistency) to prevent the said user from witnessing the unkind thing I posted about them.

You don't want global consistency and you're saying it yourself.

What you want is to be presented with a view that is consistent with the actions you've done. [it's really basics101 for user experience.]

The internals of the system don't matter to you. It only matter to us, as the builders. We can make it so that it is internally 'inconsistent'[0] yet it always presents a 'consistent'[0] view to the user.

[0] I don't think the word consistent describes well the situation. Nothing is ever strictly perfectly consistent.


You seem to be assuming that I'm a UX designer. I'm not, I'm a backend engineer. Just a note though: I wouldn't say something like "The internals of the system don't matter to you. It only matter to us, as the builders." to a UX designer, since that's super condescending.

Sequential consistency is basically the most lax model you can get away with in this case (and have it generalize to other similar operations). This is a very strong consistency model and weakening linearizability to this basically provides no advantage for availability. There is a proof that anything stronger than causal consistency cannot provide "total availability", which I'd argue is somewhat mythical anyways (except when you're colocated with where data is being stored).

My point here is that this is clearly a case where many of the strongest ordering guarantees are required to get the desired behavior, and subsequently availability must be sacrificed.


I was making the distinction on being the user VS being the builder of the system. The user doesn't need to know or understand the internals.

> consistency... consistency... consistency

What if I am not aiming for consistency in the first place?

We both understand that "total availability" is mythical. So why not target partial availability in the first place?


I'm not saying you should not have availability. On the contrary, many strongly consistent algorithms actually have very good availability characteristics. The whole point of being distributed is often to increase availability. Algorithms like raft and multi-paxos can provide availability so long as a majority of nodes remain reachable to the client, which I think is pretty acceptable in many cases. This is sort of why the availability vs consistency dichotomy is a false dilemma. Sure, having linearizability means you cannot have total availability and vice versa, but generally you can have very good availability and preserve strongly consistent semantics. What I disagree with is that trading strong consistency to achieve total availability is useful in many domains. To be fair, there are some, and for write-heavy workloads that tend to materialize a readable view in the future or for things like immutable caches, it can be extremely useful to weaken consistency, I just have often found that for the bulk of business logic, weaker consistency semantics make unacceptable tradeoffs to users.


> It's necessary that my own actions appear in their original order to me, but also to anyone else who is plausibly affected by their reordering.

No, it's necessary that the effects appear in the right order, and in the example case what matters is that the privacy setting is correctly applied to the subsequent post and stored with it.

> The full example also involves posting on a mutual friend's wall and having that post hidden by unfriending

If that is a requirement - and it's not clear it would be a good user experience, as it would break conversations (e.g. any reply; and what do you do about quoted parts of a mssage) and withdraw conversation elements that have already been put out there -, but in any case this does not generally require global consistency. It does require prompt updates.


Do you really need global consistency for that? I think you can get by with user-consistent ordering as long as your data model's right.

It sounds like you're saying that the friend-state for a given post are part of the permissions for the post, and therefore part of the post itself. In which case, when a user posts, I'd include the version number of their friend-state in the post's permissions. Then on view attempts, I make sure that the friend-state I'm working with is at least as new as the friend-state referred to by the post.


There are definitely solutions for handling this with consistency semantics that are something weaker than linearizable consistency. But they usually become expensive in other ways and loose a lot of the availability benefits to recover the consistency properties you do need. Most of the models we're even talking about here are something in the realm of sequential or causal consistency, which are still considered strong consistency models (and are usually implemented on top of linearizable algorithms that allow for weakening invariants that protect against witnessing inconsistency. spanner is an example of this).

Your friend-state solution only protects the visibility of the post if the friend-state between the two users has not changed more recently than both the post and your version.

I.E.

Friend state 0 is we're friends.

Friend state 1 is we're not friends.

Friend state 2 is we're friends.

Friend state 3 is we're not friends.

I have friend state 2, and the post is at state 1. I accept my version incorrectly and am able to view the post, despite the fact that we're no longer friends.


I'm seeing a lot of comments that are basically saying, "we don't need no stinking transactions because we can mostly reinvent them in the app." And the point of the article was, yeah, or you could pick a solution that already solves this.


Was that the point of the article? I didn't read it that way. E.g.: "In conclusion, if you use micro-services or NoSQL databases, do your homework! Choose appropriate boundaries for your systems."


You are right. The article says to understand your choice, it doesn't say WHAT to choose.

Pretty sure the subtext is that developers are making their lives harder than they need to by not understanding the choice they are making.


Do you perhaps have a different example that might make the problem more apparent here? Because if I post something you don't want to see, give you permission to see it, and then take it away again, I can live with you seeing it. I probably won't even know if you saw it because there was some sort of system glitch or if you just happened to look during friend state 2.


> I want two party consistency

aka snapshot consistency.

(Thoroughly enjoyed your "the user is sort of the unstated extra party in a distributed system", btw.)


Correct me if I'm wrong, but I feel like bank transactions aren't always commutative. Consider:

balance = 0 balance += 100 balance -= 50

You can't swap the order of addition and subtraction, because that would mean withdrawing from an empty account, and afaik you can't do that. Is there something I'm missing?


You can execute a withdrawal operation on an empty bank account. E.g. a credit card or a loan account are both accounts that are allowed to go negative, but regular accounts with overdrafts also do, and regular accounts without overdrafts are also often allowed to go negative (but often with prohibitive fees and restrictions attached).

It is entirely up to the bank whether or not it will be allowed to complete, but regardless the response from the bank the end result of the operations will be correct, but it may be different (it will be different if the bank will reject an operation in one order, but accept all of them when presented in a different order).

You are right that they are therefore technically not strictly commutative in the sense the outcome may differ in edge cases.

But the point is that what will not happen, is that you will get incorrect results. E.g. money won't "disappear" or "appear" spontaneously.


There are a couple ways you could do this, though according to jerf in a comment below, banks often don't implement transactions as commutative.

Most commonly things like this are done by somehow reifying the operation as data and attaching some information to determine ordering. One way would be to attach a timestamp to bank transactions. By itself, that would be enough to ensure bank transactions commute (but without bounds on clock synchronization, wouldn't guarantee that you see those operations in the order you issue them). You could also require a single consistent read to an account to get a state version number and attach an incremented version of that in your transaction. Ties in both cases could be resolved by account id of the person initiating the transaction.


I don't have much to add to your excellent post; I just wanted to underline this:

> Log events durably first, and then act on the events.

Nothing has changed my view of how to build systems more than the log-then-act paradigm. So many problems go away when I start from that perspective.

I think it took me so long to see it because so many common tools work against it. But those tools were built in an era when most systems had "the database", a single repository of everything. That paradigm is very slowly dying, and I look forward to seeing what tools we get when the distributed-systems paradigm becomes dominant.


Exactly, not least because it is fantastic for debugging to be able to roll a copy of a sub-system back and re-process operations and watch exactly what breaks instead of trying to figure things out from a broken after-the-fact system state.


Does commutative mean something different in CS than in math? In math, commutative means that the order of operands doesn't matter.

For example: addition is commutative, because 1+2 = 2+1

Subtraction, however, isn't.

Since a transfer is basically a transfer from one account to the other, it couldn't possibly be commutative in a mathematical sense.

I suspect there might be a better term for the property you are trying to name.

"Atomic" seems adequate to me...


(TL;DR - this is a concept that computer science borrowed from math)

In math (say abstract algebra / group theory), the operations on your database form a group under composition. Given two operations f,g (say f is "subtract $10" and g is "subtract $20"), the composition of those operations could be written "f•g" (depending on notation this can mean g is done first or f is done first, typically the former convention is used).

(So, instead of numbers and plus (+) in your example, we're using functions and composition (•) as our group.)

This group, of operations under composition, is said to be commutative if f•g = g•f for all f and g. (That is for all combinations of your possible operations.)

For addition/subtraction operations you're good to go.

If you can consider operations on a hash table being things like 'set a value of 10 for key "a"' the order that the operations are executed in do matter, and the group of those operations, under composition, is not commutative.

The subject of CRDT's in computer science is an attempt to create data types whose operations are commutative.

One reason to want do deal with commutative operations is that you don't have to figure out the order in which the operations officially occurred.


A bank transfer is rarely atomic unless within the same bank (and even then it depends how centralised their operations are). A transfer decomposes into operations on two separate accounts, which may involve additional steps (e.g. the sending bank will record having received a request to transfer it, record it's taken money out of the source account, record settlement, record the transfer has completed, etc.)

When talking about the operations being commutative, we're generally talking about each individual account, though it may apply wider as well.

In any case, the difference is that we are not talking about the elements of a given operation. You are right that if you see the operations as being "balance = balance + x" or "balance = balance - y", then the subtraction in the latter is not commutative.

But the operations seen as indivisible units are commutative.

Consider the operations instead as a list, such as e.g. "add(x); sub(y)" (in reality, of course, we'd record each transaction into a ledger, and store a lot more information about each).

See that list as an expression with ";" as the operator. This is commutative - the operands to ";" can be reordered at will. That ability to reorder individual operations is what we're talking about in this context, not the details of each operation.


My (admittedly limited, so far) experience in the financial industry has indicated that, while it is the norm is to use settlement and reconciliation in situations where making ACID-style guarantees is impossible, ACID-style guarantees are most emphatically _not_ foregone when it isn't necessary to do so.

One big reason for that is that the settlement and reconciliation process is easier to manage when the parties involved can at least be confident that their own view of the situation is internally consistent. Gremlin-hunting is a lot less expensive if you can minimize the number of places for gremlins to hide.


The bank example really needs to die, its not reflective of what people often use databases for.

1. Bank transactions are easy to express as a join-semilattice[1], which means its intrinsically easy to build a convergent system of bank accounts in a distributed environment. Many problems are not easily expressible as these and usually trade off an unacceptable amount of space to formulate as a join-semilattice and still work with complicated types of transactions.

2. A bank transaction has a small upper bound on the types of inconsistency that can occur: at worst you can make a couple overdrafts.

3. A bank transaction has a trivial method for detecting and handling an inconsistency. Detection is easy because its just a check that the transaction you're applying doesn't leave the account with a negative balance. Handling is easy, because the bank can just levy an overdraft charge on an account holder (and since they have the ability to collect debts on overdrawn accounts, this easily works for them). Complicated transactions often don't have a trivial method of detecting inconsistency and handling inconsistency is rarely as simple as being able to punish the user. In fact corrective action (and likewise, inconsistency) is often non-intuitive to users because we expect causality in our interaction with the world.

> The real world rarely involves having a consistent view of anything

I strongly disagree with this. Most of our interaction with the world is perceived consistently. When I touch an object, the sensation of the object and the force I exert upon the object is immediate and clearly corresponds to my actions. There is no perception of reordering my intentions in the real world.

We expect most applications to behave the same way, because the distributed nature of them is often hidden from us. They present themselves as a singular facade, and thus we typically expect our actions to be reflected immediately and in the order we issue them.

Furthermore, this availability vs consistency dichotomy is an overstatement of Brewer's theorem. For instance, linearizable algorithms like Raft and Multi-Paxos DO provide high availability (they'll remain available so long as a majority of nodes in the system operate correctly). In fact, Google's most recent database was developed to be highly available and strongly consistent [2].

[1]: https://en.wikipedia.org/wiki/Semilattice

[2]: http://research.google.com/archive/spanner.html


"Bank transactions are easy to express as a join-semilattice[1],"

That may be true for a given set of transactions in a very abstract way, but that has some serious practical problems. First, like it or not, banks have a history of making transactions non-associative and non-commutative; there are many reported cases from the wild of banks deliberately ordering transactions in a hostile manner so that the user's bank account drops below the fee threshold briefly even though there is also a way to order the transactions so that doesn't happen. So as a developer, telling your bank management that you are going to use a join-semilattice (suitably translated into their terms) is also making a policy decision, a mathematically strong one, about how to handle transactions that they are liable to disagree with.

There is also the problem of variable time of transmission of transactions meaning that you may still need to have some slop in the policies, in a way that is not going to be friendly to the lattice.

The nice thing about join semilattices and the similar mathematical structures is that you get some really nice properties out of them. Their major downside, from a developer point of view, is that typically those properties are very, very, very fragile; if you allow even a slight deviation into the system the whole thing falls apart. (Intuitively, this is because those slight deviations can typically be "spread" into the rest of the system; once you have one non-commutative operation in your join-semilattice it is often possible to use that operation in other previously commutative operations to produce non-commutative operations... it "poisons" the entire structure.)

I think you're badly underestimating the difficulty of the requirements for anything like a real bank system.


Apologies, you're right, I've never implemented bank accounting in the real world, I'm only talking about it in the sense of an account balance as a "toy problem", where its often used to discuss the virtues of weak consistency by being able to use overdraft charges.

Whether or not its actually implemented with something like CRDTs in the real world isn't really my point, my point is that its a cherry-picked example of a system that clearly does reorder operations and has weaker-than-usual concerns about witnessing inconsistency.


> There is no perception of reordering my intentions in the real world.

Never tried to type something by thinking only of the word (rather than specifically the sequence of key strokes) only to have your fingers execute the presses in slightly the wrong order -- giving you something like "teh" instead of "the"? I have, a lot actually, and if I'm in a hurry or distracted, it can even pass by unnoticed, leaving me with the impression I typed the correct word while having actually executed the wrong sequence. (Hurry or distracted I think would be the most analogous to systems which don't take the time to register aggregate feedback and verify correct execution after the fact.)

Humans have at least one experience of trying to execute a distributed sequence of actions and having the output be mangled by the actual ordering events so common they invented a word for it -- the "typo".

It's insanely easy to explain to people that sometimes really complicated systems make a typo when they're keeping their records, because it's such a shared, common experience.


It's a cool metaphor, and I'm sure there are occasions where its easy to pass this off as innocuous. But inconsistent semantics can produce blemishes and bugs that can be difficult to anticipate and can very negatively impact user experience if you aren't careful. The typo explanation is acceptable when I just see some comment I posted being reordered after a page refresh. It's less so if I get locked out of my account because my password change hasn't fully replicated to all servers (and even less so if my consistency model doesn't converge).

To be clear: there's an economic tradeoff to make here that isn't identical at all scales. At a certain point, lost revenue due to user dropoff from availability is more expensive than users who stop using a site due to a frustrating experience (and sometimes that scale itself is the compelling feature able to stave off a user from giving up on your app completely). But in most cases, I think its advisable to start with stronger consistency semantics and only weaken them when a need presents itself. Hell, Spanner is the foundation for Google's ad business, if that's not living proof that you can preserve strong consistency semantics and keep your system sufficiently available, I'm not sure what is.


> The typo explanation is acceptable when I just see some comment I posted being reordered after a page refresh. It's less so if I get locked out of my account because my password change hasn't fully replicated to all servers (and even less so if my consistency model doesn't converge).

The typo model is merely meant to explain why it happened. You'd be no more happy if you were locked out of a door because a locksmith fat-fingered the code on setting a pin. Not all typos are harmless, and that wasn't the point of my analogy.

Helping people understand why these technology mistakes happen -- through intuitive models like typos -- helps us discuss when mistakes are and aren't acceptable, and how we're going to address the cases where they're not. It simply gives me a way to discuss it with non-technical people: under what situations would a person doing the computer's job making a typo be catastrophic instead of merely annoying?

Most managers or clients can understand that question, at an intuitive level.

> But in most cases, I think its advisable to start with stronger consistency semantics and only weaken them when a need presents itself.

Completely agree! Unless you have a specific reason it's necessary for you, you likely don't have to actually address the inconsistency directly and are better off letting the database handle that for you.


Oh apologies, I misunderstood what you were saying! Yes, agreed, this is a good method for communicating what's occurring in weakly consistent systems to stakeholders. Definitely going to use this in the future.


Do you (or anyone) know of a better example than the bank account? I wouldn't be against changing it.

I wanted to communicate through a fairly succinct and understandable example that there are potentially high stakes, and I wanted to show a 'NoSQL' example rather than a micro-services one. It was the first one that came to mind.


Apologies, I didn't mean it as a critique of your post. What I meant was that its often used as a strawman example of why you don't need consistency[1]. Bank transactions are extremely simple compared to the majority of things we build on top of databases.

[1]: http://highscalability.com/blog/2013/5/1/myth-eric-brewer-on...


Also, banks have very low consistency requirements.

Almost every transaction is of the form: 1. estimate authorization; 2. present request for settlement; 3. settle during batch processing; 4. any additional resolution steps (eg, returned checks)

The estimator used in 1 need not be fully consistent because a) as long as the periods of inconsistency are low (eg, < 10 sec) it won't affect the usage patterns of most accounts and b) the bank only need quantify the risk to their system (and be less than the cost of improving the system), rather than have it be risk-free. Risk management techniques like preventing rapid, high-value transactions on top of a basic estimated value like "available balance" is fine for the estimation used in authorization phase.

The actual settlement is run during a special batch transaction during hours where they can't receive new transactions, and thus is a special case of it being really easy to implement consistency of the outcome, and it's acceptable if the settlement has some number of accounts end up negative or requiring additional steps (such as denying the transaction to the presenting bank), as long as the number isn't too high, eg, the estimator in step 1 is right most of the time.

Basically, the only requirement the banks have is that a) they can estimate how inconsistent or wrong their system is until their batch settlement where they resolve the state and b) the number of accounts which will be in error when settled remains low. Then they just hire people to sort out some of the mess and eat some amount of losses as easier than trying to make the system perfect.

The fact is that banks make better use of what databases are and aren't than most designs, because they're very clear about when and where they want consistency (weekdays, 6am Eastern in the US) and the (non-zero!) rate at which the system can be in error when it resolves the state. Any application that can be clear about those can get good usage of databases.


> The estimator used in 1 need not be fully consistent because a) as long as the periods of inconsistency are low (eg, < 10 sec) it won't affect the usage patterns of most accounts

A lot of transactions that need to be reconciled happens without involving 1. e.g. checks and offline card transactions. Even ATMs can often continue to dispense cash when their network is offline, though there arguable 1 applies in the form of maximum withdrawal limits.

But in general I agree with you - their system is built around having accepted, calculated and accounted for the risks rather than try to mitigate every possible risk. Which makes sense because a large proportion of a banks business is to understand and manage risk, and some risks are just happens cheaper to accept than to mitigate via technology or business processes (and the ones that aren't, or aren't any more, are where you'll see banks push new technology, such as making more settlements happens faster and fully electronically)


> I strongly disagree with this. Most of our interaction with the world is perceived consistently. When I touch an object, the sensation of the object and the force I exert upon the object is immediate and clearly corresponds to my actions. There is no perception of reordering my intentions in the real world.

You misunderstand me, and I appreciate that I should have been clearer: A single observer in a single location that is both the sole observer and creator of event will have a consistent view. But this isn't a very interesting scenario - the type of interactions that matter to us are operations that requires synchronisation between at last two different entities separated in space.

Consider if you have two identical objects, on opposite sides of the world, that represents the same thing to you and another person. When you move one, the other should move identically.

Suddenly consistency requires synchronisation limited by the time it takes to propagate signals between the two locations to ensure only one side is allowed to move at once.

This is the cost of trying to enforce consistency. If you lose your network, suddenly both or at least one (if you have a consistent way of determining who is in charge) are unable to at least modify, and possibly read the state of the system, depending on how strict consistency you want to enforce. In this specific case, it might be necessary.

But most cases rather involves multiple parties doing mostly their own things, but occasionally needing to read or write information that will affect other parties.

E.g. consider a forum system. It requires only a bare minimum amount of consistency. You don't want it to get badly out of sync, so that e.g. there's suddenly a flood of delayed comments arriving, but a bit works just fine. We have plenty of evidence in the form of mailing lists and Usenet, which both can run just fine over store-and-forward networks.

Yet modern forum systems are often designed with the expectation of using database transactions to keep a globally consistent view. For what? If the forum is busy, the view will regularly alrady be outdated by the time the page has updated on the users screens.

> Furthermore, this availability vs consistency dichotomy is an overstatement of Brewer's theorem. For instance, linearizable algorithms like Raft and Multi-Paxos DO provide high availability (they'll remain available so long as a majority of nodes in the system operate correctly). In fact, Google's most recent database was developed to be highly available and strongly consistent [2].

They'll remain available to the portion of nodes that can reach a majority partition. In the case of a network partition that is not at all guaranteed, and often not likely. These solutions are fantastic when you do need consistency, in that they will ensure you retain as much availability as possible, but they do not at all guarantee full availability.

Spanner is interesting in this context because it does some great optimization of throughput, by relying on minimising time drift: If you know your clocks are precise enough, you can reduce synchronisation needed by being able to order operations globally if timestamps are further apart than your maximum clock drift, which helps a lot. But it still requires communication.

In instances where you can safely forgo consistency, or forgo consistency within certain parameters, you can ensure full read/write availability globally (even in minority partitions) even in the face of total disintegration of your entire network. But more importantly, it entirely removes the need to limit throughput based on replication and synchronisation speeds. This is the reason to consider carefully when you actually need to enforce consistency.


> You misunderstand me, and I appreciate that I should have been clearer: A single observer in a single location that is both the sole observer and creator of event will have a consistent view. But this isn't a very interesting scenario - the type of interactions that matter to us are operations that requires synchronisation between at last two different entities separated in space.

My point in bringing that up is that applications don't present the appearance of appearance of entities separated in space, that's why weak consistency is unintuitive. We expect local realism from nearby objects the same as we expect behavior in an application that exhibits effects from our immediate actions.

> E.g. consider a forum system. It requires only a bare minimum amount of consistency. You don't want it to get badly out of sync, so that e.g. there's suddenly a flood of delayed comments arriving, but a bit works just fine. We have plenty of evidence in the form of mailing lists and Usenet, which both can run just fine over store-and-forward networks.

Most threaded or log-like things are examples that fare well with weak consistency (because they're semi-lattices with lax concerns about witnessing inconsistency). I'm not saying that there aren't examples where its a tradeoff worth making, I'm saying that many applications are built on non-trivial business logic that coordinates several kinds of updates where it is tricky to handle reordering.

> They'll remain available to the portion of nodes that can reach a majority partition. In the case of a network partition that is not at all guaranteed, and often not likely. These solutions are fantastic when you do need consistency, in that they will ensure you retain as much availability as possible, but they do not at all guarantee full availability.

Weakly consistent databases can't guarantee full availability either when a client is partitioned from them. In fact there's no way to guarantee full availability of internet services to an arbitrary client. I think the importance of network partitions is somewhat overstated by HA advocates anyways. They're convenient in distributed systems literature because they can stand-in for other types of failures, but link failures (and other networking failures, but link hardware failures are disproportionately responsible for networking outages[1]) happen at an order of magnitude less than disk failures.

> In instances where you can safely forgo consistency, or forgo consistency within certain parameters, you can ensure full read/write availability globally (even in minority partitions) even in the face of total disintegration of your entire network. But more importantly, it entirely removes the need to limit throughput based on replication and synchronisation speeds. This is the reason to consider carefully when you actually need to enforce consistency.

I'm all for giving careful consideration to your model. And I agree, there are several cases where weakening consistency makes sense, but I think they are fairly infrequent. Weakening consistency semantics per operation to the fullest extent can add undue complexity to your application by requiring some hydra combination of data stores, or come at a large cost by requiring you to roll your own solution to recover some properties like atomic transactions. You called this "laziness" earlier, but I think this is actually pragmatism.

[1]: http://research.microsoft.com/en-us/um/people/navendu/papers...


>designs that reduces availability.

It's a slimy abuse of language to call a service "available" when it's giving incorrect output and corrupting its persistent state.

I can write a program with lots of nines that is blazingly faster than its competitors if its behavior is allowed to be incorrect.


A lot of the time there is no "correct" output.

E.g. a search result from a search engine is outdated the moment it has been generated - new pages come into existence all the time, and besides the index is unlikely to be up todate.

What matters is whether "close enough" is sufficient, and a large proportion of time it is: E.g. for the search engine example don't care if I get every single page that matched my query right at the moment I pressed submit. I care that I find something that meets my expectations.

Getting "good enough" results and getting them at a reasonable time beats not getting results or waiting ages because the system is insisting on maintaining global consistency.

Yes, there are cases where "good enough" means you need global consistency, but they are far fewer than people tend to think.


> I can write a program with lots of nines that is blazingly faster than its competitors if its behavior is allowed to be incorrect.

Congratulations! You just understood distributed systems, or put more simply the real world of app development.

Most of the programs in the real world ARE allowed to be incorrect at times.

Corollary: Most of the programs in the real world do NOT need to be correct every single time.

These statements should be on top of system design and requirement gathering classes.


Banks are forced to be eventually consistent because of their distributed nature. If it was practical to have a big central ACID database I'm sure that's what they'd use. There is an extra layer of complexity with distributed systems.

That said, if the problem has been thought about, I'm happy. My frustration is with people not understanding the trade-offs.


The problem is that a big central ACID database enforces strict limits on throughput limited at best by the the speed of light (in reality it's worse).

Enforcing a globally consistent views basically requires coordination. Coordination requires that a signal traverses a distance and is acted on no faster than the instruction throughput of a single CPU core.

When the sources of operations are geographically spread out, you will be limited based on the fastest communication possible between those locations in addition to the processing latency.

Ultimately this means that there are scaling limits to throughput of a fully serialised system that causes real problems all the time.

Sometimes a single global truth is still fast enough. Very often it isn't, when e.g. your users are spread out globally. That's why it's important to consider avoiding consistency when it is not necessary.


I guess whether it's overvalued or undervalued depends on whom we're talking about. Your point is well taken, but I think the real problem is that inexperienced programmers want the best of both worlds. In other words, they design systems that require consistency to function correctly because it's easier, but then they do not actually guarantee said consistency, because hey, schemaless is so much faster than boring old SQL.

The way I like to think about it (and mind you, I'm an early-stage kind of guy, so I rarely work on huge systems) is that consistency guarantees give you a lot of leverage, and while it's almost unattainable in the real world, in the computing world we have extremely powerful CPUs and networks that allow communication between any two points on the globe in less than a second. There is frankly a huge number of applications that can be built on an ACID database and never have to worry about the offline or distributed use cases that would require building a proper distributed system.


The bank example is terrible.

The inconsistency is an absolute PITA for the consumer. The delays are are a terrible, terrible user experience, and the consumer has to spend more money in the end.

It's extremely expensive, both in money the bank has to spend on auditing and in the legal framework that has to be put in place.

The bank example is the perfect argument for consistency, not against.


It's a perfect argument for shorter transaction processing times. The bank inconsistency is there because historically it was impossible to guarantee consistency without making the processing times far worse. E.g. want to make a payment via check? Sorry, need to send a courier to the branch holding your account to verify the funds are there, and then afterwards send a courier to inform of the updated account state before another payment would be allowed.

But the shorter you make the processing time, the less benefit you get from introducing consistency requirements, to the point where the thing that is important to address is the remaining cases where processing transactions take time, not the consistency (at least as long as banks also don't take the piss by taking advantage of the lax consistency to charge unwarranted overdraft fees).


This!

There is no single "truth".

Under the unfortunate consequences of the CAP theorem and the speed of light, consistency grinds to a halt as the number of observers increases. Time itself would have to stand still in the limit if at each time step every particle in the universe had to agree on the state of the system.

What we need to preserve is mostly strong "causal consistency", which means that from a single observers perspective, the system "appears consistent".

This article is really good and isn't obscured by NoSQL vs SQL arguments.

http://www.grahamlea.com/2016/08/distributed-transactions-mi...


For the specific case of distributed datastores, by threshold for considering any NoSQL/NewSQL is scaled writes and consistent reads. Designing a system based on consistent reads is much simpler without quirks that need to be explained to end-users.


I think you are saying that consistency matters but a system is more robust with settlement and reconciliation where consistency is like a luxury that could optimize the really robust method


In most cases what matters is very localised consistency - e.g. I want to see the comments I post on a comment page right away - while global consistency matters much less, as long as the system keeps converging (it may or may not be consistent at any given point in time, but the older a change is, the more likely that it appears the same everywhere, basically).


absolute need for consistency is rare, unless deal with anything remotely similar to "financial transactions".

Thankfully the author's premise of "NoSql == inconsistency" is wrong in at least one case. Google Cloud Datastore offers ACID compliance. I use that to great effect (I run a subscription SaaS with usage quotas)


There's a big difference between being availability being predictable & unpredictable - it's called consistency.

Picking a pattern, and sticking to it is fundamental to systems flow.

Yes, have "randomness" is important, but trying to cultivate chaos from utter disorder is a something at the very least historically speaking people are not good at collectively.

Chaos functions best when it's on the edge of a system, not embedded in it.


Amen. Whether or not the article's example is a good one, in a world without consistency you need to worry about state between _any_ two database operations in the system, so there's nearly unlimited opportunity for this class of error in almost any application found in the real world.

The truly nefarious aspect of NoSQL stores is that the problems that arise from giving up ACID often aren't obvious until your new product is actually in production and failures that you didn't plan for start to appear.

Once you're running a NoSQL system of considerable size, you're going to have a sizable number of engineers who are spending significant amounts of their time thinking about and repairing data integrity problems that arise from even minor failures that are happening every single day. There is really no general fix for this; it's going to be a persistent operational tax that stays with your company as long as the NoSQL store does.

The same isn't true for an ACID database. You may eventually run into scaling bottle necks (although not nearly as soon as most people think), transactions are darn close to magic in how much default resilience they give to your system. If an unexpected failure occurs, you can roll back the transaction that you're running in, and in almost 100% of cases this turns out to be a "good enough" solution, leaving your application state sane and data integrity sound.

In the long run, ACID databases pay dividends in allowing an engineering team to stay focused on building new features instead of getting lost in the weeds of never ending daily operational work. NoSQL stores on the other hand are more akin to an unpaid credit card bill, with unpaid interest continuing to compound month after month.


A database is an abstract concept, and ACID transactions on databases have nothing to do with the interface semantics used to interact with the shared state being managed.

You can have non-ACIDic SQL databases and ACID NoSQL databases.

While it is true that many KV stores do not present a fully consistent view over distributed data, there are some that do.

It is very important that people shed this idea that only SQL databases can have strong consistency, that's for sure.


> It is very important that people shed this idea that only SQL databases can have strong consistency, that's for sure.

Why? In practice, people are going to be building things using the same common paths: Postgres, Mongo, MySQL, Redis, etc., and from a purely pragmatic standpoint, the SQL databases that you're likely to use come with strong ACID guarantees while the NoSQL databases don't.

If you can point me to a NoSQL data store that provides high availability, similar atomicity, consistency, and isolation properties provided by a strong isolation level in Postgres/MySQL/Oracle/SQL Server, and has a good track record of production use, I'm all ears.


Berkeley DB.

I use it in production for a game that supports over 100,000 concurrent users and it's been around for a very long time.

It's fully ACID by default but you get the ability to drop elements of ACID in exchange for greater performance in a per table or transaction level. It also supports two different isolation approaches on a per table level. Locking or MVCC (like postgres).

It's a really great NoSQL database from the days before the term NoSQL existed.


BerkeleyDB is great. I'm honestly surprised it doesn't see more use. It's right in the sweet spot for software that doesn't quite need a relational model, and that seems to be a lot of software.


Check out lmdb for a better alternative to it.


How did I not know about this awesome-looking tool? Thanks for the mention! Wikipedia link for others:

https://en.wikipedia.org/wiki/Lightning_Memory-Mapped_Databa...


Yup, and we used to have FoundationDB, sadly not any more, but don't get me started on that.

Still there are plenty of highly distributed stores to go around, and it's good to know that people are actively using them with great success.

It's very nice to have fine control over consistency vs performance even if you do have to think about it as a developer, but then that's our job :-)


Berkeley DB was great, but Oracle killed it by changing the license to AGPL. That change meant many projects that use it couldn't update to the latest version without violating the license.

LMDB has taken its place in a lot of projects that used to use Berkeley.


https://github.com/cockroachdb/cockroach

which was inspired by this paper:

http://research.google.com/archive/spanner.html

which google made available for all in form of it's cloud datastore:

https://cloud.google.com/datastore/

"We believe it is better to have application programmers deal with performance problems due to overuse of transactions as bottlenecks arise, rather than always coding around the lack of transactions." [1]

[1] http://highscalability.com/blog/2012/9/24/google-spanners-mo...


A bit out of the scope, since it is an embedded database designed to run on mobile devices, but Realm[1] is a NoSQL database (in this case Object Database) with full ACID transactions, and is running on millions of devices.

[1] http://realm.io


https://cloud.google.com/datastore/

I haven't used it in a few years, but from what I remember you could have ACID like operations on grouped entries because grouping them makes their physical storage the same so ACID transaction becomes a localized operation.



> While it is true that many KV stores do not present a fully consistent view over distributed data, there are some that do.

Consistency is always relative to what the DBMS can enforce, which is in turn always relative to what the the DBMS's schema language can express. A KV store can be trivially consistent in that, between writes, it always presents the same view of the data. But it won't be consistent in the sense of presenting a view of the data that's guaranteed not to contradict your business rules. The latter is what users of relational databases expect.


Totally agree. I've wet my toes with NoSQL systems but things like MongoDB just look like trouble waiting to happen.

On the other hand, SQL makes it hard to maintain the state definition, I had to develop a mechanism to upgrade the DB from any possible state to the latest state.

Still, this allows me to define very accurately what is a valid row or entry in the database, which means my Applications need to make far fewer assumptions.

I can just take data from the DB and assume with certainty that this data has certain forms, formattings and values.

MongoDB makes none of these guarantees.


It's absolutely true that migrations in SQL databases can be a pain. I don't actually think that's a bad thing - what's really going on there is that the DBMS is throwing data integrity issues into your face, and encouraging you to think about them.

A lot of developers don't want to do that, because they want to think of data integrity as this tangential concern that's largely a distraction from their real job. A lot of developers also have a habit of cursing their forebears on a project for creating all sorts of technical debt by cutting corners on data integrity issues, while at the same time habitually cutting corners themselves in the name of ticking items off their to-do list at an incrementally faster pace.

This isn't to say that NoSQL systems make your system inherently less maintainable. But I do think NoSQL projects gained a lot of their initial market traction by appealing to developers' worst instincts with sales pitches that said, "Hey, you don't even have to worry about this!" and deceptively branding schema-on-read as "schemaless". So a reputation for sloppiness might not be deserved (or at least, that's not an issue I want to take a position on), but, in any case, it was very much earned.


I am very fond of pushing as much validation into the database as I can, so I use `NOT NULL`, foreign keys, `CHECK` constraints, and whatever I can to let the database do the job of ensuring valid data. All of us make mistakes, and making sure there are no exceptions is just what computers are good at. I think you're right about "a lot of developers", but I find data modeling, including finding the invariants, pretty satisfying actually.

Reading your comment reminds me of ESR's point in The Art of Unix Programming that getting the data structures right is more important than getting the code right, and if you have good data structures, writing the code is a lot easier. The database is just a big data structure, the foundation everything else is built on, and the better-designed it is, the easier your code will be.


That's not ESR's point. It's shoplifted from Brooks.


IMO validating in the DB also makes code faster.

When I get data from a well validated DB, I don't have to check that it is valid or has certain formats, that is ensured by the DB.

I can remove checks from the entire CRUD stack, if it's invalid data it does not get created or won't make it into the update.

The easiest way to check data is then to try the INSERT or UDPATE and if it works it's valid.

Otherwise I'd need to validate all fields, see if any contain errors, if any are missing, etc etc.


>DBMS is throwing data integrity issues into your face, and encouraging you to think about them.

And I love it to death for it.

It also allows me to do some nifty little tricks.

I've developed my own Migration Tool that works by exploiting this data integrity, it actually aids me to identify and upgrade the database dynamically without worrying about which updates are applied and which not.


"MongoDB: Because /dev/null doesn't support sharding"


Minor point, but MongoDB added document validation in 3.2 so you can add guarantees about what documents are being stored.

https://docs.mongodb.com/manual/core/document-validation/


pardon my ignorance but wouldn't you have the same kind of issues once you distribute any system, NoSQL or not. CAP applies to all distributed systems.

Surely I get the feeling, NoSql just exposed the issue and became synonymous with it.


And in case you think there's a general solution to that problem, there isn't: https://en.wikipedia.org/wiki/CAP_theorem

Still, it's funny how banking seems to be the canonical example for why we need transactions given that most banking transactions are inconsistent (http://highscalability.com/blog/2013/5/1/myth-eric-brewer-on...).


Oh, in addition to my other reply I should point out that the 'C' in CAP is not the same thing as the 'C' in ACID. Be careful with that one, it can cause confusion! http://blog.thislongrun.com/2015/03/the-confusing-cap-and-ac...


Well, they're eventually consistent. It's not like it hasn't been thought through. :)

It's a useful example because it shows how there can be serious consequences for getting it wrong.


Disclaimer: I work on distributed systems and have spoken around the world on them, so I am very biased.

I think something a lot of people miss is that the universe itself is not strongly consistent. This is Einstein's theory of relativity. It fundamentally takes time for information to travel.

So if strong consistency is not viable, even at a physics level, what can we do instead? That is why our team here at http://gun.js.org/ believe in CRDTs. What are CRDTs? They are data structures that are mathematically proven to produce the same results on retries - so even if the power goes out, or the network fails, you can safely re-attempt the update.

This means you fundamentally don't need "transactions" or any of these jargon words that people often throw out. Sadly CRDTs are becoming another one of those jargons, despite how simple they are in reality.


"This is Einstein's theory of relativity. It fundamentally takes time for information to travel."

Which is why Google's hack in Spanner and F1 RDBMS of simply using or waiting out the known time delay was so clever. Also relies on a tool that was a test of Einstein's theory. ;)

"That is why our team here at http://gun.js.org/ believe in CRDTs."

Thanks for the link. I'll look into it. Plus the liberal license that gives it a chance of making a dent in commercial sector or in combo with BSD/Apache projects.


Transactions can have different isolation levels. And sometimes the problem at hand can be implemented using transactions with weak isolation levels which are not that hard to implement using your favorite NoSQL database that support CAS operation. I recommend this article: http://rystsov.info/2012/09/01/cas.html


Yes, be aware of the transactional capabilities of your database. Some ACID databases in their default configuration don't provide all transactional guarantees!


I just wanted to add that the Postgres documentation [1] and Wikipedia [2] are both really well written when it comes to explaining each isolation level and the guarantees that it provides. Well worth the reading time.

[1] https://www.postgresql.org/docs/current/static/transaction-i...

[2] https://en.wikipedia.org/wiki/Isolation_(database_systems)#I...


Before I read this article, the following question popped into my mind and it miiiiiight be tangentially related -- yea probably not, blame the title ;) When taking the concept of consistency, does consistency have an effect that is akin to compound interest?

For example, imagine someone doing the same thing year after year diligently. (S)he'd increase his or her skill say 10% a year (have no clue what realistic numbers are). Would that mean that the compound interest effect would occur?

I phrased it really naievely, because while the answer is "yes" in those circumstances (1.1 ^ n). I'm overlooking a lot and have no clue what I overlook.

I know it's off-topic, it's what I thought when I read the title and I never thought about it before, so I'm a bit too curious at the moment ;)


The problem is that when the system does not guarantee consistency, you force the application developer using the system to solve that problem. Each application developed, will have to solve the same problem. Besides the fact the same effort is done over and over again, you also are forcing application developers to solve a problem for which they probably do not have the right skill set. In short, that strategy is wasteful (replicating work) and risky (they'll make mistakes)


I sort of agree. The examples in the article are ways in which people play fast and loose with consistency, often using a NoSQL store that has poor support for atomicity and isolation. This is a helpful message, because I've definitely seen naively designed systems start to develop all sorts of corruption when used at scale. The answer for many low-throughput applications is to just use Postgres. Both Django and Rails, by default, work with relational databases and leverage transactions for consistency.

Then, there is the rise of microservices to consider. In this case, I also agree with the author that it becomes crucial to understand that the number of states your data model can be in can potentially multiply, since transactional updates are very difficult to do.

But I feel like on the opposite side of the spectrum of sophistication are people working on well-engineered eventually consistent data systems, with techniques like event sourcing, and a strong understanding of the hazards. There's a compelling argument that this more closely models the real world and unlocks scalability potential that is difficult or impossible to match with a fully consistent, ACID-compliant database.

Interestingly, in a recent project, I decided to layer in strict consistency on top of event sourcing underpinnings (Akka Persistence). My project has low write volume, but also no tolerance for the latency of a write conflict resolution strategy. That resulted in a library called Atomic Store [1].

[1] https://github.com/artsy/atomic-store


I think it's a bad example because this should not be the way to develop in this kind (microservices) of systems.

In these environments you atomically create objects in your application's "local" storage and have a reconciliation loop for creating objects in other services or deleting these orphan "local" objects.


I think it's a good example because this should not be the way to develop this kind of system - and yet people do it this way! :)


If anyone is interested in this sort of thing, I found this a great article: http://www.grahamlea.com/2016/08/distributed-transactions-mi...


To actually do a distributed transaction, I would look into algorithms such as 2PC or 3PC. Although before going there, I would seriously consider consolidating different backends into one scalable option (banking transaction example is a bit contrived, although I gather the author is just trying to make a point).

At the service level integration, we can leverage a reliable message queue middleware to make sure a task is eventually delivered and handled (or be put in a dead letter queue so we can do a batch clean up)

Also as a general principle, I would make each of those sub-transactions to be idempotent, so that retrying multiple times won't hurt, and there would be a natural way of picking the winner if there are conflicting ongoing commit / retry attempts.


Why isn't the OP using Event Sourcing "commands" for the "Bank Accounts" example?


I'm not actually that familiar with event sourcing. I'd be interested in reading more if you have a pointer to some resources!

Edit: I wasn't familiar with the terminology, but it sounds like what I have written about here: http://kevinmahoney.co.uk/articles/immutable-data/


I believe Event Sourcing opens a whole other can of worms which would detract from the point of the article, such as whether the event streams have a well-defined order (especially if each account is its own aggregate), or whether the resulting event should be "Transaction Completed" (assumes pre-conditions were checked) or "Transaction Attempted" (checks pre-conditions before altering state).


My teams' event sourcing implementation stores and publishes events in a well defined order within a specific aggregate.

There are scenarios where a two-phase commit is used to ensure that invariants across aggregates are maintained.

We use an ACID compliant database to store our domain events and we project our events into a relational schema. When projecting events, we use transactions to make sure the database updates from a specific event are either all applied or not at all.


Yes, this is a typical way of doing things in the CQRS/ES world, but there are many things left unsaid:

- would you have a single aggregate for "all bank accounts" so that one financial transaction equals one event, or is there a good reason to have multiple aggregates ?

- is command execution transactional, or is it possible for another thread to generate (possibly conflicting) events between the time you check a precondition and the time you write the intended events ?

My team uses an ES implementation with very strong guarantees: one aggregate per macro-service, and command execution is transactional, but I'm fairly certain that this is not the way CQRS/ES is described or recommended.


I would have one aggregate instance per bank account. Each has a separate state and business rules would need to be applied at the individual account instance (e.g. overdraft fees, interest, etc). If you have to enforce rules at the bank level based upon actions taken within the account then you can have a bank aggregate that you can interact with through a saga and confirm changes through a two phase commit.

Command execution is not transactional, but writing to the domain event store is. That means that two writers cannot write the same domain event version number to the event store. Say an aggregate is at version 20. The next action would put it at 21. If two threads generate version 21 of the aggregate, only one of them can write it to the event store and the other gets an exception. Domain events are written in a transaction batch to the event store so we cannot get any writes in the middle. So, the first to write the next version wins. The other one loses.

We also partition our command and event processors by aggregate id. The same aggregate cannot have its events or commands processed more than one at a time because the processors are synchronous per partition.

If I understand you correctly then you have one instance of a very large aggregate per bounded context? If so, then no, it is not in the spirit of CQRS/ES. If you are handling commands in transactions, then the designers of your system may have chosen to favor consistency instead of the availability that CQRS provides.


We do have the same constraint on event store writes, which we then use to implement command-level transactions : a command is a pure function that maps a command model to a set of events, and if the set of events cannot be written to the stream, the command model is updated and the function is called again. This does, indeed, have a very strong bias towards consistency, although the main objective is to allow multiple command processors per aggregate, which lets us embed the command processors in services that are inherently multi-instance (e.g. web applications).

I am not sure I understand why having multiple aggregates would provide availability. Unless the idea is that the events from different aggregates are stored on separate servers, so that a single-machine outage would only take down some aggregates ? If so, I agree that is the case in theory, though with limited practical applicability in our situation.


Aggregates represent consistency boundaries. To enforce consistency you would theoretically need to have your entire state loaded when you go to process a command for a single bank aggregate.

Since each command produces a new version of the bank and many commands are coming in at the same time, most of them will fail when writing to the event store. Do they keep retrying until they succeed? Either way, this is not efficient and could effect the availability of your servers.

If you have multiple aggregates, the consistency boundaries are smaller and therefore you have a smaller state that you have to maintain consistency across. There are less operations on a single aggregate and less opportunities for contention that arise from simultaneous operations.

Also, if you are using queues for your commands and events (we are, but I suspect you are not), then you can partition your queues such that you can process your workload N ways without worrying about things happening out of order. Each aggregate processes serially within a partition. If you have just one large aggregate then you would have to jump through a lot of hoops to be able to partition the work while maintaining ordering guarantees.

I can really only guess at the details of your implementation, but I am guessing that your design has accounted for all of the above in some way. If there is one thing I've learned, it's that there are many, many ways to implement CQRS. If your software works correctly under load, then I am not sure it would matter if it fits squarely into the definition of CQRS. In fact, you may have something new altogether that presents a better way to achieve the same goals as CQRS without the same cognitive overhead.


I understand better now what you're saying, though I would have called it "ability to scale" rather than "availability".

Our architecture prevents us from parallelizing the execution of commands within a bounded context. The ability to execute commands on any number of servers is for availability (transparent failover if an instance dies) rather than performance, since the event stream acts as a global lock anyway.

In practice, our system clocks in at a comfortable 1000 commands per second under stress tests ; during our peak hours and on our busiest aggregate, we only have to execute one command per second, so we can afford at least a x100 increase before we need to consider changing our architecture (and we're no longer an early-stage start-up, so x100 would mean a lot of revenue).

Similarly, the full state for our largest aggregate (not the command model, but the entire state of all projections) fits snugly in a few hundreds of megabytes, so all instances can keep it in-memory.


You can use event logs and eventual consistency to solve this problem.

Basically you make the transfer of money an event that is then atomically committed to an event log. The two bank accounts then eventually incorporate that state.

See http://www.grahamlea.com/2016/08/distributed-transactions-mi...

But I agree that often life is easier if you just keep things simpler. If you require strong consistency like with the user/profile don't make that state distributed. If you do make it distributed you need to live with less consistency.


If we're talking about user created event, I think you should have a permanent log anyway. Otherwise if there is any kind of bug or any kind of weirdness in your data, how are you supposed to figure out what the heck happened? You can try to outwit chaos by prevent all inconsistency and bad data, but that'd a game you will eventually lose.

From a more religious perspective, every action a user takes is sacred data that you should not lose, unless you deliberately want to lose it for privacy reasons. You can rely on your software to have no bugs and never confuse your user, but again that's a game you eventually use. I would rather keep that action log so I have a place to rebuild from when things are lost. Otherwise your only choice is to reverse engineer your data. Data which, if not technically corrupt, is corrupted from a human standpoint.

And if you want undo you need a log and playback anyway.

To me data consistency at the database level is not a real solution to the problem. It is a good tool, but it only solves a very narrow slice of predictable bugs. It doesn't help at all with the inevitable bugs you can't predict. A log based approach gives you a powerful tool in all kinds of tough situations.


This profile example is missing the better approach: avoid the dependency of creating the user before creating the profile.

Create the profile with a generated uuid. Once that succeeds, then create the user with the same uuid.

If you build a system that allows orphaned profiles (by just ignoring them) then you avoid the need to deal with potentially missing profiles.

This is essentially implementing MVCC. Write all your data with a new version and then as a final step write to a ledger declaring the new version to be valid. In this case, creating the user is writing to that ledger.


Well, we're playing make believe with the requirements. In fairyland the user and the profile need to exist together. Only the fairies know why!

If you can relax the requirements, you can relax the constraints.


That's the point. Requirements are rarely as strict as we presume they are.


Good article.

I've stopped using bank transfers as an example for Acid transactions, and instead talk about social features:

- if I change a privacy setting in Facebook or remove access to a user, these changes should be atomic and durable

- transactions offer a good semantic of which to make these changes. They can be staged in queries, but nothing is successful until after a commit.

- without transactions durability is hard to offer. You would essentially need to make each query flush to disk, rather tha each transaction. Much more expensive.


Depends on your POV. Startups undervalue it and corporations overvalue it. At the end of the day, it's just risk management.


In case anyone is wondering what an "atomic change" means in database terminology: https://www.gnu.org/software/emacs/manual/html_node/elisp/At...


Maybe I'm crazy, but I never see atomic libraries that are called like this:

    bank_account2.deposit(amount)
    bank_account1.deposit(amount)
Isn't this kind of thing always called in some atomic batch operation?

    transact.batch([
        account[a] = -8,
        account[b] = 8
    ]).submit()


Yes; "with atomic...:" seems to be pseudo-syntax.


The universally hated JEE can do distributed transactions by default. Yes, with pitfalls, but it can. (It is usually hated by devs who have never used it properly.)


The problem with JEE is when you leave Java Transactions - i.e., interact with some external system as part of a transaction. Then there is no way around writing the transaction compensation/recovery/reconciliation logic to handle partial failures. On the other point, I agree that JEE is unfairly maligned. We build a modern AngularJS/Material frontend on a JavaEE 7 backend and get all the benefits of a scalable, secure enterprise platform from Java EE.


[flagged]


We're always up for the Romantics (though I prefer Coleridge), but maybe not like this. Your comments have been inappropriate for Hacker News lately, especially the one where you impersonated another user. This sort of pattern will get an account banned, so please stick to civil, substantive comments only.


But consider this:

You are using mysql, you make a transaction with say deposit and withdraw.

What happens on the mysql machine if you pull the plug exactly when mysql has done the deposit but not the withdraw?

The ONLY difference between SQL transactions and NoSQL microservice transactions is the time between the parts of a transaction.

Personally I use a JSON file with state to execute my NoSQL microservice transactions and it's alot more scalable than having a pre-internet era SQL legacy design hogging all my time and resources.


I don't mean to be unkind, but is this meant to be a parody?

There are no "parts of a transaction" because a transaction is definitionally atomic.

Transactional file modification is a fairly tricky problem, and I'd be surprised if you'd actually implemented a safe system in that manner. What's certain though is that spinning up MySQL or Postgres and using it to store simple records is essentially a zero-cost setup task - so I doubt it's ever going to "hog all your time and resources".


How is it atomic? this goes for all 5 of the replies (one of you did not downvote discussion, I'm guessing that person is the only one not from the US). How does mysql unwrite the deposit? There are always "parts" to everything. Two phase commit does not solve anything unless you have "undo the thing I did before the power was lost".

Please refer to source code to prove your argumentation.


> How does mysql unwrite the deposit?

Yes, the database either reverses the changes.

https://en.wikipedia.org/wiki/Transaction_log

    If, after a start, the database is found in an
    inconsistent state or not been shut down properly, the
    database management system reviews the database logs for
    uncommitted transactions and rolls back the changes made
    by these transactions. Additionally, all transactions that
    are already committed but whose changes were not yet
    materialized in the database are re-applied. Both are done
    to ensure atomicity and durability of transactions.


Please refer to source code to prove your argumentation.

I'm not digging through source code for you; transactional atomicity is a well-known and thoroughly researched problem.

Transactions in an ACID system are atomic by definition. They're designed in such a way that either an entire transaction occurs, or no part of a transaction occurs – in other words, it's not possible to partially apply a transaction, by design.

There are a bunch of different approaches to implementing this. I'd guess that MySQL does something like write the complete set of modifications in a transaction to disk as a separate buffer, and only once the entire transaction is complete updating some associated metadata to add the transaction to the database. A power failure at any point will result in an uncommitted transaction, which will have no effect on the database.

Here's the appropriate Wikipedia article for further reading -https://en.wikipedia.org/wiki/Atomicity_(database_systems) - lets suffice to say that if you are rejecting the idea that atomicity exists then I don't know what else to tell you. It's a core concept in information systems.


I don't think so; if a MySQL transaction has two separate tables, at some point it has to write one and then the other remembering to remove the first if the second fails with power outage for example.

I'm not rejecting anything, all I'm asking is; where is the code and how does it work? If you don't know then how come you are so confident it does work?

I'm pretty sure the "write some status to a file and rollback if transaction incomplete on startup" code is pretty horrible on all SQL systems. And on top of that it doesn't scale. Defending status quo is always worth questioning.


>> What happens on the mysql machine if you pull the plug exactly when mysql has done the deposit but not the withdraw?

Nothing happens. And the client discovers that the connection to the db was closed before the 2 phase commit was done and gives an error to the user.


> What happens on the mysql machine if you pull the plug exactly when mysql has done the deposit but not the withdraw?

Nothing bad -- the deposit will never be seen. It sounds like you need to read up on what ACID means.

(EDIT: Clarification)


_You are using mysql, you make a transaction with say deposit and withdraw._

_What happens on the mysql machine if you pull the plug exactly when mysql has done the deposit but not the withdraw?_

MySQL guarantees that this cannot happen. That's what transactions are all about.

For example you can put into the log the two operations and then an end of transaction marker. When the system comes back after a crash you can replay those transactions from the log that have the end marker, and roll back those that don't.


This is not how database transactions work.

A transaction means that the amended state is not visible to any readers until it has been fully committed. Pulling the plug means the transaction is lost, and the writer knows that it was not committed (and so can try again).




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: