
Consistency is Consistently Undervalued - kpmah
http://kevinmahoney.co.uk/articles/consistency-consistently-undervalued/
======
vidarh
My opinion is exactly opposite: Consistently is overvalued.

Requiring consistency in distributed system generally leads to designs that
reduces availability.

Which is one of the reasons that bank transactions generally do _not_ rely on
transactional updates against your bank. "Low level" operations as part of
settlement may us transactions, but the bank system is "designed" (more like
it has grown by accretion) to function almost entirely by settlement and
reconciliation rather than holding onto any notion of consistency.

The real world rarely involves having a consistent view of _anything_. We
often design software with consistency guarantees that are pointless because
the guarantees can only hold until the data has been output, and are often
obsolete before the user has even seen it.

That's not to say that there are no places where consistency matters, but
often it matters because of thoughtless designs elsewhere that ends up
demanding unnecessary locks and killing throughput, failing if connectivity to
some canonical data store happens to be unavailable etc.

The places where we can't _design_ systems to function without consistently
guarantees are few and far between.

~~~
knucklesandwich
The bank example really needs to die, its not reflective of what people often
use databases for.

1\. Bank transactions are easy to express as a join-semilattice[1], which
means its intrinsically easy to build a convergent system of bank accounts in
a distributed environment. Many problems are not easily expressible as these
and usually trade off an unacceptable amount of space to formulate as a join-
semilattice and still work with complicated types of transactions.

2\. A bank transaction has a small upper bound on the types of inconsistency
that can occur: at worst you can make a couple overdrafts.

3\. A bank transaction has a trivial method for detecting and handling an
inconsistency. Detection is easy because its just a check that the transaction
you're applying doesn't leave the account with a negative balance. Handling is
easy, because the bank can just levy an overdraft charge on an account holder
(and since they have the ability to collect debts on overdrawn accounts, this
easily works for them). Complicated transactions often don't have a trivial
method of detecting inconsistency and handling inconsistency is rarely as
simple as being able to punish the user. In fact corrective action (and
likewise, inconsistency) is often non-intuitive to users because we expect
causality in our interaction with the world.

> The real world rarely involves having a consistent view of anything

I strongly disagree with this. Most of our interaction with the world is
perceived consistently. When I touch an object, the sensation of the object
and the force I exert upon the object is immediate and clearly corresponds to
my actions. There is no perception of reordering my intentions in the real
world.

We expect most applications to behave the same way, because the distributed
nature of them is often hidden from us. They present themselves as a singular
facade, and thus we typically expect our actions to be reflected immediately
and in the order we issue them.

Furthermore, this availability vs consistency dichotomy is an overstatement of
Brewer's theorem. For instance, linearizable algorithms like Raft and Multi-
Paxos DO provide high availability (they'll remain available so long as a
majority of nodes in the system operate correctly). In fact, Google's most
recent database was developed to be highly available and strongly consistent
[2].

[1]:
[https://en.wikipedia.org/wiki/Semilattice](https://en.wikipedia.org/wiki/Semilattice)

[2]:
[http://research.google.com/archive/spanner.html](http://research.google.com/archive/spanner.html)

~~~
jerf
"Bank transactions are easy to express as a join-semilattice[1],"

That may be true for a given set of transactions in a _very_ abstract way, but
that has some serious practical problems. First, like it or not, banks have a
history of making transactions non-associative and non-commutative; there are
many reported cases from the wild of banks deliberately ordering transactions
in a hostile manner so that the user's bank account drops below the fee
threshold briefly even though there is also a way to order the transactions so
that doesn't happen. So as a developer, telling your bank management that you
are going to use a join-semilattice (suitably translated into their terms) is
also making a policy decision, a mathematically strong one, about how to
handle transactions that they are liable to disagree with.

There is also the problem of variable time of transmission of transactions
meaning that you may still need to have some slop in the policies, in a way
that is not going to be friendly to the lattice.

The nice thing about join semilattices and the similar mathematical structures
is that you get some really nice properties out of them. Their major downside,
from a developer point of view, is that typically those properties are very,
very, _very_ fragile; if you allow even a slight deviation into the system the
whole thing falls apart. (Intuitively, this is because those slight deviations
can typically be "spread" into the rest of the system; once you have one non-
commutative operation in your join-semilattice it is often possible to use
that operation in other previously commutative operations to produce non-
commutative operations... it "poisons" the entire structure.)

I think you're badly underestimating the difficulty of the requirements for
anything like a real bank system.

~~~
knucklesandwich
Apologies, you're right, I've never implemented bank accounting in the real
world, I'm only talking about it in the sense of an account balance as a "toy
problem", where its often used to discuss the virtues of weak consistency by
being able to use overdraft charges.

Whether or not its actually implemented with something like CRDTs in the real
world isn't really my point, my point is that its a cherry-picked example of a
system that clearly does reorder operations and has weaker-than-usual concerns
about witnessing inconsistency.

------
brandur
Amen. Whether or not the article's example is a good one, in a world without
consistency you need to worry about state between _any_ two database
operations in the system, so there's nearly unlimited opportunity for this
class of error in almost any application found in the real world.

The truly nefarious aspect of NoSQL stores is that the problems that arise
from giving up ACID often aren't obvious until your new product is actually in
production and failures that you didn't plan for start to appear.

Once you're running a NoSQL system of considerable size, you're going to have
a sizable number of engineers who are spending significant amounts of their
time thinking about and repairing data integrity problems that arise from even
minor failures that are happening every single day. There is really no general
fix for this; it's going to be a persistent operational tax that stays with
your company as long as the NoSQL store does.

The same isn't true for an ACID database. You may eventually run into scaling
bottle necks (although not nearly as soon as most people think), transactions
are darn close to magic in how much default resilience they give to your
system. If an unexpected failure occurs, you can roll back the transaction
that you're running in, and in almost 100% of cases this turns out to be a
"good enough" solution, leaving your application state sane and data integrity
sound.

In the long run, ACID databases pay dividends in allowing an engineering team
to stay focused on building new features instead of getting lost in the weeds
of never ending daily operational work. NoSQL stores on the other hand are
more akin to an unpaid credit card bill, with unpaid interest continuing to
compound month after month.

~~~
tscs37
Totally agree. I've wet my toes with NoSQL systems but things like MongoDB
just look like trouble waiting to happen.

On the other hand, SQL makes it hard to maintain the state definition, I had
to develop a mechanism to upgrade the DB from any possible state to the latest
state.

Still, this allows me to define very accurately what is a valid row or entry
in the database, which means my Applications need to make far fewer
assumptions.

I can just take data from the DB and assume with certainty that this data has
certain forms, formattings and values.

MongoDB makes none of these guarantees.

~~~
bunderbunder
It's absolutely true that migrations in SQL databases can be a pain. I don't
actually think that's a bad thing - what's really going on there is that the
DBMS is throwing data integrity issues into your face, and encouraging you to
think about them.

A lot of developers don't want to do that, because they want to think of data
integrity as this tangential concern that's largely a distraction from their
real job. A lot of developers also have a habit of cursing their forebears on
a project for creating all sorts of technical debt by cutting corners on data
integrity issues, while at the same time habitually cutting corners themselves
in the name of ticking items off their to-do list at an incrementally faster
pace.

This isn't to say that NoSQL systems make your system inherently less
maintainable. But I do think NoSQL projects gained a lot of their initial
market traction by appealing to developers' worst instincts with sales pitches
that said, "Hey, you don't even have to worry about this!" and deceptively
branding schema-on-read as "schemaless". So a reputation for sloppiness might
not be deserved (or at least, that's not an issue I want to take a position
on), but, in any case, it was very much earned.

~~~
pjungwir
I am very fond of pushing as much validation into the database as I can, so I
use `NOT NULL`, foreign keys, `CHECK` constraints, and whatever I can to let
the database do the job of ensuring valid data. All of us make mistakes, and
making sure there are _no_ exceptions is just what computers are good at. I
think you're right about "a lot of developers", but I find data modeling,
including finding the invariants, pretty satisfying actually.

Reading your comment reminds me of ESR's point in _The Art of Unix
Programming_ that getting the data structures right is more important than
getting the code right, and if you have good data structures, writing the code
is a lot easier. The database is just a big data structure, the foundation
everything else is built on, and the better-designed it is, the easier your
code will be.

~~~
tptacek
That's not ESR's point. It's shoplifted from Brooks.

------
olalonde
And in case you think there's a general solution to that problem, there isn't:
[https://en.wikipedia.org/wiki/CAP_theorem](https://en.wikipedia.org/wiki/CAP_theorem)

Still, it's funny how banking seems to be the canonical example for why we
need transactions given that most banking transactions are inconsistent
([http://highscalability.com/blog/2013/5/1/myth-eric-brewer-
on...](http://highscalability.com/blog/2013/5/1/myth-eric-brewer-on-why-banks-
are-base-not-acid-availability.html)).

~~~
kpmah
Well, they're eventually consistent. It's not like it hasn't been thought
through. :)

It's a useful example because it shows how there can be serious consequences
for getting it wrong.

------
marknadal
Disclaimer: I work on distributed systems and have spoken around the world on
them, so I am very biased.

I think something a lot of people miss is that the universe itself is not
strongly consistent. This is Einstein's theory of relativity. It fundamentally
takes time for information to travel.

So if strong consistency is not viable, even at a physics level, what can we
do instead? That is why our team here at
[http://gun.js.org/](http://gun.js.org/) believe in CRDTs. What are CRDTs?
They are data structures that are mathematically proven to produce the same
results on retries - so even if the power goes out, or the network fails, you
can safely re-attempt the update.

This means you fundamentally don't need "transactions" or any of these jargon
words that people often throw out. Sadly CRDTs are becoming another one of
those jargons, despite how simple they are in reality.

~~~
nickpsecurity
"This is Einstein's theory of relativity. It fundamentally takes time for
information to travel."

Which is why Google's hack in Spanner and F1 RDBMS of simply using or waiting
out the known time delay was so clever. Also relies on a tool that was a test
of Einstein's theory. ;)

"That is why our team here at [http://gun.js.org/](http://gun.js.org/) believe
in CRDTs."

Thanks for the link. I'll look into it. Plus the liberal license that gives it
a chance of making a dent in commercial sector or in combo with BSD/Apache
projects.

------
xanton
Transactions can have different isolation levels. And sometimes the problem at
hand can be implemented using transactions with weak isolation levels which
are not that hard to implement using your favorite NoSQL database that support
CAS operation. I recommend this article:
[http://rystsov.info/2012/09/01/cas.html](http://rystsov.info/2012/09/01/cas.html)

~~~
kpmah
Yes, be aware of the transactional capabilities of your database. Some ACID
databases in their default configuration don't provide all transactional
guarantees!

~~~
brandur
I just wanted to add that the Postgres documentation [1] and Wikipedia [2] are
both really well written when it comes to explaining each isolation level and
the guarantees that it provides. Well worth the reading time.

[1]
[https://www.postgresql.org/docs/current/static/transaction-i...](https://www.postgresql.org/docs/current/static/transaction-
iso.html)

[2]
[https://en.wikipedia.org/wiki/Isolation_(database_systems)#I...](https://en.wikipedia.org/wiki/Isolation_\(database_systems\)#Isolation_levels)

------
mettamage
Before I read this article, the following question popped into my mind and it
miiiiiight be tangentially related -- yea probably not, blame the title ;)
When taking the concept of consistency, does consistency have an effect that
is akin to compound interest?

For example, imagine someone doing the same thing year after year diligently.
(S)he'd increase his or her skill say 10% a year (have no clue what realistic
numbers are). Would that mean that the compound interest effect would occur?

I phrased it really naievely, because while the answer is "yes" in those
circumstances (1.1 ^ n). I'm overlooking a lot and have no clue what I
overlook.

I know it's off-topic, it's what I thought when I read the title and I never
thought about it before, so I'm a bit too curious at the moment ;)

------
toolslive
The problem is that when the system does not guarantee consistency, you force
the application developer using the system to solve that problem. Each
application developed, will have to solve the same problem. Besides the fact
the same effort is done over and over again, you also are forcing application
developers to solve a problem for which they probably do not have the right
skill set. In short, that strategy is wasteful (replicating work) and risky
(they'll make mistakes)

------
acjohnson55
I sort of agree. The examples in the article are ways in which people play
fast and loose with consistency, often using a NoSQL store that has poor
support for atomicity and isolation. This is a helpful message, because I've
definitely seen naively designed systems start to develop all sorts of
corruption when used at scale. The answer for many low-throughput applications
is to just use Postgres. Both Django and Rails, by default, work with
relational databases and leverage transactions for consistency.

Then, there is the rise of microservices to consider. In this case, I also
agree with the author that it becomes crucial to understand that the number of
states your data model can be in can potentially multiply, since transactional
updates are very difficult to do.

But I feel like on the opposite side of the spectrum of sophistication are
people working on well-engineered eventually consistent data systems, with
techniques like event sourcing, and a strong understanding of the hazards.
There's a compelling argument that this more closely models the real world and
unlocks scalability potential that is difficult or impossible to match with a
fully consistent, ACID-compliant database.

Interestingly, in a recent project, I decided to layer in strict consistency
on top of event sourcing underpinnings (Akka Persistence). My project has low
write volume, but also no tolerance for the latency of a write conflict
resolution strategy. That resulted in a library called Atomic Store [1].

[1] [https://github.com/artsy/atomic-store](https://github.com/artsy/atomic-
store)

------
calind
I think it's a bad example because this should not be the way to develop in
this kind (microservices) of systems.

In these environments you atomically create objects in your application's
"local" storage and have a reconciliation loop for creating objects in other
services or deleting these orphan "local" objects.

~~~
kpmah
I think it's a good example because this should not be the way to develop this
kind of system - and yet people do it this way! :)

------
muteor
If anyone is interested in this sort of thing, I found this a great article:
[http://www.grahamlea.com/2016/08/distributed-transactions-
mi...](http://www.grahamlea.com/2016/08/distributed-transactions-
microservices-icebergs/)

------
chajath
To actually do a distributed transaction, I would look into algorithms such as
2PC or 3PC. Although before going there, I would seriously consider
consolidating different backends into one scalable option (banking transaction
example is a bit contrived, although I gather the author is just trying to
make a point).

At the service level integration, we can leverage a reliable message queue
middleware to make sure a task is eventually delivered and handled (or be put
in a dead letter queue so we can do a batch clean up)

Also as a general principle, I would make each of those sub-transactions to be
idempotent, so that retrying multiple times won't hurt, and there would be a
natural way of picking the winner if there are conflicting ongoing commit /
retry attempts.

------
fagnerbrack
Why isn't the OP using Event Sourcing "commands" for the "Bank Accounts"
example?

~~~
victorNicollet
I believe Event Sourcing opens a whole other can of worms which would detract
from the point of the article, such as whether the event streams have a well-
defined order (especially if each account is its own aggregate), or whether
the resulting event should be "Transaction Completed" (assumes pre-conditions
were checked) or "Transaction Attempted" (checks pre-conditions before
altering state).

~~~
partisan
My teams' event sourcing implementation stores and publishes events in a well
defined order within a specific aggregate.

There are scenarios where a two-phase commit is used to ensure that invariants
across aggregates are maintained.

We use an ACID compliant database to store our domain events and we project
our events into a relational schema. When projecting events, we use
transactions to make sure the database updates from a specific event are
either all applied or not at all.

~~~
victorNicollet
Yes, this is a typical way of doing things in the CQRS/ES world, but there are
many things left unsaid:

\- would you have a single aggregate for "all bank accounts" so that one
financial transaction equals one event, or is there a good reason to have
multiple aggregates ?

\- is command execution transactional, or is it possible for another thread to
generate (possibly conflicting) events between the time you check a
precondition and the time you write the intended events ?

My team uses an ES implementation with very strong guarantees: one aggregate
per macro-service, and command execution is transactional, but I'm fairly
certain that this is not the way CQRS/ES is described or recommended.

~~~
partisan
I would have one aggregate instance per bank account. Each has a separate
state and business rules would need to be applied at the individual account
instance (e.g. overdraft fees, interest, etc). If you have to enforce rules at
the bank level based upon actions taken within the account then you can have a
bank aggregate that you can interact with through a saga and confirm changes
through a two phase commit.

Command execution is not transactional, but writing to the domain event store
is. That means that two writers cannot write the same domain event version
number to the event store. Say an aggregate is at version 20. The next action
would put it at 21. If two threads generate version 21 of the aggregate, only
one of them can write it to the event store and the other gets an exception.
Domain events are written in a transaction batch to the event store so we
cannot get any writes in the middle. So, the first to write the next version
wins. The other one loses.

We also partition our command and event processors by aggregate id. The same
aggregate cannot have its events or commands processed more than one at a time
because the processors are synchronous per partition.

If I understand you correctly then you have one instance of a very large
aggregate per bounded context? If so, then no, it is not in the spirit of
CQRS/ES. If you are handling commands in transactions, then the designers of
your system may have chosen to favor consistency instead of the availability
that CQRS provides.

~~~
victorNicollet
We do have the same constraint on event store writes, which we then use to
implement command-level transactions : a command is a pure function that maps
a command model to a set of events, and if the set of events cannot be written
to the stream, the command model is updated and the function is called again.
This does, indeed, have a very strong bias towards consistency, although the
main objective is to allow multiple command processors per aggregate, which
lets us embed the command processors in services that are inherently multi-
instance (e.g. web applications).

I am not sure I understand why having multiple aggregates would provide
availability. Unless the idea is that the events from different aggregates are
stored on separate servers, so that a single-machine outage would only take
down some aggregates ? If so, I agree that is the case in theory, though with
limited practical applicability in our situation.

~~~
partisan
Aggregates represent consistency boundaries. To enforce consistency you would
theoretically need to have your entire state loaded when you go to process a
command for a single bank aggregate.

Since each command produces a new version of the bank and many commands are
coming in at the same time, most of them will fail when writing to the event
store. Do they keep retrying until they succeed? Either way, this is not
efficient and could effect the availability of your servers.

If you have multiple aggregates, the consistency boundaries are smaller and
therefore you have a smaller state that you have to maintain consistency
across. There are less operations on a single aggregate and less opportunities
for contention that arise from simultaneous operations.

Also, if you are using queues for your commands and events (we are, but I
suspect you are not), then you can partition your queues such that you can
process your workload N ways without worrying about things happening out of
order. Each aggregate processes serially within a partition. If you have just
one large aggregate then you would have to jump through a lot of hoops to be
able to partition the work while maintaining ordering guarantees.

I can really only guess at the details of your implementation, but I am
guessing that your design has accounted for all of the above in some way. If
there is one thing I've learned, it's that there are many, many ways to
implement CQRS. If your software works correctly under load, then I am not
sure it would matter if it fits squarely into the definition of CQRS. In fact,
you may have something new altogether that presents a better way to achieve
the same goals as CQRS without the same cognitive overhead.

~~~
victorNicollet
I understand better now what you're saying, though I would have called it
"ability to scale" rather than "availability".

Our architecture prevents us from parallelizing the execution of commands
within a bounded context. The ability to execute commands on any number of
servers is for availability (transparent failover if an instance dies) rather
than performance, since the event stream acts as a global lock anyway.

In practice, our system clocks in at a comfortable 1000 commands per second
under stress tests ; during our peak hours and on our busiest aggregate, we
only have to execute one command per second, so we can afford at least a x100
increase before we need to consider changing our architecture (and we're no
longer an early-stage start-up, so x100 would mean a lot of revenue).

Similarly, the full state for our largest aggregate (not the command model,
but the entire state of all projections) fits snugly in a few hundreds of
megabytes, so all instances can keep it in-memory.

------
ah-
You can use event logs and eventual consistency to solve this problem.

Basically you make the transfer of money an event that is then atomically
committed to an event log. The two bank accounts then eventually incorporate
that state.

See [http://www.grahamlea.com/2016/08/distributed-transactions-
mi...](http://www.grahamlea.com/2016/08/distributed-transactions-
microservices-icebergs/)

But I agree that often life is easier if you just keep things simpler. If you
require strong consistency like with the user/profile don't make that state
distributed. If you do make it distributed you need to live with less
consistency.

~~~
erikpukinskis
If we're talking about user created event, I think you should have a permanent
log anyway. Otherwise if there is any kind of bug or any kind of weirdness in
your data, how are you supposed to figure out what the heck happened? You can
try to outwit chaos by prevent all inconsistency and bad data, but that'd a
game you will eventually lose.

From a more religious perspective, every action a user takes is sacred data
that you should not lose, unless you deliberately want to lose it for privacy
reasons. You can rely on your software to have no bugs and never confuse your
user, but again that's a game you eventually use. I would rather keep that
action log so I have a place to rebuild from when things are lost. Otherwise
your only choice is to reverse engineer your data. Data which, if not
technically corrupt, is corrupted from a human standpoint.

And if you want undo you need a log and playback anyway.

To me data consistency at the database level is not a real solution to the
problem. It is a good tool, but it only solves a very narrow slice of
predictable bugs. It doesn't help at all with the inevitable bugs you can't
predict. A log based approach gives you a powerful tool in all kinds of tough
situations.

------
phamilton
This profile example is missing the better approach: avoid the dependency of
creating the user before creating the profile.

Create the profile with a generated uuid. Once that succeeds, then create the
user with the same uuid.

If you build a system that allows orphaned profiles (by just ignoring them)
then you avoid the need to deal with potentially missing profiles.

This is essentially implementing MVCC. Write all your data with a new version
and then as a final step write to a ledger declaring the new version to be
valid. In this case, creating the user is writing to that ledger.

~~~
kpmah
Well, we're playing make believe with the requirements. In fairyland the user
and the profile need to exist together. Only the fairies know why!

If you can relax the requirements, you can relax the constraints.

~~~
phamilton
That's the point. Requirements are rarely as strict as we presume they are.

------
morgo
Good article.

I've stopped using bank transfers as an example for Acid transactions, and
instead talk about social features:

\- if I change a privacy setting in Facebook or remove access to a user, these
changes should be atomic and durable

\- transactions offer a good semantic of which to make these changes. They can
be staged in queries, but nothing is successful until after a commit.

\- without transactions durability is hard to offer. You would essentially
need to make each query flush to disk, rather tha each transaction. Much more
expensive.

------
xarien
Depends on your POV. Startups undervalue it and corporations overvalue it. At
the end of the day, it's just risk management.

------
fagnerbrack
In case anyone is wondering what an "atomic change" means in database
terminology:
[https://www.gnu.org/software/emacs/manual/html_node/elisp/At...](https://www.gnu.org/software/emacs/manual/html_node/elisp/Atomic-
Changes.html)

------
matt_wulfeck
Maybe I'm crazy, but I never see atomic libraries that are called like this:

    
    
        bank_account2.deposit(amount)
        bank_account1.deposit(amount)
    

Isn't this kind of thing always called in some atomic batch operation?

    
    
        transact.batch([
            account[a] = -8,
            account[b] = 8
        ]).submit()

~~~
JoachimSchipper
Yes; "with atomic...:" seems to be pseudo-syntax.

------
sz4kerto
The universally hated JEE can do distributed transactions by default. Yes,
with pitfalls, but it can. (It is usually hated by devs who have never used it
properly.)

~~~
jamesblonde
The problem with JEE is when you leave Java Transactions - i.e., interact with
some external system as part of a transaction. Then there is no way around
writing the transaction compensation/recovery/reconciliation logic to handle
partial failures. On the other point, I agree that JEE is unfairly maligned.
We build a modern AngularJS/Material frontend on a JavaEE 7 backend and get
all the benefits of a scalable, secure enterprise platform from Java EE.

------
bullen
But consider this:

You are using mysql, you make a transaction with say deposit and withdraw.

What happens on the mysql machine if you pull the plug exactly when mysql has
done the deposit but not the withdraw?

The ONLY difference between SQL transactions and NoSQL microservice
transactions is the time between the parts of a transaction.

Personally I use a JSON file with state to execute my NoSQL microservice
transactions and it's alot more scalable than having a pre-internet era SQL
legacy design hogging all my time and resources.

~~~
matthewmacleod
I don't mean to be unkind, but is this meant to be a parody?

There are no "parts of a transaction" because a transaction is definitionally
atomic.

Transactional file modification is a fairly tricky problem, and I'd be
surprised if you'd actually implemented a safe system in that manner. What's
certain though is that spinning up MySQL or Postgres and using it to store
simple records is essentially a zero-cost setup task - so I doubt it's ever
going to "hog all your time and resources".

~~~
bullen
How is it atomic? this goes for all 5 of the replies (one of you did not
downvote discussion, I'm guessing that person is the only one not from the
US). How does mysql unwrite the deposit? There are always "parts" to
everything. Two phase commit does not solve anything unless you have "undo the
thing I did before the power was lost".

Please refer to source code to prove your argumentation.

~~~
matthewmacleod
_Please refer to source code to prove your argumentation._

I'm not digging through source code for you; transactional atomicity is a
well-known and thoroughly researched problem.

Transactions in an ACID system are atomic by definition. They're designed in
such a way that either an entire transaction occurs, or no part of a
transaction occurs – in other words, it's not possible to partially apply a
transaction, by design.

There are a bunch of different approaches to implementing this. I'd guess that
MySQL does something like write the complete set of modifications in a
transaction to disk as a separate buffer, and only once the entire transaction
is complete updating some associated metadata to add the transaction to the
database. A power failure at any point will result in an uncommitted
transaction, which will have no effect on the database.

Here's the appropriate Wikipedia article for further reading
-[https://en.wikipedia.org/wiki/Atomicity_(database_systems)](https://en.wikipedia.org/wiki/Atomicity_\(database_systems\))
\- lets suffice to say that if you are rejecting the idea that atomicity
exists then I don't know what else to tell you. It's a core concept in
information systems.

~~~
bullen
I don't think so; if a MySQL transaction has two separate tables, at some
point it has to write one and then the other remembering to remove the first
if the second fails with power outage for example.

I'm not rejecting anything, all I'm asking is; where is the code and how does
it work? If you don't know then how come you are so confident it does work?

I'm pretty sure the "write some status to a file and rollback if transaction
incomplete on startup" code is pretty horrible on all SQL systems. And on top
of that it doesn't scale. Defending status quo is always worth questioning.

