
If Eventual Consistency Seems Hard, Wait Till You Try MVCC - boynamedsue
http://www.xaprb.com/blog/2014/12/08/eventual-consistency-simpler-than-mvcc/
======
AlisdairO
I certainly wouldn't dispute that it can be hard to reason about concurrent
operations in traditional database systems - particularly if you want to do it
across multiple database systems, which do indeed have differing
implementations. That said, if the author thinks eventual consistency is
easy/not-that-bad/whatever by comparison, I would tend to question their
understanding of programming for eventually consistency. Eventual consistency
for non-trivial applications is really, really, really hard to get right.

Consistency in relational databases is a hard topic because maintaining
consistency in any concurrent system is hard. Traditional systems provide you
with a set of tools (isolation levels, foreign keys) that make a decent
programmer capable of building a safe concurrent application. Throwing away
those tools and replacing them with nothing does not make life easier. The
tool is easier to understand, but the problem is harder to solve.

~~~
kokey
Something that also stands out to me from the article is the general
negativity about implementations the author knows in-depth, and positivity
about the implementations that he doesn't know that well and only read about.
I think this is what often drives people to adopt new things which avoids some
old problems but knowledge about the new problems it creates are not that well
known yet, even less so with possible solutions to these new unknown problems.

------
jeffdavis
PostgreSQL supports true serializability, while maintaining good performance
and concurrency:

[http://www.postgresql.org/docs/9.4/static/transaction-
iso.ht...](http://www.postgresql.org/docs/9.4/static/transaction-
iso.html#XACT-SERIALIZABLE)

It's based on fairly recent research by Michael J. Cahill, et al.

It's simple. If you are confused, then set default_transaction_isolation=true.
You will get errors if there's a data race.

Given that, what's the point of the article? That sub-SERIALIZABLE modes have
complex semantics? Yes, that's true, but they are still much more likely to
help you then the NoSQL "you're on your own to avoid races" approach.

If you want to avoid lots of really challenging problems, PostgreSQL is often
the best bet by far.

~~~
VieElm
> If you want to avoid lots of really challenging problems, PostgreSQL is
> often the best bet by far.

How about 0 single points of failure. How does PostgreSQL help you avoid that
challenging problem?

~~~
adontz
I'm pretty sure, every try to solve all development and maintenance problems
in single software will always fail. And also asking "How to do X with Y?" is
bad. "How can you ensure zero points of failure?" is the right question not
"How does PostgreSQL help you ensure zero points of failure?".

------
Dave_Rosenthal
I agree with the author that the various levels of isolation, etc. within the
current crop of SQL databases is a morass. I’ll point to some recent fine work
by Martin Kleppmann
([https://github.com/ept/hermitage](https://github.com/ept/hermitage)) that
explores the issue and shows how many systems fall short of serializable
isolation in their default modes. (And sometimes in modes labeled
“serializable”!) In his test three databases actually achieve full
serializability: PostgreSQL, MS SQL Server, and FoundationDB.

But don’t give up on ACID yet! If can actually get real serializability, you
have a very powerful model that is also very simple to reason about.
Serializable isolation gives you a virtual “whole database lock” that lets you
modify lots of things all over the database before you “unlock” it and hand
the database to the next client. The amazing thing about MVCC combined with
serializable isolation is that you get to use this "simplest possible" mental
model even though hundreds or thousands of concurrent clients might be all
hitting the database at the same time.

~~~
curun1r
> The amazing thing about MVCC combined with serializable isolation is that
> you get to use this "simplest possible" mental model even though hundreds or
> thousands of concurrent clients might be all hitting the database at the
> same time.

I think you've proved the author's point better than he ever could. With
serializable isolation, you don't have concurrent clients all hitting the same
data at the same time...you've got locking/blocking. The two things you're
trying to put together are mutually exclusive.

The whole point of the article is that when you're trying to write
concurrently to a database, it will be necessarily complex. ACID databases
attempt to sweep that complexity under the rug and create a simpler mental
model. But those abstractions are leaky and bubble up to the surface in ways
that are often unexpected or difficult to handle. This is especially true when
you try to scale beyond a single machine...achieving serializable isolation
whilst replicating between database nodes is all but impossible.

Most NoSQL databases take the approach of explicitly exposing the complexity
to the application with the assumption that concurrency is an absolute
requirement and the application will understand how to achieve consistency
better than a generic data tier can.

~~~
Dave_Rosenthal
Hmm... Serializable isolation and concurrency go together fine and certainly
don't require blocking or even locking when using MVCC and optimistic
concurrency. For that matter serializable transactions and scalability are not
incompatible either (though few NoSQL systems have tackled the problem yet).

I think the author's main point is that sub-serializable isolation levels are
confusing and can lead to wrong behavior in subtle ways. I agree, but to me it
sounds like the failing is in using too weak a consistency/isolation model,
not too strong a one!

~~~
pbailis
> Hmm... Serializable isolation and concurrency go together fine and certainly
> don't require blocking or even locking when using MVCC and optimistic
> concurrency. For that matter serializable transactions and scalability are
> not incompatible either (though few NoSQL systems have tackled the problem
> yet).

For non-conflicting operations, I agree entirely that serializable
transactions can scale just fine. In fact, just about every serializable
concurrency control algorithm (e.g., two-phase locking, partitioned OCC, MVCC
with clever timestamp allocation) can scale just fine for non-conflicting
operations.

However, for conflicting read-write operations, serializability will incur
higher costs than many alternatives (e.g., eventual consistency, weak
isolation implemented in a scalable manner). Serializing access to shared data
items fundamentally requires some serial execution.

This fundamental overhead is a key reason why almost every commodity RDBMS
defaults to one of these weaker isolation levels and why databases like Oracle
don't support serializable isolation at all. It's not that their respective
database architects don't _know_ how to implement serializability -- it's that
weak isolation is faster and, in many cases, reduces deadlocks and system-
induced aborts, even if it increases programmer burden.

~~~
Dave_Rosenthal
> Serializing access ... fundamentally requires some serial execution.

This is true, but I don't think this requirement limits scalability because
the serial part can be arbitrarily cheap (e.g. approve ordered batches of
work).

You also say that many databases could implement serializable transactions but
don't because of the "higher costs" and that "weak isolation is slower". This
sounds like a tradeoff to me so, of course, there will never be one right
answer. Well, maybe someday for problems that permit I-confluence :)

However, the article attests to the high costs of sacrificing serializability
in programmer productivity and system complexity. Those are serious downsides
that need to be weighed very carefully against any actual, measured,
performance advantages that are on the table.

------
quizotic
One of the functions of an Operating System is to provide (the illusion of)
isolation between Processes and the resources they acquire. Roughly,

Transaction : Database :: Process : OperatingSystem

The four transaction isolation levels, defined nearly fifty years ago, were an
attempt at categorizing incomplete isolation. Can you imagine launching an
operating system process and saying "It's OK if my memory locations are
changed by another process" Or "Don't let my memory locations be changed by
another process, but it's OK if another process prevents me from deleting a
resource". That's what we're asking of our non-serializable database
transactions.

Why not just provide strong isolation? Partly because it is hard and partly
because the performance impact of strong transaction isolation is greater than
the performance impact of process context switching.

But if you're living with less than strong transaction isolation, then
tautologically, strange and unexpected things eventually happen: seeing state
that never existed, seeing state changes that you didn't make, failing to see
state that you should have seen. Rarely, the application can reliably detect
and handle some of these situations. Typically, the application only thinks it
can.

Sometimes, I think the core issue is with the notions of isolation and
serializability themselves. Developers want to believe that events are noticed
simultaneously with their occurrence, and that all observers (transactions)
see the same history unfold in the same order. But the pesky physical world
doesn't work that way.

------
nickik
While reading this I was thinking about Datomic. I know that transactions are
serialisable and actually serialised and stored in the database. Essentially
what you get is a full DB lock and you can access the full database, or even
do whatever you want. That of course does lock the database transacter,
reading can still go on consistently.

Was is not also the research in VoltDB, that coordination is the problem, and
that you have to do all transaction on a single core. Am I remembering this
correctly?

Seams to me Datomic hits a very nice spot very you have relativly fast writing
and concptionally unlimited reads.

If somebody know more then me, I happy to learn.

PS:

Also, why are we still using NoSQL and make any generalisation about it, by
know there are so many NoSQL databases that they have literally nothing in
common exept that its not a traditionall SQL database.

~~~
krakensden
The author also has thoughts on Datomic:
[http://www.xaprb.com/blog/2013/12/28/immutability-mvcc-
and-g...](http://www.xaprb.com/blog/2013/12/28/immutability-mvcc-and-garbage-
collection/)

tl;dr; append-only is great as long as you don't use your database much.

------
spacemanmatt
It would seem a much more honest argument to me if MVCC were being compared to
another implementation of consistency management, rather than compared to NO
IMPLEMENTATION at all.

~~~
amirouche
That not the point of the article, which is not very well made. The author
explains/try to explain that MVCC has also its quirks under high write
workload, it's not a silver bullet, even when considering "eventual
consistency databases" which "do nothing to help you", quite the contrary
because it's a design choice.

I think this article must be read inside the debate "ACID or no ACID". Where
Google with F1 [1] and appengine [2], FoundationDB and wiredtiger push for
strong transaction features _because it 's easier for developers_.

[1]
[http://research.google.com/pubs/pub38125.html](http://research.google.com/pubs/pub38125.html)
[2] A Design for a Distributed Transaction Layer for Google App Engine

------
friendzis
> “what’s the difference between Consistency and Isolation again?”

This is what bothers me most about this article. How can a person seemingly
having advanced knowledge of relational databases cannot understand that
Consistency is about __dataset state __and Isolation is about __transactions
__. It 's not that you get consistent results between transactions or even
queries, but that if key is integral number, then `itoa()` works (given that
bit length not too large) no matter what, period.

If you want to support concurrent read and write operations on a database,
then it's not much different from multithreaded programming - hard and
impossible to get right without compromises.

~~~
amirouche
The author makes this statement rhetorically, without explaining it more than
all 4 ACID properties are tightly coupled.

In the case of Consistency and Isolation, you _might_ have a inconsistent
databasse if Isolation allows phantom reads. Now that I think (again) about
it, it seems to me consistency can only be a consequence of Atomicity and
Isolation. Maybe not... because if Isolation allows to write things based on
phatom reads then the database even if Atomic and with strong isolation it
will be inconsistent.

tl,dr: yes, the author was quick in that sentence, doesn't mean he is stupid.
`itoa()` example is far fetched and non very instructive.

~~~
friendzis
Sorry about that, had to finish up quickly. I will try to clarify my view, and
sorry again if this wall of text is too long. I just wanted to stress that
Consistency is about _data_ itself, while Isolation is about _dataset
acquisition_ (the same goes for Atomicity).

My view is that Consistency should only be mentioned when talking about data.
It is much like method argument type checking in type-safe languages: if you
get float passed in it WILL contain a valid number and you do not need any
additional explicit checking for that. Consistency is data validation.

Isolation, on the other hand, is a form of concurrency control. Once you have
shared data between concurrent threads, care must be taken. SQL brings this to
a new level by having only shared data between threads (transactions). Since
performance is directly related to concurrency control techniques, some
compromise must be taken. The article is just about those compromises. Phantom
reads can be viewed as a manifestation of holding and releasing locks
(semaphores) during thread execution.

I believe that ACID properties are all orthogonal and databases should be
examined where/when/how those properties are violated, e.g. Isolation is not
always ensured because of the mentioned MVCC.

Atomicity: data is treated in atomic datasets; Consistency: data validation;
Isolation: data immutability during transaction; Durability: data retention
(and as a consequence Consistency) over prolonged periods of time.

But that is my own humble opinion.

------
epe
I find this: [http://martin.kleppmann.com/2014/11/25/hermitage-testing-
the...](http://martin.kleppmann.com/2014/11/25/hermitage-testing-the-i-in-
acid.html) a more informative read on essentially the same topic.

~~~
optimusclimb
Thanks for posting that link, I agree it does a slightly better job.

------
taeric
I'm not entirely sure I understand how things are any better in either world.
Especially if you are approaching the problem with the idea that you will have
no failure conditions, things are going to be tough. Prohibitively so.

------
mwcampbell
> Sorry, I'm not impressed with serializable isolation via a single writer
> mutex.

I think that for a large number of applications, this would work just fine.
Especially if the underlying storage is SSD-based -- something that's now
easily obtained via VPS providers like DigitalOcean, Linode, and Vultr. After
all, most database-driven applications aren't large-scale operations. So it's
probably better to favor correctness and simplicity over concurrency and
scalability.

------
Terr_
I think part of the goal with Eventual Consistency is that it lets you attack
the underlying issues in a qualitatively different way.

Relying on the DB, your options are constrained by your choice of relational
tables and rows, and further affected by your DB vendor and configuration.
When you lift the conflicts out to a higher layer of abstraction, you can do
something more object-oriented, leveraging some of the same tools you'll need
anyway to deal with other inconsistencies and quirks.

------
grogers
I'm probably the minority on this, but I think innodb's docs on its isolation
modes are clearer and easier to understand than postgres'. That could be
because I have internalized them over many years though.

To me, the only sensible isolation level for transactions is serializable.
We've had so many consistency issues because developers didn't understand when
to lock and when not to. Everyone (including me) gets it wrong from time to
time.

Very careful use of nonlocking selects can be correct and improve concurrency,
but across a huge codebase, bugs slip in. If I could turn on a mode where the
default is "select ... lock in share mode" but opt into a "select ...
nonlocking" I would be eternally grateful. Judicious use of wrappers where we
force the user to specify the lock mode has helped curb this trend quite a
bit.

I'd still take MVCC with tons of warts over eventual consistency any day of
the week. It is impossible to reason about eventual consistency because the
database is literally allowed to do basically anything. Causal should really
be the default starting point for AP systems.

~~~
Animats
_To me, the only sensible isolation level for transactions is serializable._

There's something to be said for this. As was discussed a few weeks ago,
"repeateable read" semantics differ quite a bit between MySQL/InnoDB and MS
SQL. In MySQL/InnoDB, you can have two transactions update the same record in
repeatable read mode, and lose the second update without an error being
reported.

But you take a huge hit on performance in serializable mode. You can only do
one thing at a time.

~~~
nehan
No. In serializable mode it simply _appears_ that one thing is happening at a
time. You can touch unrelated data concurrently with no performance impact.

------
rectang
While they aren't distributed, Apache Lucene and Apache Lucy (the "loose C"
port of Lucene that I work on) both implement MVCC: each index reader object
represents a point-in-time view of the index that persists for the object's
lifetime.

Core developers such as myself have to deal with some subtleties when
implementing MVCC, but I wouldn't say that MVCC is too terribly hard for our
users. The thing is, our interface does not look anything like SQL -- you have
to use a different mental model when interacting with our data stores, and of
course they are not suitable for all applications where you might use SQL.

What I took away from the article is that MVCC does not fit well within SQL
semantics.

------
grandalf
This is a great thing to read and discuss, as opposed to blindly jumping on
one bandwagon or another.

A relational database is a framework for storing data that comes with some
rules in in exchange helps to guarantee certain characteristics. A flat file
is too. The choice of which one to use depends on the problem.

Often there are several good choices. Imagine all the relational db zealots
who frown upon using a relational db in a slightly unconventional way, such as
strong events in a single table using a json column in postgres. Yet postgres
provides some useful features and so it may actually be very smart to use it
that way.

------
klochner

      > Does the first part of this excerpt contradict the second part? (Emphasis mine)
    

No, it's perfectly clear and consistent. Does anyone else find it confusing?

------
debacle
I wish people would not blog as experts about things they don't know a great
deal about.

------
empthought
> Many developers, including myself, have written applications that fall afoul
> of the MVCC implementation and rules.

Color me shocked.

------
imanaccount247
He's trying to claim that MVCC pushes complexity on to the user, but the
examples of that are just typical "mysql does it wrong". So, that means mysql
pushes complexity on to the user, not MVCC.

~~~
spacemanmatt
To his credit, the author acknowledges PostgreSQL is not his strong suit. As a
pg fanatic myself I think there is a lot of negativity toward SQL in general
that is more fairly leveled at MySQL and Oracle in specific.

