
What Write Skew Looks Like - foldU
http://justinjaffray.com/what-does-write-skew-look-like/
======
ztorkelson
Well written and detailed comparison of snapshot and serializable isolation
levels.

From the article:

 _A more concrete reason than “nothing else makes sense” is a question of
local vs. global reasoning. If a set of transactions must maintain some kind
of invariant within the database (for instance, the sum of some set of fields
is always greater than zero). In a database that guarantees serializability,
it’s sufficient to verify that every individual transaction maintains this
invariant on its own. With anything less than serializability, including
Snapshot, one must consider the interactions between every transaction to
ensure said invariants are upheld. This is a significant increase in the
amount of work that must be done (though in reality, I think the situation is
that people simply don’t do it), a point made by Alan Fekete in this talk[1]
on isolation._

This bears repeating. In any non-trivial transactional application, the
combination of concurrency and non-serializable transaction isolation levels
is a serious bug farm. Most app/service backends fall for this trap.

Unfortunately, most OLTP database systems have incredibly poor implementations
of the serializable isolation level. Postgres is a notable exception, and it's
nice to see some newer database systems--like CockroachDB--making progress in
this area.

[1] [https://www.youtube.com/watch?v=IP-
S_RHlsEQ](https://www.youtube.com/watch?v=IP-S_RHlsEQ) (Thanks for the link; I
hadn't seen this talk before.)

~~~
trhway
>If a set of transactions must maintain some kind of invariant within the
database (for instance, the sum of some set of fields is always greater than
zero). In a database that guarantees serializability, it’s sufficient to
verify that every individual transaction maintains this invariant on its own.
With anything less than serializability, including Snapshot, one must consider
the interactions between every transaction to ensure said invariants are
upheld.

this is a difference between practice and theory. The invariant mentioned
above is of theoretical interest. On practice such invariant would look like
"sum of some fields is always equal to the value in that field of that row".
(left as an exercise to the reader to see that that former theoretical
invariant, and actually any other, can, without loss of generality, always be
implemented as the practical invariant with the target field :) As basic Read
Committed still means serialization of write access to that "sum" target
field, it is enough to verify only that each transaction writing to any of
those fields is also writing to that "sum" field and that it does uphold the
invariant. There is no need to pay the huge performance price of serializing
those transactions with all the rest of transactions in the system. This is
why Read Committed is veriafiably enough for correct execution of so many
applications (i.e. for any application whose invariants are implemented in
that "practical" form). It is basically like multi-thread programming with
correctly implemented synchronization of access to shared data.

~~~
voidmain
> it is enough to verify only that each transaction writing to any of those
> fields is also writing to that "sum" field and that it does uphold the
> invariant

This is a good try at a theory of when you can use weak isolation. But as
stated, I believe it is false. If the invariant is A+B=C, I think this
sequence is permitted by READ COMMITTED:

T1: read A=1

T2: read B=1

T2: write A=2

T2: write C=3

T2: commit

T1: write B=10

T1: write C=11

T1: commit

Now A=2, B=10, C=11

More to the point, invariants aren't always local to a row, and people don't
usually even fully articulate them.

> There is no need to pay the huge performance price of serializing those
> transactions with all the rest of transactions in the system

Serializable execution can be made very very fast when there aren't actually a
lot of conflicts. And then when there are you can do the hard work of trying
to prove that something weaker will work. The failure mode in the other
direction is silent corruption of your data.

~~~
trhway
What i was talking about in its most simple (though not most
performant/scalable as it would be more complicated, so we wouldn't go into it
now) implementation would look like this - the transactions involved with an
invariant should start with locking of the "target" field of the invariant,
the field C in your example:

T1: lock C

T2: attempt to lock C - blocked until T1 completes.

I.e. we have these 2 transactions serialized between themselves without
unnecessary extra serialization with all the rest of the transactions in the
system. The "target" field just plays the role of the mutex associated with
the invariant (and this is why any invariant can be converted to that form
just by having a mutex, ie. a "target" field, associated with it). From the
theory of multi-threaded programming with mutexes we just know that everything
can be implemented correctly in such a model. I.e. any application can be made
correctly with READ COMMITTED isolation if required set of "mutexes" and their
acquire/release is correctly implemented.

>Serializable execution can be made very very fast when there aren't actually
a lot of conflicts.

not exactly. Serializability by itself is pretty expensive global invariant
and your application would be paying for it in performance and scalability
even though it may not really need it.

~~~
voidmain
> any application can be made correctly with READ COMMITTED isolation if
> required set of "mutexes" and their acquire/release is correctly implemented

This is true, but it's "a big if." And the result might be faster than a good
implementation of serializable isolation (since the latter is working to
maintain some invariants that you don't actually care about), but it could
_easily_ be much slower: very broad invariants will require very broad locks
which cause much more "serialization" than is required for serializability.

Let's say you want the same sum invariant but across rows: all _users '_
balances must sum to zero. The straightforward application of your reduction
is to essentially have a global mutex for all balance transactions, but this
is nowhere near the most scalable solution. Whereas a good serializable
database will handle this situation scalably, safely, and automatically.

------
sriram_malhar
I like Jim Gray's simple explanation (as always) of the difference.

A database has two white marbles and two black marbles. Transaction T1: Update
all white to black. Transaction T2: Update all black to white.

A serializable database will ensure that the transactions happen in some
serial order. At the end of performing them one after another, either all
marbles are white or all black.

In snapshot isolation, T1 updates only the white marbles, T2 updates only the
black marbles. So they are not seen as conflicting. The end result is that the
database contains 2 black (formerly white) and 2 white marbles (formerly
black),because the two transactions are working off a snapshot and not
conflicting.

------
jchrisa
If you are interested in a similar discussion of consistency levels, write
skew, and how FaunaDB allows indexes to opt-in to serializability, this blog
post covers a lot of ground, but brings up index write skew toward the end:
[https://blog.fauna.com/acid-transactions-in-a-globally-
distr...](https://blog.fauna.com/acid-transactions-in-a-globally-distributed-
database)

------
dang
Url changed from [https://www.cockroachlabs.com/blog/what-write-skew-looks-
lik...](https://www.cockroachlabs.com/blog/what-write-skew-looks-like/), which
points to this.

~~~
zokier
Pictures are now 404 for me @justinjaffray.com, so while the crdb link might
not be canonical it right now is bit more useful.

~~~
foldU
Sorry, that was my fault! Should be fixed now.

