
Practical Guide to SQL Transaction Isolation - beliu
https://begriffs.com/posts/2017-08-01-practical-guide-sql-isolation.html
======
IgorPartola
One tool to add to your toolbox as a developer is named locks. Say you have a
table for orders and a table for order items and a table for payments.
Collectively they represent an order, but that is a higher level concept that
isn't something an RDBMS understands. When the user on your site
simultaneously sends two updates, such as "remove item from the cart" and "pay
and finish", you will have a slew of simultaneous updates to the rows across
the three tables. Unless you use the highest isolation level, you will get
inconsistencies. If you do use serializable isolation level you might get a
deadlock, depending on how conscientious you are about updating everything in
the same order. You could do an initial read from the orders table with FOR
UPDATE which will provide you an implicit lock on that order, but you still
have issues with new rows being inserted into your order items table that will
not be range locked.

So use names/advisory locks: create a named lock called "order-1234" where
1234 is tbe ID of the order. Acquire it before doing any manipulations to the
order across all its tables. Then release it. Semantics of names locks differ
across RDBMS's. Unsurprisingly Postgres has saner and more flexible semantics
than MySQL, but both are perfectly suitable for the purpose. This technique
can save you a ton of headaches.

~~~
stickfigure
_If you do use serializable isolation level you might get a deadlock,
depending on how conscientious you are about updating everything in the same
order._

It sounds like you misunderstand optimistic concurrency. If your transaction
touches a piece of data, and that data changes (by another transaction) while
your transaction is in progress, then your transaction will rollback on
commit. With a tiny bit of library code, your transaction will retry until
success. In the case of severe contention you may see repeated retries and
timeouts, but you will never see deadlocks - at least one transaction will
successfully commit each "round".

I disagree - database locks should not be used by the average developer.
Understand optimistic concurrency and use serializable isolation. If (and only
_if_!) you have a known performance problem with highly contended data,
consider a locking strategy. 99 out of 100 developers will never need it, and
most of that 1% will botch locks on the first try anyways.

If your database doesn't support optimistic concurrency, get a better
database.

~~~
IgorPartola
I partially agree with what you are saying. I understand optimistic
concurrency, especially the constraints it places on your code: you must in
all cases access the data in the same order, otherwise you do get deadlocks.
You also must implement the retry mechanism. I am not familiar with all the
web frameworks out there (this is where I often encounter these types of
issues), but I don't know of one that automatically retries requests for you.
That "tiny bit of library code" is something that as far as I know _you_ have
to create for every situation, and it will be specific to your situation. I
have yet to see a production codebase that actually does this instead of
simply returning an error to the user saying "try again".

If your database doesn't fully support ACID transactions, yes, get a better
DB. And no you don't need to use named locks all the time. As an example, I
have a codebase where I use the standard transactions semantics everywhere
using the Django ORM, except one very specific and critical bit of code where
I want to be dead certain that a thing can't happen twice in a row. So
basically out of, say, 100 or so places where I commit a transaction, only one
actually uses named locks, but where it does, it solves the problem in a very
elegant way.

So I agree with you that you shouldn't litter your code with these. But you
absolutely can use them to either (a) guard against complex and critical parts
of your code or (b) use these to quickly augment a huge existing codebase that
keeps running into deadlocks or worse yet lock timeouts.

~~~
stickfigure
_I understand optimistic concurrency, especially the constraints it places on
your code: you must in all cases access the data in the same order, otherwise
you do get deadlocks._

The second half of this statement negates the first :-)

You have confused pessimistic concurrency with optimistic concurrency. An
optimistic system detects collision _at commit time_ and the actual order of
operations in your transaction is irrelevant. You will never experience a
deadlock, only aborts/retries.

In a pessimistic concurrency scenario, you take explicit locks as you access
resources. Those locks block access by other transactions. This creates the
dining philosophers problem and the risk of deadlocks. When locking resources,
you must access resources in a consistent order to prevent deadlocks.

Please avoid database locks and let smart databases do what they are supposed
to!

~~~
IgorPartola
Learn something new every day. Yes you are correct, I meant pessimistic
concurrency because that's what I'm familiar with. Which databases support
optimistic concurrency and full ACID transactions?

~~~
stickfigure
Postgres, MS SQL Server, Oracle... and more exotic ones like Google's Cloud
Datastore and Spanner. Probably many more.

~~~
IgorPartola
Interesting. I use Postgres and can't find any mention of this in the docs:
[https://www.postgresql.org/docs/current/static/mvcc.html](https://www.postgresql.org/docs/current/static/mvcc.html)
talks about the usual row and table locks, and
[https://www.postgresql.org/search/?u=%2Fdocs%2F9.6%2F&q=Opti...](https://www.postgresql.org/search/?u=%2Fdocs%2F9.6%2F&q=Optimistic)
is empty. Is it known by some other name in the Postgres land?

Also, I found this and similar on SO:
[https://stackoverflow.com/questions/17431338/optimistic-
lock...](https://stackoverflow.com/questions/17431338/optimistic-locking-in-
mysql)

So if this technique actually requires me to write my database access code at
this level, it means that using something like an ORM is nearly impossible in
a way that makes an ORM useful.

~~~
stickfigure
That stackoverflow answer is wrong - or at least, only applies to MySQL. The
author is apparently unaware of other databases.

If you put Postgres in repeatable read or serializable[1] mode, it manages
optimistic concurrency for you. Collisions will produce an error in the second
transaction attempting to commit. The only "trick" is that your client code
should be prepared to retry transactions that have concurrency failures, and
this is very easily abstracted by a library. Your SQL or ORM code does not
change (I use Hibernate, fwiw).

Some good reading:
[https://www.postgresql.org/docs/current/static/transaction-i...](https://www.postgresql.org/docs/current/static/transaction-
iso.html#XACT-SERIALIZABLE)

[1] Repeatable Read won't detect collisions caused by phantom rows, so pick
carefully. Most CRUD apps don't care about phantom rows.

~~~
IgorPartola
How does this work if your transactions also do external things? So if have
two concurrent transactions that charge a credit card via an external API, as
well as mark the order as paid in your DB? Retrying that transaction would be
disastrous, no?

~~~
stickfigure
This is a problem that affects all distributed systems irrespective of
database isolation levels. For example, your call to the external API might
timeout; did the charge succeed or not? Database locking doesn't help you.

The solution comes in two parts:

1) Ensure your call to the card processor is idempotent and retry it. For
example, grep Stripe's documentation for the "Idempotency-Key" header. Every
processor should have a similar mechanism. Conveniently, if your database
transaction unit-of-work includes the remote call, it will retry automatically
in an optimistic scenario.

2) Get a webhook call from the credit card processor. Even though you are
retrying your transaction, you can't guarantee that your server didn't crash
without a successful commit. This potentially leaves a charged card without
recording the purchase; the webhook will sync it up afterwards.

Distributed systems are hard if you think past the happy path. I wish every
REST API would have an equivalent of Idempotency-Key.

~~~
IgorPartola
Yup. I'm familiar with Strip's solution to this and am really happy with it. I
also wish more API's had that.

But this brings me back to my original point: having a library that blindly
re-tries POST requests that result database transactions on every conflict is
not going to work. Until every external API, from printers to credit card
processors, has an Idempotency Key equivalent you just can't do that
generically, and you will have to do that with a custom bit of code for every
situation.

Isn't it easier to just acquire an advisory lock for the critical bit and take
your own database's concurrency contention out of the equation?

~~~
stickfigure
A lock still doesn't give you a transaction across two systems that aren't
transactionally connected. If you're specifically worried about retries you
can create a "task" (row in a table) in the transaction and have a consumer
read tasks and propagate the message.

A lock is seductive because in the happy path it looks like you have a
distributed transaction. In practice making reliable calls across distributed
systems involves more complexity. The hard part is figuring out what to do
when your remote call times out and you aren't sure if it succeeded; once you
resolve that, you can usually work with any concurrency scenario.

------
contingencies
Weird to see hand-drawn sideways semi message sequence charts. (Looks vaguely
reminiscent of musical notation!)

For those yet to discover it, a great tool is
[http://www.mcternan.me.uk/mscgen/](http://www.mcternan.me.uk/mscgen/) ...
obligatory JS equivalent [https://mscgen.js.org/](https://mscgen.js.org/)

------
barrkel
This article mostly addresses transaction isolation from the correctness point
of view, rather than the tradeoff between semantics and performance.

Higher isolation levels add read locks that can block concurrent transactions.
That's the biggest practical reason everyone doesn't use serializable, but
this practical guide doesn't really address it. There's not much discussion of
the gotchas whereby you can accidentally break concurrency in an application.

For example there are somewhat surprising sources of contention, e.g. adding
rows with foreign keys adds locks to the rows being referenced in foreign
table, to stop them getting deleted. Or consider gap locks in queries with
range predicates.

IMO the guide is mostly theoretical / introductory rather than practical. It's
an OK starting point for someone who knows almost nothing about databases that
are being used as shared state for a concurrent application.

~~~
anarazel
> Higher isolation levels add read locks that can block concurrent
> transactions.

Not in postgres. REPEATABLE READ doesn't add any locks, and SERIALIZABLE adds
locks, but they're optimistic. So blocking isn't something you're going to see
- the relevant thing to measure is going to be the rate of transactions being
rolled back due to serialization failures.

~~~
barrkel
Sure, but the article isn't about postgres.

~~~
anarazel
> Sure, but the article isn't about postgres.

Indeed. But your point was:

> Higher isolation levels add read locks that can block concurrent
> transactions.

Which isn't generally true, given the example of postgres.

------
blaisio
This is a great article. It explains the various concurrency issues in fairly
plain English, which is great! I wish more people were aware of these
isolation levels and their consequences. Many application developers will
either a) assume the database will always figure it out for them (in other
words, they assume it is always serializable, with no errors) or b) hold great
fear of the database and it's mystical use of locks - this type of developer
tends to avoid running queries at all cost, lest they wake the beast!

------
nthcolumn
I was an Informix DBA years ago and the other day an old friend who now works
with a company supporting game developers told me that Informix is (still)
really popular principally due to the way it handles locks and multi-
concurrency. Surprised to say the least.

------
groue
I'd also like to thank the other for his attempt at raising the general
awareness of transaction isolation. Even when you deal with simpler databases
like SQLite, isolation is a serious topic. When serializing accesses hinders
performances, dealing with the "WAL mode" that grants SQLite with single-
writer-multipler-readers abilities is everything but trivial, because of the
huge amount of documentation that has to be read in order to reach the desired
level of isolation. I've written an SQLite library in Swift that provides _by
default_ the snapshot isolation level which grants most developers with both
peace of mind, and non-blocking reads. Check out its "concurrency guide" if
you're interested:
[https://github.com/groue/GRDB.swift/blob/master/README.md#co...](https://github.com/groue/GRDB.swift/blob/master/README.md#concurrency)

~~~
groue
Unique to GRDB.swift, as far as I know, is the ability to perform database
writes and then wait until snapshot isolation has been reached in another
connection before releasing the writing lock. This allows to read from a
_known_ database state, while allowing other writes to be performed. The
ability to read for a precisely known database state allows nice features like
database observation and reactive patterns, while taking advantage from the
WAL mode by not holding locks longer than necessary.

------
marklgr
I'm not sure I get the Dirty Writes example about not always being able to
rollback: it first says going back to state A would lose w2[x], so it stays in
state C. Then it says that aborting t2 can't go back to states B or C,
concluding that the scenario thus can't be rollbacked. But what about going
back to state A in case of a1+a2?

------
sciurus
If you enjoy this post, you'll _love_ Martin Kleppmann's book Designing Data-
Intensive Applications. Chapter 7 covers exactly this topic in more depth.

[http://dataintensive.net/](http://dataintensive.net/)

[https://www.safaribooksonline.com/library/view/designing-
dat...](https://www.safaribooksonline.com/library/view/designing-data-
intensive-applications/9781491903063/)

------
Twisell
Very interesting although I'm not sure about the last point where the author
assert that transaction errors can't be managed inside a pl/pgsql function.

It's a little bit twisted but I think you could write a pl/pgsql function that
act as a client of it's own database using fdw or dblink.

This is however pretty twisted and might have side effect but I'm pretty
curious about that.

Technically the author is still damn right because this will be using a client
library at some point as he explain.

------
DDerTyp
Off topic: I LOVE your theme / CSS. Whish there would be a WordPress theme
like that! :)

