In MySQL, a statement within a transaction can return "ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction". This can be forced by the following sequence of events. (This assumes an InnoDB table in Repeatable Read mode.)
Process A does a START TRANSACTION.
Process B does a START TRANSACTION.
Process A does a SELECT which reads row X.
Process B does a SELECT which reads row X.
Process A does an UPDATE which writes row X.
Process B does an UPDATE which writes row X. - Deadlock error.
A SELECT does lock parts of the database. "Repeatable read" means that if you read the same data item twice within the same transaction, you get the same result, even if someone else is changing the data. This requires locking. If you try to update the data in conflict with another process, you'll get a deadlock error, but if you COMMIT a select-only transaction, you won't. You do have to COMMIT select-only transactions, or you'll fill memory with locks and stall out updates.
MySQL in repeatable read mode actually doesn't prevent this anomaly (it doesn't deadlock). You need to be in serializable mode to get the deadlock.
Should MySQL have the same behavior? It detects the problem and blocks, but then does the update when the other transaction finishes, losing one update. I tried this for different values of "value" in each process, and it still fails. (The test sets the same value from each process, so you can't see who wins the race or if the database treated the update as a no-change transaction.)
Since you have a test suite, try getting that to the SQL committee for the next revision of the standard.
The author didn't really show that this was true. He obviously understands isolation well, so I would tend to believe him, but it would be nice to see an example.
In a simple case (not sure that it matches the exchange's case), postgresql in any configuration will prevent this problem:
CREATE TABLE account(
balance numeric, check (balance >= 0)
Does someone have a counterexample, or more details about the case at the exchange that would not be solved by postgres?
I'd also like to point out that the postgres's implementation of true serializability performs quite well and there isn't much of a cost to using it over snapshot isolation.
Your example check constraint indeed appears to work as advertised. Trying
UPDATE account SET balance = -1.00 where id = 1;
Still it's not clear exactly what the author sees re: where postgres fails. I'll have to read the article again. With any luck someone more knowledgeable than I am will shed light on the question.
This paper: http://dl.acm.org/citation.cfm?id=1376690 outlines the general issue. This paper discusses Postgres's solution: http://drkp.net/papers/ssi-vldb12.pdf
Also, Peter Bailis's blog has info and links to papers if you want more detail (http://www.bailis.org/blog/when-is-acid-acid-rarely/)
What I don't know is about the particular problem the exchange encountered, and why postgres's isolation would not have prevented it. Based on the very brief descriptions of the problem that I've seen, it seems like postgres would have prevented it, but I don't have enough information to say for sure.
Think of a transaction that inserts a row representing a withdrawal, updates a materialized total balance, and checks that it's positive. Under snapshot isolation, two concurrent instances of this transaction could commit. The materialized balance would reflect only one of the debits however, and would be inconsistent vs queries that recompute the aggregate in full.
Postgres pre version 9.2-ish would allow this situation even in "serializable" mode. Later versions wouldn't.
FYI I'm an engineer at FDB, happy to help.
> The idea of isolation is that we want our database to be able to process several transactions at the same time (otherwise it would be terribly slow)
Not necessarily true. Things like Prevayler and LMAX provide isolation by processing transactions one at a time, and they're very fast. They manage this by keeping everything relevant hot in RAM. LMAX, for example, can do 6 million TPS for a financial trading platform. You can read Martin Fowler writing about LMAX here: http://martinfowler.com/articles/lmax.html
Which, as far as production-common database implementations go, is lightspeed for implementing new academic work (9.1, the first version with the feature, was released in 2011).
"Internet commenters, in their infinite wisdom, were quick to point out that if you’re dealing with money, you had better use an ACID database. But there was a major flaw in their argument. Most so-called ACID databases — for example Postgres, MySQL, Oracle or MS SQL Server — would not have prevented this race condition in their default configuration."
I hesitate because I don't really understand the details of the situation the exchange faced. But, going by the linked references here:
it appears that the pattern in question, if translated very unnaturally to SQL, is something like:
CREATE TABLE account(id int8, balance numeric);
SELECT balance FROM account WHERE id = 123;
-- application sees 100, subtracts 90, sees that
-- it's still positive and does:
UPDATE account SET balance = 10 WHERE id = 123;
But that's ridiculous. Anyone using SQL would instead do:
CREATE TABLE account(id int8, balance numeric, check(balance >= 0));
UPDATE account SET balance = balance - 90 WHERE id = 123;
So, the author is technically right: (a) if you translate the NoSQL-isms into SQL in an unnatural way; and (b) don't bother to use SERIALIZABLE mode, which costs very little in most situations.
I agree with the author that isolation is tricky, and developers should not be expected to understand the nuances. And I applaud the development of a testing framework to really understand the various kinds of isolation and how they apply to different products. But the example is a bad one, because it actually does work just fine in postgres, and probably many other systems.
Although your second example is probably what a human would write, an ORM framework would very likely generate a transaction looking like your first example.
Another example would be inserting a transaction into a table, and summing the transactions in the account in order to calculate the account balance. Making that safe requires preventing phantom reads, which means requiring serializability.
You say serializable costs very little in most situations. I can't claim to know what most situations are like, but all I know is that I've seen many people who have tried serializable and found it too slow for them. User a-priori gives an example elsewhere on this thread: http://www.michaelmelanson.net/2014/03/20/transactions/
My point is that weak isolation is very subtle, easy to get wrong, and you don't know that you got it wrong until it's too late. We need better understanding and better tools so that concurrency is less easy to screw up.
And the point about ORMs is valid, but still not enough to back up your unqualified claim that using postgres would not have solve the problem. It might not have, but postgres offers a lot of tools to solve this problem (we didn't even discuss SELECT ... FOR UPDATE), and even inexperienced users are at least more likely to have stumbled into one of those solutions.
I very much agree that isolation issues are subtle traps for many users, even in SQL. I would like to see SERIALIZABLE become more common, and eventually the default, in postgres. I also like the fact that you're writing real tools to check up on these in a formal way.
But please be a little careful when making statements like that, because it can turn people away from the systems most likely to help them.
I would disagree that offering an alternate solution that avoids the problem is a rebuttal of the fact that there are quirks in isolation level implementations across RDBMSs that can surprise even seasoned developers.
It still looks to me like they would have stood a better chance with poatgres.