Doubt this ever really happened. Can hardly imagine debating serializability in a court of law Oracle v. Google was bad enough.
But I'd like to hear from engineers who have seen write skew bugs and other transaction anomalies cause business issues.
I dunno, maybe I didn't get the memo that the database of the future will be single-threaded.
Deadlock happens under some, but not all implementations of serializability via locking. There have been several database systems developed in my lab that use locking to achieve serializability, but yet never deadlock. Examples include:
(1) Calvin: http://www.cs.umd.edu/~abadi/papers/calvin-sigmod12.pdf
(2) Orthrus: http://www.cs.umd.edu/~abadi/papers/orthrus-sigmod16.pdf
(3) PWV: http://www.cs.umd.edu/~abadi/papers/early-write-visibility.p...
Bottom line: serializability does not necessarily mean deadlock. Deadlock can be avoided via non-locking implementations, or even in well-designed locking implementations.
One of the points in the conclusion of the post warrants being reiterated at this point:
"If you find that the cost of serializable isolation in your system is prohibitive, you should probably consider using a different database system earlier than you consider settling for a reduced isolation level."
Such isolation levels are notoriously difficult to reason about—even for experienced practitioners—and their misuse can and does introduce persistent data anomalies that can be costly to remediate.
Generally speaking, performance issues are significantly easier to diagnose and resolve than data anomalies, and they may be address in a targeted fashion as the need arises.
There’s no substitute for thinking. But if I had to prescribe general advice, it’d be this:
(1) When given the choice, select a modern database system which supports scalable and efficient serializable and snapshot transaction isolation levels.
(2) Use serializable isolation, by default, for all transactions.
(3) In the event that your transactions are not sufficiently performant, stop and investigate. Profile the system to identify the bottleneck.
(4) If the bottleneck is contention due to the transaction isolation level, stop. Assess whether the contention is inherent or whether it is incidental to the implementation or data model.
(5a) If the contention is incidental, do not lower the isolation level. Instead, refactor to eliminate the contention point. Congratulations; you are now done.
(5b) Otherwise, lower the isolation level—only for one or more of the transaction(s) in question—by a single step. Carefully assess the anomalies you have now introduced and the ramifications on the system as a whole. Look for other transactions which could intersect concurrently in time and space. Implement compensatory controls as necessary to accommodate the new behavior.
(6) Repeat only as necessary to achieve satisfaction.
see above comment from abadid - its entirely possible to impose a global ordering of all transactions either up front or retroactively without admitting deadlocks.
Fine if you have a few dozen rows as we did in testing, but after a while in production the DB struggled hard the row count went beyond a hundred k... Turned out the serialized access caused it to issue a lock on every single row (for reasons I still don't understand).
The serialized access was used was to basically implement a fancy autoinc ourselves. I rewrote the code to use a unique constraint and retry-loop instead, which eliminated the speed issue entirely.
Consider a common scenario where you SELECT a set of rows and take a SUM over a column. Suppose your query and another query begin reading from the same committed state of the same table. Suppose that the other query uses an UPDATE command on a set of rows in the table, and that the other query commits before yours does. In order to be consistent, the database system must detect the situation where the other query updated a row that affects the filter WHERE you scanned the table, otherwise your sum could be incorrect if the other query's committed state would cause your query to compute a SUM over a different set of rows or over modified values in your SUM. The only way for the database system to guarantee there is no conflict is to keep track of every single row your query accesses, even if it is a row passed over by a WHERE clause!
Serializability is a well studied concept. You can find lots of good resources about algorithms for implementing concurrency control and for detecting whether or not two transactions are serializable. The high-level summary is that it takes a lot of operational bookkeeping to guarantee that two queries have no conflicts, especially when you are using real-world examples having many filters and joins.
Why is putting a lock on the table itself not an option?
Performance gains and not blocking the database for many other readers and writers will in many scenarios outweigh the possibly of reading uncommitted/dirty data.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
(whether you are are querying from within a transaction or not)
and you may be surprised how this may solve many of your performance problems in your application/service.
For transactions which must write, the entire point of this article is that anything less than SERIALIZABLE leaves you subject to anomalies whose affects on your data can be very difficult to predict beyond trivial scenarios.
If you must lower consistency for performance's sake, then so be it, but I would strongly argue that should be the exception, not the rule, and not be a decision made lightly.