Hacker News new | past | comments | ask | show | jobs | submit login
PostgreSQL anti-patterns: read-modify-write cycles (2014) (2ndquadrant.com)
224 points by kornish on Dec 21, 2016 | hide | past | web | favorite | 76 comments

Another solution is to not use update statements at all. Obviously this involves some changes to the tables. And it defers the balance calculation to query time.

Analogous to the example from the article:

  create table transaction (user_id integer, amount integer);
  insert into transaction values (1, -100);
  select sum(amount) as balance from transaction;

This approach is good if you do not have to apply some check on the balance before inserting a new transaction. If you do (say verifying that balance does not go negative) then you have at least two options:

1) you make transactions serialized

  select sum(amount) as balance from transaction where user_id = 1;
  -- do something with balance
  insert into transaction values (1, -100); 
2) or you use row level locking on an additional table (say user, while this conflicts with the system table user, but for the sake of the example we assume this is not the case) to serialize the access.

  select user_id from user where user_id = 1 for update; 
  select sum(amount) as balance from transaction where user_id = 1;
  -- do something with balance
  insert into transaction values (1, -100);
Now it may be tempting to change this to

  select balance from user where user_id = 1 for update; 
  -- do something with balance
  update user set balance = balance + (-100) where user_id = 1; 
  insert into transaction values (1, -100);
Just for clarity: in most cases you should register your transaction event anyway (with a time stamp) for later auditing, but you now will have a risk of introducing inconsistencies between calculated balance and transactions due to the programming errors (that is partially committed transactions).

Also you should obviously create proper primary keys and indexes.

Also you should include proper error/exception handling.

Also if you already do locking, then you could also consider different locking strategies described in the article.

Another solution would be to have a view that calculates the balance from the transactions.

Then your sql code can check the balance without doing a sum (which might get expensive, and often the same value until a transaction changes it) just by joining to that view.

i.e. `WHERE total_view.balance > 0`

I like this idea as I am pro-history keeping and prefer not to make irreversible changes like an update.

What are the space concerns for keeping data in this way? For something like a retail bank (audit issues aside which probably make this necessary anyway) does space suddenly become a big factor?

This is basically the Event Sourcing pattern. You start with an initial state, keep the events relating to it that might change it, and thus the current state can always be found by replaying those events. In the example above you don't even need to replay them in order, although maintaining order is definitely useful in more sophisticated uses of this pattern.

Double-entry book-keeping is done this way. The data certainly grows over time, but generally at a predictable and manageable rate.

Most people using long-lived event-sourced data structures have some kind of snapshot/checkpoint/archive mechanism, which going back to the accounting example might simply be an opening balance.

Accounting systems usually have a "close of day" or "end of year" event at which transactions can be consolidated.

With postgresql and MySQL this can be very limiting performance wise if you need to materialize the total after every transaction on a ledger that recieves a high rate of transactions. Continuous queries ought to help with this and supposedly pipeline DB is refactoring into a postgresql module; should be interesting.

This is also the pattern used by most version control systems, which use several strategies to deal with long histories efficiently. We use them all the time to move backwards in time, and between alternative realities until we eventually merge to a consistent state all users agree on.

The space concerns depend on the usage. But in general I feel people worry too much and too early about this. People also worry about calculating balance being not efficient. You need really large amounts of data before any of this becomes a problem. When it does become a problem it can be solved by caching.

I think the more realistic concern with this approach is that some queries might get complicated when they involve more complex logic based on the balance.

> When it does become a problem it can be solved by caching.

Solved or made worse by caching?

made worse by caching

Caching may relieve some stress (on say a dashboard), but if an application is doing calculations around account balances, and uses event sourcing-like structures in the database, then caching isn't going to help as those calculations are going to still need to be done for each new balance changing event.

So caching is never worse (unless you accidentally used the cached value where you needed the live value), but isn't necessarily going to relieve a lot of DB stress.

Since it's append-only, can't you "cache" or snapshot the calculation in pieces? e.g., "Today @ 00:00 all events on this account summed up to 100." Then you just replay today's events. You don't have to recalculate anything unless you're changing your transformations. Make your checkpoints whatever interval is necessary to make on-demand event replay acceptably performant.

A retail bank would already keep tabs of ALL changes to an account anyway, if not in this format, then in separate logging tables. They don't just keep only a final tally.

Typically you do both, keep a log and keep a current table (you wouldn't recalculate every time. As for space, you archive in a regular basis (say every year). That's why you cannot check you're bank statement from 2 or 3 years ago.

Especially banks really need to maintain the transaction log. One could also argue that keeping the aggregated balances could be some DB side trigger on update operations for separate "balances" table or generated with some kind of hourly/daily batch job to not put too much load on the operational DB with every customer browsing the online bank.

Triggers get a lot of hate because of separation from application level version control etc., but these are also long discussions to get into.

edit: This article's contents are good to be included with all the most junior level application design/development training material.

You can check triggers, or any db changes, into version control. Usually is more a migration script like Active Record. And sometimes its more like a log of changes than a total current state, but its definitely doable.

Yes, I just see this kind of arguments coming from quite senior level engineers every now and then. I see the reasoning that application/system level logic should be kept separate from the database, but aggregations are a good example in my opinion of things that database engines are good at, and also as the aggregated data does not insert any new operative data/transactions this should be mostly fine.

In practice, loaded OLTP systems use a combined approach. Balances are computed at regular points in time (at the beginning of each month or quater), and final balance is efficiently calculated from the value at the nearest point and changes from the transaction log.

Khmm select sum(amount) as balance from transaction where user_id = 1;

I got caught by this a couple of years ago, but only because I was using a JSON field and as far as I knew at the time, to modify a JSON value I had to take it out, modify it and write it back. The mistake was ever choosing to use a JSON field for data that was changing rapidly, and being updated from multiple sources. When I detected the issue, I added a FOR UPDATE to my SELECT.

It made me realize though the incredible safety I'd come to take for granted with a more traditional relational approach. Prior to that I would have had a separate table for the data in that JSON field, with each object within it, represented by its own row. Given my scenario that would have been perfectly safe, because the latest update to a specific object was always the right one.

The original design choice was perceived as "easier".

I find if my first idea is to use a JSON field in postgres, I am usually being lazy. Adding schema to the JSON and creating a related table is almost always better. I try to only use JSON fields for serialized data that will not be modified, mostly record keeping.

The JSON functions in postgres are different enough from SQL that I try to avoid them. Writing SQL interspersed with JSON functions becomes clunky, and is not near as powerful as just plain SQL.

This describes not only JSON fields/blobs in a traditional RDBMS, but also the impulse that drives much NoSQL adoption. It's just easier to be able to dump whatever you want in there, developers love it, no chatter about normalization and all of that silly DB stuff. This almost invariably comes back to haunt them, often in astonishingly damaging ways.

Use schemaless options, including JSONB in PgSQL or doc DBs like Mongo, for data that really needs it due its unpredictable nature, NOT data that you're too lazy to draw out a schema for.

One place we use them effectively at the healthcare company where I work is as an abstraction over the vastly complex and varied data we have on file. Put a little more simply than the reality, we "tag" patients with various statuses, conditions, and other things. All the tags are stored in a single table. The metadata for those tags, which can be, for instance, an array of hospitalization dates, a dictionary/map of various lab values, etc., is stored in a JSONB column.

There is a "population explorer" in our web application that allows users to search for patients based on the existence (or lack) of any combination of tags, and additionally to filter each tag by the values contained. The JSONB metadata is generated in SQL, the filtering is done on the frontend, and the searching is done by a query generated in the Python backend. It is surprisingly easy to maintain and extend, and even after a year and a half, we haven't run into any insurmountable issues, or even any difficulties of note.

For instance, if a user searched for all patients tagged with a hospitalization since November 1, the following steps would happen (this is vastly different from the actual code, just trying to give a sense of how it functions):

1. the backend generates an SQL query:

  select ...
  from patient p
  where exists (
    select 1
    from tag t
    where t.tag_type_id = :type_id
      and t.deleted is false
      and t.patient_id = p.id
2. the backend iterates through the results, filtering as necessary

  exclude_list = []
  if filters:
      for key, patient in results:
          for filter in filters:
              # each filter is a lambda generated by the
              # user-defined parameters (in this case,
              # since November 1)
              if not filter(patient):

  for key in set(exclude_list):
      del results[key]
3. on the frontend, if the filters for any of the desired tags are changed, a process very similar to the above is run to recompute the display set

However, on the whole, I tend to agree with you. Really the only other places we use JSON at present are in storing API requests, where the data sent with each HTTP POST are in widely differing formats, and in various intermediate abstraction layers, where data from different sources contains different columns, and we can carry forward the "outlier" columns as a JSON object for later reference.

Out of interest, why did you guys design the system to perform filtration at the application layer instead of generating SQL for filtering and pushing that into the DB?

I also work in health care, and also am using JSON columns in Postgres to store some "schemaless" (in the sense that the schema is not fixed in advance and subject to change over time) data in a convenient way. There's a serialization mapper that turns it into a normalized structure for some types of consumption, but the source of truth is the JSON column and the normalized structure is generated from it.

I use the json field to store precomputed/-assembled write-once-read-many json documents. For that I find it incredibly useful, as the JSON functions allow me to drill into the generated JSON for analysis and diagnostics.

When you're using Javascript the temptation is almost unbearable. But you are correct.

How are "serializable" transactions in PostgreSQL different from optimistic concurrency control? The docs say:

> In fact, this isolation level works exactly the same as Repeatable Read except that it monitors for conditions which could make execution of a concurrent set of serializable transactions behave in a manner inconsistent with all possible serial (one at a time) executions of those transactions. This monitoring does not introduce any blocking beyond that present in repeatable read, but there is some overhead to the monitoring, and detection of the conditions which could cause a serialization anomaly will trigger a serialization failure.

As far as I can tell, this doesn't make transactions any more serializable; it just fails them if they wouldn't have been serializable anyway. And then clients typically retry. That sounds just like OCC. Like OCC, I'd expect that under any kind of contention, this could lead to very large numbers of failures and retries. It's not quite livelock, since at least one client will always make forward progress, but close to it.

Well, first: to properly implement OCC, you need to validate not just your write set (as implemented in some ORMs) but also your read set. Otherwise you're only addressing write-after-write anomalies, leaving write-after-read unaddressed.

Second: there's the issue of precision to think about. Neither Postgres SSI nor OCC are perfectly precise; both may incorrectly reject transactions which would not have actually violated serializability. But Postgres SSI is an improvement because it allows certain classes of (provably non-anomalous) write-after-read interleavings to proceed.

The SSI paper[1] is a good read on the subject, and even contains an interesting description of a 3-way serialization conflict involving (requiring(!)) a read-only transaction such that if the read-only transaction were omitted then there would be no conflict among the other two read-write transactions.

[1]: https://drkp.net/papers/ssi-vldb12.pdf

One major difference is that optimistic concurrency control is handled by the calling application and not in the database. Your app should handle the error in either case, but in the case of OCC you end up doing the bookkeeping yourself, which may be preferable (lower overhead for Postgres to handle the query for instance).

derp. OP has a small statement on the differences:

> Unlike SERIALIZABLE isolation, it works even in autocommit mode or if the statements are in separate transactions. For this reason it’s often a good choice for web applications that might have very long user “think time” pauses or where clients might just vanish mid-session, as it doesn’t need long-running transactions that can cause performance problems.

It sounds like if the result is the same if the transactions are reordered, then the second transaction is not aborted. Simple version numbers will always fail if the version number is incremented.

They will behave similarly in similar situations. There's four scenarios for 2 transactions:

1) Transaction 1 starts and commits before Transaction 2 starts. 2) Transaction 1 starts; Transaction 2 starts; Transaction 1 commits; Transaction 2 attempts to commit. 3) Transaction 2 starts; Transaction 1 starts; Transaction 2 commits; Transaction 1 attempts to commit. 4) Transaction 2 starts and commits before Transaction 1 starts.

1+4 and 2+3 are the same. In 1+4, the transactions are serialized - they are wholly independent of each other, so the second transaction will always see the updated version, so there's no worries about failing on that.

In 2+3, the version number increments when the first transaction commits. For optimistic concurrency control, the version number will have changed, which is detected prior to updating. (In the OP's example, the query is an UPDATE balance WHERE version = X, and since the version is X+1, zero rows get updated.) For the SERIALIZABILITY concern, it will fail because internally the version has changed for the given row ("version" referring to the metadata Postgres keeps to track which rows are current), kicking it back to the user with an error.

I don't think there's a scenario where reordering lets a SERIALIZE'd transaction succeed but optimistic locking fails. They are effectively - as far as the methodology used - the same thing (with some differences that the OP mentions).

EDIT: minor readability things

Nitpick: while the gist is correct, the 2nd diagram illustrating how transactions don't help is subtly misleading. The diagram implies that time flows down in the diagram. If so, then the update on the right won't occur until after the commit on the left. The update on the left really does acquire "write" lock on the balance, so the 2nd update on the right cannot occur until after the lock on the left is released when the left commits.

That said, the effect is exactly as the author claims. The transaction on the right will ignore the change made by the one on the left.

That was the section I stalled at --- the author doesn't explain why the transactions don't help, just states that they don't, and provides a diagram with no context. I don't understand what they're getting at here.

Surely as soon as one transaction tries to modify the database, it'll lock out all other reads to the affected rows from other transactions, so as to prevent them observing changed data until the transaction's committed? So avoiding the entire problem? Isn't this the entire reason transactions exist?

Edit: I haven't actually used PostgreSQL; I grew up on SQLite. I see from the docs that SQLite transactions are all serializable, which from the linked article seems to be what you get with PostgreSQL's BEGIN ISOLATION LEVEL SERIALIZABLE. So... if by default PostreSQL's transactions aren't like this, what are they? If they're not serializable, then aren't they simply broken, i.e. not enforcing the transactions properly?

While time is down, it's the time of a hypothetical user entering the command, not the time at which commands actually affect data in the table. I think that's pretty intuitively clear to the reader, especially since the article is about counterintuitive behavior.

The point is that if a user on the right entered the update command, it would NOT return until after the transaction on the left committed, because there's a database lock on the balance when the left transaction did its update.

I'm glad to see optimistic concurrency control mentioned. In my experience even seasoned developers are unaware of this pattern, which should be really the default in many database-handling code.

I'm not aware of ORMs handling it either (except for my in-house code generator). Is there anything out there that handles it well?

From the optimistic locking page: This locking mechanism will function inside a single Ruby process. To make it work across all web requests, the recommended approach is to add lock_version as a hidden field to your form.

Unless I'm misunderstanding, that's absolutely terrible advice, from both a security and proper layering standpoint.

> Unless I'm misunderstanding, that's absolutely terrible advice, from both a security and proper layering standpoint.

Not trying to start a flamewar but... that's Rails in a nutshell. The default patterns are just straight up bad for any kind of large app built by more than one or two people. The gotchas mentioned in this post basically have no built-in rails patterns to mitigate them so you have to know how to write an app without the ORM to effectively use ActiveRecord, even though it's viewed as a great abstraction that you should always use so your code stays readable and you and your future team members won't have to worry about the implementation details in SQL.

For instance there's no built-in way to take advantage of Postgres UPSERT so every codebase I've come across does it wrong with a naive read then create (sometimes wrapped in a .transaction block which doesn't necessarily fix it as the post points out but is used as like a rain dance in many rails codebases whenever something's acting weird and we're not sure why). The built-in Rails increment() is not atomic, and the atomic increment_counter() helper still encourages you to do the wrong thing! (it doesn't use UPDATE .. RETURNING so you see people incrementing and then reading which obviously is no longer atomic). And on and on.

I think if your lock version is a random hash that is updated each time you write via trigger or app code, then at best someone nefarious could change that value in the form and prevent themselves from being able to update the record.

It also depends on what you're trying to achieve. For instance, if you're just trying to help coordinate people updating a wiki page so that two people don't accidentally clobber each other's updates then it's probably not a huge concern.

From a layering perspective, you can probably implement this with checks and triggers in postgres so that it's mostly transparent to the client.

That said, I wouldn't bother myself ;)

SQLAlchemy suppports optimistic concurrency control[1] and row level locking[2]

[1]: http://docs.sqlalchemy.org/en/latest/orm/versioning.html

[2]: http://docs.sqlalchemy.org/en/latest/orm/query.html#sqlalche...

Doctrine2 (PHP) supports this natively.

I've never used this approach, but after working a lot with row-level locks (SELECT ... FOR UPDATE) errors in environments with sustained load, I think next time I'm going to evaluate this approach.

Row-level locks need to be carefully implemented in order to avoid deadlocks, which can occur for a number of reasons (eg. missing indexes, other transactions locking rows in different orders, etc.).

GORM in Grails uses optimistic concurrency control by default, which is how I learned about the issue in the first place.

IIRC Hibernate supports it (haven't used Hibernate in ages though).

JPA 2.0 has both optimistic and pessimistic locking.

Mongoose uses optimistic locking by default, I believe.

I believe Marten supports this


Am I the only one who would consider a CTE for this, selecting the initial balance and the proposed balance minus amount in the CTE, testing it in the query following the CTE and potentially applying it, and then returning the original and new balance to the application. If both are the same then the update was not applied (not enough balance), and if different the update was applied (enough balance for the withdrawal).

This way it all occurs within a single transaction, even though it is several logical steps, and is simple to reason about.

That's the optimistic way - do the transaction, return result and if the result doesn't pass the validation, rollback.

In postgres, you don't really need to do CTE - UPDATE can return the new row if you add RETURNING * so you can just do the update and look at result.

In Oracle (and what we did in actual bank) was pl/sql procedure that did

- select for update - do the validation - apply the transfer, or return error. - let the application rollback or commit

This is hard to do without procedural language - you can work with CTEs and functions, but SQL is great at dataflow and bad at control flow, and you either drop to procedural language in database, or have the application do the actual logic.

CTEs are still vulnerable to race conditions afaik, plus postgres does a poor job of optimizing them performance wise (that is, it doesn't) http://blog.2ndquadrant.com/postgresql-ctes-are-optimization...

In principle if you want to do the (withdrawal) balance check then you can incorporate it into a single query

  UPDATE balance SET balance = balance - 100
  WHERE user_id = 1 and balance >= 100;
Only problem is that you now actually do not know if the query did not update because of insufficient balance or because lack of user in the database.

That is why I prefer the CTE approach.

Query 1 (in CTE): Verify all is correct and the current balance (and whether you have enough balance)

Query 2: Perform update and obtain new balance

Then return a summary showing the starting balance and end balance for the transaction (proof of what, if anything, happened).

Great article. This is something I've been thinking about for a while working on Ruma, my Matrix homeserver implementation. I think this is a type of bug that a lot of application developers miss, but the consequences can be quite catastrophic depending on the nature of the data.

As a small aside, if anyone is interested in Matrix, Rust, Diesel (the Rust ORM), or the intersection of any of these things, you might be interested in discussing this further on this Ruma issue: https://github.com/ruma/ruma/issues/132

I'm guilty here too.

But one side of me wonders why SQL databases do not better adapt to those 'anti-patterns' in software development, when developers for a variety of reasons are more confident in having logic in their app code than their DB code.

The way I like to think about it is that the database does not know nor does it care (nor should it!) about the meaning behind the queries you're sending it. It cares about fetching and storing data correctly first and foremost - it can't tell your intent from the query you're sending it. It has to be told what to do (transactions, readability levels, etc) to ensure the data reflects what the app developer expects.

RDBMSes are general-purpose tools that need a little bit of extra information to give you better guarantees on how data is manipulated. You can reconfigure the database ensure serializability by default, but if you want it available for specific queries the app developer has to be the one to ensure that constraint.

A really good resource is the following book http://www.apress.com/us/book/9781484207611

Oracle Database Transactions and Locking Revealed Authors: Kyte, Thomas, Kuhn, Darl

It's Oracle-specific though, because of one of the authors(Tom Kyte , the Tom behind asktom.oracle.com not so long ago ) I highly recommend it.

In addition to this, postgres also has Advisory Locks, if you want to handle the locking mechanism client side and don't want to fail update by a client that still wants to do the update anyway for some reason.

Essentially, how to make a bank account in SQL. The author then explains how transactions won't help solve the problem, followed by an example with transactions that actually does solve the case.

At university we learn that transactions should be serializable and that they are meant for these exact cases.

Transactions aren't actually true transactions unless they are serializable.

Most databases don't actually default to serializable transactions, but instead try to get close by implementing snapshot isolation. These aren't true transactions, since snapshot isolation doesn't truly isolate transactions.

The article had a bunch of nice ways to get better isolation than snapshot for a bunch of special cases. These are all very nice, since they can be useful for mainting performance together with integrity.

My professors at university still laugh at commercial databases and calls them toys for still not having implemented more efficient serializable transactions.

If you professors could actaully solve that problem efficiently they'd be billionaires.

Toys which are propping up most of civilization then :)

What is holding back more efficient serializable transactions becoming more mainstream?

Is it:

A lag between academia and industry? Entrenched software which makes it hard to implement / update to modern technology Performance issues which make the benefits for safer transactions not worth the computation tradeoff

Well if they know of a database that implements a workable subset of SQL and has efficient SERIALIZABLE transaction isolation we'd all love to hear about it.

In the meantime, hundreds of thousands of businesses are built on REPEATABLE READ.

The efficiency challenge with serializable transactions comes mostly in the context of distributed transactions.

Netezza/IBM had a reasonably efficient implementation. Instead of broadcasting individual locks, they shared ordering relations between transactions as they were discovered.

Essentially, when transaction A reads a record and concurrent transaction B modifies the record, then B serializes after A (even if B's modification occurs before A's read in real time). These ordering relationships form a a graph. If a new ordering relationship between transactions causes the formation of a cycle, then some transaction must be aborted.

Isn't Postgres just that? Or are you saying it's not efficient?

Yeah, Postgres's implementation of SERIALIZABLE is not efficient (at least compared to the default).

I'm not really sure how any implementation can be particularly efficient. It seems that either an implementation would have to maintain a huge number of specific row locks or you have to start locking whole tables.

> Yeah, Postgres's implementation of SERIALIZABLE is not efficient (at least compared to the default).

Sure, you pay some price. But if that overhead actually proves to be a problem for you, we'd like to hear about it. There's definitely some room for optimizations, but so far fewer people hit those than some people (i.e. me) anticipated.

> I'm not really sure how any implementation can be particularly efficient. It seems that either an implementation would have to maintain a huge number of specific row locks or you have to start locking whole tables.

You don't need full row locks, and you can summarize up "ranges" of locks if necessary for space reasons. I.e. go from row level to page level, to full table level. That will still be correct, just cause more rollbacks.

In my personal use I found the overhead to be negligible. But then I've never used it for a project that really had to scale.

For simple CRUD applications SERIALIZABLE is great, because you basically don't have to worry about concurrent updates anymore. Well worth the price imho.

Used throughout, serializable transactions will frequently deadlock, because the order of read operations needs to be globally consistent and the database doesn't have enough insight into the data dependencies and processing of query results by the application to reorder things.

MSSQL transactions are trule serializable by default. MySQL InnoDB supports it too but you have to change the transaction isolation mode.

I believe the default for SQL Server is actually read committed.


Sorry, I was wrong.

MSSQL is read committed by default according to their documentation.

Please add [2014].

I should note that this is not Postres specific - any MVCC ACID database will behave this way. Some use locks instead of MVCC and behave differently. But it definitely belongs into "What every programmer should know about transactions, ollected works".

Sorry, my ability to edit the title expired - but maybe dang can add it.

Yeah, this definitely belongs in the essentials. Lots of developers treat transactions as a race condition panacea. This article is a nice case study about how expressing relative SQL transformations (e.g. "subtract 100 from current value") using absolute values ("set the value to 200"), even absolute values derived from relative application logic, can be prone to fault.

id like to read such a collected works :o

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact