Hacker News new | comments | show | ask | jobs | submit login

What's the trouble with PostgreSQL's sequence support? (http://www.postgresql.org/docs/9.1/interactive/sql-createseq...)

You got me really excited, but I don't think this is what I want. Imagine a table "Chapter" with a Foreign Key "Book". Not uncommon would be to have a 'sequence' or 'rank' column in the Chapter table defining it's order within a book. Sequences shouldn't overlap (accomplishable with a unique constraint on sequence and book_id), but they also shouldn't have gaps, and it'd be nice if inserting a Chapter at a sequence had predictable behavior (like.. it incremented all chapters in that book with sequence >= it's).

I find it a pretty typical and profound source of frustration, and since the whole idea of RDBMS is to model relationships.. and relationships so often have integral sequence/rank.. it's always seemed to me like something that ought to've be considered 'in scope' during the design of SQL.

OK, that's definitely a matter for a custom trigger on the table.

Another option would be to keep the chapter order in an array column on the Book table, since arguably the arrangement of chapters belongs to the book rather than the chapter.

PostgreSQL array documentation is here: http://www.postgresql.org/docs/9.1/static/arrays.html

arguably the arrangement of chapters belongs to the book rather than the chapter

That's a really excellent point, and array columns look like a really cool solution! I suppose the only bummer is that it introduces a way to get things out of sync (I could delete a Chapter row, but forget to remove it from the array in Book), but in general this is a much better approach than the status-quo solution.

I think you should be able to write a delete rule on Chapter that will ensure that no array in Book contains that chapter, whether through cascading the delete or through raising an error.

If you're ever thinking of writing a trigger for any purpose, think again. And again. They are used improperly far more often than not. Things should not happen as "magic" side effects in a database, which is what triggers do. Generally better to be explicit with code in a function or stored procedure.

In general I disagree, but in this specific case you are right that a procedural interface to manipulating books/chapters is preferable.

Essentially if your API is single-table DML, then triggers are preferable. If multiple tables need to be manipulated at the same time to complete the full operation, then there should be a procedural API and triggers are likely unnecessary.

You should investigate exclusion constraints. Another scenario where they come in handy is a calendar interface where you need to ensure no overlapping per record per time.

Indeed, exclusion constrains are really neat! FTR, here's their description in the PostgreSQL manual:


Oh wow. These are pretty wacky/interesting. Also, sign of a well designed API: I think I gleaned more understanding from reading an actual EXCLUDE constraint definition than I did from reading the description of what it does in the docs.

The problem is that what you want can't be had within an MVCC framework. To get what you want, you really need global, thread-safe mutable data. MVCC instead gives you multiple parallel universes that spring into existence and then disappear, possibly making indelible changes on the next parallel universes to be created in the process.

It boils down to two problems. You can't be sure you're the only transaction in progress, and you also can't "take back" a sequence number. If you could guarantee you were the only transaction, you could just use COUNT. But since you can't, you could have two transactions in progress that both get the same COUNT result and you wind up with two "Chapter 5"s. You can put a UNIQUE constraint on that table, and then one of your transactions fails that constraint. That way you retain integrity and you get (at least for a while) sequential numbering, but there is no in-database way to ensure that a sequence of numbers has no gaps. This is because to do so would be expensive but also not really relate to data integrity.

The purpose of PostgreSQL sequences is to reliably generate surrogate keys. That means every transaction in progress gets a separate ID from that sequence. They all have to be unique because they all might commit. If they rollback instead, it doesn't matter; those become gaps, but nobody is failing the UNIQUE constraint.

What are you going to do to seal the gap when you have two transactions in progress on the same sequence and one of them rolls back? Are you going to pause the other transaction and propagate some kind of decrement to the other transactions? What if I inserted that ID into another table or called a globally-visible stored procedure (sent an email, for example)? I'm not certain it would even be possible, but even if it were, it would clearly be extremely expensive, and therefore, probably not a feature you want on by default.

So I hope that explains why this is not "in scope." :)

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact