Table inheritance seems like a natural one for devs working with a lot of class oriented languages. But also, composition over inheritance seems to be a more frequent watchword. Any experiences using this one?
Also, triggers seem pretty controversial in the discussions I've been privy too, most devs seem to hate them. Any positive experiences with them?
I think many developers want to treat RDBMSs as simple black-box data stores... something, admittedly, you can largely get away with in the majority of applications. I think this leads to a lot of developers avoiding anything that forces a real understanding of the database since they can get a fair amount done without it and that's understandable. Triggers and stored procedures and database functions intrude into this solace and therefore are frowned upon by many.
Having said all that I think it's better to have a firm understanding and a feel for RDBMS operations and development techniques and that you can produce more solid work if you have that understanding. I think if you have that understanding (and actually have an appropriate problem for a RDBMS solution) there are roles for triggers/procedures/functions in the database... just understand where the boundaries are and delineate clear responsibilities to the various pieces of the stack (that's the hard part and the part in which many fail). I also think that if you are a developer needing to make use of an RDBMS and don't have a solid understanding of your RDBMS of choice, you're going to screw up more than just anything to do with triggers/functions/procedures/etc. This is why I always approach projects where ORMs are important with caution. ORMs are just a tool, true, but often times they are also the rug under which the database can be swept... or to put it another way, a framework for building technical debt. (Yeah, yeah, not always, but frequently).
As for PostgreSQL table inheritance... just say no. You cannot overcome the object/relational impedance with that feature and you end up compromising too much of why you might want a RDBMS in the first place. I've only seen it used well twice; one of those cases being table partitioning and that is, from a user perspective, becoming much less necessary with new features in PostgreSQL. PostgreSQL is an Object Relational Database Management System, but I think you have to consider the object part of that as primarily being useful inside the database itself: it can be useful in more sophisticated uses of the database itself, especially if you are writing functions/triggers, but isn't a great tool in my experience to map to OO systems outside of the database. I have found it much clearer from an information architectural perspective and easier to maintain data integrity using standard RDBMS modeling techniques (with the exception of maintaining certain unstructured data as JSON blobs).
I was one of the developers years ago. You find your language ORM library and away you go happy coding. It was only after running into some performance issues and other bottle necks that i put my head down and spent a week learning the DB (granted its not a long time but you can learn A LOT about SQL and your DB in 40 hours).
Not only did it solve my bottle neck issues it just made it very apparent that there are a lot of things ORM do that make everything harder. They have their place but working with the DB/SQL/Triggers/Views/Mat Views/ ... improved my applications greatly.
Now years later anytime i am working with a new developer i always stress that they at least learn the basics.
IMO, triggers should be used sparingly. For something like updating a "last changed" timestamp on a row, they can be useful. But they are places for side-effects to hide, and later be the source of frustration and performance problems. I almost always prefer to keep any business logic in stored procedures not triggers.
I generally agree with your statement, though I include how a particular piece of logic may effect data integrity.
For example, lets say I have an accounting system with a general ledger structured with a header table (row per journal entry) and a child detail table (row per GL Account with either debited or credited amount) and the idea that journal entries need to be "posted" to be considered part of the business transactional history. I may have a business rule that says: I can never post a journal entry that isn't balanced (debits must equal credits). It is always invalid to have a posted and unbalanced journal entry.
Assuming that the posting status is recorded in the header table, I may well write a trigger on the header table which, on updating the posted status of the header record to posted, checks that all of the detail lines sum up to a balanced journal entry and aborts the transaction if that rule is not met. This is a de facto encapsulation of business logic, but one important to data integrity. Sure I can also use a stored procedure, but I may want to have a trigger to be involved because of the difference between an active and passive application of the rule: a stored procedure for posting requires an explicit call... but just doing an UPDATE against the table won't call my stored procedure (or other application code for posting) whereas a trigger forces the issue. In many of the enterprise environments that I work in, the main accounting application is not the only way data gets into the database so I can't rely on those explicit calls to always be made to enforce the rules... the database is the final authority and forcing it there can also eliminate a class of issues. I may well have my trigger call my posting stored procedure if that procedure were written the right way...
Yes, I'm taking other issues like performance as granted as well, but I mostly wanted to show another aspect to consideration related to trigger use. [edit] Perhaps a more fine tuned approach is to say business logic which doesn't change data, but ensures rules should be met or the transaction is cancelled, helps with the side-effect issue you cite.
Thanks for your thoughtful comment on this. It’s interesting to hear good use cases for triggers.
Some years back I worked at a place that went deep on triggers and stored procedures and it totally put me off using either (sql server). There was a lot of opaque behaviour that was hard to debug and performance was terrible.
A friend is a great dB developer (one of the top on stack exchange) and when I asked him about triggers he said that he doesn’t use them for updating data, ever, unless it’s someone else’s totally broken schema and he has to dig himself out of a hole.
On the other hand, he loves stores procs and pretty much every interaction he lets apps have with the dB run through stored procs. What are your thoughts on how heavily you should lean on stored procs?
I find inheritance vs. composition to be much less an issue in RDBMSes vs. OOP languages. The problem with inheritance in OOP is that you inherit the code, but not the data (at least, it is supposed to be hidden). This results in overly tight coupling between your class and the other class that you just need a piece of and (I find) is almost never what you want (rather preferring composition and interfaces).
Whereas in an RDBMS, there is no "code" attached to a relation; and the data is explicitly exposed. Inheritance starts to look more like the typical pattern of structure with common fields at the top level, and disjoint fields inside a discriminated union. You can mirror this with RDBMS composition (foreign keys), but you lose the ability to enforce even basic structural constraints (tricks like mutually-referential foreign keys notwithstanding).
Constraint triggers are very useful for enforcing cross-table constraints.
Triggers themselves, beside the typical use cases of logging and notifying channels, I find useful for non-disruptive schema upgrades. You can redirect updates, or fill in non-trivial default values, using them. (Rules could probably fill the same role.)
> Triggers themselves, beside the typical use cases of logging and notifying channels, I find useful for non-disruptive schema upgrades. You can redirect updates, or fill in non-trivial default values, using them. (Rules could probably fill the same role.)
Before triggers are nice for constraints since they should not amplify writes. But even there they can become easily overlooked by application developers.
Also, triggers seem pretty controversial in the discussions I've been privy too, most devs seem to hate them. Any positive experiences with them?