I too have gone back to SQL after working with ORMs for 10+ years. Having worked...

jillesvangurp · on Sept 21, 2019

I've ripped out broken ORM on multiple projects with over-engineered domain models designed by people with no apparent knowledge of how to do a proper database design. This is the key problem with ORM. It leads to lots of unnecessary joins just so you can pretend databases do inheritance or all those tiny objects you will never query on need dedicated tables with indexed columns. It's stupid. It's also stupidly slow, fragile, and hard to maintain such systems.

I've been on a project where we had 20+ tables. After I made the point that the sole purpose of this database was producing json documents through expensive joins that were indexed and searched in Elasticsearch (i.e. this was a simple document db), we simplified it to a handful of tables with basically an id and a json blob; got rid of most of the joins and vastly simplified the process of updating all this with simple transactions and indexing this to elasticsearch with a minimum of joins and selects.

We also ripped out an extremely hard to maintain admin tool that was so tightly coupled to the database that any change to the domain made it more complicated and hard to use because the full madness of the database complexity basically leaked through in the UI.

ORMs don't have to be a problem but they nudge people into doing very sub optimal things. When the domain is simple, the interaction with the database should be simple as well. We're talking a handful of selects and joins and simple insert/update/delete statements for CRUD operations. Writing this manually is tedious but something you do only once on a project. With modern frameworks, you don't end up with more lines of code than you'd generate with an orm. All those silly annotations you litter all over the place to say "this field is also a column" or "this class is really a table" get condensed in nice SQL one liners that are easy to write, test, and maintain.

tekkk · on Sept 21, 2019

I whole-heartedly agree. The thing about ORM is exactly that it is so easy to get a false sense of security of "everything working" while you're clearly doing sub-optimal things all the time. It's just a fact in order to do proper, extensive SQL operations you have to know SQL and to learn it you should write a lot of SQL. ORM gets you up and up running quickly, but in the end I think it's better learning it the hard way first and _then_ being smart enough, hopefully, to know when it's better to use ORM than doing everything by hand.

I understand both points of view, with ORM saving my brain from writing massive JOINs on multiple tables yet at times forcing me to dissect a ORM query because it's doing something stupid. But if I didn't use ORM I would have probably not even made those stupidly complicated tables that I have to debug in the first place. Maybe even worse, is that instead of learning SQL I had spent all my time learning the ORM's API.

It's a balancing act, with good points on each side. I like writing my own SQL, I think it makes me think harder what I'm doing. Sure then I'll be probably writing my own helpers that might resemble a half-assed ORM but as long as it is kept simple, outside of the hands of those who wish to over-abstract everything with their fancy design patterns, it should be highly efficient and easy to understand. And the best of all, my understanding of SQL will be a lot more useful than knowing some language-specific ORM.

shkkmo · on Sept 21, 2019

> ORMs don't have to be a problem but they nudge people into doing very sub optimal things.

I don't think good ORMs do any nudging. The issue arises when people assume that because they are using an ORM they don't have to learn the underlying DB. ORMs should be treated as tools that sit on top of your SQL knowledge and allow you to do certain types of things easier.

Like any tool, there are inappropriate uses cases. On one side you have people using an ORM to implement a document store, on the other side you have people that end up hand-rolling a crappy ORM because they thought they didn't need one.

jillesvangurp · on Sept 21, 2019

I don't do object relational mappings generally. I instead query by id or a few other columns and construct objects from json documents stored in text columns.

Frameworks for that are awesome and a lot easier to deal with and serializing/deserializing overhead is typically minimal. Columns in databases only have two purposes: indexed columns for querying (ids, dates, names, categories, etc.) with or without some constraints, and raw data (json or for simple structures some primitive values. Some databases even allow you to query the json directly but in my experience this is kind of fiddly to set up and not really worth the trouble. The nice thing is that most domain model changes don't require database schema changes this way because the only thing affected is your json schema. This makes iterating on your domain model a lot easier. You still have to worry about migrations of course.

The added value of using a database is being able to manipulate them safely with transactions and query them efficiently. Bad ORM ends up conflicting with both goals and the added value of well implemented ORM is usually fairly limited. At best you end up with a lot of tables and columns you did not really need mapped to your objects and classes.

A good table structure often makes for a poor domain model and vice versa. The friction you get from the object relational impedance mismatch is best avoided by treating them as two things instead of one. Bad ORM shoves this under the carpet and in my experience does not address this (other than by providing the illusion this is not a problem).

shkkmo · on Sept 21, 2019

> I instead query by id or a few other columns and construct objects from json documents stored in text columns.

> Columns in databases only have two purposes: indexed columns for querying (ids, dates, names, categories, etc.) with or without some constraints, and raw data (json or for simple structures some primitive values.

You are basically describing a sort of ad-hoc document store with potentially limited ability to query. You've lost many of the benefits provided by a relational DB. Your ability to do run large update queries or reports will be limited (unless your DB provides native json support, which as you said, is fiddly).

If you are going to do this, why not use a NoSQL document store with support for transactions? Then you will get a tool that is designed to work with your use case.

Edit: If you use an ORM, a hybrid approach is possible. Where you store some properties as separate columns and then store the less frequently accessed (or more dynamically structure) data in a json field (which you can deserialize on hydration or on request). The main downside of this hybrid approach is that moving a propery out of the json field into a normal column would require using that fiddly native json support or a fairly slow migration that would need to go through and serialize each json field.

> A good table structure often makes for a poor domain model and vice versa.

Can you clarify what you mean? This has not at all been my experience so I am curious and would love to see some examples.

jillesvangurp · on Sept 21, 2019

Transactional semantics are problematic with a lot of nosql databases but I've used a few and you can work around this if you have some kind of consistency checks using content hashes. Postgres is pretty nice these days for a wide variety of use cases; including nosql ones. And it does transactions pretty nicely.

Regarding the object relational impedance mismatch, check here: https://en.wikipedia.org/wiki/Object-relational_impedance_mi...

In short, there are lots of things you'd do different in an OO domain model vs. properly normalized tables. A lot of what ORMs do is about taking object relations and mapping those to some kind of table structure. You either end up making compromises on your OO design to reduce the number of tables or on the database side to end up with way too many tables and joins (basically most uses of ORM I've encountered in the wild).

For reporting, you can of course choose to go for a hybrid document/column based approach. I've done that. In a pinch you can even extract some data from the json in an sql query using whatever built in functions the database provides. Kind of tedious and ugly but I've done it.

Or you can use something that actually was built to do reporting properly. I do a lot of stuff in Elasticsearch with aggregations and it kind of blows most sql databases out of the water for this kind of stuff if you know what you are doing. In a pinch, I can do some sql queries and I've also used things like amazon athena (against json or csv in s3 buckets) as well. Awesome stuff but limited. Either way, if that's a requirement, I'd optimize the database schema for it.

But for the kind of stuff people end up doing where they have an employee and customer class that are both persons that have addresses and a lot of stuff that is basically only ever going to be fetched by person id and never queried on, I'll take a document approach every time vs. doing joins between a dozen tables. I also like to denormalize things into documents. Having a category table and then linking categories by id is a common pattern in relational databases. Or you can just decide that the category id is a string that contains some kind of urn or string representation of the category and put those directly in in a column or in the json. You lose the referential integrity check on the foreign key of course; but then you should not rely on your database to do input validation so that check would be kind of redundant.

shkkmo · on Sept 23, 2019

> Postgres is pretty nice these days for a wide variety of use cases; including nosql ones.

um... what? Are you meaning to say that Postgres does a pretty good job as a document store? (not synonymous with "nosql")

Despite that that wikipedia article says, most (if not all) of the "impedence mismatches" described apply to most document stores as well. I would be curious to hear which of the mismatches described in that article you think are avoided by using Postgres as a document store. In my mind, the reason for using a document store is to have flexibility in the structure of your data (which can be a positive or negative depending on your needs).

> Or you can use something that actually was built to do reporting properly. I do a lot of stuff in Elasticsearch...

Of course there are document stores with good reporting. I was talking specifically about the downside of using Postgres as a document store given your complaints about its native json support being fiddly.

> But for the kind of stuff people end up doing where they have an employee and customer class that are both persons that have addresses and a lot of stuff that is basically only ever going to be fetched by person id and never queried on, I'll take a document approach every time vs. doing joins between a dozen tables. I also like to denormalize things into documents.

I often de-normalize addresses in my tables, but that choice is based on how you will want to store and update that data. A separate address table is good if you want to be able to automatically propagate address edits between records. A de-normalized address is good if you want keep records of that address for the purpose for which it was used. De-normalization is always an option with a relational DB, but normalization is not always easy some document stores.

> Having a category table and then linking categories by id is a common pattern in relational databases. Or you can just decide that the category id is a string that contains some kind of urn or string representation of the category and put those directly in in a column or in the json. You lose the referential integrity check on the foreign key of course; but then you should not rely on your database to do input validation so that check would be kind of redundant.

I'm not quite sure what you are on about here. You can use constraints on columns that are strings and you can have tables that are composed entirely of an indexed string column to point that constraint towards. Integer Ids are primarily used just to save space. (I don't really see how this is relevant.)

I don't see anything here to justify your assertion:

> A good table structure often makes for a poor domain model and vice versa. The friction you get from the object relational impedance mismatch is best avoided by treating them as two things instead of one.

To be frank, it sounds to me like you ran across a bunch of poorly designed DB schemas (or schemas you didn't understand the design decisions for) and decided that it must be impossible to design good DB schemas and so you just use unstructured document stores instead.