Hacker News new | past | comments | ask | show | jobs | submit login
ORM is an anti-pattern (2011) (seldo.com)
58 points by dnsco on June 10, 2016 | hide | past | favorite | 47 comments



This is The Technologist's Mistake: "X doesn't work well in all cases, therefore X is wrong."

ORMs are great for simple stuff, and most things are simple stuff, even in complicated applications. The best ORMs don't try to do too much and let you kick out to SQL/strings when things get gnarly. ActiveRecord is great in regard. ActiveRecord also shows you the SQL it is executing, so you know when you are getting screwed with N+1 issues (which are easy to fix) or whatever.

You are foolish to use SQL for basic CRUD and simple-to-intermediate queries and you are foolish to write a bunch of cryptic fluent API code when a bit of highly-specialized SQL will do. It's not an either/or, as long as your ORM layer is pragmatic about letting you get out to SQL where necessary.


> The proposed alternative of adding layers

I didn't see this in the article. As far as I can tell, the article's proposed alternatives are either using a non-relational data store, or using SQL directly. If anything, that will remove layers, not add them; it's the ORM itself that is an added layer.


Mmm, yeah, I was unfairly extrapolating a bit:

> Encapsulate your relational queries into a Model layer...

I was thinking of the people that end up having a model layer and then a query layer on top of the model layer and then a query executor layer on top of the query layer, etc.


>using a non-relational data store, or using SQL directly. If anything, that will remove layers,

A non-relational data store doesn't add layers up front, but you end up adding homegrown migration layers once you realize the schema you used for your documents is no longer adequate.


This isn't his argument; he's arguing that people expect X to work for some set of cases (either because it's the latest hotness or because X's maintainers advertise it for those cases) but it ends up being more work in those cases than the proven solution. This is different from "doesn't work well in all cases".


He's arguing that ORMs break down for complex situations, so you shouldn't use them.

It's a false dilemma. You can use an ORM for a lot of needs and then, when necessary, kick out to SQL. He says this "breaks the abstraction" but who cares? Abstractions aren't laws of physics: do what works.

False dilemmas are the best dilemmas, because you can just ignore them. Unfortunately, programmers seem particularly susceptible to this sort of fallacy. I think it is due to a form of puritanism that I often see in the mathematical mind.


Whenever I read these sort of anti-ORM posts I wonder exactly how simple the data model of these people's software is that they would rather write SQL for every trivial operation in their data model by hand than use abstraction layers that are nowadays well-established, mature, and far better tested than whatever ad-hoc intermediate layer people make by hand for their data models.

Between myself and another guy I maintain a system with hundreds of tables, and if I had to write basic fetch-by-id queries, trivial joins, index statements and a bunch of other things which should demand zero effort from myself, I wouldn't even have the time to deal with that boilerplate.

SQL is a remarkably difficult language to compose, a problem that most ORM APIs just don't have. The code necessary to makes dumb joins is long and prone to typing errors. Hell even error messages are usually better at the ORM layer than whatever weird syntax error MySQL or Postgres throw at you.

Every large system that does not use an ORM at the beginning eventually end up growing a half-assed pseudo-ORM library which has been objectively worse in every regard than using quality ORM libraries like SQLAlchemy, the Django ORM, Yii's, etc.

The impedance mismatch problem is just not there because I do want a database to return objects. The only time that doesn't happen is when I want relational records for analytics and number crunching; even then most ORMs don't fail me.

Having to write a couple SQL queries by hand because the ORM fails you for every thousand ORM queries sounds like throwing the baby out with the bath water.


>Whenever I read these sort of anti-ORM posts I wonder exactly how simple the data model of these people's software is that they would rather write SQL for every trivial operation

It's actually the exact inverse: it's ORMs that are only good for fairly simple data models and become pain points for anything more advanced, after which level you end up with hand-rolled SQL through the ORM or complex reinvention of all SQL concepts at the ORM layer in an ad-hoc way.

SQL is good for both simple and overly complex models. ORM are ok-ish with simple (but still redundant) and a crappy layer of faux-simplicity and pain on more advanced models.

Not to mention that they promote not knowing the details of your data model and letting the ORM create it through another ad-hoc reinvention of the db's DDL.


It's not the complexity of the model, it's what you are doing with it. Almost all applications that use a database do some kind of CRUD. These operations, no matter how complex your model is, are easily handled by ORMs.

But one of the big advantages of an RDBMS is being able to perform more complex data analysis and transformations. You can do that with an ORM but the result is usually harder to understand and less performant than doing it in raw SQL.

Most advocates of ORM are thinking of the former case and most proponents of ORMs are thinking of the latter case. The trick is not to get boxed into one solution.


I think there are a lot of loaded expectations on what it means to be an ORM. Maybe this is no-true-scotsmaning, but I distinguish between things providing convenient query builders from full ORMs.

Generally speaking, I'm happy with anything that handles marshalling / unmarshalling and can generate typesafe (when applicable) queries for me (so long as its happy to get out of the way on those handful of occasions that I'd like to use some handwritten SQL).

Things I've used in the past like this include

  * Knex - http://knexjs.org/
  * Slick - http://slick.lightbend.com/
  * Opaleye - https://hackage.haskell.org/package/opaleye
Generally these don't call themselves ORMs though. The primary distinction I've found is that they don't conflate my domain model with my schema. They don't enforce a 1:1 correspondance to a table or a try to concoct some scheme of representing inheritance with multiple tables and they don't try to take on myriad other unrelated duties like validation, etc. They allow me to build my application in the hexagonal style ( http://alistair.cockburn.us/Hexagonal+architecture ) in an OO language.

Abstraction of SQL is not something I have a problem with, its trying to infect my domain model with persistence concerns, or even worse by trying to pull transaction handling out of the domain of the database.


I have yet to see a data model so complicated that it cannot be modeled in the more frequently used ORMs. I have tables with two dozen FKs that participate in 7 or 8 joins and it's still not a problem.

You mention all these things and yet in 10 years of experience they have not been a problem for anyone I know.

Have you even used a good ORM?


>Have you even used a good ORM?

Well, I've professionally used Hibernate in Java, Django's ORM in Python, and Sequelize in Node. Also used Active Record for (more CRUD-ish) Rails stuff.

I'm sure someone will point to some obscure "better" ones (like there's always the "use a different distro" answer to Linux issues), but those are the most commonly used ORMs in their respective environments as far as I know.

>You mention all these things and yet in 10 years of experience they have not been a problem for anyone I know.

I could anecdotally point to people I know, but let's have a measurable metric: the support forums for those ORMs are chock full of questions and reports for such problems.

Besides, the description of what an ORM can handle as if it's impressive, namely several FKs and joins ("I have tables with two dozen FKs that participate in 7 or 8 joins and it's still not a problem")is hardly the pinnacle of data modeling that a modern DB is capable of.

Simple multiple FKs and JOINs are nothing, or course an ORM should (and could) be able to handle them (although even there you can see users struggling with performance issues, over eager bring-the-world-in queries, etc).


"it's ORMs that are only good for fairly simple data models"

That's not true. Maybe you haven't used a powerful ORM, but with a good ORM you can scrape a fair bit of bad architecture under the rug.

"SQL is good for both simple and overly complex models"

SQL is poor for complex models. Imagine a schema refactor, and then a SQL query that breaks on aggregation without you getting any notice of it. It's easily done. Maintenance is a key quality of good software. There's other maintenance problems with platforms based around reldbs as well: it's arduous to maintain parallel test and production environments, and to have dev environments that operate at the same scale as your production environment.


I've found that sqlAlchemy core (the layer below the ORM) works exceedingly well for me–it's composable, exposes SQL easily as needed, makes abstraction easy etc. The major benefit is that the objects in the expression builder represent the parse tree, rather than representing SQL strings directly.


I've had the opposite experience. The concept of a query builder is great, but SQLAlchemy is a pain. Its documentation is particularly hard to traverse or search, and it's super unintuitive. Perhaps the nature of query building is deceptively complex, but it seems like the library just tries to be too magical and ends up making things worse instead of better.


Yeah, the documentation is really bad. I found that there were two types of objects returned by sqlAlchemy: "query objects" (which can then be executed or compiled into SQL) and "selectables" (which are tables, joins, etc. and would compile to queries that have select statement, just something you could select from.) Almost all operations consist of creating or composing these. But this wasn't anywhere in the documentation!

Once you figure out these tools there's actually no magic–it's pretty easy to figure out exactly what the expression builder is doing–but finding these took a bunch of trial and error.


> it's pretty easy to figure out exactly what the expression builder is doing

I don't know about that. I touched SQLAlchemy again this very morning for the first time in several months. I was just trying to get the count of a table with several rows, so I ran `table.count()`, but it kept returning 0. The problem happened on MySQL and Postgres. It turns out that this doesn't compile to `select count()`, but instead it returns the count of the first column in the SQLAlchemy table object's column list. In my case, that column was all NULL, so the count was coming back as zero.

I thought maybe this was an accident, but here's the body of the count() function from SQLAlchemy's source:

``` if self.primary_key: col = list(self.primary_key)[0] else: col = list(self.columns)[0] return Select( [functions.func.count(col).label('tbl_row_count')], whereclause, from_obj=[self], *params) ```

I run into this sort of unintuitive behavior every time I touch this library. :(


> Whenever I read these sort of anti-ORM posts I wonder exactly how simple the data model of these people's software is that they would rather write SQL for every trivial operation in their data model by hand than use abstraction layers that are nowadays well-established, mature, and far better tested than whatever ad-hoc intermediate layer people make by hand for their data models.

I prefer SQL query builders--they remove the pain of writing raw SQL--without the complexity of an ORM. You simply compose your SQL query the way you would compose any other data structure, and for statically-typed languages, you get some compile-time validation via the type system.


"Whenever I read .. that they would rather write SQL"

If they do write SQL. There are ways to structure complex domains that don't involve relational databases. Having said that, the alternatives presented in this article are weak, and suggest the author is not aware of a world beyond relational databases.


I used to think ORMs were an anti-pattern too until I started realizing I was essentially implementing them every time I wrote any type of model that wrapped around SQL, as suggested in the "Use SQL in the Model" section. First we need some finders, OK--wrote those, and now we need a way to create some data... and then update it... maybe we'll combine those into a save method. Oops... looks like I just wrote an ORM. Except my ORM took me more time to write and is a lot less tested.

I don't think of the benefits of an ORM being abstraction away from SQL or being more efficient. I think of them as code and tests written that I would of otherwise had to write myself.


Might as well say that SQL is an anti-pattern, too, and complete the flame-bait.

The unfortunate reality is that databases require a bit of work to use effectively, and it doesn't matter what method you use, it's not going to be transparent (but both SQL and ORM can work).


Blanket statements are an anti-pattern.


This one is the most succinct response. I like it.


Only a Sith deals in absolutes!


Death to all fanatics!


I agree with a lot of what the author has to say. Is it a coincidence that every project that used an ORM I've worked on in my 10 years experience has ended up a mess?

I end up spending more time troubleshooting and asking the "why is it generating these queries?" and trying to hack ActiveRecord to get it to do what I want.

I'm sure we have all had that moment where we actually have the SQL query already written and now face the task of trying to convince the ORM to just do what you want, while trying to conform to whatever horrible and leaky API they have created. A few hours later, after scouring StackOverflow, Google and wading through abstraction upon abstraction base classes you realize you're so far away from your original problem at this point, and you need to go home, so you just get the damn database driver and execute the query, what you could have done in the first place. You feel defeated and hope the others don't shun you for your "improper" use of the ORM.

Or maybe you do find a clever way of hacking what you need to do in the ORM, but a lot of the time I just don't care anymore to make things fit in someone else's' poor design.


This is a psychological problem not a technical one. If you've already written the SQL the right choice is to use that and not translate it into the ORM. The other way is madness, which you know, but you don't accept it.

An ORM is supposed to save you time. And it does. For the bulk of database interaction an ORM will save you a lot of effort. But it's not the be-all-end-all. It has limitations. And when you hit those limitations, you just don't use it. This isn't a huge strike against ORMs in general because every technical choice is a trade off. You just have to accept that it is a trade off. And because it's a trade off, sometimes it'll be the wrong choice for a task.

I've been in your situation and I happily add the SQL to the model directly and go back to my day. I don't worry that I didn't use the ORM. It's there to help me, not hurt me.


I guess I'm just not sure exactly what, save for trivial queries the ORM is actually good at abstracting?

We can all write simple queries with simple joins (I hope), that doesn't really save much time. In my experience anything more than a trivial model with a boiler-plate relation it becomes not worth it...


ORMs are good for manipulating complex object models and doing mostly trivial queries. However, the vast majority of queries are trivial. Maybe 2-5% of queries or bulk modifications can't be easily modeled. The result is still huge savings in time, effort, and code.

Also, if you're basing your entire understanding of ORMs on ActiveRecord, you're missing out on a lot.


One ORM I've used recently which does some cool stuff is Diesel, for Rust.

On a few of these counts, I think it does quite well:

> Efficiency is "good enough": none of the ORM layers I've seen claim efficiency gains. They are all fairly explicit that you are making a sacrifice of efficiency for code agility. If things get slow, you can always override your ORM methods with more efficient hand-coded SQL. Right?

Relevant GitHub PR: https://github.com/diesel-rs/diesel/pull/283 -- Diesel's query builder is often faster than a raw SQL string, and the queries are typechecked against a provided database schema at compile time.

> When you fetch an object, which of its properties (columns in the table) do you need? ORM can't know, so it gets all of them (or it requires you to say, breaking the abstraction). Initially this is not a problem, but when you are fetching a thousand records at a time, fetching 30 columns when you only need 3 becomes a pernicious source of inefficiency.

This is trivially easy to avoid in Diesel (and, I suspect, in other ORMs I haven't yet used. Just add a .select((tuple, of columns)) function call to the start of any query you're building.

> Entirely anecdotally, I claim that the abstraction of ORM breaks down not for 20% of projects, but close to 100% of them.

IMO a good ORM allows seamless integration of "raw" SQL through the use of type-checked function calls which integrate into the rest of the ORM. Diesel does this, and IIRC so does SQLAlchemy.

> Objects are not an adequate way of expressing the results of relational queries.

Right. But an ORM need not always force returning objects. Diesel (and again, IIRC SQLAlchemy) are examples.


I use Django ORM and SQLAlchemy on a daily basis. The Django ORM is somewhat limited, but it is OK for what it is used for. SQLAlchemy is a joy to use, it maps SQL 1:1 but is way more secure and easier to use than string-based composition of SQL statements. Pony ORM is other Python alternative worth a look.

This bold statement is clearly wrong, at least for Python.


+1 on SQLAlchemy. I find myself using the SQL expression API all the time these days for ad hoc queries where in the past I would have fired up psql.

    from sqlalchemy import create_engine, func, select

    query = select([
        my_table.c.column1,
        my_table.c.column2
    ]).select_from(
        my_table
    ).where(
        my_table.c.column3 == datetime.date(2016, 5, 10)
    ).where(
        func.coalesce(func.array_length(my_table.c.my_stupid_column4_is_an_array, 1), 0) > 0
    )

The above:

a) isn't too much more boilerplate-y than the actual SQL statement itself but is composable so later on in my IPython session I can easily do

    other_query = query.order_by(-my_table.c.column1).limit(10)
b) can always just be written in straight SQL if I don't remember the specific SQLAlchemy invocation or can even be written as a mish-mash of SQLAlchemy expression language and textual SQL.


ORM is really nice when you have a very large complex object you want to send out to the browser. Have the browser send it back, have the server validate it and save/update it. For that, it is beautiful.

The N+1 query problem and such have to be avoided, but that's just knowing how to use the tool properly. For querying, it's usually better to use the criteria API or in some cases straight SQL, but for doing simple crud stuff with complex objects it's pretty hard to beat an ORM.

The complex querying stuff is moving to Hadoop and Spark for larger data sets anyway, so if anything, using the same database software for OLAP and OLTP databases is almost an anti-pattern these days, at least for larger data sets.


This article misses some of the biggest benefits of an ORM - a unified structure of the data models interacted with in code.

ActiveRecord sucks, but that doesn't mean all ORMs suck.


The value I get from ORM isn't the sql generation, it's in the removal of tedious boilerplate to map between a single row of a query result and an object.


Think about this. Why developer accept the heavy cost of learn something like C++ or even python, yet in the case of a language of SQL suddenly they think: No, is toooo hard!

Is worst when they refuse to use a relational engine, yet, use a NOSQL and re-implement, badly, what a relational engine give.

Like the well know quote about lisp:

http://www.c2.com/cgi/wiki?GreenspunsTenthRuleOfProgramming

Any sufficiently complicated C or Fortran program contains an ad-hoc, informally-specified, bug-ridden, slow implementation of half of CommonLisp.

Is possible to say:

Any sufficiently complicated data manipulation program contains an ad-hoc, informally-specified, bug-ridden, slow implementation of half of a relational engine.

-------

Is clear that SQL is not very good, but use a database engine and know enough to do it well is WAY EASIER than learn C++, python or Javascript. Like a lot!


Damm, look like someone else already state the rule about databases:

http://www.c2.com/cgi/wiki?GreencoddsTenthRuleOfProgramming


ORMs are just tools. They aren't pro- or anti- anything.

Unquestioned adoption (or its brother, knee-jerk eschewance), argument by sunk cost and/or emotional attachment, analysis by "let me google another blogpost or two", and other forms of herd-following and fanboy-ism? These are anti-patterns.


In my experience, the main problem with ORMs is that they promote the tendency to reuse their model instances for business logic entities, meaning they become overly coupled with the persistence mechanism, namely SQL databases in this case. The architecture becomes much more flexible if the business logic is kept in an independent layer -- which may or may not be object-oriented -- which then uses ORM instance objects for interacting with the database.


My experience is that ORMs fail at dealing with reports, searching data and all large data updates. If all your application needs is a single row of data tied to some other set of single rows of data then an ORM might be of limited benefit to you. I have had multiple contracts to fix the performance problems introduced by ORMs. In the test database the ORM was good - in the real world not so good.


Every approach has ups and downs, but my fundamental gripe with ORM as it exists is that it does not match business data model very well.

Here is a shameless link to something I wrote on that topic; it complements the above article: http://unframework.com/orm-vs-key-value/


We use spocs and adapters where I work and I like being able to write sql. It's a powerful language and don't see why people shy away from it.

Storing objects over using relational... be careful! Relational dbs give you decent guarantees that blobs or schemaless dbs don't. Often data is more important than code.


At least some of these points are problems with particular implementations and not with ORMs themselves. I personally find LINQ to be huge improvement over bare SQL.

Sure, there are some weak points, but IMHO none of the alternatives mentioned here are any better.


Technically linq isn't an orm, but I guess there's still a lot of overlap with points in the article. Eg you still need to know SQL and the linq abstraction on top of that.

What I like about linq/EF is that it doesn't force you into other anti patterns (active record, repository etc) but that's probably just a matter of opinion.


Better title: ORMs are my favorite thing to strawman.


ORM promotes mutable entity types.


To me, the evolution of programming communities and their attitudes toward mutation is interesting. It seems like 2011 was long ago and somewhat "less enlightened."




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: