
ORM is an anti-pattern - jeromegn
http://seldo.com/weblog/2011/06/15/orm_is_an_antipattern
======
eftpotrm
Personally, in developing quite a lot of different data-backed apps, I've
never really found the problem ORMs are solving to be a hugely significant
one; it seems like a 'quick fix' for coders who don't really understand SQL
anyway, which always felt to me to be attacking the problem in the wrong
place. SQL isn't _that_ hard....

In any case, while I don't dispute that it _might_ offer speed of startup
advantages for some developers, it seems no-one is so far disputing that it
simply doesn't scale and, if your project really takes off, it will be
creating problems. Call me a fogey if you will but I don't like the idea of
launching a project that I know will need very substantial rearchitecting too
early in its life.

~~~
ianterrell
It's hard to believe this isn't just trolling.

Writing a large app directly using SQL is not "hard," but A) it's
exceptionally boring, B)it necessitates mental context shifting from your app
language to SQL, and C) it requires you to think about nuts and bolts routine
below-your-paygrade solved concerns like SQL injection, transaction
management, type casting, etc.

Two more points:

I've never seen a large app that either didn't use a popular ORM or didn't end
up REINVENTING large chunks of functionality ORMs provide. Good developers
write DRY code, and SQL everywhere leads to, eventually, some library that
looks remarkably like... an ORM...

Finally, in the case of large web applications (since that's HN's bread and
butter), an ORM'd app will not need a "very substantial rearchitecting" to get
rid of the ORM. That statement is patently absurd. Early in its life, for
scaling (always, IF it becomes necessary) what it will need is dropping down
to SQL to tune _some_ aspects of _some_ pages. Architect for doomsday if you
like (or, conversely, if you need extra hours to bill) -- or architect to
ship.

~~~
MartinCron
_A) it's exceptionally boring_

Perhaps as a result of it being so exceptionally boring that it's so error-
prone. If you forget to wire something up by hand (especially if its nullable)
and don't notice while you're coding, it can be one of those subtle little
bugs that lingers in the background until it really bites you.

Also, you forgot:

D) Difficult and time-consuming to create test automation for.

~~~
drivebyacct2
Yes! At least LINQ-to-SQL gives compile time type checking.

~~~
johnzabroski
Linq 2 SQL has ambiguous relationship path decision function when deciding how
to map the Linq expression tree model to the SQL DBMS's model.

Compile-time type checking is very illusory with heterogeneous execution, not
only for this reason but also because it is a distributed application and the
schema can change.

~~~
drivebyacct2
And I've been bitten by the truth of your statement. It sure beats the pants
off writing SQL by hand.

------
mechanical_fish
My conclusion, drawn from the title alone: The term _antipattern_ has
apparently jumped the shark.

Spend five minutes decoding a particularly hairy regular expression? Regexps
are an antipattern. Someone writes an inefficient SQL query? SQL is an
antipattern. Stub your toe on a curb? Curbs are an antipattern.

~~~
NyxWulf
This is my favorite response. One of the reasons I came to HN from Reddit was
that the proportion of populist/echochamber articles and comments to well
thought out and insightful articles and comments changed unfavorably imo. It
makes me sad to see the trend continue on HN, perhaps it is inevitable.

My frustration with this article is that while it makes some valid points that
could well be articulated as things to watch for, it instead takes that to the
extreme and generalizes a single person's biased experience to the notion of a
more universal truth. Unfortunately the world is much more complicated than
these type of articles like to admit.

Yes SQL is well known, and solves some sets of these problems, but it also
introduces other problems that can overwhelm depending on the environment and
the scale. As an example, at my last job I worked at a company that worked in
banking and mortgages. They tangle of banking and mortgage laws is different
state by state, and in some cases county by county. At this particular company
an unusually large percentage of the people know how to write SQL. So vast
amounts of business logic were written into SQL. This was stored variously in
stored procedures, DTS steps, SSIS Steps, Excel VBA Functions, Access
Functions, etc, etc, ad nauseum. May not sound like much until you realize
that we counted 50 thousand excel documents on the network stores, 10 thousand
access databases, 10 thousand stored procedures, etc, etc. This was for an IT
team of about 20 people. Does this mean SQL is an anti-pattern? No, it just
means that tools taken to the extreme and used well beyond their design tend
to stop functioning well.

ORMS certainly have their problems and can definitely slow things down.
However almost all of those arguments could be and were used to argue why
assembly was better than C, why C was better than Java, why Java was better
than Ruby...almost seems like a pattern. ORMS can save you on maintainable
code footprint, and can reduce complexity if they are used correctly. Of
course that doesn't mean they can unthinkingly be applied in every situation
with no design. The contra is also not true. Unthinkingly applying SQL in all
situations is just as perilous and fraught with issues.

I suppose it is against human nature, but I wish for the day when people could
have reasonable discussions about issues, recognize that other people have
some valid points, and possibly learn from each other and not feel the need to
over-sensationalize everything.

~~~
Stiffy
This isn't just "a single person's biased experience". This is an experience
that millions of developers and users have experienced time and time and time
and time and time and time and time and time again.

ORMs make for a bad developer experience, a bad user experience, and a thus a
bad overall experience. For any serious app using an ORM, this will always be
the case.

------
mgkimsal
"The whole point of an abstraction is that it is supposed to simplify"

No, it's supposed to abstract. A simplification is supposed to simplify. Often
abstractions have the benefit of simplification, but it's not a requirement.

I migrated a project from MySQL to PostgreSQL last summer, and the project was
built on Grails with GORM. I had to migrate the data by hand (mostly easy,
save for a couple of edge cases like boolean columns), and I had to change the
jdbc driver. That was pretty much it. No rewriting of SQL, no changing of
escaping logic, etc. I tell a lie - the auto-sequence generation stuff of
postgresql wasn't playing nice with some of the GORM identity stuff, and my
code had made some assumptions that turned out not to be 100% true. Those
_likely_ would have shown up had I written my own stuff rather than relied on
GORM, but it was a little bit of a pain to track those down.

All in all, using the ORM _abstracted_ away the need to write against specific
database commands and syntax. A byproduct of that was simplification of most
use cases of the database, but the key use was abstraction.

------
InclinedPlane
I've seen a lot of the problems that ORM creates with big projects. The most
egregious is lack of control. You'll run into some problem caused by some
quirk of your ORM system and you'll dig down into the SQL and learn precisely
what's causing it, but you still won't be able to fix it because you don't
know the magic voodoo incantations to change your config or the ORM client
code in the right way to fix it.

When ORM starts to get in the way like that it really makes you wonder whether
it's worthwhile.

~~~
sunchild
I would love to hear about an example of this kind of case.

My experience is that ORM queries, used properly, deliver predictable results.
If they don't because of a bug in the ORM, can't you just drop down to the
native query and move on?

~~~
chrisjsmith
It's mainly due to the fact that MORE abstraction means MORE complexity. I
currently have to deal with an NHibernate mess of over 1000 domain objects (!)
at the moment and it's an absolute nightmare on the performance and
maintainability front.

~~~
smhinsey
To be fair, 1,000 (or however many you need) stored procedures probably
wouldn't be all that much better. I've seen systems taking both approaches and
I've seen both go right and wrong. It has more to do with the team than the
technology, I think.

~~~
chrisjsmith
We replaced 15,000-odd stored procedures with NHibernate. It's the same turd
in a different coat.

------
sunchild
What is it about coders and blogs that brings out the "cranky old man" vibe?

An ORM is an insanely convenient way for newbies to use various data stores
while avoid learning umpteen different query languages. The teaching value of
ActiveRecord for newbies is hard to overstate. It's also a damn nice way to
move your application closer to platform independence – a valuable thing in
today's PaaS integrated stacks.

If you seek efficiency and performance, don't use an ORM. Lick the freezing
cold metal, if you want. Nothing is stopping you from doing what you like!

(Also, I haven't dropped down to SQL since Rails 3.x and meta_where. Yes, I
realize that my applications "won't scale". They are appropriately scaled for
their intended purposes.)

~~~
bluekeybox
> An ORM is an insanely convenient way for newbies to use various data stores
> while avoid learning umpteen different query languages

Let's count these umpteen different query languages: (1) SQL, (2) ... ?

> If you seek efficiency and performance, don't use an ORM.

If you don't need efficiency and performance, why use a DBMS at all? Isn't the
whole point of a DBMS to make data access not only secure but also efficient
(B-tree indexes, etc.)?

~~~
true_religion
The variants of SQL are different between different databases.

Even the data-types supported are different.

It's not uncommon for consumer and enterprise products to have to work with
different databases. In the case of a consumer product, it'd be a client
having only one database type installed and requiring that your software use
it. You benefit by writing in an ORM because you can have the same codebase
for multiple db installs.

With enterprise customers it'll be having _multiple_ databases installed at
once, and having your code interface _between_ all of them. You benefit with
the ORM by not having to remember and handcode all of the query differences.

------
mistermann
One aspect that always comes up is the "inefficiency" of doing a select * from
a table with 30 columns when you only need 4 columns. 99% of the time the
millisecond performance difference doesn't matter, and if it does, there is a
standard non-default way to handle it in most ORM's.

However, one aspect that is usually conspicuously absent in anti-orm blog
posts is that of development time and cost. ORM usage practically guarantees
known coded efficiencies, but it lets you implement and pivot really quickly,
the time and money saved is easily more than enough to pay for a bump in
hardware to overcome the 10% slower code. But to do so is heresy for these
people....selecting columns from the database that you do not use is just not
done, full stop. Which is cheaper, in dollars, is irrelevant.

~~~
epscylonb
This surprised me as well, I was told that in Postgres at least, there is no
difference in query speed between selecting a subset of a record and the
entire thing.

~~~
IgorPartola
Forget the speed of the database. What happens when you need to transmit the
returned data across a network? Would you rather transmit 32 bytes per row, or
10kB?

~~~
falcolas
Or, like I had happen recently, 10GB?

Of course, that comes down to the scaling problem. The application I'm
currently working on can't use ORMs (though we have rolled a few of our own
read-only ones), because of 1) the size of the data we're going through, and
2) the lack of support for splitting reads and read/writes to different
connections.

~~~
IgorPartola
(1) is definitely a problem I've seen. SQLAlchemy for example takes forever to
initialize objects representing even a single row. Fine if you have less than
100 tables with less than 100 rows each and almost never join, but start
scaling, and you run into issues.

~~~
zzzeek
"forever to initialize a single row" is obviously some pretty severe
hyperbole, but the slowness of loading rows into fully formed objects is
largely a product of C python's sluggishness in creating new objects. This
also has nothing to do with how many rows are in the table, only how many rows
you happen to be fetching over the wire for a particular query.

Performance is also something SQLAlchemy is obsessive about, so the speed of
loading objects in SQLA has increased by orders of magnitude over the course
of 0.4, 0.5, 0.6, 0.7. If your opinions are based on earlier versions, your
impressions may be out of date.

~~~
IgorPartola
Sorry to be vague. In my benchmarks it takes around 30 ms to initialize an
instance of a model as a result of a query using Python 2.6 and sqlalchemy
0.6. This is in contrast to raw MySQLdb query initialized as a dict which
takes 2-4 ms.

~~~
zzzeek
MySQLdb is written in C. So you're comparing the speed of raw C code to pure
Python (which runs on top of that C code). The vast difference in speed
between pure C and Python has little to do with SQLAlchemy which I can assure
you generates those objects as fast as possible while still maintaining proper
behavior.

~~~
IgorPartola
MySQLdb is written in Python. _mysql is a Python extension wrapped around
libmysqlclient. _mysql is about twice as fast as MySQLdb BTW.

Nevermind all that. My point is that if you want fast code, write it in raw
SQL.

~~~
zzzeek
not really sure what you mean. MySQLdb, i.e. this project: <http://mysql-
python.sourceforge.net/FAQ.html> ? When you ask this library to execute a
query and return rows, the network communication with the MySQL server is done
with C code, i.e. with _mysql . It doesn't make sense to say "_mysql is twice
as fast as MySQLdb" - "_mysql" is an intrinsic component of it. The Python
portion of MySQLdb does very little.

Here is from the FAQ included with the project:

"MySQLdb is a thin Python wrapper around ``_mysql`` which makes it compatible
with the Python DB API interface (version 2). In reality, a fair amount of the
code which implements the API is in ``_mysql`` for the sake of efficiency."

In fact, if you _do_ want MySQLdb that's pure Python, there's an entirely
separate project for that called pymsql: <http://code.google.com/p/pymysql/>.
This library will be a lot slower than MySQLdb.

------
div
Labeling an ORM as an anti-pattern is throwing the baby away with the
bathwater. Sure, you will encounter some cases in which your ORM will be a
pain in the ass or even actively work against you, but most good ORM's will
allow you to talk to the database directly.

For example, both Hibernate and ActiveRecord allow you to just throw straight
sql to your database, returning a bunch of key value data.

Which is exactly what a good solution does: provide large gains for the common
cases, and get out of the way for the edge case.

------
perlgeek
> If your data is objects, stop using a relational database.

What does that even mean?

My data, is, well, data. Tables and rows are just ways to represent my data,
as are the nested hash and array structures of document storage systems. Oh,
and tables and rows are also objects.

What data is "object" and what data is "non-object"?

~~~
seldo
Yes, I could have clarified that. Here's an attempt:

"Relational" data is data whose value stems from its relationships with other
data. For instance, if it is statistical data that is viewed in aggregate
rather than as individual rows. Or if you need to answer the question "how
many rows look like this row?"

"Object" data is data that is useful in and of itself, and is largely self
contained. A pretty good example is a blog post: each post has a bunch of
metadata, including possibly a string of comments. But you seldom if ever run
queries across batches of blog posts (other than indexing them by date).

It's always bugged me that blog entries -- the staple of the ORM tutorial --
have little to no relational value, which is why they work so well in ORM.

~~~
glenjamin
Splitting blog posts by date is hardly something thats rare. See also, all
blog posts by tag, all posts by author, all posts containing the word
"javascript", latest comments by a particular user etc.

Well organised blogs have a reasonable amount of relational data.

------
mixonic
Embedding strings of one language in a second language is an anti-pattern.

I've been at a bunch of NYC dev events recently, and people at both Goruco and
Percona Live were hating on ORMs. ORMs have gotten _really good_ in the last
few years, I think the haters just haven't been using them.

Show developers a good alternative and they will go there. Some of the basic
points made in this article ring true, but the suggested alternatives are
weak. ARel is a great start to a non-orm database wrapper in Ruby! Somebody
just needs to go there.

------
encoderer
Somebody may have already mentioned this, but there's a _fantastic_ essay _The
Vietnam of Computer Science_ (2004) on this subject. It's long but so, so
worth it.

[http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Compute...](http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx)

~~~
wulczer
My first reflex was to write a comment mentioning that essay and then I found
this. Wish I could upvote you more...

------
adelevie
I'd be interested in the author's take on ActiveRecord's implementation of
Relational Algebra with ARel[1]:

> To manipulate a SQL query we must manipulate a string. There is a string
> algebra, but its operations are things like substring, concatenation,
> substitution, and so forth–not so useful. In the Relational Algebra, there
> are no queries per se; everything is either a relation or an operation on a
> relation. Connect the dots and with the algebra we get something like
> “everything is named_scope” for free.

Also, if I couldn't use something like ActiveRecord in my Rails apps, I'd end
up re-writing most of its functionality in my model code somewhere. If I don't
get Model#find_by_some_attribute() for free, then I have to spend time writing
it.

[1]
[http://magicscalingsprinkles.wordpress.com/2010/01/28/why-i-...](http://magicscalingsprinkles.wordpress.com/2010/01/28/why-
i-wrote-arel/)

------
JulianMorrison
I like the iBatis (now renamed mybatis) approach: explicit queries in a
separate file that say "input an object of this type reading these fields,
output an object of that type setting those fields" and contain raw SQL to be
thus parameterized.

This avoids the two largest flaws: live proxies with hidden state pretending
to be simple data objects, and SQL being generated with no control. It also
avoids a mistake I've only seen two ORMs make but they're common ones:
defining its own dialect of not-quite-SQL.

You still get objects mapped in and out of DB queries, it saves you the
pointless grunt work of "copy A, put it in B" and it prevents the as-bad-as-
ORM anti pattern of "SQL scattered throughout your code".

~~~
narrator
I think Ibatis is great for querying data out of the db and ETL operations,
especially when I'm using a lot of database features. However, IMHO, Hibernate
is better for crud opts on individual records because it takes care of dirty
flagging and managing relationships.

~~~
JulianMorrison
In my experience Hibernate fails badly at dirty flagging and saves every field
in the object rather than updating the changed properties (active record is
better). I suspect that's a design decision, to avoid an object in cache being
partially stale relative to the DB. But it's an problematic solution to
something that didn't need to be a problem.

Hibernate does manage relationships - but that is a misfeature. It's _doing
the wrong thing well_. The right thing is not to try to model relations as
objects, but to model queries as methods. The relationships exist _only_ in
the database - they are not duplicated into the data objects.

~~~
Zvez
"The relationships exist only in the database" Entity Person contains
collection of entity ContactInfo. In DB we have tables PERSON, CONTACT_INFO
and foreign key from CONTACT_INFO to PERSON. What the difference in relations
in code (between entities) and in DB between tables? In this example.

~~~
JulianMorrison
I wonder if you understood what I was saying? A tool like "mybatis" just reads
and creates unproxied POJOs. The relationships are in the DB and are expressed
in the SQL. The result objects are just results. With this type of Data
access, there is no duplication between the DB's relationships and the ORM's
relationships which shadow them.

------
ianterrell
"In the long term has more bad consequences than good ones."

I would hypothesize that one of the long term good consequences of healthy ORM
options is the existence of the vast majority of database backed applications
we all know and love. Sure, when/if they got popular someone had to tune some
SQL, but how many of those projects would have even been started without
ActiveRecord or Hibernate or EJB3 or CoreData?

The opinion that "ORM is an anti-pattern" is ridiculous nonsense.

------
SeoxyS
The main problems with ORMs is that they're trying to work around non-object-
oriented data stores. Layers of abstractions and ORMs in particular are
generally good things—but they can't do magic when it comes to dealing with
SQL.

If you're going to be using an ORM, I'd strongly recommend rethinking your
data store. Object databases such as MongoDB is a perfect fit, but even a key-
value store like Cassandra would be a much better option than SQL. I think
it's interesting to note that Core Data, Cocoa's ORM, is one of the fastest
data store out there. It uses SQLite, but defines its own schemas. I believe
it'll also let you store pure binary data.

~~~
chc
Small nitpick: If you're using a non-relational database, the library you use
to connect to it is not an object-relational mapping layer.

~~~
SeoxyS
You have a point! I guess I'm bastardizing the definition of an ORM. What I
mean by it was a library that automatically maps the data layer to objects in
your application code, and (often) gives you tools to work with these objects.

I wrote an "ORM" for MongoDB which adds functionality such as transparent
relationships.[1] Basically, even though it's not a _relational_ db, it lets
you do things like this:

    
    
        foreach ($author->books as $book)
        //                ^         ^ this is a Book object
        //                ^ this is an iterator, it loads a
        //                  Book object lazily every iteration.
            echo $book->author->name;
        //              ^ this is an Author object, auto-
        //                matically & lazily loaded & cached.
    
    

[1]: <https://github.com/kballenegger/MongoModel>

Among other many cool features. The point of highlighting that though, was to
illustrate that giving up traditional RDBMS doesn't mean giving up on awesome
relationships. The only thing missing is subqueries—but honestly, I don't
think that's a very big loss.

------
DavidMcLaughlin
This seems incredibly naive.

ORMs reduce code duplication. They speed up development, especially when
you're treating the underlying data storage as a "dumb" datastore that could
just as easily be sqlite or H2 as MySQL or Postgres.

As for ORMs having some sort of negative impact on the queries sent to the
underlying database - it really depends on what ORM you use but any ORM I've
used had support for pre-loading relationships in advance when required,
removing that N+1 problem.

I also want to add that I wrote an ORM for the first company I worked for and
when it was finished it was a drop-in replacement for 90% of the queries in
our application - and I mean that literally the SQL generated by the ORM was
exactly the same as the SQL being replaced. The queries that it couldn't
replace (mainly reporting queries) _already had an aggressively tuned caching
layer in front of them anyway because they were so hairy_.

But the real point is this: the performance of the ORM didn't really matter
because we were a database driven website that needed to scale - so we had
layers upon layers of caching to deal with that issue.

And that is an extremely important point - the way ORMs generalise a lot of
queries (every query for an object is always the same no matter what columns
you really need) lends itself to extremely good cache performance. Take the
query cache of MySQL for example - it stores result sets in a sort of LRU. If
you make n queries for the same row in a DB but select different columns each
time - you store the same "entity" n times in the query cache. Depending on
how big n is, that can cause much worse cache hit performance than simply
storing one representation of that entity and letting all n use cases use the
attributes they need.

Now, relying on MySQL's query cache for anything would not be smart, but
replace it with memcached or reddis or whatever memory-is-a-premium cache and
the same point stands. Another example to drive the point home is a result set
where you join the result entities to the user query so that you can get all
the results back in a single query. In theory this is a great way to reduce
the number of queries sent to your DB but if you have caching then there are
many times where you could have very low cache hit ratios for user queries
since they tend to be unique (for example they use user id) but where you
could still get great cache hit performance if certain entities appear often
across all those result sets by leaving out the join and doing N+1 fetches
instead.

ORMs prevent you from scaling as much as using Python or Ruby over C does.

So I guess that leaves the point about leaky or broken abstractions. Well I
would never claim that you can abstract across a whole bunch of databases
anyway, I think that's a ridiculous claim that most ORMs make. These types of
abstractions when people try to hide the underlying technology are really just
a lowest-common-denominator of all the feature sets. So if you chose some
technology because you really wanted a differentiating feature then most
likely you will find yourself working against such abstractions. Interestingly
enough, the dire support for cross-database queries which are perfectly legal
in MySQL but not in other vendors is the reason I had to roll my own ORM. But
the productivity and maintainability benefits were well worth it.

So yeah I guess what I'm saying is: premature optimization is the root of all
evil, there are no silver bullets and performance and scalability is about
measuring and optimising where needed. And finally: ORMs are not an anti-
pattern.

~~~
akeefer
Calling the author naive is, I think, uncharitable. I've also written my own
ORM layer, used for large-scale, high-performance enterprise applications, and
I largely agree with the original author's post, so I certainly don't think
it's naive. It sounds to me like the thoughts of someone who's ran into some
real problems in real situations.

I don't disagree with the points you've made around caching, but I do think
you're simplifying the problem a bit. Not all performance tuning in DB-
intensive applications is around caching, and it often involves query tuning,
indexing, and traditional DB-level stuff.

A large part of the abstraction leak around ORMs is around both the caching
and that DB-level performance tuning. You have to understand what code is
going to generate what queries so that, at the very least, you can tune them
by adding in the appropriate indexes in the database. All of a sudden, you're
living in SQL land, examining query plans, etc. But if you decide that the
change you need to make is to the SQL itself, the ORM layer suddenly gets in
your way: you either have to bypass the ORM layer to drop into raw SQL, which
at worst is hard to do and at best tends to massively reduce the value
proposition of the ORM framework, or you have to try to tweak your code to get
it to generate the query that you want, which is often frustrating and far
more difficult than just writing the SQL yourself. I don't think I'm that much
in the minority of having an experience like, "Hmm, the query I really need to
write needs to use an ORDER BY statement that includes a few case clauses . .
. now how do I convince this query-generation framework to spit that out so
that I don't have to pull back all the results and do the sorting in memory?"
It's also worth mentioning that caching doesn't help tune writes, so if
scaling your product requires scaling writes, you're probably going to be
mucking around in SQL land.

There's a similar problem around query-generation layers that attempt to allow
you to just write normal methods and have things executed on the database;
because the code is so far removed from the SQL, it makes it really, really
easy to write really terribly-performing queries or to write things that will
do hugely unnecessary amounts of work.

On a more trivial point, the fetching all columns when you only need a subset
of them problem is really an issue sometimes, especially if you A) have to
join across a bunch of tables, B) the columns that you want could be retrieved
from indexes, rather than requiring actual row reads, or C) the columns that
you care about are strictly several removes away from the original search
table, but the ORM layer loads everything in between. (For example,
Foo->Bar->Baz, my WHERE clause is on Foo, but the only columns I care about
are the id on Foo, which is in the index, and a few columns on Baz . . . how
do I tell my ORM layer to load nothing from else from Foo and nothing at all
from Bar? It's a different problem than pre-fetching, because I just don't
want anything loaded.)

Now, that's not to say that ORM layers can't be made to perform; of course
they can, pretty much all of them have the sorts of hooks you describe, and
there's plenty of empirical evidence to that effect. But sometimes the way you
make them perform is by just bypassing them.

There's another abstraction point, which is that supporting multiple databases
often leads to a least-common-denominator functionality approach; for example,
if you want to use a db-specific spatial data type, the ORM has to either
provide db-specific functionality, or it might just not support handling that
type of data well. The same often comes to things like db-specific functions
or query hints; if the ORM layer doesn't handle those things for you, you have
to bypass it and drop into raw SQL if you need them.

So really, the argument is not, "ORM's are not functional and no one should
use them," it's related to the value proposition of an ORM layer. The value
proposition is "This tool will make your life easier, will save you from
having to write SQL, and will help you work across multiple databases." If the
tool makes life harder than it otherwise would be, then it's not useful, even
if it's still possible to do work in it.

So the question is largely around whether or not they make life easier or not.
In the simple case, I think the answer is that yes, they do: they make it
easier for beginners to get off the ground, they make it easy to do simple
queries and writes, and the performance probably doesn't matter anyway.

When things get more complicated, though, the question becomes a lot less
clear. Yes, the ORM layer makes it easier to have structured queries that can
be cached . . . it also makes it harder to have one-off queries that can be
tuned easily based on exactly what data is needed and tweaked to convince the
database to generate the right query plan, and it makes it much harder to look
at some DB stats, identify a poorly-performing query, and then map it back to
the code that generated that query. I know of applications that have basically
had to bypass ActiveRecord more and more as they scaled to just do raw SQL
queries because making ActiveRecord perform was simply too hard or not
possible.

So personally, I prefer an ORM approach that does minimum stuff to let me do
the simple things simply (pull rows back and map them to an object, execute
simple queries directly on that table), but that's designed from the ground up
with the idea that dropping straight into SQL is a normal, accepted part of
the workflow, rather than some one-off thing that you should rarely do. But it
really depends on your project and your comfort level with SQL.

~~~
DavidMcLaughlin
> Yes, the ORM layer makes it easier to have structured queries that can be
> cached . . . it also makes it harder to have one-off queries that can be
> tuned easily based on exactly what data is needed and tweaked to convince
> the database to generate the right query plan, and it makes it much harder
> to look at some DB stats, identify a poorly-performing query, and then map
> it back to the code that generated that query.

The problem I have with your post is that you are repeatedly mistaking the
high level idea of an ORM with the (seemingly) Spartan implementations that
you have used.

Here is an ORM that provides support for automatically profiling the
performance of queries over a period of time:

<http://squeryl.org/performance-profiling.html>

Sample output:

<http://squeryl.org/profileOfH2Tests.html>

I cannot begin to tell you how much time this has saved me when optimising
performance of my webapp.

Note that that particular ORM also has the advantage of having type safe
queries - i.e. it can tell at compile time if there's a syntax error in your
query (subject to bugs in the ORM :)) - even in dynamically generated queries.
In practice this is a fantastic feature as it is so much safer than building
up SQL queries with string manipulation and dealing with multiple code paths
that depend on user input. The test paths alone in such code (even if you have
a "query builder" layer) are the stuff of nightmares.

There are many features missing from Squeryl though that I've had in other
ORMs because it makes different trade-offs. But this is what you do when you
choose a library, and it's important to understand what trade-offs you're
making upfront... otherwise you might find yourself writing off an entire
approach to software development as an anti-pattern because you picked the
wrong library.

~~~
akeefer
I think you're missing my point on the performance side; yes an ORM layer can
help you identify slow queries, but it's pretty much the database query plan
that will tell you why it's slow. Is it doing a full table scan instead of
using an index? Is it applying joins in a sub-optimal order? Are the
statistics just off, which causes it to use a bad query plan? At that point
you're already in SQL/DBA land, but now you have to map that knowledge back to
the ORM layer to fix things.

My experience with type-safe query layers is that they tend to be incomplete;
they simply don't let you generate the full range of SQL queries because
you're restricted by the language's type system. That said, I'm not
particularly familiar with squeryl (and Scala's type system is certainly more
expressive than most statically-typed languages), so I can't say what it's
limitations are, I can only make general statements.

Anyway, I think it's fair to say it's difficult to talk about ORM generally
due to the differences between frameworks and approaches. So I'll try to
phrase things more clearly, and say that I think the author's original intent,
and the part I agree with, is the fundamental premise that ORM abstractions
are inherently leaky and that performance needs often result in a desire to go
around the ORM framework to handle something more natively in SQL. Some ORM
frameworks embrace those limitations, and allow you to use them when you want
to and to work around them when you don't; other frameworks fight that
limitation and attempt to swallow the world such that you never have to leave
the ORM framework, and those frameworks tend to be the ones that become
frustrating to work with.

So if I were to attempt to charitably read the original post, I'd say that
perhaps saying it's an "antipattern" is taking it too far, but saying that
it's a fundamentally flawed, leaky abstraction is totally accurate, and that
recognizing that it's fundamentally leaky means that you, as a developer,
should probably take that into account in your application design and your
library selection, and that there are some techniques that might help you to
do that.

~~~
lloeki
In this discussion, everyone seem to broadly assume that it's an all-or-
nothing affair.

More precisely, it's a cost-benefit decision. If most queries you will make
are hampered by the ORM then by all means, don't use one. But if like in many
(most?) situations, an ORM greatly abstracts and eases design and development
for 90%+ of your operations and you have like 10% queries to be either tuned
or handwritten (even if it has to be handwritten against multiple database
types to preserve portability), then an ORM is a net benefit.

Compare it to inline asm in C, or C modules in Python: the higher level stuff
makes it efficient to work with top level concepts 90% of the time, but
sometimes you have to go down to be actually efficient, or even simply be able
to do something, even if that means losing some form of independence (which
would then mandate writing the same function for a different platform if you
want to preserve portability). Not only it is not an anti-pattern, by no means
does it mean either that the abstraction is fundamentally _flawed_.

The very idea that "going around" an ORM is somehow proving that ORMs are
flawed is simply wrong. There are problems that ORMs are built to solve, and
there are problems they can't ever solve. "Going around" is part of the deal
because it's not a "work around", it's a "work together".

This is very visible in the article, especially the moment the author states
that _"I claim that the abstraction of ORM breaks down not for 20% of
projects, but close to 100% of them"_. Indeed this is the case, but for close
to 100% of those close-to-100%-projects where it "fails", the ORM is helpful
for managing 90% of data access implementation. The 10% remainder may need to
be implemented at a lower level, but wouldn't it be silly to spend a lot of
time on those 90% of code that would get used 10% of the time? This is what
ORMs are about, and saves a lot of time to develop the 10% of code that is
critical both in usage volume and in performance. Of course such ratios are
highly project dependent, and this is what warrants a thoughtful analysis to
select the right tool for each task, _of which there can be multiple in a
single project, or even _object__. ORM, just as NoSQL, is simply not the end-
all be-all solution, yet that does not make it a very valid pattern any less.

(edit: cosmetic/typo)

------
jgrahamc
One of the problems I frequently see is that people complaining about ORM and
SQL are thinking mostly of some object wrapping a row (or set of rows) in a
table. Then they get into trouble when they want to wrap something more
complex involving joins between tables.

All these problems would disappear if people used database views. Then their
nice ORM layer (say ActiveRecord) would work perfectly and the nasty joining
and updating would be taken care of by the database. I've often wondered if
people even realize that database views exist and how powerful they are:
<http://en.wikipedia.org/wiki/View_(database)>

Of course, it's only relatively recently that MySQL has started supporting
views properly (in 5.0).

The other nice thing about views is that it means your code using the ORM is
simplified because you aren't indirecting through different objects to get at
specific values you need to display. It also means that only the necessary
data is retrieved from the database.

~~~
G_Morgan
That is the other problem. Every database is a mash of semi intentional subtle
incompatibility with the standard and a host of non-standard features. When an
ORM comes along many try to expose the non-standard features that make sense
in some way but end up needing a custom solution for each DB (see how you do
an auto sequence ID for an entity bean between Postgres and MSSQL).

So you have a semi portable layer interfacing with a semi portable
environment.

------
gerardo
The relational-object problem is called Object-Relational impedance mismatch,
duh!([http://en.wikipedia.org/wiki/Object-
relational_impedance_mis...](http://en.wikipedia.org/wiki/Object-
relational_impedance_mismatch))

For me, fast Web Application development is worth the tradeoff. I usually
begin to hate sql on the second month of a project.

~~~
wccrawford
I love SQL. What I hate is the tedium of turning the results into something
useable. ORMs eliminate that pretty handily.

As they say, "If it didn't exist, I'd have to invent it."

------
gte910h
This is a person who doen't write many large scale systems:

You will have considerably more (sometimes serious) bugs if you write _all_
your SQL by hand all the time in a app that uses a lot of DB queries.

Yes, you still need to understand what the ORM does when you do certain
things, you still need to understand what nasty joins you're writing and all
that. But you can let all the minutiae of what you DO write work out well in a
rote, well tested manner.

The article smells a bit of a guy who didn't know SQL or had a team member who
didn't, and they though just using and ORM would work.

If your app is successful, you will usually need to optimize things. But this
is true for SQL or any time saving abstraction as well, not just ORMs

~~~
hahainternet
What I found most telling was the talk about doing complex joins in your
application.

You do them in views and stored procedures. Your application code should be
distinct from the data it sources, so people don't have to wade through 400
lines of terrible PHP just to change 'fullname' to 'concat(etc)'.

Don't torture your DBAs.

~~~
dazzer
I agree with rimantas. Views and Stored Procedures should not be used to
perform business logic stuff i.e. I should not have to create a view with
massive joins just because a logic need requires it. And I should not have to
dive into my database when a business rule changes!!! Views and Stored
Procedures are useful when you lack a layer of abstraction (e.g. An MS Access
FrontEnd) where you may want to put security restrictions on the data that is
exposed to a particular group of users, OR for performance reasons where a
reasonably complex query can be run faster as a stored procedure. Of course
this is my personal view, and is definitely a point of contention for many
people.

The whole point of an ORM is to abstract the data from application code.
Business Logic can be built on top of it with minimal knowledge of the
underlying data storage system except in exceptional cases. ORM frameworks aim
to simplify the process of writing these boilerplate code and continue to
fulfil most common use cases.

------
code_duck
I had a _lot_ of problems working with ORMs when I was 1-2 years into
programming. However, I also felt a lot of resistance to learning to use a
framework vs. straightforward, procedural code for web apps.

It's a matter of wanting to take the time to learn _another_ system, API, DSL,
what-have-you just in order to work with something you already know - SQL. The
dislike of HQL resonates with me - I was wondering why I would ever work with
PHP Doctrine's DQL. Building SQL queries out of a sequence of OO method calls
seems absurd, too. As the article and comments note, you shouldn't _have_ to
know SQL well to use an ORM.

There are definitely issues with the ORM/Framework working against you, too. I
love the organization and features in Rails or Django, but I hate when I spend
hours working out how to do something that would take 5 minutes in plain PHP.
Same with ORMs. Getting them to do the right type of join, not make
unnecessary calls, etc. can be a pain. Sometimes it's that I don't know the
software well enough, which could either be my own problem or just a
reasonable lack of desire to devote my brain to it. Other times it's that the
given ORM really does have shortcomings, conceptually and at level of
development.

The one ORM I've had the most luck with is Django's. It's straightforward,
does what I want, is well documented, and doesn't have _too many_ features.

------
Swannie
In the comments I'm noticing no one ask: when should or shouldn't you use an
ORM?

Most of the discussions are over the merits of either approach, when to me it
seems an ORM has many places it belongs. And a few it doesn't.

For most database of record systems, which are a large chunk of your average
webapp, an ORM is a god send. When I say DBOR, I mean things like articles,
posts, comments, users, products, transaction history. An ORM saves a large
amount of work writing SQL, it covers 95% of your queries (particularly
insert,update,delete and simple gets) with minimal effort. You create the
model, and let it get dealt with by the ORM. Your objects are mainly records.
The pain comes when you start wanting to do analytics and interesting reports
- but stick with a reporting tool, and keep this out of your application, and
you feel less pain.

But this breaks down when you move to a database that represents a complex
real world system. If you're working on a model that represents, for example,
an electrical distribution system, these are not really records. They
represent a vast set of complex interrelations, Of course there are still
records, but in isolation, away from the complex relationship of say
pole->{location,type,maintenance history,conductors,insulator type}, and
conductor->{poles traversed,length,a end location,a end join type,b end
location, b end join type,material,material batch number,power circuit
carried} etc. etc. Then your queries to "find all customers affected by the
pole at these coordinates", requires joins through: pole, conductor, circuit,
serviced area, customers... we're moving rapidly to lots of complex queries,
where hand crafting really is the way to go.

------
stcredzero
_Death by a thousand queries_

Wait a moment here. In my experience, most problems like this can be solved by
noting something like: When we get X, we also get all of the associated Y's
and their Z's. Declarative association of a Batch Query with certain
retrievals isn't a newbie project, but it's something a lone programmer can
put together in a week in a good ORM. (I've done it.) I would expect this
library feature to be common in the Ruby/Python world.

~~~
jeswin
It has been a supported feature in just about every other ORM too (Hibernate,
Entity Framework, Linq-to-sql..), for ages.

The author doesn't seem to have understood ORMs well enough.

~~~
Swannie
I think the author's point is that by the time you have gone to learn all of
the complexities of an ORM, and all the steps you must take to sidestep when
you can't bend it to shape... you could have much more easily written it by
hand.

~~~
stcredzero
If you have a Declarative Batch Query, then you you get a bunch of SQL and
plumbing code for free. The join and the iterative code you'd need to do that
writing it "by hand" is 10x more.

------
Spyro7
I think that designating ORMs as "anti-patterns" is a bit strong. Perhaps my
understanding of what an ORM is supposed to accomplish is different from the
author's, but I think that some of the criticisms that the author levels
against ORMs are a bit off.

Inadequate abstraction - I would make the argument that it doesn't make sense
to expect an ORM to be able to completely abstract away from the underlying
database. The reason that the documentation of the various ORMs is sprinkled
with SQL concepts is that the ORM is providing a window into an SQL-based
environment. I would never

Incorrect abstraction - I actually agree with this point, but this does not
really seem to reflect on ORMs. This point has much more to do with the
ongoing debate between the NoSQL movement and relational databases.

Death by a thousand queries - I hardly think that this is a knock against all
ORMs. Different ORMs have different solutions (or a lack thereof) to this
problem. I use Django a lot, and Django's built-in ORM offers a lot of
"frills" that can help to protect against this (lazy loading, selective
loading of columns, selected loading of related models). I know that, in the
Ruby world, Datamapper seems to have some ways of dealing with this problem as
well. It really isn't as simple as saying all ORMs do this therefore all ORMs
are bad. The reality is more nuanced.

Ultimately, my principle problem with this piece is that it seems to conflate
its argument for NoSQL and its argument against ORMs. NoSQL is wonderful, but
it seems to be somewhat orthogonal to the value of ORMs.

ORMs are not perfect, and there is plenty of room for improvement; however,
writing everything in SQL solely due to performance fears will usually turn
out to be a case of premature optimization.

~~~
riffraff
FWIW even rail's ActiveRecord has the same frills (lazy/eager loading,
selective loading of columns, loading related models via join or via grouped
selects). It even has modules that do this fixes automatically :)

------
mgkimsal
One other thing struck me reading this - it feels like premature optimization.
Assuming that every ORM is going to be slow and inefficient to the point where
you'll need to override or rewrite all the queries will lead to an inefficient
use of developer time, and assumes you know a lot about what will matter under
real world use conditions.

Yeah, sure, that ORM is adding 200% overhead to the SQL query - it's pulling
back _30_ columns instead of _4_! And... it's taking 38 milliseconds and is
run 4 times per day. So what?

And when the model changes and you have an extra few columns to represent more
data? You've now got to hunt through every SQL query that could possibly
reference that table and make sure it's dealing with the new columns
appropriately, instead of having an ORM let the computer do what computers do
- compute the changes required.

Yes, there are other ancedotes that can be trotted out to prove the opposite
of my 38ms story above. Then we'll fall back to 'right tool for the right
job', and ORMs are currently a good middle ground tool for many of the
projects people are developing. Perfect? No. Useful? Yes.

~~~
keithnoizu
Mhh the premature optomization school of thought. Kiss good and all but I
think einstein had it right with his "Everything should be made as simple as
possible, but not simpler."

A modest amount of additional upfront is probably worth it if it saves you
time and effort in the long run. It's the same reason why its probably
worthwhile to learn a framework rather than rolling your own organic solution,
setting up templating, using version control from the get go. They all add
upfront complexity but pay off in the long run.

------
cturner
The article makes a strong case against ActiveRecord, not against Object
Relational Mapping.

Under the heading "The problem with ORM" the author writes,

    
    
        The most obvious problem with ORM as an abstraction is
        that it does not adequately abstract away the implementation
        details. The documentation of all the major ORM libraries is
        rife with references to SQL concepts. Some introduce them
        without indicating their equivalents in SQL, while others
        treat the library as merely a set of procedural functions for
        generating SQL. 
    

This is true of ActiveRecord, it's untrue of object graph ORMs like Apache
Cayenne.

I find two patterns to be key to effective ORM:

* Data Access Objects. This is for when you have nothing, and want to get an entrypoint into the schema. In this case, you should be able to write near-pure SQL to get what you want.

* Entity Objects. This is what the DAO will give you back - either an individual or a list. Each instance represents a row in a table, and has methods that will do lookups to foreign keys. Once you have this, you have an entrypoint into the data graph, and can use foreign keys to crawl around to wherever you need to go.

The DAO layer is a simple, centralised place where you can implement
permissioning logic.

If you need to do something high-performance (usually some sort of report),
you create a custom DAO, and have it return custom entities (instances of
classes that don't have 1 to 1 association with a table) that fit your need.

I've found that after a certain point of complexity in an application, it
becomes impractical not to use an ORM. It's like working in a type-unsafe
language. You refactor something, and SQL-in-code breaks all over the place.
That path leads to the hiring of dedicated DBAs, and abstraction of the schema
behind stateless layers of PL/SQL in a doomed attempt to get to grips with the
complexity of the problem space.

I worked on a system with a very tough customer where they repeatedly demanded
major schema changes that were sitting in front of a business logic layer and
frontend that had already been written. While the project had lots of
problems, those particular refactorings were very straightforward. I was able
to modify the ORM, and then just fix complilation problems and a few obvious
tentacles from them until the application recompiled, at which point it worked
again.

Some more criticism:

    
    
        This leads naturally to another problem of ORM:
        inefficiency. When you fetch an object, which of its
        properties (columns in the table) do you need? ORM can't
        know, so it gets all of them (or it requires you to say,
        breaking the abstraction).
    

I'm rusty but remember that at least in WebObjects EOF at least you can
nominate what you want to retrieve, including automatic joins to retrieve
stuff over foreign key jumps

The author's first suggested alternative "Use objects" offers worse technical
debt than ActiveRecord. I anticipate there are a lot of shitty systems being
on top of key-value stores. You can get fast results doing it, but it has
technical debt and doesn't scale horizontally. The key-value store is becoming
the next generation equivalent of "Oh we'll just build it in excel, and worry
about the consequences later on". But depends what you're doing. There are
situations where foreign keys are good.

The second alternative is "Use SQL in the Model" The advice of the heading
doesn't match the content of the text that follows. I think the author means
to recommend building a service that wraps the model by answering questions.
If not, that's the point I think that should be made.

It's common for companies to create a database, and then have many entrypoints
into it. This is a mistake and creates technical debt. As soon as you have
multiple entrypoints like this, you lose ability to refactor your schema
(because it's impractical to get multiple stakeholders to make concurrent
changes) and your system rots.

Instead, you create a model service that wraps the schema, but also has
stateful knowledge. For example - it knows the permissions of the user who is
talking to it and can tailor its response based on their permissions. Then you
return results in a transport format. I can't recommend a good, mainstream
mechanism for this. JSON, YAML are fiddly because they're typed, XML is
unnecessarily verbose

Anyway - there's no reason not to use a good ORM in this business logic layer.
For small systems - sure - use SQL in the model. For the larger stuff, you
have a more maintainable system if you use an ORM. But if it's a complicated
space, steer towards Cayenne or Hibernate, rather than active record patterns.

~~~
bad_user
I never found anything wrong with Active Record patterns, in all the projects
I worked with. Of course, it depends on the ORM and it's a pattern that blends
itself especially well with dynamic languages (IMHO, Hibernate sucks).

Of course an ORM will have SQL concepts because that's the nature of the
beast. You can't abstract those away.

There is also nothing wrong with _generating SQL_ for you, as generating SQL
by hand is a task that involves string concatenation. And when adding filters
to that SQL based on lots of runtime conditions, it gets freaking painful to
do it - it's a lot nicer to work with expressions that are building a syntax
tree, to which you can add and add whatever you need.

Also, about efficiency, the dumb ORM in Django can do this (prefetching
relationships on one level):

    
    
        user_rs = User.objects.filter(groups__name='Badass')\
           .select_related(depth=1)
    

It can also do this:

    
    
        rs = user_rs.values_list('groups__name', 'username')
    

Which returns an array of (group, name) tuples. No inefficiency there and no
leaking abstraction either.

You can also drop to pure SQL, if you're so inclined for efficiency reasons.
In that SQL you can also restrict the columns you're selecting, and so the
model objects returned will lazily load missing attributes on request ...

    
    
        user = User.objects.raw('select id, username from auth_user limit 1')[0]
    
        # already retrieved
        user.username
    
        # this works, but a second query is made
        user.first_name
    

So, yeah, people can use raw SQL or whatever fad-du-jour they want. I'll just
be busy working on stuff.

~~~
cturner
The way I think about it, active record is the strong option for crud, but
doesn't scale well past it.

Once you're dealing with many tables I think you need the ability to
accumulate a graph representing a new state and then commit it as a
transaction.

While you could do this in an adhoc manner with active record, it's not a way
of life in the way it is in EOF/Cayenne.

There are complications of the object to relational mapping that it doesn't
deal with. For example - no mechanism for dealing with situations where
someone change data underneath the object graph a user is working against in
memory. AR would blast over the top of the data with what it has in memory,
potentially breaking in the process.

Some ORM systems have locking levels. For example, with 'optimistic locking'
you get an exception if data changed underneath you. It's handy to be able to
catch an exception, email the support team, and tell the user to freeze and
contact them, in a way that doesn't discard the data they've just entered.

I find it backwards that active record exposes methods like insert, delete,
update. These are low-level sql concepts, and you shouldn't need to care about
them within an object system where you're thinking in terms of a graph of
data. It's very wired to the database though. As far as a "mapping" between
object and relational, active record is crudely simple.

On a positive note, you can hammer an AR out in your language of choice from
memory, without reference to any external libraries or the like. It's a good
wrench in the toolbox.

~~~
bad_user
Yeah, but the relational model is terrible for representing graphs, no matter
the tools you use, you are still bound by the representation you choose in a
relational database.

What I like about AR is precisely its close correspondence with a relational
database. Objects have create/save because they correspond to DB-tables and
that's not something I want to forget when working with such a database.
Personally I want to control when a specific object gets persisted, as to me
that point is crucial, both for performance and for data integrity.

When the relationships between tables get too complex, that's a good time to
reevaluate your data-structures and how you process that data -- as in, do you
really need a live graph inside your requests, or will a pre-processed view
will do, while having hooks for inserting new data and rebuilding your models
asynchronously (on a single thread to avoid race conditions)?

Also, there are some databases around that are specialized on storing graphs.
Even if the relational model is very general-purpose, sometimes you're better
off choosing a system that's more suited for your goals.

    
    
        a mechanism for dealing with situations where 
        someone change data underneath the object graph 
        a user is working against in memory
    

I kind of understand what you're saying, but you should avoid sharing of
resources like that. Choose stateless requests, even when building non-web
apps.

Do not let users trip over each other and if you do, construct a trail of
actions taken to be able to Undo (as in, the Command pattern).

~~~
cturner
I like your points.

It clarifies for me that the goals of Active Record and other forms of ORM are
so different that it's misleading to categorise them together. Active Record
presents a simple and unambiguous interface to the way things are, [other
systems that I don't have a noun for] try to abstract it away.

The way we use databases is interesting. In the early days of computing, a lot
of the low hanging fruit was reached via software that wrapped databases. As a
result, the database tools reached awesome power, maturity, and mindshare. As
a result, they now get used for things well outside of their domain.

I suspect there are patterns that get squeezed out by this, and it's a topic
I'm interested in exploring.

------
geebee
I've used two different formal ORMs, ActiveRecord and JPA (backed by
hibernate), and I've never felt completely at ease with them. In fact, I was
lining up to agree that ORMs suck, except that I realized I'm probably using
one no matter what I do.

If I have a model object, and I want to to persist it in a relational
database, then I'm going to need to do something that persists and retrieves
this object back and forth from the RDBMS, right? And if I want to retain the
flexibility to switch to a different database (or different persistence
strategy in general), then I'm going to some way to specify the implementation
details for each possible approach, and swap them in depending on which
approach I take. The java world tends to handle this with an ORM and a DAO
tier these days, using DI to swap in the desired implementation (sigh, Java
really is a soup of acronyms these days), whereas Rails developers tend to use
migrations. But either way, I'm pretty much stuck with an ORM. It may be an
ORM that works at a very low level, directly with objects, sql, connections,
and transactions rather than through a higher level API, but it's still an
ORM... (right?)

In spite of the inevitability of an ORM (I really hope I'm defining this
correctly), I'm going to agree with a lot of the points made in this blog
post. If I'm using SQL, I really don't like having the SQL hidden from me. And
I really can't stand languages (like HSQL) that force me to re-learn a variant
of SQL. ActiveRecord and Migrations are, without question, very productive,
but I like to see the objects and understand very directly how they are being
persisted and retrieved. I want to see the fields and methods, and I want to
see the SQL. I've found that I almost always end up changing it, and it's
easier to do that when it isn't all hidden from me.

Rails offers me so much that I can get over this little issue of mine, but I
don't feel the same way about Hibernate. My personal experience is that if
Rails isn't going to help me, and something has pushed me to use Java, I
probably need to write a lot of lowish level code anyway.

------
pnathan
Well, I can't speak for others, but in my small-scale use of databases, hand-
coded SQL has never been an issue.

~~~
ry0ohki
Same here, ORM just adds another layer of frustration for me.

------
Lagged2Death
I'm a noob when it comes to designing and implementing programs that interface
with a relational database. Based on my small experience so far with an ORM,
I'd say this post is spot-on, clearly articulating the frustrations I've felt
on my project. A friend of mine even wrote a blog post about the problems I've
had:

[http://es-cue-el.blogspot.com/2011/03/entity-framework-and-i...](http://es-
cue-el.blogspot.com/2011/03/entity-framework-and-inheritance.html)

That said, though, I do wish there were more detail on this point:

 _The programming world is currently awash with key-value stores that will
allow you to hold elegant, self-contained data structures in huge quantities
and access them at lightning speed._

I'd love to know more about such libraries, frameworks, or tools, but this
isn't a lot to go on.

~~~
rahoulb
I suspect a major problem with ORMs is that a number of people are using them
without a full knowledge of SQL beforehand. Relational databases are complex
and you need to understand what's going on if you want to make best use of
them.

I've never had any issues with ActiveRecord (although I agree that it's mixing
of data access and business logic _can_ be a problem) - but I also know what
needs joins, which columns to include, which indexes to use, when to drop to
raw SQL; all from years of writing complex SQL and stored procedures by hand
(and I never want to go back to that). And I don't ActiveRecord to magically
guess that stuff for me.

------
kstrauser
I inherited an incredibly hairy, large, mission-critical database at my
current job. While we're slowly phasing in its replacement, we'll be
interacting with the current mess for a long time to come.

There are seemingly endless little insanities, like "this column references
upper(substr(othertable.column,5))". Instead of trying to remember all such
idiosyncracies, I defined them all in SQLAlchemy and added a _lot_ of unit
tests to make sure I don't accidentally break one later. Now I can use
programmer-friendly ORM joins in production code and not have to worry about
getting all the weird rules right each time.

I'm perfectly comfortable working in SQL. I don't want to write it directly
all the time, though, any more than I want to have to write assembler all the
time.

------
cwp
Yup, spot on analysis.

I do think he missed one thing, however. The few times I've seen ORM layers
work well is when they're custom-built for a specific application. It's still
not ideal, but it lets you put all the ugliness in one place, and regain
efficiency by sacrificing generality.

------
d4nt
I'd say the problem is more with OO. The author touches on this a little
towards the end, but ORM fulfills the demand to put relational data into
objects. The issue is with all these developers insisting that they want to
think only in a limited set of types, even when the page they're rendering
needs half the properties of one type and a few more from another type.
Something like linq2sql can actually be the answer here, if you use it to
select an anonymous type containing exactly what you want, the you only need
one round trip and you haven't wasted and processing.

------
absconditus
"If your project really does not need any relational data features, then ORM
will work perfectly for you, but then you have a different problem: you're
using the wrong datastore. The overhead of a relational datastore is enormous;
this is a large part of why NoSQL data stores are so much faster."

I have never been able to receive a straight answer to this question: Is there
a "NoSQL" database that provides the same ACID properties that major RDBMS
databases do? Things like "eventual consistency" are entirely unacceptable for
the software that I work on.

~~~
smharris65
I'm not trying to say it's the best fit for you, but Apache CouchDB is fully
ACID compliant: <http://couchdb.apache.org/docs/overview.html>

------
dazzer
Interpreted pedantically, removing ORM techniques means dealing directly with
resultsets or loosly typed structures. This is definitely what I DO NOT want
in any of my views. If you're working with an OO language like C#.NET or Java,
good luck!

Any abstraction on any level is going to add a performance hit no matter what.

If this was really an issue, wrap your ORM Framework stuff (differentiating
from ORM the pattern) in a DAL layer so that your BLL does not worry about the
existence of the ORM. Then as you scale, optimise your DAL with either inbuilt
optimisations or when desperate write your own SQL (if you don't even know SQL
then you're a poor excuse of a developer)

Think of them as like Ikea furniture - they don't look great, and they don't
often fit in every household if they have complex requirements. But they're
highly modular, and easy to assemble. So when you need something in a jiffy,
just bring it home, fix it up and it'll perform its purpose. When it no longer
fits the purpose, get something else. And every household has to just start
somewhere.

------
trustfundbaby
I'd like to learn more about the author's background with ORMs and especially
activerecord ... I know SQL pretty well, in fact when I started using Rails, I
insisted on still writing my own SQL queries by hand.

ActiveRecord might be an anti-pattern, or it might not ... I really couldn't
care less, what I do know is that I enjoy dealing with the database using
active record _far more_ than error prone dynamically constructed SQL queries
I was doing back in my PHP days.

It makes my life as a coder easier ... I mean, have you ever tried to
construct a really complex search function on a web app using SQL? ... its a
pain and a half. ActiveRecord makes stuff like that much easier (named scopes
in Rails especially)

Yes, if you don't understand databases, you're going to use an ORM in shameful
ways, but it works well ... very well, if you know what you're doing and you
take the time to learn your craft.

I'm glad to have ActiveRecord in my tool belt every morning when I get to work
and that ... is what really matters to me.

------
pspeter3
I think that there are definitely some valid points about efficiency while
using ORMS when the queries get more complicated. However on a simpler scale,
like a blog, ActiveRecord or other ORMS aren't horribly inefficient and are
faster and easier for the programmer which is why people ultimately use them.

~~~
sawsaw
The point Seldo was making, however, is that for simple applications like this
a relational database doesn't make sense, and is in fact overkill.

~~~
ianterrell
Except RDBMSs are ubiquitous. Whether or not they're overkill, they're cheap,
easy, and well understood.

What web host in the world doesn't run MySQL? What platforms don't support
Sqlite3? What PaaS doesn't give you easy access to Postgres or another
database.

Compare that to support and sysadmin knowledge of Mongo, Redis, Couch,
whatever: there's no comparison.

It's only "overkill" in the very limited sense of, what, CPU cycles? Unused
relational algebra potential?

~~~
Joeri
I think the argument is that SQL+ORM is not that different from NoSQL. You
have a point that it's probably easier to use an ORM than a NoSQL store in the
real world.

------
skittles
If a project uses object-oriented programming and a relational database, it
will have an ORM. Either one written somewhere else (hibernate, Entity
Framework, etc.) or one written in house (whether or not it is thought of as
an ORM). An ORM maps object data to relational data and back. That's all.

------
dasil003
I started writing a response in a comment, and then I started frothing at the
mouth, and pretty soon it ballooned into a whole blog post:

<http://news.ycombinator.com/item?id=2659442>

------
jtchang
So I don't have deep experience with ORMs except for SQLAlchemy. I've worked
with others such as Hibernate.

All I can say is that I love SQLAlchemy. Projects that use it make my life
easier. It is easier to read and _troubleshoot_ than pure SQL.

Just because the majority of ORMs such doesn't mean they all suck. It's like
saying "web frameworks" are anti-patterns. ORM done right can be a god send.
Like all patterns knowing when to apply it is key. Go around creating
FactorFactoryFactoryObjectFactory and of course you'd think it is an anti-
pattern.

------
vertice
god. thank you.

I have never met an ORM that didnt eventually rub me the wrong way.

~~~
geekfactor
As they say, familiarity breeds contempt.

I can name a hundred things my wife does that annoy me, but that doesn't mean
I'm not worlds better with her than without.

~~~
true_religion
True, but some people might be better off looking for new partners---and new
alternatives to ORMs.

------
danssig
>When you fetch an object, which of its properties (columns in the table) do
you need? ORM can't know, so it gets all of them (or it requires you to say,
breaking the abstraction).

Not true. If you take the nHibernate approach of returning a proxy argument
then you can get clients to "tell you" without breaking the abstraction. You
normally don't worry about this, though, because pulling 30 properties usually
isn't much different then pulling 3.

------
Bdennyw
I have felt the pain of Hibernate. It just sucks. But I think that it is
possible to make an ORM like solution work. A great example is NeXT's EOF and
now Apple's Core Data. The combo of awesome mapping tools and Objective-C's
dynamism make for system that works very well for it's intended purpose. Mind
you core data is not a database and likely would not work well for a web
service, but I believe that EOF did.

------
rjurney
Agree. Talked about how severe ORM impedance mismatch is here:
[http://datasyndrome.com/post/3257282059/data-driven-
recursiv...](http://datasyndrome.com/post/3257282059/data-driven-recursive-
interfaces-for-graph-data)

Your model needs to fit your view.

------
nathanlrivera
Generalizations are an anti-pattern.

------
shaydoc
ORM's tend to suck, easy way out. delegating control to a custom ORM says to
me, OK give me performance issues. Design your Domain model, keep it simple at
the DAL and use Stored Procedures, easy life, ultimate flexibility.

------
andybak
Sometimes 'good enough' really is good enough.

For christ's sake, there's a zillion ways to mitigate ORM related performance
hits. One of those zillion is 'stop using an ORM' but it's not likely to be
your first choice.

------
forgotAgain
Seems like a "horses for courses argument" but maybe that's just the sign of
my being a cranky old man. More important than using an ORM is that everyone
consistently uses the same toolset for the project.

Personally I can't say I'm a big fan of using an ORM. Just too many bad tastes
in my mouth over the years from bad implementations. It's probably improved by
now but I long ago developed tools to generate the boiler plate code I need to
work with a database. This gives me a generated data layer and a bare business
object layer that moves the data out of the data layer. With the metadata
available from databases it's fairly simple to automate the generation of the
code.

Once you have a tool that generates the code then an ORM has much less to
offer.

------
joeburke
Another mistake the author is making is calling the ORM "an abstraction".

ORM is a mapping technology: it takes input from one world and turns it into
data suitable to another world. It doesn't abstract anything.

~~~
dazzer
Technically, (my interpretation is that) it abstracts your application from
your db. The ORM acquires the data, allowing your application code to
concentrate on using the data.

------
tcarnell
I 100% completely agree with the article. ORM's are dangerous.

ORM's solve a problem we dont really have, but introduce a whole load of new
problems we never had before (lazy loading does not work, object relations
work completley different from table contraints, designing the domain layer to
fit our persistance layer, learning new proprietary query languages, inability
to control sql queries)

I am continuously amazed at how keen developers are to adopt them as a core
part of a their product.

Thanks for the article - I feel relieved others feel the same way!

------
kunley
it's a pity and a sign of ignorance that people do an implicit assumption that
ORM == active record pattern, while the other ORM pattern: data mapper is in
fact widely used and superior for many use cases.

------
gerardo
BTW, how are you working around these days the most obvious problems of an
ORM?

~~~
bni
By not using them

------
clistctrl
Yesterday morning I would've called this guy a cranky old man... but doing
some coding last night made me want to kill something. Coming from an Active
Record background I tried using Linq to SQL. My application has a WPF front
end and a Windows service on the backend. Passing the same object between the
2 is driving me insane. The problems are so much more cryptic, and the code I
had to write to go around it completely negates any reason for using it in the
first place. I'll be ripping it out tonight.

~~~
kprobst
ToList() usually solves most marshalling problems (not that that is your
problem, but it's quite common). Nothing wrong with Linq2SQL, it's just
another technology that has a learning curve.

There's more than enough information out there to solve most problems I've
ever run into with Linq or EF. Whatever you're dealing with, someone out there
probably already dealt with and blogged about it or asked a question and got
an answer somewhere.

------
maresca
Is this guy talking about object relational mapping or object role modeling?

~~~
andybak
I'm guessing the former as I've never heard of the latter.

Is it time for a Central Registry for TLA's? (CRT! Damn. It's taken...)

