
(Some) ORM haters do get it - blambeau
http://www.revision-zero.org/orm-haters-do-get-it
======
debacle
I don't understand where this ORM 'divide' is coming from.

ORMs are powerful, because they let you say less and do more. For 90% of the
queries out there, an ORM is fine.

SQL is powerful, because you can control and fine-tune your statements. For
the remaining 10%, use SQL.

Are ORMs bad? No. Can you them for everything? No.

The same thing can be said for almost every technology in existence.

~~~
fusiongyro
The divide is fundamentally about choosing what's in charge of your system,
the system being composed of your databases, your applications, and your
supporting infrastructure (your scripts, your migrations, etc.) To relational
database folk such as myself, the central authority is the database, and our
principal interests are what ACID exists to provide: concurrent, isolated,
atomic transactions that cannot be lost, on top of a well-defined schema with
strong data validity guarantees. To us, what's most important is the data, so
everything else must serve that end: the data must always be valid and
meaningful and flexible to query.

The side that argues for ORM has chosen the application, the codebase, to be
in charge. The central authority is the code because all the data must
ultimately enter or exit through the code, and the code has more flexible
abstractions and better reuse characteristics.

The reason for the disagreement comes down to disagreement about what a
database is about. To the OO programmer, strong validation is part of the
behavior of the objects in a system: the objects are data and behavior, so
they should know what makes them valid. So the OO perspective is that the
objects are reality and the database is just the persistence mechanism. It
doesn't matter much to the programmer _how_ the data is stored, it's _that_
the data is stored, and it just happens that nowadays we use relational
databases. This is the perspective that sees SQL is this annoying middle layer
between the storage and the objects.

To the relational database person, the database is what is real, and the
objects are mostly irrelevant. We want the database to enforce validity
because there will always wind up being tools outside the OO library that need
to access the database and we don't want those tools to screw up the data. To
us, screwing up the data is far worse than making development a little less
convenient. We see SQL not as primarily a transport between the reality of the
code and some kind of storage mechanism, but rather as a general purpose data
restructuring tool. Most any page on most websites can be generated with just
a small handful of queries if you know how to write them to properly filter,
summarize and restructure the data. We see SQL as a tremendously powerful tool
for everyday tasks, not as a burdensome way of inserting and retrieving
records, and not as some kind of vehicle for performance optimization.

At the end of the day, we need both perspectives. If the code is tedious and
unpleasant to write, it won't be written correctly. The code must be written--
the database is not the appropriate thing to be running a web server and
servicing clients directly. OOP is still the dominant programming methodology,
and for good reasons, but encapsulation stands at odds with proper database
design. But people who ignore data validity are eventually bitten by
consistency problems. OODBs have failed to take off for a variety of reasons,
but one that can't be easily discounted is that they are almost always tied to
one or two languages, which makes it very hard to do the kind of scripting and
reporting that invariably crop up with long-lived data. What starts out as
application-specific data almost invariably becomes central to the
organization with many clients written in many different languages and
frameworks.

We're sort of destined to hate ORM, because the people who love databases
aren't going to love ORM no matter what, and people who hate databases will
resent how much effort they require to use properly.

~~~
mattmanser
But the truth is the days of DB being king are almost gone.

These days you just don't hear about DBAs at all any more. You used to see
constant jokes about DBAs being a pain in the ass and stopping programmers
doing X or Y. ORMs going to win because there aren't enough of you left.
Stored procedures, triggers, etc. are going to be viewed as ancient technology
back from the days of yore when people didn't understand how to code properly.

~~~
silverbax88
I'm not sure where you're working that there are no DBAs..not an enterprise
shop.

Honestly, every time I see how badly Facebook handles data and caching, I
can't help but wonder why they don't use a real data store and DBAs.

~~~
nbm
I would be interested to know what you mean by "real data store", and what you
believe Facebook does badly at in terms of handling data and caching and how
it could be improved.

(I am an engineer/developer/whatever at Facebook, and I'm always interested in
hearing the perception of the company's technology from the community.)

~~~
mistermann
I have two questions:

1\. I've always been under the impression that for what Facebook does, a
traditional RDBMS simply cannot handle the scale (like, not even close). Is
this correct?

2\. I'm also under the impression that due to the architecture Facebook runs
on, from time to time some lesser-important data (ie: a status update or
comment) can be lost (temporarily or permanently) and this is not considered
unacceptable. (It seems perfectly reasonable to me for this particular use
case.)

~~~
nbm
Perhaps it is best for the database team to talk about that themselves -
wouldn't want to put words in their mouths. They gave a Tech Talk in December
last year, which you can see at <http://livestre.am/1aeeW>

------
kstrauser
He's wrong.

SQL is great for what it does, but I use ORMs for reasons other than writing
queries in a different way. I inherited an utter mess of a schema that wasn't
even remotely close to 1NF. Imagine fields containing comma-joined sets of
values, and with column names not even remotely related to what they actually
held. For legacy and business purposes, updating the schema was a non-starter.

So I used SQLAlchemy to remap the schema into something usable. I wrote
getters that split out those comma-joined fields and returned the desired
value against tables that required indexes like:

    
    
        UPPER(SUBSTR(name_delpt,1,STRPOS(name_delpt,',')))
    

I wrote (and therefore more importantly _documented_) the bizarre and complex
way some of the tables joined together. I wrote something that was unit-
testable and that could be used as a foundation for other work so that I
wouldn't have to memorize the insane corner cases and reproduce them from
scratch each time I needed to access the data in some little-used table.

I _didn't_ write a more convenient way to say `SELECT * FROM blog`. In
general, that doesn't interest me and I wouldn't have bothered with it. ORMs
are great - if used well! - for encapsulating all of the little bits of
business cruft in one central, easy-to-manage place. They're handy for roughly
the same reasons that subroutines are handy.

~~~
thespons
This is an interesting use and I like the idea, but I don't think it makes him
wrong. You don't actually need an ORM to do this. The same logic could be
applied to data sets retrieved through regular queries. Or you could create
views/stored procedures with the same logic.

------
ianterrell
The author's claim that ORMs are bad computer science is probably accurate.
Fortunately, they're really good engineering.

~~~
Roboprog
We could be more subtle, and go with "it depends". For getting and putting
data to be used on an edit form, using an object/class which actually has (non
relational constraint) "business logic" in it, ORM good. For batch mass
update, ORM bad. (one wonders if every table needs a custom class, though)

~~~
richardlblair
Batch updates aren't where it ends. As a Django developer with a strong SQL
background I find my hands are tied far more often than I would like.
Sometimes this is caused by bugs like the current group by bug[1], other times
it's caused by the design of the ORM.

I do agree that the ORM is handy for things like "Get me all the things in
this table", and "update this single record using a form", but this author is
dead on. There is logical reason behind the hate developers have for ORMs.

[1] <https://code.djangoproject.com/ticket/17144>

~~~
gouranga
Django's ORM is a piece of junk compared to a proper one such as SQLalchemy,
hibernate or nhibernate. I wouldn't go drawing conclusions until you've
experienced something else.

~~~
soc88
If Hibernate is one of the "proper" ones, then I can happily declare ORMs to
be a non-working, time-wasting, over-complicated POS.

~~~
chris_wot
Your post is of limited value. What, in particular, are the issues that you
have with Hibernate?

~~~
chris_wot
What in particular is it about my question that caused it to be voted down?
I'm actually very interested in hearing the issues that the responder is
having with Hibernate. Unfortunately, because they haven't stated what it is
that caused them so many problems, it's of very limited value to the
discussion being had.

~~~
gouranga
I voted you back up. I'm slightly concerned that the down vote was simply
retaliatory against the norm which appears to be Common here.

I too would like to understand.

------
mistermann
Can anyone point out the coup de grâce the author seems to think he has
arrived at?

He points out some (well known) ways that ORM's can be used inneficiently, and
acknowledges the techniques that have been developed to work around these, but
then seems to conclude that he has proven once and for all that ORM's are bad.
I totally missed the connection on that part. Is it that SQL is _better_ in
_dealing with sets_ than an ORM (a fact no one denies), therefore you should
not use an ORM?

~~~
sgift
The connection (according to the author as far as I've understood him) is:
"All the techniques that have been developed to work around the inefficiencies
of ORMs are just reinventing the wheel. The solutions have been there for 40
years and your workarounds are only needed because you think about individuals
(object-orientied) instead of sets (relational)."

~~~
mistermann
> The solutions have been there for 40 years and your workarounds are only
> needed because you think about individuals (object-orientied) instead of
> sets (relational)

(This is to OP rather than you)

Well if that's the argument he's making, one example I can think of, in the
case of an extremely complex update, while it always _can_ be done in pure
SQL, it is much easier to logically code using an ORM, perhaps even using
individuals rather than sets (the horror). And while this implementation might
execute slower (.1 second vs .01 second), it is vastly simpler to read and
refactor without screwing something up (ie: economically cheaper), and as for
the performance argument, it only needs to be _fast enough_.

~~~
bunderbunder
The problem is, in the real world the performance divide between iterative and
set-based solutions often spans much more than just one order of magnitude.
For someone who's got solid experience in constructing set-based queries, the
SQL solution can quite often be simpler to read and refactor as well. That's
admittedly a big 'for', though. Good database folks reason about this stuff in
fundamentally different ways, and there's a lot more to creating a good
database person than simply teaching a programmer SQL.

In respects like that, ORM's greatest benefit is also its greatest downfall.
Using an ORM means you can put people who don't have a strong grasp of
databases in charge of your databases. Sadly, it also means that you've put
people who don't have a strong grasp of databases in charge of your databases.

For that matter, only needing to be "fast enough" is fine if you're the only
kid in the playground. That's often a safe assumption to make if you're
writing app code, but less so with databases. If the database is being shared
by a number of applications, or if the server is hosting multiple databases,
or if you have to worry about concurrency, then being "fast enough" probably
isn't enough. Because you've also got to think about all the other ways that
your queries could be affecting everyone else, and making sure you aren't
subjecting your server to the tragedy of the commons.

Which comes to another nice thing about having a dedicated database person. It
means there's someone whose official bailiwick is the DBMS. If app A isn't
experiencing any performance problems itself, but _is_ causing performance
problems for app B (say, because of some perverse locking situation), that's a
bug that a DB guy is best positioned to diagnose and fix. If the application
isn't too tightly coupled to the database (i.e., sprocs are in place) then he
can even quietly fix it on the server side without having to hassle anyone
about the application code. A team that's too ORM-reliant, on the other hand,
risks failing to include anybody who's even well-equipped to _recognize_ the
problem, let alone fix it.

------
scotty79
For easiest things SQL is cumbersome.

Getting single row by primary key which is 90% of access is overly verbose in
SQL so ORM wins.

For slightly more complicated cases SQL is much faster and easy to write so
people who have to increment field in all rows that satisfy simple condition
go: "ORM sucks".

But for more complicated cases like trimming data tree in some places SQL
quickly becomes too much of a puzzle for most programmers to deal with so they
prefer ORM again because it's doable there and most of the times works.
Dedicated SQL users who are not good at puzzles in such cases write full
fledged program (if their SQL dialect allows for that) and instead of bringing
data to their iterative or recursive programs they bring their programs to the
data which creates hard to debug, unreadable often unversionable
monstrosities.

There should be some merge between databases and programming languages that
could combine beauty of syntax of modern programming languages and efficiency
of massive data handling of modern databases.

Why is it ok to have standard hashmap implementation in a language but not
file backed hashmap or btree index?

~~~
mckoss
I think this is the motivation behind .Net Linq - bringing the relational
query semantics into the language.

<http://msdn.microsoft.com/en-us/library/bb308959.aspx>

------
dcminter
There is a need to manipulate relational data from object oriented code. ORMs
are tools that facilitate that. The Object/Relational impedance problem
doesn't go away if you hand-carve the code, it just makes you work hard on all
the points of contact instead of just the problematic ones.

The real "problem" with ORM is when people use such tools as a way of avoiding
having to understand databases (and specifically SQL). Fortunately that's
becoming less common at least within the Enterprise Java world where I live
and breath.

~~~
mb_72
I think you've hit on something here. I came to start using ORMs after 10+
years of writing / optimising databases and SQL. When using an ORM (and most
of my work is done with a probably not-well-known one, XPO from DevExpress)
I'm aware of what is (or should!) be happening under the hood; my prior
experience with 'bare metal' is extremely useful, nay _essential_ to creating
a performant system. On occasion, it's necessary to do a direct SQL query, but
not often.

Sure, the apps I'm writing are for small-medium business, but XPO 'just
works'. Context is important; if I was working on something with more users /
tighter speed requirements, an ORM may or may not be the best choice. Still,
this falls into the 'right tool for the job' category that good developers are
already aware of.

------
parfe
Any system I've worked in that didn't use an ORM still implements an ORM.

Save methods which runs insert or updates. GetAll methods get_by_username,
get_by_id, get_by_email, get_recent, get_by_foreign_keyed_object,
get_by_other_foreign_keyed_object.

And all those hand coded methods directly tied to the dialect of the db the
original developer used.

Using Hibernate I had to write raw SQL for some reporting. I've never written
raw SQL in Django (the project I've used django on self select for
simplicity).

And using SqlAlchemy/Twisted as a backend for a desktop application I haven't
found a need yet, for performance or correctness. I have one query I'm eyeing
for an SQL rewrite, but it'll probably be a week of work to make sure it works
correctly and I'd rather release this phase than save 30 seconds on a weekly
query.

I've reached a point where ORM complaints don't really make all that much
sense. The issue seems to be "THe ORM breaks down doing X and Y and Z so I had
to hand write SQL!"

But you'd be writing X Y and Z anyway if you weren't using an ORM, so what's
the issue?

~~~
mistermann
Exactly. Or more explicitly, the anti-ORM argument, to me, consists of: ORM
works fine for A through W, but SQL is better for X, Y, Z. So rather than just
the common sense solution of writing _just_ X,Y,Z in raw SQL, _everything_ has
to be written in raw SQL.

~~~
wvenable
The argument works both ways; in fact, I think it's worse the other direction.
Most projects use the ORM and just the ORM and it's against policy to write
anything in raw SQL even if it's better for X, Y, Z.

~~~
mistermann
I'm sorry but I don't think you're telling the truth.

I've never encountered in real life, _or during any online discussion_ , the
all-or-nothing sentiment from advocates of ORM.

Could you link to _anything_ online where an ORM advocate argues that you
should _never_ drop down to raw SQL?

~~~
wvenable
The attitude is prevalent in the design of many ORMs. I'm both a huge advocate
of ORMs and of SQL. A good ORM provides a simple and direct mapping from
storage to the object model. But most ORMs go beyond that and try to cover all
the query and performance possibilities available from SQL. Some ORMs have
their own text-based query language!

I've met developers who can happily (and effectively) work with an ORM but
hardly even know SQL! They certainly don't know SQL well enough to use it in
the situations were it would be most effective.

I'm starting to feel like really effective set-based understanding of SQL is
becoming sort of a lost art.

------
iainduncan
As is so often the case, IMHO the poster is ignoring the business side of the
equation in favour of "correct". A good ORM is fantastic for speeding up
_development_. If you design your db around being well usable with the ORM,
the potential development gains (at least in dynamic languages like Python)
are HUGE. I left Django in part because of their ORM. But using SQLAlchemy, we
develop far, far, faster than if we were using SQL directly. And it's flexible
enough to allow me to drop into the SQLAlchemy Query language when I need to
hand tune a query, or to SQL itself if I really need to get close to the
metal.

Will there be downsides one day? Probably. Will they come even close to the
business value of the amount of the coding time we've saved during the
critical bootstrap phase? No way in hell. As they say, those are problems I'd
love to have.

------
nsxwolf
This article is all about RBAR vs. set based operations. It has little to do
with ORMs per se. That naive ORM users may tend toward RBAR is beside the
point.

Go RBAR when it doesn't matter - when it's convenient and you know how it is
going to scale ahead of time. A user updating his profile. Creating an order.

Go set based the rest of the time - when you're processing a large batch of
data for thousands of user accounts, offload that work into a stored procedure
and call it.

ORM does exactly what it is meant to do, and if you're working with data in an
OO model, you're _going to be doing ORM_ , whether you know it or not - you
will either pull a decent ORM tool off the shelf, or you WILL be writing your
own very bad one.

~~~
recursive
What is RBAR?

~~~
nsxwolf
Row-By-Agonizing-Row, or iterating over a list of rows and operating on each
individually with a new query.

Your average Java programmer is used to iterating over collections in a while
loop, where 30,000 in-memory objects can be quickly modified. It's tempting
for said programmer to do the same to ORM-backed objects and issue 30,000 sql
update statements across the wire.

------
euroclydon
Well now I'm reminded of my initial though when a colleague first introduced
me to an ORM: "It feels wrong to use this tool to just map every object to a
table." Of course I went on to write many, at best, moderately complicated web
apps very fast using NHibernate and didn't miss writing vender-specific SQL or
column to property mapping boilerplate.

But this article is a breath of fresh air. I may just try Dapper for the next
project

<http://code.google.com/p/dapper-dot-net/>

~~~
koide
I agree, microORMs are the right solution if you are trapped in the
mainstream. I'm a very happy user of Dapper.

------
ExpiredLink
This article is very much to the point. ORMs use the wrong abstractions. They
try to 'map' 4GL to 3GL. The results are necessarily unsatisfactory.

~~~
soc88
Absolutely right!

------
Ixiaus
I hated _every single_ ORM "system" I have ever come in contact with from many
different languages with the exception of one: SQLAlchemy. SQLAlchemy isn't
just easy and intuitive, it is _really smart_ and well built.

For a long time I was on the side of the ORM haters, until I tried SQLAlchemy
- it truly is an awesome ORM package and I have yet to find anything like it
in PHP/Ruby/etc...

------
bitdiffusion
I think the other point the author makes is that it's not possible to write
efficient code that is entirely abstracted from the underlying data (see his
loop examples).

i.e. if you have to write your code in a specific way to make the ORM behave
correctly (constantly thinking about what kind of sql your code is
generating), then the abstraction becomes a lot less useful.

------
FuzzyDunlop
I'm well open to correction on this, since I've not much of a clue, but with
all this ORM back-and-forth, and relational databases, why do we not see more
usage of graph databases?[1] From the wiki, it says they map more directly to
OO applications. Is there a reason relational databases are still used by
default?

[1] <http://en.wikipedia.org/wiki/Graph_database>

~~~
biafra
Sometimes the reason is that operations is used to it.

------
jakejake
As an author of an ORM I can agree with the OP that using them incorrectly can
lead to horrible punishment on the database. He seems to be most concerned
with the n+1 issue which happens when you loop through objects and another
query is performed on each iteration.

If you're going to use an ORM then you absolutely have to learn the mechanics
to avoid n+1 queries. Every ORM is a little different, some are easier to tune
than others.

I personally favor a basic mapping that the ORM does automatically combined
with an advanced mapping that allows you to basically write queries for
special purposes and map them to transient objects that don't necessarily
exist in your schema. You have to do this for things like aggregate queries or
calls that require several joins but only need a couple of columns from each
table.

The OP raises some good points but I do think that ORMs can be used properly
to great effect.

------
SoftwareMaven
Yet another software religious war, which seems to boil down to "if you don't
think my technology is correct 100% of the time, you are insulting my honor".

Do other fields get into constant pissing matches like this? Languages,
libraries, process, licenses, editors, Operating systems, you name it,
software engineers are fighting about how much better theirs is and how you
are an idiot for not seeing the true light their vast intelligence is trying
to bequeath unto you.

What is it about our brains that makes the subtlety of "use the right tool;
every problem isn't a nail" so difficult? Or is it just hard wired into our
need to be identified with a community?

~~~
pbz
It's probably directly proportional to one's OCD level or amount of pain
accumulated through the years. There's also the need to show and feel that
you're superior, and sometimes the need to point out that somebody's
superiority is irrelevant.

------
superasn
I think the problem with ORM is that at the end it is only a wrapper over SQL,
say like Winzip is for Zip. So while it may look pretty and easy to use, it
always has to obey the rules of the host program, and so workarounds such as
these have to be invented.

On the other hand if somehow ORM was part of the core compiler AND database,
then somehow it could be possible that even when you write a for loop on the
top and have an if conditions inside the block, or perform a join, the
compiler understands it and pre-compiles your code without needing such
workarounds (as there aren't two separate layers to join). So you would treat
objects and objects and never have to worry about how the wrapper is being
generated or what kind of indexes or queries it will run finally. I'm not an
expert at compilers though but for strictly typed languages it could be
possible.

~~~
pbz
That would never work though, at least not in the perfect, non leaky way the
author and you would like. When you have a loop, even if the database could
somehow understand that loop, you can put anything inside it. You could make a
call to an outside service, write to disk, etc. With a single SQL query the
database can take it and optimize it since it has a full understanding of its
data domain.

------
ScottBurson
There's a problem that's inherent to any client/server architecture: what work
should get done on the client, and what should get done on the server? The
author presents this as an ORM issue, but it has nothing specifically to do
with ORMs; you'd have the same problem using an OODB server.

------
chris_wot
I agree with a lot of what you he says, however what about the case of getting
10 rows with 20 columns - each column needing a join. The optimizers often go
nuts over this sort of thing, as that's 20! combinations the optimizer must
iterate over to get a good join order.

Apparently, Postgres has a genetic optimizer that handles this... curious to
see if this is an issue or not.

Incidentally, I'd like to say that I loved the author's Relational Basics II
at <http://www.revision-zero.org/relational-basics-2> I've come to the
conclusion that SQL is particularly limited in its application and
implementation. I'd love to see a better declarative language for databases!

------
skybrian
This is all very nice, except that users often do want to work on individual
records (at least when updating them). Codd's simplifying assumption is for
mathematicians, not users.

------
soc88
The idea of translating the declarative way of doing things to an imperative
approach (that's basically what ORMs are doing) is imho a huge failure. It
just never worked decently.

These days, we have languages which integrate rather nicely into the
declarative mindset, so no need anymore for such bizarre "paradigm
translators".

~~~
koide
What are those languages and how do they integrate nicely into the declarative
mindset?

