
Object-Relational Mapping is the Vietnam of Computer Science (2006) - diffix
http://www.codinghorror.com/blog/2006/06/object-relational-mapping-is-the-vietnam-of-computer-science.html
======
einhverfr
To be honest, I prefer to approach this from the other angle. Namely you start
with relations and map logic relating to these back up to your object model.
This provides cleanliness of code, performance, and hand-tuned design that you
don't get with an ORM.

However hand-coding the SQL has some advantages a lot of folks don't want to
discuss. While relations and objects are to some extent incompatible, focusing
on writing good relational code usually results, at least in anything I write,
in a net reduction in codebase size. This is because there are a lot of things
that can be done in relations faster and easier than could be done in objects.

The answer in my view is not to pick one or the other but to define an
interface across them. This is often an ORM in some projects but having talked
with some ORM developers I respect, at least some tend to map objects to
relations where the relations are views, thus encapsulating the relational
logic behind relational processing (which is smart).

Again I go almost the other direction. I prefer to define logic interfaces
which encapsulate the relational logic and then add a service locator to
handle the interface. This was a big motivation behind
PGObject::Simple::Role.[1]

I am not convinced that ORMs are so bad if they are also used with updateable
views (and thus providing a stable interface for the application, protecting
the internals of the storage from the application's need for intimate
knowledge). However, I personally find it just simpler to write SQL.

[1] [https://github.com/ledgersmb/PGObject-Simple-
Role](https://github.com/ledgersmb/PGObject-Simple-Role)

------
heyadayo
This was a fun blog post, and was shockingly prescient: SQLAlchemy rose right
about this time to become one of my favorite libraries and an impressive feat
of solving the ORM issue with finality.

They have a sort of low-level python SQL api, a higher level declarative ORM
layer, and a custom glue that let's you wire objects and sql together in
arbitrary but useful & maintainable ways.

I think this was possible because the SQLAlchemy devs had this insight: "SQL
databases behave less and less like object collections the more size and
performance start to matter; object collections behave less and less like
tables and rows the more abstraction starts to matter."

~~~
habitue
Yeah, I think sqlalchemy broke the mold. It starts from the idea that you
already know and understand SQL and relational constructs well, then adds a
layer of abstraction that lets you skip tons of boilerplate and write really
elegant code. Personally, while there are lots of good decisions in sqlalchemy
like the unit of work pattern (in contrast to the active record pattern), the
one that pays off time and again is representing the SQL ast as python
objects. It eliminates and entire class of text munging issues that occur when
you write raw SQL and just completely makes you wonder "where's the mismatch
they keep talking about?"

~~~
_pmf_
Do you know of any Java-OR mappers in the same spirit?

~~~
teacup50
Java: [http://www.jooq.org/](http://www.jooq.org/)

Scala: [http://slick.typesafe.com/](http://slick.typesafe.com/)

They also have the benefit of being fully type-safe across query projections.

~~~
lukaseder
I don't think Slick can be considered a very SQL-centric API. While the Slick
folks embrace the relational model, they certainly do not embrace the SQL
language. Just as with LINQ, this can be desirable if you want to reason about
collections in a more general sense. On the other hand, the SQL standard has
gone far beyond the "occasional" OUTER JOIN that can turn out to be a true
challenge to Slick.

In other words, SQL has never been purely relational. It is a beast of its
own.

------
taspeotis
Honestly I'm extremely happy with EF6 and LINQ. If you know what you're doing
[1] it works extremely well.

Having used EF6+LINQ in 2014 I feel this post from 2006 is quite outdated. The
article tells you as much:

> Work in this space has thus far been limited, constrained mostly to research
> projects and/or "fringe" languages, but several interesting efforts are
> gaining visibility within the community, such ... the LINQ project from
> Microsoft

LINQ's come a long way since 2006.

[1] Mostly this comes down to understanding the leaky abstraction between what
IQueryable can do [2] with respect to what an ORM can do with SQL generation

[2] Also knowing your IQueryables from your IEnumerables.

------
geophile
I wrote an early Java ORM (1996-2000) and I've come to the conclusion that
ORMs are a bad idea. ORMs make the easy even easier, and the difficult
impossible. The basic problem is that writing good, high-performing SQL is
hard enough as it is. You have to understand indexes, clustering, execution
plans, and query optimization to get good performance. Add an ORM to the
picture and then you have to figure out how to say what you want in this brand
new language, through a mapping layer.

I don't want to give up either objects or relational databases, (the choice
that some other comments have suggested). What I am willing to give up is an
ORM generating SQL. Writing SQL doesn't bother me. What bothers me is dealing
with Java PreparedStatements, Connections, and ResultSets. This is all trivial
boilerplate stuff, that an automated approach can do easily. I want to
preserve my brain and fingers for the hard stuff, writing clean, fast Java and
SQL.

I have heard that iBatis takes this approach, but I haven't tried it myself.

~~~
einhverfr
> The basic problem is that writing good, high-performing SQL is hard enough
> as it is. You have to understand indexes, clustering, execution plans, and
> query optimization to get good performance.

Not to mention cache specifics (something that has personally bitten me).

To be honest, my preference is actually to write a service locator against
stored procedures. This allows queries to be maintained separately from the
application (and loosely coupled from them also). This is not too far from
what iBatis does except that I would use stored procedures instead of mapped
sql.

What db's are you using? If Pg only, interested in porting the PGObject
approach to Java?

------
programminggeek
I think the bigger problem with ORM's or database driven application design in
general is that you start to see the world through the eyes of the database
and how you move things around inside it. Thus, your code resembles your
database more than your database resembles your code.

Ironically, developers would never think to do this with a storage mechanism
like the filesystem. There is no great popularity in filesystem based ORM's.
Somehow when we deal with the filesystem we treat it as it is - data storage
and retrieval. When we deal with 3rd party api's we tend to keep them at arms
length as well. Yet, when we deal with the database, we treat it as some other
thing that seeks to influence the design of our code on a fundamental level.

I wrote about this like a year ago: [http://brianknapp.me/the-filesystem-
test/](http://brianknapp.me/the-filesystem-test/) and I don't think much has
changed since then.

~~~
lugg
Sounds like you want to start using go. It basically assumes a database to be
a black box much like file storage or an api, you keep away from it until you
really need to put that data somewhere. This has the added benefit of being
able to unit test almost all of your code. The only parts you can't
effectively test are the storage layers like apis, fs, and db. Hmm to be a
little more clear, I dont mean you avoid it, I mean you abstract it away into
simple storage access interfaces the same way you would with fs or apis. The
last thing you want is orm riddled logic.

~~~
zurn
Can you provide a link? A quick web search only turns up the "sql" package -
that seems to be just a low-level SQL interface that doesn't prescribe much.

~~~
grey-area
That's all golang has, it's pretty much like other languages :)

github.com/lib/pq is a typical driver, which returns rows of values which you
read to recreate your objects.

------
siliconc0w
I work at a large enterprise. At one point the large enterprise decided to
invest heavily in an _ahem_ non-relational model of storing data. This has
proven to be pretty dumb. It's both less performant and less reliable than the
relational databases it purports to replace. On top of this, any interesting
application of the data besides basic CRUD is either slow and a total pain in
the ass or just not possible.

SQL isn't going away because it's valuable to exploit relationships between
data and because it's pretty good at doing that. Anything that tries to
replace it, eventually looks suspiciously like it. (I'm looking at you,
MongoDB aggregate functions)

------
joe_the_user
The object-relational mapping problem is a perennial question so I think it is
great to have an interesting survey of the topic even if it is a little old.

The thing is that the object oriented programming model and the relational
database model are both good models. However, the relational model is a
logical model specifically intended to be independent of the particular
structure used to store data (presentation independence: it uses only a few
relations so as to allow data to be presented using any logical conditions on
those relations) while object oriented programming is primarily focused on
creating specific storage structures (it's not a specific model but a set of
tools for build whatever-the-heck). You can create a limited definition of
object this is compatible with the relational model (see [1]) but such
definition more or less neuters OO as a tool chest for building any structure
that tickles your fancy(such a limit can be good or bad depending on the
context).

[1] For example, the daunting Third Manifesto (TTM), by Hugh Darwen and C.J.
Date [http://www.thethirdmanifesto.com/](http://www.thethirdmanifesto.com/)

~~~
einhverfr
However, the whole point of object encapsulation is to provide that
independence, right?

I think the issue is how that independence of interface is maintained. In the
relational model, the question is one of transparency in internals and set (or
more properly bag) operations, while in objects it is defined interfaces and
encapsulation.

It seems to me that's the problem, but you can bridge them by creating stable,
defined interfaces (updateable views for example).

~~~
joe_the_user
" _However, the whole point of object encapsulation is to provide that
independence, right? ...the issue is how that independence of interface is
maintained._ "

OK, a different way to put it is that the relational model provide one kind of
independence and object orientation provides another. The way that a foreign
key column is not OO encapsulated but does reference a different column allows
SQL expressions that aren't dependent on which value is the "real" value.

" _you can bridge them by creating stable, defined interfaces_ "

Yes you can. At any one point in the development of an application, you can
pick a mapping and say "no problem", this is true.

The problem is that if you develop an application further, have an object
return an object instead of a value say, then the problem of how to change you
mapping to reflect the development is hard, harder than one would think. It's
not the snap that's hard, it's the evolving application that uses OO and
relational models that becomes weirdly hard, where the "impedance mismatch"
become evident.

~~~
einhverfr
> The problem is that if you develop an application further, have an object
> return an object instead of a value say, then the problem of how to change
> you mapping to reflect the development is hard, harder than one would think.
> It's not the snap that's hard, it's the evolving application that uses OO
> and relational models that becomes weirdly hard, where the "impedance
> mismatch" become evident.

I am not so sure that is really the problem though. As I noted, if you do ORM
mappings to updateable views, that both simplifies and complicates things. On
one hand it means more code to maintain because you have to map physical
storage relations to application-facing logical ones.

But on the other hand, it solves the problem you are describing quite nicely
because your apps only see an intermediate form of the data. You control this
intermediate form and it is independent of how the data is stored.

The real problem I see with ORM usage generally is that instead of thinking
about _interfaces_ they think the ORM is the interface, but this means that
the application depends on intimate knowledge of how the data is stored, a
dependence the relational model is designed to get you out of. That isn't
really a problem of the relational model -- as you point out, this
independence is exactly what the relational model _provides_ and so if you
aren't getting it, it is because you aren't really using the relational model.

Rather it is a problem of trying to eliminate relational thinking from app
development.

------
GFK_of_xmaspast
By "Vietnam" do you think he means "an entity that they tried to crush and
demonize but is actually doing kind of ok in 2014, all things considering".

~~~
ximeng
It's a quote from someone else's essay which explains it as a situation where
initial results are easy but finishing the job is hard. A strained analogy at
best, and the essay is very verbose.

"One of the key lessons of Vietnam was the danger of what's colloquially
called "the Slippery Slope": that a given course of action might yield some
early success, yet further investment into that action yields decreasingly
commensurate results and increasibly dangerous obstacles whose only solution
appears to be greater and greater commitment of resources and/or action. Some
have called this "the Drug Trap", after the way pharmaceuticals (legal or
illegal) can have diminished effect after prolonged use, requiring upped
dosage in order to yield the same results. Others call this "the Last Mile
Problem": that as one nears the end of a problem, it becomes increasingly
difficult in cost terms (both monetary and abstract) to find a 100% complete
solution. All are basically speaking of the same thing--the difficulty of
finding an answer that allows our hero to "finish off" the problem in
question, completely and satisfactorily."

~~~
ww520
That describes NoSQL perfectly.

~~~
collyw
I just got the whole NoSQL marketing talk from a developer I am collaborating
with on a project. "Its so quick, just connect Angular to Elasticsearch and
blah blah blah". So no thought has been given to security (its going to be a
biomedical research application). I can foresee lots problems in the future.

------
porlw
Code comes and goes, data is forever.

ORM is an attempt to bypass proper data design, when in fact more attention
should be paid to the data model in the DB than the structure of the code.

Functional programming has a much better impedance match to the relational
concept; maybe this will mitigate the issue in the future.

------
tudorconstantin
this post goes so deep into architectural aspects, that becomes philosofical
and makes us almost forget the not so elevated principle of "get shit done,
fast". The ORM that I'm using (Perl's DBIx::Class) never let me down:

* it has model generators

* need an _insert into_ or an _update_? bang, one line of code

* find_or_create? one line of code

* acess, create, update or delete related records? one line

* it has chained where clauses and hits the db only when it needs to access the data

* want hashes instead of objects in order to send them as a json response in a REST service? one line of code

and many more features. Is there an O/R "impedance mismatch" ? I don't know, I
don't care and I don't use it if there is

------
fennecfoxen
> I tend to err on the side of the database-as-model camp, because I think
> objects are overrated.

That approach is really quite fine and wonderful, if you can put all your
business logicy awesomesauce into the database layer. But it can make a
variety of problems really awkward if you want that business logic to interact
with code that lives in a different process... starting with "validating user
input on a web form" and going on from there.

Also, I'm curious whether you'd write stored procedures in your database to
set up objects for your automated testing suites, when you want to test the
business logic in your stored procedures. :P

~~~
einhverfr
> That approach is really quite fine and wonderful, if you can put all your
> business logicy awesomesauce into the database layer. But it can make a
> variety of problems really awkward if you want that business logic to
> interact with code that lives in a different process... starting with
> "validating user input on a web form" and going on from there.

I don't know. We've certainly had our awkward moments with LedgerSMB. However,
there are a bunch of things that have to be understood to do this right.

1\. Business logic in the db requires you to think carefully about interfaces.
Getting the right interfaces all the way down is rather difficult but it can
be done with appropriate attention to detail.

2\. Error handling is a very specific case of #1. You need to think about
notifying the application of an error, not the user.

> Also, I'm curious whether you'd write stored procedures in your database to
> set up objects for your automated testing suites, when you want to test the
> business logic in your stored procedures. :P

I just test the stored procedures in transactions that roll back. This is
pretty nifty because I can even test write operations on a production db
without worrying about polluting the production instance with test data.

------
snomad
For the past few years, I abandoned the ORM and went straight to Stored
Procedures using Data Transfer Objects / DAOs. It is no panacea, but has
worked well for these reasons:

1\. I can grep all of my SQL! No dynamic SQL generation means all SQL
statements are known. This alone has been huge to me. Of course it depends on
storing the SPs in version control and settling on good naming conventions
(eg. always write table.column). 2\. The SPs are easily testable and
verifiable endpoints to themselves. 3\. Small Performance boost of using
compiled / validated SQL. 4\. No dynamic SQL also means that performance
tuning is a snap. I see a bad actor I can track it down easily and fix the
problem.

One thing that has made this possible is a convention of naming SPs by
tablename_DMLname_[description]. e.g. customer_sel_byemail.

~~~
jmnicolas
What is "DMLname" in "tablename_DMLname_[description]" ?

~~~
9876543210
A named query, or any other form of named SQL statement.

It could be a view (a select query, stored as a database object) or a
procedure (a grouped set of sql statements that might involve changes that are
written to the database, possibly a simple insert, or simple update, or simple
delete, but potentially much more complex).

DML is one class of SQL statements, and DDL is another. The two together
encompass most of your common, practical database operations.

------
dustingetz
Datomic purports to solve the O/R impedance mismatch. I attempted to explain
this in a blog post:
[http://www.dustingetz.com/2013/03/26/orm.html](http://www.dustingetz.com/2013/03/26/orm.html)

------
gavinpc
I've never used an ORM, but I tend to associate their use with yer basic CRUD
application that people write for money -- i.e. something not that interesting
where you wouldn't mind "saving some time" if you could avoid caring exactly
what string gets sent to the SQL engine -- or, perhaps, knowing SQL at all,
for that matter.

Am I wrong? Are people using ORM's in their "passion" projects? Or are they
just a way to take some of the pain (and perhaps insecurity, think SQL
injections) out of boilerplate apps?

~~~
timdev2
I gave up on ORMs after a year or so (back around 2006) of ActiveRecord doing
the "makes the easy even easier, makes the hard impossible" syndrome someone
else mentioned.

Then a couple of years ago, I started playing with a DataMapper/Unit-of-Work
ORM (Doctrine2, in PHP-land). My experience, writing a moderately sized
warehouse-management application, has been very good.

In Doctrine, you don't write your schema/DDL. Doctrine does, based on your
Entity and Association mapping. You write plain-old-PHP-objects, and then map
their relations (in XML, annotations, or YAML). Doctrine reads and validates
your data model, and generates DDL statements to create the schema. Big
surprise: it creates pretty much exactly the same schema you'd expect. (And it
will generate and manage migrations, too)

Unit of Work is a powerful pattern, too, especially for web apps. You just
worry about updating entities in memory. The ORM then flushes them to the
datastore in a single transaction at the end of the request (so, by default,
you have a one-to-one mapping between http requests and transactions in the
RDBMS).

Datamapper/UoW is probably overkill for CRUD stuff; that's where ActiveRecord
shines. But it pays dividends when you're dealing with more complicated state-
transitions on your underlying entities, and there are lots of cross-cutting
concerns.

~~~
kiksy
I read Jeff's article and the comments here with interest. My only experience
with an ORM is Doctrine2 (about 2.5 years) , and I've never really hit any
issues. Whenever I've needed to duck out using the ORM in an edge case, its
only been a few lines of code and very undisruptive.

In a recent project I manually mapped a large sprawling ancient schema
manually using Doctrine2, and still never hit any major blockers. The code
just abstracts away over the cracks.

Does anyone have any real world example articles on where an ORM has totally
failed, and why?

------
acjohnson55
I don't feel like I experience this problem in Django. But this might just be
because I programmed database-driven apps pre-ORM, and as such I am
predisposed to making my object model more relational to begin with. Or
perhaps it's the case that using Django's model layer is really just an OOP
veneer on SQL. In any case, I don't feel like I've had a major pain point
here.

------
ericHosick
The real problem is more general.

Relations are easily persisted because they only have fields meaning they are
easy to persist (store).

Objects are not easily persisted because they have methods meaning you are not
always assured that the object is in a safe persist-able state (unless they
are value objects).

------
lukaseder
It would not have had to be the vietnam of computer science. There are
essentially two approaches at interacting with RDBMS:

\- "Top down", domain-model centric, hierarchical, object-oriented, etc.

\- "Bottom up", relational-model centric, relational, etc.

We've also nicely summarised this on our page here: [http://www.hibernate-
alternative.com](http://www.hibernate-alternative.com)

ORMs solve the first approach pretty well. ORMs have never claimed to solve
the second approach. But people are lazy and constantly looking for magic
bullets, which is why they are trapped by thinking that ORMs _should_ also be
a solution to the second approach.

But even Gavin King always stressed the fact that one shouldn't use Hibernate
for _everything_.

------
yeukhon
I think people have forgotten about ZODB, an impressive concept in the early
2000s. But sadly it didn't take off. Slightly off topic here but worth looking
at.

------
hayksaakian
What's changed since 2006?

At least from my knowledge:

the rise of Rails and ActiveRecord

a multitude of back end JavaScript technologies (node, et.al)

NoSQL databases have become more prominent and widely accessible.

How has the Author's opinion changed?

I could submit a link-baity article from the 90's decrying object oriented
programming as a pointless dead end in computer science, but I doubt the
original author would hold the same opinions

~~~
chilldream
Can't speak for the author's current opinion, but he did voluntarily start a
Rails app recently:

[http://www.discourse.org/faq/#tech](http://www.discourse.org/faq/#tech)

------
danielheath
The problem is stunningly difficult and widely believed (but not proven) to be
intractable.

However, there's good work advancing the state of the art in this field. Look
at Object Role Modelling implementations like ActiveFacts (
[https://duckduckgo.com/?q=ActiveFacts](https://duckduckgo.com/?q=ActiveFacts)
).

------
partomniscient
I don't see how posting links to to old articles (2006) with sensationalist
titles adds value to a _new_s site. Sorry.

~~~
chad_oliver
Hacker news isn't just a news site. That's one half, but the other half is the
discussion we bring to the table. I assume diffix thought that the article was
an interesting topic starter.

------
jackmaney
To be honest, I've never really understood the point of using an ORM. Sure, I
guess it would be handy if you needed to migrate your data from one RDBMS to
another, but other than that, I find it much easier to write SQL (with
parameterized queries when applicable).

~~~
misterjangles
In my case it just saves a lot of time on "plumbing" code. Having DB entities
represented consistently as objects is helpful for doing all kinds of things.
- having events fire on save or delete, hooks for business-logic validation,
etc. boring stuff perhaps but time-saving and keeps things really well
organized.

My ORM of choice doesn't take away the ability to write raw SQL when you want,
though. I wouldn't want to use an ORM where you couldn't extend it with your
own SQL.

------
tsax
This is certainly intriguing. .NET has moved steadily in the opposite
direction with Linq-to-Sql classes, followed by the Entity Framework which is
now the default. I wasn't even aware that this school of thought advocating
the opposite direction existed.

~~~
misterjangles
My first introduction to ORMs was hibernate. I immediately loved the idea but
to be honest it can be extremely frustrating to get things done - especially
if you are good at writing SQL already. It can feel like handcuffs!

I kinda suspect that started the whole anti-ORM thing in response to the
ridiculous complexity of some frameworks.

I still use ORMs but I've decided that it has to be a light approach that
doesn't try to totally hide the SQL code from you.

------
islon
After I moved to Clojure and start to deal with raw data instead of objects I
remember all the pain of ORMs from my Java days… Jeff is right: objects are
overrated.

------
carsongross
ActiveRecord has done it about as well as I've seen, by just bailing to
strings when it got hard. Worse is better wins again.

~~~
EGreg
[http://github.com/EGreg/Q](http://github.com/EGreg/Q)

Check out the Db library.

------
EGreg
From my own experience, if you use an ORM, that is a strong indication you
might have been better off with a NoSQL db like Riak. Consider your access
patterns and partitioning requirements.

~~~
stormbrew
I don't think that's remotely true. People tend to use ORMs for much more
superficial reasons like not wanting to manually build SQL query strings
(which is valid, as this is error prone, but I still maintain that it's
superficial) or because the environment provides them. I've seen many apps
that are extremely well suited for a relational database but function terribly
because the ORM doesn't let them get at an actual relational model.

~~~
EGreg
So you just proved my point. Just because the ORM is there doesnt mean you
have to use it. There are libraries that generate relational SQL without ORM!

~~~
stormbrew
I don't think so? I agree that just because the ORM is there doesn't mean you
have to use it, but where I disagree is the idea that people incorrectly using
an ORM means they should be using an object or document database. They should
use a tool suited to their job, but often they're using an ORM because it's
been presented to them as a path of least resistance and a silver bullet.

You can't tell anything about what tool is appropriate for the job they're
trying to do from their lack of investigation.

