
What ORMs have taught me: just learn SQL - numo16
http://wozniak.ca/what-orms-have-taught-me-just-learn-sql
======
dicroce
Here's a thought experiment. Lets say we lived in a world without SQL and the
default way to talk to DB's was through an ORM....

And then someone came and said: "I created this concise and super flexible
language for querying data." Would people want it? I think they would, and
we'd see tons of articles about vast forests of objects being replaced by
small snippets of SQL.

~~~
Daishiman
When I look at SQL through the lenses of hindsight I see a language that's not
amenable to IDEs (it's harder to autocomplete columns if you must write those
before the table name, as an example), and has questionable and verbose
syntax.

While straight relational algebra is actually quite readable, despite all the
efforts of most the anti-ORM crowd, at the end of the day the business logic
that works on business objects is _much_ more important than the storage
backend, and thinking in terms of objects seems to be the preferred
alternative when reasoning in that context.

I've been doing backend logic for quite a while now, and I have to say that
the special features RDBMs offer are great for reporting, but reporting is a
very, very small part of what most code that interacts with the business
domain does. And while that may be because SQL is _great_ at aggregating data,
the fact that it can't easily be plugged with the rest of my business logic is
a huge impediment.

This wouldn't be a problem if my entire business logic were to reside in the
database, but real world applications interact with external APIs, regular
files, and a whole other bunch of stuff. So the fact that ORMs speak the
language of my business logic is a far greater advantage.

I also find the argument of inefficiency to be a strawman. Well-written ORMs
are quite explicit (and lazy) about what they're doing, and standard best
practices would dictate that you should be properly describing the scopes and
fields you're fetching when your rows become wide enough. But SQL demands that
too; you can trivially fetch * from a table. And SQL's limitations mean that
you don't have access to all the sweet abbreviations ORMs provide such as
scopes, custom query managers, aliases, built-in result caches, and being able
to avoid the worst cases of vendor-specific SQL.

I started using ORMs because I was tired of writing the same SELECT statement
with 10 slight variations, stored in a source file for a different language,
having to deal with row casting, and being unable to plug in simple code to
fetch related entities. SQL thus far has not advanced _one bit_ in this area,
and until someone comes up with a way to modularize the language so it can
provide those features, purists will still complain while most of us keep
using ORMs to avoid verbosity, bugs, and compatiblity problems.

~~~
walshemj
Sorry Toad or Work manager and the MYSQL tools are perfectly acceptable IDE's
for SQL development.

~~~
Daishiman
"Perfectly acceptable" doesn't reach the standard of type inference and
feedback that you can get with modern IDEs for their supported languages.
Visual Studio gives far more feedback for LINQ than SQL, and it's damn more
useful for debugging queries.

~~~
walshemj
I suggest you need to learn to code SQL and the procedural extensions for the
dialect in use then - and OO is not the be all and end all of development.

~~~
Daishiman
I have over a decade of writing SQL, thank you very much. I still stand by my
assertion and I there's no way I'll be writing procedural SQL for the use
cases I deal with on a daily basis.

------
terhechte
I used to write raw SQL for many years, then, around 2005 switched over to
ORMs in order to be able to target different databases, have a nice model,
etc. Lets be honest here, the ease of justing doing:

    
    
        p.username = "Carl"
        p.age = 33
        p.save
    

instead of "update users set username=:username, age=:age where id=:id" has a
ton of advantages. For one, some sort of syntax or type checker is actually
trying to understand your queries and makes it easy to find typos _before_ the
database laments in the middle of a huge transaction. Strongly typed languages
are even cooler here (most notably Slick for Scala, which has a fully type-
checked DSL for database querying which makes it really difficult to create
typos) [1].

However, the downside of this is that it only works as long as whatever you
need from your database is the lowest common denominator of database features.
I'm currently working on a Clojure + Postgres project and I'm using all the
bells & whistles that Postgres has to offer: HStore types, Json Types,
subqueries in subqueries, Upserts, functions, etc. I have a rather complex
database that I'm running analytical queries against, and using an ORM for
this would simply not work. I'd spend half my time trying to figure out how to
implement a certain Postgres feature in -insert-or-name-here. I actually
started with Korma [2], a simple Clojure ORM, and gave up because it was too
tedious trying to figure out how to get it to correctly run my 4-function
column result query.

I've since switched to YeSQL [3] which follows a very interesting idea: You
write queries in .sql files (full editor support, jay) and tag every query
with a name. YeSQL then reads these .sql files during compile time and
dynamically generates clojure functions with the correct amount of parameters
based on these queries. It is kinda the best of both worlds. Example:

queries.sql:

    
    
        -- name: get-users
        -- a function to retrieve the user by name
        select * from users where username=:name
    

clojure.clj:

    
    
        (defqueries "some/where/queries.sql")
    
        ;; now I can do:
        (get-users "carl")
    

I think it really depends on the use case. If the database model is simple and
there're no crazy database technologies in use, I'd rather go and use an ORM
again I guess.

[1] [http://slick.typesafe.com/](http://slick.typesafe.com/) [2]
[http://sqlkorma.com/](http://sqlkorma.com/) [3]
[https://github.com/krisajenkins/yesql](https://github.com/krisajenkins/yesql)

~~~
Fishkins
Thanks for sharing your experience. I've been meaning to try Slick, and YeSQL
sounds like a nice way to reduce some boilerplate with no real downside. I go
back and forth about how I feel about ORMs. I think everyone can agree you'll
need to learn SQL for any non-trivial project, even if you end up using some
abstraction on top of it.

On a tangent: you mentioned Upserts in Postgres features. I thought Postgres
didn't have any kind of Upsert. Was it added recently or something?

~~~
terhechte
Postgres indeed doesn't have Upsert yet, so I'm going the default way of
locking the table, and implementing it via a slightly more complex query. I
was just too lazy to explain that in my earlier comment. The problem is the
same: The syntax below can't really be represented well in a ORM.

    
    
        BEGIN;
        LOCK TABLE search_tracking IN SHARE ROW EXCLUSIVE MODE;
        WITH upsert AS (UPDATE search_tracking SET count=count+1 WHERE keyword = 'John Doe' RETURNING *) INSERT INTO search_tracking (keyword, count) SELECT 'John Doe', 1 WHERE NOT EXISTS(SELECT * FROM upsert);
        COMMIT;

~~~
NicoJuicy
But this aint that hard, i suppose it could also be done in Postgres (query
using MS SQL Server)

Table1 SET (...) WHERE Column1='SomeValue'

IF @@ROWCOUNT=0

    
    
        INSERT INTO Table1 VALUES (...)

~~~
_ZeD_
not atomic

~~~
sk5t
Thus transactions...

------
rosser
I've caught a lot of flak for saying this, but I'm convinced that all ORMs are
ultimately tech debt. Sure, they get you up and running quickly, but once
you're there, you'll invariably find yourself wanting to do things that
require you to work _against_ and _around_ your ORM to accomplish. By pretty
much any definition I've ever encountered, that's "tech debt"

~~~
lpolovets
(caveat: I'm a developer but I haven't used ORMs very much.)

Don't most ORMs let you write raw SQL when you really want to? In that case,
you could use the ORM for simple things, but revert to raw SQL when you need
more power. Or is that not the case?

~~~
endlessvoid94
Yes, but the ORM often influences the schema design. That can be very painful
down the road when you realize your tables are actually tables, rather than
instances of objects, which would be what your ORM led you to believe.

~~~
anko
I think the problem is not that an ORM often influences schema design, it's
that Relational Databases/SQL often influence application design.

People complain that an ORM isn't using a relational database effectively. The
greatest contribution of the rise of ORMs is that relational databases are
hard to use properly.

Bring on the ACID compliant document databases.

~~~
gaius
No you have it backwards. RDBMS are as they are because maths (relational
algebra and calculus). There is deep theory behind doing things this way. You
can put data in without needing to know how it will be accessed and used (and
vice versa). NoSQL just doesn't have this rigor. You have to tightly couple
what creates the data with what consumes it. THAT is just begging for trouble
down the line.

~~~
anko
That is absurd. Saying that using a certain data access api tightly couples
you to it is as crazy as saying using variables tightly couple you to ram (and
using ram is obviously bad!).

I honestly don't think you understand what tight coupling means.

You have data. To access the data you use an api. SQL is forcing you to use a
generalised API, which is very old, hard to use and more importantly; hard to
test.

If you actually want rigor in your SQL API, you use stored procedures. So now
you're maintaining two languages (SQL and stored procedures) in addition to
your application language.

For me, for most things, I just write a webservice which talks to whatever
database I want. That's the API I expose. Anything can consume the API as long
as it follows my RESTful spec.

With this architecture, I; 1) Don't have to struggle with SQL, making
development faster. 2) Don't need a DBA making development cheaper. 3) Get to
write in one language making testing a lot easier, which in turn makes quality
higher. 4) can scale easier, picking whatever data storage characteristics are
important to me.

Lastly, saying that NoSQL just doesn't have this rigor really makes me wonder
what you think of google's bigtable or amazon's simpledb?

------
lukaseder
Interestingly enough, no one ever listened to Gavin King (creator of
Hibernate), when he said that you shouldn't use an ORM for everything.

It is relatively easy to draw a clear line between using:

    
    
      * ORMs for domain model persistence (complex, stateful CRUD)
      * SQL for relational model interaction (complex, stateless querying)
    

Bottom line:

    
    
      * Don't use ORMs for querying
      * Don't use SQL for complex CRUD

~~~
GordyMD
Thank you for sharing this. Do you have a source for this quote? It completely
reflects my experience over the years of what makes sense given the relative
merits and pitfalls of each method.

~~~
lukaseder
Yes, the source is here:

[https://plus.google.com/+GavinKing/posts/LGJU1NorAvY](https://plus.google.com/+GavinKing/posts/LGJU1NorAvY)

------
mamcx
For long time, I have no idea that SQL & OO are not friends. I work in a
language where such problems don't exist. And it have nice ways to move data
between tables and data structures and objects (For example: SELECT..INTO
Array NAME).

Is FoxPro. Even the stored procedures were foxpro, all along the stack, from
UI to inner DB actions. Is a shame that this kind of programming is "lost"
today.

This is kinda like work with the postgresql main language everywhere, but the
language is not half-powerfull. You don't need a "database" language and a
"application" language. In fact, the whole experience was very good, and even
let some people to do his own apps (kinda like Acces) because the approach was
not intimidating, like VB, C or Python is (ie: Torwards do database apps).

So in short: Back them we don't use a ORM, we don't need it, we still have
nice syntax and helpers. I still try to backport some of this to my own code,
but the SQL is SOOOO restrictive that the effort stop fast.

~~~
makmanalp
You know, I remember visual foxpro coming with microsoft visual studio 6.0,
and not really understanding much about it. Care to expand upon what made it
awesome?

I found some stuff here but it doesn't really explain it well:
[http://www.foxprohistory.org/articles_4.htm](http://www.foxprohistory.org/articles_4.htm)

~~~
mamcx
It shine for business apps (aka: the most common of it).

Is more "batteries included" than python. You can do a full app with just Fox,
because it include reporting, database, database designer, form builder, menu
builder, OO, procedural, SQL, a _really good_ grid. Seriously, I have only
know of 2 great grid controls: The VFP one and the DevExpress for Delphi one.

Have a truly good grid was something damm useful. Manipulate data was easy,
you could make a app only around the grid!

But all of this included.

And this could be made to work for the _end user_. So, your reports could be
designed by a end-user. Or build forms. Or menus. Similar to lisp where
everything are the "()" in fox everything was a table. A form? a table. A
report? a table. A class? a table (if designed visually. You can make
everything in text like a regular program too).

You can make dynamic the whole app if the developer learn to use the language
fully.

Was also relatively easy to grasp. A power user could work on fox with a dozen
of commands (and remember: The grid, like a excel spreadsheet, was enough for
a lot of task), but in contrast with acces, you can "upgrade" the mess to be
more professional.

Is simple for a programmer, too. My first programming class was with foxpro,
and at the end of it we have made a FULL app with menus, forms, tables,
reports and mini programs.

The closet thing today? Perhaps the django admin, but it is far less powerful
and constrained.

The weak of fox was that MS prefer to push Acces (and both fight for the same
kind of user), it neglect the DB engine pushing for sql server and, well,
neglect the language.

------
pdeuchler
Most of the time I see arguments like this the author probably has an
inconsistent and de-normalized schema. With the caveat that all growing
systems gather warts, and yes, it's nigh upon impossible to keep your data de-
normalized at scale with sufficient complexity, an ORM coupled with a decently
designed schema is an unbeatable combination. Not only do you get fast
development, code re-use, and easy benchmarking but you also get all the
knowledge and expertise that's been baked into the ORM. Most ORM's now come
with miles of security features built in, and are able to infer and optimize
on complex queries and joins better than your average non-DBA developer
(assuming your data size isn't in the millions of rows, per table, with < 3
tables to join... aka 90% of all queries).

Of course general solutions sometimes can't match up with tightly coupled,
highly optimized, extremely specific hand crafted SQL, and duh, making things
easier can encourage bad practices, but that's just parroting tautologies and
ignoring all of the benefits you get with an ORM. In fact, good ORM's will
even provide tools to allow you to construct custom queries in the same scope
as an ORM query, allowing you to have your cake and eat it too. Why would you
throw out the baby with the bathwater when you can just rewrite your most
egregious 10% of queries in SQL, while allowing the rest of your app to
merrily chirp on? Hopefully one day someone super smart can create a brilliant
ORM that incorporates machine learning, whizbang functional tools, and _insert
trendy something or other here_ and we can all relegate SQL to the status of
"DB assembly" where it belongs. I, for one, am embracing an era where devs no
longer have to shlep around arcane DBA wisdom tidbits in comments above
grotesque SQL queries. Why would you ever reject high level abstraction over
tedious minutiae?!

P.S. This is all ignoring the ease of which you can switch database
backends/engines with ORM's, which could be a 500 word comment all on it's own

------
nmjohn
> A recent count of one such table in my work resulted in over 600 attributes
> and 14 joins to access a single object, using the preferred query
> methodology

At some point between inception and having 600 columns someone had to have
stepped in said is this necessary? Maybe I haven't worked with big enough data
sets, but to me that number seems insanely high - and I really don't see how
the ORM would be any worse than raw SQL - if they were equally optimized.

It sounds to me like best practices SQL are being compared with worst
practices ORM. Now many ORMs, by default, may lead a developer to use bad
practices, which I will concede is a problem.

But in optimizing a query with SQL you only select necessary fields - and
don't use any more joins than required for the data you need - you only gain
over the ORM if you didn't limit the scope of the object you asked the ORM for
and instead asked for the full object.

> Attribute creep and excessive use of foreign keys shows me is that in order
> to use ORMs effectively, you still need to know SQL.

Well there is your problem! Thinking you can completely ignore the underlying
database mechanism of an abstraction layer is a little naive - if you want to
maintain maximum performance. Though bad marketing of some ORM's may claim
differently which is a problem as well.

I agree, there is a potential downside to ORM's. Raw SQL likely will get you
better performance almost 100% of the time. Is that necessary? Well, it
depends. If the ORM is being used with best practices, I'd say until you're at
the scale of billions of pageviews a month, it'd be negligible. However if
you're essentially doing a SELECT * and JOINING on all related tables just to
get one field from the Users table - then the SQL will win hands down.

~~~
mgkimsal
"It sounds to me like best practices SQL are being compared with worst
practices ORM."

AMEN.

I've written bad SQL and good SQL. I've written bad ORM stuff, and good ORM
stuff. You don't learn how anything is bad until you make mistakes and learn
from them. Ideally you learn from others' mistakes too, but some things you
just end up having to internalize through experience.

I still take a decent ORM over raw SQL for 90% of the work I do, because much
of it is boilerplate/repetitive stuff. Knowing something else will handle
escaping and basic relations for me without a whole lot of boilerplate behind
it is great, then I write some SQL by hand when I hit a wall with the ORM
(complexity or performance). If you're trying to fit 100% of every single data
query in to an ORM, and you bend the data too much to fit the constraints of
the ORM tool, yes, there will be problems. But that's sort of just common
sense - once you feel you're forcing a tool to do something it's not suited
for, back up and ask if there's a different way to get your results; doesn't
have to mean throwing out the whole tool.

I've worked with plenty enough stupid "raw SQL only" projects to know the real
answer to all this is experience and knowledge, vs just following 'one true
path' regardless of your ability to understand it. Ever had a user table with
190 columns, named "is_usa", "is_argentina", "is_germany", etc, one for each
country, so the developer could determine what country someone lived in? Every
single request, 190+ queries: "select is_argentina from user where id=5",
"select is_iran from user where id=5", etc. But hey, it was done by hand - no
evil ORM to hinder the awesomeness of raw SQL, right?

------
TeMPOraL
> _I 've found myself thinking about the database as just another data type
> that has an API: the queries. The queries return values of some type, which
> are represented as some object in the program. By moving away from thinking
> of the objects in my application as something to be stored in a database
> (the raison d'être for ORMs) and instead thinking of the database as a
> (large and complex) data type, I've found working with a database from an
> application to be much simpler. And wondering why I didn't see it earlier._

Exactly. ORMs feel to me like a result of trying to stick together two
different abstractions with a duct tape and hoping it will save us from
writing code.

The two abstractions would be:

\- #1 Database as "just another data type that has an API", i.e. what is
stored in the database and what are your business objects are _two different
things_ and should be explicitly separated. ORMs break this by Active Record
pattern.

\- #2 Database as invisible persistence layer that automagically stores the
state of your business objects, in the same way as garbage collector
automagically takes care of reclaiming unused memory. ORMs break this by
leaking SQL all over the place.

Getting #2 and not having to deal with RDBMS explicitly would lead to a great
simplification of code, however we seem to be nowhere near that with current
libraries (I heard some rumors that some Common Lisp Metaobject Protocol magic
can get close, but I need to see this with my own eyes). So until that time I
try to stick to abstraction #1. I do use Active Record, as it simplifies
things, but try to explicitly avoid treating ORM objects as business objects.

------
meritt
ORMs are great for simple CRUD operations. As soon as you want to do anything
mildly complex or desire efficiency, you need to write SQL.

As long as the ORM helps me with the former and gets out of my way for the
latter, I'm totally happy to utilize then.

~~~
silverbax88
I agree with this, but SQL is also great for writing CRUD, and 99% of database
interaction is CRUD.

~~~
spacemanmatt
That may be true by frequency of query execution, but not by frequency of
appearance in an application.

------
bkeroack
Another problem with ORMs is that they make performance diagnostics much
harder. DB-side, you might have a list of worst-performing queries and
examining it reveals a huge, hundred-line monstrosity of a query. Because the
queries are ORM-generated and are not usually very readable it isn't exactly
clear which part of the application is generating it (or why). Even further,
if your DBA says you could make the query more efficient by changing X and Y,
it's not likely that you can take her advice since the query is
programmatically generated! So you have the choice of abandoning the ORM in
performance-sensitive paths and inserting raw SQL (making the code ugly) or
living with poor performance.

I'm a fan of abstractions only when necessary. I want to have control over
100% of the code stack. When performance issues crop up (and they inevitably
will), I don't want to just shrug and go "well, Hibernate must be doing
something weird..."

~~~
jurre
This is simply not true, lots of tools are able to tell you exactly which line
of code is responsible for which query.

~~~
bkeroack
Such as? Using ASP.NET and NHibernate, I've not seen any way to do this other
than old fashioned log statements and guesswork.

~~~
daigoba66
[http://www.hibernatingrhinos.com/products/nhprof](http://www.hibernatingrhinos.com/products/nhprof)

Worth every penny. You get full stack traces and line numbers automatically
with each query.

------
_RPM
Personally, I see value in learning the actual "protocol" that the ORM's are
abstracting. For instance, it is convenient to use an ORM if you understand
that it is just a large abstraction of the actual syntax sent to the database
server. People learning ORM's before SQL have a disadvantage though, in my
opinion. Because without learning the SQL required, they are tied down to
their particular ORM implementation. It is kind of like learning jQuery before
JavaScript.

------
makmanalp
Over and over I keep finding that _just_ an ORM is not enough, but raw SQL is
hideous in a different way.

ORMs map nicely when you are indeed modifying objects, but somethings don't
map well that way. So don't map them that way! What we need is a low level
abstraction layer alongside the ORM.

The main problem with raw SQL is that what you really want is a genuine
programming language. You almost want programmatic access to the SQL AST, so
you can generate syntax as opposed to concatenate strings together. Kind of
like a DOM API, but for SQL.

Let me give you an example from what I'm working on now. So let's say you have
a generic query for fetching monthly aggregated import / export values between
countries. Sometimes you want to filter by one country. Sometimes you want to
filter by 10 countries. Sometimes you want to use the column containing the
inflation-adjusted value instead of the regular column. Sometimes you want the
average export instead of the total export.

With a low level abstraction layer, I can do stuff like
already_complicated_query.filter(another_param==5). Or I can write a function
that does get_world_trade_aggregate(country="USA", aggregate="average") and
it'll generate the right query for me. But then that's not even all, if I have
to modify or filter that query further in some other part of the code, in some
unexpected way totally doable. It's less often that I have to write a whole
new function or duplicate code.

So the point isn't the abstraction, it's the fact that it makes queries
modifiable and composable, which is something you just can _not_ have in raw
sql. So given a query, I can decide to add something to it or change it
without having to change the text based SQL. You know how it's good practice
to split your 1000 line function into smaller, modular pieces? With this you
can do that with your ginormous SQL query too. AND it's (well, almost)
guaranteed to generate correct and safe SQL.

This also makes everything much more maintainable. E.g. If I change table
names or migrate fields, often it's a non-issue and there is a single place I
need to change that. But with raw SQL you have to go in and do a ctrl-f and
hope you catch everything, but often you don't because someone decided that
that field name should come from a variable or something.

Raw SQL spits into the face of everything we've learned about programming
languages in the past 60 years.

Anyway, the neat thing is that this magical system exists, as the SQLAlchemy
Core
([http://docs.sqlalchemy.org/en/rel_0_9/core/tutorial.html](http://docs.sqlalchemy.org/en/rel_0_9/core/tutorial.html)).
Enjoy!

~~~
snarfy
>The main problem with raw SQL is that what you really want is a genuine
programming language. You almost want programmatic access to the SQL AST, so
you can generate syntax as opposed to concatenate strings together. Kind of
like a DOM API, but for SQL.

What you described is basically EF+LINQ.

    
    
        var query = context.Users.Include("Users.Group").Where( u => u.UserType 
        == UserTypes.Basic).Select( u => u.Group);
    

That's going to spit out a SQL statement that joins Users to Group and selects
all groups where the UserType is UserTypes.Basic. You can even do
ToTraceString() to see the generated statement.

~~~
virtualwhys
The same in Scala + Slick would be:

    
    
        val q = Users.filter(_.userType == UserTypes.Basic).map(_.group)
    

Although I'm not sure what the context.Users.Include bit in the EF example is
doing. Both will generate statement at compile time, and I assume EF queries
are composable as is the case with Slick.

Slick's readability does suffer though with more complex queries --
unfortunate they strayed away from SQL semantics in favor of Scala
collections. We'll see where it goes...

~~~
snarfy
Include() on an ObjectContext will return the corresponding foreign key
objects. It's a poor man's join to related entities. There is also Join(), but
that will join by anything.

~~~
virtualwhys
Ah ok, I see. Slick actually does have a nice shorthand for fkey joins:

    
    
        val q = for{
          ur <- UserRole; u <- ur.user; r <- ur.role
        } yield(ur,u,r)
    

which would produce semantic equivalent to:

    
    
        select ur.id, ur.a, ..., u.id, u.a ...
        from userrole ur
        join user u on u.id = ur.userId
        join role r on r. id = ur.roleId
    

grouping and sorting is where Slick becomes less readable SQL-like DSL and
more boilerplate DSL full of meaningless tuples O_o

------
jstsch
A good ORM is not a substitute for SQL. They help you with mundane things and
you can still write SQL if you want. I like the approach of RedBeanPHP
(www.redbeanphp.com).

~~~
mgkimsal
+1 for redbean - most of my colleagues hate it, but I've found it's a good
balance between getting rid of a lot of boilerplate, but not getting in the
way when I need straight SQL.

------
JohnBooty
I wrote "raw" SQL for many many years before using ORMs, so I feel a lot of
this guy's pain and agree with most of his points. _Especially_ the part about
still having to know SQL even though you're using an ORM.

Not sure I understand his solution to this one, though!

    
    
      Window functions are relatively advanced SQL that is painful 
      to write with ORMs. Not writing them into the query likely 
      means you will be transferring a lot of extra data from the 
      database to your application.
    
      In these cases, I've elected to write queries using a 
      templating system and describe the tables using the ORM.
    

Templating system? Whaaaat? Here's what I do in ActiveRecord + Postgres and/or
SQLite. (Edit: OK, I think I understand! Other people in this very HN thread
have discussed templated approaches, like
[https://news.ycombinator.com/item?id=8134170](https://news.ycombinator.com/item?id=8134170)
)

1\. I stick to ActiveRecord whenever possible for the simple stuff. It's
really good for like 75% of what I want to do. Being able to chain scopes and
things is neat and a good example of something you can't easily do in SQL.

2\. When I need to work some SQL magic that ActiveRecord can't do (or can't
_easily_ do) that's when I write a SQL view or function. So my class might be
something like:

    
    
      # Example is Ruby+ActiveRecord but concepts should be applicable elsewhere
      class SomeClass < ActiveRecord::Base
    
        def some_simple_method_that_uses_activerecord
          SomeClass.where(foo: bar).order(:baz)
        end 
    
        ...
    
        def some_method_that_needs_sql_magic
          SomeClass.find_by_sql "select a,b,c,d,foo,baz from some_crazy_view_or_table"
        end 
        
      end
    

This has been working pretty well. For the simple CRUD stuff that ActiveRecord
is good at, I use ActiveRecord's built-in query interface. For the other
stuff, it's easy to write a SQL view/function and simply reference that. It's
not _perfect_ because obviously you risk having too much of your app logic
live in the database layer... although, then again, I feel equally icky about
having too much of the data logic live in the app layer.

~~~
plaguuuuuu
Let's talk practicality rather than sitting in some ivory tower and muttering
about best practices

With AR, if this is the worst case where you write some raw SQL, what's the
alternative? The alternative seems far more painful pragmatically speaking and
this, while being a little ugly, seems to work just fine without impacting
productivity or performance.

~~~
JohnBooty

      > With AR, if this is the worst case where you write some
      > raw SQL, what's the alternative? The alternative seems
      > far more painful pragmatically speaking and this, while
      > being a little ugly, seems to work just fine without
      > impacting productivity or performance.
    

Do you know if other ORMs allow this kind of relatively painless use of "raw
SQL?"

I've only used ActiveRecord and some of the .NET "micro ORMs" like Dapper and
PetaPOCO.

Dapper and PetaPOCO definitely support raw SQL. That's kind of their main
focus - they take a row of SQL results and map those database columns to your
class's properties, and by design they don't do a whole lot else.

But I don't know about other big/popular ORMs like Hibernate...

~~~
Hakeashar
Even the bulkiest ORMs allow you to use raw SQL. That's why you can use 80 -
90% of the features on the regular basis and hand-tweak regions which cause
performance problems or places where you just _have_ to write SQL (e.g.
recursive queries).

In EF, there's either: Database.SqlQuery<T> \- [http://msdn.microsoft.com/en-
us/library/gg696545%28v=vs.113%...](http://msdn.microsoft.com/en-
us/library/gg696545%28v=vs.113%29.aspx)

which can return _any_ object or: DbSet<T>.SqlQuery -
[http://msdn.microsoft.com/en-
us/library/gg696332%28v=vs.113%...](http://msdn.microsoft.com/en-
us/library/gg696332%28v=vs.113%29.aspx)

which returns tracked entities, so you can write raw SQL (e.g. call a stored
procedure) and just use the 'mapper' part of the framework.

NHibernate has CreateSQLQuery - [http://www.nhforge.org/doc/nh/en/#querysql-
creating](http://www.nhforge.org/doc/nh/en/#querysql-creating)

I like micro-ORMs, but when you want to skip writing tedious INSERT or UPDATE
queries, you have to add extensions to them (at least to Dapper); that, and
SQL strings do not really lend well to refactoring and type safety...

~~~
JohnBooty
Oh man, that's great to know. Thank you for that reply!

~~~
Hakeashar
No problem! Happy to help :)

ORMs get a lot of flak and while some of it is truly earned, the rest comes
from the misuse/abuse of the tool. I always thought that using ORM
functionality (where convenient) _together_ with SQL (where necessary or
convenient for different facet of the application) was the best from both
worlds.

And then you can of course mix different ORMs in one project, so you can use
EF in areas where performance does not really matter that much or if you're
doing a lot of CRUD and Dapper (or something like Insight.Database if you like
stored procedure-to-interface mapping) in hot paths or analytic-heavy piece.

------
Joeboy
In Django, for me the killer feature of the ORM is that it's (mostly) database
agnostic, which means that you can use Postgres in production and in-memory
sqlite when testing, which makes testing a gajillion times faster.

If you start writing custom SQL you have to introduce horrible bodges to work
with whatever database is in use.

~~~
matdrewin
Database portability is a poor argument for using ORM in my opinion.

In the real world, most apps don't change database engines during their
lifetime. You wait for the rewrite and then you pick a new engine.

If you stick to ANSI compliant SQL, you should be fine when porting over
databases. It's not perfect but it'll get you most of the way.

~~~
xienze
> In the real world, most apps don't change database engines during their
> lifetime.

Perhaps not, but a lot of times you may have to support multiple database
engines at once (this is more true of e.g. middleware than an end-user
application obviously).

------
dkrich
I definitely agree with the crux of the article (that it's important to know
SQL if you want to program with an ORM), but not for the same reasons as the
author. The reason I agree that it's important to know SQL is that not knowing
SQL will in some ways limit what you can do with your applications. This may
never affect you based on your use cases, but there may be times when it does,
and not knowing how to write queries will lead you to believe that things
can't be done with your application that could be.

In the article, it sounds like he's building some reports from a database with
a framework, and is challenged because there are massive amounts of data that
work best for these specific reports in an unnormalized layout (this is where
the attribute creep comes into play). If you use an ORM to try and build these
reports on the app side as opposed to the database side, then you will
struggle.

However, my take on this is pretty simple. Don't build the complex logic in
aggregating the report data on the app side. Use materialized views on the
database side, and use your app to do a dumb retrieval of the data from the
view with the ORM. The complexity will lie in the materialized views, so your
ORM queries can actually still be quite simple and your performance won't
suffer.

Obviously, this route would require knowledge of SQL to implement, which
brings me back to my original point. ORM developers should have an
understanding of SQL.

Long story short, for specific use-cases, it's good to know SQL. For most it
doesn't really matter.

------
sida
I think the problem is when ORM influences/encourages particular schema
designs. When you no longer see tables as tables (which is storage) and rather
see your database tables as instances of objects (how you would like to
consume the data)

ORM (rails/AR in particular) makes it very diffcult to work with joins and
build an object that read from multiple tables.

One workaround I think is to use database views. And see views as "instances
of objects" and back ORM classes with them.

~~~
twic
Java's JPA interface, of which Hibernate is an implementation, makes it pretty
simple. Straight from the documentation of EclipseLink (another
implementation):

    
    
      // query for a primitive
      Query query = em.createNativeQuery("SELECT SYSDATE FROM DUAL");
      Date result = (Date)query.getSingleResult();
    
      // query for a pair of primitives (bletcherous, but not disastrous)
      Query query = em.createNativeQuery("SELECT MAX(SALARY), MIN(SALARY) FROM EMPLOYEE");
      List<Object[]> results = query.getResultList();
      int max = results.get(0)[0];
      int min = results.get(0)[1];
    
      // query for a mapped object
      Query query = em.createNativeQuery("SELECT * FROM EMPLOYEE", Employee.class);
      List<Employee> result = query.getResultList();

------
Lavinski
I hardly use an ORM anymore these days, however this article has a few issues.

First even when using SQL the author runs into the problem where splitting the
database into two and having a reporting database would be much more efficient
instead of having one database trying to meet all your needs. When you do
writes you want transactions and 3rd normal form but when reporting 3rd normal
form becomes a downside. This applies to attribute creep and data retrieval.

Second the dual schema problem is one that I think most ORM users know how to
avoid. I generate the schema from the code directly, maybe with a little bit
of fluent migrations to help move data.

The issue with transactions is a strange one. Ideally this is handled as a
cross cutting concern in your application. This means it's consistent and
transactions can be explicit and predictable. I'd do the same thing for any
application.

The biggest issue here is I think the author has chosen the wrong tool. This
application sounds like it would be well suited for event sourcing. I'm not
going to go into it here but they solve these issues in an interesting way.
Plus the data is event-based anyway.

------
earino
My favorite pattern, which I use on almost all personal code is the
phrasebook, implemented here in perl: [http://search.cpan.org/~rani/Class-
Phrasebook-0.88/SQL/SQL.p...](http://search.cpan.org/~rani/Class-
Phrasebook-0.88/SQL/SQL.pm)

Basically it's "give your SQL a symbolic name, define how your code interacts
with that SQL, and use it as such."

Mind you, ORMs can be awesome, but in my experience ORMs are fantastic at
implementing the parts of an application that aren't the competitive advantage
of your application. The parts that actually make my code special, the parts
where fundamental assumptions regarding codes interaction with data fall
apart... those are the parts where ORMs don't particularly shine, especially
as time moves forward.

------
Xorlev
For those of us on Java, I've grown to love
[http://jdbi.org/](http://jdbi.org/). JDBI has a lot of features that are
convenient (e.g. auto-mapping of columns to a POJO), but synthesizes DAOs for
you from interfaces annotated with SQL queries, e.g.

    
    
      @RegisterMapperFactory(BeanMapperFactory.class)
      public interface TripDAO {
          @SqlQuery("SELECT trip_start AS start, MAX(timestamp) AS end FROM location_updates WHERE trip_start = :start GROUP BY trip_start ORDER BY trip_start DESC")
          public Trip trip(@Bind("start") Date tripId);
      }
    

And allows you to plug in providers for mixing in Optional types,
PGgeometries, whatever you can fit into an interface.

~~~
beermann
I've been using JDBI for a new project for about a month or so and I like it a
lot as well. I will say that the documentation is a bit sparse though, or at
the least hard to find by digging through blog posts. When I figure out how to
do what I'm trying to do however, I love it.

For example, it took me a while to find out that @CreateSqlObject was a thing
and that it solved any hackery I was trying to do to have my DAOs reference
one another. Or if you google "jdbi transactions," you aren't led to a page
that actually shows you how to use @Transaction. I feel like the only thing
holding back widespread use is some better documentation.

~~~
Xorlev
I'll agree with that. I had to really dig through the user group to find
answers to a lot of things. e.g. @SingleValueResult if you want to return an
Optional<Item> from a query.

I didn't know about @CreateSqlObject, thanks!

------
marcosdumay
"I much prefer to keep the data definition in the database and read it into
the application. It doesn't solve the problem, but it makes it more
manageable. I've found that reflection techniques to get the data definition
are not worth it and I succumb to managing the redundancy of data definitons
in two places."

My experiency is almost the oposite of that. I've found that automatic
migrations are one of the best features of ORMs (only on par with getting
reflection on the data structure), and the OO syntax much better for defining
data structure than DDL.

Nowadays I favor ORMs even when I don't use the OO data interface, just
because it's easier to maintain.

~~~
hamburglar
Do people actually use ORM-generated migration in large production
environments? In my experience, schema mods to big production tables need to
be planned out carefully and run by hand on the inactive master db (using
mysql mmm replication terminology) in order to pull them off without downtime.
You can't just have the next app push blindly start doing ALTER TABLE
statements. Perhaps there are clever schemes modern ORMs use to package up the
migrations to be run on the inactive master? I honestly don't know, but I had
the impression that automatic migrations via the ORM were more or less
unusable for anything nontrivial.

~~~
marcosdumay
To be honest, I never tried migrating without downtime[1].

I don't see any reason why applying the changes in a inactive mirror would be
dificult. They are fit for diverse environments, since you'd apply them first
at the development machine, then testing[2], and only then at the live system.
The main issue is what to do with the data generated while you migrate, but
you solve that the same way you solve a normal hand-made migration, you ask
the ORM to create the operations at the right order, and divide it in batches.
It's just more work than a couple of minutes of downtime justified for me.

[1] Postgres has no problem doing some ALTER TABLE statements at the live
mirror, what makes most migrations trivial, ORM or not. I mostly do those
live, but monitor closely because it's dead-lock prone.

[2] I think data migration is the only reason I create separated testing
environments nowadays. Otherwise, I try to keep development envs complete
enough for tests, and just recreate them as needed. It makes everything much
faster.

------
thirdreplicator
Hibernate can almost be used as the definition of pernicious

[http://www.merriam-webster.com/dictionary/pernicious](http://www.merriam-
webster.com/dictionary/pernicious)

"causing great harm or damage often in a way that is not easily seen or
noticed"

The most pernicious thing about Hibernate is the "caching feature" (read as
"time-bomb") layer that doesn't write objects right away when modifying an
object. So instead of immediately seeing changes in your database, they
gradually creep in from other application instances as the write-timeout
expires, slowly corrupting your data without any errors occurring. Great idea
guys.

~~~
kovrik
So true!

With Hibernate there are a lot of things happen under the hood. And all you
can do is either struggle with Hibernate, or hope that those under-the-hood-
things won't have a strong impact on your app's performance.

------
notme_
I agree with Woz. Any developers working with RDBMS should learn SQL. ORMs are
just wrappers for SQL statements. They are not the same thing, and knowing one
without the other means your skills are weak.

Many comments in this thread are painful to read.

------
manicdee
Learn to recognise anti-patterns and deal with them ASAP:
[http://pragprog.com/book/bksqla/sql-
antipatterns](http://pragprog.com/book/bksqla/sql-antipatterns)

Also, learn about red/green/refactor. Refactor mercilessly. Learn both ORM and
SQL, understand the technology you're working with.

The "wide tables" problem is easily addressed through "associated data"
tables. Don't store everything in the one model. You have one model which is
for computation, searching, comparison, etc. You have another model which is
for storage of stuff that is only required at report time. This is a ORM-level
of "partitioning". Just avoid "fetch-related" style ORM requests and you're
golden.

It's really worth reading whatever references are available on tuning your
particular database for performance. You can take the advice onboard and let
the performance advice contribute towards your ORM design.

The SQL-templating approach is tempting, but it will very quickly lead to
problems as someone adjusts a template and adds or removes parameters. Do you
have tests in place to ensure that the SQL templates still function as
expected? How tightly bound is your SQL to one particular vendor's product?
Been there, done that, bought the t-shirt. Did you know that MySQL will
automatically update the first TIMESTAMP column any time that row is updated?
It doesn't even ask!

~~~
morgo
The non-standard TIMESTAMP behavior is deprecated btw, and can be disabled via
--explicit_defaults_for_timestamp:

[http://dev.mysql.com/doc/refman/5.6/en/server-system-
variabl...](http://dev.mysql.com/doc/refman/5.6/en/server-system-
variables.html#sysvar_explicit_defaults_for_timestamp)

------
johan_larson
The worst part about ORM is that there are so many systems for doing it. Out
of six employers, two had their own handrolled ORM systems.

I think what's going on here is that elementary mapping of objects to and from
relational tables is really easy and obvious, so people go ahead and implement
it. Unfortunately once you get past the easy bits and have to worry about
breaking objects across tables, foreign keys, transactions, caching, and other
fun stuff, you have a great big mess and you probably messed up some of the
edge cases.

------
callesgg
The main problem with ORM's is that they keeps the database model in the code
rather than in the database. I know that is what some people like, i myself
USED to think that was nice.

What i eventually learned was what Linus Torwalds said:
[http://programmers.stackexchange.com/questions/163185/torval...](http://programmers.stackexchange.com/questions/163185/torvalds-
quote-about-good-programmer)

As long as your data structures are good and clear the code to handle them
seams almost obvious.

~~~
ak39
Here's the quote: "Bad programmers worry about the code. Good programmers
worry about data structures and their relationships."

Wow, thanks for posting the excellent link.

------
zak_mc_kracken
> for me, ORMs are more detriment than benefit

Well, that's not a great start to the article. There are dozens of different
ORM's, so unless you get more specific, this makes as much sense as saying
"For me, programming languages are more detriment than benefit".

All ORM's are not equal and they all operate at various levels of abstraction.
Some are very close to SQL, others sit very high on the abstraction chain, and
each project might benefit from a different one based on a lot of criteria.

------
junto
This is why I love Dapper micro-ORM in the .NET world. It gives you the best
of both worlds.

You are still close to SQL, but you get objects back. Started by Sam Saffron
(StackOverflow) and used by StackOverflow themselves, it is fast, well
written, concise and easy to use.

Basic usage:

    
    
       IEnumerable<Account> resultList = conn.Query<Account>(@"
                        SELECT * 
                        FROM Account
                        WHERE shopId = @ShopId", 
       new {  ShopId = shopId });
    

Performance of SELECT mapping over 500 iterations - POCO serialization:

    
    
       Hand coded (using a SqlDataReader)	47ms
       Dapper ExecuteMapperQuery<Post>	49ms
       ServiceStack.OrmLite (QueryById)	50ms
       PetaPoco				52ms
       BLToolkit				80ms
       SubSonic CodingHorror		107ms
       NHibernate SQL			104ms
       Linq 2 SQL ExecuteQuery		181ms
       Entity framework ExecuteStoreQuery	631ms
    

\- [https://code.google.com/p/dapper-dot-
net/](https://code.google.com/p/dapper-dot-net/)

\- [http://www.tritac.com/bp-24-dapper-net-by-
example](http://www.tritac.com/bp-24-dapper-net-by-example)

\-
[http://en.wikipedia.org/wiki/Dapper_ORM](http://en.wikipedia.org/wiki/Dapper_ORM)

\- [http://code.google.com/p/dapper-dot-
net/source/browse/Tests/...](http://code.google.com/p/dapper-dot-
net/source/browse/Tests/PerformanceTests.cs)

------
guscost
These annoyances are familiar enough, but I think that the author is
underselling the benefits of this kind of abstraction. For one project I'm
currently using an ORM (NHibernate) with several patterns depending on the
requirements and the context:

\- With single-table queries and simple joins, having a repository available
means that these are a joy to write and maintain with LINQ. And with some
preparation the mappings can prevent all sorts of runtime issues.

\- Complicated queries which might be tricky to generate optimized SQL for can
be written in HQL, meaning that the data access code does not have to be
coupled to one brand of persistence store.

\- Really quirky stuff that only works in Oracle, for example, can be written
raw with CreateSQLQuery(). Here the ORM is basically functioning as an
extremely bloated DB session manager, but this is not very often.

I definitely agree that familiarity with the database query language is needed
to build non-trivial applications, but at the same time I would not enjoy
building many of those applications without the option to query a table in one
line of type-safe code. Some folks might say that I'm not using all the
features of the ORM unless I have every single foreign key relationship mapped
out to lazy-loaded collections and such, but I'm not really comfortable enough
with the technology to want to do that, and I don't really mind the extra bit
of housekeeping required to handle those cases.

Don't get me wrong, if you must prioritize raw performance or need all the
flexibility of your query language, the right decision might be to ditch your
ORM. With my current project, stability and maintainability are much more
important, so I'll be sticking with the convenient tools for now.

------
Smudge
Use the right tool for the job. If accessing a meaningful subset of your data
requires 14 joins and returns 600 attributes, your data model probably isn't
going to be friendly to many ORMs. Just write the SQL, maintain the sprocs and
tvfs, and give up on trying to fit everything neatly into an object-to-table
mapping.

But if you're tired of rewriting the same, CRUD-like SQL over and over again,
and want a common framework for accessing similarly structured data, you'll
inevitably write an ORM (however minimal) and then you're facing the same
problems that every ORM has attempted to solve to date.

The biggest danger in using an ORM is that it will constrain how you structure
your data model. It took me a long time to become comfortable with using
(materialized) views to back my models, or using multiple models for a single
table. But being able to divorce my data model from my conceptual model has
helped me immensely, in terms of avoiding a lot of the problems the author is
talking about. Rails' ActiveRecord is surprisingly friendly when it comes to
these sorts of things.

------
jacobsenscott
Like anything, you always need to learn at least what is going on one layer
below the abstraction layer you are working in. So if you are using an ORM you
need to learn SQL. If you are using ruby or python you need to learn what the
interpreter is actually doing. If you are using C you need to know what the
machine code is doing, etc.

------
julie1
Well I love SqlAlchemy, and DBIx::Class

I also like true RDBMS.

The most important feature of _R_ dbMs is the Relationships for me and the «on
delete/on update» combined with FKI and constraint. It makes possible to have
complex in relationship that are consistent and that can be added/removed
without destroying the consistency of the state in the DB. And most ORM only
people I know overlook this part, which is a must for transactionality. Your
data stay consistent other time.

Also, people tend to think querying with ORM are cool. They often query DB as
random access record. However for me complex query are more like intersection,
symmetric difference, union of records based on their joins. And SQL is so
good at it.

ORM can have hook on transactions, making it possible to call a webservice on
commit and rollback.

ORM can have mixins to help with having password handled carefully.

So, what is wrong at my opinion, is neither SQL nor ORM but the idea that only
one is enough to learn.

------
zoomerang
The problem with raw SQL queries is that they don't compose. Using an ORM, I
can do two things.

1\. Pass around query objects, and build queries derived from others. I can
also combine multiple queries into one and split the results.

2\. Update model records in multiple places passing them through several
layers of business logic before serializing.

This is on top of the other obvious benefits of ORMs, such as abstraction over
my storage engine. I can write a single 'query' that be be executed against a
variety of SQL servers, Salesforce, MonoDB, an in-memory cache, or whatever
else I want to do with it.

As a real-world example of why this matters - On my current project, I have a
heuristic algorithm that allocates pending jobs to workers that meet the
required skillset. As part of this, I have individually defined rules that
must be met. As part of the operation, each rule can query the database for
relevant results.

Each rule is a standalone module, and by design cannot have any knowledge of
other rules. Rules can be written by third parties and are plugged in and
loaded at runtime. To make this work, we can either

1\. Run one query per rule, hitting the database more than needed 2\. Compose
all the rules into one query, hitting the database once.

Using an ORM, I'm able to take a base query and fold it through all the
available rules, allowing each one to return an updated query to be merged.
Some rules conflict - they may require results that another rule has filtered
out. To solve this, the ORM will automatically detect the conflict and build a
single 'Query' object that compiles to a broader query behind the scenes,
takes the results, stores them in an in-memory cache, and then runs the
individual rules in-memory against them to get a final resultset. In the worst
case scenario where this is not possible, it will compile to the minimum
possible number of SQL queries to satisfy all of the individual callers.

As a result, each rule can run a different query against the database, getting
the individual resultset it wanted, while not hitting the database so heavily.

Why not just query multiple times? In this case, we're running this against a
Salesforce database. On top of the fact that you pay per API call, there's
anywhere up to 2 seconds of latency before getting a result. Composing the
queries means we take an operation that might have taken a few minutes and
used a lot of expensive API calls into an operation that takes a few seconds
and uses 1 API call.

At the end of this I get a resulting immutable object. I can perform business
logic on this in multiple places, accumulating changes, at the end of which I
have an `update` object containing an original version and a delta. I can then
just update the things that actually need updating, and can run additional
business rules intelligently based on which fields were changed. If there are
multiple records that need updating, the ORM will batch them for me to further
reduce API calls.

Using raw SQL, it would be possible to implement a rough hack that
approximates this, but it would be nowhere near as efficient or scalable.

~~~
HolyHaddock
> this is on top of the other obvious benefits of ORMs, such as abstraction
> over my storage engine. I can write a single 'query' that be be executed
> against a variety of SQL servers, Salesforce, MonoDB, an in-memory cache, or
> whatever else I want to do with it.

This is a trade-off, not an obvious benefit. In programming to a lowest-
common-database API, one loses the ability to use any actual features of the
specific database technology being used. It would be very interesting to know
what proportion of projects need to smoothly change the underlying DB
technology (it is of course debatable whether ORM's actually let you do this),
vs what proportion find themselves hampered by the less-powerful-than-SQL
database manipulation API offered by an ORM.

------
dventimi
Here's a question. WHY do we even bother with ORMs? Put another way, what
problem are they trying to solve?

~~~
CmonDev
Make data layer more testable and refactorable by decoupling from a specific
data storage. I would trade a horrendous large SP for a horrendous large C#
codebase any day.

~~~
dventimi
I'd choose "Door No. 3": neither C# nor stored procedures, but declarative db
business rules using appropriate data types, constraints, default values,
views, access controls, and triggers.

------
bellerocky
I'd rather write Django ORM than write create and alter table SQL, and
migration SQL any day of the week. The developers who wrote Django's ORM are
also way better at writing SQL and database related code than I am and sure I
could spend all the time I need to become so proficient that my migrations
work as nicely as migrations and syncing in Django and Django related projects
like south, or I could just spend that time writing my application using
Django ORM. If and when problems happen, as they have, I usually find easy
ways to solve them. I've been using Django for over 4 years in high load
environments with few frustrations.

Before using Django I wrote my own SQL with web.py, and PHP before that. When
I work on non-Python projects I use non-relational databases instead of SQL.

~~~
hcarvalhoalves
To be sincere, I find Django's ORM is one of the weakest ones (e.g., the API
doesn't support a simple GROUP BY). If you want to make a good case for ORMs,
Django's may not be a very strong argument.

In about 5 years working with Django the ORM has been the _only_ component
that consistently gave headaches. I have filled a couple bug reports about it
generating non-sense/slow queries too (like generating queries with `DISTINCT`
for no reason, with no way to override).

~~~
Grue3
>the API doesn't support a simple GROUP BY

This is just plain wrong (see aggregate/annotate).

~~~
hcarvalhoalves
Let's word it differently: the API doesn't support a simple way to emit an
arbitrary GROUP BY clause.

The aggregate/annotate APIs do not cover all legitimate uses of GROUP BY
besides very simple cases (sum, avg, etc).

------
serve_yay
I wonder if one's stance on this issue moves in cycles as one gains
experience. You start off with the ORM cause it's easy, then you make a mess
and decide to be rigorous which includes coding all-SQL, then you realize how
tedious that is and why ORMs came about in the first place.

------
weavie
Most people who complain about ORMs complain about the input - the querying
aspect of them. And it is true, particularly for complex queries ORMs can be
pretty hideous. If a query goes beyond a certain level of complexity I would
much rather replace them with a series of views or stored procedures. ORMs are
not good for complex queries.

For me, where ORMs do shine is with their output. If you have two tables, A
and B with one to many relationships between them - with pure SQL running a
join on these tables will return a single result set. Table As data will be
duplicated for each row of B. With an ORM you can get back a single object A
containing a collection of B's rows.

This is enough reason for me to reach for an ORM for anything but the simplest
of problems.

Lazy loading generally also comes for free.

------
euphemize
I think ORMs like SQLAlchemy are really useful for many many use cases. I
don't think most people who work with ORMs deal with the kind of complexity
described by the author, let alone work on such a specific application for 30
months at a time. In that sense, ORMs are super powerful tools that cut down
your work, shortens your code and do nifty optimizations once in a while

With that being said, this article totally resounds with me, having resorted
to using direct SQL in almost all of my large-ish projects. Using an ORM
always starts out nicely, and then eventually it gets messier and messier.
Specifically, when your DB data has subtle interactions with a cache, I have
found it harder to keep ORM code clean and readable.

~~~
maaku
SQLAlchemy isn't like most ORMs. It's data model is actually closer to the
relational model than OOP objects, and therefore lacks many of the ORM warts.

I wish more ORMs were like SQLAlchemy.

~~~
CmonDev
Dynamic typing is quite good for this sort of integration scenarios.

------
buro9
The one thing I like about ORMs (where I control the ORM): I can prevent any
SQL query from being run that doesn't have a WHERE statement.

Especially when dealing with anything that changes data, lack of WHERE is very
dangerous.

That said, I write raw SQL. It's the only way to get the performance. As an
abstraction it's already great and I don't feel I need another abstraction on
top of it.

An example of the kind of query that an ORM is bad at:

    
    
        SELECT id
              ,value_from_row_based_functions(some_column) AS foo
          FROM bar
         WHERE some_column = 'blah'
         ORDER BY id DESC
         FETCH FIRST 25 ROWS ONLY
    

It's bad because the ORM will merrily do this even though the row based
function may be expensive. Over hundreds of rows this isn't an issue, but over
millions or tens of millions it kills your query (if it doesn't timeout it
will merrily carry on for hours).

Good SQL always reduces the amount of work involved, you tailor it as you
would your regular code to avoid doing work that isn't needed (reading rows
you will discard through a filter, etc).

The above could be written as:

    
    
        SELECT id
              ,value_from_row_based_functions(some_column) AS foo
          FROM (
                    SELECT id
                          ,some_column
                      FROM bar
                     WHERE some_column = 'blah'
                     ORDER BY id DESC
                     FETCH FIRST 25 ROWS ONLY
               ) AS something
    

And now the row based function is only applied to the 25 rows that are going
to be returned.

If you look at the very bottom of this:
[http://www.postgresql.org/docs/9.2/static/textsearch-
control...](http://www.postgresql.org/docs/9.2/static/textsearch-
controls.html) you'll see this is a real-world problem.

ORMs can stitch together your SQL in a certain way that is fairly optimal for
most common scenarios... but you _will_ encounter a scenario in which raw SQL
is the only way to go.

~~~
Ma8ee
I'm almost 100% certain that the function will be called only 25 times no
matter which way you write the query. Not only is the select statement
semantically at the end, but any engine worth anything optimizes this kind of
things.

~~~
buro9
You are 100% incorrect when it comes to Postgres. Feel free to try it out.

------
mtimokhin
I think, the problem is in all-or-nothing approach when discussing that topic.
It's always "abstract pure SQL" vs "abstract pure ORM", but I think the
solution is in reasonable compromise between them.

First, it's definitely bad idea to write ORM that works with absolutely any
database schema. If you accept some limitations in schema design (and some in
model layer too), you can really benefit in more simple ORM design.

For instance, you can eliminate partial load at all, breaking stored entities
into separate aspects in separate tables and using some tricks for fetching
associations (see below).

Second, ORM should be as lightweight as possible, so you can really use it and
not fight it.

There are things that just should not be used in ORM, if we don't want to
solve some complicated performance problems. I believe, it's really bad idea
to use auto-generated SQL JOINs on ORM side, it's very hard to control and
optimize them. There is simple and clever solution by Jakub Vrana
([https://www.facebook.com/jakubvrana/posts/415359675151430](https://www.facebook.com/jakubvrana/posts/415359675151430)),
that works really good if implemented carefully on ORM side. When you really
need JOIN, you can use VIEW on database schema side or (better) use some kind
of denormalization.

You can optimize VIEW on SQL side, you can move from views to denormalization
without breaking model layer etc.

Using stored code on SQL side is good, you can maintain your database without
any special client tools or code, and you could automate a lot of
denormalization using triggers.

We have some slides (in Russian, sorry) on this topic here
[http://www.slideshare.net/interlabs-ru/model-
patterns](http://www.slideshare.net/interlabs-ru/model-patterns) and here
[http://www.slideshare.net/interlabs-ru/sustainable-
schema](http://www.slideshare.net/interlabs-ru/sustainable-schema) and
implemented lightweight proprietary ORM (in PHP) based on principles above,
possible open it later.

------
joshdance
ORMs can definitely help. When you have a large team of developers, not
everyone does things the same way, or even the right way.

Also, what if you want to change your database structure? Wouldn't an ORM help
in that instance as well?

------
lexx
If I have to code a serious application with complex schema, ORM is out of the
discussion. If I want to kick fast prototype project I would consider it. But
in the end I know that I will regret it!!!!

Great article!

------
thom
This is fascinating for me, because I've recently had the opportunity to watch
someone from a pure maths background learn SQL from scratch. It seems like a
very different experience when you arrive on day one appreciating the
underlying theory of relational databases, as opposed to my experience from a
more enterprisey background. I didn't realise what a mental block the subset
of SQL available to Hibernate had imposed until recently, and I suspect things
like ActiveRecord hide even more of the good stuff away.

------
snarfy
The first ORM I used was basically a bunch of C++ macros for Oracle, about 16
years ago. I've been using both SQL and ORMs ever since in various forms. I'm
not sure how anyone could ever use an ORM effectively without knowing SQL.

Some people only know how to administer their box through X11 apps, but others
know the command line. It feels like that when working with devs that do not
know SQL very well. They'd be lost without their ORM.

------
rollthehard6
Disclaimer - I have little know how of ORMs and haven't developed anything
more than scripts for a long long time.

What happened to 4GLs when the web came? i.e. It looks a little like this
would be a solved problem if there were a open source web 4GL language that
integrated SQL into say, python as deeply as SQL is intergrated into Oracle
PL/SQL or Ingres OpenROAD whilst providing a decent web app framework to go
along with it.

------
LBarret
ORM like anything aren't all in the same league. I didn't like them blindly,
but SQLAlchemy works very well (because its layered architecture).

------
mightymouse07
I haven't seen anyone say this yet but it seems that this guy used the wrong
tool for the job he was doing. He needed to get off of the ORM long before his
attribute count got up to 600. ORM is not the right tool to use for his
problem. He was trying to fit the glass slipper onto the giant ugly step
sister and then wrote an article about how all glass slippers are garbage.

------
mnw21cam
People should try out InterMine. It's an ORM (and other tools) that I wrote a
while back, and runs the backbone of a few genetics research databases.
Funnily enough, the object-oriented query language shared almost all its
syntax with SQL. It also does a load of things to avoid some of the problems
in this article.

If it ain't broke, don't fix it.

------
Dowwie
Can anyone tell me which of the OP's arguments against ORMs isn't already
addressed in SQLAlchemy? I can't find any, but maybe I've overlooked
something.

Anyone who is already coding in Python and hasn't yet discovered SQLAlchemy
would find that doing so is a great investment in the little time that is
available.

------
a-saleh
Ever since we learned about Relational Algebra, I have felt, that as an
approach it is superior to both SQL and ORM.

------
kayman
Learning ORM without learningn sql is great for beginnners. Ex: Django
Framework for new comers. But like any abstracction, learning SQL will allow
you to optimize w raw sql as needed. Its like trying to learn coffeescript
without learning JavaScript. It always helps to learn from bottom up..

~~~
bni
I disagree. Some of the worst horrors Ive seen over my carrer is systems
written by developers using an ORM "So they don't have to learn SQL".

I find some folks has these really strong beliefs:

1\. SQL is for DBAs, Im a developer! 2\. SQL is legacy and if you use it
directly you must be some kind of luddite. 3\. I tried to write some SQL once
by copy/pasting someone else's, and I got really strange results back.
Nevermind that I dint try to actually understand the query.

------
segmondy
I always let the database do heavy lifting. I write 100 line SQL that if you
turned into code, will turn into 1000 line of code. It scares most folks but
whatever, it runs much faster and once it has been fined tuned, runs fine
without needing much. I just comment the hell out of it.

------
xienze
I find ORMs quite handy, and they've always generated tight SQL.

Oh yeah, we had to support an additional DB recently... all that involved was
changing the JDBC connection information and dropping in a new driver. Good
luck doing that when everything is written in raw SQL.

~~~
dicroce
I usually just restrict the SQL I'm using to something reasonable and I've
never had any trouble...

------
vetler
What ORMs have taught me: people will complain endlessly about how ORMs don't
work, when they don't bother to understand how they work, what they
should/shouldn't be used for, and how to use them appropriately.

------
krapp
I have a laravel project for which all of my models are raw SQL statements for
this reason. At the very least, it makes the code more portable, and it makes
it easier to reason about the statements when you can actually see them.

~~~
xienze
It makes your code less portable. Raw SQL means you're essentially tied to a
single DB provider. What happens when you need to move to a different one?

~~~
Xorlev
Then you should probably rewrite your statements anyways to take advantage of
platform-specific features. I won't write a query on MySQL the same as I would
on Postgres, because there's usually a better way.

CRUD is about all that can honestly stay the same, as IMO that's where ORMs
shine.

------
CmonDev
"(And hey, with the advent of "devops", the divide between the developer and
the database administrator is basically non-existent)"

Not really. Devops is a new role. Not everyone would like to work in devops.

------
atmosx
ORMs are a breath of fresh air for developers IMHO. I can concentrate in DB
design, use SQLite for dev and deploy PSQL on the server (production). Awesome
level of abstraction :-)

------
sytelus
After looking at kind of monstrosity like Entity Framework, I've given up on
traditional ORMs. My current choice for db access is micro-ORMs like PetaPOCO.

~~~
CmonDev
Check the EF7 - they started from scratch:

[https://github.com/aspnet/EntityFramework](https://github.com/aspnet/EntityFramework)

~~~
sytelus
Doesn't look that promising:
[https://github.com/aspnet/EntityFramework/wiki/Using-
EF7-in-...](https://github.com/aspnet/EntityFramework/wiki/Using-EF7-in-
Traditional-.NET-Applications)

Look at the simplicity and beauty of micro-ORM like PetaPOCO:
[http://www.toptensoftware.com/petapoco/](http://www.toptensoftware.com/petapoco/)

The fact is you don't want to put in large teams in creating ORMs because that
would invariably mean feature creep and lot of black magic happening behind
the scene. ORMs should be super light weight, a single page doc should be
enough to explain everything with it and there should be no "magic".

------
manicdee
Does anyone have a decent version control system for stored procedures? How do
I keep them in my git repository alongside my code? How do I maintain them?

~~~
numo16
If you use MS SQL Server, Red Gate has a solution for that [0]. I don't know
if there is something similar for other RDBMSs.

[0]: [http://www.red-gate.com/products/sql-development/sql-
source-...](http://www.red-gate.com/products/sql-development/sql-source-
control/)

------
spacemanmatt
I find the species of SQL in any sufficiently complex application are too
varied for a blanket rule. Add performance demands, and some ORMs are right
out.

------
recursive
A small nit on the opening:

> ... ORMs are more detriment than benefit. In short, they can be used to
> nicely augment working with SQL in a program ...

That statements contradictory.

------
digitalpacman
Pretty sure you are missing one of the biggest rules of ORMs. This rule being
you don't use them for reporting software or reporting queries.

------
evantahler
I've generally found the following mantra to hold true:

ORMs for a singe record (and associations) Raw SQL for groups

------
techhackblob
SQL is not hard and certainly no harder then something like the entity
framework

------
Datsundere
how about writing a reusable database wrapper class?

you can generalize queries : query(sql, args[]):

and for specific purposes

addName("name"): query('insert into table ..., "name") .. other functions(..)
: query('..','')

------
seldo
See also from a few years ago:
[http://seldo.com/weblog/2011/08/11/orm_is_an_antipattern](http://seldo.com/weblog/2011/08/11/orm_is_an_antipattern)

I agree with basically everything in this post, obviously :-)

------
dreamfactory2
There seems to be an underlying assumption that SQL performance is crucial. Of
course an ORM is going to produce less efficient SQL, that's the tradeoff.
However, object caching may eliminate this concern.

For example, I've worked with a system where each field is a table and data
retrieval can require huge inefficient joins just as you'd expect. Due to
general revulsion at the idea when it was introduced, there was an initiative
to create materialized views. But this completely collapsed when it was
discovered that the existing object caching benchmarked just as well whilst
removing system complexity. Also worth thinking beyond selects - another very
big advantage of table per field is in altering schema on large datasets,
which was a major undertaking on more read-efficient schema. I think this is
an example of worse being better.

------
lafar6502
I agree that SQL is a brilliant data processing language, while C#, Java or
C++ are terrible at it. And there are obvious benefits to learning SQL and
being able to use it efficiently, makes imho much more sense than throwing
away RDBMS because they're 'slow'. But if you write applications in an oo
language objects are quite natural way of representing both data and logic.
Sometimes you do application-level transaction management or use the mapping
metadata to implement higher-level features, sometimes you have to integrate
application events with data operations, or you need your data objects to live
outside of database transaction - ORM saves you lots of coding in such
situations. And adds services like caching, prefetch, intercept/triggers etc
so you don't have to invent it.

~~~
recursive
I believe that C# is a better data processing language than SQL is, assuming
it can access the data. That's mostly thanks to the strength of linq.

~~~
lafar6502
Linq is just a useful syntax, there's nothing about data processing there.
It's only good for small data sets that fit entirely in memory.

~~~
recursive
I completely disagree. SQL is just syntax too. Syntax for a language that can
process data, just like linq. And Linq can also operate on IQueryable<T>,
which can represent data structures too large to fit in memory. It functions
just fine on infinite data structures for that matter too.

