No database access abstraction is going to be great for all situations. Whether it's object-oriented, functional, or DSL-based, there are going to be situations where it's frustrating to use. However, each of these abstractions have a purpose and a clear benefit for a certain subset of situations, or else they likely wouldn't have been created.
Since people find some abstractions useful for some subset of situations, why waste so much time and energy fighting against its use entirely--labeling it an "anti-pattern"? On the other hand, why waste time and energy arguing for its ubiquitous use? Like with all other technology choices, the right prescription is to use an abstraction when it makes sense, and to abandon it when it does not.
Maybe it would be more constructive to frame the discussion in this way: what cases are ORMs useful for, and what cases are they bad at? How can we identify the good use cases from the bad? Which use cases seem to be good uses but turn out to be bad, and why? What are alternative solutions for those bad use cases? Are there different philosophies of ORMs with different tradeoffs? What are those philosophies and tradeoffs and how can we choose which one most suits our problem?
ORM's are awesome to get things done quickly, but as stated in the article, and stated by those very same ORM's, you do sacrifice some performance.
As usual with these articles, there is a false dichotomy that one should choose between ORM or no ORM. It's a perfectly sound decision to use an ORM and hand-optimize whenever the need arises.
Also, ORM's usually _are_ optimized for the 90% use case, so going of on your own to write those in plain sql for your entire domain model is passing up on a huge amount of leverage.
Now a normal person would say "stupid satnav" but an ORM fan would say "stupid car" (blame the database for being slow).
It is possible for someone to improperly use X, therefore
X should be banned.
ORMs - OK if CRUDs are your '90% uses cases'.
ORMs - fail if you need something slightly more complicated.
But they don't fit really!
I like the way original author explained it: the problem was that people were forced or convinced to use wrong abstractions. Due to the massive marketing of really big players since 80s, relational databases are everywhere around; most of developers during the years took for granted that they MUST use SQL, so they stopped thinking if they really need relational model.
Luckily we have other setting these days. There are plenty of production-ready non-relational database engines. To simplify your application layer and really map your OO data, please do use document database - they fit. That's it. No amount of pretending will dissolve the differences between relational model and object-oriented model.
If in doubt, please go back to the theory, to the computer science. You don't have to dig through tons of papers from 70s when relational model has had its peak of scientific evolution - you just buy yourself the CTM book and go through all the computational models with examples of the real code. Please do the lesson. Otherwise any discussion will be about habits and beliefs again, not about science and engineering.
And - I hate to say it - most of the comments like "here we go again" are really about habits and beliefs. How sad. The foundation of our occupation is hard science and engineering. We strive to build better things based on such concrete foundation. By sticking to the thinking "OO + relations fit" we abandon this foundation and we are turning into believers of some cult created decades ago by the marketing people of the few companies everybody knows in order to massively increase sales of RDBMSs. This is not engineering.
Please, put yourself in the distance from the habits of using specific tools you're accustomed to; please think about it.
None of this is saying a particular technology is wrong or bad but it's a reminder that Eric was right to point out that they all have drawbacks which developers should consider when picking a good fit for their application. Blindly asserting that a particular one is good or bad is no better than telling carpenters they only need hammers.
Even Google, for all of their massive resources and BigTable's deserved respect, uses a ton of SQL databases - and they didn't add support to AppEngine simply because they were unwilling to tell developers to try something new.
tl;dr: "Know your data access patterns and pick a good fit with your resources"
I also know that they use at least one big financial reporting packages but I'm not sure whether the use of Oracle Hyperion is evidence for or against wisdom.
Do a Google search for mysql google contributions and plenty of stuff pops up.
This does not mean ORM's are an anti-pattern, or bad. ORM's provide a lot of value for the 90% use case that people often need to bang out under various time constraints (mvp, project for a customer, etc).
There is a trade-off when using an ORM that you, or your team, should be aware of. Once you start hitting the limits of what an ORM can reasonably do for you, it's fine to either:
- drop down to using handoptimized sql for specific queries / submodules of your project (e.g. a set of reporting pages)
- figure out if you are trying to punch a cube through a circle by using a relational database over a nosql type storage, and consider moving part of your data to a different storage.
I haven't had a lot of experience with nosql databases yet, but I think it's also interesting to consider that relational databases have been around for a long time, and alot of corner cases where they are unwieldy are known. Nosql databases are not as vetted against reality yet, so abolishing one for the other may amount to nothing else than trading one set of problems for another.
But there are plenty of abstractions code virtually never has to plumb around (Many kernel APIs/memory management, network protocols, various programming language abstractions, etc).
If an abstraction requires "plumbing around" in almost every project, that is indeed a problem with the abstraction.
Yes! We get it! We got it years ago: ORMs are a compromise over an irreducible set-theoretical problem. I got it in CS class and I get it now. You can't square the circle -- no one is claiming to have squared the circle.
But for 90% of the cases, it turns out that really experienced developers use and leverage ORMs to boost productivity and eliminate code duplication. It happens every day. It's not habits and tools, it's from careful consideration of the results of what happens in practice. Good Baconian Science should teach us to go out into the world and observe and report. And what we've learned is that neither side of the Object/Relational tension really ever fully wins out, and what works is experienced devs using the right tool at the right time.
We code in the real world, get burned by bad practice and adjust. That's why ORMs have survived, because good devs have found ways to used them as effective bandaids on difficult problems. We don't code in a classroom and I couldn't care less whether my code violates anybody's CS or Set Theory dogma. I have features to ship.
- they create a dangerous illusion of an leaky abstraction which leaks very quickly
- they require programmers to learn 2 things anyway
- they initially speed up the coding, but later slow it down, esp. when you have to tune stuff (how many times you peeked generated SQLs only in order to feed them to EXPLAIN ANALYZE... ?)
- they can promote bad habits, code which takes your engine to a crawl
- they hinder debugging, esp. tracking performance problems, often effectively putting a problem under the carpet to be discovered by admins
There's also a funny thing: most of the world seems to happily use active record (as a pattern), although when you go deep into it it occurs far inferior to data mapper (as a pattern). That also demonstrates that many people are using habits, not engineering, and stick to what's used around instead of making some investigation on the topic. This is not bad per se; but when it comes to discussing problems, such people are not in the position to argue, precisely because they haven't done their homework.
So, the problems above are caused by the mismatch which one can expect from the theory. You've got what you paid for.
I guess we all could learn most from the history of relational databases or just try to remember what relational databases really are and what other examples of this technology are around. For example Prolog is relational (apart from logic engine). If one does something in it then it becomes more clear how much it "fits" objects or not.
Document databases are usually not a perfect fit either in my experience (I have worked on a few MongoDB projects). The fact remains that you are responsible for storing the state and retrieving that state in an optimal way (this is where you usually end up writing some custom document db queries, or creating large multi-indexed collections specifically for reports).
What would be ideal is if state was managed transparently for you using a lot of shared / clustered memory and a transaction log. I.e. something like Terracotta in the JVM (No I don't work for them, and have never even used it. I just love the theory behind it). You'd still want to archive data somewhere (Relational or Document), so you would still have that complexity for large datasets.
Forget NoSQL, how about NoFetch? (oversimplified perhaps).
Application requirements may sometimes provide broad limits for technology choices but the link isn't very strong as there is a huge gap between the problem space and the solution space (evidenced by the army of software developers needed to translate between the two)
Software is not hardware and it's not even engineering in my view. The extent to which we in the software space are able to define and redefine jobs as well as tools makes this kind of thinking useless.
As far as I know Doctrine 2.0 absolves a lot of these problems but I've not used it yet.
If the requirement change requires a different backend using an ORM is actually easier, faster and less error prone.
I like to see people push back on that; let's at least get to a state where we can consider dropping it from a project.
That is covered in the article. ORMs are an advantage in the very early stages of a project, but a disadvantage in the later stages. Since a project by definition spends the least amount of its lifetime in the early stages, don't paint yourself into a corner!
For most of my projects I use Django nowadays. Its ORM is usually sufficient. In cases where it is not I add a query method to a model class that just executes the most efficient sql query possible and returns the results in the most efficient format possible. For mature applications this can result in quite a lot of those query methods and the ORM than plays a lesser role.
However, even in those case the ORM continues to be a convenience to me as a developer, e.g. because it powers the Django admin interface and because together with South (a database schema migration tool) it makes schema management a breeze.
An ORM is a tool. And each tool has its own place and time. So, yes - there we go again ...
Thanks for mentioning this. I'm just learning Django, mainly for using its admin interface to avoid a lot of CRUD, and South seems just what I need to evolve my schema while I go along.
Said developers will never take the next step of swapping out bad ORM-generated SQL for good hand-tuned SQL. The app is therefore doomed.