Here we go again. I'm pretty sure I've seen this post here (and the famous Vietnam of Computer Science one) several times before.
No database access abstraction is going to be great for all situations. Whether it's object-oriented, functional, or DSL-based, there are going to be situations where it's frustrating to use. However, each of these abstractions have a purpose and a clear benefit for a certain subset of situations, or else they likely wouldn't have been created.
Since people find some abstractions useful for some subset of situations, why waste so much time and energy fighting against its use entirely--labeling it an "anti-pattern"? On the other hand, why waste time and energy arguing for its ubiquitous use? Like with all other technology choices, the right prescription is to use an abstraction when it makes sense, and to abandon it when it does not.
Maybe it would be more constructive to frame the discussion in this way: what cases are ORMs useful for, and what cases are they bad at? How can we identify the good use cases from the bad? Which use cases seem to be good uses but turn out to be bad, and why? What are alternative solutions for those bad use cases? Are there different philosophies of ORMs with different tradeoffs? What are those philosophies and tradeoffs and how can we choose which one most suits our problem?
ORMs are useful for one thing for sure: for making an illusion that OO way of thinking and relational way of thinking fits.
But they don't fit really!
I like the way original author explained it: the problem was that people were forced or convinced to use wrong abstractions. Due to the massive marketing of really big players since 80s, relational databases are everywhere around; most of developers during the years took for granted that they MUST use SQL, so they stopped thinking if they really need relational model.
Luckily we have other setting these days. There are plenty of production-ready non-relational database engines. To simplify your application layer and really map your OO data, please do use document database - they fit. That's it. No amount of pretending will dissolve the differences between relational model and object-oriented model.
If in doubt, please go back to the theory, to the computer science. You don't have to dig through tons of papers from 70s when relational model has had its peak of scientific evolution - you just buy yourself the CTM book and go through all the computational models with examples of the real code. Please do the lesson. Otherwise any discussion will be about habits and beliefs again, not about science and engineering.
And - I hate to say it - most of the comments like "here we go again" are really about habits and beliefs. How sad. The foundation of our occupation is hard science and engineering. We strive to build better things based on such concrete foundation. By sticking to the thinking "OO + relations fit" we abandon this foundation and we are turning into believers of some cult created decades ago by the marketing people of the few companies everybody knows in order to massively increase sales of RDBMSs. This is not engineering.
Please, put yourself in the distance from the habits of using specific tools you're accustomed to; please think about it.
So, say I do use a document database and object serialization is indeed easy and fun again. Now I need to do reporting, only it's really slow until I heavily denormalize (i.e. reinvent half of a SQL database and update every time my reports change). Now I need to share data across objects so it turns out I did need joins after all. Maybe I could invent an Object-Non-Relational-Mapper to make some of that complexity easier to maintain…
None of this is saying a particular technology is wrong or bad but it's a reminder that Eric was right to point out that they all have drawbacks which developers should consider when picking a good fit for their application. Blindly asserting that a particular one is good or bad is no better than telling carpenters they only need hammers.
Even Google, for all of their massive resources and BigTable's deserved respect, uses a ton of SQL databases - and they didn't add support to AppEngine simply because they were unwilling to tell developers to try something new.
The answer is to use both. Do you really need realtime reporting? If you don't, store everything in a document database and write a small job to copy the data to a relational store for querying and reporting.
If you're going to store in a relational database anyway, why have the expense of having to develop and support both? It'd make more sense to try to either denormalize - with some support framework - or just use the SQL database from the beginning, the deciding factor being how important aggregate operations are to your app and simple questions like how much your NoSQL database costs to operate vs. your SQL database.
tl;dr: "Know your data access patterns and pick a good fit with your resources"
Google has contributed quite a few large patches to MySQL. I doubt they would be spending engineering time on improving MySQL if they did not use it internally for at least _some_ projects, and probably many.
Do a Google search for mysql google contributions and plenty of stuff pops up.
Yes, there is an impedance mismatch in mapping data from relational data stores to object instances.
This does not mean ORM's are an anti-pattern, or bad. ORM's provide a lot of value for the 90% use case that people often need to bang out under various time constraints (mvp, project for a customer, etc).
There is a trade-off when using an ORM that you, or your team, should be aware of. Once you start hitting the limits of what an ORM can reasonably do for you, it's fine to either:
- drop down to using handoptimized sql for specific queries / submodules of your project (e.g. a set of reporting pages)
- figure out if you are trying to punch a cube through a circle by using a relational database over a nosql type storage, and consider moving part of your data to a different storage.
I haven't had a lot of experience with nosql databases yet, but I think it's also interesting to consider that relational databases have been around for a long time, and alot of corner cases where they are unwieldy are known. Nosql databases are not as vetted against reality yet, so abolishing one for the other may amount to nothing else than trading one set of problems for another.
Wow, kunley, I just had a flashback to 15 years ago when Object DB zealots were making these same kind of dogmatic go-back-to-set-theory arguments.
Yes! We get it! We got it years ago: ORMs are a compromise over an irreducible set-theoretical problem. I got it in CS class and I get it now. You can't square the circle -- no one is claiming to have squared the circle.
But for 90% of the cases, it turns out that really experienced developers use and leverage ORMs to boost productivity and eliminate code duplication. It happens every day. It's not habits and tools, it's from careful consideration of the results of what happens in practice. Good Baconian Science should teach us to go out into the world and observe and report. And what we've learned is that neither side of the Object/Relational tension really ever fully wins out, and what works is experienced devs using the right tool at the right time.
We code in the real world, get burned by bad practice and adjust. That's why ORMs have survived, because good devs have found ways to used them as effective bandaids on difficult problems. We don't code in a classroom and I couldn't care less whether my code violates anybody's CS or Set Theory dogma. I have features to ship.
Well, just assume that I'm also speaking from experience of delivering complicated applications to production and I'm tired of how ORMs become a pain in the long term:
- they create a dangerous illusion of an leaky abstraction which leaks very quickly
- they require programmers to learn 2 things anyway
- they initially speed up the coding, but later slow it down, esp. when you have to tune stuff (how many times you peeked generated SQLs only in order to feed them to EXPLAIN ANALYZE... ?)
- they can promote bad habits, code which takes your engine to a crawl
- they hinder debugging, esp. tracking performance problems, often effectively putting a problem under the carpet to be discovered by admins
There's also a funny thing: most of the world seems to happily use active record (as a pattern), although when you go deep into it it occurs far inferior to data mapper (as a pattern). That also demonstrates that many people are using habits, not engineering, and stick to what's used around instead of making some investigation on the topic. This is not bad per se; but when it comes to discussing problems, such people are not in the position to argue, precisely because they haven't done their homework.
So, the problems above are caused by the mismatch which one can expect from the theory. You've got what you paid for.
I guess we all could learn most from the history of relational databases or just try to remember what relational databases really are and what other examples of this technology are around. For example Prolog is relational (apart from logic engine). If one does something in it then it becomes more clear how much it "fits" objects or not.
I'm not in love with relational databases or ORM by any means, but your post sounds like document database marketing.
Document databases are usually not a perfect fit either in my experience (I have worked on a few MongoDB projects). The fact remains that you are responsible for storing the state and retrieving that state in an optimal way (this is where you usually end up writing some custom document db queries, or creating large multi-indexed collections specifically for reports).
What would be ideal is if state was managed transparently for you using a lot of shared / clustered memory and a transaction log. I.e. something like Terracotta in the JVM (No I don't work for them, and have never even used it. I just love the theory behind it). You'd still want to archive data somewhere (Relational or Document), so you would still have that complexity for large datasets.
Forget NoSQL, how about NoFetch? (oversimplified perhaps).
Erm, sir, philosophies? I'm talking about computer science as a foundation of hacker's work. I haven't merely stated my opinions, but mentioned some materials recognized as state of the art in this science. Do you propose that we abandon it and put it into a bag of relativism? Because if you do, we can just ask PG to shut down this forum and go for a beer.
I don't see anyone arguing that the database is stupid for being slow. I see a bunch of anti-ORM people arguing that an ORM is never the right solution to any problem ever. I also see a bunch of people arguing that ORM mixed with SQL is perfectly fine and there is nothing 'broken' about it (which I think is the more moderate view). I don't see anyone here arguing the opposite extreme of "ORM is always the right approach to every problem." I view the "ORM fan" as a strawman for the most part in this whole discussion.
My experience as a DBA is that someone will develop using Hibernate, it'll go into production and run like a dog, and they'll go straight to their manager and say "It's the database". Because noooooo it couldn't be his perfect code, could it?
I think the "right tool for the job" argument is weaker than it appears. It implies that the job is fully defined before we start to model or even before we start to think about the problem.
Application requirements may sometimes provide broad limits for technology choices but the link isn't very strong as there is a huge gap between the problem space and the solution space (evidenced by the army of software developers needed to translate between the two)
Software is not hardware and it's not even engineering in my view. The extent to which we in the software space are able to define and redefine jobs as well as tools makes this kind of thinking useless.
Good point. Maybe you have to choose a tool that allows you to discard it if you find a better tool once the problem becomes clearer. So, the question is: are ORMs easy to "unplug" if you need to use a better tool?
The problem with unplugging the ORM or inter operating is that the 2 biggest ORM frameworks are implemented such that they don't support composite primary keys and have stupid requirements like having an auto increment column called "id" on every table (including tables used as part of a many to many relationship!!).
As far as I know Doctrine 2.0 absolves a lot of these problems but I've not used it yet.
what cases are ORMs useful for, and what cases are they bad at?
That is covered in the article. ORMs are an advantage in the very early stages of a project, but a disadvantage in the later stages. Since a project by definition spends the least amount of its lifetime in the early stages, don't paint yourself into a corner!
You (and the article) would be correct if the ORM/no ORM was a black and white affair. However, as others already pointed out in this thread, it is not. It is perfectly viable to use an ORM as the basis and gradually switch to 'raw' sql if the need arises.
For most of my projects I use Django nowadays. Its ORM is usually sufficient. In cases where it is not I add a query method to a model class that just executes the most efficient sql query possible and returns the results in the most efficient format possible. For mature applications this can result in quite a lot of those query methods and the ORM than plays a lesser role.
However, even in those case the ORM continues to be a convenience to me as a developer, e.g. because it powers the Django admin interface and because together with South (a database schema migration tool) it makes schema management a breeze.
An ORM is a tool. And each tool has its own place and time. So, yes - there we go again ...
Early vs. later isn't the problem. You can write n+1 queries in your first model or your 1000th. The problem is developers who don't know how to watch logs, don't know how to write efficient SQL, and projects that don't use adequate monitoring in general. If your ORM use is getting worse as you go, you probably need code reviews or NewRelic or better developers.
Yes that's the whole point - ORM is a crutch developers use to avoid learning how databases work. Yet weirdly they're happy to learn the ORM! Hell in Hibernate you have a query language called HQL, why not just use SQL!
Said developers will never take the next step of swapping out bad ORM-generated SQL for good hand-tuned SQL. The app is therefore doomed.
You seem to have worked with a lot of lazy hack developers, and I can only sympathize. I hope you find a better group of colleagues soon, and maybe you'll find less use for words like 'never' and 'doomed'.