The side that argues for ORM has chosen the application, the codebase, to be in charge. The central authority is the code because all the data must ultimately enter or exit through the code, and the code has more flexible abstractions and better reuse characteristics.
The reason for the disagreement comes down to disagreement about what a database is about. To the OO programmer, strong validation is part of the behavior of the objects in a system: the objects are data and behavior, so they should know what makes them valid. So the OO perspective is that the objects are reality and the database is just the persistence mechanism. It doesn't matter much to the programmer how the data is stored, it's that the data is stored, and it just happens that nowadays we use relational databases. This is the perspective that sees SQL is this annoying middle layer between the storage and the objects.
To the relational database person, the database is what is real, and the objects are mostly irrelevant. We want the database to enforce validity because there will always wind up being tools outside the OO library that need to access the database and we don't want those tools to screw up the data. To us, screwing up the data is far worse than making development a little less convenient. We see SQL not as primarily a transport between the reality of the code and some kind of storage mechanism, but rather as a general purpose data restructuring tool. Most any page on most websites can be generated with just a small handful of queries if you know how to write them to properly filter, summarize and restructure the data. We see SQL as a tremendously powerful tool for everyday tasks, not as a burdensome way of inserting and retrieving records, and not as some kind of vehicle for performance optimization.
At the end of the day, we need both perspectives. If the code is tedious and unpleasant to write, it won't be written correctly. The code must be written--the database is not the appropriate thing to be running a web server and servicing clients directly. OOP is still the dominant programming methodology, and for good reasons, but encapsulation stands at odds with proper database design. But people who ignore data validity are eventually bitten by consistency problems. OODBs have failed to take off for a variety of reasons, but one that can't be easily discounted is that they are almost always tied to one or two languages, which makes it very hard to do the kind of scripting and reporting that invariably crop up with long-lived data. What starts out as application-specific data almost invariably becomes central to the organization with many clients written in many different languages and frameworks.
We're sort of destined to hate ORM, because the people who love databases aren't going to love ORM no matter what, and people who hate databases will resent how much effort they require to use properly.
Speak for yourself. I love databases (note the plural form) and love ORM. ORM is a godsend for developing application that has to work against different databases (postgresql, mssql, db2, etc).
Any conformant client code then must honor these rules, and oftentimes that means it must re-implement them, which is an acceptable cost if we have decided to use an RBDMS in the first place.
Now it's true, a given database may only implement a subset of all applicable business rules--maybe some fall outside the scope of the database, maybe it's preferable to offload some to a trusted client, maybe the business and database model have drifted apart over time, and no one wan't to overhaul the database model due to all the dependencies involved.
That said, any rules that the database does implement is a good thing, especially those simple rules that can be implemented as constraints. And it's good because then you can program against them, from any client, from any code, inside the database and elsewhere, and you can make guarantees about what possible states the data could be in. This is generally a useful thing.
I admit I have no statistics, but it's been my experience that most places choose between a highly OO model + ORM and a highly relational model without.
My experience has always been a highly relational model, ORM or not, and business rules enforced in app layer or DB (or a mix of the two). I've always seen them as distinctly different decisions.
Personally, in the past I was always a "rules in the app layer" guy, because of the many advantages of doing in that way, but as I get older the more difficult but guaranteed correctness of implementing in the database is becoming more appealing (especially if it's not me that has to actually write the code!!)
Surely this is a flawed world view!
> To us, what's most important is the data, so everything else must serve that end
To you, yes, and I don't fault you for defending that perspective. But the real master who must be served is maximizing "profitability" while maintaining an acceptable level of risk.
Anyone from either side of this argument who ignores the very real advantages from the other side, or the risks from their own side, are the only ones who are totally wrong. (Which would make the author of the original article the one that is most "wrong" in this discussion, as far as I'm concerned.)
This is ideological, right? What's most important is the business. Anyone that starts from the assumption that everything, EVERYTHING, must serve the end of the data, is wrong. Right?
We can make up interesting dilemas all day. How about this one. There is an optimization that facebook can make which is shown to increase monetization by 10%, but it creates soem risk of data corruption. Engineers estimate that it will corrupt 0.01% of facebook posts. Do you choose a 10% increase in monetization, or does everything have to serve the end of data integrity?
"I don't believe in hypothetical situations" -- Kenneth the Page, 30 Rock.
I would probably pick the data. It's what can be monetized and it's impossible to regenerate.
Having to recreate the software might even be beneficial in the long term if your engineers are careful enough to avoid second system syndrome.
What matters is consistency, usability, and agility. Throwing ORMs out the window will give you as much consistency as you can squeeze out of an SQL server, but will greatly reduce your agility. Using an ORM for everything will greatly increase your agility but will reduce your usability.
As in everything, there is a balance. People who fall on either side of that balance need to back away from the pulpit and rethink their stance.
These days you just don't hear about DBAs at all any more. You used to see constant jokes about DBAs being a pain in the ass and stopping programmers doing X or Y. ORMs going to win because there aren't enough of you left. Stored procedures, triggers, etc. are going to be viewed as ancient technology back from the days of yore when people didn't understand how to code properly.
The database is where you store your data. If you have data of which its integrity is critical to your organization, a properly designed and maintained database is going to save a lot of hastle.
I believe that databases will remain important, and maintaining data will always involve restrictions on how you can use it. Restricting data is not a relational database problem - it's more often than not a business constraint. Often times you don't want programmers doing stupid things with your data :-)
I incidentally have stopped programmers from doing X or Y, but it was because the right answer was Z.
As for ORM's winning, I don't think its a war, For some things I use and recommend ORMs, but for others I recommend using pure SQL.
You may be right about perception, but nearly every system I've worked has contained a big ugly mess somewhere because the author didn't know how to use a SQL DB properly.
Honestly, every time I see how badly Facebook handles data and caching, I can't help but wonder why they don't use a real data store and DBAs.
(I am an engineer/developer/whatever at Facebook, and I'm always interested in hearing the perception of the company's technology from the community.)
1. I've always been under the impression that for what Facebook does, a traditional RDBMS simply cannot handle the scale (like, not even close). Is this correct?
2. I'm also under the impression that due to the architecture Facebook runs on, from time to time some lesser-important data (ie: a status update or comment) can be lost (temporarily or permanently) and this is not considered unacceptable. (It seems perfectly reasonable to me for this particular use case.)