However hand-coding the SQL has some advantages a lot of folks don't want to discuss. While relations and objects are to some extent incompatible, focusing on writing good relational code usually results, at least in anything I write, in a net reduction in codebase size. This is because there are a lot of things that can be done in relations faster and easier than could be done in objects.
The answer in my view is not to pick one or the other but to define an interface across them. This is often an ORM in some projects but having talked with some ORM developers I respect, at least some tend to map objects to relations where the relations are views, thus encapsulating the relational logic behind relational processing (which is smart).
Again I go almost the other direction. I prefer to define logic interfaces which encapsulate the relational logic and then add a service locator to handle the interface. This was a big motivation behind PGObject::Simple::Role.
I am not convinced that ORMs are so bad if they are also used with updateable views (and thus providing a stable interface for the application, protecting the internals of the storage from the application's need for intimate knowledge). However, I personally find it just simpler to write SQL.
They have a sort of low-level python SQL api, a higher level declarative ORM layer, and a custom glue that let's you wire objects and sql together in arbitrary but useful & maintainable ways.
I think this was possible because the SQLAlchemy devs had this insight: "SQL databases behave less and less like object collections the more size and performance start to matter; object collections behave less and less like tables and rows the more abstraction starts to matter."
They also have the benefit of being fully type-safe across query projections.
In other words, SQL has never been purely relational. It is a beast of its own.
Let me state this, SQLAlchemys pool/engine/SQL-Api is great, and it's got nothing whatsoever todo with ORM.
But SQLAlchemy does contain an ORM, and it does become unmaintainable just like every other ORM. It roughly goes like this:
1) You use the ORM
2) Things get slow
3) You write a bit of custom mappers
4) Soon all your logic can't be operated without custom mappers
5) Congratulations, you've now written all the handwired code you would have written without an ORM, plus all the ORM code on top, it's now unmaintainable, full of bugs and oversights AND slow.
And that about sums up everybodies experience with ORMs unless they do trivially small things. It's at this junction people begin thinking about solving this ORM problem. And they learn it's bloody hard.
Obligatory XKCD reference: There are N ORMs on the market, they all suck. Let's solve all these problem and do a new ORM to rule them all. There are now N+1 ORMs that all suck on the market.
Having said that, do not underestimate the number of apps doings "trivial" things. There are lots. The problem with SQLAlchemy I think is that it is actually advanced enough to have you use it right up to the point where you must rewite all your code because it suddenly is just dead slow. In contrast the Django ORM is too simplistic to get too far with it.
But, the problem is that, you rarely get to go to your boss and say, well, this thing we've been using to do, you know, our data stuff, it's completely wrong. We need to throw everything away and start from scratch.
The only situation in which you get to say that, is when you're explicitly make it clear that you're writing a raw prototype that'll not scale, and that nothing short of a complete rewrite is gonna be required as soon as you finish (and even then, it can turn out tricky to actually do that).
Unfortunately, most people who get themselves into the ORM mess don't realize that this coming for them in the future. So they don't go their boss and make the prototype proposal. This all but cements the entire failure of the project right then.
I find this works well. It speeds up development of all the simple stuff , and by the time it gets complicated, you can drop down to SQL.
I don't find the two ways of thinking incompatible, in fact the "extra" method on Djangos ORM is an intermediate where you let the ORM do its normal simple style of mapping, but allows you to add extra clauses to the SQL manually. Once you do this you see how the ORM works, and can choose quite easily when it is appropriate to use it.
You loose a bit of the magic that comes with Django by the time your queries get more complex (like auto generated admin pages), but the benefits outweigh the costs immensely in my case.
What this thread is talking about is when you have datasets that are several gigabytes in size and up, when you have many dozens of tables, all interrelated and when you need to squeeze as much performance out of your hardware as you can. When your code gets complex enough where you cannot just insert an object, but have to do all manner of verification through half the DB first. Or where you need to operate on 100,000 rows in an intelligent manner very fast (think using stored procedures and complex query parameters). This is a place that you might end up and if your entire codebase depends on an ORM, you are often out of luck.
In the long run, I think there several things to do: (1) know SQL well, (2) use an ORM when you know the project will be small, and (3) anticipate that if you are growing and using an ORM, you will need a re-write.
80% of the queries are simple and use the ORM directly. A few need an extra clause for stuff that isn't handled by the ORM, and I have a couple that need to be high performance (and are abstract enough that they don't fit well with an ORM), which are done via the raw SQL.
My experience was somewhat different in terms of ditching SQLAlchemy: the performance started to suffer as we kept adding clients. Also, developers started tripping over their own feet to do simple things as relationships got much more complex and the ORM did not allow us to impose certain rules easily (data did not always flow through the ORM as it was due to historical and performance reasons). Instead, a co-worker and I sat down and wrote a small library that did the common things for you. It was broken up into modules and we standardized on some specific methods for each "object": get, find, insert, delete, update. Each object also had non-standard more interesting methods that allowed us to manipulate relationships in a controlled fashion. The management did not mind this as it took very little time, was a gradual transition (we converted performance critical code first, then anytime we touched anything else).
Some things I learned from this:
1. Creating and maintaining a library like this is actually very little work. Who cares if it's raw SQL if it's super simple and you only write it once. Something like "SELECT * FROM vehicle WHERE id = ?" does not need unit testing, etc. and almost never changes. On the other hand the crazy complex queries we had were actually simpler in raw SQL than trying to express them in the ORM. To use an ORM efficiently you must know SQL already, except you can usually write better, clearer SQL than an ORM can.
2. The benefits in performance were huge. We could do all manner of dirty hacks inside the library functions since the library was self-contained.
3. Migrations were a pain, until we created a simple text file where we would append the latest migration. I used a tool that compared schemas to verify that the live database was in line with what a blank initialization of the DB looked like. This could definitely use some better tooling but practically this was a small pain.
4. The benefits from code that used this library were even bigger than the performance benefits. Other developers on the team were locked into what that could do with the data. If they needed to create a new method for a given table/object, the went ahead, but anything complex typically got reviewed by at least one other person and bugs were caught early. Moreover, all the data manipulation now happened in one single library so it was much harder to introduce inconsistencies.
5. We got to actually use our RDBMS. ORM's try to give you the lowest common denominator between the ORM's they support. They don't typically let you do views, sharding/partitioning (in the RDBMS sense, not in the application), SQL functions/procedures, etc. We got to do all of this and it often got to be the better solution.
Having used EF6+LINQ in 2014 I feel this post from 2006 is quite outdated. The article tells you as much:
> Work in this space has thus far been limited, constrained mostly to research projects and/or "fringe" languages, but several interesting efforts are gaining visibility within the community, such ... the LINQ project from Microsoft
LINQ's come a long way since 2006.
 Mostly this comes down to understanding the leaky abstraction between what IQueryable can do  with respect to what an ORM can do with SQL generation
 Also knowing your IQueryables from your IEnumerables.
I don't want to give up either objects or relational databases, (the choice that some other comments have suggested). What I am willing to give up is an ORM generating SQL. Writing SQL doesn't bother me. What bothers me is dealing with Java PreparedStatements, Connections, and ResultSets. This is all trivial boilerplate stuff, that an automated approach can do easily. I want to preserve my brain and fingers for the hard stuff, writing clean, fast Java and SQL.
I have heard that iBatis takes this approach, but I haven't tried it myself.
Not to mention cache specifics (something that has personally bitten me).
To be honest, my preference is actually to write a service locator against stored procedures. This allows queries to be maintained separately from the application (and loosely coupled from them also). This is not too far from what iBatis does except that I would use stored procedures instead of mapped sql.
What db's are you using? If Pg only, interested in porting the PGObject approach to Java?
Ironically, developers would never think to do this with a storage mechanism like the filesystem. There is no great popularity in filesystem based ORM's. Somehow when we deal with the filesystem we treat it as it is - data storage and retrieval. When we deal with 3rd party api's we tend to keep them at arms length as well. Yet, when we deal with the database, we treat it as some other thing that seeks to influence the design of our code on a fundamental level.
I wrote about this like a year ago: http://brianknapp.me/the-filesystem-test/ and I don't think much has changed since then.
You can view the database as merely being there to support your program, or you can view your program as being there to manage the database. If it's the latter, then it's not surprising that the database design impacts the design of your code. So we don't keep it at arm's length, because it's much more important and complex than a 3rd party API or even a filesystem.
Certainly when my team leader suggests another quick hack at the code level to make the database do what we want, I usually push back, and suggest a change at the database level to model the situation more accurately. His solution would be quicker, but mine will be more robust and maintainable (so less work in the long term).
If I needed to look in a folder containing a million files, find a subset containing the word 'foo' and match to the 'joining' files in another 4 folders (all of which contain 500k+ files) - I think my code would resemble my filesystem more than my filesystem resembles my code.
The file descriptor represents the file in your language and is the equivalent of parf of what some ORMs do. It's an explicit interface between external data and your program, just like a good SQL library (like SQLAlchemy).
The problem with ORMs is not with just storing data, that's pretty simple and you don't even really need an ORM for that. The most difficult problem comes when you introduce complex relations, and need a way to retrieve those datasets from the database. AFAIK golang does not help you in any way with that, so you'll have to invent your own ORM and conventions for representing things like belongs_to, has_many and join relationships. If you have join tables you'll be doing the relations yourself which is not always simple or performant - this is what ORMs are useful for.
Clearly ORMs break down in apps of a certain size and have their own issues (largely due to non-optimal SQL and/or a reluctance to break out of the ORM when necessary), but every app contains the sort of logic which is in an ORM, because it must map storage to in-memory and view representations. If you have logic to convert *sql.Rows to your objects, including relations, you have written a simple ORM.
github.com/lib/pq is a typical driver, which returns rows of values which you read to recreate your objects.
SQL isn't going away because it's valuable to exploit relationships between data and because it's pretty good at doing that. Anything that tries to replace it, eventually looks suspiciously like it. (I'm looking at you, MongoDB aggregate functions)
The thing is that the object oriented programming model and the relational database model are both good models. However, the relational model is a logical model specifically intended to be independent of the particular structure used to store data (presentation independence: it uses only a few relations so as to allow data to be presented using any logical conditions on those relations) while object oriented programming is primarily focused on creating specific storage structures (it's not a specific model but a set of tools for build whatever-the-heck). You can create a limited definition of object this is compatible with the relational model (see ) but such definition more or less neuters OO as a tool chest for building any structure that tickles your fancy(such a limit can be good or bad depending on the context).
 For example, the daunting Third Manifesto (TTM), by Hugh Darwen and C.J. Date http://www.thethirdmanifesto.com/
I think the issue is how that independence of interface is maintained. In the relational model, the question is one of transparency in internals and set (or more properly bag) operations, while in objects it is defined interfaces and encapsulation.
It seems to me that's the problem, but you can bridge them by creating stable, defined interfaces (updateable views for example).
OK, a different way to put it is that the relational model provide one kind of independence and object orientation provides another. The way that a foreign key column is not OO encapsulated but does reference a different column allows SQL expressions that aren't dependent on which value is the "real" value.
"you can bridge them by creating stable, defined interfaces"
Yes you can. At any one point in the development of an application, you can pick a mapping and say "no problem", this is true.
The problem is that if you develop an application further, have an object return an object instead of a value say, then the problem of how to change you mapping to reflect the development is hard, harder than one would think. It's not the snap that's hard, it's the evolving application that uses OO and relational models that becomes weirdly hard, where the "impedance mismatch" become evident.
I am not so sure that is really the problem though. As I noted, if you do ORM mappings to updateable views, that both simplifies and complicates things. On one hand it means more code to maintain because you have to map physical storage relations to application-facing logical ones.
But on the other hand, it solves the problem you are describing quite nicely because your apps only see an intermediate form of the data. You control this intermediate form and it is independent of how the data is stored.
The real problem I see with ORM usage generally is that instead of thinking about interfaces they think the ORM is the interface, but this means that the application depends on intimate knowledge of how the data is stored, a dependence the relational model is designed to get you out of. That isn't really a problem of the relational model -- as you point out, this independence is exactly what the relational model provides and so if you aren't getting it, it is because you aren't really using the relational model.
Rather it is a problem of trying to eliminate relational thinking from app development.
"One of the key lessons of Vietnam was the danger of what's colloquially called "the Slippery Slope": that a given course of action might yield some early success, yet further investment into that action yields decreasingly commensurate results and increasibly dangerous obstacles whose only solution appears to be greater and greater commitment of resources and/or action. Some have called this "the Drug Trap", after the way pharmaceuticals (legal or illegal) can have diminished effect after prolonged use, requiring upped dosage in order to yield the same results. Others call this "the Last Mile Problem": that as one nears the end of a problem, it becomes increasingly difficult in cost terms (both monetary and abstract) to find a 100% complete solution. All are basically speaking of the same thing--the difficulty of finding an answer that allows our hero to "finish off" the problem in question, completely and satisfactorily."
For history buffs from Europe, the experience of Germanicus (at least as recounted by Tacitus) in the swamps across the Rhine may have a similar implication. Oh look, we're the Roman army, the most powerful army in the world. Defeating a few barbarians should be easy! But then the actual requirements start piling up, one gets into the swamps, can't move about like one would have hoped, logistics gets disrupted, and eventually it takes outside help even to retreat.....
In the alternate history where the US did not turn tail and run out of Vietnam like a whipped dog, abandoning their former allies in the south to be utterly subsumed by their enemies from the north, how did the 1976 election turn out. Was it still Carter?
This is self contradictory. Failure of national morale in war is actually how wars are normally lost, not somehow something different than normal defeat. And its relevant to the analogy, in that...
> ORM are doing a good (well, at least passable) job of doing what they are intended to do, but are generally hated.
If that's true, its because what they are designed to do isn't what the people hating them need done. It indicates a gap between what is available and what is needed; intent may be interesting, but in the end not all that important.
ORM is an attempt to bypass proper data design, when in fact more attention should be paid to the data model in the DB than the structure of the code.
Functional programming has a much better impedance match to the relational concept; maybe this will mitigate the issue in the future.
* it has model generators
* need an insert into or an update? bang, one line of code
* find_or_create? one line of code
* acess, create, update or delete related records? one line
* it has chained where clauses and hits the db only when it needs to access the data
* want hashes instead of objects in order to send them as a json response in a REST service? one line of code
and many more features. Is there an O/R "impedance mismatch" ? I don't know, I don't care and I don't use it if there is
That approach is really quite fine and wonderful, if you can put all your business logicy awesomesauce into the database layer. But it can make a variety of problems really awkward if you want that business logic to interact with code that lives in a different process... starting with "validating user input on a web form" and going on from there.
Also, I'm curious whether you'd write stored procedures in your database to set up objects for your automated testing suites, when you want to test the business logic in your stored procedures. :P
I don't know. We've certainly had our awkward moments with LedgerSMB. However, there are a bunch of things that have to be understood to do this right.
1. Business logic in the db requires you to think carefully about interfaces. Getting the right interfaces all the way down is rather difficult but it can be done with appropriate attention to detail.
2. Error handling is a very specific case of #1. You need to think about notifying the application of an error, not the user.
> Also, I'm curious whether you'd write stored procedures in your database to set up objects for your automated testing suites, when you want to test the business logic in your stored procedures. :P
I just test the stored procedures in transactions that roll back. This is pretty nifty because I can even test write operations on a production db without worrying about polluting the production instance with test data.
1. I can grep all of my SQL! No dynamic SQL generation means all SQL statements are known. This alone has been huge to me. Of course it depends on storing the SPs in version control and settling on good naming conventions (eg. always write table.column).
2. The SPs are easily testable and verifiable endpoints to themselves.
3. Small Performance boost of using compiled / validated SQL.
4. No dynamic SQL also means that performance tuning is a snap. I see a bad actor I can track it down easily and fix the problem.
One thing that has made this possible is a convention of naming SPs by tablename_DMLname_[description]. e.g. customer_sel_byemail.
It could be a view (a select query, stored as a database object) or a procedure (a grouped set of sql statements that might involve changes that are written to the database, possibly a simple insert, or simple update, or simple delete, but potentially much more complex).
DML is one class of SQL statements, and DDL is another. The two together encompass most of your common, practical database operations.
Am I wrong? Are people using ORM's in their "passion" projects? Or are they just a way to take some of the pain (and perhaps insecurity, think SQL injections) out of boilerplate apps?
Then a couple of years ago, I started playing with a DataMapper/Unit-of-Work ORM (Doctrine2, in PHP-land). My experience, writing a moderately sized warehouse-management application, has been very good.
In Doctrine, you don't write your schema/DDL. Doctrine does, based on your Entity and Association mapping. You write plain-old-PHP-objects, and then map their relations (in XML, annotations, or YAML). Doctrine reads and validates your data model, and generates DDL statements to create the schema. Big surprise: it creates pretty much exactly the same schema you'd expect. (And it will generate and manage migrations, too)
Unit of Work is a powerful pattern, too, especially for web apps. You just worry about updating entities in memory. The ORM then flushes them to the datastore in a single transaction at the end of the request (so, by default, you have a one-to-one mapping between http requests and transactions in the RDBMS).
Datamapper/UoW is probably overkill for CRUD stuff; that's where ActiveRecord shines. But it pays dividends when you're dealing with more complicated state-transitions on your underlying entities, and there are lots of cross-cutting concerns.
In a recent project I manually mapped a large sprawling ancient schema manually using Doctrine2, and still never hit any major blockers. The code just abstracts away over the cracks.
Does anyone have any real world example articles on where an ORM has totally failed, and why?
I also appreciate all the work the SA devs have done to help mitigate SQL injection. I recommend using the SQL Expression Language without the ORM to help mitigate SQL injections (following from defense in depth). The oursql connector also helps with that as it supports real parameterized queries where as some of the other connectors do not.
I certainly do care what strings get sent to the Database though, I can tell you that! If you use an ORM and don't know how to inspect the queries it generates then I'd consider that to be somewhat negligent. That's just me, though.
Relations are easily persisted because they only have fields meaning they are easy to persist (store).
Objects are not easily persisted because they have methods meaning you are not always assured that the object is in a safe persist-able state (unless they are value objects).
- "Top down", domain-model centric, hierarchical, object-oriented, etc.
- "Bottom up", relational-model centric, relational, etc.
We've also nicely summarised this on our page here: http://www.hibernate-alternative.com
ORMs solve the first approach pretty well. ORMs have never claimed to solve the second approach. But people are lazy and constantly looking for magic bullets, which is why they are trapped by thinking that ORMs should also be a solution to the second approach.
But even Gavin King always stressed the fact that one shouldn't use Hibernate for everything.
At least from my knowledge:
the rise of Rails and ActiveRecord
NoSQL databases have become more prominent and widely accessible.
How has the Author's opinion changed?
I could submit a link-baity article from the 90's decrying object oriented programming as a pointless dead end in computer science, but I doubt the original author would hold the same opinions
However, there's good work advancing the state of the art in this field. Look at Object Role Modelling implementations like ActiveFacts ( https://duckduckgo.com/?q=ActiveFacts ).
It was old, but it at least pointed out a theme that still holds some validity.
I like getting injections of knowledge and themes, that one should be aware of, and from the replies and comments here, it seems that a lot of other people feel the same way too.
I got fucked by using .NET and Linq some years ago, so I found it to be an interesting read. Nowadays I use Objective-C and core data, and that is a much more pleasant experience, as you can see from the comments. :)
George Santayana (1863-1952)
My ORM of choice doesn't take away the ability to write raw SQL when you want, though. I wouldn't want to use an ORM where you couldn't extend it with your own SQL.
I kinda suspect that started the whole anti-ORM thing in response to the ridiculous complexity of some frameworks.
I still use ORMs but I've decided that it has to be a light approach that doesn't try to totally hide the SQL code from you.
Check out the Db library.
You can't tell anything about what tool is appropriate for the job they're trying to do from their lack of investigation.