This certainly doesn't match with my experience, and I've worked on dozens of projects using ActiveRecord. Some of these were hairy production projects with strict performance requirements and complicated schemas.
About 90% of the time, the ORM does the right thing. It creates, updates and deletes individual records, and it generates entirely reasonable queries. And it does this without requiring dozens or hundreds of lines of SQL.
You'll get much better performance out of ActiveRecord if you know a few tricks, however:
- Read the SQL queries in development.log to make sure they're sane, and use New Relic to find slow queries in production.
- Use 'include' to load a model and its associated objects in 2 queries instead of 1+N queries.
- Use 'select' to load only the necessary columns from your tables.
- Don't try to force it: If you hit something weird, never hesitate to fall back to SQL. ActiveRecord provides a dozen different places where you can inject small SQL fragments into your queries, or you can just call 'find_by_sql' and do everything by hand.
If you start out with basic ORM models, benchmark your database, and add some raw SQL as you scale, you can get reasonable rcesults with ActiveRecord. If it breaks down for 100% of projects, you either have a weird definition of "breaks down", or you're doing it wrong.
Now I'll grant you that this may not personally bug you that much. It bugs me quite a bit though.
If you are using a relational database, know sql, and want to eliminate a TON of boilerplate, use an ORM.
I don't know of many ORM's that claim "Use us and you won't have to know SQL". That's just a straw man argument.
If you define breaking down in the terms mentioned by the comment you are replying to, you are playing a semantics game to frame the debate to be: this ORM sucks because I can't use it without knowing sql.
It's just that in software, the argument "If I already know A, so why should I have to learn B" isn't really useful. Given that software is often about pushing down complexity, one of our main tools is building a "B" so that we don't have to do "A".
I know how to write socket code in C, but I'm glad for libraries that don't force me to think about that level--I can pop up a level or two. Examples continue ad infinitum.
On the other hand, I can buy the argument "B isn't really very good at abstracting A; I've tried to use B, and every time I use B, I end up just doing A anyway. So B isn't useful."
But with ORMs, my experience isn't so black-and-white. I find them very useful a lot of the time. When I run into problems, I can switch to doing SQL directly. That's an okay tradeoff for me. I can certainly understand if others don't find that tradeoff as useful.
My point was more B doesn't in my experience abstract A well enough to justify its use in anything other than toy or really simple apps that won't live very long or won't have much maintenance. I don't mind an upfront cost if it means I won't be pulling my hair out at 2am trying to make some database call take less then 10 minutes to complete. (Exaggerating there but you get the idea.)
If B doesn't do a great job abstracting A then the time spent learning B feels wasted to me.
(Edit : extrapolate to abstract)
Since most ORMs I've worked with are not really as complicated to learn as you try to make it sound, the trade-off is clear.
EDIT: fixed a typo.
Sometimes, this one place is actual a MySQL proxy. Check out ScaleDB, which they just launched - it looks promising.
But other times, it will be in your own app. That is the ORM :) But in that case I wonder how often you really need relational data stores.
You will actually arrive at some sort of ORM if you start abstracting your CRUD operations into ONE PLACE. Basically ORM is the product of DRY.
and it's not a bottleneck? it's shared nothing?
No one is forcing you to do anything, you can write your own database access layer if you like. Often developers who do this end up using parts of their code across different projects, basically they write their own ORM.
Or you can use a pre-built ORM, as far as I am concerned for me this is the saner choice. I really doubt I could write something on my lonesome that is as secure, performant and has the features of ActiveRecord in a non trivial amount of time.
But the important part is that whichever path you go down, if your project is growing, complex or needs to scale, you need to know your stack. You need to be able to modify or reconfigure each part. You need to understand at a high level what each part of the stack does, and be able to learn exactly what it does at a low level quickly should the need arise.
ORM has lots of benefits. but having to learn less is not one of them. Biggest one is that you don't need to write/maintain a separate code base (sql) for CRUD operations. that's BIG benefit even if you get to use it on 40% of your database interactions. Something as stupid as not worrying about creating the right data conversions and logical types to store data in, will avoid a ton of bugs from intermediate programmers.
I find ORMs useful for simple CRUD but that's so little of what I do that I mostly don't bother adding a huge framework to my code just to do some CRUD.
A set oriented DB API that cuts down on boilerplate is the way to go in my view. And that is simple enough to create on a weekend.
This works out very nicely for things like web applications where the currently-logged-in-user can only manipulate his own data. You can write your model so that it takes the current user and restricts all data to his rows so that your application doesn't even see data that the user is not allowed to see. It's really quite wonderful and something that raw SQL queries simply don't provide enough abstraction to handle flexibly.
Ultimately, people dislike ORMs for the same reason they dislike things like CoffeeScript. They're not mentally comfortable delegating code generation to a computer program -- that's their job. And so, they will complain whenever they are looking at code that's not exactly what the lower level of the system will see. But in general, that's due to inexperience rather than actual dislike.
The point of working with sets of tuples is that you can apply any set operation to a set of tuples and what you get is a set of tuples. You can use selection, projection or union and you still get a set of tuples. The recursive nature of this principle makes it so powerful and simple. Kind of like Lisp.
ORMs introduce a conceptual barrier between objects defined by domain classes created at design time and generic containers used to hold the result of projections and aggregate functions. If you join over two classes and project over some of their attributes you lose all the properties of the original classes, including any attribute accessor methods. Same thing with aggregate functions and grouping.
Even if the result of a query involving projections or aggregate functions happens to have the exact same attributes as another existing class, you don't get objects of that class and hence none of its methods. The tuples don't compare equal to objects of that class either because they don't share their type.
Different ORMs have various features to partially paper over these issues. I don't want to get caught up in a debate about what ORMs can or cannot do because powerful ORMs like hibernate can do almost anything and everything is optional.
The point is that they discourage me from working in a set oriented fashion because they make it hugely more complex than it needs to be. They create a conceptual mess and then they want me to learn a lot of tuning techniques to fix what is broken as a result of that mess. It makes no sense to me.
Same thing with aggregate functions and grouping.
The goal of a query is not to process sets of objects. The goal is to answer a question. The ORM gives you two choices. Either limit yourself to answers that can be expressed as objects of existing classes or get something that is fundamentally different from all other domain objects - a generic tuple of values.
You lose the homogenous, recursive transformation capability that the relational model provides.
In terms of simplicity, I suggest you take a few queries involving joins, group by, having and aggregates and translate them to procedural code. I've done that a lot. It's very eye opening. Functional languages using homogenous data structures like lists or generalized sequences can be similarily productive for data that fits into memory. That is a more and more viable alternative to SQL in my view, but OO systsms and querying/transformation just clash badly.
But what ORMs encourage you to do is to load a bunch of objects and then use procedural code to do the real thing.
I fully understand that not all types of algorithms and datastructures lend themselves to set oriented thinking. I work mostly with those nowadays. But for those cases it makes no sense to use RDBMS at all.
But what ORMs encourage you to do is to load a bunch of
objects and then use procedural code to do the real thing.
My rule of thumb is pretty simple. I use whatever takes fewer and/or simpler lines of code unless it's a lot slower.
The raging scalability debate is a different matter. I'm sure if you're Google there are good reasons to write more lines of code in order to scale better. I'm not Google so I can prioritize productivity over scalability.
This is not a good comparison. With ORM, you started with Python/Java/C#/etc + SQL, and ended with Python/Java/C#/etc + SQL (a lot less). There is nothing new added.
One example of the Data Mapper pattern is in python's SQLAlchemy. Another is in perl's DBIx::ObjectMapper.
I'm comfortable working like that, but it is exactly what the article is criticizing. Sure, it works - but the whole idea of an ORM was supposed to be avoding that.
At least that is what I believe, I have never seen an ORM advertising itself with "NEVER WRITE ANOTHER LINE OF SQL AGAIN".
Maybe some ORM zealots said that in the past, but that certainly isn't the common conception now. ORMs as they're actually used properly today allow devs who know SQL really well to eliminate code dup. If you can't drop into SQL quickly, stay away from ORMs, you're going to hurt yourself and others.
No database access abstraction is going to be great for all situations. Whether it's object-oriented, functional, or DSL-based, there are going to be situations where it's frustrating to use. However, each of these abstractions have a purpose and a clear benefit for a certain subset of situations, or else they likely wouldn't have been created.
Since people find some abstractions useful for some subset of situations, why waste so much time and energy fighting against its use entirely--labeling it an "anti-pattern"? On the other hand, why waste time and energy arguing for its ubiquitous use? Like with all other technology choices, the right prescription is to use an abstraction when it makes sense, and to abandon it when it does not.
Maybe it would be more constructive to frame the discussion in this way: what cases are ORMs useful for, and what cases are they bad at? How can we identify the good use cases from the bad? Which use cases seem to be good uses but turn out to be bad, and why? What are alternative solutions for those bad use cases? Are there different philosophies of ORMs with different tradeoffs? What are those philosophies and tradeoffs and how can we choose which one most suits our problem?
ORM's are awesome to get things done quickly, but as stated in the article, and stated by those very same ORM's, you do sacrifice some performance.
As usual with these articles, there is a false dichotomy that one should choose between ORM or no ORM. It's a perfectly sound decision to use an ORM and hand-optimize whenever the need arises.
Also, ORM's usually _are_ optimized for the 90% use case, so going of on your own to write those in plain sql for your entire domain model is passing up on a huge amount of leverage.
Now a normal person would say "stupid satnav" but an ORM fan would say "stupid car" (blame the database for being slow).
It is possible for someone to improperly use X, therefore
X should be banned.
ORMs - OK if CRUDs are your '90% uses cases'.
ORMs - fail if you need something slightly more complicated.
But they don't fit really!
I like the way original author explained it: the problem was that people were forced or convinced to use wrong abstractions. Due to the massive marketing of really big players since 80s, relational databases are everywhere around; most of developers during the years took for granted that they MUST use SQL, so they stopped thinking if they really need relational model.
Luckily we have other setting these days. There are plenty of production-ready non-relational database engines. To simplify your application layer and really map your OO data, please do use document database - they fit. That's it. No amount of pretending will dissolve the differences between relational model and object-oriented model.
If in doubt, please go back to the theory, to the computer science. You don't have to dig through tons of papers from 70s when relational model has had its peak of scientific evolution - you just buy yourself the CTM book and go through all the computational models with examples of the real code. Please do the lesson. Otherwise any discussion will be about habits and beliefs again, not about science and engineering.
And - I hate to say it - most of the comments like "here we go again" are really about habits and beliefs. How sad. The foundation of our occupation is hard science and engineering. We strive to build better things based on such concrete foundation. By sticking to the thinking "OO + relations fit" we abandon this foundation and we are turning into believers of some cult created decades ago by the marketing people of the few companies everybody knows in order to massively increase sales of RDBMSs. This is not engineering.
Please, put yourself in the distance from the habits of using specific tools you're accustomed to; please think about it.
None of this is saying a particular technology is wrong or bad but it's a reminder that Eric was right to point out that they all have drawbacks which developers should consider when picking a good fit for their application. Blindly asserting that a particular one is good or bad is no better than telling carpenters they only need hammers.
Even Google, for all of their massive resources and BigTable's deserved respect, uses a ton of SQL databases - and they didn't add support to AppEngine simply because they were unwilling to tell developers to try something new.
tl;dr: "Know your data access patterns and pick a good fit with your resources"
I also know that they use at least one big financial reporting packages but I'm not sure whether the use of Oracle Hyperion is evidence for or against wisdom.
Do a Google search for mysql google contributions and plenty of stuff pops up.
This does not mean ORM's are an anti-pattern, or bad. ORM's provide a lot of value for the 90% use case that people often need to bang out under various time constraints (mvp, project for a customer, etc).
There is a trade-off when using an ORM that you, or your team, should be aware of. Once you start hitting the limits of what an ORM can reasonably do for you, it's fine to either:
- drop down to using handoptimized sql for specific queries / submodules of your project (e.g. a set of reporting pages)
- figure out if you are trying to punch a cube through a circle by using a relational database over a nosql type storage, and consider moving part of your data to a different storage.
I haven't had a lot of experience with nosql databases yet, but I think it's also interesting to consider that relational databases have been around for a long time, and alot of corner cases where they are unwieldy are known. Nosql databases are not as vetted against reality yet, so abolishing one for the other may amount to nothing else than trading one set of problems for another.
But there are plenty of abstractions code virtually never has to plumb around (Many kernel APIs/memory management, network protocols, various programming language abstractions, etc).
If an abstraction requires "plumbing around" in almost every project, that is indeed a problem with the abstraction.
Yes! We get it! We got it years ago: ORMs are a compromise over an irreducible set-theoretical problem. I got it in CS class and I get it now. You can't square the circle -- no one is claiming to have squared the circle.
But for 90% of the cases, it turns out that really experienced developers use and leverage ORMs to boost productivity and eliminate code duplication. It happens every day. It's not habits and tools, it's from careful consideration of the results of what happens in practice. Good Baconian Science should teach us to go out into the world and observe and report. And what we've learned is that neither side of the Object/Relational tension really ever fully wins out, and what works is experienced devs using the right tool at the right time.
We code in the real world, get burned by bad practice and adjust. That's why ORMs have survived, because good devs have found ways to used them as effective bandaids on difficult problems. We don't code in a classroom and I couldn't care less whether my code violates anybody's CS or Set Theory dogma. I have features to ship.
- they create a dangerous illusion of an leaky abstraction which leaks very quickly
- they require programmers to learn 2 things anyway
- they initially speed up the coding, but later slow it down, esp. when you have to tune stuff (how many times you peeked generated SQLs only in order to feed them to EXPLAIN ANALYZE... ?)
- they can promote bad habits, code which takes your engine to a crawl
- they hinder debugging, esp. tracking performance problems, often effectively putting a problem under the carpet to be discovered by admins
There's also a funny thing: most of the world seems to happily use active record (as a pattern), although when you go deep into it it occurs far inferior to data mapper (as a pattern). That also demonstrates that many people are using habits, not engineering, and stick to what's used around instead of making some investigation on the topic. This is not bad per se; but when it comes to discussing problems, such people are not in the position to argue, precisely because they haven't done their homework.
So, the problems above are caused by the mismatch which one can expect from the theory. You've got what you paid for.
I guess we all could learn most from the history of relational databases or just try to remember what relational databases really are and what other examples of this technology are around. For example Prolog is relational (apart from logic engine). If one does something in it then it becomes more clear how much it "fits" objects or not.
Document databases are usually not a perfect fit either in my experience (I have worked on a few MongoDB projects). The fact remains that you are responsible for storing the state and retrieving that state in an optimal way (this is where you usually end up writing some custom document db queries, or creating large multi-indexed collections specifically for reports).
What would be ideal is if state was managed transparently for you using a lot of shared / clustered memory and a transaction log. I.e. something like Terracotta in the JVM (No I don't work for them, and have never even used it. I just love the theory behind it). You'd still want to archive data somewhere (Relational or Document), so you would still have that complexity for large datasets.
Forget NoSQL, how about NoFetch? (oversimplified perhaps).
Application requirements may sometimes provide broad limits for technology choices but the link isn't very strong as there is a huge gap between the problem space and the solution space (evidenced by the army of software developers needed to translate between the two)
Software is not hardware and it's not even engineering in my view. The extent to which we in the software space are able to define and redefine jobs as well as tools makes this kind of thinking useless.
As far as I know Doctrine 2.0 absolves a lot of these problems but I've not used it yet.
If the requirement change requires a different backend using an ORM is actually easier, faster and less error prone.
I like to see people push back on that; let's at least get to a state where we can consider dropping it from a project.
That is covered in the article. ORMs are an advantage in the very early stages of a project, but a disadvantage in the later stages. Since a project by definition spends the least amount of its lifetime in the early stages, don't paint yourself into a corner!
For most of my projects I use Django nowadays. Its ORM is usually sufficient. In cases where it is not I add a query method to a model class that just executes the most efficient sql query possible and returns the results in the most efficient format possible. For mature applications this can result in quite a lot of those query methods and the ORM than plays a lesser role.
However, even in those case the ORM continues to be a convenience to me as a developer, e.g. because it powers the Django admin interface and because together with South (a database schema migration tool) it makes schema management a breeze.
An ORM is a tool. And each tool has its own place and time. So, yes - there we go again ...
Thanks for mentioning this. I'm just learning Django, mainly for using its admin interface to avoid a lot of CRUD, and South seems just what I need to evolve my schema while I go along.
Said developers will never take the next step of swapping out bad ORM-generated SQL for good hand-tuned SQL. The app is therefore doomed.
Getting tired of this. The fact that a tool or method has limitations and risks doesn't make it an anti-pattern.
Airports are in relative short supply, so once you've taken the train, your stuck on it for quite a bit.
Migrating away from an ORM or using raw sql queries in addition to an ORM is possible at all times (with more or less work, depending on the ORM you've chosen). So you can start with the train and hop on the plane anytime you want or need.
The reason it has a different URL now is a long story: my hosting provider had a meltdown and I was less than religious about backups for my personal blog (oh, the shame!) so I lost timestamp information on this and a few other posts, as well as all the original comments. So it looks like I posted this a month ago, but it's actually more than 3 months old.
Anyway, feel free to argue about this all over again -- it's new to you. I just felt I'd post so nobody who remembers this the first time around thinks I'm intentionally trying to game HN.
This was just as applicable when you first posted three months ago, Laurie. : )
So I'm sorry your ORM failed you. But there's no way in hell I'd ever go near your codebase if you're shying off them in favour of raw SQL, but also because talking about things as being 'anti-patterns' is ... urgh.
It sounds like a lot of people who are Anti-ORM in the comments (here, and on her blog), are focusing on the saturation of relational databases, and touting that NoSQL style data stores are a better fit for certain problems.
No argument here
But in a relational DB, assuming you are using it for its intended purpose (storing relational models, and not "objects"), I don't see why anyone would want to eschew the use of a proper ORM, with the exception of incredibly small projects with minimal database access.
For medium to large projects, I've never once regretted using an ORM. Not having to write my CRUD statements? Having all of an objects relational data lazy-loadable and at my finger tips? Sign me up! Just my experience.
I feel as though this author has worked in environments with other developers who didn't "get" the benefits of ORMs, and thus the implementations of those ORMs suffered. This is just my opinion from what I've read. I could be wrong here.
Death by a thousand queries, however, is a completely valid complaint. In fact, it's my one of my only major gripes with ORMs. But maybe I've just never run into these cases that the author has that makes ORMs so unbearable to use. Maybe i'm just lucky?
Also, she re-blogged this from several months ago: http://seldo.com/weblog/2011/06/15/orm_is_an_antipattern
I've heard SQL Alchemy is pretty hot at that kind of thing.
Sure, as your app grows, you write SQL to speed up the bottlenecks. And that's O.K.
Also, one of the pro ORM arguments is that it generates database-specific SQL, making your code theoretically portable between different RDBMS. Hand writing SQL will break this.
Another point: your ORM would automatically pick up changes to your model and cope with it, not your hand-written SQL, making your code harder to maintain.
Yes, some ORM's are more flexible than others.
>> ... one of the pro ORM arguments is that it generates database-specific SQL ... portable between different RDBMS
Yes, hand writing SQL will break this but:
- it is rare in my experience to completely migrate to a different RDBMS
- migrating to a different RDBMS should easily be viewed by everyone from management to development as a very tricky process that will take time to perform correctly. Part of this time will be tuning your custom queries.
- the ORM promise that it's queries are RDBMS independent is not broken
>> Another point: your ORM would automatically pick up changes to your model and cope with it, not your hand-written SQL, making your code harder to maintain.
Writing everything by hand (known alternative 1 from the article) makes your code even harder to maintain because there will be zero changes that are automatically picked up.
Depending on the KV store you use, his second alternative may not have this problem.
Hand writing sql comes with a cost, but the overal cost of "ORM + a bit of handwritten sql" is likely to be much less than "everything handwritten".
> Another point: your ORM would automatically pick up changes to your model and cope with it, not your hand-written SQL, making your code harder to maintain.
You mean, I'd need to re-write LESS hand-written SQL with model changes. This is a selling point, not a down side.
In my experience this article is spot on. It's easy to say "oh sure just use custom sql for this edge case" Only those edge cases turn out to be increasingly common and it is increasingly hard to work with data that may be coming at you in multiple formats.
And if you understand the types of queries being done under the hood by your ORM then sometimes using a custom query to retrieve ids to be used in a simpler ORM query is not such a bad idea. I think it's a case of doing things consistently in your code base so you use your ORM when it helps you and don't fight it when it's not going to help you.
I've done this several times using Django's ORM.
Really, there are good and bad ORMs. Bad ORMs produce the results you mention. Good ones are much more adaptable.
Maybe the gp doesn't know about it or chooses not to use it, but it exists.
The other existing solution provided by the author is to switch to a key-value datastore, which still would not solve your problem.
I don't even think necessarily much of an initial productivity boost. SQL doesn't take all that long to write, and once you do, you have complete control over your app.
I'm not completely sure what's meant by the statement "Use SQL in the Model." I think I agree, though I'm hoping this doesn't mean including SQL in the model class itself. I'd avoid coding references to how an object will be persisted in the model class itself. I'd lean toward creating a DAO (or some other API for persisting/retrieving an object) that you can use/call without worrying about how it is happening. The difference is that I'd rather implement that API through SQL than through an ORM framework.
Active Record is a pattern. Ruby's ActiveRecord and Django's ORM are implementations of that pattern. The Active Record pattern has some of the problems mentioned in this article, with convenience being the major benefit.
Data Mapper is another pattern. Python's SQLAlchemy and Java's Hibernate are implementations of that pattern. These will get you a lot farther without bumping into limitations. And honestly, I've written some terribly ugly queries in SQLAlchemy and it still worked. I've even started with an ugly query I wanted to make in raw SQL and then made sure that it was possible using just the library and no SQL strings.
I'm not saying that to be snarky. If I hadn't used a good ORM like SQLAlchemy (although I'm sure there are others out there), I'd be inclined to agree.
"Unfortunately, these advantages disappear as the project increases in complexity: the abstraction breaks down, forcing the dev to use and understand SQL"
"Forced" to use and understand SQL? I think you have a basic misunderstanding of the purpose of ORMs, Laurie. The developer who can't use and understand SQL really has no business going near the ORM. They probably need to stay somewhere near the front-end doing design, layout and UI-centric work.
The purpose of an ORM, properly used, is not at all to completely remove SQL from the developer's toolkit, but to help developers who fully understand SQL and Objects eliminate boilerplate SQL. This has a huge dual benefit of productivity boost plus elimination of bugs by the reduction of code/SQL duplication.
If an ORM developer is not aware of SQL (i.e., watching their rails log for odd queries/joins) and the natural problems that arise, such as n+1, etc. then they probably shouldn't be writing data-layer concerns. ORMs are for experienced developers who know how to use things like the ActiveRecord or DataMapper DSLs when appropriate, and can also drop into straight SQL when appropriate. ORMs are power tools for serious developers who can use them for what they do well, and know how to avoid their weaknesses, which you outline well here.
Because of their benefit and the cumulative experience we've had in the last 10 years or so getting to know how to use ORMs effectively, they're not going away any time soon. However, they certainly have their pitfalls and it's important that we have posts like these every few years or so to remind us what those pitfalls are. So thanks!
Absolutely. First we have developers trying to use ORM (a.k.a. HideSQL) to avoid learning SQL. That fails. Now we have developers trying to use NoSQL to avoid learning SQL. This "avoid learning SQL" thing is shaping up to be a fantastic anti-pattern. I wonder what's next.
TBH: If you want all your relational logic in your app, NoSQL systems make an awful lot more sense these days -- esp. if you've got a decent API/wrapper-layer that can automatically manage the freshness of locally cached data for you.
I find that the problem when discussing ORMs is that we gave them two completely different responsibilities: generating queries and mapping the results... In fact, those two are already separated in Rails (AREL?).
I don't see much of a criticism to the mapping part (and the benefits given to you in form of validation/metadata/migration/schema versioning)... In fact, unless you're going to access the records directly via the database driver API (giving way to hideous coding), you'll probably reinvent the wheel or do the O-R mapping by hand to fill POCOs/POJOs/etc, as I've seen on so many enterprise projects (in different levels of elegance).
But then, again, somebody please correct me if I'm wrong.
In C#, using LInQ I can specify exactly what columns I want, how I want things joined, and how the resulting object should composed, without the need for adding additional named classes or augment my data-model. I get control of what gets sent to my DB and I get only the information I need returned, in the form I desire. And I certainly hope that C# still qualifies as an OOP language.
His rant seems to be that inadequate ORMs is an anti-pattern. But isn't investing anything in something you know to be inadequate by definition an anti-pattern? He is making his own circular argument and then trying to sneak up the scope beyond what is valid.
Needless to say, I think his argument is simplified, based on dated information and this invalid.
This doesn't seem like an insoluble problem; rather than use an ORM to generate an impenetrable glue layer, it would be better to use code generation tools to create a clean database model and a glue layer that can be easily overloaded as required.
What tools exist that give you ORM-esque generation at the start of your project, but allow you to do maintenance on the glue layer later in the project?
> This doesn't seem like an insoluble problem; rather than use an ORM to generate an impenetrable glue layer, it would be better to use code generation tools to create a clean database model and a glue layer that can be easily overloaded as required.
I'm not sure it would work though. One of the problems I see is inherent with code generation approach: what happens when you had to modify the generated code and then changed your meta-model and want to regenerate the model again ?
Besides, the problem of "fitting square pigs in a round hole" would remain: SQL is much more versatile and flexible than objects/structs: you simply can't just work with the a table=an object abstraction, which tends to yield inefficient database queries and/or headaches from the developers side. Example:
update table set one_field=value where some_condition_on_another_field.
The same applies for the select queries, especially for reporting and data aggregation needs, where you would select some columns from multiple tables, possibly with formulas and computed columns.
SQL isn't even turing complete. How can it be more versatile and flexible than your programming language of choice?
> How would you expect an ORM to generate a "dense" query like this with it's simplistic view of things ?
Then you might go all the way and try to abstract away the fact that you're reading a record from a database. For what, I don't know, but it feels often that there's some perceived benefit in pretending that everything, including data, is a native object and all objects are alike--even if you have a heavy marshalling mechanism to prove that isn't the case.
Also, learning a different ORM every time you switch environments is costly. Switching from, say, Django to LINQ2Entities has nearly no knowledge transfer. Dealing with the quirks between MSSQL, postgres, MySQL is comparatively simple.
I call it PluSQL: the ORM for SQL lovers:
Combine that with LINQ and you have a 80-90% coverage of complex joins.
Also combine it with CQRS principles and I hope that when my project requires scaling (and I so hope it does) then I can easily scale
select first_name, last_name, login, email from users where id = 5;
result = execute(query);
User u = new User();
u.firstName = result.get("first_name);
u.lastName = result.get("last_name);
u.login = result.get("login);
u.email = result.get("email);
The same also applies to the other direction: when you want to generate an SQL query that updates/inserts an object of yours into the database.
ORMs also handle (I didn't say well) relations between objects (1..1, 1..n, n..n).
It may be the case that you don't, but really -- if what you want to do with objects is retrieve them, modify them, and save them, then a key value store is more like what you need.
Basically it's just a standard way to do the basic stuff.
If you want optimized select queries, you can still write them, also inserts are optimal with ORMs.
INSERT INTO a SELECT * FROM b;
I personally have not felt the need to use and ORM yet. But then I have not built a CRUD heavy app either. Instead most of my time spent on database work is centered around reporting and other complex queries where no ORM I have seen yet helps much.
The pattern represent a very good abstraction concept that is irrelevant to the library implemtation.
Data Warehousing is all about JOINS, big JOINS. Or that thing they call it data-cubes. In that case you need query optimizer, lot and lots of buffers, data partitioning and several layers of caches. You also should use stored procedures, because it is a good way to structure and manage your code, same way modules work for other languages. So, you know, that old lovely DB2.
Even in old times, people who claims that there is a solutions that fits both cases were considered crazy. That is why no sane person considered MSSQL (leave alone MySQL) as something other that a taste-less joke.
Nowadays people forgot about designing in terms of data flows. Everything starts with installing some framework, such as Spring+Hibernate, Rails or some PHP crap. They forgot that not tables itself, but queries (which type, how often) to them is what matters, that indexes optimized for actual query flow is what performance of a server is all about, and that actual structure of tables (and corresponding indexes) must be adapted/redesigned for that particular production flow of data. That was a DBA's job.
Today some people believe that they can eliminate smart but costly engineers (DBAs or sysadmins) by some software which is marketed to them as Easy, Smart, Fast, Zero-thinking or whatever - ready meals for a mediocre crowds. OK, if you're building a 20 pages web-site for a 100 visitors per day, that might work - you can save some money and time, but, if it is a industrial or internet-scale solution, there is no chance that you can run Rails or say Code Igniter crap in production without huge changes or total redesign. No fucking way.
So, all those specialized solutions, such as memcached, redis, membase, mongoDB are about dealing with flows of technical data, such as AJAX queries from UI, logs, authorization requests, chats, photos and other unimportant things, OR about building a huge distributed cache layer above actual data store. But, of course, you cannot build a Data Warehouse out of it. (Or invent a complete different approach to dealing with data, such as map/reduce).
So, ORM is anti-pattern? It is not efficient? Ridiculous. It is just broken by design. ^_^
In all seriousness, I'd love to see a criticism of ORM that takes Arel into account (http://magicscalingsprinkles.wordpress.com/2010/01/28/why-i-...). I have yet to see one.
go read martin fowler's _patterns of enterprise application architecture_.