

Keep Your ORM Out of my Controller - mononcqc
http://ferd.ca/keep-your-orm-out-of-my-controller.html

======
Groxx
For large-scale projects, this makes a lot of sense. Without another layer of
abstraction to keep database-content-specific code out of your controllers,
you end up with:

 _db - > orm -> 100,000 flat-level queries tied directly to the format
returned by the orm_

With abstraction like they suggest, you can achieve:

 _db - > orm -> [your organization method here] -> your never-changing data
you requested_

Significantly better if you're working at that scale, and if you ever need to
change your DB/ORM. In a situation like that, I wholly agree: there should be
_zero_ ORM-tied code in your controller.

BUT: tutorials are rarely-if-ever for people working at that scale. You
_should_ have _some_ grasp of what you're doing if you're attacking a problem
of that scale, or at _least_ be beyond the tutorial stage. If tutorials
addressed this, they'd remind me of most beginning stuff for Java programming:

 _subclass this, override main, subclass that, subclass that, subclass that,
build a polymorphic inheritance structure, print "hello world". Easy!_

 _Totally_ opaque until you understand the underlying organization of Java,
and it tends to distract newcomers from things they _want_ to achieve, by
bogging them down in things _should_ do when they have no comprehension of
_why_. Once you understand why, it's merely annoying; before that, you really
can't know if you're supposed to use an interface, inherit from Object or
Array or GenericArrayListWithBellsAndWhistlesRadixSorted, or just code the
damn thing by hand.

ORM tutorials are for coding the damn thing by hand, a necessary starting
point for building that scalable abstracted structure, and the foundation of
it. Not for helping you build a 3000-table, 5-billion-row, 10,000-request-per-
second Google-killer.

------
aphyr
I run into this question frequently: how much logic belongs in the controller,
and how much in the model. The difficulty, I think, is that every method you
add to the model requires a name, and the profusion of names leads to
complexity and poorly named methods.

An alternative is to collapse similar methods (for example, two queries
differing only by "order by") into one, and use a parameter or function
argument to distinguish the cases. But this pushes ORM-specific logic back up
into the controller--only now with an extra layer of indirection for the
programmer to remember how to use.

My approach has been to define methods on the model only when they are used in
more than one place or have a concise, meaningful name. Introducing the
abstraction later rarely requires modifying several aspects of the
application, and Sequel's def_dataset_method/chaining approach make this
relatively painless.

~~~
Xurinos
> how much logic belongs in the controller, and how much in the model

I always thought this was really simple (at least based on the original Xerox
definition of MVC):

Controller transforms inputs/parameters as needed. Controller can use Model to
do this. Controller creates/manages views and sends latest data to the Views.

And a confusion-reducer: Controller is not between Model and Views. Views have
direct access to Model. It ONLY uses Model to figure out the situation.

I have found these clear separations helpful in a variety of applications. I
have also found that people that apply MVC to the web confuse themselves, too.
On the server side, your View is what you are returning to the browser.
JavaScript is part of the View of the server. It has its own MVC when it is
run in the browser environment, and the browser has its own MVC. It is
perfectly acceptable to have interlocking components, each with their own MVC.

~~~
mononcqc
A note: The architecture model where the controller sits between the model and
view is instead called 'Model-View-Adapter
(<http://en.wikipedia.org/wiki/Model-view-adapter>), where the controller (or
adapter) is doing the mediation between the model and the view. It supposedly
allows for a stronger separation of concerns.

------
stevenwei
I've found the repository pattern
(<http://martinfowler.com/eaaCatalog/repository.html>) quite useful in large
projects to abstract away some of the detail of the ORM. Using this technique
has the added benefit of being able to mock out the repository during unit
testing (so you're not tied to having to setup/teardown your database between
every run, which if you have a large project, can take quite a bit of time).
It also helps with maintainability because all of your ORM related code is
basically in one place.

I do think this is overkill for smaller projects as you're just adding another
layer of indirection. In most projects, I think it's best to start out by
keeping things simple, and refactor towards these additional abstractions as
the need grows.

One thing I've always found when using an ORM, (especially in web projects
although really in anything with a user interface) is that the abstraction is
always leaky, especially as you start to optimize for performance.

You might start out with a simple query for Books, but then realize for this
specific page, you're only rendering Book.name and Book.price, so you should
limit your query to those fields. Or you realize that for another page you
should be eager loading Book.authors and lazy loading something else.

If you don't pay attention to the fact that your data is ultimately backed by
the database you could end up doing things that cause your ORM to issue more
queries or request more data than is really necessary. But once you do start
paying attention and customizing/optimizing those queries, your queries can
end up very specific to the page that they're being rendered on.

------
mattmcknight
I think we call these Named Scopes:
[http://api.rubyonrails.org/classes/ActiveRecord/NamedScope/C...](http://api.rubyonrails.org/classes/ActiveRecord/NamedScope/ClassMethods.html)

Maybe it's this in Django:
[http://docs.djangoproject.com/en/dev/topics/db/managers/#mod...](http://docs.djangoproject.com/en/dev/topics/db/managers/#modifying-
initial-manager-querysets)

The overall purist argument, that the controller should have no knowledge of
the database seems ridiculous. If one removes or adds a field to the model,
one would expect there to be cascading changes across the whole system. If you
generate most of your SQL with an ORM, you can switch database with minimal
fuss. Although, how often do you do that anyway? ORM provides a lot of benefit
in reducing the amount of code you have to write for many operations.

~~~
cageface
Beginning programmers make the mistake of abstracting too little. Intermediate
programmers make the mistake of abstracting too much. Experienced programmers
know when abstractions add value and when they just add another layer of extra
complexity.

Generally speaking, designing your code up front to answer vague & ill-defined
portability concerns you _may_ encounter down the road means you're
abstracting too much & too early.

~~~
mononcqc
Can't help but feel this is directed at the post. While I can definitely agree
doing it for _every_ single ORM call for maintenance purposes is sometimes
overkill (some calls just are easy to refactor, or following a rule similar to
"3-strikes you're out" keeps it sane), I find it hard to imagine another way
to get code that is as easy to test and to mock. Any suggestion?

Then I think Groxx hit the nail on the head with his comment
(<http://news.ycombinator.com/item?id=1636322>). The advice is mostly valid
for large-scale projects, things you expect to stay forever, which is what I'm
maintaining at the moment; Not for experimental or short-lived code (or
whatever is logical on that spectrum).

~~~
cageface
In modern web stacks you're already separated your view from your controller
from your models, and you've abstracted most or all of your database code into
an ORM. That's a lot of framework to punch through when your abstractions
start to leak already. At _some_ level your controller has to know something
about what objects exist in your database and what fields they contain.

I've also found it's extremely hard to predict ahead of time what projects
will turn out to be long lived and large scale so I try to build the simplest
thing possible first and then take advantage of the flexibility of modern
stacks to iterate and refactor from there. Things like named scopes in rails
should be more than enough to keep your controllers from getting too tightly
intertwined with your models.

~~~
mononcqc
I can see what you mean by the controller needing to know something about the
objects it will obtain. It makes sense, although I guess we might disagree on
how much info we want to let go through.

I think predicting the length and scale of projects likely depends on who you
work for and what products they want. I might be biased because I've spent
most of my professional life working for a large-ish business where products
in general end up being reused a lot by other internal products and all have
an active life of at least 3 to 5 years, during which different developers
will 'own' them. In this case, you can understand how planning ahead might be
necessary. I'm currently supporting an application that's 14 years old and
small leaks like that end up hurting a whole lot.

Then I guess I'm not as strict for home projects where I'll be the sole owner
and developer, but that's the opposite end of the spectrum.

~~~
cageface
_I might be biased because I've spent most of my professional life working for
a large-ish business where products in general end up being reused a lot by
other internal products and all have an active life of at least 3 to 5 years_

The interesting and frustrating thing about building software is that the
answer to most questions is "it depends". What you're describing may be
appropriate in your case. In the case of most of the projects I've been
involved with adding any significant extra abstraction on top of reasonable
use of an ORM would be overkill in most places.

------
teilo
The problem is not the ORM. Anyone who is writing such complex queries
directly in the controller needs to re-think how they have designed their
site. Specificity like this belongs in the View and the DB itself, not in the
Controller OR the Model.

Adding umpteen custom getters to the Model is not helping matters. It may be
easier to test, but it is just as unwieldy and difficult to maintain. It is
the wrong solution to the problem. These things need to be parametized in some
way and made generic.

In Django you would likely do this with Q objects, and a front-end that lets
you filter down your results on-the-fly, and save the resulting query to the
DB for recall later. This is even easier to test, and prevents your Models
from getting cluttered with functions which are far too specific.

~~~
mononcqc
I didn't know about Q objects before. What are their capabilities regarding,
say, moving from an SQL database to a web-service or a third party
application?

I mean, this is the kind of stuff I have to do surprisingly often when
maintaining legacy applications and putting them up to date. Your data model
ends up changing or just not supporting all the same idioms again. You might
need to join some info coming from your database without half of it coming
from a RPC.

I've found such changes in the foundations of where data you get is to cause a
whole damn lot of problems on all the layers above. I haven't figured out a
better way than wrapping lots of it in fat models, but if Q objects have
something that can help with that I'd be very happy to learn about it.

~~~
teilo
Q objects are a way to represent complex SQL in a progressive fashion that
would be unwieldy or impossible with the standard Django ORM chaining
technique.

It really doesn't address your use case.

------
pvg
That's an impressively verbose and obfuscated way to re-state the very basic
and completely independent (of ORM, Django, Python, etc) notion that if you
have common code you should abstract it into a separate function.

------
sofuture
Using an ORM does buy you some abstraction, though. You get to worry about
your business objects and fields instead of your tables and columns.
(Depending on the language/situation this can be a big deal)

You're right that it doesn't provide _all_ the abstraction you probably want,
but it easily gives you a toehold on which to build that abstraction. Doing it
from scratch, by hand is often a much worse place to start.

~~~
mononcqc
I agree. Maybe I should have been clearer on my views. ORMs are a great tool;
the way they are used with the leaky abstraction is what I have a problem
with.

Restricted to the model layer, they're entirely worth it and provide lots of
help.

------
endlessvoid94
I hate rails. But one thing I like is that activerecord will build functions
for you. e.g.:

User.find_by_email(email)

I've started writing these types of functions manually now and it makes
EVERYTHING so much simpler.

I never thought I'd say it...but...thank you rails.

~~~
chrismsnz
So you're using ORM specific functionality in the controller? :)

~~~
endlessvoid94
I...guess so.

But it successfully removes the reason that becomes painful and leaky, e.g.,
queries. The function names are short and readable, and the execution of the
query is in the ORM where it belongs.

------
mgkimsal
"This means that we're in the controller but we still have to grasp what the
hell is going on in the database. The abstraction is leaking!"

How many people JUST work in controllers and can remain oblivious to what's
going on in the DB? Just because you move the ORM or any DB call to a
different area of the code doesn't mean you shouldn't have a grasp of what's
going on.

If I wrapped all ORM calls in some service layer classes, that simply means
that now when I'm looking at my service layer classes I "still have to grasp
what the hell is going on in the database. The abstraction is leaking!".
Doesn't it?

~~~
mononcqc
Try renaming a table column. Which one of the two has more impact on the code?

Of course the same programmers can work from the model up to the view, but the
idea is to reduce the impact of changes in lower layers, and help with
testing. The example of the database info required is used to illustrate the
leak. It's not a horror in itself.

------
misterbwong
I think the author needs to make a clearer distinction between business logic
and view logic. I'm an advocate of creating a _view_ model layer above the
traditional model containing things like sorting and field suppression, etc.

The app stack might look like this: DB -> ORM -> BizLogic -> ViewLogic -> View

This has the effect of decoupling actual business logic (i.e. get me every
british author) from view logic (i.e. sort them by name in my table) and
allows the design to be more flexible in the future.

------
richcollins
_Now your boss has decided he wants this example search on 6 different pages
(that require 6 different controllers). What do you do?

    
    
      Copy/paste the query within all 6 controllers
      Create a class with a method (or just a function) to handle it

_

Why is the idea that you shouldn't duplicate code novel enough to make it to
the front page of HN?

------
adamilardi
In java("an enterpise level language") we never call orm directly from a
controller Typically we do 3 or more layers.

controller->facade-layer(can handle transactions via annotations)->business
objects->dao OR controller->business objects->dao

------
hippich
ORM is not always Model. ORM is tool to build Model. If queries become to
SQLish, add another layer on top of ORM with methods to retrieve data which
encapsulate all SQLish methods and provide nice and consistent interface.

------
bfjotld
How innocent the article is written :) People are expecting to be able to
build a cheap just hang in there type of flying thing, which would magically
transform itself into a stealth one with minimal tweaking here and there.

Of course not. And it's not about abstractions, middle-ware, etc. Everything
is a tool. If you need to abstract, you abstract. If you have experience, you
realize that the client is not aware of what he will want in the future. That
he will want triple sorts, grouping by, etc.

I recommend to the writer of the article to memorize the Tao of Programming.
Only Tao is perfect. Yeah, you can claim your tool is perfect, that is magic,
it's not, ask the guy that has to maintain the thing.

~~~
mononcqc
I'm the guy maintaining the thing, and I'm complaining about ORMs not being
perfect. I'm not sure I get your point.

~~~
bfjotld
It means that you still fight with the tool - which is a waste of energy.
Accept the tools are not perfect, get your benefits out of them (either
financial or intellectual - for example by making something better) and use
the remaining energy in something more constructing for your self.

Quote from Tao of Programming: 'There once was a master programmer who wrote
unstructured programs. A novice programmer, seeking to imitate him, also began
to write unstructured programs. When the novice asked the master to evaluate
his progress, the master criticized him for writing unstructured programs,
saying, ``What is appropriate for the master is not appropriate for the
novice. You must understand the Tao before transcending structure.'

------
hp1995acer
I'm going to make up hypothetical problem. then have the developer use the
most retarded solution, and then use the resulting make believe coding chaos
to prove my point.

Yay Programming Blogging

:/

