

Building a higher-level query API: the right way to use Django's ORM - j4mie
http://dabapps.com/blog/higher-level-query-api-django-orm/

======
Aramgutang
The main point raised by the article is spot-on, and I'm ashamed to say that I
had never recognised it as an issue before reading it. It applies even more
strongly for more complex lookups (possibly involving Q objects), which I've
always felt would find a better home in models.py than in views.py. And I too
cringe every time I come across the django.db.models.manager source code.

Some thoughts:

The approach goes slightly against the commandment of "there should just be
one way of doing it". It's probably best to apply it sparingly, only where
there are very clear readability and/or DRYness improvements to be had.

The `PassThroughManager.for_queryset_class(TodoQuerySet)()` bit is a bit
intimidating, especially to someone not familiar with django-model-utils. I'd
probably take the time to write a `manager_with_queryset(queryset,
manager=models.Manager)` function to make things more readable.

When you're trying to convince someone that the way they currently code isn't
optimal, it really helps if the code you use to illustrate the current way
looks like something your reader is likely to write. The strangely formatted
filter chain at the start is thus really off-putting. Instead of adding three
disclaimers asking the reader not to focus on the implementation details, why
not just write it the way most people would:

    
    
        todos = Todo.objects.filter(
            owner=request.user,
            is_done=False,
            priority=1,
        )
    

No need to make the current way of doing things seem unnecessarily convoluted,
the point you're making still stands. Though I'll admit that when you're
forced to throw an .exclude() into the chain, it starts looking a lot like
your example.

~~~
randlet
Agreed. I found the contrived chained filters at the beginning of the article
off putting but the rest of the article was quite interesting.

~~~
richardlblair
I completely agree, but I think this was author's way of demonstrating how
ugly a call to the ORM can get. With his very simple to-do list app I think
this was the easiest way for him to do that.

In real life you would obviously have all your filter in one call (most of the
time), then perhaps .values call with an .annotation call in there too.

------
mitechie
I can't agree more. Always wrap your ORM with your own logic that makes sense
to you and your application. You'll find code reuse will go way up,
readability will go way up, testing is easier. You can actually test the model
without bootstrapping the whole app. I've been preaching this in my SqlAlchemy
talks and tutorials for years.

~~~
mgw
For our project I've been thinking about this a lot lately. Could you maybe
elaborate on how you did this with SqlAlchemy and how you write your tests
against the new model? Do you now of any more in-depth articles on the
subject?

This really seems to hit a sweet spot of moving business logic into the model
and I would love to use this in some form.

------
jpwagner
Wait, what's wrong with:

    
    
      class Todo(models.Model):
        content = models.CharField(max_length=100)
        # other fields go here..
    
        @classmethod
        def incomplete(cls):
            return cls.objects.filter(is_done=False)
    
        @classmethod
        def high_priority(cls):
            return cls.objects.filter(priority=1)

~~~
lojack
There's a couple reasons that's not ideal.

The big one is that you'd lose filter chaining. With your example you couldn't
do

    
    
        Todo.objects.high_priority().incomplete()
    

The other issue is a semantic one. In your example you could do:

    
    
        Todo.objects.all()[0].incomplete()
    

which will return a QuerySet of _all_ incomplete Todo items. This, at least to
me, doesn't make sense.

The last reason is that by using a Manager you are encapsulating this filter
data. If you later decide that you want to create a new model with similar
types of filters, then you'd have to rewrite these methods. With a Manager,
both models can simply use the same manager.

~~~
jpwagner
No, I believe you've read the code above incorrectly. This would be, for
example:

    
    
      Todo.high_priority().all()
    

and would allow, for example:

    
    
      Todo.high_priority().filter(id__gte=1)
    

I haven't tested chaining these, but this might work:

    
    
      Todo.high_priority().incomplete().filter(id__gte=1)

~~~
j4mie
Your last example wouldn't work. Todo.high_priority() returns a plain
QuerySet, which won't have your "incomplete" method (as that's defined on the
Model class in your example).

------
caioariede
For simple approaches you can also bitwise through Q objects:

    
    
      >>> from todo.models import Todo
      >>> 
      >>> Todo.objects.incomplete()
      [<Todo: 2>, <Todo: 3>, <Todo: 4>]
      >>> 
      >>> Todo.objects.high_priority()
      [<Todo: 1>, <Todo: 2>]
      >>> 
      >>> Todo.objects.incomplete() & Todo.objects.high_priority()
      [<Todo: 2>]

------
zacharyvoase
> Personally, I'm not completely convinced by the decorator-based idea. It
> obscures the details slightly, and feels a little "hacky".

The purpose of the decorator is, indeed, to obscure the implementation details
in favour of more semantic code. But then again we're using an ORM which makes
heavy use of metaprogramming to obscure the details of the database layer from
us; I don't see how this is a bad thing.

~~~
j4mie
Yep, and my criticism of your suggested approach wasn't intended to be
particularly strong by any means. I could definitely be sold on the idea. It
just felt like a workaround to a problem that could probably be solved in a
nicer way.

I think my main objection is that these query methods _should_ conceptually be
on the QuerySet, and so defining them on the Manager (the "wrong place") and
magically copying them to the QuerySet (the "right place") feels somehow worse
than the opposite.

I appreciate that you raised the discussion on the mailing list, as it
highlights the fact that this is a common problem in big Django codebases.
Even pulling something like PassThroughManager into core might work (perhaps
with a nicer "manager_with_queryset" API, as suggested by Aramgutang).

~~~
zacharyvoase
But my problem is that most people wouldn't even think of subclassing
QuerySet.

When we write methods that operate on collections of things, we typically use
@classmethod. Without @classmethod, we'd have to write a custom metaclass (and
instruct our class to use that) if we wanted even a single class method on a
class. Multiple inheritance would break (or at least be difficult to reason
about) when classes defined class methods, because there would have to be both
an instance method resolution order and a class MRO. Fortunately Python's
built-in `type` provides the descriptor protocol, which allows us to have
class methods and instance methods and properties and all these other nice
things without having to metaprogram or hack the interpreter.

All I'm asking for is a similar (if less ornate) interface for Django models,
wherein the methods that operate on collections of things can be defined
alongside the methods that operate on individual things, without requiring a
knowledge of Manager/QuerySet internals.

~~~
simonw
A few years ago I was experimenting with exactly this problem, and I came up
with an API that worked using an inner class on the model. I can't find the
code now, but from memory it looked something like this:

    
    
        class BlogEntry(models.Model, MagicManagerMixin):
            title = models.CharField(max_length = 128)
            is_published = models.BooleanField(default = False)
    
            class QuerySet(models.QuerySet):
                def published(self):
                    return self.filter(is_published = True)
    
        entries = BlogEntry.objects.published()
    

Where MagicManagerMixin was some scary code that made sure the objects Manager
would use the queryset subclass.

------
pandyashreyas1
Momentarily arguments sounds right(perhaps because it's intellectually
appetizing) but I think we are forgetting basic philosophy of every layered
architecture that "lower layer provides generic api to its higher layer"
allowing higher layer to customize its every possible needs using this api.
Django ORM does exactly same thing.

author seemed to be concerned about these issues:

1\. embedding businnes logic in views:

making query isn't business logic, business logic is in your database
constraints or sometimes if db constraints are not enough then by overriding
save() and validate(). queries belong in views because all we are suppossed to
do in views is mearly fetch data from a data structure(already modeled
according to biz. logic) and representing it as we see fit(thus the name
views). theoretically this representations(views) could be of infinite types
and changes over time so queries would change for every representation and
over time but we can't go about implementing all possibilities in models. And
this is what's antipattern because we are talking abaout putting views in
models(partially though).

2\. code reusability:

agreed, some queries could be repeted many a times and if complicated enough
may clutter the code. I recommend putting querries into functions and put the
functions in views or any similar aproach(I will think of one or you figure
out one and share) but they just dont belong in models. although I believe
full reusabilty can be achieved but in some cases if we can't- well we are
choosing 'division of functionality' over 'reusability'.

and most important of all if django's documentaion does not suggest inherting
for eg. queryset class then we shouldn't (even if you yourself coded the
django framework) because these implementation details are supposed to be
concealed and hence they are free to change it in future versions making our
code 'upgrade-ugly'(if that's the right term)

------
tocomment
I see what he's trying to do but this just seems like a ton of extra
"boilerplate" code when you're trying to make an app.

I'd rather spend extra time tracking down where I've used the is_done field if
I later change it to a status, than spend all this time writing a manager for
every query I do. Unit tests help with catching it if you've missed somewhere.

On the other hand, if you have an extremely complicated query, then this might
make sense. Maybe he chose that example for illustrative purposes but it's not
really the kind of thing he's recommending you use this for?

~~~
jonknee
It's not much overhead and considering the savings in views the LOC will
probably be equal or less. It doesn't make sense for every model or attribute,
but if you find yourself doing the same types of queries multiple times, this
can be a big savings.

------
DodgyEggplant
Correct. Another benefit: easier to implement cache between the view and the
DB down the road.

------
marcofucci
I'm a freelancer and I've been using this approach since ever. I wish more
developers did too because it's really frustrating when they don't and you
need to maintain their code.

------
JoshMock
Lately I've been re-discovering the value of the ORM and putting as much
business logic on your models as possible. This, after spending way too long
writing out lots of query logic in views instead. It's amazing how you can
many times reduce complexity from 10-20 lines to 2-3 lines, and gain
reusability, just by putting business logic where it belonged in the first
place.

~~~
pestaa

        > putting business logic where it belonged in the first place
    

I've been thinking a lot about where to put business logic, and I think the
models are the closest, but not the best place for them.

A significant portion of this logic for me affects more than one model, and
while you could solve this by using public interfaces, you still need to put
in _any_ of the models, which seems like a suboptimal approach.

The ORM is already responsible for several layers, and custom business logic
is not relevant there imho.

Still puzzled how to organize my code. MVC has started to fall apart for me.
Just my two cents.

------
ecesena
A bit OT. In Yii framework (php) these are called "scopes" and there is a nice
abstraction to define non-parametric scopes via arrays.
[http://www.yiiframework.com/doc/guide/1.1/en/database.ar#nam...](http://www.yiiframework.com/doc/guide/1.1/en/database.ar#named-
scopes)

------
richardlblair
This article is great and provides some awesome insight from someone who
clearly has been down this road before. It couldn't have come at a better time
for me. I was just about to implement my own manager today, but I'll take this
much cleaner approach.

Thanks!

------
2mur
Spot on. Never (okay very rarely) do business logic in your views. Just don't.
Model methods for row-level logic, managers for table-level logic. It's easier
to test, it's easier to abstract and it's easier to change your views later.

