Hacker News new | past | comments | ask | show | jobs | submit login
Django Styleguide (github.com/hacksoftware)
177 points by r4victor on Jan 11, 2023 | hide | past | favorite | 138 comments



I think there are too many concepts in this, but rather than being negative, here are some tips:

Always keep your models slim. Don't stuff template related stuff in there. You need to look at those models often, so compact is a win. course_has_finished(course) is not much longer than course.has_finished(), and will allow you to expand the functionality as time goes on. Do precomputation if you need the information in a template - that keeps your templates simpler and allows you to easily expand the complexity of the precomputation.

Don't use class-based views, at least not outside very specific niches like the Django admin. Class-based views will transform a simple, composeable call stack into a ball of inheritance mud. Inheritance is not a good code reuse tool.

Don't make separate apps in the same project, unless the project actually consists of several different, completely independent projects. You can make subdirectories without making apps, and thus avoid dependency hell.

Also be wary of the formerly South, now built-in migrations stuff. It's built around a fragile model (perfect history representation), so has lots of foot guns.

And be wary of 3rd party libraries with their own models. You can spend a lot of time trying to built bridges to those models compared to just doing what makes sense for your particular project. I think 3rd party libraries are perhaps best implemented without concrete models - duck-typing in Python let us do this. This includes Django itself, by the way. User profiles didn't become good until Django allowed you to define the user model yourself.


Always keep your models slim. Don't stuff template ...course_has_finished(course) is not much longer than course.has_finished()

I disagree with this. When trouble shooting or expanding code it is super convenient to import a model and have all of your methods on auto complete. Especially when you need the same functionality in a view, a cron job, a celery task, and an DRF end point.

If you want to keep it clean you can put all your methods in a mixin class and import it from another file.

Also be wary of the formerly South, now built-in migrations stuff.

Things are much better than they were in South. But yes be careful, rule of thumb is always move forward.

Don't use class-based views

Please. For the love of God, always use class based views for almost everything. Almost everything you need is a variant of one of the built in class based views, don't make me read your copy/pasted reimplementation of it.


> Please. For the love of God, always use class based views for almost everything.

This. 100%. Leverage as much pre-built stuff you can, especially with something as important as your HTTP layer. Whenever I run out of CRUD verbs for a model and I need to add a custom endpoint, I'll implement it in a separate APIView sublcass. Convention over configuration; write boring code.


In my view (hehe), Django's class based views is a good idea implemented poorly. In theory you should be able to use any of the built-in class based generic views with minimal customizations to suit your needs, except when you want to do such customizations you're left dealing with a huge inheritance tree of mixins. It's all magic unless you know or wanna read the documentation on what each mixin brings to the view, that is _if_ you know what mixins are involved exactly, of course.


Are you aware of https://ccbv.co.uk/? After I discovered that, class based views became easy.


And if you're using DRF there's https://www.cdrf.co/


Wow, 8 years of using Django and I did not know about either of these. Thanks!


> Always keep your models slim.

As simple as possible, but no simpler. Django models are meant to deal not just with the data, but also with business logic. If `course.has_finished` is a property of the course, why would you want to have a separate function outside of the class?

> Do precomputation if you need the information in a template

If the precomputation is only needed in a template, you can (should, IMO) use template tags.

> Don't make separate apps in the same project (...) avoid dependency hell.

My current pattern here has been to create one "core" project where I represent all the internal models of the domain of the application, and "adapter" apps if I want to interface/integrate with anything from the external world. This makes it easier to extend or replace third-party tools.

> (Migrations) It's built around a fragile model.

I wouldn't call it fragile, quite the opposite. There are some annoying limitations for sure (I didn't find a reliable way to change the primary key of a model, except for creating a whole new model and migrating the data to it), but I think they are due to a matter of strong safety that the migration can only be done if it consistent.


At my current company, we've had many teams over the years fail to make business logic in model methods work, and I think many other people have had similar results. The issues usually boil down to some combination of "business logic is too coupled to the data model" and "this method lives at an intersection of these two models and creates weird dependency problems". I now feel that Django puts you down a path for failure by naming the DB layer "models" and not giving users a decent place to put cross-model domain logic.

My current preference is a functional core-imperative shell-style architecture where as much code lives in the functional core as possible. It's not very elegant with Django but it works fine. Cosmic Python (really accessible and fairly quick read if you have the time: https://www.cosmicpython.com/book/preface.html) has examples that are similar.


> The issues usually boil down to some combination of "business logic is too coupled to the data model" and "this method lives at an intersection of these two models and creates weird dependency problems".

Refactor is not a dirty word. The problems you are describing seem to be more of the nature of having too many things concentrated at specific model classes, and that this model should be decomposed, broken down. This is not a Django-specific issue.


> I now feel that Django puts you down a path for failure by naming the DB layer "models" and not giving users a decent place to put cross-model domain logic.

I think a lot of 'MVC-inspired' frameworks fail there, not just django. Rails... 'app/helpers' maybe? Laravel 'models' is it, and 'services' or a variation is something I see a lot of folks adding, but it's not an out of the box convention. I can't remember anything specific/explicit in the asp.net world either.


> If `course.has_finished` is a property of the course, why would you want to have a separate function outside of the class?

Because one should avoid passing Django models around. It leads to bad design. Have a selector or something that uses the ORM, but exposes some dataclass or pydantic model instead, and put the logic there.


Keep in mind that passing around querysets has performance advantages you wouldn't get by passing around dataclasses or similar.

For example, if you do a query like Model.objects.filter(related_model__in=RelatedModel.objects.filter(...)) the ORM will only run a single query, silently converting the second one into a JOIN.

If you pass lists of "RelatedModel" however you would've had to first one one query to get that list (raising potential edge-cases with regards to atomicity and transactional isolation) and then pass the list of IDs to the outer query in an "WHERE related_model_id IN (...)", resulting in 2 queries in the end.


That gain is often lost by people doing unoptimized queries all over the place, though, instead of a single place where the queries are optimized. And passing the fat django objects around often lead to accidental n+1 queries, since you can't really trust that looking up a property on your object doesn't do a new query. Often nicer to avoid it all, by having a gated access to the DB.

While I propose most often sending in a list of related IDs (premature optimization and all that), the function could just accept any iterable, and you from the outside could send in the lazy relatedmodel query.


You want to add an ORM to the ORM? Why?


How is that adding an ORM to the ORM? I want all django orm access to happen at defined places, instead of the spaghetti mess it is when people do SomeOtherModulesModel.objects.filter(..) and expose themselves to the internal workings of that module. Access it through a selector instead.


If for some strange reason your application has data that it was not created with Django, sure.

But aside from that you are just adding another layer of abstraction that does not give any benefit when all your models are managed by Django already.


It gives a huge benefit, and not doing it is why most django code is incomprehensible and slow.

No longer should anyone in their module directly do something to a model and save it. They should always go through a service in the module owning that model, that makes sure everything is done correctly. So services.py and selectors.py works as a public API for the module, while the models are internal. Avoids having lots of other apps/modules depending on your app's internals.


> not doing it is why most django code is incomprehensible and slow.

> No longer should anyone in their module

> they should always go through a service

Weasel words and opinions-as-fact. Come back when you have a way to show that your approach gives any actual benefit.


I did give an example. What you call opinions-as-fact was me trying to explain the approach, not commanding anything. That you disagree (or can't comprehend it?) doesn't make it weasel words. Please don't behave like this towards me, read the guidelines. You can find a link at the bottom of this page. I'll leave this "discussion" here as it's unfruitful when you act so hostile.


If your idea of "explaining the approach" does not address the "why?" and depends on "should always X" and "should never Y", then you are resorting to weasel words.

"incomprehensible and slow". To whom? How slow? What about your approach makes it faster?

A sibling comment pointed one important aspect: Django Querysets are lazily evaluated, I find it really hard to believe that having a layer that constant marshalls and unmarshalls the data through pydantic can make anything faster than not having to fetch any data until you really need it.


An orm takes a selector (typically an sql query) and maps it onto an object.

What you're describing takes a selector, and maps it onto an object. Is it just that you want type hints or something?


No, what I'm describing is functions in a selector.py, like: def get_orders_for_date(date) -> Order:

where Order is a pydantic model, not a fat django model. Other modules shouldn't know about my internal database. All other modules should use functions from this selector.py, they aren't allowed to use the OrderModel themselves directly, only the pydantic class. Because otherwise you end up with spaghetti.


Right... so you're talking about mapping an object from your database to a pydantic object.

So you want an ORM, but you want it without a save method? Or presumably with a save method that can only be called under specific circumstances?


> Don't use class-based views, at least not outside very specific niches like the Django admin. Class-based views will transform a simple, composeable call stack into a ball of inheritance mud. Inheritance is not a good code reuse tool.

People always say this but well-structured CBVs keep a generic interface that you'll be really glad that exists when you have 80 views spread across 10 apps.

Composing function-based views is a PITA and when you're building an API with a bunch of auth/serialization/cache extras being bolted on it's way easier to keep disciplined and ordered. It is _trivial_ to mess up the order of callers for these things inside function-based views.


I disagree with almost everything you have said.

Fat models think controllers is the suggested strategy. It works well in my opinion / experience, though I guess it could get out of hand on very large projects.

There is something that feels very unnatural and unintuitive about your course_has_finished(course) versus course.has_finished() example. This was one of the principles behind OOP, keeping your data and functions / methods together, though I know OOP isn't trendy these days. It's far more natural to have it as a method of course than some random function, stored who knows where. I worked on a system with this type of design, it was pretty bad.

One thing I would like to see for larger projects is the ability to easily split models into separate files - a bit more like Java does with one class per file. Maybe you can do this already.

Class based views mean that a lot of code is written and tested for you. I'll agree that your view does need to be somewhat "standard" in what it's doing (anything you would see in the admin, list, create, edit, detail, login, etc), so if you have something more complex multiple forms in one view, then thy don't give a great deal of advantage. In that situation I would still likely choose a basic View class over functional views, but more for consistency.

I am probably in agreement on separate apps.

Migrations are one of the best features of Django. I have just spent 4 years working on a a system without them and it was a shambles as you would expect. Everyone is scared to make database changes, so you get a ton of shitty application code to compensate for shitty database modelling. Tech debt in other words.

I can't think of any 3rd party apps where I have used the models directly, or if I did it was frictionless enough for me not to remember. So no real opinion on that one. 3rd party apps can be pretty hit and miss, especially if they get abandoned as you upgrade Django, so I would say use with caution, particularly for more obscure ones (just been revisiting django-celery-beat, that's been around for years so I doubt it's going away).


"Also be wary of the formerly South, now built-in migrations stuff. It's built around a fragile model (perfect history representation), so has lots of foot guns."

My experience has been opposite so I'd be interested in hearing your experiences if you are willing. I have been using the built in migrations since day 1 on a medium sized Django project with 350+ migrations and migration issues for our project have been exceedingly rare. edit: We have a small team of developers, so merge migrations are very rare for us, which might be a contributing factor.


Diamond-dependencies are hard to work with, and so we forbade them. Specifically, you can't revert an individual migration, you just specify your desired "target" to roll back to, and therefore you can't unapply a single branch of a diamond dependency. This means if your second branch of a diamond dependency breaks the DB, but your app depends on the first branch, you're SOL and are now manually running SQL to fix your production DB. (Can you tell I'm speaking from experience here? :) )

Your migration code uses the model classes, but the migrations are using a rehydrated version of the model that doesn't include any methods; another footgun. Basically you need to copy in any logic that you're using into your migration file, or else that migration's logic will change as your code is refactored. You might naively think that because `model.do_the_thing()` works now, the migration is somehow pickling a snapshot of your model class. It's not.

Because of the above, you should really squash migrations frequently, but it's a big pain to do so -- particularly if you have dependency cycles between apps. ("Just Say No to Apps" is my default advice for new Django developers. If you have a real need for making a bit of library code sharable between multiple different Django projects then you know enough to break this rule. Within a single monolithic project, apps just cause pain.) Squashing migrations quickly gets to some extremely gnarly and hard-to-work-with logic in the docs.

Moving models between apps isn't supported by the migration machinery; it's an involved and dangerous process. One idea here that might save Apps is if you manually remove the App Prefix from your "owned" / internal apps; if I have Customer and Ledger apps, I don't really need to namespace the tables; `user`, `information`, `ledger_entries` are fine table names instead of `customer_user`, `customer_information`, `ledger_ledger_entries`, a normal DB admin would not namespace the table names. You neeed the app namespacing to make it safe to use someone else's apps, but I think namespacing for your own apps inside a single repo is harmful.

I find the migration framework to be worth using, but it's definitely got some sharp edges.


Maybe you need to turn this into an article because over the last decade of working with Django, we've learnt the same lessons, sometimes the hard way. Learning to never import real models into data migrations was a big one.

I recently wanted to move a model between apps and ended up going the route of create new table, copy all rows over, delete old table. It was annoying, but the only way to make it work with regular migrations.

We ended up writing our own script[1] to squash migrations, and I'd love to know if there's a better way. We needed something that works for clean installs or existing installs that already have the current migrations installed - so it generates empty migrations which get applied on existing installs, and then they get replaced with real initial migrations on clean installs starting from a new release.

1. https://github.com/nyaruka/rapidpro/blob/main/tools/squash_m...


Great answer thank you!


Same here. If I was starting a non-Python project tomorrow I'd consider using Django to manage the database schema - especially now that we can describe custom indexes and constraints in migrations. Our project has gone thru over 1000 migrations so far tho we squash them down to 50 or so about once a year.


Thats interesting, if I was using TypeScript to access the data, how would I keep the schemas in sync between TS and python?


Prisma works very well and has usable migrations. Not as solid as Django's.

Tiangolo of FastAPI fame is working on https://sqlmodel.tiangolo.com/ Which is pydantic models, SQL alchemy. Migrations are coming soon. This will probably be an excellent way to build APIs with db. Then you can generate a typescript client from the built in OpenAPI schema.


Can introspect and export Django models to JSON schema or similar, then in TS read it and use compiler low-level API to generate types. There may be libraries for either stage…


I've never used TypeScript to talk to a database but there might be tooling to generate classes from tables. Even if there isn't, manually keeping some TypeScript classes in sync might still be worth the effort, for being able to manage schema migrations easily elsewhere.


We do this manually along with a pydantic as a middleman between Django and TS. Works pretty well and is not a major inconvenience to keep things aligned.


if using DRF you export openapi to json and use openapi-typescript-codegen

if using graphene or strawberry you export sdl to json and use @graphql-codegen/typescript


> You can make subdirectories without making apps, and thus avoid dependency hell.

It's hard to get 100% right and thus our projects always have a few lazy foreign key relationships and inline imports to avoid cyclical imports... but I think our code is easier to manage because we try to model the dependency relationships by having things in separate apps.

> Also be wary of the formerly South, now built-in migrations stuff

Our experience from South to 4.x has been that the models/migrations system has matured significantly and is probably now the main selling point for Django for us.


I'm currently in circular import hell. My business logic has Jobs and Loads, and they both need to update each other under certain circumstances. Should these two things/monstrosities be lumped into the same app?


Well you're writing Python so you're never truly stuck when you run into dependency issues, but if you feel like you're in hell then maybe they do belong in the same app. All complex real world apps have complex dependencies between entities - all I would argue is that "put them all in the same package" generally isn't the most scalable solution.


The short answer is yes.

The long answer is if two entities are updating each other you might benefit from shifting all update responsibilities to one of them. Or even to a third entity that knows about both and keeps those two isolated from each other.


Woof. Thank you so much. I like the idea of a third party, like a mediator.

Would that mediator be another app? Or should it be some module sitting in the project directory? (I'm not even sure Django would import something like that.)


> Would that mediator be another app?

Yes!

It is an app that might not even have any model classes. But it will contain business logic. And it will probably speak domain language, which is great.

If you're lucky, those two other apps will become pluggable. You will probably never replace them, but separation of concerns is always nice.

The downside of course is that you will have 3 apps instead of 1. That's the balance you have to maintain.


So, the way I understand it, job/services.py and load/services.py depend on mediator/mediator.py which depends on job/models.py and load/models.py, instead of job/services.py ultimately using load/services.py, and vice versa.

Thanks so much!


This is good enough to break circular dependencies between individual modules, but keep in mind that circular dependencies between the apps remain (e.g. job depends on mediator, mediator depends on job).

I usually prefer to resolve those too. If job is small enough, all the orchestration of jobs should happen through mediator (same for load). If it's not plausible, then job can emit signals which mediator subscribes to.

A good place to start is to give a more descriptive name to the mediator. Sure, it mediates between the two, but it probably does that to implement some business process. Can you name it after that process?


> ...all the orchestration of jobs should happen through mediator (same for load)

So, in a smaller app, when a request comes into job/views.py or load/views.py then we immediately start working with JobLoadMediator to handle business logic between the two? I was just going to focus on specific tasks between job and load. I'll probably look into signals; I haven't used those in several years and as I recall, it felt hacky.

> give a more descriptive name to the mediator

When a Job is deleted, it needs to delete associated Loads. And the states of the Loads will affect the state of the Job. That's the main cycle I'm looking at right now.


> when a request comes into job/views.py or load/views.py then we immediately start working with JobLoadMediator to handle business logic between the two?

In my world, mediator.views and job.views would likely have different audiences.

mediator.views is for business domain (e.g. your API). Though name would not be mediator, it would be something domain-specific.

job.views could be for more low-level internal tooling (e.g. analytics/monitoring). Or it could be empty, if job is just dumb data object that has a lifecycle but doesn't require public API.

> When a Job is deleted, it needs to delete associated Loads. And the states of the Loads will affect the state of the Job

If you want to keep them apart, signals is the right (though not the easiest) answer. job owns (depends on) loads, and subscribes to loads signals. load doesn't know anything about job (ignore the fact that job is referenced in load.models as foreign key, that's just limitations of SQL DDL).

The easiest is a subtle dependency: load can access job through foreign key, job can access loads through related_name. Circular dependency is still there, but it is resolved at runtime. This will start to cause pain when application grows, but is fine at small scale.

Though again, I'd probably merge those apps: if you can't reason about load without job, and can't reason about job without load, there's very little benefit in keeping them separate. It might make you feel more organized, but the if the code is lying about your mental model, this is false organization.


> mediator.views is for business domain (e.g. your API) > job.views could be for more low-level internal tooling

That's very interesting. I'm mostly following the architecture from: https://phalt.github.io/django-api-domains/styleguide/

I knew this was going to be a large project, about 19 apps, supporting a trucking and inventory web/mobile app, and I wanted a sane/organized way to deal with all of the models and relations.

Do you know of some blog posts or books that go into the way you normally organize things?

> if you can't reason about load without job, and can't reason about job without load

There is a lot of interaction between all of the parts of the app, but particularly between Jobs, Loads, Inventory, Stages, and Drivers. In the future I might start with one app, and then add another if I absolutely have to.


If you try to keep everything in one app, you'll have gigantic views.py, models.py, etc.

Nobody wants to work with those. Also, a simple typo in those files can bring your whole system down and can make it hard to debug.

Apps, are a cheap (EXCEPT when you HAVE TO [but really, do you? Really?] move models between apps) way to keep YOU sane.


Right, no one wants to work on a 10k line view.py file. However, if you put everything into its own app, like I did, then you run into circular dependencies as your project gets closer to feature complete. So, the answer is somewhere in the middle. At the moment, I have about 20 apps, with 1 to 4 models per app, and 20% of all that is highly interdependent. I should have put the big, highly interdependent pieces in one app, and have the less connected pieces in separate, bare-bones crud-apps.


Although it goes against certain opinions AND if you set a norm for their use, signals are the way to go.

You can either decide that signals code will live in the app where the objects reside or the app of the target objects and that's it. A built-in and simple interface between objects.


Holy bovine. This finally worked. I've been working around the clock on this for two days now. Thanks again stranger!


Also, do not forget to integrate with Sentry [https://sentry.io/] or something that does the same thing.

It will also keep you sane.


> Don't make separate apps in the same project, unless the project actually consists of several different, completely independent projects. You can make subdirectories without making apps, and thus avoid dependency hell.

I wrote an article on this. My goto strategy is to create a project/core/views.py,models.py,apps.py,tests.py



yup that's that. Thanks for finding it for me :)


> I wrote an article on this.

Link or it didn't happen :)


lol forgot to include


Apps should be buried deep in the documentation in some “super advanced” section and not advocated much at all


Agree with migrations. I've spent a lot of time doing migration surgery to get things back into a consistent state. But mostly I've found that if you just let Django manage the database how it wants to then things will be fine. But get very familiar with merge migrations and fake migrations if things start to go awry.

I disagree with the perspective on CBVs though. I've been programming Django since 0.96 and CBVs have made nearly everything easier and better for me.


how is this possibly the highest upvoted thing here on django. it's like the opposite of good opinions.. they're bad opinions!

_(99% kidding, but i disagree with most of this lol)_


Agree with all of this, although I do like class based views.


Me too, I think it's important to acknowledge that no approach is perfect. Pick your poison.


Is there a better alternative to South (built-in now) for migrations? I always had issues but figured it was the de facto for a reason.


Good points. Agree with slim models and I personally like a service layer. FBVs are great, CBVs are ok, GCBVs are of the devil


> Always keep your models slim.

At least never put your business logic in your views, forms or even templates.


I am in more or less in agreement, except for forms. Surely some validation could be considered business logic.


I've been heavily inspired by this styleguide over the years, but I still think it's a bit too complex. A few random thoughts:

  - I think "services" is too much of a loaded term. I prefer "actions", and I always use the function-based style.
  - I hate the naming of "APIs" in this document. They use the term "API" when they mean "endpoint" or "view".
  - "Reuse serializers as little as possible" is the single best piece of advice when using DRF. The inline InputSerializer thing is brilliant.
  - Having each each endpoint only handle a single HTTP verb is brilliant.
  - URLs and views should be separate from all other business logic (models, actions etc).
  - For read endpoints and associated business logic, I'd encourage https://www.django-readers.org/ (disclaimer: I'm the author).


After 20 years in this industry I firmly believe that the code reuse thing is way oversold in universities. It seems to be the reason for so much crappy over engineered monstrosities. YAGNI and KISS are far better things to aim for. Avoid duplicate code, but don't start out aiming for reusable code, when most of the time your requirements won't be especially clear.


> "Reuse serializers as little as possible" is the single best piece of advice when using DRF. The inline InputSerializer thing is brilliant.

Can you expand on this? What is the InputSerializer as opposed to custom rest serializers?


I think the idea is that instead of thinking "here's the object I'm serializing" you should think "here's the view (endpoint) I'm serializing for".

Contrary to what people usually think, the shape of the serialized object is typically defined by the API endpoint, not by the object itself. Different endpoints can (and will) serve different shapes of the same object.

Even if two endpoints serve the same shape today, they can deviate tomorrow. When this happens, most people are trying to resolve it through DRF inheritance, which is wrong.


Ah, the typical "where to put business logic in Django".

M in ActiveRecord MVC web frameworks is deeply misunderstood. M is not "data model" (it would be called DVC if that was the case). M is your domain model. It's the model of your business, model of the real world. It's the core of your application.

Another thing that I never understood, why are functions called services? Is it a subconscious desire to go back to enterprisey kingdom of nouns? (apparently it is [1])

A service is either something that lives on a network (e.g. database, payment gateway, microservice). Or a class that has a state. Your functions are neither of those, they are just functions.

You business logic should live in the "models" namespace. Whether you put it on Model classes, or onto custom Managers, or just dump them into the module is not important, as long as you keep it consistent and keep your apps fairly small and isolated from each other.

Django already gives you enough tools to support big "enterprise" applications. It is far from perfect, but you'll get much further if you embrace the framework instead of fighting it.

If you really are attached to this "services" mindset then Django API Domains [2] is your best option.

[1] https://www.b-list.org/weblog/2020/mar/16/no-service/

[2] https://github.com/phalt/django-api-domains


I never understood why this was so hard or why people complicate this so much. You have a segment of your application that "does stuff" - some mix of classes and functions. This stuff has its own API. Then you have your web views call that code through that API (which is probably just calling functions...).

No, instead, it is that "does stuff" hast to be its own library, or god forbid, its own service that lives somewhere else, with its own communication layer, its own auth...

Why are we making this so hard on ourselves?


Typically it goes like this:

1. You found one case where complexity is essential.

2. That one case is not consistent with the rest of your app, and you were taught that inconsistency is bad.

3. Since you can't remove complexity from that case, for the sake of consistency you add complexity to all other cases.

Class-based views is a typical example. You found a place where CBVs are useful. Now some parts of your app use functions, some use classes, that's inconsistent. Edit your style guide to enforce CBV everywhere. Now a simple healthcheck endpoint that returns "OK" has to be a class.

As some folks used to say, you can write Java in any language.

The right approach, of course, is to say "I'd rather have inconsistency than complexity". The challenge is that perception of complexity is subjective, but inconsistency is objective. So the right approach eventually loses, and every organization turns into a bureaucratic hell.


Business logic goes in the controller. That's why it's called a controller because it controls stuff like access to your data.


I try to use controllers just to connect incoming events or API calls to the business layer.

The control part is more like Traffic Controller . Just directing traffic.


Oh sweet summer child.

Please read this carefully: https://folk.universitetetioslo.no/trygver/1979/mvc-2/1979-1...

Business logic is supposed to be reused. Controllers in web frameworks (views in Django) expose no way to do it.


If you want to reuse logic you make another http request to the right endpoint.


Are you just trolling? Or your cronjobs are really making http requests?


> We use Celery for the following general cases:

>

> Communicating with 3rd party services (sending emails, notifications, etc.)

> Offloading heavier computational tasks outside the HTTP cycle.

> Periodic tasks (using Celery beat)

Sigh. No mention of the trade-offs. There's simpler ways to do all these things. Celery is a big complex beast and it always pains me to see it as the default suggestion for simple tasks.


Because it’s mature, well integrated with Django and is a path so well-trodden there’s a McDonald’s on the way. Any possible use-case you can imagine for a job queue has been done in Celery and documented.

Celery being complicated is also entirely on the operational side, once you actually have Celery using it from within your app is simple enough.

Cron is awful for this use-case. You end up just inventing Celery but worse when you decide how your app and the cron scripts communicate. If you wanted just scheduled tasks but simpler use something like APScheduler.


> Celery being complicated is also entirely on the operational side, once you actually have Celery using it from within your app is simple enough.

Yeah and to some degree improvements in devops quality of life in the last few years has softened my view. (I used to mainly use Webfaction without access to apt-get and my own hand-rolled scripted deployment. Ugh...)

But I'd still usually prefer a pure-Python solution without additional persistent processes - assuming there is one and it's fairly well-documented. Huey is pretty good from recollection.

> Cron is awful for this use-case. You end up just inventing Celery but worse when you decide how your app and the cron scripts communicate. If you wanted just scheduled tasks but simpler use something like APScheduler.

I was advocating for something like this. There's django-cron etc which solves issues around communicating with scripts.

I do see both sides of the debate between "complexity" and "solves problems out of the box". I'm generally on the Django side when the flask vs Django discussion happens. There's always trade-offs.


I spent 3 years building a high scale crawler on top of Celery.

I can't recommend it. We found many bugs in the more advanced features of Celery (like Canvas) we also ran into some really weird issues like tasks getting duplicated for no reason [1].

The most concerning problem is that the project was abandoned. The original creator is not working on it anymore and all issues that we raised were ignored. We had to fork the project and apply our own fixes to it. This was 4 years ago so maybe things improved since them.

Celery is also extremely complex.

I would recommend https://dramatiq.io/ instead.

[1]: https://github.com/celery/celery/issues/4426


> Because it’s mature, well integrated with Django and is a path so well-trodden

> Cron is awful for this use-case. You end up just inventing Celery

Isn't it the other way around?

Crons are way more mature, well integrated (mgmt commands don't require 3rd party modules), and extremely well trodden. Crons are super predictable, have sensible defaults and plenty of tooling. Which you will have to reinvent with Celery.

There are some benefits to programmatic crons, but the downsides are huge.


A lot of people use Django with uWSGI, which also comes with queues, cron, workers, cache and lots more. I've been stuck with Celery on previous projects for reasons. But I've been dying to try out uWSGI's built in features for this. Hearing great things about it.

https://uwsgi-docs.readthedocs.io/en/latest/Spooler.html

https://uwsgi-docs.readthedocs.io/en/latest/Cron.html

https://uwsgi-docs.readthedocs.io/en/latest/Mules.html

https://uwsgi-docs.readthedocs.io/en/latest/Caching.html


Fellow uWSGI fan here. Unfortunately uWSGI is now in maintenance mode, not because is complete which would've been fine, but because the maintainers are not able to dedicate time to it[0]


A long time ago I wrote a blog post about Celery use cases at https://nickjanetakis.com/blog/4-use-cases-for-when-to-use-c....

It applies to Django, Flask or any Python system using it. All of it still applies today.

It covers a few use cases on the before vs after of using Celery and touches base on why I'd consider using Celery over other solutions such as async / await. The TL;DR is Celery brings a lot to the table around tracking and retrying jobs. It's also nice to separate your web and worker workloads into different processes since they have much different requirements usually.


I think as a concept a task queue with some workers makes a lot of sense but having used Celery in production, it leaves a lot to be desired

We’ve run into various bugs and weird performance gotchas (like the workers prefetch jobs which is terrible if they aren’t all the same size)


Agreed. I have had more luck writing my own "jobs" engine that does stuff that I need that celery doesn't have anyway (retries, some record of execution, rate limiting).

I'm sure if I really tried, I can get celery to be very reliable... but I never really got there.

Also for whatever reason I have NEVER been able to get celery to be reliable for its scheduling/cron stuff. It just starts to fail. I use this library for that, which I have never had problems with: https://schedule.readthedocs.io/en/stable/


What type of situations did you run into while using it?

I've been using Celery for a long time in production now. Nothing crazy and it's a fairly basic set up of "execute job, thanks!" along with a beat server. Over the last 6-7 years an off the top of my head guess would be that there's been at least 5 million jobs processed. It's not huge volume when measured over years but it's been stable.

One server was running for 6 months without being updated. That's a Celery worker process running for ~180 days uninterrupted. It served hundreds of thousands of jobs without maintenance. A lot of these were pretty beefy tasks too like performing HTTP requests that got 400mb XML responses and then parsed them. I didn't even have things like `worker_max_tasks_per_child` set either.


Celery has a lot of gotchas, but primarily the issue I have had is with resource consumption. Maybe it is a problem with my config, but I haven't had this issue with other solutions.


Do you have additional examples of those simpler ways? I totally understand how Celery can be a hammer and everything is a nail type situation.


I am a big fan of Huey. And they have a Django module https://huey.readthedocs.io/en/latest/


I would just use serverless functions to achieve the same thing personally, if you're in the cloud already, chances are high you can trigger functions based on new records in a database or new file uploaded, or what have you. Then you don't need to import much outside of the serverless SDK which should typically be pretty minimal.

That's how I did a timed function for a Django project we were hosting in Azure anyway.


Serverless functions are good for some stuff: clear this cache on this schedule, run this task when XYZ happens, etc..

But that is not the popular usecase for celery. Often you want "some code" (that you likely already have written) to be executed async. Sure, you can create a public interface for "some code" then write a record, that the serverless function is looking for, that then calls back to that interface (but is it a web interface???? then you have a problem where if the job takes too long to complete, what about http timeouts ... ) and now you're really creating a big circle for something that should be simple: execute some code outside of the request (send an email, hit an api, whatever).

Serverless functions really shouldn't contain much logic either, because it's too complicated to test.


Ive used serverless in this way as well when it was a long running process, basically the end-user needed to upload a Shapefile, and some of them can be quite large, so I kicked off an Azure Function once the file was uploaded to parse the file in the background. If it's something that will halt the browser when it needs to run in the background instead, that's where I'll use Serverless if it makes sense. I'm not fond of having my web server running things in the background it ruins my mental model of the web.


> I'm not fond of having my web server running things in the background it ruins my mental model of the web.

Well yes, and you run into other problems as well (now if your process dies or you deploy or something you have to be careful to not kill running jobs).

How much of your logic is in the azure function?


Only things that needed to be done behind the scenes that could take a long time, like we would take JTIFFs that could be gigs of data, and analyze them and create variations of the same image. You don't want someone who just uploaded a 1GB file waiting for you to also analyze it as well.


Eh, yeah, I just don't like any real logic being in those types of systems. Just too hard to test and chase down issues.

I guess "convert this image" is ok, since it is kind of a "pure function" sorta.


Correct, I only use them for very specific needs where it would slow down my website, or be weird to spin off a new thread in the background. I do know Azure supports WebJobs which is probably the closest you can get to something like Celery, which funnily enough, is what Azure Functions (at least originally) are built on top of.


https://github.com/dabapps/django-db-queue

(I am the author)

EDIT: oh hi, Andy :)


Since everyone suggests alternatives to Celery, may I plug my own healthier celery?

https://github.com/NicolasLM/spinach


I have a bug with celery i can't solve.

when I send an async job that get data from various APIs and write all in a DB, in the case of lot there is lot of data, the celery task finish properly to but my flask app becomes unresponsive. I have to restart flask to get back to a normal state.

Anyone would know where I should check?


sounds like a resource issue, maybe you are opening application contexts and not handling them properly?


Django subreddit. Someone asks for a way to run a simple cron job once a week. Everyone jumps in and shouts "Celery". I chip in and say, nah, just set up a cron job, it will be simpler. Get downvoted by the hivemind.


I've been using the beta of Temporal's Python SDK with Django, and aside from some minor teething issues, it's very nice.


What would be your first choices for each of the above?


My choice for periodic jobs is cronjobs.


Do you mean a cronjob that calls a Django manager command to do the work? Or invokes an API method? From my experience cronjobs have a lot of downsides as well. They're great for doing tasks local to the server the job is running on. Not so great for the kinds of tasks (periodic or transactional) that Celery/a real queue is designed for.


Crons are fine for running Django commands that are just talking to the same services as your web views (databases, cache, email services etc). It should be fine to let them run in their dedicated VM/containers, using Ansible etc to keep the crontab configurations in the repo.

Celerybeat has the advantage of better visibility e.g. you can configure them in the Django admin and check when they are running. If you are not using Celery though and your needs are simple, it's easier to just use plain crons.


I use them for invoking django commands on the same server. I do use celery for transactional jobs though. It’s only periodic jobs that get called with cron. For the context, I do this on small web apps with less than 20 dedicated servers (real servers not VPS), so there is a “manager” server that does nothing but run periodic tasks, management interventions, and cleanup operations.


Yeah. Or a simple cron wrapper like django-cron to get the best of both worlds.

For background tasks - you can just spawn a background process and keep a simple status table in the db so the main app can check if it's completed (assuming you even need that)

And for task queues that can handle the traffic most sites will need there's things like django-huey.


> For background tasks - you can just spawn a background process and keep a simple status table in the db so the main app can check if it's completed (assuming you even need that)

I don't know if making my own bespoke queue system is a great idea. It seems simple enough, but it gets so much more complicated once you start seeing issues with it. Orphaned task processes sticking around on the server forever, concurrency control, error handling, etc. I'll pretty much always just use celery and not have to worry about it.


I'd never argue for building your own - just for using something simpler than Celery

It's not so bad now as CI/CD, Docker etc have made complex deployments easier to handle. But back when I was wrestling with Django simply deploying Celery on a new host could easily waste an afternoon and all those dependencies made me very nervous about the overall complexity.

I still weigh carefully anything that adds another long-running process or non-Python dependency to my sites.


This is good for really simple scheduling of stuff: https://schedule.readthedocs.io/en/stable/


I have a long-running thread on the Django forums with a bunch of opinions about this topic: https://forum.djangoproject.com/t/structuring-large-complex-...


In a similar fashion, for anybody interested I've written some of my guidelines on implementing Django apps: https://www.spapas.net/2022/09/28/django-guidelines/


I've been working on a project with flask + sqlalchemy. I have those sql queries returning a bit of rows (up to 20k) and that are quite slow. SQLalchemy does not seem to support caching the results and I've started to use flask-caching[1] with redis using the @cache.memoize() decorator.

Just wondering if I am taking the right path or if there is better alternative.

[1]https://flask-caching.readthedocs.io


> flask-caching[1] with redis using the @cache.memoize() decorator.

> Just wondering if I am taking the right path or if there is better alternative.

Yes, this can be a fine solution to slow queries and is used very often in many kinds of web applications.

However... 20k rows is not a very big number for a modern dB. If the db query is really the slow part then you should investigate why - ensure the relevant sql queries are written properly, and that the relevant tables are indexed properly for the queries that you are running on them.


I've been using nginx for caching when the response isn't user-specific. Pretty painless to set up and doesn't bloat the codebase.

(But it's mostly for small personal projects, so grain of salt and all.)

https://www.nginx.com/blog/nginx-caching-guide/


Indexed?

If complex, what does Sqlalchemy output as the SQL? Could you optimise the query? If quite complex, optimise the tables by redesigning them?

Was the move to allow Redis to cache persistently across requests? Does it do this?

Are you timing each function to look for slowness?


I found this useful being someone who was self taught django and hacked together a project with no idea of architecture. To this day I don't even know what a selector is referring to lol never used it

I'm learning node now for another mini project, is there anything similar? I know how it achieve certain tasks but to structure in a proper way I know nothing, all tutorials I've done never really go that deep


I've found it helpful in several projects to implement the "services layer" described here as a state machine, modeling state transitions for a central object (e.g. an article can be drafted, submitted, reviewed, published). The state machine enforces permissible transitions and handles side effects which touch other models.


HN comments are really disparaging, but after reading through I really liked the content and am going to pull out their service / serializer model to use in my project. Nice opinionated way to avoid structuring a program in a bad way.


Don't create apps just for the sake of project structure even when models from multiple apps are closely related.

Moving models from one app to another is doable but it is a pain. It's even worse if you are relying on GFKs.


Somewhat off-topic but django is fundraising over here https://www.djangoproject.com/fundraising/


Hello everyone,

Radoslav here (one of the authors of the mentioned Django Styleguide).

First of all - I want to thank everyone to the comments The fact that someone took the time to read the styleguide & then write a comment / propose a different POV - is humbling.

I've read everything once & I'll do so at least couple more times. There are interesting ideas & comments that we can apply!

And finally, I want to add some more context:

1. That particular styleguide has served us, and it's still serving us well. It's basically a list of ideas that we found useful, thanks to the various Django projects that we've been exposed to at HackSoft.

2. One core philosophy of the style guide is the ability to cherry-pick whatever makes sense to you. Even at our company, it's very rare to have 2 Django projects following the exact same structure. The styleguide is rather a framework / direction for things that's been proven to work, from our experience.

3. And of course, the styleguide can use some more love from us. We are sitting on a lot of unshared knowledge that needs to be structured and applied back to the Django Styleguide & the corresponding Django Styleguide example project.

4. We try to keep it pragmatic, so you can actually build something. For example, DDD sounds great, but lacks pragmatism and slows you down by a lot (at least, for us).

5. And finally - this is not the "only right way" to do Django. As there is no "right way" to build software. Luckily, there are always options. We'll update the list of other suggested approaches, so people can have a choice / navigate the space better.

As an example of one of the big topics that we want to touch upon is nesting apps. Alongside the "services / selectors" layer (btw - you can call this whatever suits you best ), the ability to nest apps within apps is really powerful, when it comes to the structure and longevity of a Django project. Having 50+ flat apps is not the best experience.

Our current focus is around building the company (HackSoft). The Django Styleguide will be soon to follow. We are slowly gaining more speed & we'll eventually get there

All discussion around "How to do Django" are in fact really interesting. If we happen to meet at some future EuroPython / DjangoCon Europe - I'm always open to discuss in person. Otherwise, if you have specific comments / suggestions - you can submit it either as an issue / discussion, or just send an email to radorado@hacksoft.io

Cheers!


By the way, we are discussing more of this in our latest podcast episode (HackCast) here - https://www.youtube.com/watch?v=9VfRaPECbpY

Cheers!


Can't believe they're putting `class Meta` at the bottom of models.


What’s wrong with that?


The official coding style guide lines already state that it should come immediately after the fields.

https://docs.djangoproject.com/en/dev/internals/contributing...


No, it doesn't. Custom manager attributes go between the two:

```

The order of model inner classes and standard methods should be as follows (noting that these are not all required):

    All database fields
    Custom manager attributes
    class Meta
    def __str__()
    def save()
    def get_absolute_url()
    Any custom methods
```

And that guideline is not really a general guideline of how to make Django projects, but contributing to Django itself. It might make sense to keep that convention for projects build in Django, but that doesn't make this a style guide for those projects.


Official Django docs do NOT say the `class Meta` comes __immediately__ after the fields.

The docs give the example (as linked by you) looking like that, but few chapters down you can find this:

The order of model inner classes and standard methods should be as follows (noting that these are not all required):

All database fields

Custom manager attributes

class Meta

...


It's a pet peeve more than anything - I just hate it when I have to scroll around to find if a class is abstract or not, our team puts it at the top so that's never an issue. Having it anywhere else means it can be any arbitrary number of lines below the class definition making it harder to find.


I’ve never worked on a Django codebase that puts Meta at the top of a model definition. Not saying that it’s the wrong hint to do, but this just feels like feigned surprise because you surely also know that it’s far from common.


It very much was feigned surprise, though I do know a few Django devs who prefer top. I've never really understood the logic for hiding it in the middle or at the bottom even though both are much more common.


On testing, I think "one file & class per thing-to-test" is subtly bad advice. It's not harmful in the hands of someone that knows what they are doing, but it tends to point engineers towards a tightly-coupled test suite, which ends up making refactoring more painful and error prone down the road.

If you have one testclass per entity, then if you make any changes to the structure of your services/models/entities, you must restructure your tests too. This means you can't do the "dream refactor" where you don't touch your tests, and restructure your code without changing any behavior. If you rewrite your tests whenever your structure changes, how can you be sure you've not broken your tests?

Instead, I advocate for testing behaviors. In a tightly-integrated framework like Django, most of your tests are going to be integration tests (i.e. you have a database involved). You should bias those tests towards an integration-y approach that uses the public interfaces (Service Layer, if you have one) and asserts behavior. Ideally the tests should not change if the business logic has not changed. (In practice you'll often need to add some mapping/helper/setup code to make this true.)

If you have any fat-model type behavior that is explicitly scoped to a single model, then you can test that in isolation. Many Django projects call these "unit tests" even though they still involve reading and writing your model from the DB. I call them "focused integration tests". All that matters is that you have agreement on terminology inside your project. If you have extremely complex domain logic, it can be worthwhile to construct "true Unit Tests" that use dummy objects to test logic without hitting your DB. I've not found it worthwhile to mock the DB in most Django projects though.

To provide an example of where my "test behaviors not classes" advice differs from the OP's paradigm, let's say you split out a sub-object to provide a pluggable Strategy for a part of your Model's behavior -- you don't necessarily need to have detailed tests for that Strategy class if it's fully covered by the model's tests. Only the edge cases that are awkward to test at the higher level need to be tested at the granular level of the Strategy. Indeed, the first refactor that just creates a Strategy holding your existing behavior need not change any of your existing tests at all! Indeed, if you do need to change existing tests, that suggests your tests were improperly-coupled to the code under test, since a mere structural change like this should not affect the behavior of your application. Even after adding more Strategy logic, most of your old ModelTests are still good; they still test the high-level behavior, and now also test the integration between your model and the new Strategy class. Basically, test at the most-granular level that gives a clear, decoupled test for your behavior; resist testing every entity in isolation, because some entities have rich coreographies with other entites that make them hard to isolate. Sometimes you have to contort and tightly-couple in order to test things at the very-lowest-level possible.

Inspiration/further reading: https://blog.cleancoder.com/uncle-bob/2017/10/03/TestContrav.... (Grit your teeth through the "Socratic dialog" style. The principle being described is extremely valuable.)


I hate making webpages, but Django makes it as bearable as it can be.

I hate python, but love Django.

I think I have some sort of emotional attachment to Django.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: