To me there's a sliding scale between productivity, where you use a heavy framework like Django and Rails to do everything for you, and control, where you write boilerplate to stitch all your favorite single-purpose libraries together using your preferred patterns.
They each have their purposes. Django will get you to market fast with all the features you need, and keep you there for a long time. But it forces (through its library structure) and encourages (through its common patterns) ridiculously tight coupling.
I work on a Django Monolith now that runs an org needing to grow beyond it. We need something not quite offered by a Django library, or we need to move something with different scaling needs out to another service - it's all miserably difficult, because they followed all the Django recommended best practices The framework controls you, you don't control it.
Now we're back to writing boilerplate to enforce a semblance of clean architecture onto it. It's kind boring sometimes, but once a domain gets refactored out of the Django way, our ability to deliver features quickly and safely in that domain goes up 10x.
The "Fat Models" recommendation is one of the most destructive in my opinion: https://django-best-practices.readthedocs.io/en/latest/appli..., along with Django Rest Framework "Model Serializers". A JSON serializer that talks directly to the database is just madness.
So just don't use fat models. The only sensible way to use Django is to put all the business logic in service methods, not in models/managers or serializers/forms.
If all your business logic is in models then of course your app is going to be completely unmaintainable and it's going to take developers weeks to do things that should normally take a couple hours.
There is definitely a real problem in the Django community where lots of people have recommended architecting apps in bad ways, so then you get developers who want to implement the app the "standard" way that Two Scoops or whatever recommends. But Django itself is still a great tool, you just need to be willing call out your teammates if they're unable to think for themselves.
I was just acquired into a team that enthusiastically recommended that book. Are there any alternative references I could look at or point to as alternatives? I've used a good bit of Flask but don't have much experience with Django.
The book is actually worth reading, there are just some things that I strongly disagree with. The reason I'm writing my own guide is because there isn't anything else there that I like.
I’m writing a Django style guide since all the existing ones are bad. If you send me an email then I’ll send you a draft, so that you have something to show your coworkers.
Absolutely. There's a big curve to understanding Django, Rails, .NET well enough to be able to prototype a real application. There's an even bigger curve to doing that in a maintainable way.
I think it's good to get familiar with a variety of ways of building applications over a career so you can pull from the best of them to, again, be able to focus on _business problems_ you have and can solve. To me that includes a sustainable development model and system architecture.
TL;DR Django models are the database, which makes them the wrong choice for presenting a service-layer interface to the persistence. They are inherently unable to hide, encapsulate, and protect implementation details from consumer that don't care or shouldn't be allowed to access.
The Django model is a representation of the database state. It's an infrastructure-layer object. It's is _very_ tightly coupled to the database.
Your business needs should not be so coupled to the database! While it is very helpful for an RDB to accurately model your data, a database is not an application. They have different jobs.
(The TL;DR of the following paragraph is "encapsulation and interfaces")
Your business logic belongs in the "service layer" or "use case layer". The service layer presents a consistent interface to the rest of the application - whether that is a Kafka producer, the HTTP API views, another service, whatever. Your service layer has sensible, human-understandable methods like "register user" "associate device with user", whatever. These methods are going to contain business logic that often needs to be applied _before_ a database model ever exists, or apply a bunch of business logic after existing models are retrieved in order to present a nice, usable, uncluttered return value. Your service layer hides ugly or unnecessary details of the database state from the rest of the application. Consumers shouldn't care about these details, they shouldn't rely on them (so you can fix or change without breaking the interface) , and they very probably should not be presented direct access to edit whatever they want.
If you do not do this and instead choose the fat models method all of the following will happen:
1. You will repeatedly write that business logic everywhere
where you use the models. You'll write it in your serializers, your API views, your queue consumers/producers, etc. You'll never write it the same way twice and you damn sure won't test it everywhere.
2. You'll get tired of writing the same thing and you will add properties or methods on the model. This is the Fat Model! This might be appropriate for convenience property or two that calculates something or decides a flag from the state of the model, but that's it. As soon as you start reaching across domains and implementing something like "register device for user" on the user model, or the device model, you are just reinventing a service layer in a crappy way that will eventually make your model definition 4000 lines long (not even remotely an exaggeration).
3. Every corner of your application will be updating the database - via the model - however it wants. They will rely on it! Whole features will be built on it! Now when it's time to deprecate that database field or implement a new approach, too bad. 20 different parts of your app are built on the assumption that any arbitrary database update allowed by the model is valid and a-ok.
Preferred approach:
1. Each domain gets a service layer, which contains business logic, but also presents an nice reliable interface to anything else that might consume that domain. This interface includes raising business logic errors that mean something related to our business logic. It does not expose "Django.models.DoesNotExist" or "MultipleObjectsReturned". It returns an error that tells the service consumer what went wrong or what they did wrong.
2. The service layer is the only thing that accesses or sees the Django models aka the database state. It completely hides the Django models for its domain from the rest of the application. It returns dataclasses or attrs, or whatever you want to use. The models are no longer running rampant all over the application getting updated and saved willy nilly. The service layer controls what the consumers in the rest of the application can know and do.
You will write more boilerplate. It will be boring. You will write more tests. It will be boring. But it will be reliable and modular and easier to reason about, and you can deliver features and changes faster and with much less fear of breakage.
Your business logic will live one place, completely decoupled, and it can be tested alone with everything else mocked.
How your consumers (like API views)turn service responses and errors into external (like HTTP) responses and errors, lives in one place, completely decoupled, and can be tested alone with everything else mocked.
Your models will not need to be tested because they are just a Django model. They don't do anything that's not already 100% tested and promised by the Django framework.
We started moving off "fat models" at my job and onto DDD (service methods, entities, etc.), and I have to say after a year I'm not a fan. Here are my beefs:
1. If you're not using models, it's a lot of work to stay fast.
If you've got a Customer instance, and you want to get customer.orders, you've got a problem if it's not lazy. If it's a queryset, you get laziness for free, if it isn't you have to build it yourself. God help you if you have anything even remotely complicated. You also need trapdoors everywhere if you want to use any Django feature like auth, or Django libraries.
2. You have to build auth/auth yourself
Django provides really nice auth middleware and methods (user_passes_test).
3. Service methods only do things something else should be doing.
You might be doing deserialization, auth/auth checks, database interactions, etc. All of that stuff belongs at a different layer (preferably abstracted away like @user_passes_test or serializers).
4. The model exposed by Django and DRF is actually pretty good, and you'll probably reimplement it (not as well)
The core request lifecycle is:
request -> auth -> deserialize -> auth -> db (or other persistence stuff) -> business stuff -> db (or other persistence stuff) -> serialize -> response
We've reimplemented all of those layers, and since we built multiple domains we reimplemented some of them multiple times. It probably would've been better to just admit "get_queryset" and the like are good ideas.
5. Entities are a poor substitute for regular objects and interfaces.
We've mostly ended up wrapping our existing models in entities, but just not implementing most of the properties/fields/attributes/methods. But again, we have to trapdoor a lot, we have trouble with laziness and relationships in general, and we have a lot of duplicate code in our different domains.
6. We have way too many unit tests.
Changing very small things requires changing between 5-10 tests, each of which use mocks and are around a dozen lines at least. Coupled with the level of duplication, this has really slowed us down. They also take _forever_ to run.
FWIW I think you're right about jamming too much into models; I think that works at a small scale but really breaks down quickly. I think at this point, my preferences are:
1. Ideally, your business logic should be an entirely separate package. It shouldn't know about HTML, JSON, SQL, transactions, etc. This means all that stuff (serialization, persistence) is handled in a different layer. Interfaces are your friend here, i.e. you may be passing around something backed by models, but it implements an interface your business logic package defines.
2. The API of your business logic package are the interfaces you expose and document. The API of your application is your REST/GraphQL/whatever API--that you also document.
3. Models should be solely database-specific. If you're not dealing with the database and joins and whatever, it doesn't go in models and it doesn't go in managers.
4. Don't make a custom user model [1].
5. Serialization, auth, and persistence should be a declarative and DRY as possible. That means class-level configuration and decorators.
6. Bias strongly against unit tests, and rely more strongly on integration tests. Also consider using them during development/debugging, and removing them when you're done.
Does that seem reasonable to you? I spend a lot of time thinking about this stuff, and I would like my life to be less about it (haha) so, any insight you can give would be super appreciated.
I think we're agreeing on the majority of this. We have not chucked DRF or Django auth or anything. We've just created service layers to take the business logic out of the API views, API serializers, and DB models.
Each action looks like
1. Request arrives into the app, auth happens using DRF on the API view. This is all using Django & DRF built-ins.
2. In the API view: request data gets serialized using DRF serializers, but no calculated fields or model serializers or other BS. JSON -> dict only. The dict does not have models in it, only IDs: profile_id, reservation_id, whatever. Letting the "model Serializers" turn a JSON location ID into a Location model is how you get 10 database queries before you've done _anything_. At this point we don't care if the location_id is valid. We are just serializing.
3. Still in the API view: Dict dump from the serializer gets shoved into whatever format you're going to send to the service layer. For us this is often an attrs/dataclass. If we're calling the "Reservations Service" method "create reservation", we pass in location_id, start time, end time, and the User model. The User model in this case is breaking our policy of not passing models through the service boundary, but it's the one exception for the entire code base, because it's too useful not to take getting it for free from DRF's user auth. We would be basically throwing it away then re-calling for it in the service layer which is dumb.
4. Call the Reservations Service layer. The service layer is going to do n things to try to create the reservation. If it needs to insert related records, like in a transaction, cool. Its job is to provide a sane interface for creating a Reservation, and whatever related side effects, not to only ever touch the Reservation model/table and nothing else. The base of our Domain is Reservation, creating a ReservationReceipt and a ReservationPayment are entirely within scope. Use the Payment model directly to do this if there's zero extra logic to encapsulate, or create a Payment service if you have a ton of Payment-creation logic you need to extract/hide from the Reservation service. You can still manage it all in a transaction if you want. The point is that the caller (the API layer) doesn't see this. It only sees that it's calling the Reservation Service.
5. The Reservation service will either return a dataclass/attrs objects representing a successful Reservation created, or raise a nice business error like ReservationLocationNotFound (remember when you passed in a bad location id to the API, but we didn't want to check it in the API layer?)
6. API View takes the service response & serializes it back, or takes the business error and decides which HTTP error it should be.
Got it, yeah that makes sense. At a previous job, we invested pretty heavily in model serializers, but yeah they’re bonkers slow. Thanks for weighing in, really nice to talk about this stuff with someone with a lot similar experience.
They each have their purposes. Django will get you to market fast with all the features you need, and keep you there for a long time. But it forces (through its library structure) and encourages (through its common patterns) ridiculously tight coupling.
I work on a Django Monolith now that runs an org needing to grow beyond it. We need something not quite offered by a Django library, or we need to move something with different scaling needs out to another service - it's all miserably difficult, because they followed all the Django recommended best practices The framework controls you, you don't control it.
Now we're back to writing boilerplate to enforce a semblance of clean architecture onto it. It's kind boring sometimes, but once a domain gets refactored out of the Django way, our ability to deliver features quickly and safely in that domain goes up 10x.
The "Fat Models" recommendation is one of the most destructive in my opinion: https://django-best-practices.readthedocs.io/en/latest/appli..., along with Django Rest Framework "Model Serializers". A JSON serializer that talks directly to the database is just madness.