
Surviving Django, if you care about databases - pauloxnet
https://www.varrazzo.com/blog/2020/07/25/surviving-django/
======
ris
I can't say I can recommend following almost any of this as "general" advice.
It is, at best, advice from a complete expert _for_ complete experts who are
already pushing the edge of what django should be used for, and who never
expect to have to hand off their project to an average web developer. I've had
to come to terms with the fact that the average web developer _is_ going to
screw up e.g. writing their schema by hand in sql. And I know I'll likely be
the person inheriting this two generations of developer after the original
wizard left.

For an average web developer, someone just getting into django, or someone who
isn't yet running into the limitations that the author apparently is, this is
a recipe to just make a huge mess.

It's also very _specific_ advice. Recommending ignoring cross-database
concerns.. I'm sure that's fine if you're an end-user developer building and
running your own product, but if you're considering writing a library or
reusable app you'd like to be useful to other django users, it's probably
worth a second thought.

It's ironic though that the author makes such a prominent mention of "You
Ain't Gonna Need It", because that's exactly what I'd say about most of the
pimped-up tweaks mentioned here. 99% of people aren't going to run into these
limitations and will produce a much more maintainable app by following the
standard django doctrine.

~~~
bb88
For my company I wrote internal django apps that are completely database
agnostic. They can be run in testing with sqlite, postgres, oracle, etc.

~~~
berkes
That severely limits you though.

And I'm not talking about postgis, hstore or such exotic features. I'm talking
about indexes, transactions, locking and other basic feature that is often
missing or implemented very different across databases.

You may not need, say, GIST indices that help when sorting "tweets by recent
activity" for your proof of concept or even your test-env but you certainly do
for production, or else you are truly missing out. And such basic tools are
avaialble in most database-engines, but differ very much in how exactly they
are implemented, to be used and to be tuned.

~~~
bb88
Django supports indexes, transactions, and locking, and if I need to specify
raw SQL I can do that during a migration say.

90% of what I need is already there. And if I need something special, I can
add it in later.

~~~
price
It's not really about what Django supports, but about what all the databases
you're using support. If you run it in testing with SQLite, you are not going
to have the same set of features for indexing, transactions, and so on as you
do in production with, say, PostgreSQL.

The parent comment mentioned GIST indexes; I'd add partial indexes and
triggers as two features that can make a huge difference when used even in a
very few places in an app (which is what we do in Zulip, on PostgreSQL.)

~~~
bb88
Sure, but does it matter when unit testing code or developing the 95% where it
doesn't?

It matters for production sure if you start getting slowdowns and you can have
migrations that do that. But for cranking out APIs on DRF, say, it matters a
lot less than you think it does.

~~~
price
For that I guess it comes down to whether the features you're relying on are
purely optimizations (like partial indexes, fancy GIST index types, etc.), or
go to the semantics of how the code behaves (like triggers).

I think for us it's the case that all the fancy DBMS features we rely on in
the core functionality of the app are pure optimizations. So if we wanted to
run tests on SQLite, that'd work fine except for when testing the parts of the
app that happen to rely on those fancy DBMS features that aren't only
optimizations.

But I'd consider that a pretty unstable situation -- it'd mean that we had one
test setup for most of the app, and then still needed another test setup (with
Postgres) in order to test some parts of the app. When I've heard from people
working on apps that use different DBMSes for test and production, generally
they don't take this strategy and instead they just limit themselves to the
lowest-common-denominator features that exist in both. You can totally do that
(many people have!); but as berkes's original comment above said, if you do
you're missing out on some really valuable features.

And if you ever want to use even a little bit of those fancy DBMS features,
beyond pure-optimization index features, in some core part of the app -- boom,
you can't test on the more limited DBMS at all.

------
nikisweeting
In my 8 years of working on large Python codebases, I think the Django ORM is
one of the best pieces of software I've had the joy of using. I often
recommend it independent of the rest of Django (e.g. to people building Flask
apps that need an ORM+migrations).

Especially when working on a team with many junior developers who don't know
SQL (myself included in the early days), the Django ORM has been a shining
beacon of reliability and consistency allowing us to do complex database
schema changes without ever having to worry about messing them up or not being
able to revert our changes.

I don't think I can say the same for any other ORM I've worked with, in
particular I have really bad memories of trying to match Django's migration
reliability and flexibility using SQLAlchemy + Alembic.

Also in regards to switching databases, the ability to swap out PostgreSQL for
SQLite on smaller projects without having anything break is a feature I've
relied on several times, and I'm extremely grateful maintaining that clean
break in the layers of abstraction was prioritized by the Django team.

Obviously the author of this article has had different experiences (and has
considerable authority in this field), but I caution that their experiences
are not universal among Python devs.

~~~
bastawhiz
I came here to post essentially this exact comment. Give me Django with only
routing, middleware, the admin panel, and the ORM and I'd be happy. The time
I've saved because of these things so I'm shipping code and not writing SQL by
hand is astounding. I've made the decision to use Python and Django over
_other languages and frameworks_ solely because of the ORM.

I honestly can't think of a single other piece of software that has actively
saved me as much time.

~~~
sergioisidoro
> Give me Django with only routing, middleware, the admin panel, and the ORM
> and I'd be happy.

Very much this. I tend to lean towards Flask for building simple APIs. I think
it's easier and faster to start a project and get something done.

But I always end up missing the admin features and the far superior ORM, and
always regret midway that I did not use Django instead.

------
bb88
A couple of real world data points.

Two Fortune 500 companies I worked at had projects in Django that used native
SQL to do migrations. This led to errors in prod during deployment.

The order of the scripts had to run in a proscribed order. They had to be
listed on the deployment instructions. So the migrations would run, and then
fail during a prod deployment for various reasons. Most notably a dependent
script hadn't run in prod yet. They had run in the Dev/QA environments and
everything had run just fine. Or Dev/QA had been hand tweaked and that change
hadn't made it to the script during prod deployment because they were testing
it for a month to see performance before moving to prod, etc... etc...

It was all ugly.

In one of the two companies they finally decided to stop handwriting SQL and
started trusting the Django migration system, and the problem with deployments
went away.

 _AND_ they still were able to hand write sql to optimize the performance of
the DB if needed.

~~~
MattGaiser
>Two Fortune 500 companies I worked at had projects in Django that used native
SQL to do migrations. This led to errors in prod during deployment.

> The order of the scripts had to run in a proscribed order. They had to be
> listed on the deployment instructions. So the migrations would run, and then
> fail during a prod deployment for various reasons. Most notably a dependent
> script hadn't run in prod yet. They had run in the Dev/QA environments and
> everything had run just fine. Or Dev/QA had been hand tweaked and that
> change hadn't made it to the script during prod deployment because they were
> testing it for a month to see performance before moving to prod, etc...
> etc...

How do companies get this right? My organization has all these problems as
well and I lose 10-15% of my time chasing misplaced SQL scripts or SQL scripts
with improper tweaks.

~~~
skinnyarms
There are tools that help this: Visual Studio Data Tools, RoundhousE,
Flyway/Redgate - I highly recommend investing time in building them into your
workflows.

~~~
AlphaSite
Alembic is pretty great for this as well

------
VWWHFSfQ
> The motivation for writing this article comes from knowledge sharing with
> the team I'm currently collaborating with, who are using Django the
> canonical way. I am convinced they can better use the tools they have. Let's
> see if they can be persuaded to drop Django migrations...

I've built and worked on enough large Django projects to know that if you
don't do Django the canonical way, then I don't want any part of the project.
If I joined a team and saw this weird hodge-podge of custom migration
framework and shadowing ORM fields I would nope-out immediately. It sounds
like this post author just doesn't want to use Django's ORM. And that's fine,
you don't have to.

~~~
Daishiman
There's nothing worse than random teams reimplementing a migrations system
because they think they know better.

The Django migrations system is far from perfect, but it's light years ahead
of anything a small team can build.

~~~
gerbler
I don't think these practices scale well. One person becomes the default
"migration person" (with processes that are likely not as well documented as
Django) and the team becomes reliant on tribal knowledge.

~~~
nikisweeting
How is that different from one person becoming the default "migration person"
on a team that doesn't use Django?

~~~
Daishiman
The difference is that Stack Overflow has a ton of answers to even fairly
complicated questions and that reading the Django docs is an order of
magnitude easier than delving into some random person's badly written code.

------
zzzeek
Daniele is a great guy and I deal w/ him regularly as he's the author of
psycopg2. However the mistake he makes here is in the realm of database
abstraction tools having the sole rationale of "database agnostic code".
That's really not the main rationale, nor is it really "hide the database so
people don't have to learn SQL". The main one is "automation of work". A lot
of the comments here refer to situations where hand-rolling everything always
was just too error prone. There's a thing we do in computers when a job is
very repetitive, tedious, error prone, and follows a complete pattern that you
can define algorithmically, which is, _write a program to do it!_. that said
I'm sort of glad Django took the heat on this one and not SQLAlchemy so I can
get to bed without obsessing :)

~~~
stickyricky
Thank you for creating SQLAlchemy and being so active on its mailing list.

------
airstrike
_> Django needs it because a web framework not tied to a single database
vendor is more valuable than one tied to a specific one - and that's fair
enough. But you don't: your web program, most likely than not, will not have
to switch from one database to another_

This is missing the forest for the trees.

Sure, my one app won't need to migrate from MySQL to Postgres a year down the
road, but as a developer, I'd much rather be allowed to abstract away each
vendor's idiosyncrasies and be able to focus on my Python code regardless of
whether I'm using sqlite, PostgreSQL, MariaDB, MySQL, Oracle or even MS-SQL

~~~
djrobstep
The idea that some library can abstract away database concerns to this extent
is wishful thinking.

You inevitably need to understand the SQL that your ORM is generating and what
the database is actually doing. Otherwise you're flying blind.

~~~
TylerE
Realistically what ends up happening is that the DB gets abstracted away to a
painfully feature-free lowest common denominator.

~~~
bastawhiz
Django handles this quite nicely. You get your basic lowest common denominator
subset by default. If you want something more powerful, you can explicitly
import a db-specific feature.

And for some features, you'll get a NotImplemented exception. In the ten years
I've been using Django, I've only seen these a handful of times, and only when
using SQLite.

------
acdha
This article is trying too hard to dress the author's limited experience up as
absolute truth. For example:

> Have you got PostgreSQL in production, but you want to test with SQLite
> because it's easier to set up? If so, your tests are just a tick-box
> exercise: you are not testing anything remotely plausible and resembling
> your live system.

This is not completely wrong but it's just telling you that the author has
never worked in an environment where it makes sense. Anyone who has knows that
this approach can be used to catch a high fraction of errors locally, while
not precluding a full slower CI build.

> Even if you added some form of manual auditing to each save() method, it
> will not capture changes made outside Django. It wouldn't be very secure
> either: Django uses a single user to access the database so if someone
> manages to hijack that user, they would be able to change the data in the
> database and alter the audit tables to hide their traces.

Similarly, while it is true that Django defaults using a single user for
everything you can easily configure multiple connections (containing, say, an
issue with your normal web code but not your management commands), you can
grant access to only SELECT and INSERT on the audit log table but not UPDATE
or DELETE, etc. There's nothing with the separate approach, of course, but
it's not like it's some great discovery that frameworks are not intended to
cover every possible use-case and sometimes you want to turn them off.

The extended “I hacked around because understanding the migrations module
seemed like more work and then I was too far in to reconsider” section has the
same problem. Being surprised that the migration system triggered an update
when you renamed a field is, well, exactly what it's designed to do. That's
exactly why there is an entire section in the documentation which tells you
exactly what to do in that situation and how to avoid data loss — in that
case, if the problem wasn't making foo.bar magic (which it almost always is)
you'd create multiple migrations to add the new column, move (and usually
convert) the data, and drop the old column. You can even run raw SQL if
there's something like an efficiency hack which makes it worthwhile. This
takes a couple of minutes and is reliable.

~~~
tuatoru
> This article is trying too hard to dress the author's limited experience up
> as absolute truth.

Sure you're not doing the same thing? As Vivekseth writes below, the author
wrote psycopg2.

\--- Allowing the DB user that the web app uses to have schema modification
privileges is a massive security hole. If you're not hacked, someone will
eventually drop the production database.

So, migrations in anything but SQL are a Bad Idea.

~~~
acdha
Yes, you’ll notice that I’m not saying his ideas have no place but that
they’re not universally applicable. For example, the security “hole” is
trivially avoidable by using a different settings file to do migrations on a
separate server/container which doesn’t get normal web traffic.

If you want SQL, you can also generate it from your migrations and send it
over to someone else to run. This is not uncommon in enterprise IT.

~~~
hitekker
To reiterate the point of the GP, you said:

> This article is trying too hard to dress the author's limited experience

Saying the author has “limited experience” is condescending and untruthful
given his career and accomplishments so far.

~~~
acdha
Limited does not mean he hasn’t accomplished anything, only that he’s not
speaking for the entire community and wording his post that way doesn’t add
anything to it.

He’s obviously very proficient, which is going to shape your perspective of
what’s easy and how much control you want just like the scale of the projects
you work on and the number and skill levels of your team.

There’s nothing wrong with his opinion - my objection is the overly broad
framing. It would have been just as good as “here are some things which worked
for us” and letting the reader decide whether they are in the same situation.

------
vivekseth
Not mentioned in the article, but the author also built psychopg2
([https://www.varrazzo.com/software/](https://www.varrazzo.com/software/))

~~~
jimhefferon
Thank you. Relevant.

~~~
giancarlostoro
For those who do not know Django uses said library for PostgresSQL support.
Mind you, if you don't like Django's ORM, you don't have to use it.

~~~
cafard
Isn't it the standard Python library for PostgreSQL? A co-worker just
installed it for some work having nothing to do with Django.

~~~
giancarlostoro
It's the most popular, if you mean standard by that metric then yes, if you
mean standard as in built-in to Python, then no (I only mention this because
SQLite _does_ come with Python OOTB).

------
l0b0
This sounds like a "golden hammer" argument[1], basically saying that SQL can
do everything Django can do, and if you know SQL _really, really well_ you can
do powerful stuff which would be clunky in Django. Of course that's true,
because it's true of any two tools which have overlapping functionality. But
for the vast majority of things you need to do on a daily basis, and for
developers with >10x more experience with Python than SQL it just doesn't make
sense to use raw SQL any more than absolutely necessary. As soon as you use
raw SQL in Django most of the _maintainability_ advantages of an ORM
disappear.

All that said, I totally agree about throwing out DB portability for anything
but the simplest apps (or libraries, of course). Portability is extremely
costly in terms of which features we have available to and how maintainable
the resulting code is, so it should only be a goal when absolutely necessary.

[1]
[https://en.wikipedia.org/wiki/Law_of_the_instrument](https://en.wikipedia.org/wiki/Law_of_the_instrument)

------
DangitBobby
> Much of the complexity of this system is designed to give you a specific
> feature: migrations abstracted from your database. In my opinion, if there
> is something less likely to happen than writing a complex app to run on
> interchangeable databases, it's the need to repeat the same history of
> migrations.

Unless you have a dev, staging, and prod setup. Then there's a good chance
you'll encounter a wide range of variation in the current database schema
compared to your models. I use the ability to roll back and forward migrations
_all the time._ Django ORM and the migrations (and the admin site) are _the_
features that make it impossible for me to justify the use of other frameworks
or languages in the vast majority of cases. The productivity boon is just too
much.

With that said, I have a personal policy of not letting the ORM get in the
way. If I find myself reaching out to the documentation for how to write a
particular query (something like window functions, for example), that's a good
indication that I may be better off writing the raw SQL.

------
ensignavenger
I support a very large database, containing product data for a major US
retailer. If I understand the history of this database correctly (I am not
directly on the team that manages it, but my team supports them in various
ways) the database began life as an MS SQL DB. Many years ago it was migrated
to MySQL. Now, they are in the process of migrating it to Postgres. So yes,
migrating data to different DB's does happen, especially when you consider an
applications life over decades.

~~~
kevindong
Why was the underlying database changed?

The most obvious reason I can think of being licensing concerns with both MS
SQL and MySQL.

~~~
ensignavenger
The first migration from ms sql to mysql was licensing costs, and the desire
to host on Linux. The more recent plan to move to postgres is motivated in
part by features and in part by standardizing on postgres.

~~~
grey-area
Features that you wouldn't be allowed to use if you try to remain database
agnostic.

DB migrations happen, but not often enough to use a lowest-common-denominator
db agnostic library and try to remain compatible with all systems.

~~~
ensignavenger
Well, that is another area the article got partially wrong. There are many
postgres specific features included in Django, and other available via third
party packages. And for most things that the ORM doesn't specifically support,
it gets out of the way and lests you do them anyway, by executing your own
custom SQL, right from the ORM.

------
kissgyorgy
> And with comments! Explaining why a certain index or constraint exist!

You can put comments in Python files...

> With constraints named meaningfully, available for manipulation, not
> auth_group_permissions_group_id_b120cbf9_fk_auth_group_id.

You can name your migrations with:

    
    
        manage.py makemigrations --name "prettier_name"
    

> In my opinion, if there is something less likely to happen than writing a
> complex app to run on interchangeable databases, it's the need to repeat the
> same history of migrations.

Maybe he never worked in a team? Or wrote any integration tests at all? Or
implemented rollback? Because those are pretty much basic things for every
serious project.

> I have a patch_db.py script that I have used, with small variations, in
> several projects.

So you are basically wrote your own schema migrations tool, but inconsistent
across projects, and you maintain it yourself. Not a good idea IMO.

~~~
dvarrazzo
> You can put comments in Python files...

only if the comment pertains to a django feature, not to a schema feature that
cannot be expressed in django (e.g. a partial index designed for a specific
query).

> You can name your migrations with:

In the example there is a foreign key name, not a migration name. It is
persisted in the database, it's not ephemeral like a migration name, for which
choosing a meaningful name has only a temporary value.

Just two factual corrections; for the rest our experiences diverge, that's
fine.

~~~
kissgyorgy
> for the rest our experiences diverge, that's fine.

[https://twitter.com/kissgyorgy/status/1291609448809627649](https://twitter.com/kissgyorgy/status/1291609448809627649)

------
BiteCode_dev
This article misses the point of the django orm and migration system
completly.

They are meant to allow for the dev of an ecosystem.

This is why you have so many django plugins, and why they are so easy to
integrate and play together.

This is why it's easy to have a dev envs, tests, signals and api generation
with django.

It trades performances and flexibility for this. It's nothing more than a
classic technical compromise.

You may need raw sql later on in your django project. But if you feel like
starting with it, just go with flask. Django makes no sense.

~~~
giancarlostoro
The other funny thing is how easy it is to write migrations in Python via the
RunPython capabilities of the Django ORM you can migrate data over. I've yet
to need to write any SQL aside from stored procedures and Database Views.

You can even export all the contents of your DB to JSON or YAML and import it
in Django, so you probably very well could go from SQLite to MySQL / Postgres
if you really wanted to.

Edit:

The other actually funny thing is, Django doesn't force you to use its own
ORM. You can write code that doesn't use its ORM. You can override anything
and everything.

~~~
kissgyorgy
> You can write code that doesn't use its ORM.

You can, but what's the point using Django then? Everything is built around
the ORM API, that's how admin is automatically generating pages, etc, etc. You
lose the main advantage of Django if you don't use it.

~~~
giancarlostoro
ASP .NET Has the same issue, but you could always override the admin bits. If
you _really_ want another library to manage Django's database usage, you can
go to town, nobody has bothered coding alternatives because the ORM is
sufficient. Web frameworks are not what you use if you're even concerned with
micro managing how you access a database, web microframeworks on the other
hand are designed for that level of detail. Something like CherryPy or Flask
is more what you'd want in that case.

------
orf
Migrations can, and do, run arbitrary SQL. At the end of the day the
migrations framework is just a SQL dependency graph, it can be used without
models if that's what you wish. By itself that is way better than ad-hoc
scripts.

~~~
ris
> At the end of the day the migrations framework is just a SQL dependency
> graph.

It's not _quite_. During operation it actually mutates an in-memory shadow of
your model structure. And that's really clever, but if you've got a large set
of models it can be _really slow_.

~~~
dvarrazzo
It also breaks in interesting way: I discovered just today that constants
defined on the Model subclass are not available when you use 'get_model()'. I
suspect methods wouldn't be accessible either?

~~~
MythicDev
As far as I know, this is by design; your migrations are supposed to operate
on a frozen state of what your models and code _were at a point in time_.

If you would rely on code outside of said migration, you would be breaching
that frozen state and potentially end up with unintended side-effects (e.g.
running a migration created 2 years ago that imports your code that changed
today). This is why you might have to sometimes copy-paste logic to your
python migrations, but you also guarantee that the migration always runs the
same way.

------
vandahm
I wish I could work on these magical projects that people with blogs work on,
in which everything works like it's supposed to do. In my experience, though,
it's not unusual to be forced by management to migrate to a new RDBMS a year
into a project. The last time it happened to me, we were using Rails and had
very little difficulty migrating from PostgreSQL to MySQL. We didn't need to
alter application code or rewrite tests and only had to manage config changes
and migrate the data. I imagine the transition wouldn't be much different if
we had been using Django.

------
whakim
Seems like the author should've just used a much less opinionated framework.
The last thing I want to do is choose a batteries-included framework and then
take on the responsibility of maintaining my own system to manage SQL
migrations. Also unclear where the author has worked, but I've had to replay a
bunch of migrations in order on many occasions.

------
tuatoru
The implication of the article is that if you're using a relational database
management system, _you need to know SQL_.

I agree. I wholeheartedly recommend Stephane Faroult's _SQL Success_ as a
beginner book. Read that with attention, and you'll be ahead of 95% of people
messing around with RDBs.

~~~
jordic
Also agree, sql it's like javascript, it's almost on all projects I worked
last 20 years.. (Django it's just another tool, and implementation detail).

------
lewisjoe
A lot of problems stated are characteristic of any ORM not just Django. People
pick ORMs specifically for trading convenience and tooling with loss of
control. Any escape hatches that the ORM provides is by definition hacky.

These days when I know it’s a long term project, I prefer laying out the
domain entities as plain classes and wrapping database operations as class
methods with raw database statements.

This lets me use the underlying databases to its full potential and gives me
enough control without leaking database code into application logic.

------
some_developer
> How many times have you worked on a project and, after 1-2 years of
> development, you have changed to a different database? > > > I can tell you
> how many times it happened to me, I counted them: exactly never.

I guess this is the crux.

You hardly know in advance this is going to happen.

I experienced this transition (mysql>pgsql), though not with Django.

The ORM was perfect. If I recall correctly, only the customized SQL queries
(only a few present) had to br adapted, everything else just worked.

11/10 would chose such a framework again.

------
mnm1
I've migrated a production app from mysql to pgsql although not with django
but a different orm and language. The app also had some custom sql. The reason
was for proper json support as we have a 2tb table with unstructured data (1tb
at the time) that was slowing mysql down severely (most used table by far). A
later version of mysql with json support was not available on aws so wasn't an
option. This was about two to three years in so the app was actually almost
finished and in maintenance mode.

It worked. It took many months and copying the data and doing the migration
was a pita. The orm helped in that I had to modify fewer queries by hand, but
modifying the queries we did have was trivial also. It's one of only two times
off the top of my head that the orm was a net benefit rather than a liability
in the apps that use it. Would I use an orm in the future in case this comes
up? Hell no. I'd pick a proper db first, aka pgsql (that decision was before
my time). But even if I didn't, I'd still prefer to convert the sql one by one
from mysql to pgsql rather than incur the horrific penalties and extremely
slow run time and development time the orm imposed on us. Developing with the
orm was many times slower and the runtime was ten times slower than the actual
query time due to hydration. Rewriting a few thousand queries is nothing
compared to losing months to the slow development speed and the slow runtime
speed which required a completely separate rewrite of most of our
functionality alongside the orm just so we could serve api requests in a
manner somewhat reminiscent of not completely slow.

So yeah, even when switching from mysql to pgsql, the orm wasn't worth it. It
never is or has been.

------
zumachase
We've got a core product running on Django, and one thing the author doesn't
mention is testing migrations. The migration system is wonderful, except for
the difficulty in testing migrations. There's no sane/official way to do this.
And it's such an important thing that I don't get why the Django crew haven't
tackled it.

~~~
_AzMoo
I've used django-test-migrations before and was really happy with it.
[https://pypi.org/project/django-test-
migrations/](https://pypi.org/project/django-test-migrations/)

~~~
zumachase
We've played around with that and it's on the roadmap. But it just adds
another set of fixtures to herd.

------
40four
Pretty good read! But I don't like how the author tries to characterize
generic ORM's like Django's as "Portability at all costs". I don't think the
point is that you can switch DB's a few years down the road if you choose.
He's right, nobody does that. Ever :) The advantage is that you can _start_ a
new project with whatever DB you already have, or are comfortable with. So, if
you want to begin a new Django project, you aren't locked into using MySql, or
Postgres, or what-have-you. It is true you sacrifice being able to leverage
the full gamut of features a particular engine might offer, but that is a
compromise you make with any ORM right? Or, if the limitations are to much to
bear, just don't use Django ORM, it's not required, it's just the path of
least resistance.

------
tabbott
I agree Django's migration system has a few unfortunate quirks (most notably
the help-text-change-migrations), but it's also worked great for Zulip across
over 300 database migrations, and I wouldn't recommend anyone do their own
migration framework on raw SQL files by hand.

The issues mentioned here in this article are real (E.g. we have a lint rule
blocking importing other code inside migration files), but the author is also
ignoring the benefits of Django's system, namely that it's a well-documented
system that does most of the work of managing migrations for you, and does a
good enough job that the vast majority of developers working on a large
software project (like Zulip, with 96 people who've contributed changes to
database migrations) will do it correctly without special training just from
the Django documentation.

And the downsides are certainly minor compared to the obvious downsides with
rolling one's own system with raw SQL (E.g. the fact that nothing validates
that the model definitions match the database schema in the author's
proposal!). I wouldn't recommend anyone take the author's advice for what to
do instead, but this article is helpful reading for anyone curious about what
issues one will encounter after a few years working with Django.

> How many times have you worked on a project and, after 1-2 years of
> development, you have changed to a different database?

FWIW, Zulip was originally developed with MySQL, and then about a year into
developed we switched to Postgres. It also had a phase where we ran SQLite in
development environments. Building on Django made our life a lot easier doing
this.

> Once the migrations have been applied to all your testing and production
> systems, and all your developers' databases, just remove them.

This is only true if you don't care about being able to use tools like `git
bisect` where you might want to switch a development environment to older
versions and back again (I do this often as part of).

> If you do it in a loop, it results in a "ripple load", meaning you'll run 50
> queries, each one fetching one record, while you could have run a single
> query fetching 50 records, or just adding an extra JOIN to another query.

This is my biggest complaint about Django's ORM -- lazy fetching of foreign
keys is great in a management shell but rarely desirable in production, and it
results in inexperienced developers doing database queries in loops all the
time. We address this with a bit of middleware that formats log lines like
this e.g. `41ms (db: 10ms/2q)` (showing how much time was spend doing database
queries and how many queries were done) -- which lets one easily identify
problematic cases of this both in development and in production.

But if Django had a setting that just disables the lazy fetching of foreign
keys feature (making accessing an unfetched foreign key an error), I'd turn it
on for Zulip.

See also [https://zulip.readthedocs.io/en/latest/subsystems/schema-
mig...](https://zulip.readthedocs.io/en/latest/subsystems/schema-
migrations.html).

~~~
airstrike
_> This is my biggest complaint about Django's ORM -- lazy fetching of foreign
keys is great in a management shell but rarely desirable in production, and it
results in inexperienced developers doing database queries in loops all the
time._

I may be misunderstanding you, but isn't this what select_related and
prefetch_related exist for?

~~~
acdha
Those are how you optimize it, but first you have to recognize the need. The
easiest way to make this visible is to increase logging or install Django-
debug-toolbar, so people can see when the query count jumps, but the best way
to prevent regressions is to use something like the testing framework’s
assertNumQueries() method to prevent regressions:

[https://docs.djangoproject.com/en/3.0/topics/testing/tools/#...](https://docs.djangoproject.com/en/3.0/topics/testing/tools/#django.test.TransactionTestCase.assertNumQueries)

I’ve used that with a fuzzy number class which will say it’s __eq__ to any
number below a specified max so you can say e.g. my homepage should take less
than 10 queries and not have to touch it for minor changes while catching when
someone implements a new feature in a way which breaks your select_related
usage.

------
luord
This is tricky, on the one hand keeping everything related to the database
close to or in the database is probably cleaner (in the clean architecture
sort of way), as it eases a separation between the business rules and the data
saving and loading.

On the other hand, doing this loses a lot of the sugar and development
benefits of using the ORM (either Django or SQLAlchemy, which I prefer), that
are often enough—and perhaps preferred—for 95% of projects.

If I were to choose, I'd pick the ORM, though trying to place it as high in
the stack as possible.

------
thedanbob
Interesting comparing this to Ruby on Rail's ORM, which takes a relatively
hands-off approach. ActiveRecord will produce a migration for you when you
generate a model from the command line, but otherwise migrations are entirely
the developer's responsibility. And the way he suggests handling patches
sounds a lot like how ActiveRecord does it: keep track of what migrations have
already been applied and only apply newer ones, and for a new database just
create the up-to-date schema rather than applying the entire history of
migrations.

~~~
_AzMoo
There's nothing stopping you from hand-writing Django migrations if you
choose. Django compares the state of the models to the state of the database
after all migrations have been run to determine if new migrations are
necessary, so as long as you've covered all of your model changes in your
hand-rolled migration you'll be fine.

~~~
thedanbob
Ah, good to know. I have no experience with Django and the article gave me the
impression that Django migrations can’t be handled manually.

------
rhema
I made a custom Django system for my personal website in 2013. It became to
old to work with my web host. So, now I just run the django locally long
enough to run a wget mirror. Voilà, static content.

------
Jasp3r
Terrible article, especially the part about migrations. The automatic
migration generation of Django of course cannot infer exactly how you want
your schema to change if you start renaming or altering columns.

The author's solution, instead of adding on to migrations then opts for
completely abandoning a tried and tested migration framework for arbitrarily
executing sql files? This is madness

------
jordic
May Django work with multiple tenants on different postgresql schemas? Can
Django migrations handle this in an easy way (better than sql).. When projects
grow old (+10years), things trend to complicate (lot of hands, lot of changed
requirements.. ), and that's when the advices on the article had a lot of
sense.. on the end data is there and the model is the database..

------
jordic
I would also add to the article, that when you try to alter a busy table add
it a timeout (otherwise you will lock the world). SET statement_timeout = 50;
Alter .... ;

------
Humphrey
History Lesson (if you care about migrations):

TLDR; I disagree: Django migrations are the best, and make reusable apps
possible!

I've been using Django since before they introduced migrations and it was
hell! Django only had "sync_db" then which would create your tables initially,
but then you were on your own to write and run SQL manually for any model
changes. This became especially annoying when using 3rd party apps, because if
those apps changed the schema, you'd have to manually update your database for
the 3rd party changes.

Then came along django-south which was a reusable app which provided
migrations and solved all of the above problems. It became almost the standard
approach, and third party apps often provided south migrations.

South became so popular, and was such a big improvement to the development
workflow, that it's codebase was merged into Django (after making some
significant API improvements) to become Django migrations. I can still
remember having to covert all the files in the migrations directory from South
to Django migrations.

Now we have a defacto standard of migrations work, and they allow easy use of
customer SQL. Seriously, if you hate python based migrations as much as the
author does, write custom SQL but put it in a Django migration so you get all
the benefits of being able to rollback migrations etc.

Summary: when writing all of your own code for a single project then there is
only minor benefits from migrations. But they are still worth it! In fact, I
write lots raw SQL in my migrations, eg for constraints, triggers, notify, and
just modifying my data. If I run SQL, it goes in a migration.

But migrations shine the brightest for reusable first and third party apps,
which lets admit, we all use in most projects. This allows a single reusable
app to be used for all projects regardless of database. Also, as a consumer of
a reusable app, I can let migrations do their thing, and not worry if an
update will break my database.

------
vbilopav
My advice is to avoid Django ORM altogether

------
1337shadow
Probably they should try django-north instead of their patch_db.py

------
jordic
Another thing important that the author doesn't mention is that if you try to
keep your sql model on the postgres side.. you will be able to introduce other
languages (for example to optimize hot paths)

I agree 100% with the author, if you check other open source projects (on
other languages) they mostly are based on the patch version sql feature.
Sorry, the migrations feature from my point of view it's absolutely over
engineered..

I also agree with him, that doesn't make too much sense to keep Django+drf,
when if you use fastapi+pydantic+asyncpg you got all the real innovation on
the python side..

Also it's not a good argument that following the Django religion will let to
proper future maintenance.. think on channels and the big mess around them :)

Sorry, I'm perhaps a bit biased, started with Django 0.9 till 1.9.. and right
now I'm not using it for anything new.. on the way I swapped for go, and later
come back to python thanks to asyncio.

~~~
ensignavenger
What mess around channels are you talking about? I've never heard anything
about such a mess?

Not everyone uses Django to build a REST API that connects with a javascript
SPA. And even if I was, I would still be skeptical that FastAPI could do
everything Django does for me. (I do like Pydantic a lot, and actually use it
some times in my Django apps.)

~~~
jordic
Look at what will happen to channels on the next releases when Django will be
async...

To have a websocked popping info from a redis pubsub doesn't make any sense
all of this crazy abstractions.. it's just an async view reading from
aioredis..

If you need an orm (a real one) use sqlalchemy.. the Django orm has it's own
design flaws that a lot of of dbas will complain about it..

Anyway, latest thoughts I have it's that it's better to keep you model logic
as near as sql as you can.. you will swap before python or Django than
postgresql :)

