I mostly agree with this but with one (and a half) modifications and one addition.
Instead of storing a version as an integer, I strongly prefer naming each migration and storing the applied migrations in a database table.
Rather than relying on a single integer, I can simply write code that applies all the migrations which haven't yet been applied. This makes it much easier for the database to be modified in multiple concurrent branches without any merge pain.
At work we follow this practice and name our migrations something like "1-add-foo-table". If another dev makes a branch with a migration named "1-add-bar-table" then there's no conflict.
We also store our migrations as a directory rather than a single script. This lets us split things up into multiple files if needed. We also allow for both SQL and Perl (our main language) migration scripts in the directory, which is handy.
Finally, all migrations must be idempotent. In theory no migration should ever be run against the same database twice, but making our migrations idempotent is a little extra insurance.
For the curious, I wrote some Perl modules that help manage a system of migrations like what I just described:
Our tooling still requires an integer as the first part of the script name - but that's only used to determine overall execution order. The numbers can overlap and there can be gaps which helps with the branching problem.
I also agree that idempotent scripts are a key best practice.
We also persist a hash of the script in the database so that our tools can detect when a script has been modified after it was applied. This has helped catch and prevent a whole class of bugs.
As an extension here, you could store the entire script. Storage is cheap (at least for the size of the scripts that we're talking here most of the time).
I agree that versions should be named rather than numbered. Well, I guess that's an implementation detail. What really matters is having the equivalent of branches in a version control system. That way the main branch can use migrations a, b, c, d, and e. And the release branch for the legacy version that backports bugfixes but not new features can easily use migrations a, c, and e and skip b and d because those went with features that weren't backported.
In Python, sqlalchemy-migrate does this wrong and Alembic does this right. I worked on a big project that used sqlalchemy-migrate, and this caused us pain. It's difficult for a large project to change database versioning systems, so it's important to pick a good one from the start.
This is exactly the same conclusion I've come to over the years.
For compiled languages, I strongly recommend embedding the scripts in your jar/assembly/dll. That way you can sign it, and make deployment much much easier.
You'd have a database table where its entire contents explain what's been done to the database schema and the data migrations.
And then you make inserting a row named '3-add_customers_table' as part of the transaction for the migration. If this row already exists, the insert will fail, aborting the entire transaction.
Instead of storing a version as an integer, I strongly prefer naming each migration and storing the applied migrations in a database table.
Rather than relying on a single integer, I can simply write code that applies all the migrations which haven't yet been applied. This makes it much easier for the database to be modified in multiple concurrent branches without any merge pain.
At work we follow this practice and name our migrations something like "1-add-foo-table". If another dev makes a branch with a migration named "1-add-bar-table" then there's no conflict.
We also store our migrations as a directory rather than a single script. This lets us split things up into multiple files if needed. We also allow for both SQL and Perl (our main language) migration scripts in the directory, which is handy.
Finally, all migrations must be idempotent. In theory no migration should ever be run against the same database twice, but making our migrations idempotent is a little extra insurance.
For the curious, I wrote some Perl modules that help manage a system of migrations like what I just described:
* https://metacpan.org/release/Database-Migrator
* https://metacpan.org/release/Database-Migrator-mysql
* https://metacpan.org/release/Database-Migrator-Pg
The system is designed to be extensible so you can build on top of it, rather than a fully encapsulated tool.
There's also Sqitch (http://sqitch.org/). It's written in Perl but is entirely language-agnostic to the best of my knowledge.