I mostly agree with this but with one (and a half) modifications and one additio...

daigoba66 · on Aug 10, 2015

I've come to the same conclusions.

Our tooling still requires an integer as the first part of the script name - but that's only used to determine overall execution order. The numbers can overlap and there can be gaps which helps with the branching problem.

I also agree that idempotent scripts are a key best practice.

We also persist a hash of the script in the database so that our tools can detect when a script has been modified after it was applied. This has helped catch and prevent a whole class of bugs.

joshka · on Aug 11, 2015

As an extension here, you could store the entire script. Storage is cheap (at least for the size of the scripts that we're talking here most of the time).

dripton · on Aug 11, 2015

I agree that versions should be named rather than numbered. Well, I guess that's an implementation detail. What really matters is having the equivalent of branches in a version control system. That way the main branch can use migrations a, b, c, d, and e. And the release branch for the legacy version that backports bugfixes but not new features can easily use migrations a, c, and e and skip b and d because those went with features that weren't backported.

In Python, sqlalchemy-migrate does this wrong and Alembic does this right. I worked on a big project that used sqlalchemy-migrate, and this caused us pain. It's difficult for a large project to change database versioning systems, so it's important to pick a good one from the start.

radicalbyte · on Aug 10, 2015

This is exactly the same conclusion I've come to over the years.

For compiled languages, I strongly recommend embedding the scripts in your jar/assembly/dll. That way you can sign it, and make deployment much much easier.

ars · on Aug 10, 2015

> Finally, all migrations must be idempotent.

That must be tough. Are you doing it by simply checking a "I've already done this flag"?

ansible · on Aug 10, 2015

You'd have a database table where its entire contents explain what's been done to the database schema and the data migrations.

And then you make inserting a row named '3-add_customers_table' as part of the transaction for the migration. If this row already exists, the insert will fail, aborting the entire transaction.

Or something like that...

daigoba66 · on Aug 11, 2015

If your RDBMS supports transactional DDL and conditionals then it is pretty simple.

E.g. adding a column: IF NOT EXISTS ... ALTER TABLE ... ADD ...;

tajen · on Aug 10, 2015

This is how the Play Framework work too.