I've been really happy with how my current company[0] has been doing migrations ...

evanelias · on May 10, 2019

I call this declarative schema management, since the repo declares the desired state, and the tooling knows how to reach this state. This concept is finally catching on lately, although some huge companies have already been doing it this way for quite some time. Facebook is a key example; they've managed their schema changes in a pure-SQL declarative fashion, company-wide, for nearly a decade.

I'm developing a suite of tools [1] to provide declarative schema management and "schema change by pull request" functionality, initially targeting MySQL and MariaDB. A few large companies have built pipelines using one of my tools -- including Twilio SendGrid, who wrote about their process in-depth recently [2].

[1] https://skeema.io

[2] https://sendgrid.com/blog/schema-management-with-skeema/

liyanchang · on May 10, 2019

This is good to know. As someone who didn't do much with databases before, I was frankly worried given how it didn't seem like many others were taking this approach when it made so much sense (we did have the advantage of having a defined schema which I know isn't always available). Seems like I just didn't know what to search for.

Git would never have worked it required devs to write the up/down patches - why should we have to write the up/down migrations for my schema?

Excited to see more tooling around declarative schema!

meowface · on May 10, 2019

I'm surprised this isn't more of a thing. It seems like the natural evolution of "[X] as code". I've always been a little turned off by migrations (though they were certainly an improvement over the previous situation, which was basically just indeterministic changes on the fly).

evanelias · on May 10, 2019

My thoughts exactly. But it's a major paradigm shift for those coming from the world of Rails/Django/etc migrations, and that unfamiliarity understandably leads to some initial resistance and skepticism.

fwiw, other declarative tools are starting to pop up -- besides my tool Skeema, some others I've seen recently are Migra [1] and sqldef [2]. And meanwhile a bunch of enterprise tools for MS SQL Server have operated in the declarative fashion for quite a long time, although usually with GUIs instead of being git / pull-request-driven. So I think/hope it's just a matter of time before this concept becomes more widely known.

[1] https://github.com/djrobstep/migra

[2] https://github.com/k0kubun/sqldef/

hobs · on May 10, 2019

It's definitely a thing, eg SQL Server Data Tools has this as a default - Schema Compare and Data Compare, and you can just use declarative approaches to defined your final state and let the tool take care of it.

That being said - if you want to do this the downside is usually that its slow as hell, and the non-migration approaches can cost you downtime.

Generic solutions to specific states often means copying all data somewhere else so you can modify the table and then put it back in a useful fashion - a migration often allows more piecemeal approaches.

Edit: a guy I like wrote a good model/migration set of articles http://dlmconsultants.com/model-vs-mig/

nickserv · on May 10, 2019

Curious, how do you deal with renaming fields or tables?

This is a (minor) pain point for traditional migration systems.

evanelias · on May 10, 2019

Excellent question! The short answer is Skeema doesn't directly support renames yet. Renames are inherently more imperative than declarative, so they don't fit in well with the model. I've thought about handling them via tracking their history/state, but it would be hacky.

Two workarounds exist in Skeema currently:

* You can do the rename "out of band" (e.g. manually, rather than via `skeema push`), and then update your schema repo via `skeema pull`. This isn't ideal, but then again, table/col renames typically involve nasty code-deploy-order complexities to begin with (regardless of migration system): there's no way to rename something at the same exact instant that your new code goes live, and it's difficult/annoying to write code that can correctly interact with both names.

* For new empty tables, happily a rename is equivalent to drop-then-re-add. So this case is trivial, and Skeema can be configured to allow destructive changes only on empty tables.

I've written a bit more about this at https://github.com/skeema/skeema/blob/master/doc/requirement... , as well as the bottom of https://www.skeema.io/blog/2019/01/18/declarative/ .

brlewis · on May 11, 2019

How about a column named x_no_wait_y declares a column named y, but if a column named x exists it's renamed?

teddyuk · on May 11, 2019

If you use ms sql server ssdt you use refactor/rename and it finds all references and changes them and then when you go to deploy it generates a sp_rename - 100% killer feature right there :)

barrkel · on May 10, 2019

Data migrations? Denormalizing columns from one table to one or more child tables, possibly more than one relation away? Switching one set of fields in a single table to be in a different table via a relation, converting something from 1:1 to 1:n?

The concept appeals to me, but it only seems to work for trivial migrations.

liyanchang · on May 10, 2019

I’ll openly admit that we don’t have everything figured out. You’re absolutely right that currently, we constrain ourselves on what we migrate to admittedly simple migrations.

I think there’s a whole set of problems to be solved in this space and frankly, I’m as surprised as anyone that given how SQL is declarative, we use procedural code to do migrations so part of my post was hoping people would tell me what tool I should be using or how this approach fails over time. So your examples are really helpful for me as I think through if it’s possible to do automatically, workaround, or get by without.

It seems to me that we just lack the ability to express these transitions mathematically that can help us do them. And of those, there’s probably only a subset which are possible to do without taking downtime.

In particular, the class of migrations that you outlines are a combination of DDL and DML changes and also have quite a bit of code complexity to do without downtime. It’s definitely a current weakness.

evanelias · on May 10, 2019

A totally valid point, but I'd argue those should be handled by a separate tool or process. Data migrations tend to be fully programmatic; tools and frameworks can help reduce the code required, but cannot handle every possible case. (having performed numerous multi-billion-row data migrations, I learned this painfully first-hand...)

For simpler cases, where it may make sense to run a data migration immediately after a schema change, a good generic middle-ground may be configurable hook scripts. A declarative schema management system can then pass relevant info to the hook (which tables were changed, for example) and then the script can run any arbitrary row data diff/apply/migrate type of logic.

I do understand your point though; for relatively straightforward data migrations, an imperative system can capture these much more cleanly by just coupling them with the corresponding schema migration code.

barrkel · on May 11, 2019

I honestly like the way Rails does it: both capturing the imperative deltas and dumping the final schema which gets checked in. Not a big fan of down migrations, usually a waste of time.

Otherwise I like Percona's OSC, particularly how it can tune down table rewrites when there's competing work, or replication is lagging too much. We're just at the point where we need to automate the OSC tool rather than using it as a point solution for migrating our bigger tenants.

PaulMest · on May 10, 2019

I am guessing that you are probably not using Python/Django... but is this any different than what Django offers?

Django allows you to define your models (schema) and then run a command that will generate the migrations. If you don't like the migration that was generated, you can modify it. You can customize up and down operations.

There are also tools that will take an existing database and generate the Django models for you.

All of these operations can also export the exact SQL that will run on each supported RDBMS platform in case you want to be extra sure on what exactly will be executed.

nickserv · on May 10, 2019

Django migrations can be problematic because they're meant to be sequential and have interdependencies. I've had problems merging multiple feature branches because of this, even though there are no code conflicts.

A system like Saltstack or Puppet for databases would not have checked in migrations, these would be generated on the fly at deploy time.

So you could very well have multiple state changes in a single run, by comparing actual DB state and desired DB state, then creating the SQL code as needed for that deployment.

Honestly not having to fiddle with the migrations table on a live server seems pretty nice ;-)

This could very well turn out to be Django's next gen migration tool...

Izkata · on May 10, 2019

> Django migrations can be problematic because they're meant to be sequential and have interdependencies. I've had problems merging multiple feature branches because of this, even though there are no code conflicts.

They're actually a directed graph; this means a conflict wasn't handled on the branches that should have been, and would probably have been a problem regardless.

PaulMest · on May 10, 2019

This was helpful to think about, thanks.

I've rarely encountered logical merge conflicts with migrations, but I could see it happening.

I used to be on the SQL Server team at Microsoft and had some exposure to the customer support teams. So data integrity and eliminating any potential for errors was huge.

So while I love the idea of migrations being generated on the fly from actual state in Production-System-5 to desired state of commit 27a73e, I'm skeptical of it working that well in practice. Certain cases come to mind where there might be intermediate migrations from [full name] -> ([first name] [last name]) -> ([first initial] [last name]). The system would have to be smart enough to know A -> C may require A -> B -> C or prompt the engineering team for a better DML migration script.

Also, you will want there to be documentation about what was performed whether that is a migrations table that points to a .py file... or a .json output... or a log file.

liyanchang · on May 10, 2019

Yeah. I’d love to see the academic paper with formalizations that help me understand the true scope of this problem. Your example is a great one that prompts many questions. Is it possible to travel directly to the commit o(1) or will the code have to calculate the diff of each commit and apply them one at a time o(n) and how much definition and dependency mapping humans need to do to have it work correctly?

kd5bjo · on May 11, 2019

The closest I can think of is trying to define a set of CRDT-compatible operations that are expressive enough to describe your database schema, starting from an empty database. Then, the migration you need to perform is whatever the CRDT merge operator says you need to do.

evanelias · on May 10, 2019

That's great initially, but problems definitely crop up at scale:

* What happens when your company creates new systems that aren't Python/Django? You can either still shoehorn all migrations into Django models, or have multiple separate schema change processes/pipelines... both options are not good.

* If someone makes an out-of-band schema change manually (either by accident or to do a rapid hotfix), you're no longer on a known version. Not sure about Django, but when this happens, most traditional migration systems cannot operate without additional manual cleanups... whereas declarative tools can inherently transition any state to any other state.

* Depending on the DBMS, with large tables and certain types of ALTERs, using an external online schema change tool is greatly preferable to just running an ALTER directly.

* Does Django support sharding / the notion that a change to a sharded model must be made to multiple shards?

PaulMest · on May 10, 2019

* I see your point on not standardizing on one framework. Generally when that has happened for me, it turns into a new service and it has its own database/tables/migration management. It does get quite annoying, for sure.

* I've seen enough things go wrong that on my teams I do not allow DDL to be executed outside of a controlled process that comes from code. But yeah, if that were to happen, it would annoying to figure out what was done and then try to re-model.

* With Django you can specify exact SQL to run. So you can break up operations into multiple smaller steps... canonical example is building a new column based on an old column. You first add the column with NULL. Then you populate in batches of ~10k records. Then you add on the constraints/indexes.

* I haven't used Django with sharding. It appears there are some posts about it, but it all appears to be community generated content and not part of the official docs.

All-in-all, I could see that at a large scale with very mature engineering organizations with lots of activity and complex operations that something like Django could fall short and a home-grown system like this may be beneficial, assuming it were reliable enough.

nikisweeting · on May 10, 2019

Sharding is usually handled by patching the QuerySet / ObjectManager, how Citus handles it is a good example: http://docs.citusdata.com/en/v8.1/develop/migration_mt_djang...

henning · on May 10, 2019

Prepare for the exciting future of DevOps transformation: it's `./manage.py makemigrations`.

daoxid · on May 10, 2019

This sounds nice! One Question: You said that DML changes are handled via "standard check in sql file". Does this simply mean a new SQL file for each migration? And how are DML changes connected to DDL changes? For example, if some code is two versions behind and updated to the current schema, wouldn't this mean that the DDL is updated in one step to the current state, but the DML potentially in two steps, breaking the update?

liyanchang · on May 10, 2019

That's correct. The DML changes as part of CI are somewhat new so we haven't ironed it all out yet.

Here's the scenario that I think you're laying out: 1. Commit A creates column foo 2. Commit B has DML that reference column foo 3. Commit C removes column foo

This works fine if our CI deployer does each commit individually. First roll out any schema changes, then run any DML SQL.

However, our deployer might pick up all those changes and since we roll out the schema migrations first (in this case a create + drop -> NOP) and then runs the DML (which will error), this is an issue because of the rollup.

In practice, we have yet to see this case (most of the time, the dev who write the DML is close enough to the code to know if it's going to be dropped soon and we don't drop that many columns - in part because we know that there be dragons) but truthfully, I haven't thought about it much and need to think through what the impact is beyond this example. Thanks for helping me refine my thinking and I'll have something to ponder on this weekend!

daoxid · on May 10, 2019

Yep, your example describes exactly (and better) what I meant. Thanks!

wvenable · on May 10, 2019

Can that handle column renames? Most schema-to-schema diff tools can't tell the difference between a rename and a delete/add.

liyanchang · on May 10, 2019

I’ll openly admit that we don’t have everything ironed out. In fact my next big project is to tackle derived columns (rename is a column where the transformation is the identity function).

It requires a bit more finesse and integration into our code base as it requires multiple deploys with code that knows how to handle both columns.

djrobstep · on May 11, 2019

Strong agree. You can do much better than than Rails/Django migrations.

I have been advocating for this approach for a while now: https://djrobstep.com/talks/your-migrations-are-bad-and-you-...

safog · on May 10, 2019

Not sure about the state of the world currently after living in BigCo filter bubble for the past few years, but do you even need custom tools to calculate the delta between the schema as checked into VCS vs the database's actual state?

Spanner (https://cloud.google.com/spanner/) I think can auto-compute the diff between its current state and a given schema, generate appropriate SQL statements to perform a migration and get user confirmation for destructive schema changes.

atsaloli · on May 11, 2019

> This is just how the industry started managing server configurations (Puppet) ...

Yes, and CFEngine pioneered this in 1993 (Mark Burgess).

I make my living as a CFEngine consultant.