Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What's wrong/right with Postgres migrations?
15 points by doganugurlu 10 months ago | hide | past | favorite | 10 comments
Background:

- About 9 months ago we burned 2 separate days handling migrations on Supabase. Supabase CLI version and/or local Docker version change made it impossible for my teammate to bring up the db on their local. Migrations couldn't be applied. Fresh pull didn't work either. I thought this was a fluke.

- Working with a new team/stack (flask, alembic, postgres). Twice in the last month I had to change the `alembic_version` in the db just to get the migrations to apply. I think we found ourselves in this situation because we rewrote the history by having the new new migration point to a previous revision (HEAD^) instead of the last one (HEAD). Not sure what our motivation was.

- What am/are I/we doing wrong? - What's the right/robust way to do this? - How are teams of size >2 handling this? - Isn't this scary?




> How are teams of size >2 handling this?

Directory with .sql files starting with a number. Each file contains a set of changes to the db. The db has a table with the number that was applied last. To migrate your db you check if you have a file with a number that is higher than the one in the db. Then you apply that file to your db. That’s it.

Sounds like you are working in a way that is not intended by your tool / framework.


I do something similar.

We use a minified version of the production DB (baseline). People create numbered sql scripts. A helper script will restore the baseline DB on their local laptop, then apply the new sql files against it.

You develop a deployment script explicitly rather than have it generated. Automatic diff-based generation is full of holes. Making a deployment script directly works flawlessly for all all cases: renames, external data imports, flat files, etc.

Feature branches? no problem.

    git fetch
    git rebase origin/master. 
If someone adds a new script you just bump your script number and continue. Scripts are often order-independent, so even if you both make claim to 04_foo.sql it may not matter, and if it does you just bump your script to 05_foo.sql.

I've been using plain old sql scripts for many years with exactly 0 deployment issues. The fact everyone is constantly testing/validating a "deployment" on their local box is a huge win.

With a minified DB the full restore and script deployment can be done in under 10 seconds usually. With big data imports it may take longer, but those scripts can simply be renamed from 80_import.sql to 80_import.sql.SKIP for a faster iteration.


Assume you and your teammate made db changes and generated migrations in your respective topic branches. Both migrations refer to the latest revision, as it should be.

How do they merge? How does the migration tool know which migration to apply first?

Edit: You are right about us possibly not using the tools in the intended manner. But having to check the latest migration and check that with my local db seems a little bit error-prone and cumbersome, no?


> How do they merge?

Manually. When you merge your code back into a common branch you look at the existing migrations and the new migrations and make sure the numbers make sense (after the merge). You can avoid stepping on each other’s toes by using a timestamp for the number in your filename. You still have to make sure (when you merge) the order makes sense.

> How does the migration tool know which migration to apply first?

Migrations are ordered by the (ascending) number in the filename. (Or some other scheme, but this is common)

> But having to check the latest migration and check that with my local db seems a little bit error-prone and cumbersome, no?

The number in the database should never be higher than the highest numbered file in source control. So you can increment without looking.


We use timestamps in file names. Before you merge you have to catch up your branch (or it gets automatically updated). CI runs the migrations, so if something doesn't work you will know it.


Pretty much what Flyway does.


I use dbmate and for a deployment I package up the migration files into a docker container which runs and then applies the changes.

Firstly I use a devcontainer for development so I know my versions line up. dbmate uses .sql files which also makes things a lot easier.

You can see my setup here https://github.com/bionic-gpt/bionic-gpt

Have a look at the CONTRIBUTING.md to get an idea of the dev workflow.


> because we rewrote the history

Don't rewrite shared history.

As for how migrations are ran. Last place, each team/service had their own data store. Numbered sql files and forward compatible changes only. Sometimes this meant application code would have to write to two locations and in a later update change the read location.

Current gig, everything is managed by the django orm. Great when you have a single db; sucks to scale out to sharded tables, sharded db instances, and is a pain for migrating to new data stores.


I never experienced anything like this in Rails or Phoenix (Ecto) land.

The only time I had problems with database migrations were in javascript with Prisma. But that was because we had a local dev dataset, and the CEO had gone into past migration files and edited them manually. Prisma exploded every time and it was a pain to fix.

It sounds like your pains are also coming from people manually editing past migration files and committing them. Don't do that!


You can merge heads on alembic, nothing special. You just need tests for running migrations and its "just" like code conflict when merging.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: