
Ask HN: Database migration on production without downtime - ainhu
How do you handle database migration on production servers without downtime?<p>We are currently struggling to find the right strategy to handle migrations of our database schemas without downtime. We are running a NodeJS application with a MongoDB on Kubernetes and can have thus pods of different versions running concurrently when deploying a new version (we currently perform rolling updates).<p>Some options we have been considering:<p>1. Using an evolutionary database design (only adding fields to the schema) and having the newer version write both the old and new.
2. Extracting the DB repository to an external versionable service that reads&#x2F;write from the DB and returns a different schema version depending on what application version requests an entity.<p>Further suggestions are greatly appreciated!
======
redis_mlc
Some general points (I'll use MySQL 5.7 as an example):

\- all major databases, including MySQL, have an equivalent to ALTER ONLINE
now. However, not all operations are covered, and you need to develop pre-
tested procedures for small and large tables. If you can't predict how long an
ALTER will take, then you're not prepared enough. (Typically under 400k narrow
rows on SSD you don't even need ALTER ONLINE at ebb time.)

[https://dev.mysql.com/doc/refman/5.7/en/innodb-online-ddl-
op...](https://dev.mysql.com/doc/refman/5.7/en/innodb-online-ddl-
operations.html)

\- there's lots of trigger-based tools, like pt-online-schema-change

[https://www.percona.com/doc/percona-toolkit/3.0/pt-online-
sc...](https://www.percona.com/doc/percona-toolkit/3.0/pt-online-schema-
change.html)

\- using local SSD helps a lot. using EBS will be ugly for large tables.
active master/passive master flipping might be helpful for hard disks.

\- for large tables, you still want to make changes at ebb time. In the US,
that's generally around 6 or 7 pm PST.

\- experienced DBAs keep things simple and understandable. So the
"evolutionary database design" and "failing forward" are recommended.

\- data retention policies (say 18 months) help with mgmt.

\- using tools like Ruby migrations is a special thing, so you need to look
under the hood there.

Pro Tip: Phrases like "Extracting the DB repository to an external versionable
service" give DBAs the willies. Sounds like a dangerous path to me.

Source: DBA.

~~~
ainhu
Thank you very much for the advice, it is really useful.

------
matt_s
I'm not familiar with MongoDB to know if it has peculiar behavior so my
experience is with traditional RDBMS.

You could write the backend/middleware and front-end to handle both situations
of if a column/data field is present vs. not. Get that code out there and then
deploy your DB changes separate. Once all is working fine, remove the "shim"
code that handled if a column was present/not. Most MVC web frameworks have
ways to mask the table/column names if needed. It may feel "icky" but
recognize it is short term and since you can do rolling updates of your pods,
you should be able to deploy software only changes nearly anytime, all the
time without interruption.

For the DB, if the tables are large then in RDBMS sometimes the DB will lock
the table for update which can cause issues. A technique we use is to copy the
table, do the migration of say adding a column, apply updates that happened to
original table to the copy and then rename the tables so your new migrated
table becomes the actual table and original is preserved in case. The rename
is usually fast and shouldn't lock out transactions.

~~~
ainhu
We already perform migrations separately but we were trying to avoid needing
to be backwards-compatible. However, it looks like this is the best approach.
Thanks!

------
matisoffn
Depends on the operation. For most operations there are backwards compatible
options that generally result in a 2-step process.

Soundcloud (I don't work there) developed something in Ruby for online
database migrations that copies the table and uses triggers to handle
everything. Note that this only works for MySQL. For PostgreSQL or others, the
best option is taking a backwards compatible approach.

[https://github.com/soundcloud/lhm](https://github.com/soundcloud/lhm)

~~~
ainhu
I guess backwards-compatibility in the app is probably the way to go. Thank
you for your comment!

------
itroot
I made a lot of migrations with MongoDB without downtime with dbs up to
200-300Gb. Most easy way is to support this on your backend side, and then
gradually switch data. It is easy to do when you can move your date item-by-
item, if it possible. Feel free to PM me!

------
jlisam13
Check out [https://github.com/github/gh-ost](https://github.com/github/gh-ost)

~~~
ainhu
This and the pt-online-schema-change tools look promising but not really
applicable to NoSQL :(

