Minimal downtime major PostgreSQL version upgrades with pg_easy_replicate

srslack · on June 20, 2023

I did this the "hard way" a year ago. This blog post came in handy:

https://jstaf.github.io/posts/pglogical/

pglogical is magic. Keeping an eye on this project for the future, looks great.

sgt · on June 20, 2023

> Binary replication only works with DBs of the same major version. If you want to replicate to a different version, well… you can’t.

Is this true? I thought they fixed that at some point after PG11.

luhn · on June 20, 2023

You're thinking of logical replication being added to the core. https://www.postgresql.org/docs/current/logical-replication....

WAL streaming is still only supported between the same major versions and AFAIA that isn't going to change anytime soon.

shayonj · on June 20, 2023

Love pglogical. I was thinking, it'd be nice to orchestrate this setup using pglogical too. Would attempt it if there is interest. Especially with bi-directional replication setup.

take-five · on June 20, 2023

Nice UX!

I checked the code and noticed some things that you might want to address in the future (I've done major version upgrades via logical replication myself several times using a script similar to this).

The "default" way of bringing the target DB in sync with the source one (CREATE PUBLICATION on the source then CREATE SUBSCRIPTION on the target) only works for moderately sized databases. Even on ~50GB I noticed that it may take hours to catch up with the origin. There are a couple of solutions:

1. Drop all indices except for the primary keys before performing initial copy, and then restore the indices after all tables have been copied and replication stream started.

2. Instead of copying into a blank database, you could instead create a logical dump using pg_dump with snapshot isolation, restore it on the target DB using pg_restore, create a subscription in the disabled state, advance the logical replication slot to the LSN with which you created the logical dump, and then enable the subscriptions. This way your target database would only have to process the delta between the time you created a dump and the current time. I have a snippet illustrating how we did it: https://gist.github.com/take-five/1dab3a99c8636a93fc69f36ff9....

3. We found out that pg_dump + pg_restore is still very slow on larger databases (~1TB) and it's untenable to wait several hours. Remember that while you're waiting until the dump is being restored, the source database is accumulating WALs which it should later send to the target DB. If left unchecked for 12-18 hours, it could lead to running out of disk space. This was unacceptable for us, so instead of creating a target DB from a logical dump, we created a copy of the source database from AWS RDS snapshot, upgraded it using pg_upgrade and then set up logical replication (similar to how it's described in Percona blog: https://www.percona.com/blog/postgresql-logical-replication-...). Something like this can probably be achieved with ZFS snapshots.

--

Otherwise, it was very similar to how you programmed your library.

Recently, I saw an article here on HN on the same topic of migrating Postgres databases using logical replication. There was a very nice idea to set up _reverse_ logical replication after switchover so that it's possible to roll back quickly in case something breaks.

EDIT: formatting

shayonj · on June 20, 2023

These are great shouts! Thank you so much for sharing. One of the operations I ran with this was ~500GB and yeah it takes a few hours to catchup. For us, it wasn't an issue, especially also since we were on AWS Aurora. However, that said, I can totally see it being not feasible on other database engine/systems. I considered the idea of dropping all the indices and re-creating it once the catch up is complete. I'd like pg_easy_replicate it to support that. It should make the initial COPY lot more swift. On a 200GB large DB it cut down down the total time by 90%.

Re: bi-directional replication, you read my mind :). https://github.com/shayonj/pg_easy_replicate/blob/8df94aa93f...

These two features are top of my list to support in pg_easy_replicate

nijave · on June 21, 2023

Check out pgcopydb. It uses COPY under the hood in parallel and is a lot faster than pg_dump/pg_restore.

hans_castorp · on June 20, 2023

There is also a new (OpenSource) Postgres extension that claims to provide multi-master replication based on logical replication, BDR and pglogical

https://pgedge.github.io/spock/

which can also be used for online upgrades.

It however only supports Postgres 15 and (not yet released) 16

justinclift · on June 21, 2023

Cool, that's good info. Hadn't come across it before. :)

The GitHub repo page for it is here, for anyone else interested in taking a look: https://github.com/pgEdge/spock

mb4nck · on June 21, 2023

spock and pgedge are cool, but they are NOT open source (and they try hard to make it look like they are).

It is under the Confluent license (with s/confluent/pgedge/), basically a source-available-but-Amazon-cannot-use-it kind of license.

remram · on June 20, 2023

The fastest way to switchover is probably adding a iptables rule on the old server to DNAT to the new server. Unless you have a load-balancer or another network middleware like Kubernetes' clusterIP services (which uses iptables or IPVS internally).

Waiting for a rolling restart or DNS cache expiration could take a while, during which the app is talking to a read-only database, if I understand correctly.

bbarnett · on June 20, 2023

This is why you always give two ip addresses to such servers. One for the server, and one for the service.

aidos · on June 20, 2023

I could definitely have done with that idea a couple of weeks back!

shayonj · on June 20, 2023

Cool idea!

MuffinFlavored · on June 20, 2023

The title says "zero downtime" but the GitHub tagline says "minimal downtime"

Zero... data loss maybe?

tycoon177 · on June 20, 2023

Was just going to comment this. It's specifically:

> This ensures zero data loss and minimal downtime for the application.

jreed91 · on June 20, 2023

Be careful if you have sequences as those are not replicated. https://www.jacobreed.dev/blog/migrate-postgres-instances

shayonj · on June 20, 2023

I will update the readme to mention this - but the lib takes care of updating sequence with the data catchup: https://github.com/shayonj/pg_easy_replicate/blob/13ad6af51f...

ekiauhce · on June 20, 2023

Is there an alternative for MySQL? We want to upgrade from 5.7 to 8 pretty soon, so it would be nice to have the appropriate toolkit.

neilwilson · on June 21, 2023

MySQL is easy. You just open the database with the new version and the upgrade happens automatically.

justinclift · on June 21, 2023

When you do this, does it need to process the whole database before the server can start processing transactions, or is it more of a "upgrade in the background on the fly" kind of thing?

gfosco · on June 20, 2023

Have done minimal downtime major version upgrades using standard replication, switching between two "master" servers. In the five minute range.

hinkley · on June 20, 2023

This is also a spot where having failover strategies and circuit breakers can help. Microservices make you think about these things but you don’t have to wait for that forcing function to think about them.

When the database goes down, you have to do something else. Could be the server crashed or could just be a Postgres upgrade. Five minutes is just about the right amount of time for an open circuit to do its job.

oulu2006 · on June 21, 2023

Have you done this postgresql/RDS?

Because when I last tried it, the replica has to be the same major version (WAL streaming requires it) and then upgrading it to the latest version can take 10+ hours.

Definitely nowhere near 5 minutes downtime.

nijave · on June 21, 2023

You'd want logical replication instead of physical replication. Logical rep doesn't stream the WAL as-is, it streams a representation of the statement that can be replayed on the destination.

It looks like this uses pglogical plugin but there's also a built in decoder I think called "pgoutput"

gfosco · on June 21, 2023

Self hosted postgres, not RDS.

tudorg · on June 20, 2023

Looks very interesting!

> The switch is made by putting the user on the source database in READ ONLY mode, so that it is not accepting any more writes and waits for the flush lag to be 0. It is up to user to kick of a rolling restart of your application containers or failover DNS (more on these below in strategies) after the switchover is complete, so that your application isn't sending any read/write requests to the old/source database.

This does mean that there will be a "downtime" during which the DB is in read-only mode, right? I understand this period can be really small, but that depends on how long the app deployment takes.

shayonj · on June 20, 2023

Thanks for taking a look!

Thats right, it was a typo. Mean't zero data loss and minimal downtime. There are some strategies in the readme like using weighted based DNS failover to reduce the downtime even less without requiring application deployment.

CoolCold · on June 21, 2023

Simple LB like Haproxy works fine for this and similar cases. In very nutshell, Haproxy executes checks every n seconds (in your case it can be readonly mode checks ) and disables that upstream.

We use it over Mysql/Pg+patroni/Redis and works fine - we made tests for low load though, just up to 5k qps fot PG case and up to 80k rps for Redis case.

nijave · on June 21, 2023

This can be done with pgbouncer and you can suspend the database so pgbouncer "holds" connections. If the application has high enough timeouts, it will look like it's just taking longer than normal to connect.

CoolCold · on June 21, 2023

That specific case and if you can afford PgBouncer, I agree with you.

On a wider scale, Haproxy adds value - auto tracking on sync/async replicas, replica lag windows, falling back on master if all replicas lagging/down, so on.

pelagicAustral · on June 20, 2023

Much appreciated. I was actually looking into a few options for this exact same thing. I will give this shot right away.

shayonj · on June 20, 2023

Thanks for giving it a try. Please don't hesitate to open bugs/feature requests.

jrochkind1 · on June 20, 2023

I guess this is pretty similar to what heroku does for pg updates?

If using this tool for upgrades with minimal downtime, if I understand right, you need to orchestrate your db callers to switch over to the new primary at the right point, when it's ready? Tips for getting that to happen at just the right point to minimize downtime?

shayonj · on June 20, 2023

Great question: so you can watch the stats command. The command returns a JSON every second which has a switchover_completed_at field. It is updated the as soon as the switchover is complete.

UPDATE: Some more reading material here: https://github.com/shayonj/pg_easy_replicate#switchover-stra...

jrochkind1 · on June 20, 2023

Very nice, thanks.

nijave · on June 21, 2023

Connect through a single pgbouncer instance and use pgbouncer to orchestrate the switchover by hot reloading/updating the config

cmdrriker · on June 20, 2023

interesting, how does this compare to something like pg_auto_failover in a two or three-node PostgreSQL setup?

pqdbr · on June 20, 2023

Let's say I have a Rails app and I want to do a major postgres version upgrade using this tool.

I see one of the requirements is that 'Both databases should have the same schema'. How should I manually load the schema to the new database (running on port 5433 for instance)?

jxcl · on June 20, 2023

Yeah I think the documentation is a bit ambiguous here. Postgres "schemas" have a very specific meaning, different from what most people think of when they hear the word. It's more like a table namespace.

https://www.postgresql.org/docs/current/ddl-schemas.html

hinkley · on June 20, 2023

I have historically had a “drop-create script” that could handle building me empty tables. Typically split up so that I can create tables, import (test) data from a snapshot or synthetic source, then enable constraints and indexes.

So even if a tool like that needed a little boost at the beginning to get going, I’d have something that already took care of that.

shayonj · on June 20, 2023

That’s a great question: internally what we have done is when preparing the new empty database, we also run a “bundle exec rake db:schema:load” against it. Depending on your setup, it can be hard to do.

I am exploring some options like loading schema from a schema.sql file or something along the lines of what pg_dump does for schema export.

Would love to hear any ideas.

jrochkind1 · on June 20, 2023

The easy answer would be on some non-production copy of the Rails app, point at the new db in database.yml, and simply run `rails db:create; rails db:schema:load`, or some variation rails command like `rails db:prepare`.

Rather than this ad-hoc thing, you could also try to create a new rails "environment" for this use, I guess.

somat · on June 21, 2023

The postgres documentation for logical replication suggests using "pg_dump --schema-only" as a starting point.

I am sure the pg_easy_replicate tool brings something to the table(probably replication lifecycle management) but reading the docs, the bare postgres replication looks just as easy if not easier than the easy replicate tool. I am going to try it on my toy db and see how it goes.

https://www.postgresql.org/docs/13/logical-replication-restr...

nijave · on June 21, 2023

This looks like a script that basically just runs all the necessary SQL for you. It looks pretty similar to an internal Python tool I came up with to do the same thing (basically just scripting out setting up logical rep and waiting for it to catch up, then executing switchover commands)

shayonj · on June 21, 2023

Thats right! Its mostly a CLI orchestration tool (mentioned in the readme), with some logging, stats and safety checks so that user/engineer can be mostly hands off during the process. Perhaps, even automate the entire process by wiring up different pg_easy_replicate commands together.

We do that with Tines, where we can be fully hands off (zero touch ops) and are able to kick off upgrade with switchover for tens of DBs (one at a time for now :D).

justinclift · on June 21, 2023

    Requirements
    ...
    Ruby 2.7 and later

Damn. This looked interesting right up until the Ruby requirement. :(

eeixlk · on June 21, 2023

How is this valid criticism? This is an excellent use of a scripting language.

forinti · on June 21, 2023

I would prefer to not have to install a new language on the server. There's Python and Perl already; you would certainly cover nearly all Linux installations with these two.

But it's a nice tool anyway.

anhtran · on June 21, 2023

I would prefer if it were written in Rust or Python. But this tool is very useful BTW.

hakube · on June 21, 2023

What do you want them to use? Node.js?

justinclift · on June 21, 2023

Almost anything other than Ruby. Bad previous experiences with it have completely turned me off the entire ecosystem.

If it was "Ruby or Node.js" though, I'd just pick neither. ;)

nylonstrung · on June 21, 2023

Would it make sense to use this for backup/redundancy as well?

If not, what would better approaches be

nijave · on June 21, 2023

For backup there's tools like barman and pg_backrest. If Postgres is virtualized, it's potentially faster to take disk spanshots & use PITR.

For redundancy (cluster management where you have hot standbys) there's tools like Patroni or Stolon

I don't have experience with Stolon but Patroni can help orchestrate backup tools for you (you still need to pick the tool and do some config)

cswilliams · on June 21, 2023

Looks cool! Does it work with postgres dbs in AWS RDS?

shayonj · on June 21, 2023

Yep it does. We have used with AWS Aurora. You will need to update the parameter group settings to ensure logical replication is enabled.

make_it_sure · on June 21, 2023

Does it work with 9.6 version?

jensenbox · on June 20, 2023

I know I am going to get fairly polarized responses on this one...

Too bad it is written in Ruby.

remram · on June 20, 2023

If you had included any content or justification in your comment, it might have gone differently. "I don't like software written in Ruby" is not interesting to anyone. No polarization.

revskill · on June 20, 2023

What if it's written in Assembly (which is hard to read, maintained....) ?

mike256 · on June 20, 2023

I likecode written in Assembly. No problem with that. If it's written in Assembly in the first place (not disassembled from another language) it's usually not that hard to read...

mike_hock · on June 20, 2023

Based.