Hacker News new | past | comments | ask | show | jobs | submit login
Non blocking and zero downtime ALTER statements in PostgreSQL with pg-osc (shayon.dev)
138 points by postgresqlDB on March 6, 2022 | hide | past | favorite | 23 comments



Nice tool! Shadow tables are quite convenient as they support arbitrary ALTER TABLE statements but they also seem a bit wasteful to me. I'm working on a similar tool called Reshape (http://github.com/fabianlindfors/reshape) which instead uses temporary columns to reduce the time a migration takes and the extra storage necessary.


Hello! Author of the post here -

Yeah I agree, it's certainly a bit wasteful, especially during the operation. You can clean up the table automatically in the end with --drop. Love the concept and path with Reshape btw, I think its very innovative.


Thanks. Reshape is quite promising.


Rather than plain SERIALIZABLE, maybe consider a SERIALIZABLE READ ONLY DEFERRABLE transaction to read the data. This is the only way in PostgreSQL to read from a table both without holding any locks, and also without suffering the potential of failed serialization (which would mean restarting the entire copy). [1]

Unfortunately this would mean you'd have to round-trip the data being copied through the client into a concurrent non-read-only transaction to write into the shadow table. However I think you could avoid the client round-trip by exporting the snapshot from the read-only transaction using pg_export_snapshot() [2] and then, on the concurrent connection, copy in batches by alternating between a read-only transaction opened on this snapshot using SET TRANSACTION SNAPSHOT in which you grab a WITH HOLD cursor on a LIMIT query (which copies the rows into an in-memory region), and a non-read-only transaction which actually writes to the shadow table. (The latter can be simply READ COMMITTED to avoid the possibility of serialization failure.)

And also of course you can't issue the DELETE on the audit table if the transaction is READ ONLY -- but instead of doing that, you could just read out the contents of the audit table (or the max serial), and skip those entries when applying the audit table later.

[1] https://www.postgresql.org/docs/current/transaction-iso.html...

[2] https://www.postgresql.org/docs/current/functions-admin.html...

[3] https://www.postgresql.org/docs/current/sql-set-transaction....


Thanks for the suggestion, I will def take that into consideration. With SERIALIZABLE currently the transaction and lock with it is harmless, but perhaps the design will change at some point in future.


I haven't read the source code, but how does it handle materialized views? Views refer to source tables by relation id, not by name, so by default you end up with views pointing to old tables, which also means can't drop them.

My use case is somewhat different. I have ~400M row tables which are not updated live, but I rebuild them from new source data, because it is faster that way (lots of columns, indices and FKs). There are also materialized views based on these tables, similarly with multiple indices.

I wrote some sql scripts using information_schema, which prepare new tables for data import, rebuild indices, FKs and then swap tables. After that scripts recreate materialized views from definitions and swap them. All happens without ACCESS EXCLUSIVE lock, so it can be still used by the backend. It sucks, though. I wouldn't mind if there was a way to have views use table names, so I could just refresh them after swapping tables.


Very interesting. Just so I understand - are you asking how does pg-osc updates the references in view(s) when pg-osc performs an ALTER on a table that is being referred in these view(s)?

If so, it actually doesn't handle that currently since AFAIK, there is no good way to get the views up w/o dropping and creating the view again :(.


Yes, I was a bit unclear, but that was my question exactly. I hoped that there is some trick I don’t yet know. Dropping and creating these massive views is my nightmare.


One Weird Trick to avoid waiting ~forever to acquire exclusive locks when making Postgres schema changes (e.g. attaching/detaching partitions, which is where I usually need this):

1. begin a transaction;

2. rename your table (or the entire schema) to some temporary/anonymous name;

3. perform the DDL operation;

4. rename your table/schema back;

5. commit the transaction.

I have no idea why this works, but it does!


I have an “I don’t know what I’m talking about but…” gut feeling if it works differently the way you explain then you are running into a bug and something isn’t working the way it should (as in, it should be slow and being fast means something is wrong).


just a note that in newer versions of PG I believe partition changes no longer require an access exclusive lock on the parent table, which I'm looking forward to when we upgrade...


Worth noting that this approach uses the external tool technique, which makes a copy of the table with triggers.

MySQL and MariaDB both support native online DDL, which makes alter statements non-blocking and zero downtime in most cases, in even in-place (no whole table data copy) in some cases.

pt-online-schema-change is still useful when you want control on when the tables are swapped over.


Historically, native online DDL in MySQL 5.6+ / MariaDB 10+ isn't replication-friendly -- despite being non-blocking on the primary, it blocks the replication stream application on replicas, which makes it basically unusable at scale. (This is inherently a trade-off of MySQL/MariaDB using logical replication, instead of physical replication of the tx log / WAL.)

The newer INSTANT algo in MySQL 8 and MariaDB 10.3+ solves this, but it is only usable for a limited subset of alter operations, such as adding a new column. That's one of the most common ALTER cases, so this feature is quite nice, but it certainly doesn't solve everything.

For this reason, external tools such as pt-online-schema-change are still pretty essential for MySQL/MariaDB deployments of any non-trivial size.

MariaDB 10.8, which is still pre-GA, adds a clever solution to the replication problem: https://jira.mariadb.org/browse/MDEV-11675 . It will be interesting to see if there are any real-world operational drawbacks to this approach, and seeing if MySQL offers this soon as well.


Given that Postgresql has transactional DDL, some of the planned safety features seem redundant.


Yeah, I have been mostly erring on the side of caution since the use cases can vary a lot. Transactional DDLs are a big savior for sure. Were there specific features that seemed redundant? I am always looking to iterate on this. - Author of the post


This is the same approach that Oracle uses with DBMS_REDEFINITION: https://docs.oracle.com/database/121/ARPLS/d_redefi.htm#ARPL...

People that have never used the Oracle RDBMS give it grief because of Larry, the rep of the company, etc, which is a shame because the DB is great. This feature is more than 20 years old.

I’m pleased that Postgres is the best of the open source databases & it leading as far as functionality goes.


> People that have never used the Oracle RDBMS give it grief because of Larry, the rep of the company, etc

It’s also not developer friendly, like at all. Starting with licensing, that insane installer, arcane configuration, documentation, error messages, standards conformity, column name limits. Our team hated every second of using it.


The patches. Jeebus the patches. Don't miss Oracle one bit.


> column name limits.

Those have been increased to 128 bytes with 12.2


Author of the post here - Thank you and appreciate the submission.


does Postgres support online DDL like MySQL? Seeing the docs for this mention "pg-osc supports when a column is being added, dropped or renamed with no data loss" but MySQL supports this out-of-the box via the `ALGORITHM=INPLACE, LOCK=NONE` params. Just curious as I'm currently learning about this.


Except for few cases, it doesn't. This thread above on MySQL and online DDL is ncie. https://news.ycombinator.com/item?id=30581314

pg-osc doc says that since the alter statements can vary in nature, it supports if a rename is being done, so the data is preserved and synced in the order its expected.


That’s pretty cool. I have used pt-online-schema-change on a few occasions in the past on mariadb and it worked great.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: