The fundamental problem is that a synchronous, lock-based approach is being used in an asynchronous, event-driven context (logical replication).
The Postgres team needs to go back to the drawing board and avoid polling altogether and more importantly have event-listener based approaches for primary and replicas separately
It's great to see ClickHouse contributing to Postgres though. Cross-pollination between two large database communities can have a multiplying effect.
> The fundamental problem is that a synchronous, lock-based approach is being used in an asynchronous, event-driven context (logical replication).
This isn't hit in ongoing replication, this is when creating the resources for a new replica on the publisher. And even there the poll based thing is only hit due to a function being used in a context that wasn't possible at the time it was being written (waiting for transactions to complete in a hot standby).
> The Postgres team needs to go back to the drawing board and avoid polling altogether and more importantly have event-listener based approaches for primary and replicas separately
Armchair software engineers are so incredibly annoying.
Postgres codebase is a shining example of good engineering. They walk a very fine line between many drawbacks. Code is battle tested (and surprisingly readable). Not many people (or groups) can do much better - and if they think they can, they are probably going to learn a lesson in humility.
> Postgres codebase is a shining example of good engineering. They walk a very fine line between many drawbacks. Code is battle tested (and surprisingly readable). Not many people (or groups) can do much better - and if they think they can, they are probably going to learn a lesson in humility.
To be fair, we have/PG has plenty crud-y code... Some of it even written by yours truly :(. Including some of the code involved here.
But just saying "go back to the drawing board" without really understanding the problem isn't particularly useful.
I'd be happy to make the changes to the codebase but there's definitely scope for improvement. Putting a sleep to avoid repeated lock acquisition should have raised an eyebrow or two when it was implemented.
As mentioned, there should be a read-replica specific code that works using event listeners. Repurposing the primary's code is what causes the issue.
The following pseudocode would give a better picture on the next steps
// Register to receive WAL notifications about this transaction
RegisterForWALNotification(xid);
while (TransactionIdIsInProgress(xid)) {
// Wait for WAL record indicating transaction completion
WaitForWALEvent();
CHECK_FOR_INTERRUPTS();
// Process any incoming WAL records
ProcessIncomingWAL();
}
UnregisterForWALNotification(xid);
}
The Postgres team needs to go back to the drawing board and avoid polling altogether and more importantly have event-listener based approaches for primary and replicas separately
It's great to see ClickHouse contributing to Postgres though. Cross-pollination between two large database communities can have a multiplying effect.