We got hit with a good bit of the postgres iceberg this week.
VACUUM by default runs with the TRUNCATE option. This will, if possible and valuable, truncate the heap and return those pages to the operating system. This operation takes an AccessExclusive lock, which blocks reads and writes. It is written to be very deferential and only runs if nothing else is trying to take a competing lock.
AccessExclusive is a unique lock in that it is replicated to standbys. When the writer holds the lock, reads on the standbys will be blocked. However, the writer doesn't know about them, so its conservative approach of only acquiring the lock if there's no contention is completely thwarted.
Finally, if vacuum truncate is interrupted by lock contention it must repeat a scan of the heap to ensure the blocks are still empty before proceeding with the truncation.
All of these details combined meant our writer got stuck in a "grab lock, scan, be interrupted, release lock, grab lock, restart scan, be interrupted, release lock... etc" loop. The replication of this lock to the reader seemed to have batched together, so the readers never got to run in between the loop iterations. The vacuum never actually succeeded because it kept re-scanning. We had to intervene manually to disable autovacuum and cancel the in-progress jobs.
We've hit plenty of weird postgres issues, but this was a new one for us.
Edit: of note is the somewhat rare lack of clarity in the postgres docs. "VACUUM (without FULL)" is documented as only requiring ShareUpdateExclusive, which is why autovacuum is considered safe to run. Turns out that's not true.
> so its conservative approach of only acquiring the lock if there's no contention is completely thwarted.
How is contention on the standbys communicated back to the writer? I'm confused about how the writer is interrupted mid-cycle if only the readers are experiencing lock contention.
Heh, this is pretty great. Is it good or bad that I know almost all of these?
Not sure "SERIAL is non-transactional" belongs at the bottom, I thought that was common knowledge... but that might be from years doing Oracle where sequences are more explicit.
Many databases including PostgreSQL have a second reason “serial” sequences may have missing numbers. It’s too inefficient to write to the disk every time you get a new value from the sequence, so sequence transactions use up a batch of consecutive values, not just one. The set of values in the batch is then maintained in memory. If (when) the database crashes, the leftovers can’t be restored to the sequence (since there’s no finer-grained log of them). So on average the sequence will permanently skip half of the batch size on each crash.
This is a pretty awesome way of summarising a topic. Perhaps an advanced version could offer an ELI5 explanation on hoverover. I'm in the first or second layer of the postgres version of the meme, yet would gladly spend hours exploring layers I don't yet know about (simply reading the names of the ideas in each layer was incredibly interesting tbh).
I struggle to understand why the Halloween Problem is listed as a Postgres issue. The concept that a where condition would run until all rows are condition false is just categorically false for Postgresql.
Yes, and they call it out as a concern, which it isn't with Postgresql.
> This could even cause an infinite loop in some cases where updates continually
place the updated record ahead of the scan performing the update operation.
This just is not true for Postgresql and they don't specify anywhere that they switched from talking about Postgresql to a different database.
So either they believe it's an concern with Postgresql or they aren't being clear when they are talking about different databases.
They didn't call it out as a concern, they said at the beginning of the section "Halloween Problem is a database error that a database system developer needs to be aware of."
To be honest, just like you I also thought that it was related to PostgreSQL or other popular SQL databases somehow, until I re-read the section multiple times.
VACUUM by default runs with the TRUNCATE option. This will, if possible and valuable, truncate the heap and return those pages to the operating system. This operation takes an AccessExclusive lock, which blocks reads and writes. It is written to be very deferential and only runs if nothing else is trying to take a competing lock.
AccessExclusive is a unique lock in that it is replicated to standbys. When the writer holds the lock, reads on the standbys will be blocked. However, the writer doesn't know about them, so its conservative approach of only acquiring the lock if there's no contention is completely thwarted.
Finally, if vacuum truncate is interrupted by lock contention it must repeat a scan of the heap to ensure the blocks are still empty before proceeding with the truncation.
All of these details combined meant our writer got stuck in a "grab lock, scan, be interrupted, release lock, grab lock, restart scan, be interrupted, release lock... etc" loop. The replication of this lock to the reader seemed to have batched together, so the readers never got to run in between the loop iterations. The vacuum never actually succeeded because it kept re-scanning. We had to intervene manually to disable autovacuum and cancel the in-progress jobs.
We've hit plenty of weird postgres issues, but this was a new one for us.
Edit: of note is the somewhat rare lack of clarity in the postgres docs. "VACUUM (without FULL)" is documented as only requiring ShareUpdateExclusive, which is why autovacuum is considered safe to run. Turns out that's not true.