Hacker News new | past | comments | ask | show | jobs | submit login
Visualizing PostgreSQL Vacuum Progress (dtrace.org)
73 points by helper on May 23, 2019 | hide | past | favorite | 3 comments

Jim Starkey the man who probably invented mvcc back in the 80s had some interesting thoughts on this topic. Firebird a fork of Interbase used to rely on visiting transactions to clear up garbage. This meant that old versions of a record would hang around until a query pulled the record up, saw there garbage versions and deleted them. At some point Firebird introduced a dedicated garbage collection thread to do the job instead, which to me sounds a bit like how vacuum works. Here is what he had to say about it:

...but the garbage collect thread is an unmitigated disaster. The theory of cooperative garbage collection is that a) garbage collection only takes place when the page has already been read, minimizing additional I/O, and b) it keeps the length of record chains to the theoretical minimum. The garbage collect thread not only defeats both of these, but is only of those rare features that the higher the load, the worse it works -- under load record chains get longer, which increases the cost of index and blob updates as all old record versions must be fetched and scanned, which increases the load even more, causing the garbage collect thread to fall farther and farther behind, increasing the load even more -- well, you get the picture.


I regret I have but one upvote to give for this comment! In the hallway conversations around this problem at its most abstract, we have often used the GC analogy (aside: I am opposed to GC in systems software for essentially this same reason); it's good to know that this observation was being made by domain experts a long time ago! Thanks for the link!

Thanks I'm glad you liked it. He's a very interesting man in my opinion.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact