
Postgres Autovacuum is Not the Enemy - narfz
https://www.citusdata.com/blog/2016/11/04/autovacuum-not-the-enemy/
======
bradknowles
The actual title of the article is "Postgres Autovacuum is Not the Enemy". The
word "Postgres" is a critical element here and should not be left off.

~~~
johncolanduoni
Yeah, I came up with some pretty dark stuff trying to imagine what an
"autovacuum" was.

~~~
devonharvey
I thought someone was finally sticking up for Roombas.

------
hyuen
The main problem here is that autovacuum threshold is something like c + m *
nrows, and in large configurations you could have all sorts of table sizes.

How much change is a lot? 1% of the table + 50 rows (for small tables)? I
would argue that sometimes is better to use a fixed threshold, e.g. c = 1000,
m=0

All these approaches are hit or miss and are different per configuration. What
I found useful is to choose the best parameters you can think of without
forcing autovacuum to run everytime, and have an external job run vacuum
manually to cleanup whatever got missed... eventually you can figure out the
right configuration.

~~~
mjevans
I think the idea for this is either tune it for your problem cases, or if it's
an issue at your scale, divide and conqueror your use case (small tables in
one physical DB, large tables in another).

------
pgaddict
I find the recommendation to leave the cost limit alone strange. The problem
is this is a global limit, shared by all autovacuum workers. The default (200)
means all workers combined should not do more than 8 MB/s reads or 4 MB/s
writes, which on current hardware are rather low limits. Increasing the number
of workers is good, but the total autovacuum throughput does not change -
there will be more workers but they'll go slower.

Also a note regarding the delay - the resolution really depends on hardware.
Some timers have only 10ms resolution, for example.

~~~
felixge
Interesting.

Are you sure that the limit is shared by all workers?

> When the accumulated cost reaches a limit (specified by vacuum_cost_limit),
> the process performing the operation will sleep for a short period of time,
> as specified by vacuum_cost_delay. Then it will reset the counter and
> continue execution.

\-- [https://www.postgresql.org/docs/9.6/static/runtime-config-
re...](https://www.postgresql.org/docs/9.6/static/runtime-config-
resource.html#RUNTIME-CONFIG-RESOURCE-VACUUM-COST)

Since every worker is it's own process, I assume that the limit applies to
each worker, and overall vacuum throughput increases when using more workers.

That being said, your observation about the autovacuum_vacuum_cost_limit and
autovacuum_vacuum_cost_delay being potentially too low may still apply.

Your throughput estimate for 8 MB/s (I think per worker) only applies for the
vacuum_cost_page_hit (default 1) case. For vacuum_cost_page_miss (default 10),
the throughput would only be 0.8 MB/s.

If my understanding of the way a page miss is defined is correct, this will be
the common case unless one has increased shared_buffers from it's default of
128 MB to something much larger. That's b/c pg will assume a "page miss" if
the page is not in its shared buffer, even if it's in the host OS page cache.

I might be wrong about the caches, so I'd love for somebody with more insights
to confirm/reject these assumptions.

 _Edit:_ Seems like your 8 MB/s estimate is based on the vacuum_cost_page_miss
case. But still, that's pretty low :)

~~~
pgaddict
I am pretty sure. Let me quote the official docs from
[https://www.postgresql.org/docs/9.6/static/runtime-config-
au...](https://www.postgresql.org/docs/9.6/static/runtime-config-
autovacuum.html#GUC-AUTOVACUUM-VACUUM-COST-LIMIT):

===========================================================
autovacuum_vacuum_cost_limit (integer)

    
    
        Specifies the cost limit value that will be used in automatic VACUUM operations. If -1 is specified (which is the default), the regular vacuum_cost_limit value will be used. Note that the value is distributed proportionally among the running autovacuum workers, if there is more than one, so that the sum of the limits for each worker does not exceed the value of this variable. This parameter can only be set in the postgresql.conf file or on the server command line; but the setting can be overridden for individual tables by changing table storage parameters.
    

===========================================================

So, that's pretty clear, I think. In practice the balancing happens in
autovac_balance_cost() function. The autovacuum workers communicate through a
chunk of shared memory, and cost rebalancing is one of the things doing that.

Regarding the limits - yes, the 8MB/s is based on vacuum_cost_page_miss=10,
which means a read from the OS. Per second, there's the worker process wakes
up every 20ms, so 50x per second. As each round has 200 tokens, this means
10.000 tokens per second. Assuming all of them are reads from disk/OS, we can
do 1000 of them (because the cost is 10), Which is 8kB x 1000 = 8MB/s reads.
OTOH writes are about twice as expensive, leaving us only 500 writes, i.e.
4MB/s.

~~~
felixge
I completely missed that.

So yeah, your comment is spot on. The default cost settings seem way too low.
I'll certainly try tuning them for our postgres servers :).

Thanks!

------
JulianWasTaken
Unreadable in portrait on mobile?

How does one even begin to think that blocking users from scrolling
horizontally on your site is a good idea?

~~~
barrkel
Indeed. overflow: hidden on the parent combined with float: left on the main
text - it's bizarre. Why would someone take the whole content of the page out
of the main flow like that?

It's also using some kind of javascript callbacks to get responsive placement
of the header, rather than media queries.

------
AznHisoka
What ive found is that if wih the default settings, if you're running over 150
million inserts/updates/deletes a day, your database is going to halt because
of transaction wraparound errors. autovacuum simply cant catch up.

the solution for me was to batch inserts and updates in one transaction.

------
ckdarby
Site text gets cut off on mobile

