
Practical Guide to PostgreSQL Optimizations - omarish
https://tech.lendinghome.com/practical-guide-to-postgresql-optimizations-d7b9c2ad6a22#.d4s9e879m
======
sbuttgereit
The work_mem section is incorrect. The article says the configuration is per
worker, but it's actually per operation which requires work_mem... any given
worker can have multiple operations running in parallel. So actually getting a
good, but safe, value is more sensitive than the article suggests.

[https://www.postgresql.org/docs/9.5/static/runtime-config-
re...](https://www.postgresql.org/docs/9.5/static/runtime-config-
resource.html#GUC-WORK-MEM)

 _work_mem (integer)

Specifies the amount of memory to be used by internal sort operations and hash
tables before writing to temporary disk files. The value defaults to four
megabytes (4MB). Note that for a complex query, several sort or hash
operations might be running in parallel; each operation will be allowed to use
as much memory as this value specifies before it starts to write data into
temporary files. Also, several running sessions could be doing such operations
concurrently. Therefore, the total memory used could be many times the value
of work_mem; it is necessary to keep this fact in mind when choosing the
value. Sort operations are used for ORDER BY, DISTINCT, and merge joins. Hash
tables are used in hash joins, hash-based aggregation, and hash-based
processing of IN subqueries._

~~~
pgaddict
It's not the only section that's incorrect. The effective_cache_size is wrong
too, shared_buffers is misleading (e.g. new PostgreSQL versions are not
subject to SysV kernel limits), and so on.

The recommendation to set low checkpoint_segments during restore is a bit
strange. Also, not a single word about checkpoint_timeout (when the
configuration should aim for timed checkpoints in the first place).

Haven't read further.

~~~
irrational
Do you have a recommendation of a similar type of post that has correct
information? I'm just getting into PostgreSQL so I don't know what the best
resources are.

~~~
sbuttgereit
I would say this is my favorite resource.

[https://www.packtpub.com/big-data-and-business-
intelligence/...](https://www.packtpub.com/big-data-and-business-
intelligence/postgresql-90-high-performance)

Note that, as published, it's a bit dated, but by and large much still is
applicable even today. Once you learn the basics the PostgreSQL docs and
release notes can catch you up to what's different.

------
ceyhunkazel
I recommend following Christophe Pettus - PostgreSQL Proficiency for Python
People - PyCon 2016 video for postgresql basics and optimization

[https://www.youtube.com/watch?v=knUitQQnpJo](https://www.youtube.com/watch?v=knUitQQnpJo)

~~~
ceyhunkazel
Here is the self notes from the video:

1)for debian use: pg_createcluster instead of initdb

pg_createcluster 9.5 main -D /data/9.5/main -E UTF8 --locale=en_US.UTF-8 -- -k

[https://wiki.debian.org/PostgreSql](https://wiki.debian.org/PostgreSql)

[https://makandracards.com/makandra/18643-how-the-change-
the-...](https://makandracards.com/makandra/18643-how-the-change-the-locale-
of-a-postgresql-cluster)

2) Create a separate database volume partition for data. do not put version
number in the mountpoint (/data, not /data/9.5) then create a folder with
version number EXT4 or XFS for filesystem (ZFS is extra for experts) EXT4 on
average a litter faster than XFS but standard deviation is higher than XFS

3) postgresql.conf - most server settings pg_hba.conf - who gets to log in to
what databases?

4) Important parameters:

    
    
      a) Logging
    
    
      b) Memory
    
    
      c) Checkpoints
    
    
      d) Planner
    
    

5) log use csvlog format [http://cheng.logdown.com/posts/2016/04/08/enable-
logging-in-...](http://cheng.logdown.com/posts/2016/04/08/enable-logging-in-
postgresql)

[http://bonesmoses.org/2014/08/05/on-postgresql-logging-
verbo...](http://bonesmoses.org/2014/08/05/on-postgresql-logging-verbosity/)

log_destination = 'csvlog' logging_collector = on log_min_duration_statement =
250ms log_checkpoints = on log_lock_waits = on log_temp_files = 0

[https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Serv...](https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server)

6) shared_buffers (cache for data) Sets the amount of memory the database
server uses for shared memory buffers. Below 64GB set to %25 of total system
memory above 64GB set to 16GB

7) work_mem (affect slow joins) Specifies the amount of memory to be used by
internal sort operations and hash tables before writing to temporary disk
files. The value defaults to four megabytes (4MB). Note that for a complex
query, several sort or hash operations might be running in parallel; each
operation will be allowed to use as much memory as this value specifies before
it starts to write data into temporary files. Also, several running sessions
could be doing such operations concurrently. Therefore, the total memory used
could be many times the value of work_mem; it is necessary to keep this fact
in mind when choosing the value. Sort operations are used for ORDER BY,
DISTINCT, and merge joins. Hash tables are used in hash joins, hash-based
aggregation, and hash-based processing of IN subqueries.

Start low: 32-64MB. Look for 'temporary file' lines in logs Set to 2-3x the
largest temp file you see. Can cause a huge speed-up if set properly! But be
careful: It can use that amount of memory per planner node.

8)maintenance_work_mem (integer) Specifies the maximum amount of memory to be
used by maintenance operations, such as VACUUM, CREATE INDEX, and ALTER TABLE
ADD FOREIGN KEY. It defaults to 64 megabytes (64MB). Since only one of these
operations can be executed at a time by a database session, and an
installation normally doesn't have many of them running concurrently, it's
safe to set this value significantly larger than work_mem. Larger settings
might improve performance for vacuuming and for restoring database dumps.

Note that when autovacuum runs, up to autovacuum_max_workers times this memory
may be allocated, so be careful not to set the default value too high. It may
be useful to control for this by separately setting autovacuum_work_mem.

* 10% of system memory, up to 1 GB. Maybe even higher if you are having VACUUM problems.

9) effective_cache_size effective_cache_size should be set to an estimate of
how much memory is available for disk caching by the operating system and
within the database itself, after taking into account what's used by the OS
itself and other applications. This is a guideline for how much memory you
expect to be available in the OS and PostgreSQL buffer caches, not an
allocation! This value is used only by the PostgreSQL query planner to figure
out whether plans it's considering would be expected to fit in RAM or not. If
it's set too low, indexes may not be used for executing queries the way you'd
expect. The setting for shared_buffers is not taken into account here--only
the effective_cache_size value is, so it should include memory dedicated to
the database too. Setting effective_cache_size to 1/2 of total memory would be
a normal conservative setting, and 3/4 of memory is a more aggressive but
still reasonable amount. You might find a better estimate by looking at your
operating system's statistics. On UNIX-like systems, add the free+cached
numbers from free or top to get an estimate. * Set to the amount of file
system cache available. * If you don't know set it to 75% of total system
memory free -m

------
pella
simple tool: [http://pgtune.leopard.in.ua/](http://pgtune.leopard.in.ua/)

~~~
theprotocol
Interesting! Thanks for posting this.

------
DrJokepu
SHMMAX basically stopped being an important factor in PostgreSQL 9.3 (released
almost 3 years ago) and later.

------
olalonde
I'm currently reading "Designing Data-Intensive Applications" by Martin
Kleppman and it has an interesting chapter on database storage engines,
including PostgreSQL (e.g. B-trees chapter). I would definitely recommend if
you'd like a high level overview of how things work under the hood (it's a
good mix of theory and practice). It also helps understanding why and when
optimisations work.

