Hacker News new | past | comments | ask | show | jobs | submit login
PostgreSQL tips and tricks (gabrielweinberg.com)
148 points by swah on May 29, 2011 | hide | past | web | favorite | 27 comments

A lot of this advice is not very good, or at best is misleading.

1) Turning "enable_seqscan" off is almost never a good idea. If Postgres isn't using an index that you think it should, you should figure out the underlying reason why. This would be akin to killing a fly with a sledgehammer.

2) The main thing that is going to impact whether a given index is in memory is how often that index is used, not it's size. The advice about reducing the size of indexes is true (in that a smaller index uses less memory, which is always good), but if you have 10 indexes and you only actively use 1 of them, then those other 9 indexes aren't going be taking up much memory. The biggest issue is they are going to slow down your inserts.

3) Manually vacuuming isn't really necessary for newer versions of Postgres (>8.3 I believe, where autovaccum is enabled by default). There are times when you do need to VACCUM, but probably not that often in standard usage.

Here is some simple practice advice for postgres users:

1) If you are using an older version of Postgres, upgrade to Postres 9. It has a lot of huge improvements over previous versions.

2) Purchase & read Greg Smith's PostgreSQL 9.0 High Performance (as mentioned previously by another poster). This book is phenomenal.

1) I said that was "crazier" and pointed to the thread saying it was generally a bad idea. That is, I wasn't really endorsing it, just offering it as a last ditch option.

2) The idea (sorry if wasn't clear) but to make sure the indexes you routinely use (get hit by the site many times a second) are all in memory. In the past for me that has generally meant roughly all the indexes (in aggregate) fit in memory.

3) I have autovacuum on and find my daily vacuuming actually does a lot. Sure, I can probably tune it to be more effective, but it is easier just do put a daily vacuum in cron (and also a safeguard in case anything goes way out of wack).

On your points, I bought the book :)

Agreed! Most of this advice isn't that useful and in all honesty can be bad advice for many sites. Really "fast websites" use external caching to not hit the database at all.

The post wasn't about anything but the DB layer. I cache everything, but still when you hit the DB layer, you want it to return instantly if at all possible. I said specifically at the very top this wasn't a tuning or scaling guide.

As an aside, I'm reading Greg Smith's PG 9.0 High Performance, and it's an excellent in depth study of database optimization. The first 90 pages are bottom up: storage technologies and how to tune and benchmark them, memory, cpu, file system choices, and operating system tuning parameters. Once the fundamentals are in order, it covers the internals of PG in even more interesting detail.

Trivia: A fsync call on any file on ext3 will force a sync of not just that file but the whole volume; ext4 fixes that. If you buy large SATA drives, then partition them to "short stroke" using only the outside of the disk (discarding the slower spinning inner region) you get competitive performance to SAS disks for many workloads.

One of the best low level books I've found in a long time.

Is that book full of these trivias? I would love to read more trivias like that.

You may find http://blogs.msdn.com/b/oldnewthing/ interesting (a developer at Microsoft who blogs about Windows internals).

His books is a copy of his blog with just the internals blog posts.

Why the down vote? And why is that blog so interesting again?

It's a shame the book doesn't have a Kindle edition.

It could be your use of the word 'trivia.' At least in the US, it sometimes carries a negative connotation. It might read to some like you are being sarcastic.

ohh... sorry, English is not my native language that's why I didn't knew this. I wasn't being sarcastic, I actually liked the comment.

Thanks for the explanation astine :)

Also, trivia is a mass noun. "Lots of trivia like this", not "lots of trivias like this".

"Details" or "specifics" sounds less ... trivial. "Nitty gritty details" ("let's get down to the nitty gritty") is also OK.

Would one of a body of trivia be a "trivium?"

I dont personally work with postgres but you've convinced me to buy this book.

Awesome! Ordering now.

Postgres kicks ass.

Be very, very careful following the advice about disabling table scans ("enable_seqscan = false") — especially globally. In performance tuning terms, that's often the equivalent of swatting flies with a howitzer. (You can also set that on a query-by-query basis, which, depending on the query, may be more sensible.)

The fact is, though, that often-times, a table-scan is the most efficient query plan. Yes, indexes can speed things up, but they do so by increasing random IO (leaving completely aside the write-time penalty you pay for indexing). When your storage medium is spinning rust, increasing random IO will eventually — and much sooner than you think, at that — cost more than buffering an entire table into RAM and sorting/filtering there. Moreover, all that additional random IO will degrade the performance of the rest of your queries.

After many years using PostgreSQL too (8.2), I have other kind of advices. We didn't tweek much the configuration. Instead, we use date indexing, denormalization triggers and different kinds of materialization to have a good performance overall.

Date indexing: This was tricky in the beginning as a timestamp with timezone column can not be indexed (because the value is not fixed). So we use timestamp without timezone columns and handle zones another way. Then, for a big table with events that you want to keep for a long time, you can CREATE INDEX i_date ON events(ts::date). Then in all queries fetching for this table, make sure to add a WHERE ts::date = 'YYYY-MM-DD' clause, this way you will hit the date index and get fast queries even with billions of rows.

Denormalization triggers: Suppose you have a Russian puppets structure like usergroups, users, events, items. You need to access to items, and also most of the time filter by a 'type' value that is properly stored at usergroups level. The straight way is to do a quadruple join, but this can be slow. So we add a denormalized column 'denorm_type' in items, with two triggers: one trigger forbids direct update of 'denorm_type', another trigger reflects any change done at usergroups level in the 'denorm_type' column. This helps a lot.

Materialization is a higher level of denormalization than the one above. It is necessary to keep the 'real' data highly normalized, but most often one need to access to the data in a more user-oriented form, specially for reporting purposes. Views are excellent at that task, but they are computed in real-time. Materialization is the process to write some of these views on the disk, and index them, for faster access. With proper triggers, it is feasible to keep these materialized views safe (ie. read-only) and always up-to-date (any change in the base data triggers an update or insert in the materialized view).

While I agree with the second tip ("Replace a live table with DROP TABLE/ALTER TABLE") for ad-hoc stuff, its big disadvantage is that it requires knowledge of the indexes on the table. If you have one or more scripts to maintain that use this method on a particular table, and the indexes are changed in an unrelated update to your table structure, you have to make sure those changes are also reflected in your scripts.

Oracle allowed you to disable the indexes on a table (something like 'ALTER INDEX foo DISABLE'), which allowed them to become stale when data in the table was modified. You could do this right before a big import operation for example, and you could then rebuild (a different 'ALTER INDEX' syntax) them when the operation completed.

PostgreSQL doesn't appear to have an equivalent at the moment.

edit: wording

Doesn't PostgreSQL support something like:

    RENAME TABLE current TO current_old,
                 current_new TO current;
Which is the common pattern of atomically swapping out tables in MySQL with zero downtime.

Almost all DDL is transactional in PostgreSQL, so the usual equivalent to the above would be:

  alter table current rename to current_old;
  alter table current_new rename to current;
Transaction-safe DDL is a very handy thing indeed.

One thing that helped us a lot was setting

checkpoint_completion_target = 0.9 # up from 0.5 default

So the checkpoints get written out more evenly and cause less and shorter "stop the world" events. This was especially painful on Amazon EBS (which is a relatively bad idea by itself btw) with large shared memory size. Decreasing this actually improved performance for us.

Increasing shared buffers should not be so far down the list. It is the first, single biggest impact on performance that you can do. I think everyone knows that it's something that should be tuned right away, but I think it should very first on the list, since it's so easily overlooked by someone just starting out.

Almost always, a lightweight connection pooler like pgbouncer can do wonders. It is so easy to configure and have it up and running that there is really no reason not to use it.

Just make sure your developers don't use temporary tables in very strange ways. (like between different sessions. wtf.)

Good stuff! I just permanently archived a searchable copy.

Off topic, but some PostgreSQL love: after using DB2 on a customer job for the last 5 months I much more appreciate PostgreSQL which is the RDMS I usually use. It seems like PostgreSQL is developed and documented in such a way to make it easy to use while DB2 is designed and documented to optimize IBM consulting services revenue.

A few of these tips apply to mysql also.

Copy table from a tab delimited file. In mysql you can use load data infile which will do the exact same thing.

Indexes in memory obviously very important in mysql land too. Also when using joins make sure the columns being joined on are indexed.

Using innodb o_direct will prevent linux from swapping out the mysql process.

Mysql's default in memory table tables are very small which usually results in creating on disk tables, to prevent this increase tmp_table_size and max_heap_table_size. Alternatively you can specify mysql engine memory for in memory tables if you know how big they are going to be.

Using innodb o_direct will prevent linux from swapping out the mysql process.

This is close, but not exactly true. Linux is loathe to discard buffer cache (even with swappiness at zero) it will often choose to swap out MySQL code/data instead when faced with downward memory pressure.

Using O_DIRECT helps to not swap out MySQL because it doesn't load up the buffer pool with a bunch of already-cached-by-mysql data, but it doesn't prevent it. A big rsync job for example will still load every copied file into the buffer cache and potentially cause swapping.

If you want prevent the InnoDB buffer pool (or Oracle SGA) from ever being swapped out use huge pages; they can't be swapped out.

I've read that the split-LRU kernel patch (kernels >= 2.6.28) fixes a number of these bad swapping behaviors, but I haven't tried it myself.

I've been meaning to get around to playing with Python/PL and Postgres for a little while now

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact