I decided to start again from scratch. I discovered a handy little utility called pghero and set up it's internal monitoring:
I backed up my old .conf replaced it with a default and blew away all my old indices, and fired up the database. Of course, nothing worked well. But step by step, working with postgres' internal tools and pghero, I took a brand new look at every previous assumption and was much more strategic about where and when I created an index and which tuning parameter I tweaked.
Long story short, I was able to knock 30 GB off of the install disk size (based, in large part, on unnecessary duplicate indices), accelerate the most common queries, and everything worked a lot smoother. Plus my postgres installation was no longer fighting old postgres .conf tuning tweaks which were designed for much older versions of postgres.
Just adding my vote for revising your old tuning parameters after your database (and your programmer/administrator) has matured a bit.
Only adding indexes with good use of EXPLAIN/EXPLAIN ANALYZE on expected usage patterns yielded good results. Often we'd just blow the indexes away and fiddle until we added the correct one to enable an index scan / index only scan.
Other big boons for us have been using ENUM types where necessary (small known dataset for a column) -- now your column takes 4 bytes instead of N bytes for a string. I find them a bit easier to work in than foreign keys for this optimization because of their direct string mapping.
We've had really good mileage with Datadog's APM tools, RDS performance insights and pghero as you mentioned.
I'm my own worst enemy for indexes, because I'm not only the database administrator but also the analyst. So when I'm trying to solve an ad-hoc problem and encounter a long-running query there's a 99% probability that I may have "accidentally" generated a new index to accelerate that query and then forgotten to remove it when I was done.
Lesson learned, and also a good argument for disaggregating the administrative and analytical user permissions.
Can you give an example?
It sounds like normalising the schema would have similar (perhaps more) benefits.
I recognize this would be a much different beast when it comes to a database that supports a hundred thousand users or something. However, Postgresql does have the ability to do hot standby and streaming replication so it's likely not as difficult as one might imagine.
True, though you can tweak settings on when the cache writes dirty pages. Percona has a blog post specific to Postgres that covers it.
There's also some other tuning there not mentioned in the HN linked story.
It is a combination of shared_buffers and kernel cache size
This important configuration parameter is a too-often badly understood. IMHO it should be, at least on a server non dedicated to PG, periodically dynamically calculated by the engine.
Learning about what goes on under the hood before you blindly change settings some tool spits out doesn't seem like a bad idea I'd say.
Values in the post are pretty arbitrary and useless as their comments in postgresql.conf used to be.
I am using PostgreSQL in production since 7.4 and the best source of performance tuning tips for me has been PostgreSQL 9.0 High Performance by Gregory Smith. These days PostgreSQL wiki is pretty good as well.
They do different things - pgTune tells you how to configure your system, this blog post tells you why.
Any compares and contrasts available?
I'm not running a production instance so tuning is gratuitous but nice.