
PostgreSQL performance considerations - mattyb
http://robots.thoughtbot.com/post/2638538135/postgresql-performance-considerations
======
pgr0ss
Here are a few extra tips:

\- "select relname, indexrelname, idx_scan from pg_stat_user_indexes" will
show how many times an index is used, so you can remove unused indexes

\- you can create indexes without locking tables (ie, while your site is up)
by using "create index concurrently"

\- you should use the autovaccuum daemon instead of manually running vacuum
analyze in most cases

------
acconrad
This is particularly important if you are working on Heroku. I just made the
switch from MySQL to Postgres and while they both run SQL, they have numerous
syntactical and performance considerations that need to be addressed so you
can transition into the Heroku cloud easier.

It's definitely worth considering for 9 and up, as at least when I was in grad
school for databases (~2008), MySQL outperformed Postgres.

~~~
torial
>MySQL outperformed Postgres.

For my part, I thought Postgres was slow until 5 months ago (when I came from
a port from SQL Server).

Then I learned something! The default install of Postgres is slow (at least on
Windows). And on benchmarks against the baseline SQL Server port I had, it was
about 2x slower. BUT there were a number of things I could do to make it
faster: a) Use the EnterpriseDB wizard for tuning Postgres --> resulted in a
2x to 3x performance improvement. b) Adjust the way I did updates (there is a
way to get Postgres updates to be fast, it just takes some massaging) c) Get
rid of views when possible -- I found that certain query compilation
optimizations I had taken for granted on SQL Server weren't available.

Doing all this resulted in a system that performed faster and w/ less RAM than
SQL Server(and I had spent a bunch of time tuning the SQL Server perf in the
first place!)

~~~
chunkbot
I'm liking the new trend of making databases "fast" by default, with "safe"
configurations. People who care about safety should take the time make it so;
people who want it fast, well, give it to them faster! So maybe PostgreSQL
should offer faster defaults?

~~~
hgimenez
Not sure what you mean by "safe" by default. Speed and performance have
nothing to do with security, if that's what you mean.

The reason that the default Postgres configuration does not perform well is
that Postgres runs on a number of operating systems, and it is not possible to
make assumptions about the hardware and configuration of the host. For
example, increasing configuration parameters like shared_buffers to any useful
value will require you to tweak the OS'es kernel resources. The Postgres
documentation does a great job at describing the process on a number of
operating systems:

[http://www.postgresql.org/docs/current/interactive/kernel-
re...](http://www.postgresql.org/docs/current/interactive/kernel-
resources.html)

~~~
andrewf
Plenty of software manages to share information between processes without
using SysV shared memory, and therefore being subject to these kernel resource
limits.

I'm not saying this should necessarily be fixed - I'm sure the Postgres team
has better things to be working on.

But it's not a natural, unavoidable limitation, either.

------
zzzeek
I can vouch for the inner joins blowing outer joins out of the water. Setting
up your postgresql.conf appropriate to its environment is helpful as well,
[http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Serve...](http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server)
has a good deal of explanation.

~~~
adamzochowski
But they have quite different meaning, one can't just replace OUTER with
INNER.

    
    
      Table A has m rows
      Table B has n rows
    

all rows return a match (aka: join on 1=1)

    
    
      OUTER JOIN will return m*n rows
      INNER JOIN will return m*n rows
    

none rows return a match (aka: join on 1=0)

    
    
      OUTER JOIN will return m+n rows
      INNER JOIN will return 0 zero rows
    

INNER join enforces that match has to exist.

OUTER join doesn't.

~~~
zzzeek
in my case its about the SQLAlchemy ORM which uses by default OUTER JOIN to
load a set of parent objects and their related objects. If the foreign key on
the parent is NOT NULL, you can replace the OUTER JOIN with INNER. So yes
clearly OUTER and INNER are different but its sometimes the case that the use
of OUTER is unnecessary, other times the case that a query can be restructured
to not require OUTER (such as, using an EXISTS to check for "no parent rows
found" instead of OUTER JOIN and a NULL check).

------
ghotli
Put your write ahead logs on a mirrored set of ssds as long as your write
patterns can handle a build up of logs before archiving to slower storage.
This gets the fsync out of the way so the application will block for a shorter
amount of time.

Postures relies heavily on disk cache, so be sure to tune shared memory small
enough to allow the kernel to cache commonly used disk pages. ZFS on FreeBSD
and OpenSolaris has the ARC as a second level of caching between disk cache
and slow disk IO. Te Adaptive Readahead Cache can take advantage of some ssds
to up access for your hotspots.

~~~
ghotli
Answering questions on the iPad before I go to sleep is a bad idea. My
apologies on the misspellings.

------
duskwuff
A bunch of these performance tips are equally applicable to any SQL database.
MySQL doesn't have partial indexes or GiST, for instance, and it uses ANALYZE
TABLE instead of VACUUM ANALYZE, but the rest of the tips apply to it too.

