

PostgreSQL needs a new load balancer - eksith
http://www.databasesoup.com/2012/03/postgresql-needs-new-load-balancer.html

======
pilif
The article is from 2012, but its contents is still very true. The only
general purpose solution out there is pgpool, but from my own experience,
unfortunately, it's less than stable at times and it has its own interesting
quirks you have to live with (like having a SQL parser that lags behind
PostgreSQL 1-2 releases).

Then there are bugs. Sure: As they are discovered, they get solved quite
quickly, but if you are depending on certain functionality, it can still be
very annoying.

It's also touted as a HA solution, but from my own experience last year, I
greatly increased the overall stability of an installation by getting rid of
pgpool - there's too much magic going on in there.

Finally, I was quite unhappy about them silently fixing a case where an
unauthenicated client could request all the servers memory to be consumed. No
security announcement - not even a release was made.

Unless you are very willing to spend a lot of time working around quirks and
deeply familiarizing yourself with the source code, I would recommend against
pgpool at this time, which leaves us, again, without any general-purpose load
balancing solution. Other solutions have restrictions that make them quite
situational (no prepared statements, non-deterministic cleanup after the
client disconnects, etc).

~~~
jwr
The article is definitely still true. When using Postgres in larger systems
it's always a problem when the client asks "well what about redundancy?" —
there is no good answer to that, for reasons outlined in the article and the
post above.

~~~
pilif
That's entirely not true. Postgres has built-in replication and you can even
do synchronous replication if you want to pay for the latency - you can't have
more redundancy than this.

Doing the failover is trivially done using something like keepalived or
heartbeat as Postgres has built-in support for quickly promoting a slave to
the master.

What we don't have, however is good load balancing. Also, because load
balancing is f'ing hard to get right when you are in a master/slave
replication configuration because it's really hard to know what queries you
can safely send to the slave. After thinking quickly, here are some cases: use
of nextval/currval, use of select after insert/update in a transaction,
querying unlogged tables, and probably many more.

A good load balancer knows about all of this.

~~~
willvarfar
I think in spirit the post you were replying to meant load balancing?

------
roncohen
We ditched PgPool for PgBouncer for connection pooling from our webservers.
PgPool tries to do way to much, and the documentation is horrible. PGBouncer
been rock solid compared to PgPool.

I'd love for PgBouncer to poll the servers it's connected to and automatically
talk to the master, so in case of master failure I would only have to promote
a new master (and STONITH) and PgBouncer would auto-failover to the new
master.

~~~
jaytaylor
Yes, PgBouncer is a little better but could still be improved upon in
significant ways. E.g. Getting PgBouncer working with SSL connections- You
have to do it through a specific version of stunnel. Then stunnel always has
to be running in addition to PgBouncer. Not friendly or fun to setup.

~~~
theatrus2
Out of curiosity, are you using Postgres with SSL for an offsite or multi-DC
replica? Would switching to a site-to-site VPN be more efficient here?

~~~
stingraycharles
Wouldn't using site-to-site VPN be adding a SPOF just as well, namely the VPN
server (and, in addition to this, make the VPN server a bottleneck for
transfer speed) ?

~~~
vidarh
You can "easily" enough set up two VPN connections on separate pairs of
machines and route to a virtual IP on each end and use ucarp or keepalived to
have one or the other take over. It's not pretty, but it works.

And the VPN server may very well become a bottleneck for transfer speed at
some point, but most of us won't ever need to deal with a level of bandwidth
where that's an issue.

------
mkhpalm
I think the elephant in the room for postgres is its lack of bidirectional log
replication. Once that finally happens, I'm sure dead-simple balancing tools
will start popping up everywhere.

~~~
wahnfrieden
Is there progress on this front?

~~~
Someone
Is that what
[http://wiki.postgresql.org/wiki/BDR_Project](http://wiki.postgresql.org/wiki/BDR_Project)
is about? They aim to "implement main BDR features into core Postgres" in
versions 9.4. Looking at past history, that could be the third quarter of 2014
([http://en.wikipedia.org/wiki/PostgreSQL#Major_releases](http://en.wikipedia.org/wiki/PostgreSQL#Major_releases)),
but of course, that is just a guess.

------
Qantourisc
Clustering is very hard to do correct. Clustering while maintaining unique
auto_increments, solving split brain, rolling back transactions, ... while
maintaining speed: even harder. So the cluster is never going to be "simple",
might be simple to install, but never simple.

Side node: if you want load balancing, no replication, yes then it's easy to
do :)

------
rpedela
What about Postgres-XC for load balancing? I know it does several things, but
isn't that one of the use cases?

