An important consideration is that increasing the pool size many times above the number of cores is very often counterproductive in the face of decreasing IO latency. Also, there may often not be justification for anything other than a constant-sized pool.
Removing it altogether made monitoring and debugging performance issues easier, and client-side pools were essentially "good enough" for our purposes.
Including pgbouncer in the base package would just mask the issue because pgbouncer saves you (mainly) from the short lived connection.
Prepared statements work fine with PgBouncer when you have session pooling enabled (as opposed to transaction pooling).
It may not save you from all the pain of dealing with connections, but PgBouncer definitely helps a lot when you're working with remote Postgres servers and high latency (i.e., > 10ms). Mitigating the delay and cost of connection initiation is pretty huge.
In my case that is true, but for a php webapp (for example) it is a lot less true.
Some things that'd help improve the argument:
1. Account for table/row locking in the benchmark: is it a factor or not?
2. Re-run the benchmark with for some of the proposed solutions, e.g. local connection pooling and PgBouncer.
Another thing to be aware of with postgres is that connections in an error state (for example, due to trying to insert a row that violates a constraint) must be rolled back before they can be recycled.
Pooling introduces numerous other issues as well. For example, clients probably shouldn't all be using the same entitlements to connect, right? Surely there are some segregation of duties and roles that clients will be using alternative entitlements into the system?
Finally, you have to show some before and after results. This article simply asserts that connection pooling would improve the situation, but does nothing to demonstrate the point. Even the details of the server matter...how many cpus, how many drives, HDD or flash, what file system is being used?
Ironically we exacerbated the symptoms by reducing average calls per page below 1.0. The burstiness of the traffic made it clear that the low water mark for the connection pool was not working out.
I've not measured recently, but from the angle of a PG dev I'd be very surprised if that weren't true.
Note that one unfortunatelycannot just compare the memory usage of a individual postgres connection processes with a threaded connection model. Read-only things are mapped multiple times, and a lot of the read-write memory is copy-on-write and never modified after startup. And a good chunk of the required memory is in shared memory and thus not duplicated.
I strongly suggest using huge pages (huge_pages=on, and configure the os to allocate them) if memory usage due to connection count is a problem when using PG. The page table is usually one of the biggest users of per-process memory.
Contrast to Postgresql, where none of the above is an option, every connection is a process and you don't want thousands of those piled up.
And you should probably use HTTP too.
All database clients are blocking which kills the CPU with IO-wait and most of them use some binary protocol.
Here is the worlds first and only async. parallel HTTP app server and HTTP/JSON database: