An important consideration is that increasing the pool size many times above the number of cores is very often counterproductive in the face of decreasing IO latency. Also, there may often not be justification for anything other than a constant-sized pool.
PgBouncer is basically a requirement of any production Postgres system, but I've always wondered why it simply wasn't part of the Postgres package itself? For instance, it means a dedicated EC2 instances is necessary for getting the most out of AWS RDS - even the whole point of RDS is to avoid managing that server..
We were big proponents of pgbouncer for years, but having one extra, sometimes capricious actor in our architecture ended up not being worth it. Especially if you're IO-heavy like we are, and that Web App -> DB pipe is the hottest path in the entire system.
Removing it altogether made monitoring and debugging performance issues easier, and client-side pools were essentially "good enough" for our purposes.
Ohhh, I understand. Long day. I got that backwards, I thought there was a server-side option I wasn't aware of. So you don't have a dedicated server with PgBouncer on it, you just move the pooler to where the requests originates from. I think this might be problematic in a highly elastic/distributed architecture but would work if you have a beefy dedicated application instance.
Correct, you'd have to instruct your pools to resize as machines are added and removed. We haven't had to do that so far, as we're not that dramatically elastic.
PgBouncer doesn’t save you from a lot of the actual pain when dealing with connections. And that’s prepared statements, you can’t pool prepared statements so high volume transactional databases need a lot of connections.
Including pgbouncer in the base package would just mask the issue because pgbouncer saves you (mainly) from the short lived connection.
> PgBouncer doesn’t save you from a lot of the actual pain when dealing with connections. And that’s prepared statements, you can’t pool prepared statements
Prepared statements work fine with PgBouncer when you have session pooling enabled (as opposed to transaction pooling).
It may not save you from all the pain of dealing with connections, but PgBouncer definitely helps a lot when you're working with remote Postgres servers and high latency (i.e., > 10ms). Mitigating the delay and cost of connection initiation is pretty huge.
I didn't know about this global locking but I'm assuming the row / table locking is taken into account in the testing. Presumably just inserting into a simple table with explicit unique ids each time should avoid issues with locking (I guess you'd want to avoid indexes and sequences).
This trips up a lot of newbies who have no concept of connection management or scope. I routinely get questions about it in regards to the orm library I wrote, which includes a connection pool of course.
Another thing to be aware of with postgres is that connections in an error state (for example, due to trying to insert a row that violates a constraint) must be rolled back before they can be recycled.
I wonder how alfresco ends up with such a huge recommended default pool size of 275 [1].
Is this bad architecture on their part or are different principles at play?
Nice article, but doesn't actually demonstrate that connections setup and tear down is the bottleneck. If you really have 500 active connections all "shouting" at the database simultaneously, the fact that those connections were obtained from a pool or not isn't going to change the throughput of the database.
Pooling introduces numerous other issues as well. For example, clients probably shouldn't all be using the same entitlements to connect, right? Surely there are some segregation of duties and roles that clients will be using alternative entitlements into the system?
Finally, you have to show some before and after results. This article simply asserts that connection pooling would improve the situation, but does nothing to demonstrate the point. Even the details of the server matter...how many cpus, how many drives, HDD or flash, what file system is being used?
I don’t have the numbers but I do know we are facing this problem at work. When you have microservices that are supposed to respond in <30 ms connection setup time becomes noteworthy.
Ironically we exacerbated the symptoms by reducing average calls per page below 1.0. The burstiness of the traffic made it clear that the low water mark for the connection pool was not working out.
note that while the MySQL connection-pool equation is pretty different from that of Postgresql's, at least in my experience MySQL connections use less memory (sorry, no measurements handy, feel free to correct me, but they generally are handled as a thread rather than an entire subprocess and we regularly see apps that spin up thousands of them fairly cheaply) and also have a lower startup time (also anecdotal, sorry), the MySQL world does finally have middleware connection pools now such as proxysql: https://proxysql.com/
> MySQL connections use less memory [...] handled as a thread
I've not measured recently, but from the angle of a PG dev I'd be very surprised if that weren't true.
Note that one unfortunatelycannot just compare the memory usage of a individual postgres connection processes with a threaded connection model. Read-only things are mapped multiple times, and a lot of the read-write memory is copy-on-write and never modified after startup. And a good chunk of the required memory is in shared memory and thus not duplicated.
I strongly suggest using huge pages (huge_pages=on, and configure the os to allocate them) if memory usage due to connection count is a problem when using PG. The page table is usually one of the biggest users of per-process memory.
There isn't much practical difference. MySQL's memory cost is lower, largely due to using a thread per connection rather than a process, so you can have a larger number of connections before running out of memory. But this really doesn't matter since by the time you run out of memory with PG or MySQL you'll probably have run into a situation where performance has tanked due to the amount of context switching going on. In both cases you likely want something like `num_cpu_cores * 2..3` connections to the database for optimal performance.
it is different, because if you have 100 processes all talking to the database, they can safely all have an application-local connection pool holding onto a dozen connections in a pool that are probably largely idle, but are ready in a flash. context switches aren't necessarily a problem because the database probably sees 5-10% of these connection receiving queries at a given moment. Also, mariadb has a thread-pool feature (https://mariadb.com/kb/en/library/thread-pool-in-mariadb/) that allows a significant reduction in threads for a given number of connections.
Contrast to Postgresql, where none of the above is an option, every connection is a process and you don't want thousands of those piled up.
The Aurora handbook strongly recommends using these techniques, or using a connection proxy if connection pools won't work for your situation. Having too many connections can cause your Aurora instances to scale up as well, limiting database performance, and leaving you with hundreds more instances than you want to pay for.
Have a look at draw.io (https://about.draw.io/features/) which can make a lot of fantastic diagrams. Be sure to look at the "More shapes..." options and choose the libraries that are of interest to you. Free and open-source.
An important consideration is that increasing the pool size many times above the number of cores is very often counterproductive in the face of decreasing IO latency. Also, there may often not be justification for anything other than a constant-sized pool.