Most distributions ship with conservative defaults for the maximum number of concurrent file descriptors and socket connections. Especially for a database server, it's wise to tune these values before optimizing anything else.
> 1000 files per connection * 370 = 370,000 files
This is a very pessimistic estimate, in reality you'll need much less than 1000 file descriptors per connection. As long as you tune your file descriptor limit, you should be able to handle one or two orders of magnitude more connections than this.
A few resources that might help:
What I miss here is information about the Postgres version used.
Also, if 3.5GB of memory hurt you on a database server with 370 connections, you need to think about buying a DB server with more RAM (our main DB server has 512G and typically ~220 connections, we don't sweat over it and don't use PGBouncer to add latency).
Re work_mem: I think this is per query and thus might affect every connection?! At least thats what I gather from 'PostgreSQL 9.0 High Performance'
Re temp_buffers: 'Sets the maximum number of temporary buffers used by each database session.' For us it practically means per connection. Not for you?
Shit, I'm missing the version! 9.3. I edited the blog post.
Re memory: I think that really depends on the application I think. We have many idle connections.
It means that potentially every connection may at some point use this amount of memory, but idle connections will not. If you have 20 busy connections and 350 idle ones, at most 20 x temp_buffers will be used and there is no benefit from PGbouncer in this regard. If you have 370 busy connections, all of them might use temp_buffers and your tradeoff can be 370 concurrently executing queries at higher memory cost vs. e.g. 20 concurrently executing queries via PGbouncer at lower memory cost and 350 stalled queries with potentially high latency.
> We have many idle connections.
In this case, you will not have any issues with memory use from temp_buffers, work_mem ...
I was trying to gather everything relevant to resources per connection. I'm not saying work_mem caused our problem.
I get it from a historical perspective where it would have been needless complexity to implement something else.
On an unrelated note it's pretty amazing what threads get conflated with in general.
And yet you can get >50% speedups for super common queries like "select * from foo where id = ?" if you prepare it once and then use execute instead of just running the select every time. Seems like maybe you're making assumptions you shouldn't.
Of course it has: http://www.postgresql.org/docs/9.3/static/sql-createfunction...
And you can choose between 4 languages: http://www.postgresql.org/docs/9.3/interactive/xplang.html
Read your link. They can be cached. You have no way to ensure that they are. And you have no way to have the planning done up front at all, you have to wait until the function is run, and even then every code path in the function has to be exercised to get it planned. And then half the time it doesn't think it should bother to cache the plan anyways. And it is per-session, not shared. So every single connection has to start over with an empty cache every time, and there's tons of duplicate plans in the cache wasting space. Not cool, and by far the biggest thing keeping a ton of "enterprise" customers on SQL server and oracle.
Or would you have to roll your own pgbouncer setup for these systems?
Has anyone used this? Is it configurable, or do they just choose some reasonable defaults?
In my python app, I switched from an sqlalchemy connection pool per process (eg per gunicorn worker) to using a Null pool everwhere. No increase in latency; far fewer connections needed.
Now true, in the event of a network partition, you lose the ability to open a database connection. But that's a small price to pay, right?
Even if you ran pgbouncer on a different box, the situation is still better. Suppose you have a probability P of having a partition between any two boxes. Then the probability of a partition taking down the system with pgbouncer is P.
If you instead had N clients coordinating among themselves how many connections to use, your odds go way up. The probability of a partition existing between two clients is 1-(1-P)^(N(N-1)/2) ~= N(N-1)P/2 (for very small P). Once two clients can't communicate, neither one can connect to the DB (because they don't know if the other two have pushed the db over the limit). All clients will need to stop connecting to the DB if any client is ever disconnected from the other N-1 of them.
In a distributed system, ensuring that SUM(x[i]) remains bounded for all time is a tricky problem to solve.
I've done this, long-term, in production. This is not how PGbouncer is designed to be used, and it causes serious problems under load. (I am still doing this, but I have it on my todo list to remove this glitch from my architecture, as it caused my entire site infrastructure to fail once during a launch.) Put simply: PGbouncer is single-threaded; it is not difficult to end up in situations where the one core PGbouncer is running on to be at 100% while the remaining cores (in my case, seven) running PostgreSQL are having no issue with the traffic.
I'm guessing it's a situation where you are running a large volume of very easy queries, for example primary key lookups with no joins?
Clients talking between themselves to establish if they can create another connection or not is madness. They'll keep their connections in the maximum all the time anyway.
It didn't have a pool, but it's a mistake to think it just oppened connections at will.
An addiction to globally shared state across threads multiplies race conditions and scaling problems, even if the GIL magically goes away. If you are incapable of using even processes effectively then I really fear for you when you have to scale out horizontally.
I'd say this more nicely if I could think of any way, but you don't know what you're talking about and you should read up before you begin talking about it again. It looks bad and you might mislead someone.
It doesn't work (for me) on Android so I started to use the app less and less.