Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Supavisor – a Postgres connection pooler written in Elixir (github.com/supabase)
149 points by kiwicopple on April 9, 2023 | hide | past | favorite | 49 comments
hey hn, supabase ceo here

this is a postgres connection pooler. it’s similar to pgbouncer, but built with Elixir and specifically designed for multi-tenancy.

it’s still under development, but it’s at a stage where we can gather a feedback from the community and you can try it yourself. we aren’t using this in production yet, but aiming to deploy it for a subset of databases in the next 2 months.

We have the following benchmarks (details in the readme):

  - Elixir Cluster maintaining 400 connections to a single Postgres database
  - 1_000_000 clients connecting to the Elixir cluster
  - Sending 20_000 transactions per second
  - Consuming 7.8G RAM and ~50% CPU on a 64vCPU machine
supavisor can be run as a cluster or a single node/binary. It’s handling 90%+ of the throughput of pgbouncer on a local machine (running pgbench)

we will place this in front of all supabase databases. It will eventually be able to handle multiple types of connections: traditional TCP connections, and HTTP connections for developers who are connecting to Postgres in serverless environments using Prisma, Kysely, Drizzle, etc

the proxy will serve as a connection buffer while we scale databases: scaling up compute with zero-downtime, and for scale-to-zero - triggering a server restart when a connection is initiated

finally, i want to shout out to Jose and the Dashbit/elixir team. They were extremely helpful with the design & architecture. they have been valuable partners, and elixir continues to be an amazing language for tools like this and our Realtime server.




Feature request that would make me consider moving off of pgbouncer: make prepared statements work in transaction pooling mode.

We've found that we can double our throughput by using prepared statements, but can't do so because we use pgbouncer in transaction pooling mode.

Our newer systems (on Rust) bypass pgbouncer in favor of an in process pool so we get to take advantage of prepared statements there, but our legacy systems would benefit a lot from it.


There are existing patches to support prepared statements in PgBouncer https://github.com/pgbouncer/pgbouncer/pull/510 https://github.com/pgbouncer/pgbouncer/issues/695

FWIW, we are running a modified patch.


Interesting PR. It looks like https://github.com/pgbouncer/pgbouncer/pull/757 is more inline with multiple clients using the same prepared statements. I'll be watching that closely.


Can your client use the `binary_parameters` param?

e.g. https://blog.bullgare.com/2019/06/pgbouncer-and-prepared-sta...

Elixir's Ecto can (and that Go example) so if you use Elixir (or Go) you can use prepared statements with PgBouncer (or Supavisor!) in transaction mode.


I don't believe that provides the performance improvements we desire.

The main reason prepared statements improves throughout is that it caches query plans.

As I understand it, binary_parameters does not do that. Each query is parsed and evaluated individually (perhaps equivalent to unnamed prepared statements) and so nothing is saved when you execute the same query again.


Yeah what you're looking for is "unnamed prepared statements". If your client supports this then you can use PgBouncer and get the throughput benefits.

Maybe binary_parameters is a different thing. Need to look into it more but this is a good one to add to the docs. And maybe officially support named prepared statements natively if lots of clients don't support unnamed prepared statements but definitely seems like we'd have to do much more accounting.


I'm not sure I understand. Are you saying unnamed prepared statements would enable query plan caching? I don't understand how that would work.


I will say that we are trying to keep the connection to the database open as long as possible and only disconnect when necessary. PgBouncer's db connections are more dynamic.

So if Postgres is doing query plan caching by session then caches would be built up as that query hits other db connections. In theory, this should be better.


Postgres does no query planning caching outside of per prepared statement. From the docs:

A prepared statement can be executed with either a generic plan or a custom plan. A generic plan is the same across all executions, while a custom plan is generated for a specific execution using the parameter values given in that call. Use of a generic plan avoids planning overhead, but in some situations a custom plan will be much more efficient to execute because the planner can make use of knowledge of the parameter values. (Of course, if the prepared statement has no parameters, then this is moot and a generic plan is always used.)

By default (that is, when plan_cache_mode is set to auto), the server will automatically choose whether to use a generic or custom plan for a prepared statement that has parameters. The current rule for this is that the first five executions are done with custom plans and the average estimated cost of those plans is calculated. Then a generic plan is created and its estimated cost is compared to the average custom-plan cost. Subsequent executions use the generic plan if its cost is not so much higher than the average custom-plan cost as to make repeated replanning seem preferable.

https://www.postgresql.org/docs/current/sql-prepare.html


That doesn't say anything about named vs unnamed. This does though:

"If successfully created, a named prepared-statement object lasts till the end of the current session, unless explicitly destroyed. An unnamed prepared statement lasts only until the next Parse statement specifying the unnamed statement as destination is issued."

https://www.postgresql.org/docs/current/protocol-flow.html#P...


Ah, that makes sense.

Conclusion: I still want named prepared statements to take advantage of generic plans.


I assumed it did because I would think Postgres caches the query plan by some statement hash not just by name.

Anyways, I'm probably wrong here but it looks like I'm about to figure this out now :D


> Elixir's Ecto can (and that Go example) so if you use Elixir (or Go) you can use prepared statements with PgBouncer (or Supavisor!) in transaction mode.

Ah `binary_parameters` is just this Go lib thing. Ecto just sends unnamed prepared statements where (TiL) query plans are not cached.


That would be very difficult to do, Postgres prepared statements are attached to connections so prepared statements would require connection pinning of some sort.

It would be nice to have a wider scope for prepared statements.


I understand why it's difficult, but most in-process connection poolers will manage prepared statements for you even as the connection pool grows and shrinks.

I know it's more complicated, but I imagine pgbouncer could detect "prepare foo as..." and "execute foo(...)" and track whether a given session has had "foo" prepared. If not, rerun the "prepare foo as ..." on a session if "execute foo(...)" is received. That's essentially what the in-process version does.

There would be some complications around conflicts for a given named prepared statement, but that could be solved with some requirements on clients (e.g. use source hash in prepared statement names).


I’m thinking can just use :erlang.phash2 to pin a client connection process to a db connection process. The only thing is a slow query could block a db connection for others pinned to it.


Interesting approach, but that would be messy when multiple instances of an application all try to prepare the same query with the same name on the same server connection. Clients would need to know that an "prepared statement with that name already exists" error is ok and they should go ahead and execute it.

Should we take this discussion into a GitHub issue?



I've had a great time contributing to this project, using Elixir to create a multi-tenant connection pooler. It's still under development, but we're excited to gather feedback and improve it. Elixir has been an awesome language to work with — let us know what you think!


I’m not too familiar with Elixir. Why did you choose it and why do you say it’s awesome?


Runs on the Erlang Virtual Machine, also known as the BEAM. Erlang was built for scalability and fault tolerance of telephone switches by Ericsson in the 80's.

Give this a watch, "The Soul of Erlang and Elixir". Covers a lot of what's nice about the underlying platform.

https://youtu.be/JvBT4XBdoUE


This is a great video for introducing Elixir/BEAM.


Binary pattern matching is great for working with protocols.

Erlang process monitoring makes it very easy to handle client disconnects.

And Elixir makes using Erlang a bit more approachable.


I completely agree with chasers's points. Moreover, Elixir's excellent concurrency support and its fault-tolerant nature make it well-suited for handling a high number of connections, ensuring our connection pooler's reliability.


Hi, i was running these million connections benchmarks, happy to answer if you have any questions regarding this:) In short, that was pretty easy to do, only things we had to tweak to make it work were some elixir and linux limits


Contributor here. Happy to answer any questions.

An interesting bit... the syn library helps us guarantee there is only one connection pool alive on the cluster at a time and gives us callbacks to resolve the conflict if two get spun up.

https://github.com/ostinelli/syn


Why not call this SupaPool? Supavisor is very BEAMie.

Question about telemetry.

These are the metrics emitted by Ecto:

- :idle_time - the time the connection spent waiting before being checked out for the query

- :queue_time - the time spent waiting to check out a database connection - :query_time - the time spent executing the query

- :decode_time - the time spent decoding the data received from the database

- :total_time - the sum of (queue_time, query_time, and decode_time)

To what extent are these metrics distorted by use of a Supavisor connection pooler? How could Supavisor compensate for that?


It is very BEAMy because we love the BEAM. All connections are supervised :D

Might be a little too close to "supervisor" though.

> To what extent are these metrics distorted by use of a Supavisor connection pooler? How could Supavisor compensate for that?

You mean if you have an Elixir app and using Supavisor for connection pooling? They would be affected as much as they would be if you used PgBouncer.

For the metrics Supavisor is exposing, it has those Ecto metrics for the interactions with the management database.

Supavisor doesn't use any libs for handling the wire protocol. It handles the wire protocol directly. We are exposing some similar ones: https://github.com/supabase/supavisor/wiki/Metrics


Looking at the repo's mix file it seems to use poolboy for pooling. Given that dashbit and José helped out, was nimble_pool considered?


We discussed this. Don't remember exactly but @abc3erl had a lot of experience with poolboy.

We made sure to do local benchmarks along the way and there wasn't a meaningful difference.

Really the discussion was around doing as little message passing as possible to not copy data.

See:

https://github.com/supabase/supavisor/pull/4

https://github.com/supabase/supavisor/issues/7

So there are definite gains there but if we want to do things across a cluster we're going to be copying data. The decision at the time was the benefits of clustering outweighed the perf gains considering we were basically already as fast as PgBouncer. This could change.


> It will eventually be able to handle multiple types of connections: traditional TCP connections, and HTTP connections for developers who are connecting to Postgres in serverless environments using Prisma, Kysely, Drizzle, etc

Is there a specific HTTP API to target for this or something brand new that will cover the desired use cases?


Unknown at the moment but we're definitely open to some prior art if there's an implementation you appreciate.

Naively, POSTing a query string and params would go a long ways I think.


at work we run pgbouncer on 4 vCPU (GCP c2-standard-4) machines, at 60% load, and they do ~15k TPS at peak, however we only peak at ~45k connections per node.

however, i think that you may have a lot left on the table in terms of optimizations that can be made. these numbers seem far under the performance of what i know beam to be capable.


Hey, from what we’ve seen up to 50k per node should be ok for supavisor on 4 cores as well. And TPS is here mainly to demonstrate that we not only create dummy connections but that these connections are really used. And 20k TPS is just the maximum amount that database node was able to handle. To bump tps we had to spin up more db replicas. We want to run these tests as well at some point, but now were just verifying that we can connect millions of clients and with some tangible tps but not overcomplicating the setup too much.


Nice! Yeah really our first go at it so I’m sure we have room to improve.

PgBouncer was on the same instance I’m assuming?


nah we run a pool of common pgbouncers that route to the various postgres instances we run. they also run google's cloud sql proxy as well. it's actually haproxy -> pgbouncer -> cloud sql proxy -> postgres.

everything but postgres runs on one machine. we use haproxy to load balance replica connections across different pgbouncer backends for replicas of the same db.


Ha! Wait but why Google Cloud SQL proxy too?

If you are interested in connecting over this it would be great to connect… chase@supabase.com.

We want to do load balancing too. Makes so much sense to do load balancing in the proxy.


cloud sql proxy makes connecting to cloud managed postgres instances pretty simple/easy. honestly, we dont have a "great" reason other than, we set it up a long time ago, and it's continued to work haha.

so pgbouncer connects to cloud sql proxy, which does the connection pooling concern.


What tool did you use for the diagram?


it's custom, built with figma.

we have a library of assets that our design team use. I can ask if they can open source it if you would find it useful


that would be awesome, thank you


Supabase team member here.

We made the figma file 2 days ago - so we’re still figuring out the wider diagram system.

But we’re planning on publishing it in the figma community as it might be useful for others.

I’ll update on here once we do.


My issue with poolers has always been routing. This solution works in a vanilla cluster but not in a multi-node one.

Can Supavisor route Reads across multiple nodes and Writes to just the Primary like pgpool does? If so, how, is it documented, share a link?

Thanks in advance.


Not yet but we definitely are going to do that.


Perhaps a dumb question but can you run it as a library in an existing elixir app to pool the connections of the app?


You could bring it in as an application but you already get pooling with Ecto via Postgrex and DBConnection.

If you're using Elixir there's really no need for separate connection pool until you have LOTS of nodes.


This is so cool! Have been needing something like this for a long time! The Supabase team smashed it on this one!


I am so glad this exists, great work.


Thanks!


[deleted]




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: