I'm a Postgres fan, and use it a lot, but I've never actually used it in a clustered setup.
What I'm looking at clustering for is not really for scalability (still at the stage where we can scale vertically), but for high availability and backup - if one node is done for update, or crashes, the other node can take over, and I'd also ideally like point-in-time restore.
There seems to be a plethora of OSS projects claiming to help with this, so it looks like there isn't "one true way" - I'd love to hear how people are actually setting up their Postgres clusters for in practice?
We've had the best luck with patron, but even then you'll find the documentation confusing, have weird issues, etc. You'll need to setup etcd/Consul to use it. That's right you need a second database cluster to setup your database cluster.... Great...
I have no clue how such a community favorite database has no clear solution to basic HA.
A good alternative is to use a database service and not have to worry about it at all. However if you do have to operate one, orchestration systems give you the tools to handle it properly WRT your availability and performance SLA/SLOs.
However, if I had to run Postgres as part of something I deployed on k8s AND for some reason couldn’t use my cloud provider’s built in solution (AWS RDS, Cloud SQL, etc.) I would probably go with using/writing a k8s operator. The big advantage of this route is that it gives you good framework for coordinating the operational changes you need to be able to handle to actually have failover and self-healing from a Postgres cluster, in a self-contained and testable part of your infrastructure.
When setting up a few Postgres nodes with your chosen HA configuration you’ll quickly run into a few problems you have to solve:
* I lose connectivity to an instance. Is it ever coming back? How do I signal that it’s dead and buried to the system so it knows to spin up a fresh replica in the cases where this cannot be automatically detected?
* How do I safely follow the process I need to when upgrading a component (Postgres, HAProxy, PGBouncer, etc.)? How do I test this procedure, in particular the not-so-happy paths (e.g. where a node decides to die while upgrading).
* How do I make sure whatever daemon that watches to figure out if I need to make some state change to the cluster (due to a failure or requested state change) can both be deployed in a HA manner AND doesn’t have to contend with multiple instances of itself issuing conflicting commands?
* How do I verify that my application can actually handle failover in the way that I expect? If I test this manually, how confident am I that it will continue to handle it gracefully when I next need it?
A k8s operator is a nice way to crystallize these kinds of state management issues on top of a consistent and easily observable state store (namely the k8s API’s etcd instance). They also provide a great way to run continuous integration tests that you can actually throw the situations you’re trying to prepare for at the implementation of the failover logic (and your application code) to actually give you some confidence that your HA setup deserves the name.
But again, I wouldn’t bite this off if you can use a managed service for the database. Pay someone else to handle that part, and focus on making your app actually not shit the bed if a failover of Postgres happens. The vast majority of applications I’ve worked on that were pointed at a HA instance would have (and in some cases did) broken down during a failover due to things like expecting durability but using asynchronous replication. You don’t get points for “one of the two things that needed to work to have let us avoid that incident worked”.
> Pay someone else to handle that part
Aside from the money (managed Postgres is expensive), I'd actually like to understand what good, high-availability Postgres solutions look like.
In general, you need some sort of separate strictly serializable store run in an HA configuration to manage the state changes, and whatever is managing the state changes needs to be run in multiple instances across your fault domains as well. Others have mentioned Patroni; I’ve used it before (though with etcd, not k8s) and been quite happy with it. Be aware that (as it cautions you in the README) it’s not a total point and shoot tool, you do need to read through it’s caveats and understand the underlying Postgres replication features.
The documentation is pretty good, if you want to get an idea of what the logic looks like they have a nice state diagram: https://github.com/zalando/patroni/blob/master/docs/ha_loop_...