
Approaches to PostgreSQL Replication and Backup - craigkerstiens
https://www.citusdata.com/blog/2018/02/21/three-approaches-to-postgresql-replication/
======
colanderman
A fourth option, for those in physical datacenters who do not need to run
active/active, is of course to eschew network replication entirely in favor of
a RAIDed JBOD shared between two or more servers. If you have the hardware,
this is often the easiest to configure, and the most efficient.

EDIT: I should clarify that, like the other HA solutions in the article, of
course you need to couple this with a backup solution, like periodic dumps to
S3, such as the article suggests.

~~~
colanderman
Those downvoting, care to give a technical reason why I am wrong?

~~~
koolba
The short answer is that RAID is not a backup solution.

~~~
rosser
It also doesn't manage failover. Shared storage is a totally viable approach.
(I've worked as a PostgreSQL DBA for over a decade and have built multiple
shared storage-based HA environments for multiple employers.)

If you mis-manage failover, _you will destroy your data_. This makes proper
backup even more important.

~~~
colanderman
Nor does Postgres replication manage failover. Either way you need a 3rd party
tool.

Yep, botched failovers are bad. Of course, a botched failover between
replicated DBs is no walk in the park either. In many such cases you'll still
want to pull backups anyway, if there's no clear way in your domain to merge
two "authoritative" databases.

~~~
rosser
> _In many such cases you 'll still want to pull backups anyway..._

In _all cases_ where you care about your data, you must take backups. There is
_no substitute_ for that step.

Additionally, you want to periodically restore your backups and test them for
validity — ideally automatically. There's a saying in the ops world, borne of
bitter experience:

"Your backups are only as good as their last successful restore."

~~~
colanderman
What grantwu said. You misinterpreted me. Seems to be a theme today. I guess I
should spell things out more clearly in the future.

~~~
rosser
Ok, my apologies. I've never heard that phrasing before, but now that you
clarify, I understand it.

Ambiguity in online discourse is an ongoing problem. Thanks for helping
resolve this occurrence without acrimony.

------
cuu508
Sorry for the dumb question, but which of the 3 approaches is pg_basebackup?

My understanding was/is that with pg_basebackup the new replica would only
need to replay state changes since the last checkpoint, not "the entire state
from the primary node—from the beginning of time". What am I missing?

~~~
koolba
Definitely the third one and possibly the first one depending on what’s being
backed up. It seems to imply it’s just a logical pg_dump backup going to the
offsite store but doesn’t clarify.

The second option is at the physical block layer. It’s not specific to any one
DB but is a bit of a blunt instrument (works everywhere but lacks finesse).

The third option would involve base backups being pushed off site as well as
archived WAL logs. The replicas would restore the base backups, then catch up
via WAL logs. Finally you can directly link up to the master to have a live
standby for read only queries. This is the approach used by the WAL-E and
WAL-G backup/restore tools and is the one I generally recommend as it has the
most flexibility and resilience.

------
skewedlines
Hmm, just a quick clarification for the first approach. It says that the
replay operation may then introduce significant load on the primary node. Is
this because when new nodes are added, the logs have to be dumped to the new
slave nodes from the beginning, and the IO to do so is expensive, possibly
interfering with normal operation of the master node?

~~~
mixmastamyk
That’s how I read it.

