
PGHoard – PostgreSQL backup and restore service for cloud object storages - bjoko
https://github.com/aiven/pghoard/
======
melor
One of the pghoard developers here. We developed pghoard for our use case
([https://aiven.io](https://aiven.io)):

* Optimizing for roll forward upgrades in a fully automated cloud environment * Streaming: encryption and compression on the fly for the backup streams without creating temp files on disk * Solid object storage support (AWS/GCP/Azure) * Survive over various glitches like faulty networks, processes getting restarted, etc.

Restore speed is very important for us and pghoard is pretty nice in that
respect, e.g. 2.5 terabytes restored from an S3 bucket to an AWS i3.8xlarge in
half an hour (1.5 gigabytes per second avg). This means hitting all of
cpu/disk/network very hard, but at restore time there's not typically much
else to do with them.

------
manigandham
It would be good for the PG community to consolidate some of these projects
(wall-e, wall-g, pghoard, pgbackrest, barman, etc) into the core functionality
at this point.

------
oskari
One of the PGHoard authors will be talking about PostgreSQL backups in the
cloud in a couple of weeks in PostgresConf NYC:
[https://postgresconf.org/conferences/2019/program/proposals/...](https://postgresconf.org/conferences/2019/program/proposals/postgresql-
backups-in-the-age-of-the-cloud)

------
briffle
How does a 'periodic backup using pg_basebackup' compare to barman's rsync and
reuse_backup = link for speed of backups, and size of files?

I want to look at this (and Wal-G) but am not sure how much of a load I would
be putting on our db servers when they do the periodic backups. Our database
is pretty heavy on 'history' tables that don't change much once they are
written to.

------
seslattery
How does this compare to pgBackRest?
[https://pgbackrest.org/](https://pgbackrest.org/)

------
koolba
How’s the cpu usage for this when compressing? Are the libs C libraries? (ie
native performance)

~~~
melor
CPU usage varies based the selected compression algorithm and level used.
Snappy and LZMA area available now. Compression is native code. There are some
newer interesting algorithms (zstd/lz4) that we are looking into adding.

------
scottshamus
I'm going to look at this more later, but my first thought is how does this
compare to WAL-e
[https://github.com/wal-e/wal-e](https://github.com/wal-e/wal-e)

~~~
bruce_one
[WAL-G]([https://github.com/wal-g/wal-g](https://github.com/wal-g/wal-g)) is
another alternative, and is sold as a successor to WAL-E, just in case if you
haven't heard of it :-)

~~~
bjoko
There has been a discussion about WAL-G just a few days ago here on HN
[https://news.ycombinator.com/item?id=19259099](https://news.ycombinator.com/item?id=19259099)

------
tpetry
With wal-e, wal-g and pghoard its getting really difficult choosing a
solution, they are so comparable

~~~
oskari
Feature sets of all the recent backup and restore systems are becoming more
and more alike, but when we started working on PGHoard there were no good
options that were built to efficiently utilize different cloud object stores
(S3 + GCS, Azure, Swift, etc.)

Our original announcement of PGHoard at [https://aiven.io/blog/postgresql-
cloud-backups-with-pghoard/](https://aiven.io/blog/postgresql-cloud-backups-
with-pghoard/) lists some of the reasons we had for building a new system from
scratch.

Nowadays there are many good options for handling basebackups and WAL, and one
of the largest remaining issues is the lack of parallel WAL apply in
PostgreSQL itself, which limits restore throughput quite severely.

