
PGHoard: Tools for making PostgreSQL backups to cloud object storages - melor
http://blog.aiven.io/2016/04/postgresql-cloud-backups-with-pghoard.html
======
waffle_ss
How does this compare to WAL-E?

[https://github.com/wal-e/wal-e](https://github.com/wal-e/wal-e)

~~~
melor
Both do mostly the same thing with some differences. The biggest difference
currently could be that WAL-E uses the PostgreSQL "archive_command" to send
incremental backups (WAL files) in complete 16 megabyte chunks, whereas
PGHoard uses real-time streaming with "pg_receivexlog", making the data loss
window much smaller in case of a disaster.

~~~
willlll
You can set archive_timeout to something like 1 minute to bound the window.

------
melor
Takes care of realtime WAL streaming, compression, encryption, restoration and
backup expiration among other things. Open Source and written in Python.

~~~
brudgers
Curious if it backs up to other cloud storage providers in vendor neutral
ways.

~~~
melor
Currently S3 (AWS + compatible), Google Cloud, OpenStack Swift, Azure
(experimental), local disk and Ceph (via S3 or Swift) are supported. More can
be added quite easily as the object storage logic is behind an extendable
interface.

Which vendor neutral protocol are you interested in using?

~~~
merb
What will happen when the Storage (swift or ceph) is offline for some time?

~~~
oskari
PGHoard can archive PG's WAL segments in two modes: streaming directly using
pg_receivexlog or as an archive_command to archive complete segments.

When PGHoard is used in streaming mode it keeps reading new segments from PG
and stores them in compressed & encrypted form in a queue ready to be
uploaded. The segments will stay there until they can be uploaded.

When using archive_mode PGHoard handles the operation synchronously so PG
won't actually remove or recycle the WAL segment in question until the command
completes.

Postgres will keep running normally in both cases, but the files will be
queued in different places, compressed or uncompressed. This may cause your
disk to fill up eventually, but PGHoard will trigger an alart after a
configurable number of upload failures.

------
anarazel
Do you prevent segments from being removed while they're not yet received by
pg_receivexlog? wal_keep_segments or replication slots?

~~~
melor
A replication slot can be used by defining it in the pghoard.json
configuration. However, the slot needs to be created (and removed after no
longer needed, important!) manually. We've been planning to add more automatic
replication slot management to PGHoard.

~~~
anarazel
Good. Without archiving or slots in place, you really can't rely on such
backups...

