> It's whole purpose in life is to automatically detect the version of PostgreSQL used in the existing PostgreSQL data directory, and automatically upgrade it (if needed) to the latest version of PostgreSQL.
In a small startup...
* If the data is mission-critical and constantly changing, PostgreSQL is a rare infra thing for which I'd use a managed service like AWS RDS, rather just Debian Stable EC2 or my own containers. The first time I used RDS, the company couldn't afford to lose an hour of data (could destroy confidence in enterprise customer's pilot project), and without RDS, I didn't have time/resources to be nine-9s confident that we could do PITR if we ever needed to.
* If the data is less-critical or permits easy sufficient backups, and I don't mind a 0-2 year-old stable version of PG, I'd probably just use whatever PG version Debian Stable has locked in. And just hold my breath during Debian security updates to that version.
(Although I think I saw AWS has a cheaper entry level of pricing for RDS now, which I'll have to look into next time I have a concrete need. AWS pricing varies from no-brainers to lunacy, depending on specifics, and specifics can be tweaked with costs in mind.)
RDS is very useful for companies which are big enough to employ, say, 2 programmers, but still too small to employ a DBA.
The hard part of running a database, in my experience, isn't setting up or running it. The hard part isn't even configuring backups.
The hard part is noticing that your backups have been broken for years, before you actually need to restore from them. Yes, yes, you know how to do this correctly. But you delegated it to the sysadmin, the sysadmin subtly broke the backup scripts, the scripts have been silently doing nothing for 18 months, and then the sysadmin got a new job.
This is the main value proposition of RDS: Your data will be backed up, your backups will be restorable, and most of your normal admin tasks can be performed by pushing a button.
Exactly, on the backups. There was off-the-shelf open source PITR software that I looked at and could've configured, but I couldn't justify spending all the time to test that setup, given that (supposedly) RDS was rock-solid turn-key. There were other engineering and ops things that needed my time more.
In my startup (until our exit) I've run Postgres on my own, with streaming backups, daily backup tests (automatic restores and checks), offsite backups and (slow) upgrades.
Data wasn't mission critical and customers could live with 5min downtimes for upgrades once a year.
No problem for years. When we had hosted Mongo, we had more problems.
If I have the money I'll use a managed database. But running Postgres in a startup up to $10m ARR / ~1TB data seems not like a problem.
A script creates a database instance from a container, restores backup into it, makes some checks (e.G. table sizes above some value, audit table contains data up to the backup point etc.), sends out email that everything looks ok.
Also unzipping and decrypting, already makes sure encryption worked and zipping worked and you did not end up with a 0 byte backup file (because of permission etc.).
Same goes of course for snapshots and backups in the cloud. I have several clients which had backup problems because of misconfiguration in AWS/GCP.
super. I have been thinking of doing something similar for my productions db systems as well but was struggling to come up with some efficient way to test restoration. This gives me some starting point.
You will be very happy when restoration testing fails. Better to find out in a test than - as some of my clients have - when restoring a backup in an emergency.
While true, you'd think something as important as a DB would be something worthy of creating some kind of nightly automated testing/verification of a restore procedure (or multiple).
In my experience boot strapping a startup, RDS is really expensive. Definitely a nice thing, but if you're a one person show trying to get to ramen profitable, RDS might not really make sense. If you are VC backed then go ahead!
Many things. Here's a list off the top of my head:
MySQL RDS acts like it's just regular MySQL minus SUPER privilege. So it will not accept various important inputs. For example, GTID state will be rejected. Even just pre-GTID log sequence numbers are barely supported. Various SQL SECURITY things are rejected. The suggestions for importing data using any of these features vary from scary nonsense (just ignore GTID numbers!) to absurd hacks (run sed on your mysqldump output to remove SQL SECURITY!).
The docs basically don't acknowledge that GTID matters in a RDS-to-or-from-non-RDS setup. The suggestions don't seem like they deserve to work. (Azure at least has some documentation for GTID, but it involves using fancy barely-documented APIs just to import your data.)
Replication will just break if you accidentally use a feature that RDS can't handle.
For something that could easily cost hundreds to thousands of dollars per month, I expected to be able to run an unmodified mysqldump and have RDS accept the output, process it correctly, and take my money. Nope, didn't happen.
In a small startup...
* If the data is mission-critical and constantly changing, PostgreSQL is a rare infra thing for which I'd use a managed service like AWS RDS, rather just Debian Stable EC2 or my own containers. The first time I used RDS, the company couldn't afford to lose an hour of data (could destroy confidence in enterprise customer's pilot project), and without RDS, I didn't have time/resources to be nine-9s confident that we could do PITR if we ever needed to.
* If the data is less-critical or permits easy sufficient backups, and I don't mind a 0-2 year-old stable version of PG, I'd probably just use whatever PG version Debian Stable has locked in. And just hold my breath during Debian security updates to that version.
(Although I think I saw AWS has a cheaper entry level of pricing for RDS now, which I'll have to look into next time I have a concrete need. AWS pricing varies from no-brainers to lunacy, depending on specifics, and specifics can be tweaked with costs in mind.)