
Ask HN: How do you back up your site hosted on a VPS such as Digital Ocean? - joeclef
I have a production site running on Digital Ocean. I&#x27;m looking for a cheap and reliable backup solution. What are your suggestions? Thanks.
======
dangrossman
Write a little program in your favorite shell or scripting language that

* rsyncs the directories containing the files you want to back up

* mysqldumps/pg_dumps your databases

* zips/gzips everything up into a dated archive file

* deletes the oldest backup (the one with X days ago's date)

Put this program on a VPS at a different provider, on a spare computer in your
house, or both. Create a cron job that runs it every night. Run it manually
once or twice, then actually restore your backups somewhere to ensure you've
made them correctly.

~~~
drinchev
Yep. Here is an example that I use to upload it to dropbox.

I don't delete and/or gzip my oldest uploads though.

    
    
        #!/bin/sh
    
        DATE=$(date +%d-%m-%Y@%H:%M:%S.%3N)
        DB_USER="qux"
        DB_PASS="foo"
        DB_NAME="bar"
        DROPBOX_TOKEN="baz"
    
        /usr/bin/mysqldump -u${DB_USER} -p${DB_PASS} ${DB_NAME} > /tmp/${DATE}.sql
        /usr/bin/curl -H "Authorization: Bearer ${DROPBOX_TOKEN}" https://api-content.dropbox.com/1/files_put/backup/ -T /tmp/${DATE}.sql

~~~
gcr
Careful! If someone hacks your server, they now get your Dropbox account.

One alternative is to put these backups into S3 using pre-signed requests
rather than Dropbox. An S3 pre-signed request gives permission only to upload
files, perhaps only to a certain location in a certain bucket.

It's a bit harder to set up, but the shell script will look almost the same.

~~~
meowface
You can actually set up app folders in Dropbox so that a particular API key
effectively chroots you to that folder. The attacker would only get the
backups.

~~~
jaffa214525
Which is literally the worst scenario. An attacker owns your box and now your
backups.

~~~
chatmasta
How do you avoid that with any backup service?

~~~
toomuchtodo
Push to an S3 bucket with upload only credentials with versioning turned on.

Your master account (or superuser IAM account if you're paranoid) gives you
read/write after 2FA login, but you could share your backup creds with the
world and never have your backups pulled out or overwritten.

Use S3 lifecycle rules to expire backup objects after x days; data transfer in
is free, the operation requests are pennies per thousand, only the bandwidth
is expensive (10 cents/GB) to retrieve the backups when you need to perform a
restore (even then, still very cheap).

Also, by storing in S3, you can backup and restore from anywhere.

------
yvan
The simplest way for us it's to use rsync, there is this service decade old
(even more) that is just perfect for the offsite backup.
[http://rsync.net/index.html](http://rsync.net/index.html)

We basically create a backup folder (our assets and MySQL Dump, then rsync it
to rsync.net). Our source code is already on git, so basically backuped on
Github, and all developers computer.

On top of it, rsynch has a very clear and simple documentation to implement it
very quickly with any Linux distrib.

~~~
rsync
Glad to hear it's working for you.

I hope that you know that your account, like all accounts at rsync.net, is on
a ZFS filesystem.

This is important because it means that inside your account, in the .zfs
directory, are 7 daily "snapshots" of your entire rsync.net account, free of
charge.

Just browse right in and see your entire account as it existed on those days
in the past. No configuration or setup necessary. Also, they are
immutable/readonly so even if an attacker gains access to your rsync.net
account and uses your credentials to delete your data, the snapshots will
still be there.

------
Kjeldahl
DigitalOcean has a droplet backup solution priced at 20% of the monthly cost
of your droplet. Doesn't get much easier than that, if you can afford it. For
a small droplet ($10/month) that's a full backup of everything for a buck a
month.
[https://www.digitalocean.com/community/tutorials/understandi...](https://www.digitalocean.com/community/tutorials/understanding-
digitalocean-droplet-backups)

~~~
batuhanicoz
It's always good to have your backup off-site. If something happens to
DigitalOcean servers, your backups will be gone too.

~~~
ryanSrich
I imagine DO has some contingency plan to prevent both your server and the
backup being down at the same time.

~~~
funkyy
Unless bug in their system, hack attack or human error deletes all droplets
from your account. And yes, I have heard of stories like that from other
hosting companies where a human error caused the account to be deleted and
data removed.

~~~
j_jochem
DO does that too:
[https://news.ycombinator.com/item?id=11036554](https://news.ycombinator.com/item?id=11036554)

------
no_protocol
Whatever strategy you use, make sure you test the process of recreating the
server from a backup to make sure you will actually be able to recover. You'll
also have an idea how long it will take, and you can create scripts to
automate the entire flow so you don't have to figure it all out while you're
frantic.

I use tarsnap, as many others in this thread have shared. I also have the
Digital Ocean backups option enabled, but I don't necessarily trust it. For
the handful of servers I run, the small cost is worth it. Tarsnap is
incredibly cheap if most of your data doesn't change from day to day.

~~~
budhajeewa
Is there any reason to not trust DigitalOcean's back up system?

~~~
no_protocol
If their entire site disappeared your backups would be gone too. You want to
be able to move to a different provider if necessary.

------
rsync
Some of our customers have already recommended rsync.net to you - let me
remind folks that there is a "HN Readers Discount" \- just email us[1] and ask
for it.

[1] info@rsync.net

------
kumaraman
I use AWS S3 for this as the storage prices are so cheap, at $0.03 per GB. I
recommend using a utility called s3cmd, which is a similar to rsync, in that
you can backup directories. I just have this setup with a batch of cron jobs
which dump my databases and then sync the directories to s3 weekly.

~~~
AgentME
I use duply (a simpler CLI front-end to duplicity) for doing encrypted
incremental backups to S3.

The only annoying thing is that duplicity uses an old version of the boto s3
library that errors out if your signatures tar file is greater than 5gb unless
you add `DUPL_PARAMS="$DUPL_PARAMS --s3-use-multiprocessing "` to your duply
`conf` file. Took me days to figure that out.

~~~
aliencat
Duplicity is great! I use duply to back up my photos to Backblaze B2, which is
really cheap: 0.005$/GB/Month.

------
stevekemp
I see a lot of people mentioning different tools, but one thing you'll
discover if you need to restore in the future is that it is crucial to
distinguish between your "site" and your "data".

My main site runs a complex series of workers, CGI-scripts, and deamons. I can
deploy them from scratch onto a remote node via fabric & ansible.

That means that I don't need to backup the whole server "/" (although I do!).
If I can setup a new instance immediately the only data that needs to be
backed up is the contents of some databases, and to do that I run an offsite
backup once an hour.

------
AdamGibbins
I use config management to build the system (Puppet in my case, purely due to
experience rather than strong preference) so it's fully reproducible. I push
my data with borg
([https://github.com/borgbackup/borg](https://github.com/borgbackup/borg)) to
rsync.net
([http://rsync.net/products/attic.html](http://rsync.net/products/attic.html))
for offsite backup.

------
xachen
www.tarsnap.com - it's pay as you go, encrypted and super simple to use and
script using cron

~~~
buro9
^ this.

Github takes care of code and config.

AWS S3 takes care of uploaded static files.

But Tarsnap takes care of my database backups.

The only thing to be aware of is that restore times can be very slow.

------
touch_o_goof
All automated, with one copy to AWS, one copy to Azure, and an scp local that
goes on my home server. Rolling 10, put every 10th backup in cold storage. And
I use a different tool for each, just in case.

------
bretpiatt
For a static site put it in version control and keep as copy of your full site
and deployment code.

For a database driven dynamic site or a site with content uploads you can also
use your version control via cron job to upload that content. Have the
database journal out the tables you need to backup before syncing to your DVCS
host over choice.

If you're looking for a backup service to manage multiple servers with
reporting, encryption, dedupelication, etc. I'd love your feedback on our
server product:
[https://www.jungledisk.com/products/server](https://www.jungledisk.com/products/server)
(starts at $5 per month).

------
billhathaway
Remember to have automated restore testing that validates restores are
successful and the data "freshness" is within a reasonable period of time,
such as last updated record in a database.

Lots of people only do a full test of their backup solution when first
installing it. Without constant validation of the backup->restore pipeline, it
is easy to get into a bad situation and not realize it until it is too late.

~~~
shostack
Novice here. How would you go about testing the integrity and ability to
restore from the backup as part of an automated backup/restore pipeline?

------
darkst4r
[http://tarsnap.com](http://tarsnap.com) \+ bash scripts for mysqldump and
removing old dumps + cron

------
pmontra
On OVH I rsync to another VPS in a different data center. I pick the lowest
priced VPS with enough space. I also rsync to a local disk at my home. I would
do the same with DO.

OVH has a backup by FTP premium service but the FTP server is accessible only
by the VPS it backups. Pretty useless because in my experience if an OVH VPS
fails the technical support has never been able to take it back online.

------
jasey
I used Duplicity[1] to backup to Amazon S3

[1][http://duplicity.nongnu.org/](http://duplicity.nongnu.org/)

[http://mindfsck.net/incremental-backups-amazon-s3-centos-
usi...](http://mindfsck.net/incremental-backups-amazon-s3-centos-using-
duplicity/)

------
jenkstom
Backup ninja. It handles backing up to remote servers via rdiff, so I have
snapshots back as far as I need them. The remote server is on another
provider. As long as I have SSH login via key to the remote server enabled,
ninja backup will install the dependencies on the remote server for me.

~~~
kzisme
How much does something like that cost to maintain (backups mainly since I'm
assuming your DO droplet is ~5$/mo)

------
Osiris
I use attic backup (there's a fork called borg backup). It runs daily to make
incremental backups to a server at my home.

For database, I use a second VPS running as a read only slave. A script runs
daily to create database backups on the VPS.

------
2bluesc
I use a daily systemd timer on my home machine to remotely back-up the data on
my VPS. From there, my home machine backs-up a handful of data from different
places to a remote server.

Make sure you check the status of backups, I send journald and syslog stuff to
papertrail[0] and have email alerts on failures.

I manually verify the back-ups at least once a year, typically on World Back-
up Day [1]

[0] [https://papertrailapp.com/](https://papertrailapp.com/) [1]
[http://www.worldbackupday.com/en/](http://www.worldbackupday.com/en/)

------
spoiledtechie
I use [https://www.sosonlinebackup.com](https://www.sosonlinebackup.com).

Stupid simple and stupid cheap. Install, select directories you want backed
up, set it and forget it.

All for $7.00 a month.

------
stephenr
How is this at 70+ comments without a mention of rsync.net?

Collect your files, rsync/scp/sftp them over.

Read only snapshots on the rsync.net side means even an attacker can't just
delete all your previous backups.

------
aeharding
Because I use Docker Cloud, I use Dockup to back up a certain directory daily
to S3 from my DO VPS.
[https://github.com/tutumcloud/dockup](https://github.com/tutumcloud/dockup)

I just use a simple scheduled AWS lambda to PUT to the redeploy webhook URL.

I use an IAM role with put-only permissions to a certain bucket. Then, if your
box is compromised, the backups cannot be deleted or read. S3 can also be
setup to automatically remove files older than X days... Also very useful.

------
geocrasher
I run a couple of virtualmin web servers which do virtualmin based backups
(backs up each website with all its files/email/db's/zones etc into a single
file, very much like how cPanel does its account backups), and those are
rsynced (cron job) to my home server than runs two mirrored 1tb disks. A
simple bash script keeps a few days of backups, plus a weekly backup that I
keep two copies of. Overall pretty simple, and it's free since I'm not paying
for cloud storage.

------
colinbartlett
The sites I host on DigitalOcean are all very simple Rails sites deployed with
Dokku. The source code is in GitHub and the databases I backup hourly to S3
with a very simple cron job.

------
mike503
Bash script to dump all DBs local and tar up any config files.

Then the script sends it to s3 using aws s3 sync. If versioning is enabled you
get versioning applied for free and can ship your actual data and webdocs type
stuff up extremely fast and it's browsable via the console or tools. Set a
retention policy how you desire. Industry's best durability, nearly the
cheapest too.

------
kevinsimper
This is the same question I had [1], but just asked in "how can I outsource
this cheap" instead of "how can I do this cheap". I also use docker, so I
would only need to get a hosted database.

[1]
[https://news.ycombinator.com/item?id=12659437](https://news.ycombinator.com/item?id=12659437)

------
dotancohen
I see lots of great suggestions for backup hosts and methods, but I don't see
anybody addressing encrypting said backups. I'm uncomfortable with rsync.net /
Backblaze / etc having access to my data. What are some good ways to encrypt
these multiple-GB backups before uploading them to a third-party backup
server?

~~~
ing33k
Check this
[https://github.com/backup/backup](https://github.com/backup/backup)

~~~
dotancohen
That looks really nice, thanks. However, I've sworn off Ruby as installing and
maintaining the whole Gems stack is such a pain. I'd rather write a shell
script, or conjure up some Python.

~~~
dotancohen
Just to kid myself, I ran `sudo gem install backup` on my CentOS 7 desktop.
The CPU has been pegged for over a minute, I have no idea what it's doing.
It's not using much memory, and I'm not sure if it is making or hanging on
network requests.

EDIT: After a few minutes of warming the CPU, Ruby failed but at least with an
informative error message.

~~~
ing33k
as a matter of choice ,I always use rvm to install ruby. Never faced any issue
related to installation.

------
extesy
I currently use
[https://github.com/backup/backup](https://github.com/backup/backup) on my
Digital Ocean instances, but
[https://github.com/bup/bup](https://github.com/bup/bup) also looks nice.

------
benbristow
What type of site is it?

~~~
jagger27
Important question. If it's a Wordpress site, then all you need to back up is
the theme and your MySQL db. If it's a static site then just use rsync or sync
to a git service.

~~~
fergbrain
And your /wp-contents/uploads/ folder...probably worth backing up everything
in /wp-contents/ since some plugins create additional directories.

------
moreentropy
I use restic[1] to make encrypted backups to S3 (self hosted minio service in
my case).

I can't praise restic enough. It's fast, secure, easy to use and set up
(golang) and the developer(s) are awesome!

[1] [https://restic.github.io/](https://restic.github.io/)

------
wtbob
I have duplicity set up, sending encrypted backups to S3. It works pretty
well, and is pretty cheap.

------
educar
If you use docker to deploy, see cloudron.io. You can install custom apps and
it takes care of encrypted backups to s3. And automates lets encrypt as well.

------
00deadbeef
I have BackupPC running on another system

[http://backuppc.sourceforge.net/](http://backuppc.sourceforge.net/)

------
bedros
I use borg backup to a backup-drive formatted as btrfs, then I use btrfs
snapshot feature, to create a snapshot after every backup,

------
voycey
I really rate Jungledisk, you can choose S3 or Rackspace Cloudfiles as your
storage medium, very much set it and forget it!

------
ausjke
Many ways to backup, but I always encrypt them other than just copying them to
somewhere.

------
yakamok
i run a python/shell program to rsync and collect what i want backed up into
one folder i then compress it and gpg encrypt it and send it to my backup
server

------
edoceo
I make archives and put them in S3.

Use pg_dump and tar then just s3cp

------
chatterbeak
Here's how we do it:

All the databases and other data are backed up to s3. For mysql, we use the
python mysql-to-s3 backup scripts.

But the machines themselves are "backed up" by virtue of being able to be
rebuilt with saltstack. We verify through nightly builds that we can bring a
fresh instance up, with the latest dataset restored from s3, from scratch.

This makes it simple for us to switch providers, and can run our "production"
instances locally on virtual machines running the exact same version of CentOS
or FreeBSD we use in production.

------
X86BSD
I don't know what the OP is running OS wise but if it's any modern Unix
variant it uses ZFS. And a simple ZFS send/receive would be perfect. There are
tons of scripts for that and replication.

If you're not using a modern Unix variant with ZFS... well there isn't a good
reason why you would be.

~~~
LeoPanthera
I am amused by your subtle-ish attempts to brand Linux as "not modern".

~~~
X86BSD
Interesting that you chose to assume Linux there. Your words not mine. DO
offers other OS options than that, which also don't have modern file systems.
OpenBSD etc.

------
nwilkens
We have cheap reliable storage servers at
[https://mnx.io/pricing](https://mnx.io/pricing) \-- $15/TB. Couple our
storage server with R1soft CDP (r1soft.com), Attic, Rsync, or Innobackupex,
etc..

You can also use [https://r1softstorage.com/](https://r1softstorage.com/) and
receive storage + R1soft license (block based incremental backups) -- or just
purchase the $5/month license from them and use storage where you want.

