
Arq 4 is out – Mac backup to S3/Glacier - michaelx
http://www.haystacksoftware.com/blog/2014/03/arq-4-is-out/
======
rsc
I discovered Arq a couple years ago and have been using it ever since. I
highly recommend it. The thing that sold me was that the file format is
documented and I was able to write code to read the backups directly. So even
if somehow Arq does disappear, I can still read my data.
[http://godoc.org/code.google.com/p/rsc/arq](http://godoc.org/code.google.com/p/rsc/arq)

~~~
mietek
Is this compatible with the latest version?

~~~
sreitshamer
Mostly. I need to update it a bit. I will ASAP.

------
rsync
As always, we're very happy to see that in addition to
S3/Glacier/DreamCloudStartupOfTheWeek, Arq supports plain old SFTP.

This means that, of course, it works perfectly with rsync.net and that we have
yet another chance to offer the HN readers discount, which you may email us to
find out about.

~~~
abcd_f
Please try and contain yourselves when you see "backup" on HN.

------
rglover
Quick endorsement: buy this. Arq has allowed me to be insanely careless about
backups and has never failed. Makes the whole backup thing cost effective, too
with Amazon being dirt cheap these days.

------
pudquick
Can someone using S3 Glacier as a backup destination please provide some
concrete costs? Even if no retrieval is involved, I'd just love to see: "I pay
$XYZ and am backing up ###GB worth of data."

I see the calculator floating around in the comments, but it's not formatted
in a backup-friendly fashion / use case (ie. set storage size with XYZ% churn
for changed files, or continuously expanding snapshots with aged snapshot
deletion).

I've always been curious for usage for backup storage. The costs for retrieval
are expensive enough (in the time frames I'd care to see for a personal
computer) that I think I'll still avoid that aspect for the time being.

~~~
jhart3333
It's cheap until you have to grab a bunch of stuff back all at once. Then they
have you by the short hairs.

~~~
digitalengineer
I think if your house/company burned down and you loose yor main PC and the
external backup, you'd be glad to pay 0.10 cents for every Gigabyte you can
retrieve.

~~~
saurik
Or potentially higher than 20x that much, if you are downloading it quickly
(are you actually aware of the Glacier pricing model? where did you obtain
$.10/GB?).

[https://news.ycombinator.com/item?id=7337181](https://news.ycombinator.com/item?id=7337181)

------
leejoramo
Arq has been part of my backup systems since version 2. This upgrade looks
very good. Especially the expansion beyond relying solely on Amazon S3 and use
number of other services including SSH.

Having used the past Glacier support, the new S3 Glacier Lifecycle will be
much better.

I am wondering about when (if?) the open source arq_restore and format
documentation will be updated.

~~~
OWaz
Do you mind elaborating what your total backup solution is? I've been looking
at Arq but am not aware of what it does not do and what I would need to fill
in the gaps. Currently I rely on time machine backing up to three different
drives and keeping one of those drives in the office.

~~~
leejoramo
I will ignore what I do for server backups, and I will focus on my MacBook. I
run my web development business and my personal life off of this system.

The software I use: CrashPlan, Arq, DropBox, Carbon Copy Cloner.

CrashPlan: I have backing up my data, configuration /Users, /Library. I have a
bunch of regex restrictions to not backup various files such as caches, VMWare
images, etc. Crashplan backups up to their cloud servers every 15 minutes.
When I am at home (where I work from) it also backups to a local copy of
CrashPlan on a server.

Arq: I am doing daily backups to Arq. These are now being done to Glacier for
long term / last resort backups. This only backups my /Users with heavy
restrictions on which files.

DropBox: I have many of my documents stored in Dropbox with the PackRat
feature to keep copies of every version and deletion. I don't consider DropBox
to be backup by itself, but I often find it is much faster to find and restore
something via Dropbox than other methods. I also take care about the types of
data I put on Dropbox.

Carbon Copy Cloner: as I mentioned in another part of this thread, I think
SuperDuper is better for most people. However, I do use CCC's ability to
remotely do a boot able bare metal backup to my home office server. When I
travel, I typically take an external backup drive with a current mirror of my
system.

I don't use Apple's Time Machine. I think it is a good choice for most home
users. As Apple has added more features to Time Machine, I do think about
adding it to my mix.

That covers most things. I do have somethings under SVN or Git which could be
considered another layer of backup.

Currently, the biggest pain point for me in backups is VMWare images. I
currently have 4 Linux and 3 Windows images on this system, and they can cause
a huge amount of data needing to be backed up every time they are used.

~~~
Stratoscope
> Currently, the biggest pain point for me in backups is VMWare images. I
> currently have 4 Linux and 3 Windows images on this system, and they can
> cause a huge amount of data needing to be backed up every time they are
> used.

This is where a sector-by-sector backup program shines.

I don't know what to recommend on the Mac, but on Windows, ShadowProtect is
pretty wonderful. It backs up only changed sectors - update 10MB in a 10GB
file and it only copies that 10MB - and it's insanely fast.

Even with a file-by-file backup, one thing you can do for VMware images is to
take a snapshot. After you take a snapshot, further changes to the VM don't go
into the large .vmdk file you just snapshotted, they go into a new,
potentially much smaller .vmdk file, so your next incremental backups may be
much smaller.

~~~
ojilles
Why not just setup backups from inside the VM's, while having the base VM
image backed up somewhere (once) as well?

~~~
leejoramo
I have sometimes done this with Linux VM's that I use for local development.

Usually, the VM disk images are some where between the need to backup
Applications and configuration files, and not as important my work and data
files.

The problem with VM's is not just the quantity of data that needs to be backed
up, but the overall size of the data that needs to be evaluated. I think that
CrashPlan does a pretty good job of just coping the changed data of the disk
image, but it has to do a HUGE amount of processing with every backup.
Therefore VM's are hard to fit in with the remote versioned backups of
CrashPlan and Arq.

I do backup these up via Carbon Copy Clone when I mirror the entire drive.

------
AdamGibbins
Wonderful piece of software - highly recommended. Awesome upgrade also, have
been waiting for S3 alternatives for a long time.

It's unfortunate a few things appear to be backwards - why can you include
wifi APs, yet not exclude them - despite the example suggesting you exclude
from tethered devices.

Likewise, why can you email on success, but not failure.

~~~
mjmsmith
Seriously? There's still no way to get notified if a backup fails?

~~~
sreitshamer
It emails on both success and failure.

------
egonschiele
Could you change the upgrade system on macs please? I got a message telling me
to install the latest update -- so I did. I was on a paid version of Arq 3,
and now I'm on a trial version of Arq 4 :-/

~~~
sreitshamer
Didn't the upgrade text explain that it was a paid upgrade? I tried to make
that as clear as I could.

If you want to stick with Arq 3 you can. Delete Arq 4. Download Arq 3
([http://www.haystacksoftware.com/arq/Arq_3.3.4.zip](http://www.haystacksoftware.com/arq/Arq_3.3.4.zip)).
Launch Arq 3. You'll have to find your old backup set under "Other Backup
Sets", select it, and click the "Adopt This Backup Set" button. Sorry for the
hassle.

~~~
northwoods
Agree that this should have been presented differently - something other than
the normal update mechanism, which becomes routine over time. This is a major
update with a fee associated - the 'upgraded' user is presented with an
'unlicensed' copy of Arq4 and continual purchase dialogs until they get out
their credit card and pay again.

------
tunesmith
I've regularly been curious about this, but Crashplan has a stable reputation
and seems much less expensive for large backups. For those who have researched
Crashplan, why did you choose Arq instead?

~~~
leejoramo
As I have replied to several questions in this thread and said that I use both
CrashPlan and Arq for remote backups let me give my rational.

CrashPlan: I have used since soon after they first appeared for Mac. They have
really strong compression and de-duplication to minimize and speed data
transfers. I personally use their consumer and small business solutions.
However, I also maintain their CrashPlan PROe enterprise backup for several
clients. The fact that they have a very strong enterprise product, provides me
with a great deal of trust in the quality of CrashPlan's work. I think they
have be best solution I have used for a notebook that is on the move. I backup
to both their remote servers and to my own home office server. Thus, I have
the option to quickly restore from my own local server, their much slower
remote server AND I can have a CrashPlan Next Day send me a copy of my data on
a disk drive. I do wish they could get rid of the Java dependency for their
Mac Client software since it is a RAM hog. CrashPlan rates very well at saving
Mac OS X meta data.

Arq: I like the approach to backing up to Amazon S3 which I know is a very
reliable storage environment, and Glacier has made it dirt cheap for last
resort archival backups. I like the fact that at least through Version 3 there
has been an open source software GitHub hosted restore. If Haystack software
disappears there are still options to restore. I believe Arq is one of the
very few Mac OS X remote backup systems that preserves ALL meta data.

I have used lots backup software over 30 years. Every backup system has
failings and bugs. And the operator (normally me) is capable of making
mistakes. That is why I use multiple products to do backup.

I am interested in exploring Arq new features especially using SSH/SFTP which
will allow me to self host, and may cause me to re-evaluate my overall backup
approach.

~~~
Serow225
I've talked to the CrashPlan folks recently, and they have been working on
native clients for a while now. They wouldn't tell me a release date of
course, but it's in the works :)

------
chimeracoder
Pretty awesome to see this. When I used a Mac for work, I used Arq and set it
up on my coworkers' computers (they were completely non-technical). It was
very easy to use.

I'm curious what backup tools people use on Linux if they want to back up
files on Glacier. I use git-annex[0] for certain files (it works well for
pictures and media). The rest of my backup process is a fairly rudimentary
(though effective) rsync script, but it doesn't use Glacier.

My current setup works fine for me, but I imagine there are better tools out
there.

[0] [https://git-annex.branchable.com/](https://git-annex.branchable.com/)

~~~
natch
I use my own script.

It tars a directory, naming the output with a hash of the original directory
name. Then it encrypts it with gpg and breaks it into small parts (100M) so I
can pace any needed Glacier restores so as not to break the bank. Then it runs
par2 on each part, to make it more likely that I can recover from any file
corruption. Then it uploads each part and the par2 files to an S3 bucket which
is set (via the S3 admin web dashboard UI) to automatically transition the
files to Glacier.

The shortcoming is it's not a whole-system backup. Also it doesn't do
differential backups, though that's not a problem for me because I organize
things such that old stuff doesn't change often if ever. It's dirt cheap, one-
command simple, and feels pretty reliable... though I must admit I haven't
tested a restore!

~~~
warmwaffles
Do you mind sharing the script this with us? I currently have a NAS that I
backup to and a couple external drives that the NAS backs up to (the most
important backups). This is all painfully manual and I may write my own script
sometime.

------
michaelx
I'm using Arq since 2011 to backup my most important data to Amazon S3+Glacier
and can highly recommend it.

Today v4 has been released and comes with new storage options (GreenQloud,
DreamObjects, Google Cloud Storage, SFTP aka your own server), multiple backup
targets, unified budget across S3 and S3/Glacier, Email notifications and many
more clever features.

------
coffeecheque
I love the addition of SFTP, and I hope to buy the update soon.

Can anyone recommend a SFTP backup provider?

My Arq backups are designed to be worst-case. I have other, local backup
options in case of failure. I was using Glacier, but I ran into Arq3 sync
problems and I need to re-upload all my data. Glacier is very slow from where
I live. I assume SFTP will be a bit faster.

~~~
Spooky23
I'm not a customer anymore, but was very happy when I was. rsync.net is the
perfect service for this.

~~~
rsync
arq (or any SFTP based client, including sshfs) will work perfectly with
rsync.net.

The new-user-HN-discount is 10c per GB, per month, and there are no other
(traffic/bandwidth/usage) costs. Our platform is ZFS and there are 7 daily
snapshots for free.

We would be happy to serve you, as we've been serving thousands of users since
2001.

------
chmars
BTW, another great product from the same developers have recently become
available:

It's called Filosync and is like Dropbox but secure and with your own (or with
Amazon) servers. Check out
[http://www.filosync.com/](http://www.filosync.com/) for more information. And
be warned, it's pricey!

~~~
acjohnson55
My personal solution is to use the free Boxcryptor Classic [1] on top of
Dropbox. I've actually replaced my entire Documents folder with a link symlink
to its version in my Boxcryptor instance (something I was leery of doing
directly in Dropbox, for privacy reasons). So now I get transparent encryption
and sync on both of my computers. It's been working amazingly for about 6
months now.

[1]
[https://www.boxcryptor.com/en/classic](https://www.boxcryptor.com/en/classic)

~~~
chmars
I did a test with BoxCryptor last year and was not too pleased. For enterprise
use with the necessary master key, it's expensive too, and you still have the
disadvantages of Dropbox. On the other hand, if you are a Dropbox user and
willing to pay for BoxCryptor, the solution works flawlessly.

~~~
acjohnson55
What do you mean by the disadvantages of Dropbox?

Also, keep in mind that Boxcryptor and Boxcryptor Classic are two different
products. If sharing files in an encrypted way with other people or having a
master key is part of your use case, Boxcryptor Classic is not a great choice.
But if you just need a way to store your own files in the cloud, with seamless
sync between machines, and without privacy concerns, Boxcryptor Classic has
worked very well for me.

------
dewey
Site is down/slow for me. Google Cache:
[http://webcache.googleusercontent.com/search?q=cache:http://...](http://webcache.googleusercontent.com/search?q=cache:http://www.haystacksoftware.com/blog/2014/03/arq-4-is-
out/)

~~~
sreitshamer
Should be better now.

~~~
dewey
Way better. Thanks!

------
m_mueller
I like the idea of using Glacier as a Backup solution for ones devices.
However one thing worries me: Looking at the Glacier pricing table[1] there is
a section 'Request pricing'. This looks to me like there is a price of 5.5
cents per 1000 upload request. Considering Arq will upload multiple times an
hour, this looks like it could amount to quite a bit. With two uploads per
hour I arrive at 5$ per month, but there could be significantly more uploads.
Even 5$ would already be a 50% price increase compared to only the storage
pricing for 1TB of data.

Could anyone clarify whether my calculation is wrong?

[1]
[https://aws.amazon.com/glacier/pricing/](https://aws.amazon.com/glacier/pricing/)

~~~
dalore
2 uploads an hour = 48 uploads a day

48 a day * 31 days a month = 1488 uploads a month

5.5 cents per 1000 uploads

so this would cost roughly 1.5 * 5.5cents, so about $0.0825 a month.

There would be data and storage costs but I'm just going by your upload
request calculations.

~~~
m_mueller
damn, I had a factor of 60 in there, for some reason I calculated 2 uploads
per minute instead of per hour. Alright then, seems to be a fair price.

------
tedchs
What I couldn't find is whether Arq encrypts the data client side before
uploading it. This has prevented me from using several other backup tools.
Does anybody know?

~~~
jonpierce
Yes, it does. "All encryption is done before your data leave your computer" \-
[http://www.haystacksoftware.com/arq/index.php](http://www.haystacksoftware.com/arq/index.php)

------
DomBlack
How does this compare to say using Tarsnap (apart from cost)?

~~~
sintaxi
Tarsnap's privacy policy is pretty terrible. Tarsnap reserves the right to "at
their sole discretion" share your information. Even in the case where
authorities don't go through due process. This is completely unacceptable to
me. Privacy policy for a service like this should be "Warrant or GTFO".

[https://www.tarsnap.com/legal-why.html](https://www.tarsnap.com/legal-
why.html)

------
greggman
Am I the only one that would like to be able to setup a time range when it's
allowed to use my bandwidth?

I'm traveling in hotels with shitty wifi most of the time. It's hard enough to
browse so I'd really only like to backup while I'm sleeping. Also, in the
interest of not hogging all the bandwidth I'd like it to stop by say 6am so
that as guests are waking up they can use the bad internet.

~~~
286c8cb04bda
_> Am I the only one that would like to be able to setup a time range when
it's allowed to use my bandwidth?_

You can. After you setup a target open the Preferences window. Select the
target therein and click the "Edit..." button.

The dialog that follows has an option to "Pause between [00:00] and [00:00]",
where [00:00] is a drop-down which lets you pick the top of any hour of the
day.

------
xxdesmus
Big fan, the inclusion of DreamHost's DreamObjects is a huge improvement also.
In most cases they are cheaper than Amazon's S3 storage. Still more expensive
than Glacier...but DreamObjects doesn't have any of the crazy slow retrieval
times or costs that Glacier does.

------
chmars
Two questions for Arq users:

1\. In former versions, it was difficult to see what was actually part of the
backup. Has that changed?

2\. Is there any reliable way of calculating how much an Arq backup will cost?
Storage costs are easy to calculate but with Amazon S3, changes etc. are a
major cost factor.

------
willtheperson
Can someone tell me why I shouldn't be using BitTorrent Sync as a multi-
location backup plan?

In other words, can someone sell me on the idea of paying for AWS storage when
I have dirt cheap storage around my house and even a remote location that I
can stuff a huge drive in.

~~~
tlrobinson
If deleting/modifying a file at one location results in deletion/modification
of a file on the "backup", then it's not a backup.

~~~
rich90usa
Certainly. I remember learning early on that RAID != backup, and an easy
conceptualization of Sync is RAID 1 at the file/folder level.

I'd recommend that someone using Sync for backup purposes take snapshots of
they're syncing for backup purposes.

------
Goopplesoft
I was looking at Arq the other day and couldn't find if it had a bootable
backup feature (like CC Cloner). Anyone know something about this?

~~~
leejoramo
Arq does not have bootable backups. It is primarily for long term off site
versioned backups. Look at SuperDuper or CCC for a full metal bootable backup.

I generally prefer SuperDuper for its simplicity, and recommend it to most
people.

CCC is not really harder to use, but presents a bunch more options which most
people don't need and with which they may get into trouble. One great feature
of CCC is the ability to do bootable backup to a remote volume. I have my
MacBook set up to backup to a server in this fashion. However, this requires
you to configure your remote server with _root_ SSH access via certificates.

~~~
toomuchtodo
I pine for a Mac backup service that would let me backup from anywhere, and in
the event of failure, I could fire up a new Mac, point it at the service, and
boot to restore all over the net.

~~~
leejoramo
The problem with bootable backups is that you need root access to do the types
of low level disk writing required. Thus, you need a very trusted environment.

That said, you can also use CCC and SuperDuper to backup to a disk image file.
This would not be directly bootable, but it can be copied to a new hard drive
which would then be bootable. Backing up to remote disk images is much slower
than the root method to drive method.

------
pipek
Anyone using duplicity or duplicati as open-source alternatives to arq?

------
natch
The purchase link is incredibly well hidden. Anyone find it?

~~~
sreitshamer
That's not good! :) (It's my site) Here's the link:
[https://store.haystacksoftware.com/store?product=arq4](https://store.haystacksoftware.com/store?product=arq4)

~~~
floatingatoll
Had no trouble finding it earlier, but I just clicked the Store button in the
nav and skipped the product page entirely.

[http://www.haystacksoftware.com/arq/](http://www.haystacksoftware.com/arq/)
links to ?product=arq3 though, text "per computer for Arq 3".

Please consider integrating or linking [http://liangzan.net/aws-glacier-
calculator/](http://liangzan.net/aws-glacier-calculator/) (as found in the HN
comments somewhere around here), it's been invaluable when talking with people
about Arq today.

------
zeckalpha
But... Tarsnap...

~~~
kondro
I love Tarsnap, but $0.30/GB for Tarsnap backup is very different from
$0.01/GB for Glacier cold storage.

For my own Mac I backup to a sort of off-site NAS with TimeMachine, which is
fine for all but the worst-case, meteor to the city situations. However,
Glacier makes a perfect option for me to deal with this actual worst-case
scenario.

