Hacker News new | past | comments | ask | show | jobs | submit login
Backing up data like the adult I supposedly am (magnusson.io)
207 points by miked85 on Sept 20, 2020 | hide | past | favorite | 146 comments

The author uses a systemd timer to schedule their backups. For backups going to a remote host I prefer adding a little bit of variance to the execution time to avoid consistently hitting some hotspot.

From the timer I use to backup my server using Borg to rsync.net:

This will run the backup script every 24 hours with a random delay of up to 1 hour, so every 24.5 hours on average. This causes the job to nicely rotate around the day.

That’s really such a nice solution to the problem, nice.

Can you imagine not reading the docs to discover those options. So you spin up a database to save state about runs to implement the delay. And you need a dashboard to monitor the various parts of the system for debugging.

Or you read the docs

Or you prefix your command / script with sleep and a randomlly generated 1h value!

This works, but compared to using systemd it has the drawback that the range of possible times is anchored to the configured time in cron. The systemd timer example I gave causes the next cycle to start when the previous job finished.

So if it initially runs the script between 0:00 and 1:00 and the script takes 1.5 hours to finish then the next run will be between 1:30 and 2:30 the next day, instead of 0:00 and 1:00.

> So if it initially runs the script between 0:00 and 1:00 and the script takes 1.5 hours to finish then the next run will be between 1:30 and 2:30 the next day

Shouldn't it be between 1:30 ans 3:30? I'm just nitpicking, of course, that's a nice solution.

Yes, you are correct. The script ends no later than 2:30 and then there's a delay of up to 1 hour.

A loner straightforward solution - albeit not supported by schedulers like cron with start times denoted by fixed, absolute values - is to use non-recurring intervals.

Eg run a task at intervals of 86413 seconds

The snooze scheduler* has a random delay feature.

* https://github.com/leahneukirchen/snooze

It's why you see a bunch of cron jobs that start off with a random sleep. For example, certbot's cron:

    perl -e 'sleep int(rand(43200))' && certbot -q renew

Seems a bad idea in terms of the accuracy of your logs e.g. so you might not notice if the command you want to run is starting to run unusually long some days because of some error.

Another good reason to switch to systemd, IMO.

Replace sleep with a countdown logger.

I would guess that most workloads have some sort of slow time when it’s appropriate to schedule backups. Wouldn’t having backups rotate through the day potentially cause slowness during a more active time for users?

Whenever I use a scheduler I always use prime numbers wherever possible.

May I ask why?

Not OP, but one good reason to do that is to reduce probability and frequency of collision with other periodically running jobs.

Let's say you run your job on the hour, and you have a job running every 4h and another every 24h, without planning, because 4 divides 24, you have one in four chance of having them collide and having the 24h job running at the same time than the 4h job. If you add more 4h jobs, the probability that one of those 4h jobs collide each time with the 24h job increases.

The more jobs you have, the higher the probability that some will be divisors of others.

Using prime number for scheduling reduce the probability at a given time that those jobs collides. If you create a job every 5h and another every 23h, the 23h will collide with the 5h job every 115h. PPCM(5, 23) = 115.

Interestingly, this technique is used in nature by cicadas who developed long, prime-numbered, periodical life cycles to avoid gaining a predator that can sync up with the cicadas[0].

[0] https://www.cicadamania.com/cicadas/cicadas-and-prime-number...

And the number of teeth on two interacting gears is usually coprime, to distribute wear more evenly.

If you have a bunch of computers each on a fixed timer, and their clocks are synchronized (say, with NTP), then on the least common multiple of all of those timers you'll get a stampede of requests from all of the computers.

If you're on a sufficiently large network, that surge can cause failures. And a fixed retry policy will just cause the same stampede to recur on the retry intervals; you want to add jitter to ensure that you spread the load out.

> If you're on a sufficiently large network, that surge can cause failures.

The more prime numbers you use, the rarer a stampede affecting a certain percent of nodes will be. To an extent that makes a bigger network safer.

Adding more nodes with the same prime numbers means the peak load scales linearly with the network size. So wouldn't that mean that a "sufficiently large" network is no worse off than a small network?

Also a backup is a bulk upload that can easily run a thousand times slower than normal and still succeed. Even if every server triggers at once, that shouldn't inherently cause failures.

To appease the cicadas.

So that it backs up at a different time each day? For instance, if it is set to 11H, and you were starting at 4pm on a day, then the backup would run next at:

15h, 2h, 9h, 20h, 7h....etc. So you won't have the backup running at the same time everyday.

Personally speaking I would prefer that the backup runs at the same time everyday but some people don't.

Well, it's not that the number is prime. There are many numbers which don't back up at the same time every day that aren't prime, like 25hrs or 22hrs. Numbers which are more complex fractions of a full day will take longer to repeat.

Depending on a more exact statement of the goal, the best constant interval to avoid repeating parts of the day on nearby dates would be to multiply a day's length by the golden ratio; this would be about every 38.833 hours.

Since the golden ratio is irrational, you'll technically never repeat the same time. But if I remember correctly, it's the best number to space out the times uniformly throughout the day and also distantly between nearby days.

What's the downside of consistently hitting a hotspot?

It possibly makes the process unnecessarily slow. People tend to choose “round” numbers for their cronjobs. Probably most commonly minute 0 of a given hour for an hourly or daily job. Thus on e.g. 0:00 UTC there might be hundreds of clients running their backups.

I don't have a strict need to run my backups at a fixed point in time (e.g. within the night hours). By not hitting a hotspot I have a better chance of having a larger percentage of the targets bandwidth for my needs (both network as well as disk IO).

The random delay ensures that the job runs at a different point in time every day, with most of these points in time being expected to have a light load. If it accidentally hits a hotspot on one day it will be fine the next.

Unless you have a whole fleet doing the same thing, nothing.

With many people using mobile devices, a plug for PhotoSync enabling seamless photo backup and sync between iOS, Android, Windows, Mac, Linux, local NAS, iXpand local flash and cloud services. They have both subscription/rental and lifetime/ownership licenses, https://www.photosync-app.com

iOS storage management has improved with a user-visible "filesystem" and storage providers allowing edit-in-place, but there's still virtually no support for backup or rsync. The native iOS Files app is not a robust client for NAS storage. So far, the best option has been GoodReader (Russian devs) which implements robust sync (SMB, SFTP & more) within the app, along with optional in-app strong encryption that goes beyond iOS data protection. Unencrypted files are visible to other apps, https://www.goodreader.com/

Samsung's iXpand has built an ecosystem of iOS apps that support their custom protocol for iXpand flash drives via Lightning. Now that iPad enables access to local storage via USB-C, we need a similar ability to mount a ZFS drive, even if Apple won't provide this natively in iOS.

With a low-cost x86 SBC like Odroid H2+, an entry-level NAS can be constructed with Ubuntu ZFS and dual 3.5" drives.

"OS storage management has improved with a user-visible "filesystem" and storage providers allowing edit-in-place, but there's still virtually no support for backup or rsync. The native iOS Files app is not a robust client for NAS storage. So far, the best option has been GoodReader (Russian devs) which implements robust sync (SMB, SFTP & more) within the app, along with optional in-app strong encryption that goes beyond iOS data protection."

Thank you - this is very interesting.

Although I am a (casual) iphone user (my iphone has never seen my real name or my real phone number and has never touched rsync.net) I was not aware of the user-visible filesystem nor was I aware of "Goodreader".

Does this user-visible filesystem allow me to just copy over my entire music library (which is files and directories, and no knowledge of apple/itunes/ios) and then let itunes browse it, locally on the phone ? Or do I still need to do complicated import tasks ?

> Does this user-visible filesystem allow me to just copy over my entire music library (which is files and directories, and no knowledge of apple/itunes/ios)

Sort of. Each app has the equivalent of a folder. You could use the Files app to copy a directory tree of music files from a remote server, into the folder of a music-playing app (e.g. Flacbox, Nplayer). Then all the files are "local" to the app. This assumes the copy doesn't fail, e.g. several GB over a flaky network, as the Apple-native Files app can't resume a copy, and will abort on every duplicate.

> and then let itunes browse it, locally on the phone ?

The on-device music app from Apple conflates downloadable iTunes-purchased files with streaming Apple Music. It does not play local files AFAIK. But you can use several third-party apps to play music & video.

The challenge is keeping the device and remote server in sync :) Hence GoodReader, which does a decent job of file synchronization within the constraints of a non-Apple app. It can also play music and videos, but doesn't have all the bells & whistles of a dedicated media app like nPlayer. But then those apps don't have file sync.

While apps (e.g. media players or editors) can edit-in-place or view-in-place files within other apps (e.g. storage providers like GoodReader), this doesn't seem to support the ability to maintain a continuous mirror view / symbolic link on a directory tree from GoodReader into nPlayer.

Other apps for an iCloud-free iPhone: 2Do (CalDav sync), local scraper/search DevonThink 2 Go (Webdav sync), Secure Shellfish (local access to remote files via SSH), HereWeGo (formerly Nokia offline maps), iCatcher (podcasts), Codebook (password manager with native sync to Mac), Kiwix (offline wikipedia/stackoverflow).

What I found is that many old apps have "password protected folders". Some video apps, some photos apps, some comic readers. This was to keep certain photos/videos/comic private. Many of those app expose all those "private" files via the Files app entirely defeating the purpose of having a password on them.

In-app encryption can keep files opaque to Siri/Spotlight/Files. By default, Spotlight/Siri will index in-app content, which may (?) also be sent to iCloud.

Those files should be marked read-only for the owner and the pubic files world readable.

Does PhotoSync-app allow me to view photos from other devices inside the iOS Photos app? I would like the photos from my Android phone to be viewable in the iOS Photos app on my iPad.

Not personally tested, but PhotoSync should be able to copy the Android photos to an iOS Album via WiFi. You will want to keep the photos separated in an iOS album, to enable future synchronization. Otherwise, your iOS photos could get copied to the Android device.

> So far, the best option has been GoodReader


It's about flexibility. GoodReader supports both public clouds (Dropbox, OneDrive, Google Drive, SugarSync, Box) and open protocols (WebDAV, FTP, SFTP, AFP, SMB) for private storage. Then, after remote files have been synced to/from the iDevice, editing them in-place from other apps.

I don't use GoodReader, so can't say if it's exactly the same. But as of iOS 13, the Files app now supports directly connecting to external servers. SMB for sure, but can't find a definitive list of supported protocols.

You can also use Files to directly browse and access files from third party cloud services, although it requires you to download the service's app first[1].

[1] https://support.apple.com/guide/iphone/connect-external-devi...

Perhaps iOS 14 is better, but the iOS 13 Files app is not a great SMB client. It doesn't deal well with intermittent connectivity, e.g. remote hosts connected by VPN, or USB wired ethernet. When it gets confused, the only solution is to reboot the iPad, as restarting the Files app is not enough. Unlike GoodReader, the Files app does not have bidirectional sync with remote servers.

I came across the feature while using a beta of iOS 14, so can't speak to whether it's improved since iOS 13.

But my use case has involved connecting to a remote host over a ZeroTier[1] VPN, using an iPhone 6S. I haven't experienced the scenario you mentioned; most problematic issue I've had is sporadic latency spikes during folder navigation. I haven't really done any true stress testing of things, but it's handled transitions between wifi and cellular network without any noticeable issues.

It’s the same as in macOS, which hasn’t been good for years. Is it not just Samba? Did they write their own implementation of SMB?

No idea on implementation, good point that it's possibly reused code from macOS. But if various iOS apps have stable SMB clients, surely Apple can afford one? There's a possible business conflict between Apple's high prices for internal flash storage and their support for external storage.

Don’t forget Apple’s other conflict of interest: paid iCloud storage.

True, especially with the new Services bundles.

Wait so does that mean I can plug and SD card reader or other usb mass storage device into my phone and copy files directly without a third party app?

That was news to me until I pulled up that support article to add to my post, but looks like it!

Looks like there are two different USB adapters Apple sells, and the product page for this[1] one mentions it can be used for quite a few USB devices, including "USB peripherals like hubs, Ethernet adapters, audio/MIDI interfaces, and card readers for CompactFlash, SD, microSD, and more."

Biggest issue for both external HDDs and other devices appears to be power delivery, which can be alleviated by using a drive with external power or leveraging a powered USB hub between the phone/tablet and the USB device.

[1] https://www.apple.com/shop/product/MK0W2AM/A/lightning-to-us...

Yes, using Lightning-to-USB adapter, at least for apps which support the iOS 13+ Files app.

Several apps (nPlayer, InFuse, LumaFusion) also support Samsung iXpand lightning/usb flash drives natively. With native app support, files can be played directly from the flash drive, without copying to the iPhone.

Like the author I have also been very happy with Borg backup software ( https://www.borgbackup.org/ ).

The compression and de-duplication is very useful. A little bit of a learning curve to get everything up and running, but not too bad.

I have a question about Borg’s encryption. It’s stated [0] that if multiple clients update same repository, the server might be able to decrypt data.

Why is that the case and wouldn’t that make the encryption very weak? Simultaneous updates happen quite often.

Would restic have the same problem?


Update: The issue happens because Borg uses AES in the CTR mode (not AES GCM) and two clients could provide the same nonce. The server could then recover the plaintext from two cipher texts. This is the famous nonce reuse problem.

So Borg developers are not using established primitives for this use case. Also, I am not comfortable with the OpenSSL even though it’s got better since 2015. The libssl code base is a mess and buggy. On the other hand using the low level libcrypto library would expose developers to the crypto primitives with possibilities for errors for people not expert in cryptography.

Borg should consider ChaCha-Poly135 as in rclone (or at least AES-GCM).


> Why is that the case [...]?

This is explained in the "Encryption" section: https://borgbackup.readthedocs.io/en/stable/internals/securi...

The important part is the part about avoiding re-use of the AES CTR value.

> Simultaneous updates happen quite often.

Personally I created a dedicated borg repository per machine I want to backup, because that avoids sharing passphrases across machines. This comes with the drawback that I cannot deduplicate across machines, but that is acceptable to me, because the data is mostly unique-ish anyway. I only backup the user data, not everything (e.g. /bin/).

Yes, thanks!

I meanwhile read about it and updated my comment.

Would it be practical to rclone the output of the Borg into a cloud service using an rclone crypt remote? In my experience, rclone’s crypt remote is sluggish, even locally. I am not sure how the mount would work.

It’s unfortunate that we have to get the dedup from Borg and the encryption from rclone!

I never used rclone, but I can tell you that a borg repository basically is a number of encrypted blobs of up to 500 MB size that are never going to be modified again (only created and deleted) + a few small metadata files. It rsyncs quite well.

I can confirm that cloning a local Borg repository to a cloud system works. One of my systems uses Borg to make a local backup. Next, B2's command line tool synchronizes the local Borg repository to B2.

Since we are talking about Borg, I should plug my https://BorgBase.com service, which is offsite backup hosting, specialized for Borg only and offers additional feature, like append-only mode, quick setup of isolated repos and monitoring for stale backups.

Of course, as with all backups, you should make sure to run a test recovery. I ran into a situation with Borg where the encoding of one of my filenames caused that file and anything else behind it in the backup to be unrecoverable. Or at least I was never able to find a way to recover it.

I gave it the old college try to recover, using the different tools to try to access it (the fuse mount, the CLI), I tried all sorts of different settings for my locale. At the time I had at least 2 other backups of that so eventually I recovered from my primary backup. I was testing out Borg at the time.

I've ended up using Restic more recently, and it seems to be fine. Uses kind of a lot of memory in some situations though. Small AWS instances have issues. My primary backups still go via rsync though.

I also like Borg back up.

I wonder if I am missing something compared to restic?

"I wonder if I am missing something compared to restic?"

The biggest difference for us is that borg really requires a server side 'borg' binary to talk to, which we have built into rsync.net. restic, on the other hand, can just connect to any old SFTP endpoint.

This means we need to preserve some amount of backwards-compat and so we maintain borg0.x and borg1.x binaries in our environment (and eventually, borg2.x).

Borg is supposedly a bit faster, restic supports a ton of backends https://github.com/restic/restic/issues/1875 Also restic does not allow to backup unencrypted.

I haven't used borg. Only done some maintenance on restic backup jobs at work. Restic's command design is intuitive and the documentation is good. But Borg looks just fine in that regard as well.

Thank you !

I posted another question on Borg, in case you know the answer!

I found restic did not scale well with a large amount of files when I tried it a few years ago. Has this changed? Is Borg better?

I used to run Crashplan with near-continuous backup for my important files, and I'm still missing this.

I too used to use Crashplan until they discontinued their personal subscription and with it the option to backup to local drives.

Around the same time I tried Restic. That ran into difficulties (don't recall what anymore) so I switched to Borg.

Borg has been 100% reliable, including a full restore of /home after my laptop was stolen.

I don't know what you call large, but my personal restic backup runs weekly, ~250000 files, 2 TB, finishes in 7 hours saturating a 600 Mbps connection over S3 protocol. I'm very happy with restic.

Well I have about 2-3x as many files, but only ~100GB. The main concern was finding out which files to take backup of, it would take quite a long time.

Like I said I was hoping to get the 10-15 minute intervals I had with Crashplan.

If you want to find out how to backup your iPhone on Linux, I've also made a guide! It's actually kind of complicated, but it can be fully automated. I connect my iPhone to my Linux server in my room to do an incremental backup every night at 5 AM (it also fast charges at 1.8 amps over USB C). I then create ZFS snapshots every week, since the iPhone backup is an overwrite type.


Thanks for your self-hosting tutorials for iOS services! The next question is how to extract individual files from a backup, without needed an iDevice for a full restore. There are several commercial products sold for this purpose, but I've not yet seen OSS tools to parse iOS backups.

I use TineMachine to three separate idéntica disks: one at home, another one of the office, and and a third one at my parents place 1000 km away. Pretty anythingproof. I also have three other disks with three identical copies of my old archived stuff, in the same locations. Also all code repos are online (svn, git and hg), and I have most non-code stuff on Dropbox too. Restored entire machine from TimeMachine once when I upgraded the laptop, ideal experience. I’m not happy that Covid made me have the office disks at home now too, but otherwise, I feel pretty safe.

Is there a way to test if TimeMachine backups are uncorrupted? I backup using TimeMachine as well but as far as I can tell there is no way to verify a backup. I’m concerned that at some point my backup will get corrupted and I won’t know why. This happened to my iPhone backup to iTunes, luckily I had a iCloud backup.

It seems to be possible to check the integrity on the command line.


Restore it?

Why on earth would I restore an image back into my MacBook if I don’t know if it’s corrupt?

Use another computer or partition, of course.

Restore it in a VM.

Why not have functionality to check to see if the backups are readable? Instead of forcing users to spin up a vm which most can’t, and do testing etc. seems like a simple ask. Most credible backup solutions do this.

What data do you have on your personal machine that warrants such a setup? For me, I keep my laptop clean such that if anything goes wrong then no worries.

Photos are on Google Photos, use Dropbox/drive for data.

It's both my personal and my work laptop. Over 20 years of freelancing, contracting, consulting, starting startups... Probably my most visible thing available now would be http://www.viemu.com , although these days I'm most active at http://katoid.com (too little information there yet, but we're creating some amazing new tech for game analytics).

What happens if Google or Dropbox lock your account?

The problem isn't setting up backups, the problem is that you have backups because something can happen to your data which is 1) you are clumsy or 2) someone is malicious like a hacker or burglar and 3) physical damage like harddrive corruption, cosmic rays or fire.

Of these, I find 1) is by far the most common, and 2/3 isn't even close. The problem with any backup scheme created by myself is: if I couldnt' be trusted to maintain my data without deleting it, a sure as hell can't be trusted to set up my own backup scheme without screwing it up.

Example: I have have had regular backups to a NAS, which were then uploaded to an offsite server were I rotated the data in. But I screwed up a raid config thing on the NAS after a harddrive faliure, and didn't notice I lost a lot of data, which after a couple of years had also been removed on the offsite server.

Basically: to be a good backup solution for me it has to be idiot proof. Zero manual configuration (If there is a config file or command line anywhere, it's out). I want a gui tool that gets out of my way and has good defaults, and has such a huge disk area that it can have effectively write-only semantics. I want retention for deleted files. Currently I use iDrive which is pretty good and lets me back up parents computers and so on, in the same 2TB.

I solve this problem by buying a Synology (a popular, idiot proof NAS). Other trusted solution is QNAP.

Yes, it’s cool to set up a FOSS FreeNAS server with ZFS pools. However, one look at the forum posts of people losing data due to misconfiguration tells me it should be considered a toy project.

All of my important files live on the NAS as a source of truth. This means I passively make sure the data is there, every time I access something.

It’s backed up using Hyper Backup to the cloud, in encrypted format. I verify restoration once a quarter. I also keep a HDD around that I manually backup to once a quarter.

> I solve this problem by buying a Synology

My NAS was a Synology too (The one where I lost a lot of my data, despite Raid1 + offsite sync). No dry-run restoration from the cloud was my mistake.

How did it fail, if I may ask?

I don’t remember exactly. It was some small error that I turned into a much larger error when I tried to fix it.

What software do people like for backing up Windows desktops?

I really want something that ends with a full disk image that's easy to restore to a new device, runs backups on a schedule (and will run a while after the next boot if the computer is off at the scheduled time), writes the images to a unix system on the LAN (either directly, or by writing to SMB), and doesn't cost an arm and a leg.

I just went with Backblaze Personal. It's pretty much fire and forget.

Doesn't provide a perfect full disk image, but it does store everything I need. I've done one full restore from them (fried motherboard from a power surge) and it went as smoothly as I could expect.

This is not a great choice. The Backblaze client is extremely insecure—like, arbitrary remote root code execution insecure—and they seem to me to either not care or are too incompetent (or both) to be trusted.[0]

[0] https://twitter.com/zetafleet/status/1304664097989054464

I like BackBlaze, but their client has always been absolutely terrible. Pretty shocked to hear that it's this bad, I and just paid for another two years in advance.

Last I checked BackBlaze only kept deleted files for a max of 30 days, making it a non-starter for my needs. I'm not sure if that's still the case.

You can pay more for year long retention now

That's what they say in their offerings/docs, but in fact I had data kept far longer than the 30 days (after the laptop registered for Backblaze was already scrapped and not been running for months).


As someone else commented - you can set a personal key. It's encrypted at rest (and in flight), but obviously Backblaze have that key.

There's the option to set a password to encrypt it.

Well I should perhaps elaborate: does it offer end to end authenticated encryption with keys that never leave user’s device in an open source program?

Another point, I suppose that backblaze comes with dedup and compression?

Re: encryption, long-story short, the keys used to decrypt your data are stored in their data centers, but you can also encrypt those keys with a symmetric key which only you know. [0]

Re dedup/compression, it's a bit irrelevant because their plans are unmetered.

[0] https://help.backblaze.com/hc/en-us/articles/217664688-Can-y...

> [...] but you can also encrypt those keys with a symmetric key which only you know

...until you need to restore from backup. You then have to sign in on the Backblaze website and enter that key, the files you are trying to restore are then decrypted on their end, and bundled up and sent to you.

They say that the key is only ever in RAM, and only then briefly.

regarding unmetered, i gave up on backblaze as their network connection seems incredibly slow. i think asking for compression and dedup is very relevant with them

Yeah, their stock answer is "use more threads" but I could never use more than 30-50% of my upstream bandwidth. It doesn't help that the client is slow itself and seems to sometimes just stop backing up.

There are two go-to options, at least as far as /r/sysadmin and /r/datahoarder people are concerned - Veeam Endpoint Backup [1] or Macrium Reflect [2].

However, another option is to back up just the data and reinstall the OS + programs in case of a disaster. I've been set up this way for nearly a decade, now using Bvckup 2 [3] as a replicator. This is faster and lighter on the system and it creates backups that are readily accessible.

[1] https://www.veeam.com/windows-endpoint-server-backup-free.ht...

[2] https://www.macrium.com

[3] https://bvckup2.com

Hey, just wanted to come comment here to say that since reading this 2 days ago, I've setup bvckup2 to sync all my important stuff. This is exactly what I've wanted for a long time now. Thank you so much. It's really exciting.

Up until now I've been manually doing it via 7zip. But doing it manually is so unreliable that it doesn't even count. Or using Macrium, but it always felt like overkill.

Bvckup2's archive/keep eveything feature for handling deletions is really great!

Someone else in this thread mentioned Duplicati, which also looks really great. I might add that to my backup flow. I thought it was you, but now I can't find it anywhere in the thread. I guess they either delted or edited their comment.

For home use, I use Macrium and OneDrive. A good pattern I've found is to have a "clean" Windows 10 image (maybe with a few utilities), my personal data on OneDrive or a NAS and then something like PatchMyPC [1] to reinstall apps quickly.

I also have bvckup2 (worth buying almost for the amazing UI alone) but I use it more for syncing some folders to and from a NAS.

[1]: https://patchmypc.com/home-updater

Acronis True Image works well for me. Scheduling with notifications of success/failure. You can backup locally to whatever windows can attach to or to a cloud.

I've used to restore twice: same machine and new machine. Worked without an issue once the USB boot is created.

I think the cost is reasonable for 5 workstations.

Same here. I've set it to back up each night to the NAS locally and to their cloud.

Had a SSD die on me a few years ago, the primary disk. With no warning it just bricked itself. Thanks to Acronis my computer was running again less than an hour later.

Have also used it to restore documents and similar I accidentally deleted.

Another nice feature they have is their malware protection service. It detects programs modifying a large number of files in a relatively short amount of time, blocks them until you say if it's ok or not.

For my personal machines I've been using Veeam's free version. It's not as full featured as what I'd like (I have it set for nightly backups), but it seems to do the job alright. It offers to make a bootable flash drive for you at installation to make full restoration easier. I have it backing up over SMB to a FreeNAS box, but it doesn't look like the backup images are easily readable (the look to be some Veeam-specific format, but I didn't look to hard at them).

Oh, man. No, just no.

Arq 5 was OK.

Arq 6 was shipped in a state that wasn't suitable even for beta. It corrupted and destroyed backups created with previous versions, couldn't complete new backups, wasn't working in fresh installs, had no documentation, no development plan and very poor communication from the dev addressing all these issues. The lash back was so bad that they closed their Twitter account and locked up Arq subreddit (only to claim later that it wasn't them, but Reddit itself that did that).

A lot of people, me included, were expecting Arq 6 with a great deal of excitement only to witness one of the greatest dumpster fires in the recent history of ISVs. The news now is that they decided to just bury Arq 6 without trying to fix it and move on to Arq 7 - https://www.arqbackup.com/blog/next-up-arq-7/

I'm very sorry you had that experience. I feel really terrible about it. It's been an extremely stressful 5 months so far.

I tried to communicate the best I could about what we were doing -- a blog post, responding to all the reddit comments on the arqbackup subreddit that somebody else controls, answering thousands of emails. For at least a week I answered 300+ emails/day while simultaneously trying to diagnose and fix the issues people were experiencing.

At one point I deleted the Twitter account because I couldn't cope psychologically with all the hate and the personal attacks.

We set about immediately working to make Arq 6.3 "backward-compatible" with old Arq data (rather than import it into the new format, which failed unexpectedly for quite a few people).

A month into it we tried making a UI that's "native" (like Arq 5) and realized we like it better too. So, we missed our June 30 deadline of making Arq 6 backward compatible, and decided to just start over with a native UI.

We were going to ship that as Arq 6.3, but a few weeks ago realized that just shipping it as a point release would be way too disruptive. So it's going to be Arq 7. Arq 6 users of course will be upgraded to Arq 7 for free.

I know we screwed up. We're trying really, really hard to make it right. We promptly refunded every single purchase for which a refund was requested. It's not about the money. It's about trying to do the right thing.

I don't know what else to do at this point. If you have suggestions please let me know.

> bury Arq 6 without trying to fix it

I don't understand this. Arq 7 is the fix for the Arq 6 issues. It's free for Arq 6 users. We're not trying to bury anything. We've been really open about saying we screwed up and we're doing all we can to fix it.

Stefan, I realize that you've been under a massive amount of stress. I've been through really badly screwed up roll-outs myself, while being on a small team. I can relate.

The issue with the Arq 6 release was not that it was bad per se, but that there was no clear _public_ communication from you. This was twice as jarring because in recent years you've been making comments to the effect that it was no longer just you, but a team. So not hearing anything official for days, if not weeks following such a disastrous release cost you a great deal of goodwill. For every email you got, there were 10 people who didn't bother to send one.

The hate and personal attacks you were seeing were a side-effect of that. The rule of thumb for when you screw up is that you _must_ talk to people. Tell them, verbosely, what's happening on your end, what caused this, what you do to prevent the same from happening again. Talk like a chatter box. As shallow as it may sound, this shows people that you are on top of the things and it builds sympathy. All you have to do is to demonstrate that you are feeling the pain and working to resolve it. Once there's a critical mass of users that are supportive of your recovery efforts, it will prevent others from turning into trolls and haters.

Talk to your users.

You weren't doing this, not in public. That was the main issue with Arq 6 release. Not that you screwed up.

OK. I'm sorry you got the impression that I didn't communicate. I answered a shitload of tweets and emails in the days/weeks following the launch. I apologized thousands of times. I scrambled to fix issues at the same time.

At some point I think I had some sort of breakdown and could no longer cope.

I'm trying to recover here. But every time somebody mentions Arq, someone seems to come along and make a comment like your above comment, which makes me want to throw in the towel frankly.

I bust my ass day after day to try to do the right thing because I believe that what you put out into the world is what you get back. I hope in the longer run that's true.

Support is an enormous time drain and one-on-one support, be it over email or Twitter, is worse yet.

You should really consider doing everything possible to move support from 1-to-1 interactions to 1-to-many. A couple of simple things will go a long way - open up a forum and add an FAQ page.

Right now you don't have a place on the website where users with issues can go. If someone runs into a problem, they will be looking for the problem description, not documentation. The only option is to either send an email, which is slow, or to use Twitter, since you appear to be responsive there. It's literally the least effective setup as far as managing the support load goes.

Once you have a self-serve support options set up, you can funnel all support queries to the FAQ page first and to the official forum second. Redirect all Twitter and Reddit queries to the forum and answer them there. Do not engage in any support conversations on Twitter and Reddit at all. Chit-chat is OK, but no support talk. Keep an eye on what's the actual FAQs are and populate the FAQ based on that. In a matter of weeks you should see the time spent on support go down dramatically.

You'll get through this. Arq still got the momentum and the vast majority of Arq users are loyal. They still wish Arq well. I know I do.

Interesting, thank you. I have been using Arq 5 without issues and was not aware of this.

Did you try Arq 7 at all? People have been happy about it so far. If you're willing to give it a try and provide feedback, I'd be very grateful.

As an Arq 5 user, should I wait until Arq 7 is officially released?

It's up to you. We hope to officially release Arq 7 by the end of the year. It seems to be very stable already according to the feedback we've gotten so far.

Thank you. I was mainly curious if the migration from 5 to 6 (or 7) was smooth now.

Seems to be file-based, which isn't what OP wants.

restic. It's fantastic and more importantly did'nt let me down even on faulty hardware.

+1 for restic. Incredible tool.

Yep. One of the most thoughtfully designed backup tools in existence... and actively developed at that!

I've tried a variety over time, some of which seemed cool at the time, a couple of which I like.

- Unison was a command line tool, like a better rsync, designed for backups. But it feels quite kludgy doing backups/restores from the command line in Windows.

- DriveImageXML not only backed everything up, but created an XML file listing what was backed up where, so in the event you didn't have the software to restore stuff, you could still conceivably recover it (if you could write something to parse the XML and extract). It worked well, though it was slow. It got me through a few computer changes, and I had no problems restoring.

- Windows' built-in backup software worked really nice until you upgraded and then wanted to restore something from a backup made in a previous version. Being able to restore is kind of critical though.

- One of my employers required us to use a cloud backup solution which I won't name, but it was about as heavy as having McAfee Antivirus or similar on your system wasting most of the resources most of the time.

- When I backed up my last computer (Windows 7) prior to upgrading to Windows 10, I had a free copy of Acronis TrueImage just for that purpose, and so far it has seemed to work beautifully. I have zero problems recovering files from that backup.

- Currently I'm using Paragon Hard Disk Manager, which is fast and simple. So far it seems ok, but I haven't had to restore from it yet.

So with that experience, I'd probably look at Acronis and Paragon. My concern always though, is - when I need to recover these files, how do I do so if my computer doesn't have the software that created the backup? (Assume new computer or freshly reinstalled system.)

That's tricky and a spot where the things like Unison and DriveImageXML may still hold some value. If you can make a backup of your backup software separate from your backup so you can be sure to have it available to restore, then that might not be a concern. But it's always been a concern of mine - having a full backup of my data but in a format that I can't access.

I think there's one that can make a virtual hard disk as a backup that mirrors the real one, so without anything but the OS, you could boot and mount your backup as if it were the system disk, but I don't remember what it is.

I guess none of my suggestions will help :) but I sometimes run "Create a system image" from the Windows 7 backup and restore page that is still hanging around. It has an option to save to a network location.

I think that even though some pop-up messages tell you that the previous backup will be blown away, it actually is incremental to a certain extent, and the recovery tool in the installer sometimes does list multiple dates to restore from -- although I'm not sure if and how data retention can be controlled. Also disk encryption is removed on restore, and I think the backup is not encrypted at rest either; you need to keep it in an encrypted location to begin with.

For file-level backups, I'm using an rsync frontend, QtdSync, but I also had success with Borg running under Msys2's Python interpreter.

Windows system image backups with "physical" disks backing the storage (either locally-attached disks or via iSCSI) is actually reasonably nice. On later versions of Windows you can encrypt the backup with Bitlocker. Mounting prior backup generations via command line tools isn't too hateful. Bare metal restores of the entire system are very straightforward, too.

Using a network location is somewhat less useful. You lose Volume Shadow Copy so it becomes a single generation full-backup-every-time solution. It's still easy to mount and to restore from, but marginally more useless.

It would figure that Microsoft announced (last year, I believe) that the feature is no longer being developed.

Windows already comes with a Backup and Restore. And it does both incremental backups and full disk images. I do both to my Synology nas every week. Maybe you’re not using the “Professional” version?

Macrium Reflect free to a Samba (Raspberry pi) share.

Veeam Agent for Microsoft Windows Free Edition

urbackup is proper client-server, multi-platform, does Windows well as both client and server, is trivial to get going and keep running, and is free.

I didn't find urbackup trivial to keep running. For some reason I can't figure out, it stopped being able to backup the systemvolume, and I can't figure out how to get it to at least backup the C drives either.

Other than that, it did seem like it fit my needs. :(

Interesting. Personally I went the other way round and stopped doing dedicated periodic backups altogether. Photos are on my computer, phone and the cloud. Code is in repos and the few documents I need in Dropbox. I use a local time machine disk that I plug in from time to time (which I reformat if I ever need to reinstall the machine.)

My main reason for ditching periodic backups (backblaze) was that even in situations where restoring a backup would be useful, I found it easier to just reinstall the OS and pull a few repos. Nice thing is that this forced me to automate the machine "setup" so I just have one script that installs my cli tools of choice and links the correct config files.

> half-assed rsync and shell script abomination

I don’t understand the author’s difficulties with a minimalist bash-wrapped rsync-based backup. You can even hardlink to unchanged files from a previous backup to save space.

This is how I wrap rsync: https://github.com/kaumanns/snapshot

And regarding file permissions: why not simply use an EXT4 backup drive instead of an FAT32 one? Non-rhetorical question.

My home network Raspberry has an HDD attached which gets fired up every couple days for a fresh snapshot of $HOME. The only thing I am missing is redundancy. And possibly encryption.

Getting an rsync wrapper to be robust takes some work. The wrapper script I use evolved over things I found while running it across ~200 hosts nightly for a couple years. It started as one of those hardlink scripts, but evolved into using zfs snapshots. My goal was to have it be the ultimate in reliability though, I wanted it to just work as much as possible, but be quiet unless the backup failed, at which point it should let me know.

15 years later, nightly backups across maybe 300 machines, this is what I have:


This is very elegant! Thank you for sharing. I look forward to studying it a bit deeper.

I'm personally using restic[0] to create encrypted/de-duplicated backups. I use a local external drive and Backblaze B2 to push the snapshots to. There's no server to maintain.

The best thing about restic in my opinion is the ability to mount[1] the snapshots using FUSE to my machine without actually explicitly extracting the backup to a local directory.

[0] https://restic.readthedocs.io/en/latest/index.html [1] https://restic.readthedocs.io/en/latest/050_restore.html#res...

edit: formatting

Restic has some failure cases where it claims that backup was successful, when in fact some of your data was not backed up. This is just about the most horrifying failure case imaginable.

For example, if you take a backup of a bunch of files which includes TrueCrypt containers, and then you modify the containers and take a new backup, it will not back up the new data. Instead, it will look at file metadata to erroneously conclude that the container has not changed.

Now, some people argue that this is not an issue, because you can use non default configuration of TrueCrypt and/or Restic to prevent this problem. But how would a Restic user know that they need to do this?

I don't want to become an expert in the internals of the backup software I'm using. I just want it to work -- or at least fail in predictable ways.

Jup, restic is like git for backups, really happy with it.

I have been using borg for a while and it has been a joy. But I was under the impression that you had to trust the server if you aren't mounting the remote repo, but are running borg on the server too (like the author). Before commenting that here, I had a quick read of the docs.

Turns out, I was wrong! [1]

> If you use encryption, all data is encrypted on the client before being written to the repository. This means that an attacker who manages to compromise the host containing an encrypted repository will not be able to access any of the data, even while the backup is being made.

borg is even more amazing than I thought.

[1] https://borgbackup.readthedocs.io/en/stable/quickstart.html#...

I used to use Borg but I switched to restic, they're very similar but restic doesn't require a server (you can use BackBlaze B2).

I use syncthing to get all my data to a single device, and restic to back every thing up to backblaze. Restic also encrypts and reduplicates, though I don't know if it's at the block level.

Currently I use node.js + rsync + duplicity to create two backups, and it works pretty fine. Firstly I create a backup to a local server via node.js and rsync, then upload them to the remote server via duplicity, which supports both encryption and compression. Both tasks run periodically and automatically, and backup files incrementally.

If like me you didn't click all the links on the article and were wondering where the 100GB for 18$ pricing was (it wasn't on the main pricing page), it was at: https://www.rsync.net/products/attic.html

The older I get I got lazier about backups. Now I just use onedrive and icloud. I know I acn lose them all with a mistake but its alot of convenience and is cheap.

Does borg prevent attacker verifying the presence of a file? Dedup content based things may have this problem, and I can't find in the docs that this is addressed.

File or chunk.

I wish Backblaze supported Linux. On Mac/Windows it's impossible to beat Backblaze IMO. It saved my ass against disk failure twice already.

I'm planning to go to Backblaze from Carbonite (which was in turn preceded by a now-defunct Crashplan offering.)

Many times I've cursed at Carbonite's app for doing nothing when I want to back up, and popping up annoying when I don't.

I use restic pushing to backblaze from Linux Ubuntu. Works perfect.

You're pushing to B2 though. That'll cost a pretty penny for how much data I need to store. Come to think of it though maybe I could push to a disk attached to a Windows machine, which would then back up to Backblaze. Awkward, but seems like it'd work.

That's very likely the reason they've not bothered with a Linux client yet. I'd bet the typical Linux user has a hell of a lot more data than the typical Windows one.

I use B2/restic and only selectively backup things that are either irreplaceable or I'm willing to pay for. It costs me around 1 USD/month, which is only ~200GB. I have a lot more data than that on my NAS!

I used borg for a long time but shifted over to Restic (mostly because of single binary and win/mac/linux/bsd compatibility)

I should maybe have a closer look at borg. Just to learn what alternatives there are to my current restic + Backblaze B2 setup.

Just use tarsnap.

rsync.net: 100 GB for $18/year (no ingress/egress charge)

tarsnap: 100 GB would be $300/year (excluding ingress/egress)

Edit: I believe both support deduplication (borg and tarsnap), but no idea if one is superior to the other?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact