From the timer I use to backup my server using Borg to rsync.net:
Can you imagine not reading the docs to discover those options. So you spin up a database to save state about runs to implement the delay. And you need a dashboard to monitor the various parts of the system for debugging.
Or you read the docs
So if it initially runs the script between 0:00 and 1:00 and the script takes 1.5 hours to finish then the next run will be between 1:30 and 2:30 the next day, instead of 0:00 and 1:00.
Shouldn't it be between 1:30 ans 3:30? I'm just nitpicking, of course, that's a nice solution.
Eg run a task at intervals of 86413 seconds
perl -e 'sleep int(rand(43200))' && certbot -q renew
Let's say you run your job on the hour, and you have a job running every 4h and another every 24h, without planning, because 4 divides 24, you have one in four chance of having them collide and having the 24h job running at the same time than the 4h job.
If you add more 4h jobs, the probability that one of those 4h jobs collide each time with the 24h job increases.
The more jobs you have, the higher the probability that some will be divisors of others.
Using prime number for scheduling reduce the probability at a given time that those jobs collides. If you create a job every 5h and another every 23h, the 23h will collide with the 5h job every 115h. PPCM(5, 23) = 115.
Interestingly, this technique is used in nature by cicadas who developed long, prime-numbered, periodical life cycles to avoid gaining a predator that can sync up with the cicadas.
If you're on a sufficiently large network, that surge can cause failures. And a fixed retry policy will just cause the same stampede to recur on the retry intervals; you want to add jitter to ensure that you spread the load out.
The more prime numbers you use, the rarer a stampede affecting a certain percent of nodes will be. To an extent that makes a bigger network safer.
Adding more nodes with the same prime numbers means the peak load scales linearly with the network size. So wouldn't that mean that a "sufficiently large" network is no worse off than a small network?
Also a backup is a bulk upload that can easily run a thousand times slower than normal and still succeed. Even if every server triggers at once, that shouldn't inherently cause failures.
15h, 2h, 9h, 20h, 7h....etc. So you won't have the backup running at the same time everyday.
Personally speaking I would prefer that the backup runs at the same time everyday but some people don't.
Depending on a more exact statement of the goal, the best constant interval to avoid repeating parts of the day on nearby dates would be to multiply a day's length by the golden ratio; this would be about every 38.833 hours.
Since the golden ratio is irrational, you'll technically never repeat the same time. But if I remember correctly, it's the best number to space out the times uniformly throughout the day and also distantly between nearby days.
I don't have a strict need to run my backups at a fixed point in time (e.g. within the night hours). By not hitting a hotspot I have a better chance of having a larger percentage of the targets bandwidth for my needs (both network as well as disk IO).
The random delay ensures that the job runs at a different point in time every day, with most of these points in time being expected to have a light load. If it accidentally hits a hotspot on one day it will be fine the next.
iOS storage management has improved with a user-visible "filesystem" and storage providers allowing edit-in-place, but there's still virtually no support for backup or rsync. The native iOS Files app is not a robust client for NAS storage. So far, the best option has been GoodReader (Russian devs) which implements robust sync (SMB, SFTP & more) within the app, along with optional in-app strong encryption that goes beyond iOS data protection. Unencrypted files are visible to other apps, https://www.goodreader.com/
Samsung's iXpand has built an ecosystem of iOS apps that support their custom protocol for iXpand flash drives via Lightning. Now that iPad enables access to local storage via USB-C, we need a similar ability to mount a ZFS drive, even if Apple won't provide this natively in iOS.
With a low-cost x86 SBC like Odroid H2+, an entry-level NAS can be constructed with Ubuntu ZFS and dual 3.5" drives.
Thank you - this is very interesting.
Although I am a (casual) iphone user (my iphone has never seen my real name or my real phone number and has never touched rsync.net) I was not aware of the user-visible filesystem nor was I aware of "Goodreader".
Does this user-visible filesystem allow me to just copy over my entire music library (which is files and directories, and no knowledge of apple/itunes/ios) and then let itunes browse it, locally on the phone ? Or do I still need to do complicated import tasks ?
Sort of. Each app has the equivalent of a folder. You could use the Files app to copy a directory tree of music files from a remote server, into the folder of a music-playing app (e.g. Flacbox, Nplayer). Then all the files are "local" to the app. This assumes the copy doesn't fail, e.g. several GB over a flaky network, as the Apple-native Files app can't resume a copy, and will abort on every duplicate.
> and then let itunes browse it, locally on the phone ?
The on-device music app from Apple conflates downloadable iTunes-purchased files with streaming Apple Music. It does not play local files AFAIK. But you can use several third-party apps to play music & video.
The challenge is keeping the device and remote server in sync :) Hence GoodReader, which does a decent job of file synchronization within the constraints of a non-Apple app. It can also play music and videos, but doesn't have all the bells & whistles of a dedicated media app like nPlayer. But then those apps don't have file sync.
While apps (e.g. media players or editors) can edit-in-place or view-in-place files within other apps (e.g. storage providers like GoodReader), this doesn't seem to support the ability to maintain a continuous mirror view / symbolic link on a directory tree from GoodReader into nPlayer.
Other apps for an iCloud-free iPhone: 2Do (CalDav sync), local scraper/search DevonThink 2 Go (Webdav sync), Secure Shellfish (local access to remote files via SSH), HereWeGo (formerly Nokia offline maps), iCatcher (podcasts), Codebook (password manager with native sync to Mac), Kiwix (offline wikipedia/stackoverflow).
You can also use Files to directly browse and access files from third party cloud services, although it requires you to download the service's app first.
But my use case has involved connecting to a remote host over a ZeroTier VPN, using an iPhone 6S. I haven't experienced the scenario you mentioned; most problematic issue I've had is sporadic latency spikes during folder navigation. I haven't really done any true stress testing of things, but it's handled transitions between wifi and cellular network without any noticeable issues.
Looks like there are two different USB adapters Apple sells, and the product page for this one mentions it can be used for quite a few USB devices, including "USB peripherals like hubs, Ethernet adapters, audio/MIDI interfaces, and card readers for CompactFlash, SD, microSD, and more."
Biggest issue for both external HDDs and other devices appears to be power delivery, which can be alleviated by using a drive with external power or leveraging a powered USB hub between the phone/tablet and the USB device.
Several apps (nPlayer, InFuse, LumaFusion) also support Samsung iXpand lightning/usb flash drives natively. With native app support, files can be played directly from the flash drive, without copying to the iPhone.
The compression and de-duplication is very useful. A little bit of a learning curve to get everything up and running, but not too bad.
Why is that the case and wouldn’t that make the encryption very weak? Simultaneous updates happen quite often.
Would restic have the same problem?
Update: The issue happens because Borg uses AES in the CTR mode (not AES GCM) and two clients could provide the same nonce. The server could then recover the plaintext from two cipher texts. This is the famous nonce reuse problem.
So Borg developers are not using established primitives for this use case. Also, I am not comfortable with the OpenSSL even though it’s got better since 2015. The libssl code base is a mess and buggy. On the other hand using the low level libcrypto library would expose developers to the crypto primitives with possibilities for errors for people not expert in cryptography.
Borg should consider ChaCha-Poly135 as in rclone (or at least AES-GCM).
This is explained in the "Encryption" section: https://borgbackup.readthedocs.io/en/stable/internals/securi...
The important part is the part about avoiding re-use of the AES CTR value.
> Simultaneous updates happen quite often.
Personally I created a dedicated borg repository per machine I want to backup, because that avoids sharing passphrases across machines. This comes with the drawback that I cannot deduplicate across machines, but that is acceptable to me, because the data is mostly unique-ish anyway. I only backup the user data, not everything (e.g. /bin/).
I meanwhile read about it and updated my comment.
Would it be practical to rclone the output of the Borg into a cloud service using an rclone crypt remote? In my experience, rclone’s crypt remote is sluggish, even locally. I am not sure how the mount would work.
It’s unfortunate that we have to get the dedup from Borg and the encryption from rclone!
I gave it the old college try to recover, using the different tools to try to access it (the fuse mount, the CLI), I tried all sorts of different settings for my locale. At the time I had at least 2 other backups of that so eventually I recovered from my primary backup. I was testing out Borg at the time.
I've ended up using Restic more recently, and it seems to be fine. Uses kind of a lot of memory in some situations though. Small AWS instances have issues. My primary backups still go via rsync though.
I wonder if I am missing something compared to restic?
The biggest difference for us is that borg really requires a server side 'borg' binary to talk to, which we have built into rsync.net. restic, on the other hand, can just connect to any old SFTP endpoint.
This means we need to preserve some amount of backwards-compat and so we maintain borg0.x and borg1.x binaries in our environment (and eventually, borg2.x).
I haven't used borg. Only done some maintenance on restic backup jobs at work. Restic's command design is intuitive and the documentation is good. But Borg looks just fine in that regard as well.
I posted another question on Borg, in case you know the answer!
I used to run Crashplan with near-continuous backup for my important files, and I'm still missing this.
Around the same time I tried Restic. That ran into difficulties (don't recall what anymore) so I switched to Borg.
Borg has been 100% reliable, including a full restore of /home after my laptop was stolen.
Like I said I was hoping to get the 10-15 minute intervals I had with Crashplan.
Photos are on Google Photos, use Dropbox/drive for data.
Of these, I find 1) is by far the most common, and 2/3 isn't even close. The problem with any backup scheme created by myself is: if I couldnt' be trusted to maintain my data without deleting it, a sure as hell can't be trusted to set up my own backup scheme without screwing it up.
Example: I have have had regular backups to a NAS, which were then uploaded to an offsite server were I rotated the data in. But I screwed up a raid config thing on the NAS after a harddrive faliure, and didn't notice I lost a lot of data, which after a couple of years had also been removed on the offsite server.
Basically: to be a good backup solution for me it has to be idiot proof. Zero manual configuration (If there is a config file or command line anywhere, it's out). I want a gui tool that gets out of my way and has good defaults, and has such a huge disk area that it can have effectively write-only semantics. I want retention for deleted files. Currently I use iDrive which is pretty good and lets me back up parents computers and so on, in the same 2TB.
Yes, it’s cool to set up a FOSS FreeNAS server with ZFS pools. However, one look at the forum posts of people losing data due to misconfiguration tells me it should be considered a toy project.
All of my important files live on the NAS as a source of truth. This means I passively make sure the data is there, every time I access something.
It’s backed up using Hyper Backup to the cloud, in encrypted format. I verify restoration once a quarter. I also keep a HDD around that I manually backup to once a quarter.
My NAS was a Synology too (The one where I lost a lot of my data, despite Raid1 + offsite sync). No dry-run restoration from the cloud was my mistake.
I really want something that ends with a full disk image that's easy to restore to a new device, runs backups on a schedule (and will run a while after the next boot if the computer is off at the scheduled time), writes the images to a unix system on the LAN (either directly, or by writing to SMB), and doesn't cost an arm and a leg.
Doesn't provide a perfect full disk image, but it does store everything I need. I've done one full restore from them (fried motherboard from a power surge) and it went as smoothly as I could expect.
Another point, I suppose that backblaze comes with dedup and compression?
Re dedup/compression, it's a bit irrelevant because their plans are unmetered.
...until you need to restore from backup. You then have to sign in on the Backblaze website and enter that key, the files you are trying to restore are then decrypted on their end, and bundled up and sent to you.
They say that the key is only ever in RAM, and only then briefly.
However, another option is to back up just the data and reinstall the OS + programs in case of a disaster. I've been set up this way for nearly a decade, now using Bvckup 2  as a replicator. This is faster and lighter on the system and it creates backups that are readily accessible.
Up until now I've been manually doing it via 7zip. But doing it manually is so unreliable that it doesn't even count. Or using Macrium, but it always felt like overkill.
Bvckup2's archive/keep eveything feature for handling deletions is really great!
Someone else in this thread mentioned Duplicati, which also looks really great. I might add that to my backup flow. I thought it was you, but now I can't find it anywhere in the thread. I guess they either delted or edited their comment.
I also have bvckup2 (worth buying almost for the amazing UI alone) but I use it more for syncing some folders to and from a NAS.
I've used to restore twice: same machine and new machine. Worked without an issue once the USB boot is created.
I think the cost is reasonable for 5 workstations.
Had a SSD die on me a few years ago, the primary disk. With no warning it just bricked itself. Thanks to Acronis my computer was running again less than an hour later.
Have also used it to restore documents and similar I accidentally deleted.
Another nice feature they have is their malware protection service. It detects programs modifying a large number of files in a relatively short amount of time, blocks them until you say if it's ok or not.
Arq 5 was OK.
Arq 6 was shipped in a state that wasn't suitable even for beta. It corrupted and destroyed backups created with previous versions, couldn't complete new backups, wasn't working in fresh installs, had no documentation, no development plan and very poor communication from the dev addressing all these issues. The lash back was so bad that they closed their Twitter account and locked up Arq subreddit (only to claim later that it wasn't them, but Reddit itself that did that).
A lot of people, me included, were expecting Arq 6 with a great deal of excitement only to witness one of the greatest dumpster fires in the recent history of ISVs. The news now is that they decided to just bury Arq 6 without trying to fix it and move on to Arq 7 - https://www.arqbackup.com/blog/next-up-arq-7/
I tried to communicate the best I could about what we were doing -- a blog post, responding to all the reddit comments on the arqbackup subreddit that somebody else controls, answering thousands of emails. For at least a week I answered 300+ emails/day while simultaneously trying to diagnose and fix the issues people were experiencing.
At one point I deleted the Twitter account because I couldn't cope psychologically with all the hate and the personal attacks.
We set about immediately working to make Arq 6.3 "backward-compatible" with old Arq data (rather than import it into the new format, which failed unexpectedly for quite a few people).
A month into it we tried making a UI that's "native" (like Arq 5) and realized we like it better too. So, we missed our June 30 deadline of making Arq 6 backward compatible, and decided to just start over with a native UI.
We were going to ship that as Arq 6.3, but a few weeks ago realized that just shipping it as a point release would be way too disruptive. So it's going to be Arq 7. Arq 6 users of course will be upgraded to Arq 7 for free.
I know we screwed up. We're trying really, really hard to make it right. We promptly refunded every single purchase for which a refund was requested. It's not about the money. It's about trying to do the right thing.
I don't know what else to do at this point. If you have suggestions please let me know.
> bury Arq 6 without trying to fix it
I don't understand this. Arq 7 is the fix for the Arq 6 issues. It's free for Arq 6 users. We're not trying to bury anything. We've been really open about saying we screwed up and we're doing all we can to fix it.
The issue with the Arq 6 release was not that it was bad per se, but that there was no clear _public_ communication from you. This was twice as jarring because in recent years you've been making comments to the effect that it was no longer just you, but a team. So not hearing anything official for days, if not weeks following such a disastrous release cost you a great deal of goodwill. For every email you got, there were 10 people who didn't bother to send one.
The hate and personal attacks you were seeing were a side-effect of that. The rule of thumb for when you screw up is that you _must_ talk to people. Tell them, verbosely, what's happening on your end, what caused this, what you do to prevent the same from happening again. Talk like a chatter box. As shallow as it may sound, this shows people that you are on top of the things and it builds sympathy. All you have to do is to demonstrate that you are feeling the pain and working to resolve it. Once there's a critical mass of users that are supportive of your recovery efforts, it will prevent others from turning into trolls and haters.
Talk to your users.
You weren't doing this, not in public. That was the main issue with Arq 6 release. Not that you screwed up.
At some point I think I had some sort of breakdown and could no longer cope.
I'm trying to recover here. But every time somebody mentions Arq, someone seems to come along and make a comment like your above comment, which makes me want to throw in the towel frankly.
I bust my ass day after day to try to do the right thing because I believe that what you put out into the world is what you get back. I hope in the longer run that's true.
You should really consider doing everything possible to move support from 1-to-1 interactions to 1-to-many. A couple of simple things will go a long way - open up a forum and add an FAQ page.
Right now you don't have a place on the website where users with issues can go. If someone runs into a problem, they will be looking for the problem description, not documentation. The only option is to either send an email, which is slow, or to use Twitter, since you appear to be responsive there. It's literally the least effective setup as far as managing the support load goes.
Once you have a self-serve support options set up, you can funnel all support queries to the FAQ page first and to the official forum second. Redirect all Twitter and Reddit queries to the forum and answer them there. Do not engage in any support conversations on Twitter and Reddit at all. Chit-chat is OK, but no support talk. Keep an eye on what's the actual FAQs are and populate the FAQ based on that. In a matter of weeks you should see the time spent on support go down dramatically.
You'll get through this. Arq still got the momentum and the vast majority of Arq users are loyal. They still wish Arq well. I know I do.
- Unison was a command line tool, like a better rsync, designed for backups. But it feels quite kludgy doing backups/restores from the command line in Windows.
- DriveImageXML not only backed everything up, but created an XML file listing what was backed up where, so in the event you didn't have the software to restore stuff, you could still conceivably recover it (if you could write something to parse the XML and extract). It worked well, though it was slow. It got me through a few computer changes, and I had no problems restoring.
- Windows' built-in backup software worked really nice until you upgraded and then wanted to restore something from a backup made in a previous version. Being able to restore is kind of critical though.
- One of my employers required us to use a cloud backup solution which I won't name, but it was about as heavy as having McAfee Antivirus or similar on your system wasting most of the resources most of the time.
- When I backed up my last computer (Windows 7) prior to upgrading to Windows 10, I had a free copy of Acronis TrueImage just for that purpose, and so far it has seemed to work beautifully. I have zero problems recovering files from that backup.
- Currently I'm using Paragon Hard Disk Manager, which is fast and simple. So far it seems ok, but I haven't had to restore from it yet.
So with that experience, I'd probably look at Acronis and Paragon. My concern always though, is - when I need to recover these files, how do I do so if my computer doesn't have the software that created the backup? (Assume new computer or freshly reinstalled system.)
That's tricky and a spot where the things like Unison and DriveImageXML may still hold some value. If you can make a backup of your backup software separate from your backup so you can be sure to have it available to restore, then that might not be a concern. But it's always been a concern of mine - having a full backup of my data but in a format that I can't access.
I think there's one that can make a virtual hard disk as a backup that mirrors the real one, so without anything but the OS, you could boot and mount your backup as if it were the system disk, but I don't remember what it is.
I think that even though some pop-up messages tell you that the previous backup will be blown away, it actually is incremental to a certain extent, and the recovery tool in the installer sometimes does list multiple dates to restore from -- although I'm not sure if and how data retention can be controlled. Also disk encryption is removed on restore, and I think the backup is not encrypted at rest either; you need to keep it in an encrypted location to begin with.
For file-level backups, I'm using an rsync frontend, QtdSync, but I also had success with Borg running under Msys2's Python interpreter.
Using a network location is somewhat less useful. You lose Volume Shadow Copy so it becomes a single generation full-backup-every-time solution. It's still easy to mount and to restore from, but marginally more useless.
It would figure that Microsoft announced (last year, I believe) that the feature is no longer being developed.
Other than that, it did seem like it fit my needs. :(
My main reason for ditching periodic backups (backblaze) was that even in situations where restoring a backup would be useful, I found it easier to just reinstall the OS and pull a few repos. Nice thing is that this forced me to automate the machine "setup" so I just have one script that installs my cli tools of choice and links the correct config files.
I don’t understand the author’s difficulties with a minimalist bash-wrapped rsync-based backup. You can even hardlink to unchanged files from a previous backup to save space.
This is how I wrap rsync: https://github.com/kaumanns/snapshot
And regarding file permissions: why not simply use an EXT4 backup drive instead of an FAT32 one? Non-rhetorical question.
My home network Raspberry has an HDD attached which gets fired up every couple days for a fresh snapshot of $HOME. The only thing I am missing is redundancy. And possibly encryption.
15 years later, nightly backups across maybe 300 machines, this is what I have:
The best thing about restic in my opinion is the ability to mount the snapshots using FUSE to my machine without actually explicitly extracting the backup to a local directory.
For example, if you take a backup of a bunch of files which includes TrueCrypt containers, and then you modify the containers and take a new backup, it will not back up the new data. Instead, it will look at file metadata to erroneously conclude that the container has not changed.
Now, some people argue that this is not an issue, because you can use non default configuration of TrueCrypt and/or Restic to prevent this problem. But how would a Restic user know that they need to do this?
I don't want to become an expert in the internals of the backup software I'm using. I just want it to work -- or at least fail in predictable ways.
Turns out, I was wrong! 
> If you use encryption, all data is encrypted on the client before being written to the repository. This means that an attacker who manages to compromise the host containing an encrypted repository will not be able to access any of the data, even while the backup is being made.
borg is even more amazing than I thought.
File or chunk.
Many times I've cursed at Carbonite's app for doing nothing when I want to back up, and popping up annoying when I don't.
I use B2/restic and only selectively backup things that are either irreplaceable or I'm willing to pay for. It costs me around 1 USD/month, which is only ~200GB. I have a lot more data than that on my NAS!
tarsnap: 100 GB would be $300/year (excluding ingress/egress)
Edit: I believe both support deduplication (borg and tarsnap), but no idea if one is superior to the other?