Hacker News new | past | comments | ask | show | jobs | submit login
I Nearly Lost All of My Data (kevq.uk)
233 points by kevq on Jan 19, 2019 | hide | past | favorite | 255 comments

Hard to believe we are 129 comments into this and there are zero mentions of 'borg' in the comment threads ...

For those that don't know, borg is a backup utility[1] that has been called the "holy grail of backups"[2].

It takes your plaintext files and directories, chops them into gpg-encrypted chunks with encrypted, random filenames, and will upload (and maintain) them, with an efficient, changes-only update, to any SFTP/SSH capable server.

My understanding is that the reason people are using borg instead of duplicity is that duplicity forces you to re-upload your entire backup set every month or two or three, depending on how often you update ... and borg just lets you keep updating the remote copy forever.

[1] http://borgbackup.readthedocs.io/en/stable/

[2] https://www.stavros.io/posts/holy-grail-backups/

The crypto in Borg is bad. Even has its own page in the docs about how bad it really is. The tl;dr is "there is one AES key per repository and we use this with AES-CTR and trust the remote server to not fiddle with the counter".


Last time I checked there was still no work being done on improving performance. Multithreading has been on the agenda for years and will probably never come, much like improved crypto.

Like many open source projects, Borg is low on developers. The two points you mention are both known and on the agenda for this year. Since you seem to have experience with crypto, why not add a PR? It's all in Python and fairly readable.

When someone says something is the "holy grail of backups" and it is "gpg-encrypted" I assume it does that well, not it does it poorly right now but might improve in the near future.

Thus his post was valuable to me.

"Why not add a PR" is not a reasonable response to something like this. I'm looking to use a backup system, not create one. If something works so poorly that I'd have to start submitting PRs before using it, I'm far more inclined to go elsewhere.

>> Multithreading has been on the agenda for years and will probably never come, much like improved crypto.

> The two points you mention are both known and on the agenda for this year

This is a rather strange response to the above...

Borg isn't really low on developers, it has high churn, i.e. repels developers.

I'm not sure that's a fair summary. It's very clearly stated that there is authentication performed after encryption.

The MAC should prevent the remote server changing the counter or any encrypted bits.

> When the above attack model is extended to include multiple clients independently updating the same repository, then Borg fails to provide confidentiality

Which is why multiple clients are not supported within the same repository.

They absolutely are supported.

Let's also mention Vorta, an open source GUI for Borg to make it as easy to use as commercial backup tools. We just started translating into different languages. All input and PRs are welcome.


(I'm the original author, but Thomas, the current Borg maintainer is also very active.)

Thanks for sharing this! Vorta seems to be exactly what I was looking for. I can't seem to get it up and running on Mac OS.

Edit: It looks like Vorta doesn't obey Mac OS's Dark Mode so it looks like the app doesn't launch.

There seem to be a lot of open source projects named after things in Star Trek.

Any plans to have a Windows client of Vorta? [Edit: Saw it now: "Windows is currently not supported by Borg, but this may change in the future."]

Vorta can run on Windows (it's in Qt), but Borg can't (for now). They are working on it though.[1]

You could probably run it today using the Windows 10 Linux subsystem, but it's not fully tested and will need some small fixes. Maybe later this year.

If anyone is interested in working on this platform, just post at [1]. It's probably very doable.

1: https://github.com/borgbackup/borg/issues/936

Have you looked at Arq[1]? It deduplicates, encrypts, compresses, backs up to most major cloud providers and SFTP and Minio, is available for macOS and Windows. We've been improving it for 10 years. The data format is open and documented[2].

(Note: I work on Arq)

[1] https://www.arqbackup.com [2] https://www.arqbackup.com/arq_data_format.txt

You've probably heard this before but there are a lot of people who'd be interested in a cross-platform (NAS/Linux/BSD/Raspberry Pi) version with a command-line or web interface.

As is it's really best for workstations rather than the NAS/servers a lot of people who care about their data use (not just basement-dwelling data hoarders but also professionals like photographers and videographers).

I currently use Borg on my servers, and Arq on my client machines.

I'm quite sure if Arq was available in a command-line version for Linux/FreeBSD, i would use that instead. It has been nothing but rock solid on my client machines.

Restore through the UI is a bit wonky, but once it's started it works well.

What about Arq would make you switch from Borg? (And if you could email me at stefan @ arqbackup.com I'd love to follow up and learn more)

My biggest reason is probably that i prefer to use the same (well known / proven) tools across the line.

I don't think Arq brings something to the table that Borg/Restic/Duplicity doesn't already provide, besides a piece of software i'm familiar with, and trust. (and trust doesn't come easy when it comes to backup software!)

As i wrote, i already use Arq on my client machines instead of the mess that is "TimeMachine over wifi", but when it came to backing up my NAS and server/VPS, i had to look for something else, as Arq is not available on those platforms.

My use case would be backing up with Arq from a FreeBSD/Linux NAS/Server to a remote (networked at least) target, and in case of a full restore, i would use the command line as well.

I would then use Arq for Mac/Windows for "plucking" single file restores.

Thanks. Yes I’ve heard that before but not as clearly as you’ve written it here. I appreciate the feedback a lot.

Have you checked out duplicacy[1]? It's support for incremental backups is fairly unparalleled in my experience. It does not have the same limitation of duplicity.

[1] https://github.com/gilbertchen/duplicacy

Looks nice, but it is not open source.

I know what you mean. It is not a normal open source license and for commercial use it requires a $20 a year subscription per user. With that said, I believe all the core source code is viewable and auditable if you use the CLI version, which is what I use on my FreeNas system and there is no charge for personal use. I think this may meet many people's personal backup requirements.

restic was mentioned a few times and is a very similar project with very similar goals. They both have encryption as a first-class concept and are snapshot based with content-defined chunking. I'd hope most people would be using these newer generations of backup tools.

My favorite benefit of restic is that it doesn't require the receiving end to have any special software: you can use restic to backup to S3 or BackBlaze B2 directly.

Same! I’ve been using Duplicity for most of my backups and had been considering switching to Borg, but passed when I found it couldn’t use B2 as a backend.

Now I’m using both Duplicity and Restic, for different backups.

Yup, though due to Australian internet I don't use restic's support for non-local backends (otherwise getting a snapshot list or doing a purge+clean is painfully slow) -- I have a sync job that runs every day that uses rclone to sync the restic backup store.

I've been using restic for a long time now. It's flexibility is amazing:

* Fully open-source, command line implementation. Script as needed. * Supports ton of backends, including directories, SFTP, and many cloud services. * Supports multiple backup operations in parallel. Create one repo for multiple machines. * A repo is really an encrypted object store with a lists of directory/file/metadata references (snapshots). If the same file exists on multiple directories/drives/machines, it is deduplicated. * Supports tagging, which allow you to identify independent snapshots and manage them. * Supports purging old data with conditions, ie. keep last 7 daily, 3 weekly, and 6 monthly. * Can verify and repair repositories.

I regularly use it to backup my home server to a backup drive and to Backblaze. I've also used to back up my home directory of my development machine to replicate it on another machine.

Have you looked into Bareos?

I'm setting up backups and looked into most available solutions, including restic, borg, and duplicity, but Bareos by far seems the best one. It's just a little harder to set up, but totally worth it.

Apparently the one downside of Borg is that it has to walk the whole filesystem for backups, rather than picking up changes: https://github.com/borgbackup/borg/issues/325

According to a more up to date comparison[1] (recommends Arq), Borg is also slow because it processes sequentially.

[1]: https://www.lullabot.com/articles/backup-strategies-for-2018

> Apparently the one downside of Borg is that it has to walk the whole filesystem for backups, rather than picking up changes: https://github.com/borgbackup/borg/issues/325

This is true of most backup programs.

If you're using borg backup and would like to support the project, you can do so here: https://www.bountysource.com/teams/borgbackup

Took me forever to find that myself so just posting it here in the hopes that more people will support the developers.

Love borg, can't rate it enough. There's also a pretty great service that supports it, you might have heard of it? ;) https://rsync.net/products/attic.html

Not only is borg great, rsync.net has a super cheap plan for borg and attic that is vastly less expensive than other cloud storage. I have about 140GB backed up this way and couldn't be more happy with the toolchain.

I feel as though you are missing a "Disclaimer: ...", though I will be looking at borg. :-)

I've been using Borg (and it's predecessor, Attic), for a while now and I'm very happy with it. Perhaps my only complaint would be that at one time they changed the hashing function which exploded the size of the repo for a while (since content was getting stored twice until all the old chunks were rotated out). Duplicity managed to lose my data several times, but so far I didn't have any such bad experiences with Borg.

The best solution I found as an alternative to an off-site copy is to set up a USB drive and something like a Raspberry Pi in my car. When I pull into the garage, the house server senses it and syncs new data to it.

This can be supplemented further by having it auto sync to a computer in your office whenever you pull into work. Add in some monitoring so you can get a phone alert if any of the synced copies are more than a few days out of date, and you are almost golden.

That’s on-site way too often for my tastes. If your house burns down while your car is in the garage, there’s a good chance you’ll lose the backup too. Adding a computer in your office fixes that, but why not just send your backups to it over the internet instead of using this ad-box vehicular transport protocol?

Never underestimate the bandwidth of a station wagon full of tapes travelling down the highway...

Initial seeding of your offsite backup over the Internet would be a pain, but most people's valuable data change rate isn't high enough to significantly tax modern broadband connections. I sync a few gigs of data a week and I never even notice it happen.

When I had my new 'Man Shed' (It's a large workshop really[1]) built at the end of my garden, I ensured I could extend my network to it. I have an old NAS with a couple of large drives in it, and I have scheduled backups that write to it from the main file server in the house.

It's not perfect, but it's way better than not having any offsite backups at all. I have a deep distrust of Cloud storage services so for me I think this setup is the best solution I'm going to get.


[1] I'm in the UK, so our typical 'sheds' are absolutely tiny - this one is not.

Distrust in cloud storage in which way? What are yafraid will happen. Which data are you afraid is going to end up in whose hands?

The data I really care about is encrypted right on my laptop,and I backup that encrypted data.

It no so much that, but more of a control aspect. Once my backups are on someone-elses-computer I've lost control of them, and knowing my luck the day I need them will be the day the cloud provider has an outage, or even worse, goes offline.

There is no way the big cloud providers, AWS, GCP, Azure go offline without a notice period (even medium size one like DO).

Use a solution that you can easily switch from one provider to another one, switch when there is any warning that the one you use might be in trouble. In the same line, make sure people in your family (or technically savvy friends) understand your system, are able to switch and know to pay the could provider bills if you're not around - unless it is ok for your entire storage to disappear when you die of course.

I use syncthing for syncing when my raspberry pi joins a specific BSSID, sync starts.

It's susceptible to macspoofing but I hope my Pi won't join a spoofed network.

Even if it does, isn't Syncthing traffic encrypted. Also, 'they' would need to spoof your other syncthing node for the first one to start syncing with it, and if they have that much info, you've already lost.

I use syncthing to sync (mostly important text stuff) between my two machines and my android phone. Once every day one of the machines uploads the changes via duplicati.

how do you power the usb drive whilst it's syncing in your car, while parked in the garage?

What if the drive gets stolen?

It is hooked up to a USB battery pack, which in turn is plugged into the accessory port on the car. The hard drive powers down after the data transfer is finished.

All data is encrypted before sending to it. If it gets stolen, well it is only one of the copies of the data (however it lives hidden under the dash, and is small enough that it looks like a piece of car equipment).

Do you ever worry about the Li-Ion pack having issues in the hot car?

I've never specifically used a Li-Ion pack in my car PC setup, storing it in the heat would worry me a bit, but surprisingly, operational temps in a car aren't bad: If he's using it after he drives it, the car has likely been cooled or heated for driver comfort, and it'll take time after he parks for the temperature in the car to return to "outdoor" levels.

My bigger concern is just storing it. It sounds like he’s using just a regular powerbank type battery which is li-ion. Letting it sit in a parking lot for a day in the summer then turning it on probably isn’t a very good thermal situation. I wonder if some sort of sealed lead acid might be better. Far more inconvenient for power management though as some sort of circuit would have to be created.

Wow what a clever/simple elegant solution. I was wondering how you set it up so it wouldn't drain your battery. Excellent work!

He said it syncs when he pulls in so presumably a 12v power supply

Ha! That sounds like such a neat idea that I'm now skeptical that it is indeed just an idea! Do you have a write up somewhere? That sounds really cool.

This is brilliant! I think I'll hack something together for this soon...

Does the RaspberryPi have enough CPU for disk encryption?

It doesn't need to, assuming you're using something like rclone [1], where the encryption is done on the "client," which in this case would be the GP's home server. The client does the encryption, treating the disk attached to the Raspberry Pi as an untrusted third-party disk. As long as you keep the decryption key safe (e.g., on a USB stick stored off-site), you should be good to go in the event that you need to pull data and decrypt it.

[1] https://rclone.org/

I've run Truecrypt on mine, and it works with small drives - 100gb or so. But it gets super hot, even with a heatsink. I wouldn't rely on it.

This was so I could use it to back up my Synology box to a set of external USB drives and retire the older Mac Mini I had been using. I'm still using the Mini...

In the case of the author, +1 for doing backups, but -1 for not establishing a chain of off-site copies (a safety deposit box at your bank, a relatives house that's some distance away, etc)

You can do encryption with any amount of CPU power, it’s just slower if you have less. Backups are not typically a case where speed is extremely important.

I'm not sure about disk encryption, but I know that the Raspberry Pi didn't have the juice to cut it when Syncthing was running, calculating hashes for my files. I eventually gave up, hoping that some day Go would support the use of GPU hardware to hash combined with accelerated drivers for the Pi's graphics card.

RaspberryPis can emulate Mario Kart on an N64, run Minecraft on Linux and even Mathematica at decent speeds; I'm willing to bet big bucks they can handle symmetric key encryption (which does not require much power).

If a crappy NAS can run disk encryption, so can a raspberry pi. They're deceivingly potent.

Eloquent except the Rpi has a terrible USB bus.

It got significantly better in the last generation. As a result the disk I/O and network are significantly faster. While the last 2 versions had GigE, the throughput almost doubled in the last generation.

> RaspberryPis can emulate Mario Kart on an N64

Yes they can, and Golden Eye. I have just played Mario Kart with the kid before bedtime. It stutters a little if you have too many karts on screen, but that’s not a problem when you’re winning.

I am trying to find cheaper device between NUC and Pi.

Lot fewer choices are available.

NUC is expensive. I wonder when will 8core ARM based NUC will arrive in the market.

No idea is NUC can be powered by Xiomi Mi 2i powerbank but a Raspberry Pi can be


1. Gigabyte ethernet

2. Small form factor NUC or smaller.

3. 8+ cores

4. Very low power consumption

You should give the Odroid HC2 (or HC1) a look (https://www.hardkernel.com/shop/odroid-hc2-home-cloud-two/)

It's an 8 core ARM SBC, with 2GB Ram, Gigabit ethernet on one USB3 bus, and a SATA connector on another USB3 bus. It can fully saturate spinning rush, though it might have problems with SSD drives. In any case, it will fully saturate your GigE link with even a slow harddrive.

It runs a standard Linux kernel, with the "driver" for the hardware added, which is why it requires a special build. Latest kernel version is 4.18. There is also an OMV build for it.

Mine uses around 4-8W, depending on how much the harddrive is being used. I've got a couple of them running as backup targets (Arq, Borg, etc) with a 4TB WD Red and Btrfs, and they've been rock solid.

If you need hardware AES, you should probably be looking elsewhere.

The 3rd and 4th are mutually exclusive. If you want the power consumption of a smartphone, you also have the computational speed of a smartphone, but that should be enough for many applications.

I have experimented with a very large number of devices between NUC and Pi.

While the best processors for such devices would be ARM processors (with Cortex-A76 or Cortex-A75), nobody offers adequate devices.

You can find only devices with obsolete slow ARM processors, e.g. RK3399, which are much faster than Pi but much slower than x86, or you can find development systems or modules with modern smartphone ARM processors, but those are much more expensive than a faster Intel NUC.

Therefore, for the moment the only sensible choices are devices with either Intel Gemini Lake processors or with Intel Y-Series processors.

The devices with Intel Y-Series processors are much faster, but they are also more expensive. While Zotac has presented at CES one such device, its price and time of availability are unknown.

For now, the only device available with Y-Series is "LattePanda Alpha". Considering that its price includes 8 GB RAM, 64 GB flash and the Windows licence (even if I wiped it and I installed Linux) it is cheaper than an i3 NUC where at the same price you have no memory.

It has everything a NUC has, except that it has one less USB port (it has 1 C + 3 A, while some NUCs have 1 C + 4 A). I have installed in its 2 M.2 connectors a NVMe SSD and a second Ethernet interface. The speed was excellent, because the Kaby Lake CPU was configured for 7 W TDP with 15 W TDP for the first 28 seconds. The included fan does not start unless you run a heavy workload and even then it is silent. The size is smaller than a NUC but larger than a pico-ITX board.

If you want something cheaper, or if you want, like me, both full-size DisplayPort & HDMI connectors (not DisplayPort on USB C), then you must choose a Gemini Lake device, i.e. either Zotac ZBOX PI335 Gemini Lake (with N4100, not to be confused with the obsolete ZBOX PI335 with N3350) or an ODROID H2.

The ODROID H2 is faster and possibly cheaper (depending on the memory you buy or you already have; ZBOX has soldered memory), but it is larger. ODROID H2 is NUC-sized while PI335 Gemini Lake is pico-ITX sized. ODROID H2 has 2 Ethernet, but PI335 Gemini Lake includes WiFi & BT and it comes in a closed case which will protect it in dusty environments (both have passive cooling). PI335-GK uses a 5 V power supply, so it should be easy to be powered from a power bank. The CPU TDP is configured for steady state 6 W and 15 W for the first 28 seconds. The idle CPU power is less than 2 W. I have not measured the total idle power of the computer. With the CPU halted, the power consumption should be much less, but I have not measured it.

Gemini Lake has a speed just a little less than a Snapdragon 845, but it is much faster than older ARM processors and than older Atom processors.

There's the Intel Up board. 4 Atom cores, gigabit ethernet, 2xUSB3, 2xUSB2, HDMI and the same form factor as a Pi. You can get expansion boards that break out more USB ports (not on a hub, independent bandwidth), native UART and I2C. I've been playing with them for onboard computing on a drone, so far it is working really well. Price is about $100 for the low spec ones.

Not OP, it'd be neat if it just uses FTP (or rsync) to transmit data which is already encrypted from the uploading-side...

With encrypted upload, you usually get a destination authentication thrown in for free, i.e. you're uploading to the right server :-)

1. yes 2. you would want to encrypt before sending to rpi anyway

How does this not kill your car battery?

That's a really cool solution.

There really isn't a story here, Kev. You blew a power supply on your Synology. You even said it yourself: "maybe this is a problem with the enclosures and the disks are fine".

If you'd put your disks into a replacement Synology unit, you would have been back online - config, data, and all - within a few minutes.

This - I thought of URE as he mentioned Raid5 but the disk size was ok (1tb).

But this scenario? Lost almost all his data? I once closed the terminal window where I spawned gparted from while resizing a partition with it... this was nerverracking

The Synology unit isn’t cheap and it’s for backups - surely it warrants a surge protector. Easy to say after the event however.

Agreed. There's also something janky about the power supplies on some Synologys:


And joy of joy's it's not a standard off the shelf part, and pray you're not outside the warranty period because the price of a spare is eye watering.

Steve has posted a few other videos about failed Synology's, worth a watch before parting with you cash.

This is the reason I upgraded our Synology with a self-built NAS with six disks, a low-power Atom CPU and a really really good PSU connected to a UPS unit for extra security. Never save money with your NAS power solution.

Which OS and file system did you choose?

I recently did similar with an old server and Freenas, but I am still not sure if I have a safe (enough) system overall. I chose RAIDZ2 with 4 4TB WD Red drives, ECC RAM, seems to be working well so far.

FreeNas with ZFS, RAIDZ2, 6 x 4TB WD Red, 16 GB ECC RAM.

System has been running a year now without any bigger maintenance.

Many years ago I watched someone stuff the DVDRW (remember them?) which contained all his stuff into a work PC’s pioneer slot loader drive to get some music off it. We stood there and the drive went bzzt, clang, bzzt, clang then sped up way faster than it was supposed to go. This was followed by a large bang and bits of DVDRW coming flying out and then a crunching noise.

From this I learned about single points of failure.

Edit: also in the decade and a half since I learned that you should never trust a magic box, magic piece of software or magic container file system for backups. A plain file system you can just copy your shit back from is the closest thing to a guarantee. Also it’s cheaper to curate your data carefully than end up with 4TiB of crap you’re too scared to deal with on your hands.

> A plain file system you can just copy your shit back from is the closest thing to a guarantee.

These days, I'd suggest using a file system with some form of file checksum metadata. If one values the integrity of their data, bit rot is a thing.

Very true. alternatively SHA sum all your data per directory. That’s easy enough to write scripts to validate

But define 'plain file system' - is that NTFS? ext4? ZFS? FAT32? HFS? UFS? ISO9660? TAR? cpio? Some of these are more complicated than others, and all of them have catastrophic failure modes...

I use NTFS.

The catastrophic failure modes are present only through hardware and misuse issues on a mature file system.

> What are the chances of both a 4 disk RAID failing AND a USB drive at the same time?

Probably close to 100% if your house catches on fire.

Came to say the same thing. Or flood, tornado, earthquake, etc...

The chances of both failing get even higher when you factor in failure causes like "I was running old firmware and someone's ransomware exploited that."

Probably closer to 10%. House fires rarely completely destroy the house and cars are often somewhere else.

Wildfires for example often destroy homes, but generally allow most people to evacuate.

Their USB drive was plugged into their NAS. The probability (as they found out in TFA) is very high that a power surge will knock out most of their devices. A house fire wouldn't hesitate in destroying both. I think "close to 100%" is being too generous.

If you don't have 3 copies of your data (with at least 1 offsite) then your data doesn't really exist.

I have to agree with the guy above. House fires often leave a significant bit of the house structure alone. I also feel that a decent number of scenarios where my house would burn down would be when I'm not there.

Kudos to Synology for having a process to recover on a Linux box if the Synology box craps the bed. I was reading this story thinking "this is a good lesson on why you don't buy an expensive SAN, unless you can afford to buy two".

But, the other lesson is: Backups. Sounds like backups were shut off 6 months prior.

The other lesson is: Monitoring. Backups were going to the USB drive, but it that stopped working at some point. Unless you have some tested monitoring of your backups, you are likely to lose data.

Glad this story had a happy ending.

The nice thing about Synology storage stuff is that it's a nice GUI (seriously, they're about the only company I can think of that's doing appliance management right) on top of standard and battle-tested open source tools.

This was one of the reasons I was okay with paying their prices. Even if the device completely craps the bed, I'll be able to hook the drives containing absolutely normal LVM/btrfs volumes up to another machine and get my data out.

Btrfs is battle tested?


We use a box with 2x2TB mirror zpools. It runs Nas4free off an 128GB SSD. Initially we wanted to use Freenas but ir needs 16GB of RAM, while Nas4free works fine with 8. It also does SMART monitoring sending emails when something is up.

FreeNAS works fine with 8GB RAM, especially with such little data

On the btrfs Parity Raid issue: Synology uses btrfs on md RAID [1], so I believe it's more stable than the native btrfs implementation.

[1] https://www.synology.com/en-global/knowledgebase/DSM/tutoria...

It’s the reason I’m desperate to get my pro-photographer best mate off her old Drobo and on to a new Synology. Should be happening in the next few weeks, and not a minute too soon.

The Drobo is literally a black box to me. It either mounts and you can see disks, or it doesn’t. And if it doesn’t ... well. Yep. Right-o. I’ll try rebooting it, I guess.

It’s nice for running Docker in too and give a beginner a fairly gentle introduction. The boxes are also really nice in that you have a range of configuration options that lets you get what you need. Don’t put big Seagate drives in it if it’s near where you sleep though as they make a bit of noise as they grind.

What about that “hybrid raid” stuff they have? What is it?

It's just a particular LVM configuration with a nice UI https://superuser.com/a/1367697

> Sounds like backups were shut off 6 months prior.

Exactly. Onsite backup is not a backup, especially if it is directly connected to the primary data store.

These kinds of scenarios are why I built Relica [1]. It backs up to local disks, network drives, remote computers (on LAN or anywhere with a public IP address), your own cloud storage, or our own special formula we call the Relica Cloud: one upload, five independent cloud providers -- replicated in real-time.

And restores can use the open-source tool restic [2], so you don't have to be locked into Relica for accessing your data.

We're working on the ability to do byte-for-byte copies of a repository to other destinations to make data even harder to lose in these kinds of disaster scenarios, as well as a new UI to make it more pleasant and powerful.

Anyway, our goal is help make robust backup strategies like what this guy needs really, really painless, because I'm as paranoid as he should have been about losing data.

[1]: https://relicabackup.com

[2]: https://github.com/restic/restic

Can this also be used as a replacement for sync services such as Dropbox or SpiderOak ONE? In other words, can I sync accessible folders between computers in near-realtime, or is it only archival/backup storage?

No, Relica is not a sync service, because we do not want to sync your deletes -- we recommend using backup and sync together, because they serve two different functions.

Relica is archival software, but you can backup+archive your synced folders of course.

>I don’t know what happened for sure, but I think it may have been a power surge that fried the boards on both the Synology and the USB, as they were plugged in to the same socket.

He didn't have a surge protector? Sweet Jesus. I don't plug my backed-up PC into anything not surge protected.

Surge suppression isn't all it's cracked up to be. I work in an MSP and the number of electrically fried components behind Enterprise grade UPSs is pretty astounding.

I'm not saying it's worthless, but it's not silver bullet. You must plan for failure.

Yep, I've had a PSU get friend by a power spike even though it was plugged in through a surge protector. Like you say, I'm sure it helps, but it's not perfect.

Surge protection isn't a big thing in the UK these days. You can get arresters but it's really rare now to hear of computer damage as a result of mains spikes.

Yeah but don't you think having a raid array but not having a surge protector is missing the forest for the trees?

Power surges, brownouts etc. don't really exist in large parts of Europe. I also never unplugged my computer(s) in thunderstorms (some of which are on a USP, most of which were simply plugged into mains, no surge protector, no USP) and lost exactly zero computers to that.

One of the few things that actually routinely comes with real surge arresters (i.e. gas-discharge tubes, not the MOVs plastered all over the place) is anything connected to a telephone line, e.g. A/VDSL modems. On some old PBXs for analog lines these were even contained in field-replaceable modules.

Can confirm, I think it's over ten years since my last power glitch. I don't think I've ever even seen a surge protector; UPS I've seen a couple of times at people who run servers at home.

I re-read the article twice and didn’t see him mention he didn’t have a surge suppressor. Did I miss it or are you assuming that he didn’t have a surge suppressor?

As others have said, electronics can get fried by power surges while plugged into even commercial grade surge suppressors. Home grade surge suppressors provide even less protection.

He didn't mention having one either. It's weird not to mention anything at all about surge protection in a story about a power surge destroying important equipment. I'd assume if he had one he'd mention the state it was in after his equipment was destroyed.

You don't have lightning in the UK?

I wouldn't plug any PC into anything not surge protected...

I have never used a surge protector, and I don't think I've ever seen one in a home or office.

Much of Europe uses buried cables, so there's less risk from lightning strikes, and I think the higher line voltage (230V) reduces the effect of something like a dodgy elevator motor in the same billing.

I have visited the US a few times, where I remember the lights would dim momentarily if the garbage compactor or the dryer were used.

It's a common scenario, unfortunately. Synology RAID is nice. External USB on the same power circuit is not a good idea. Actually, I'd look at least into a surge protector. But it's quite possible to have those blow up and still fry your hardware.

I'd run some batteries in the basement and run my Synology (or similar) off of those. Additionally, the USB Backup of backups should be in the garage or the attic, if at all possible. Also, a nice cloud-backup solution that is capable of delta-uploads is a very good idea (cover against fire in the house, or any other "catastrophic" failure).

If you don't have DC experience or if you didn't do much hardware, it's common to over-focus on software. And to be fair, vice-versa :)

The issue is probably the Synology PSU and nothing else. This is the third Synology NAS breaking down with intact disks I hear about this month, and it seems to be quite a common issue.[0]

[0] https://www.youtube.com/watch?v=K7ly8zde3dE

The usb backup backup really, really cannot be at the same site. I'm not saying you have to stick it in the cloud but it's got to physically somewhere else.

I had a similar experience with a relative's Seagate NAS.

Except the ext filesystem was unreadable because it used a different page size. Required some shenanigans in userland by thankfully I was able to recover the data. Seemed like a software fault on the box.

The chassis had to be destroyed to remove the drive and it was interesting to see the warranty explicitly mention the customer was allowed to do this to recover their data.

Have you written up the solution or do you have some link? I have that same problem currently but havent had time to look at it yet. Thanks!

You need one copy somewhere else! What if there's a break-in or the roof leaks or whatever? My low-tech solution is two external drives, one of which is at my Mom's place and gets swapped every month or two.

This solution is great because of its simplicity. Keeping a second drive on your desk at work is also great if mom is too far away.

Things that jumped out at me from this:

First, the USB external is probably OK except that its USB circuitry has taken a power hit. If it's a standard SATA drive inside that could probably be shucked and accessed. Counter: Some of these have drives that are no longer SATA but have a bunch of the USB connectivity built into the drive. At that point, you'd probably be looking at a few hundred $ of data recovery costs (yes, that little). Professional recovery of the RAID would be more expensive because pricing is often based on the number and capacity of the drives.

Second, RAID5? I know these were only 1TB drives, but be very wary of anything with only a single parity disk if you're looking at drives of 1TB or larger, particularly if they're sequentially-numbered drives from the same lot. With modern TB+ drives there's a not-insignificant chance of drive errors as you hammer the remaining drives to rebuild the array. If building one of these now, the price difference between a RAID5 of smaller disks and a RAID6 of larger ones is probably only a few dollars.

Third, if actually doing recovery the first thing you want to do is image the drives and work from the images. ddrescue is probably your simplest option there, but yes you're going to need a big chunk of drive space available.

The fears of parity raid disk error rates are greatly overblown - this post does a good job of summing up the problem:


A bit of annecdata: I've been running raid5 and raid6 for years without issues on 4 and 8TB drives, scrubs come back successful every month despite claimed error rates, the drive deaths I've had have been sudden whole drive failures or write failures to a large portion of the drive.

The linked article is working through some of the numbers on a pure math basis but the real world is messy in ways that article doesn't consider.

Most notably, if a drive has failed there's a reason for it and a lot of the possible reasons will be shared with other drives in the array. Was there a manufacturing issue and all of the drives are from the same lot (pretty common). Is the RAID in a hostile environment (heat, vibration, bad power, etc.)? Heck, is the RAID normally very lightly used and going to have heat problems if it's under full load for hours (days?) during a rebuild?

There are also factors like how the RAID controller is going to handle another read error - will it drop a second disk if the drive reports a failure? For that matter, if an array drops to "degraded" due to an error are you going to immediately replace the drive or are you going to write it off as a one-time fluke and let the array rebuild? Do you keep a pool of spare drives around that you'll drop the failed one into after testing? I've regularly seen an array drop a drive due to something transient, then rebuild onto it and not have any more problems for years.

Even with a 2-drive failure in a RAID5 you're unlikely to lose much data - almost everything is likely still there on the disks unless there've been catastrophic failures (e.g. an array of the old "Deathstar" drives which were prone to head crashes). You just may need to do recovery which will generally mean imaging each of the drives and doing recovery based on working with those images instead of the original drives.

  First, the USB external is probably OK except
  that its USB circuitry has taken a power hit.
Maybe. But where do you think a USB external hard drive gets the 5v that is the sole power input to most SATA SSDs?

Here is what I do to hedge the risk of electrical failure (raid+usb on the same circuit), fire (both copies in same structure), malicious compromise of my machine and theft:

- My workstation has linux software raid-1 of 2x 6TiB drives (this provides robustness and uptime in case of single drive failures and ease of recovery).

- Another machine in my garage doing incremental daily backup pulls over the network. It is setup as multiple discrete hard drives thus partitioning single drive failures (low cost, the garage is a separate building, the host machine is an arm board that actually turns off HD PSU when not backing up, so hard drives are fairly isolated from power spikes).

- I make a monthly incremental backup (three sets) onto an external 6TB usb hard drive encrypted with luks. This drive spends 99% of it's life powered down in a cabinet at my office at work. It is protected against theft, fire, electrical spikes, etc... by my employer.

- The is not a *ucking cloud anywhere in this picture. I can get access to my backups within ~1hr in worst case (round trip drive to office to pickup my drive).

You Kids need to learn how to take care of your shit - now get off my lawn!!

I use Arq Backup on my workstations, along with Resilio Sync.

The NAS holds all our media/documents/music/whatever, and where possible, this gets auto uploaded from workstations to the NAS, mostly through Resilio Sync, but also ChronoSync. A local Raspberry Pi (different building) acts as a node in the Resilio Sync setup, adding more redundancy.

The NAS backs up to a local USB drive nightly, as well as a remote (4km distance) Odroid HC2 with a WD Red drive. This device also runs Resilio Sync as a redundant node. All machines run Btrfs where possible, with smart monitoring, daily short smart tests, weekly long smart tests, monthly scrubs, and log monitoring emailed to my inbox every morning.

Finally i make yearly archive discs (100GB M-Disc) with the data from the past 12 months. I burn these in 2 copies, one is stored locally, the other is stored remotely. Along with these drives, i also maintain a couple of 4TB USB3 drives, which i freshen (nondestructive badblocks) yearly, and update. Again, one is stored locally, the other remotely with the M-discs.

Even with the above setup, there is a theoretical possibility of losing data, but as most data lives on both the NAS and the client machines, as well as a remote target, i would need to lose all 3 at the same time. The only irreplaceable data would be our family photos, and those are also stored on optical and magnetic media (spending 360/365 days powered off), adding at least a couple more layers of redundancy to the equation.

Over Christmas, I backed up all my stuff to a 3TB external hard drive that I took with me when I visited my parents and intentionally left the drive on a shelf at their house.

This is what I call a poor man's offsite backup.

I assume you either told them what it is or are reasonably confident they won't throw away the random thing on their shelf that neither of them know about.

I've seen a few cold backup setups like this, but I would both worry about the significant gap in coverage between the drive you left at Christmas and next time you update your backups. And also the shelf life of mechanical hard drives not in operation is poor.

As the saying goes, nobody wants a backup. What they want is a restore.

Very good point. As I pointed out in another comment in this thread, I had my backup disks encrypted with a password safely stored on the disk that was being backed up. Backups were just fine, but the restore depended on that password. Fortunately I had access to it via another means, otherwise: no restore. Sobering.

I've been wondering about this. How do you guys test your restore? Do you schedule a manual test restore once in while? Do you automate the test?

Something like that happened to me. I had several TB of data on a RAID10 array on an LSI MegaRAID controller. A nearby lightning strike took out the server. And the server manufacturer had gone out of business.

I had backups of the data itself. But I'd been doing lots of data massaging, and didn't have enough storage to keep copies of every step.

Anyway, so I bought a couple new servers. One to replace the dead one, to be setup with SQL Server. And a low-end one that would accept the controller from the old server. I just left the drives in their cage, and jury rigged power and data connections.

And it worked.

Couple of observations.

First of all, given the price of storage, for backups, I don’t think anything other than mirroring makes sense. Just get 2 big hard drives for your NAS and set them up for mirroring. In the event of a filmier, you can read directly.

Next, you don’t have a full backup without offsite storage. Even if it wasn’t a power surge, there could be flooding or fire at a single location.

Always remember the basic 3 2 1 rule!

Indeed. I'd rather have a bunch of RAID 1s around than a RAID 5 I have to worry about rebuilding. I sync my files to two different RAID 1 setups in two different cities. No hardware failure has ever been worrying.

I've always thought of a NAS as high speed local storage with a little redundancy in raid, but I'd never treat it as a backup solution. Only because then all my data is under one roof. Fire, earthquake, flood or other local event that's big enough and all my data is gone. I'm lazy so I haven't setup anything fancy but backblaze and dropbox do all my off site backup for me. It's very cheap. 10 years of backblaze would be cheaper than a NAS.

Power problems can be both stealthy and deadly.

Years ago I had a desktop with a 4-disk RAID 5 where the SSDs failed in quick succession, it's common. I lost some data-- or rather recovered files manually I think from a failed RAID and cold backup, switched to spinning disks, but after a while the new disks started beeping and generating RAID errors.

After much time and anxious guessing, I swapped the power supply and never had a problem since.

I recently was also thinking about how to organize my data & backups. I still have not really decided. Esp about the software. I collected lots of options here: https://github.com/albertz/wiki/blob/master/backup-software....

At the moment, I really like Perkeep (https://perkeep.org/). But I'm not sure whether this is a solution for everything.

On the hardware side, I also have not really decided. I want to build up my own NAS (custom hardware, no preconfigured thing), which should be quiet (if it is not doing anything, i.e. most of the time), as it will be in my home. Another NAS maybe at my parents home. And then maybe some cloud storage.

Yes, I was also thinking about perkeep and giving it a spin. I was considering a RAID 1 array for redundancy.

Another thing I was considering was M-Disk[1], that can, supposedly, hold information for 1000 years. But I wonder what other people's experience with it is.

Alternatively, I understand normal Blu-Ray disks should be able to have data retention of about 40 years, and that sounds decent to me.

I was looking into Amazon glacier and/or google cold line, and the prices seem decent, but I do not like the fact that you have to pay monthly, even if a small sum, it's just one more thing to concern yourself with. I would like to prepay and know my stuff is up there for a couple of years.

Normal Dropbox/Google drive stuff is too expensive to store big amounts of data, so not worth it. Plus, you have a copy of it locally also, (at least in normal use cases)

Thinking about it, I think this would be a great idea for a startup. Cheap data storage for long term storage, with competitive prices, the ability to per-pay, user-side encryption and a simple UI that grandma can use to drop photos. It should be able to guarantee that the information you desire to will outlive you.

[1] https://en.wikipedia.org/wiki/M-DISC

> Thinking about it, I think this would be a great idea for a startup. Cheap data storage for long term storage, with competitive prices, the ability to per-pay, user-side encryption and a simple UI that grandma can use to drop photos. It should be able to guarantee that the information you desire to will outlive you.

As a consumer, I wouldn't trust a startup for such needs, because there's a likelihood that the startup would either raise prices, pivot to a different service offering, or shut down entirely.

Google Cloud Platform lets you make a manual prepayment [1], so that's an option worth considering. I know Google gets a lot of flak for shutting down consumer services, but I'm inclined to believe that they wouldn't shut down Nearline/Coldline without a significant amount of notice.

[1] https://cloud.google.com/billing/docs/how-to/manual-payment

> As a consumer, I wouldn't trust a startup for such needs, because there's a likelihood that the startup would either raise prices, pivot to a different service offering, or shut down entirely.

Fair enough. It's just that I think something in the field that can do these things would be filling a need. And all companies have to start _somewhere_

But I agree, getting a promise of long-term availability from a new company is a bit rich.

Thanks for the info about GCP - I looked mostly into AWS glacier and I know AWS did not have a prepay offer.

I settled with a SuperMicro's passive Atom solution and a Fractal Design mini case, able to hold six disks. It is very quiet when you tune down the fans and keeps the disks just below 40°C which is optimal for me. We have the NAS in the same room we sleep and it doesn't bother us.

Do you spin down disks? In my case WD Reds generates more noise than cpu and case fans when there is no workload on system.

No. Keeping them running uses less electricity than spinning them up with our workload. The disks are floating with rubber connectors in the case and the noise is nothing we can hear to the bed.


Thanks, I have node 304 with 3 HDDs, there is no vibration that's true, but I hear spinning plates when there is no other source of sound. It does not look like there is acoustic management available for wd red.

I have looked and looked since 1998, tried rsync, RAIDs, striped drives, ATA over Ethernet, lvm, ZFS, all sorts of things.

What finally clicked for me, is https://www.greyhole.net

It's like magic. Decide how much redundancy you want. Then just add drives to it. It balances files automatically across drives. You can have remote drives in the mix.

What it is not good at, is many small files. But for my use, media files and backups (tar archives) it's a breeze. And the files are stored as normal file on the drives it distributes too, so there is nothing complicated to dig into should disaster strike. (Not that it has happened to me.)

No affiliation, just finally in a Zen state of mind when it comes to my home NAS.

Next step - make sure all of that is backed up off site too, but that is another thing altogether...

It's a really tough sell for most people when you tell them they need to have at least two copies of their data if they care about it. The vast majority of people will rather "take their chances." Usually that means another $100-500 bucks for most people.

It was a tough sell for me too. After losing 6 years worth of photos, it was an easier sell!

I had all of my backups on a 2TB external drive, which worked great until the MFT got corrupted somehow. Suddenly all my eggs in one backup basket felt a bit silly. Fortunately I was able to recover all of it. I'm in the process of partitioning out a new backup system to avoid that in the future.

After thinking about this kind of data apocalypse for a bit I realised that the portion of my data that is mission critical is actually really small.

Throwing it on multiple clouds with version history isn't an issue.

(I'd recommend an O365 sub + duplicati...in theory you can push like 5TB to MS cloud).

And here's your recurring reminder that all cloud storage services fingerprint your uploaded files looking for TOS violations. If you haven't encrypted your backups first you are at the will of the service you are using.

I use Arq to backup, and it’s a lovely little program that isn’t a resource hog, and handles the encryption. I use it to back everything up, a copy to a removable drive, a copy that gets updated every day online, and another once a month online on a seperate server. It seems to work well, and it really is painless,after the initial seeding.

Yip, I also use arq and love it. Everything backed up into my nas and Google drive. The only irritating thing is that there's no Linux client and I suspect my next laptop will be Linux based not a MacBook.

Adding on: Arq has a open source restore tool, which I think is important in the long term: https://github.com/arqbackup/arq_restore

Yeah duplicati encrypts by default.

Plus ironically all the content that would score me a TOS violation doesn't get backed up since it's easily replaceable...

Yes! The most important data I have fits inside 10MB, and even the "I'd like to keep this" stuff isn't huge.

Silly mistake I nearly made:

MacBook Pro in for repair incl. wiped disk. No problem, I have two external drives with regular TimeMachine backups, so go ahead.

Having received the laptop back, I plug in one of the disk drives and - oops, it's encrypted (with a good, safe, long password of random characters), which had previously been stored in the local login keychain, so that the disk drives have always just silently automatically mounted over the years, and I totally forgot that they were encrypted!

Fortunately, I had the password saved somewhere and access to it. Otherwise my backups would have been for nought (though of course they themselves had the disk password inside... well protected by itself.)

Is it weird that I don't have GBs of my own personal files/data? I get if you have movies/tv shows or whatever, that's different. I just don't know what personal data you could possibly have that would amount to GBs...

Photos and videos. All My email. I as a musician have about 40 gigabytes of personal recordings that I have used for auditions and pre selection rounds in competitions. All in all it is about 800gb and growing.

Since my son was born the amount of photos and shitty quality videos that I backup is staggering.

Ah okay, fair enough! Guess I just don't really have that much personal data but it makes sense in your situation.

As a parent you go through a period of time - when your kid is young - that you take photos and videos of EVERYTHING that they do. Its silly now that i look back. Certainly it is important to capture moments, but to over-document everything is silly...now i take minimal photos/videos, and just be present in the moment. Oh, and my data accumulation has precipitously gone down since i've mellowed out. But, yeah family photos/videos adds up like crazy.

I have 3 kids and still don't have more than a terabyte of data.

I currently have way to much data lying around for any metered cloud storage to be economical, so I'm currently subscribing to both Backblaze and Crashplan for their unlimited storage backup plans.

Crashplan killed off their consumer plan a while ago, so I ended up moving to their business plan at double the price. In my case even at that price it was still worth it considering the amount of storage I'm using.

Anyone aware of other services offering unlimited storage for a single user? I know Dropbox, Google Drive, and OneDrive all offer unlimited storage in their business plans, but they all require a certain number of users before the unlimited storage kicks in.

Crashplan ending their consumer product was such a shame. As far as I know it was the only consumer product supporting Linux. I am currently on their discounted business plan, but I still have no idea what to do after that discount runs out.

Your question - about data and transit volumes, and costs - would be answerable if you provided some actual numbers.

I'm not the poster, but I have local and remote backups of photographs.

When I shoot a model I will take 400-600 images in an hour, each about 25Mb in size. The shoot I had last weekend resulted in approximately 18Gb of RAW files, and output JPG files.

In total I have just under 3Tb of RAW, JPG, and other media files. (Sometimes I film shoots, or do some video-work at the same time.)

That kind of volume is not huge, but still painful to upload remotely. Its also at the cusp of the kind of data you can backup to a cheap SAN-box locally. I currently have two toy NAS devices each with 2x4Tb drives. If I want to bump my local capacity to 8Tb, or similar, it'll get quite expensive.

Okay, so if this is a professional gig, then off site backups are a must, and raid on site is also. But for a pro photographer I'd have expected disk costs to be a necessary expense. In Australia we have 10TB disks for AUD 450. Double up for redundancy and add on chassis, this isn't trivial $s, but equally it's presumably 'worth it'.

For my non-pro and far more modest collection of around 300GB of images, I keep copies on three local machines, and one remote (family member) with sync changes being able to be done over home grade ADSL. With your volumes you could do off site sync via usb disk easily enough for new large ingests, and propagate smaller changes over the wire. Having a friend or family in the same city is very convenient compared to trying to hunt down the best all you can eat deal du jour, with no need to handle the regular t&c changes those services suffer. Good reciprocal opportunities too, of course.

I'm using restic with a Backblaze backend and I couldn't be happier.

A surge from a lightning strike near my home travelled over the cable line to kill a network switch and the WAN port on a firewall. (Strangely, the cable modem was spared.)

Everything was on a decent UPS... but I’d completely forgotten the cable line.

Most of the UPSs and surge protectors I've seen for home use include coax, ethernet, and/or telephone protection, too, e.g., https://www.apc.com/shop/us/en/products/APC-Performance-Surg...

Lighting is pretty rough.

You can completely air gap your network from the outside line by converting to fiber at some point (probably between your cable modem and your router, in a DOCSIS setup). Isn't foolproof, because lightning can induce current on your wires directly, but it'll help these kinds of scenarios.

- SSD + HDD in laptop, this latter for storage, because it actually tells me, if it's unhappy and about to die, unlike the m.2 ssd. The cost of this is to have an older laptop, in my case, a thinkpad x250.

- synced to home server, which, at this point, is a thinkpad x201 on an ultrabase with 2 disks - it has built-in ups, called a battery

- all of this synced to off-site rented server in Germany

- irreplacable photos are on blu-ray on yearly archives

This covers lost laptop, burglary, house fire/flood, etc. To avoid problems with lost ssh keys, I have a few users on that rented server which can log in with a password, in case of emergency.

Synced how?

My current approach.

Personal media archive, Windows 10 Pro, Storage Space with Parity across 4 HDD in a sata jbod. This is a purely software RAID. Moving the drives (or part of them) to another Windows 10 system allows for seamless recovery.

This archive, as well as all personal computers use Backblaze for offsite backup (including versioning). Versioning is important in case malware/accidents/buggy software. I don't consider any backup plan complete without this and being off-site (fire/theft)

For my business servers I use Tarsnap. (Off-site and versioning, 45 days)

Edit: Oh and everything is on UPS

RAID is not for data protection. It's for availability.

Data Storage is dirt cheap. Buy a 6tb for $100 every few months and throw important folders onto it. Place old ones at relatives homes for offsite.

Never worry about this again.

6TB drives are about £150 at the bottom end, average maybe £180. That's c. $200-240. £600 a year on back up hardware is pretty hefty.

My first thought is "why don't you keep a backup?" I really don't know what the moral of this story is beyond "don't keep one copy of all your data, especially in the same physical location."

Online storage is cheap. Bandwidth is cheap. There's a multitude of solutions in the comments if a consumer solution like Dropbox or Google Drive isn't good enough for you.

> Bandwidth is cheap.

Most ISPs in the US now have a 1TB datacap and charge ridiculous rates for overages.

Yikes, seeing as I just signed up with Backblaze and uploaded about 3TB the last week, I was a bit worried there for a minute.

AT&T gigabit customers are on unlimited plan for anyone else seeing this who is a customer, see https://www.att.com/esupport/article.html#!/u-verse-high-spe...

Are you really pushing 1TB/mo, though? That would be a full quarter of OP's lifetime of data. And in those cases where you are, could you not afford business class internet for a bit of a premium?

It's on top of your other bandwidth usage though. Family of 4 with Netflix/YouTube/Twitch can accumulate quite a bit. Still, using an online service at least for an important subset is a good idea.

3 Nest cameras and streaming in 4k (Netflix, YouTube, Amazon) use up quite a bit. Since I started tracking last April, 736gb a month has been our lowest. Most AAA video games are 40-80gb, and my partner and I both can work from home with always on VPN.

So downloading random Steam games to play or taking a few videos of the kids for family can easily result in an extra 150gb of incidental bandwidth.

Over Thanksgiving we hit 973gb with guests in the house. I reduced all the camera and Netflix quality settings the last week of November to avoid the penalty fee.

Sounds like a UPS with the standard protections is going to be somewhere in your future... Just not somewhere a cleaning lady can touch it.

Most UPSs don't provide much more protection than a power strip. They have a cheap "surge" protector, if you look it's often just running the tower through a ferrite ring.

Generally the input it directly connected to the output, through the ferrite ring. Only in cases of a power outage does a relay trigger and switch to the batteries. But that doesn't prevent a surge from going through the UPS.

You can hear the relay trigger when you disconnect it from the wall power, the latency is high enough to allow plenty of damage.

There do exist UPSs without this problem, but they are more expensive, heavier, generate more heat, and consume more power. Look for double conversion UPSs that in the normal state go from AC -> DC -> AC.

Kinda weird that while computers are 100% DC, they are fed with 100% AC. I was pleasantly surprised that pretty much all ham radio stuff is DC fed. MUCH easier to do UPSs, solar, wind, or have multiple pieces of equipment share the same power supply. Imagine racks with a big power supply on top (and getting rid of all the AC -> DC conversion heat). I've actually seen these, but they were unfortunately cost prohibitive.

There isn't really a good excuse not to have an offsite backup, especially when services like back blaze are $5 a month.

Also ... RAID is not a backup.

Backblaze is great. But their desktop backup solution for $5/month is not available beyond Windows and MacOS. Their B2 storage is via duplicity, but that quickly exceeds $5/month with typical storage cases.

Fair point. It would be great if Backblaze supported Linux with a desktop client.

So I guess the options would be desktop backup on the various computers that feed the current NAS solution.

No matter what your backup tech is, if you have it all in a single location, you don't truly have a reliable backup.

Synology allows syncing of your NAS to a similar Synoogy NAS on another location (Cloud station server). That's what I am doing. This doesn't prevent issues with data corruption, but prevents these kind of issues where you 'complete' NAS fails due to a possible surge.

The failure modes I consider most probable at my home are a whole-house power surge, and a loss of structure (fire, natural disaster). With a 1 Gbps (synchronous) FTTH connection, and a Backblaze subscription, that's about all the peace of mind I need!

I'm surprised it's not mentioned here but a lot of the Synology nas's have had power supply issues in addition to the prior Intel Atom issues. I swapped out four units twice each (1815+'s) and swore off their hardware permanently.

To be fair, at least with the Intel Atom C2000-series problem, I find it difficult to blame them. We had a network edge device from another vendor die due to the C2000 defect. I figure the best the vendor can do in that case is offer to replace/fix it, which the vendor in question did. Did Synology offer to fix their C2000 devices?

I use tarsnap for off-site backups. It has a reasonable price, even for some GB of data

Forgot the link:


Pretty happy with my solution. Keeps me safe from floods, ransomware, and losing google drive access:

- originals: 2TB google drive + 50GB icloud

- backup #1 (autosync from drive/icloud): 4TB internal drive

- backup #2 (cold storage): 4TB external drive

How often do you do the incremental backups on backup drive #2?

Some degree of cold storage would have reduced the OP's stress level when restoring their NAS mind you.

maybe every 2-3 months.

That’s part of the reason I keep my personal data backed up two two sets of NAS. In addition, it’s mapped as a Dropbox drive on one of them.

From my families perspective it’s just one drive, but it’s replicated everywhere.

Me too, because I was storing it on IPFS. IPFS had a minor bug and I've lost all the references to the data.

You know, the data is still there, I just don't know what hashes correspond to each objects.

I use Synology cloud software to backup to both Google Photos and Backblaze. Way faster than Glacier, and costs me <$5 month (haven't reached 1TB yet).

After watching Steve from Gamers Nexus talk about their second Synology NAS failure in a single quarter, I instantly figured that OP was using a Synology product. Their stuff must be bad if I'm making that association.


Anyone knows an open source Backup Client that runs on Windows and that support SFTP?

Bitter lesson learned: Invest couple of dollars in a surge protector.

Coincidental timing for this article! I just wrote in my own blog yesterday about the level of importance I had placed upon the data I had stored on my PC [0]. (TLDR - Data I thought was unimportant ended up actually being the opposite right after I lost my external drive).

[0] - http://devan.blaze.com.au/blog/2019/1/20/the-folly-of-unimpo...

4 disk RAID what? 4 disk RAID 1 or RAID 10 or... It'd make a difference.

The article mentions that it's RAID 5:

> My Synology has 4 x 1TB disks in a RAID 5

TL;DR: a power surge killed my PSU and maybe my motherboard, causing me to freak out and btw I don't have any cold storage.

I'll bet this guy did not have a surge protector in front of his Synology PSU.

I've been working with computers for so long that total failure is not a probability, it's a certainty. Last year, in fact, my desktop burst into flames. I've had smoke come out of desktops several times before, but that was the first flamer.

The best thing i found to enforce backup discipline on myself is to regularly migrate between machines- who are regularly offline for more then 4 hours.

You will have mostly current backups of your projectdata and workdata on all those machines you migrate from and towards.

The problem theire then becomes destructive automation. You must avoid automating syncs with half-corrupted or full-corrupted instances of your work environment.

Also backups. Always backups.

He doesnt confirm that the drives are faulty, and it isnt clear that they are, as the data is accessible while maybe some partition or boot data was scrambled but repairable

The NAS should also have a warranty of some kind or the controller could be repaired for cheap

He was never in any data peril, so just fix that and then add an offsite backup to the mix

M Discs brother, they are cleared to last 1000+ years. Long after most people's cloud backups have turned to dust, my most treasured data will endure due to my use of these discs (which may well be unreadable from a hardware standpoint in 3019. But shit, hope they keep the schematics around.

Inspired by this post, I decided to document my backup strategy and shortcomings here: https://natalian.org/2019/01/20/Data_I_would_not_like_to_los...

tl;dr I'm trusting Apple.

Z ... F ... S.

And at least Raid-Z. Raid-5 and Raid-6 are now at probability of failure levels that your rebuild is likely to throw an error.

ZFS is great, but won't help if there's a power surge that fries all the disks at once. To prevent that you'll want to store snapshots remotely somewhere.

Exactly, my solution right now is to create encrypted backups of my raid-z2 FreeNAS instance (protected by a UPS) and then have a nightly cronjob copy the backup to a 4TB external hard drive using duplicacy (https://duplicacy.com). This drive gets rotated with another drive stored locally within the house, but in a fireproof and fairly waterproof location, then every 3 weeks I rotate the oldest drive out to a safety deposit box and bring the current drive in the safety deposit box back into the in-house rotation. Call me paranoid, but for me it's worth the peace of mind and ease of local restoration. Once you get in the routine it's really not that much work at all. :-)

True story: I've seen Raid5 not recovering after a simple power loss. Software was working but couldn't build it up. Fortunately, we didn't have to dig further than wiping it.

To prevent that you'll want to s̶t̶o̶r̶e̶ ̶s̶n̶a̶p̶s̶h̶o̶t̶s̶ ̶r̶e̶m̶o̶t̶e̶l̶y̶ ̶s̶o̶m̶e̶w̶h̶e̶r̶e̶.̶ use a surge protector

Don't forget to replace your surge protector on an annual basis (or more frequently if you have really dirty power). Also, you can have a nasty fault in your PSU that fries everything also (If anyone knows of a surge protector that sits between the power supply and the motherboard, I'd really like to get one).

> (If anyone knows of a surge protector that sits between the power supply and the motherboard, I'd really like to get one)

I think the best you can do is just get a top-quality power supply. SeaSonic will sell you a ridiculously overengineered box with a 12-year warranty for $160; it's guaranteed to have a longer useful lifetime than any other component in your computer, probably even including the case itself.

Just use a surge protector literally right next to your PSU.

Cheap whole house surge protectors are pretty much useless. It will work for a few times, but the MOVs will degrade and fail pretty quickly. Every time a big inductive load switches on or off its going to cause a voltage spike that is going to trip the MOVs. or the MOV's Trip voltage is so high (to avoid quick degration) that they really don't provide any protect. Really a good whole house protector needs to use SADs (Silicone Avalance Diodes) or a passive LC filter. Transtector, Thor Systems (SADs) or Pricewheeler (LC Filter). These protectors can take many more surge hits than the ones you listed above. Why whole house Surge protectors don't work.https://zerosurge.com/wp-content/uploads/2016/10/USTech.pdfE... with a whole house protector it still would be a good idea to use point of use surge protectors. Since the surge has to reach the Whole House protector before it can be clamped. Surges can reach your electrical devices faster than a whole house protector can clamp. It only takes a few nanoseconds to destroy a microchip. For instance say you have a vacuum cleaner plugged into the same circuit your computer is connected to. the vacuum cleaner jams, causing a surge that will hit your computer (since it closer) than the whole house proetector is (down in your basement).

Source: comment section on https://www.youtube.com/watch?v=6PqO0aQaGDY

Sure. Encrypted ZFS with FreeBSD, have a power loss, goodbye data.

Should have read the manual though, it does tell you to make a backup of certain data ranges in case of encrypted ZFS for this specific case, so it's partly my fault.

That said, I'm using ZFS ever since, but on top of LUKs with linux.

Do you have more information on this?

I've just rebuilt my home server/nas going from 2x3tb disks in a mirror no encryption to 4 x 4tb disks in a striped mirror with native ZFS encryption on ZOL.

Most of my data is read only media content, 90% read use which is also extremely low io wise and write is only when adding new media. I was thinking power loss would only be an issue loosing data that is currently being written so would not corrupt the entire zpool?

If native ZFS encryption can lead to loosing a zpool on powerloss I might look in to buying a used cheap Dell r210 ii, stick two 8tb drives in as a mirror then stick it in a collocation data center and use native zfs send/recv incremental snapshots for offsite. Looks to be cheaper then rsync.net $30 per TB/month for ZFS when you got 2-3TB+, also can do the initial sync locally. Still looking at $700+ a year after initial hardware costs.

So LUKS is more resilient than native ZFS encryption to power loss?

Encrypted ZFS with FreeBSD is not actually native ZFS encryption, it uses GELI to handle the encryption part (which may be why it's bad at handling power losses).

Ah, for some reason I was thinking that FreeBSD's implementation of OpenZFS had native encryption.

I just don't understand rolling your own backup drives in 2018 unless you're up to something nefarious. I rest peacefully knowing my data will never be lost by Dropbox, and the immediate backup means I don't have to mess with slow, periodic all-at-once backups.

And I just don't understand how people can trust just a single online service with their backups. They can be part of a plan, e.g. to quickly get data off-site, but I wouldn't trust a single one enough (case in point: Dropbox actually has managed to loose data in the past, + the occasional security failure)

The risk of distributed data centers losing my stuff is exponentially less than that of situations like the one in this very article. And while a security professional might be able to set up better security on a home server than what cloud storage providers offer, I am not a security professional, and even if I were I'd probably still want to outsource that to other professionals and not deal with it.

If you build this kind of thing because you enjoy it, more power to you. But I see no valid argument for its practicality.

3-2-1 Backup should always be in your mind.

3 Copies

2 different Media (or vendors)

1 Offsite / Offline

There are all kinds of things I could think of that may cause problems with your dropbox plans but what would keep me up at night is all the ways I could not think of

never trust a single vendor, a single NAS, a single anything... NEVER

It's much cheaper to build your own than to use cloud storage esecially a secondary storage service like Dropbox. Even blackblaze B2 is $60/TB/year vs you can buy an 8TB HDD on Ebay for ~$200 and if the drive lasts 5 years that works out to be $5/TB/year and adding one or two redudent drives is still much cheaper.

How do you deal with the off-site problem? That's why most people use cloud services -- as an off-site backup that they don't need to manage themselves. Yeah, you could just put everything on an 8TB hard-drive from eBay (not sure that's a great idea for important data) and stick it on a shelf -- but the chance that drive will be alive and not have bit-rotted by the time you need the data is smaller than most people would be comfortable with.

I guess that's a legitimate reason, though the tradeoff still doesn't seem worth it unless you have massive amounts of data

> I rest peacefully knowing my data will never be lost by Dropbox

This is still putting all of your eggs in one basket. What happens if I lose access to my account?

You could say the same of the encryption key for your local drive. Or your bitcoin password. Any secure system has a baseline key that you can't afford to lose. Just make sure you don't.

It's two baskets - dropbox and local.

It was several years ago and I can't find the link now, but Dropbox has previously had a bug that deleted the data in the account and then synced the deleted change back to all the client PCs. It was unrecoverable.

If you have 4TB of data saved at home you are a hoarder. Why do you have that much stuff. I don't have any data at home that I would miss very much if I lost it.

I was wondering the same thing. I don't think the average person have terabytes or even gigabytes of data stored at home. Every time I start a project I make a remote git repo. Music and photos are handled by cloud services... Not that I have photos I really care about, but I'm sure other people do. Documents and configuration files are backup up using Tarsnap, but that's 100Mb at the most.

We had a similar debate at work, where I questioned the need for backup of my workstation, arguing that there's nothing on it that's not also in git, on a network share, in LastPass or the mail server. Apparently only two people out of a hundred, me and the CEO, do have important stuff stored solely on our workstations.

Calling people with 4Tb of data hoarders may be a little unnecessary, but I do question how much of that data is ever going to be accessed in the future.

What? This is a wild assertion.

My photos alone are north of 4TB. That's DSLR, but not a crazy high-res one (to say nothing of people with video hobbies). I've always worked in small 2-8 person teams for companies that I've been heavily a part of, so that's a huge chunk too. But even discounting that data, I have quite a lot of projects that weigh in pretty heavily.

Yeah, I do have some datahoarding-type collections, because that's the sort of thing you end up doing when automation and total storage become commonplace, but just looking at "bytes I have created and will be lost forever if they vanish", I'm well north of 4TB. I think a lot of other people are too.

I don't mean this disparagingly, but if a person hasn't had any data-heavy hobbies, and has always been some kind of employee to a larger entity who manages data elsewhere, then yeah, your data footprint might be small. I imagine lots of HN regular types don't fit that mold, though.

(On the original link--I don't have much to add to the other comments here. But calling a 4-drive RAID5 setup robust in any sense is nuts. That's data loss waiting to happen, and probably made worse by thinking it is robust)

Family photos and videos can take up a lot of space. Also it sounds like the authors work is backed up to this NAS as well.

I wouldn't have that much to backup but some do, depending on their hobbies and work.

Photos, videos... Yeah I used to save that stuff. I realized I never looked at it again. So I just stopped. I literally don't worry about losing data at home. Work is different of course.

I think for some this might be the case but there are some photos people hold dear. Wedding photos, children growing up and other major life events. Each to their own.

Why? I have lots of PDFs, photos, video. It's a lot more convenient being a 'digital' 'hoarder' than a physical one (though I actually have way too many physical books as well).

I see some terrible backup strategies here.

1. Backups should not be on a single drive. 2. Backups without checksums will result in corruption. 3. Offsite is a must. 4. Unencrypted off site backup means someone already copied your data. 5. Encrypted offsite backups should have forward secrecy. So different keys for each file and keys file gets backed up encrypted.

My backup strategy: File server runs zfs raidz with Daily/weekly/monthly snapshots on disk.

Snapshots get copied to 2 external drives, zfs mirrored.

Files get encrypted and uploaded to backblaze using my custom software. Nothing fancy, just standard authenticated encryption (chacha20poly-poly1306) but with per file key management and argon2.

> Encrypted offsite backups should have forward secrecy. So different keys for each file and keys file gets backed up encrypted.

Any references on PFS for backups? Was there no existing OSS backup solution that implements PFS?

I'm not sure why you'd want PFS for backups. The idea of backups is that you have a history (not just a simple mirror) and so having PFS intentionally renders older backups unusable (unless you're storing all the keys -- in which case you have somewhat defeated the point of PFS).

Now, PFS would allow you to handle key compromise by making future backups unreadable. But there are other solutions for this (such as upgradeable encryption).

Most encrypted backup solutions are really bad with protecting keys. Fixed ivs are ok for one file. Not ok for possibly millions of small files. Basically exposed your private key along with your backup.

How do you manage millions of keys, if you have millions of small files to be backed up? Would it be ok to have something between 1:all and 1:1?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact