Hacker News new | past | comments | ask | show | jobs | submit login

A few years ago I had a drive at home that was flipping bits, randomly corrupting my files. It inspired me to build a ZFS disk server and introduce redundancy in my home setup.

A bunch of this article reads as if this scenario, which I in fact hit, won't happen, drives do it better, etc. But it happens. It happened to me. The drive did not "magically fix itself", and instead got worse over time. With ZFS, if it happens again, I can be told where it happened, exactly what files are affected, etc., and that's already better than what I got with that other disk which didn't have ZFS.

Plus the ZFS tools like snapshotting, send/receive, scrub being able to check integrity while the system is running... Those are great features.

I've got a ZFS server here that regularly detects some small number of megs of incorrect data on each week's scrub. This week, it happens to be 4.28M. Every week, ZFS finds the correct copy and fixes it.

I have no idea what the problem is with this server. There are no SMART failures or kernel messages indicating hardware failure, and the system doesn't hard-crash. The thing is, I don't actually have to care, because ZFS is actively taking care of the problem. Until one of the disks goes so bad that SMART or the kernel's SATA layer or ZFS can point me at it, I can just passively let ZFS continue protecting me.

If this were a RAID, the first risk is that the RAID system wouldn't have a scrub command at all. Some do, but not all. Without such a command, those on-disk ECCs the author heaps so much praise on won't help him. I've got the same ECCs backing my ZFS, and clearly the data is getting corrupted anyway, somehow.

Let's keep the author's context in mind, which is apparently that we're going to use motherboard or software RAID, since he's budgeted $0 for a hardware RAID card, so the chances are higher that there is no scrub or verify command.

If our RAID implementation does happen to have a scrub or verify command, it might be forced to just kick one of the disks out or mark the whole array as degraded, depending on where in the chain the corruption happened. If it does that, it'll take a whole lot longer to rewrite one of the author's cheap 3 TB disks than it took ZFS on my file server to fix the few megs of corrupted blocks.

And that's not all. I have a second anecdote, the plural of which is "data," right? :)

Another ZFS-based system I manage had a disk die outright in it. SMART errors, I/O timeouts, the whole bit. Very easy to diagnose. So, I attached a third disk in an external hard disk enclosure to the pained ZFS mirror, which caused ZFS to start resilvering it.

Before I go on, I want to point out that this shows another case where ZFS has a clear advantage. In a typical hardware RAID setup, a 2-disk mirror is more likely to be done with a 2-port RAID card, because they're cheaper than 4-port and 8-port cards. That means there is a very real chance that you couldn't set up a 3-disk mirror at all, which means you're temporarily reduced to no redundancy during the resilver operation. Even if you've got a spare RAID port on the RAID card or motherboard, you might not have another internal disk slot to put the disk in. With ZFS, I don't need either: ZFS doesn't care if two of a pool's disks are in a high-end RAID enclosure configured for JBOD and the third is in a cheap USB enclosure.

The point of having a temporary 3-disk mirror is that the dying disk wasn't quite dead yet. That means it was still useful for maintaining redundancy during the resilvering operation. With the RAID setup, you might be forced to replace the dying disk with the new disk, which means you lose all your redundancy during the resilver.

Now as it happens, sometime during the resilver operation, `zfs status` began showing corruptions. ZFS was actively fixing them like a trooper, but this was still very bad. It turned out that the cheap USB external disk enclosure I was using for the third disk was flaky, so that when resilvering the new disk, it wasn't always able to write reliably. I unmounted the ZFS pool, moved the new disk to a different external USB disk enclosure, re-mounted the pool, and watched it pick the resilvering process right back up. Once that was done, I detached the dying disk from the mirror and did a scrub pass to clear the data errors, and I was back in business having lost no data, despite the hardware actively trying to kill my data twice over.

There are still cases where I'll use RAID over ZFS, but I'm under no illusions that ZFS has no real advantages over RAID. I've seen plenty of evidence to the contrary.

By the way, are you running ZFS on a linux server? Or BSD? Just want to set one up for myself too.


The first anecdote is about a TrueOS box — which previews what will become FreeBSD 12 — and the second is about a macOS Sierra box running OpenZFS on OS X.

Since TrueOS, O3X and ZoL are all based on OpenZFS, I expect that you will have the opportunity to replicate my experiences should you have disks that die. Now I don't know whether to wish you good luck or not. :)

I'm running it on FreeBSD. I am happy.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact