Hacker News new | comments | show | ask | jobs | submit login

The 'hidden' cost of traditional RAID for your home NAS:

I've learned the hard way that the 'R' in traditional RAID truly does stand only for "redundant" and not "reliable". Reliability in traditional RAID is predicated an complete, catastrophic failure of a drive such that it is either working wholly and completely or failing wholly and completely.

In a traditional RAID, for any failure mode in which a drive or its controller starts to report bad data before total failure, the bad data is propagated like a virus to the other drives. The corruption returned by a failing drive is lovingly and redundantly replicated to the other drives in the RAID.

This is the advantage of ZFS (or BTRFS). Blocks of data are checksummed and verified and corruption isolated and repaired. Yay for reliable data.

I'm about to either buy a home NAS or build my own with an atom Mini ITX. I plan to expand my array in the future. So is there any configuration that gives me the best of both worlds, ie the expandability of traditional RAID with checksums to prevent replicated errors?

BTRFS allows you to add, remove drives, and change RAID levels on a mounted filesystem, while able to use differently sized drives efficiently. That alone wins me over. Some people still say BTRFS isn't ready for production, but I've had fewer problems with it than with ZFS (YMMV). Still, I don't care much as I have multiple backups.

I decided to try BTRFS on my NAS because it didn't require rebuilding a kernel. The ability to add disks to an array and have it rebalance made it very appealing.

Unfortunately, the three-drive filesystem lasted two weeks before it became unmountable. The only thing that let me mount it was finally running btrfsck. I was left with 57 unrecoverable errors, and lots of lost data.

I would not recommend running BTRFS in RAID5 or RAID6 just yet. Stick with mirroring, if you want to use it, and rebalance to RAID5/6 later on when it's more stable.

Which kernel, and did you file a bug report?

e: To anyone not up on btrfs, its features are closely tied to the kernel version it's used with. For example, raid56 scrub and device replace, and recovery and rebuild code were not available prior to kernel 3.19.

I also believe the only way to use 5/6 modes before they were stable was to explicitly compile with them enabled. It wasn't just something you could accidentally do.

It was 4.2.5-1-ARCH, and I didn't file a bug report, no.

I didn't have much data to submit. No kernel panics, no useful error messages, nothing beyond it saying it wouldn't mount. One could read the tea leaves from the filesystem as it sat, but such data spelunking could take a while on an 8TB partition, and I wanted to get the disks back into use.

I didn't notice the corruption until after I had unmounted it, so scrubbing it wasn't an option.

Thanks, that is what I wanted to know. Not what I hoped to hear, but what are you gonna do?

I recently got myself a new home server and decided to use FreeBSD and ZFS, so the question is settled for me for the moment. I still hope Btrfs gets there in the not-too-distant future, though.

When I was looking to install BTRFS, I saw tons of warnings not to install RAID5 or RAID6 because the code is not finished yet. The other levels are fine.

The last time I checked, using the RAID5/6-like feature was not stable, yet. Have they made progress on that?

I have been saying this for a long time. And yet no one seems to care about client data corruption. Synology only recently has NAS with BRTFS, but they are on the expensive range.

As far as I can tell, the only consumer NAS maker that offer BTRFS is Netgear.

Sadly it looks like Synology's implementation of BTRFS does not implement correction, only detection at this stage: http://forum.synology.com/enu/viewtopic.php?f=260&t=104519

No, except possibly paid Solaris from Oracle (which has a newer version of ZFS), but that would probably cost a lot more than buying more disks.

What I do is have my ZFS in two "layers" (each of them 4 disks in raidz2, i.e. resilient against any two failures), and replace a whole layer at a time. So I started with 4x500GB drives for 1TB of capacity. Then I added 4x1TB drives, total capacity 3TB. Then I replaced the 500GB drives with 2TB drives, total capacity 6TB (and throwing away the 500GB disks, so "losing" 1TB). I'm shortly going to replace the 1TB drives with 4TB drives in the same way.

I personally use 2 ZFS groups of 2 disks, and backup one on the other 'manually' [0]. I get the advantage of both worlds mostly.

I use a HP microstation gen 8, it's lovely hardware; and I use nas4free on a usb stick in the internal port as software.

[0]: by manually I mean the machine that use obnam to backup use one or the other group as a destination, depending on the day.

Look at the Netgear 516. It's got BTRFS and is plug and play. It has a small plugin library, super quiet, except for the first 20 seconds it starts up. Holds 6 disks. And it has intel server processor and more reliable sever ram. Unlike say qnap and synology which have probably more features / plugins it has less but I was sold on the file system and overall it does what I need. Support updates have been fast and perfect from what I've seen too.

If you go your own route, look at this server case silverstone cs-ds380b. I think paired with a mini-itx server board with the most sata ports, maybe get some ECC ram, server rated intel processor. I wish I knew the best software route to go. I've considered Amahi in the past and of course FreeNAS. I supect even a couple commercial packages could be sweet to have.

I have had one for four years.

My Setup: OpenSUSE (BTRFS with their Snappy is awesome)

Two drives that are redundant.

CrashPlan for off site backup

I also run my IRC (Weechat) on the box and glowing-bear.org as the front end and this is the BEST thing ever. Run my printer from the server and have a personal R-Studio and Juypter Notebook server running on it. I love it.

What hardware failures have you survived through so far, and how was your setup able to cope?

If you haven't had a failed disk yet, consider yourself lucky! :-)

<----- Lucky (No Failures) I think I paid $50 for the chip and motherboard. Western Digital Green 1 TB drives.

Do you know, can you use ZFS for the checksumming on top of md for RAID?

This way you get the expandability of RAID with the checksumming and snapshots of ZFS.

According to the article the ONLY way to expand ZFS after the fact is to "replace every hard drive in the VDEV, one by one, with a higher capacity hard drive." If you have some better work-around that gives you both, you need to either explain it or link to an article that does.

My thinking was that expanding your underlying md RAID would be the same as replacing the initial disk ZFS sees with a bigger one, thus enabling easier expansion at the md level and presenting a "bigger disk" to the zfs vdev.

I haven't seen it done, it's just a theory, hence why I asked. I'm just not sure if zfs needs to see actual disks, or if it can work on top of any block device, like an md RAID.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact