
I come not to praise RAID-5... (Why RAID-5 isn't evil) - MPSimmons
http://www.standalone-sysadmin.com/blog/2012/08/i-come-not-to-praise-raid-5/
======
lmm
The problem is the stupid way raid systems (or at least linux md) handle these
inevitable UREs. Make a raid5 of 2TB disks, use it for a bit, and it's
virtually guaranteed there's one bad sector on each disk (or at least, you'd
get one URE on each disk when reading them all). Now have a drive fail. No
problem, you think, I'll replace that drive and rebuild. Put in the
replacement, kick off the rebuild process. Linux will hit a URE on one of the
drives, kick that drive out of the array, and refuse to rebuild any more. You
can't even mitigate this by doing a weekly verify of your disks, because if
that verify happens to run into UREs on two separate disks, bam, bye bye data.

This is not theoretical, it happened to me; maybe there are some magic
parameters that get around it, but I read the available tutorials; if I made a
mistake, others have probably made it. I asked a kernel dev how to solve this
problem and he suggested a cronjob that runs md5sum on each of the underlying
disk devices. I wish I was joking.

Fortunately, there is a simple solution, ZFS. A raidz1 will recover perfectly
from the same scenario (again, this is not theoretical, I've done it); you
will lose the particular blocks that suffered the UREs, but no others (and ZFS
can tell you which files were affected). And you can run the "zfs scrub"
command regularly to catch any sectors that've decayed before you lose a drive
and can no longer use parity to restore them.

~~~
baruch
You're supposed to run dm scrubs as well, if you don't it is possible that you
have a bad sector somewhere on your disks.

See <http://en.gentoo-wiki.com/wiki/RAID/Software#Data_Scrubbing>

