
Has RAID5 stopped working? (2013) - xearl
http://www.zdnet.com/has-raid5-stopped-working-7000019939/
======
apenwarr
The analysis in this article is horrifically poor. Among other things, just
using the expected error rate as a pass/fail criterion is not a good idea.
Even if the "typical" RAID5 array doesn't fail, isn't it still bad if 1/10 or
1/100th of them fail because of randomness in the error distribution?

There are also lots of ways to mitigate this problem even with a high error
rate. In particular, if you get a bad sector, don't just kick the disk
immediately out of the RAID; try recovering that one sector first, using the
other disks in the RAID. The drive's built-in sector remapping should then
bypass the newly-bad sector.

I wrote about this in detail in 2008:
[http://apenwarr.ca/log/?m=200809#08](http://apenwarr.ca/log/?m=200809#08)

~~~
baruch
What you describe is what is known as disk scrubbing, and indeed if you are
operating a RAID device and it doesn't do disk scrubs every so often you are
doing it wrong. Linux mdadm has this feature as well and in most distributions
that I'm aware of disk scrubbing is performed once a month by default.

In addition, the reading into the BER and the immediate conclusion that RAID
stands no chance is not quite correct. If you did your disk scrubs your disks
are mostly clean and the likelyhood of failure is not that great. I was
responsible for tens of thousands of disks in a storage system that did do
disk scrubs and it did find occasionally a bad sector[1] and I've never even
once seen a double failure as is often warned about in the "RAID is dead"
articles.

Please do your disk scrubs to avoid sectors falling off and remember that RAID
is not backup, extra redundancy is still needed if you really care for your
data.

[1] I didn't have the sense of mind to compare the rate of error detection to
the official BER.

------
CHY872
Ok, so one thing that is bugging me: This is totally a software based problem.
At the moment, if I have a big hard disk I'm almost certainly going to have
some kind of big read error in at least a couple of sectors - which
practically means that I'll get some garbled bytes on some reads, and I hope
that they're in a cache file, or swap file, and not my registry, or programs,
etc.

So here, they're saying that if you have a RAID 5, and one disk fails, then
when it has such an error, it's suddenly game over for the entire array - the
rest of the data on the drive must be copied to a new array, because the
firmware says so. Apparently that's why RAID 5 is a bad idea.

Surely the same argument can be applied to hard drives on their own? If
Windows checksummed every file before using it, and forced copying every byte
onto a new hard drive whenever an error was found, hard drives themselves
would stop working as a meaningful method of storage in 2009. But it doesn't,
you (likely) get some garbled text, and it's usually not a problem. File
systems get corrupted etc etc (unless you use something like ZFS), and RAID 5
handling it in a particularly non-graceful way is just an implementation
detail.

So really, this is just a problem with the controller firmwares forcing you to
make a whole new array. It's not a problem with RAID 5.

Or have I missed something?

I know that RAID 5's not a good choice in this modern age (cloud, big SANs
etc, SSDs etc), but surely the 'a sector will be bad, and you will not be able
to rebuild after a failure' problem would be easily fixed by the raid
controller company if their clients complained enough?

~~~
simplexion
No. Here is a good explanation: [http://www.zdnet.com/blog/storage/why-
raid-5-stops-working-i...](http://www.zdnet.com/blog/storage/why-raid-5-stops-
working-in-2009/162)

~~~
CHY872
I don't believe the contradicts what I've said at all. His claim is that when
one disk fails completely, we expect a sector or two on each of the two
remaining drives to result in a 'bad sector' error.

> So the read fails. And when that happens, you are one unhappy camper. The
> message "we can't read this RAID volume" travels up the chain of command
> until an error message is presented on the screen. 12 TB of your carefully
> protected - you thought! - data is gone. Oh, you didn't back it up to tape?
> Bummer!

So, at this point, we've got two hard drives containing millions of sectors
for which one or two are bad, and the software claims that the whole thing is
broken, and we have to find a new array?

As far as I know, RAID 5 is block level - the block level failure should only
destroy one block. All of the others (apart from the other dead ones) are
fine. This sort of thing happens in all scenarios with a single disk -
eventually an operating system will hit a bad sector which it will have to
deal with.

In other words, why does the RAID controller crap itself when it can't read
(with no recourse at all, according to these articles) when it could just do
what every other hard drive does and return 'sector unreadable'. Then the
operating system can just remap it etc.

I know in some situations one would want to be notified of any miniscule
error, but it should be possible to ignore the warnings.

~~~
micro-ram
I think the assumed issues here are the long rebuild times and the expectation
of an additional failure happening during the rebuild which could then trigger
another disk error. Then the entire array of data would be taken offline and
possibly corrupted. I have build many RAID5/6 arrays and have rarely lost any
data, but I am leery and now and tend to just stick with smaller RAID 1 or
10's due to the large size of current disks. We really need native ZFS
(BTRFS?) on everything now. Data should be automatically distributed to
multiple disks and the file system should be able to guarantee via checksum
the data read is what I wrote.

~~~
venus
Are any consumer NAS that implement ZFS even available?

Sounds like ZFS's proactive sector sweeps across all managed drives would
handily solve the problem the article raises with conventional RAID.

~~~
mitchty
So I built a zfs raid nas box with 6x3Tb drives. I have it resilver every 2
weeks. So far, NOTHING has failed checksums for almost 2 years. So while the
maths behind a 3-4 tb drive returning incorrect data I'm sure is technically
correct, I've not seen issues.

If you want "proof" i can dump out zpool status/info/log/etc to show i'm not
lying. Note the pool is ~50% in use so its not a great example. Also its
raidz2 (raid6) so not a direct comparison. I also bought each drive from
different lots to hopefully ensure if a drive failed i'd have 2ish days to get
a replacement.

------
shopinterest
In my anecdotal, (not technical) experience, here are the issues with RAID5
failing (from losing 4TBs of Data and 2TBs of backups)

\- Consumer devices - The later Buffalo terastations (Consumer NAS) and other
'low' price devices for some stupid reason tied the RAID array to their own
hardware (like firmware) if the unit fails, the array fails.

\- The size and 'time' to rebuild places a great stress on disks, therefore
causing a potential 2nd drive loss. Rebuilding HDs for 250GB-500GB took a 90
min or so, a 2TB rebuilt takes several hours, all HDs are grinding to recover
data at the same time, for hours as well, and this is when you are likely to
lose more than the 1 drive and your array.

\- Consumer and SMB NAS devices tend to buy the hard drives from the same
manufacturer, at the same time, so when one drive fails, the other drives are
just as likely to fail in the same time period. I lost two 1TB backups when
the HDs had a 'suicide pact' and when I opened the Hard drives, I saw they
were literally made the same day. To be extra paranoid, I used to have 2 or 3
different brand of HDs when I had my NAS. \- I recommend Synology units and
constantly monitor the health of your hard drives. Totally agreed that RAID1
should be the golden standard. I've used several brands and Synology was rock
solid.

------
chadnickbok
Perhaps a better argument to make is that, since 2009, "Cloud Storage" has
taken off alongside Amazon's S3 service.

So while folks who really do need to reliably store 500GB of stuff might still
be looking at RAID, for the 'average' end-user they have a far better solution
than complicated multi-disk setups.

~~~
kijin
The "average" end user also doesn't want to spend weeks uploading 1TB of files
to the cloud.

Imagine a typical American or Canadian "broadband" consumer internet
connection with 10Mbps down and 1Mbps up. If you keep your computer on 8 hours
a day, it will take just over 9 months to upload 1TB of data to the cloud at
that speed. A multi-TB RAID array would take years to back up. (This is why I
think Backblaze et al. can afford to offer "unlimited backup" for a flat
monthly fee. The customer's ISP is doing all the limiting for them!)

It was a major PITA when I first started using online backup services while in
Canada. (TekSavvy FTW!) Despite paying for one of their faster plans, I had to
keep my computer on for a couple of weeks 24/7 to make the initial backup.

~~~
jimmcslim
Is there an opportunity for a well-known backup vendor to have shopfronts in
malls that are wired into a high-speed backbone? Take them your external USB,
go have lunch in the food court, come back later when the backup is complete.

'YOUR PHOTOS BACKED UP TO CLOUD WHILE U WAIT'

~~~
gizmo686
You don't even need the high-speed backbone. You could backup the files to a
local drive. The latency might be bad, but trucks still have higher bandwidth
then the internet.

[http://what-if.xkcd.com/31/](http://what-if.xkcd.com/31/)

~~~
jimmcslim
True absolutely. But there's something more appealing to me about handing a
drive across a counter, knowing it will be backed up in a few hours (assuming
that is practical), and ready for me to be confirmed back home via my higher-
download-speed broadband; rather than have the drive rattle across the country
in a truck, sit in a mail room, and finally get processed X days/weeks
later...

~~~
lessnonymous
So Mall Backup Co takes two copies immediately. One on their local node and
one that gets shipped back to base.

As soon as you get home you can access your backup (transparently) from the
mall node. Then when the drive gets back to base it does another verification
run and instructs the node that it can re-use that space.

------
Spooky23
RAID enhances online availability and performance. It isn't backup. So if you
are the consumer addressed in the article, the answer is no, you don't need a
RAID array, you need offline backup.

------
fleitz
No it isn't you just need to do proactive maintenance on your array... and/or
backup your data.

If you use RAID5 on any size array with out making backups you will lose data.
Does he really think that with the invention of 1TB+ drives that it's the
first time two drives have failed at once?

~~~
sard420
Back in the days when all 10 of your Seagate 120GB drives would die in a span
of 2 months, sometimes two or three popping off in the same day.

------
jrockway
Drives are so cheap these days that I wouldn't consider anything other than
RAID-1. For applications requiring cheaper storage or better density, I'd
consider application-level redundancy coding.

~~~
kayoone
Raid 1 doesn't offer any performance benefits though but you could go with
Raid 0+1 (or 10) but i believe read performance is still better in a Raid 5
configuration while write performance is lower.

In the age of really fast SSDs Raid 1 might be the better choice though,
unless you really need a huge amount of fast storage.

~~~
jrockway
RAID 1 does have read performance benefits; you can read from both disks in
parallel. Write time is max(disk0, disk1).

~~~
kayoone
oh really ? i thought to have read benefits youd go RAID 10 and thats the main
reason it exists in the first place, but seems like you are right but it
depends on the RAID controller and drivers.

------
pmoriarty
Related:

[https://news.ycombinator.com/item?id=7830213](https://news.ycombinator.com/item?id=7830213)

