Hacker News new | past | comments | ask | show | jobs | submit login

> URE Rate is documented in most handbooks or manuals on harddisks, it's usually about 10^-12, though this is a worst case error rate and -13 or -14 is more realistic.

The kind of literature I'm looking for is based on real world data, since spec sheets are generally marketing documents and therefore, at best, uninformative. This is particularly noticeable when a single number is used for a statistic, as in this situation.

Let's take a current datasheet [1] which actually says 1 sector (unclear if that's 512 or 4k, but let's assume 512) per 10E15, max, a full 3 OOMs higher than what you mentioned. That's 512PB, which, at the highest listed transfer rate, would take almost 545k hours, corresponding to slightly above a 1.6% AFR for worst-case URE-caused failures.

Since I don't believe the spec sheet's 0.35% (average) AFR, but since real-world AFRs are in the low single-percent, I'm staying with the conclusion that read errors need not be considered separately from any other underlying cause of drive failure.

> Depends on your use case.

I'm not convinced that "waste" can ever depend on use case, when examined under the narrow lense of redundancy. Regardless, it was a use case you, specifically brought up, so the question remains open.

> Best practise is to test your failover regularly, correct

You're misunderstanding my assertion, which wasn't about regularity but frequency. I made no statement as to regularity (and it may even be better, with discrete failovers, to do so irregularly, rather than regularly).

My assertion is that best practice dictates frequent failover.

> It's not that low

> risk of marking of your array as dead if you get unlucky.

You still haven't actually quantified the risk. "Unlucky" and "possible" are too hand-wavy to engineer around.

Quantifying the risk, even if it's a very coarse approximation, is a prerequisite to avoiding FUD-based decisions/actions. Usually, the most visible consequence of the latter is the waste alluded to earlier, manifesting as higher cost and/or lower performance. The less visible consequence is misallocated resources/attention from something relatively likely (e.g. human error causing catastrophic data loss) to something much less likely (e.g. single sector read error causing catastrophic data loss).

[1] https://www.seagate.com/www-content/datasheets/pdfs/exos-x-1...




>Let's take a current datasheet [1]

Of a high profile and high quality enterprise disk, unlikely to appear in 99% of RAID setups. Congratulations.

Most RAID setups in the wild consist of low quality enterprise or even desktop platters, high quality setups are rare and expensive.

Esp. considering that lots of consumers have a NAS or MyCloud/eqv. at home and SMB doesn't buy those harddrives either, I don't think this is a valid comparison.

[https://www.seagate.com/files/www-content/product-content/na...]

[https://www.seagate.com/staticfiles/docs/pdf/datasheet/disc/...]

These drives here have an OOM higher error rate on the data sheet and they're still not very common, plus the error rate is given in bits here, shaving of 3 or 5 OOM again, calculating it out gives you an bad bit every 12 TB or so. Depending on how your RAID works, this can ruin the entire stripe or the sector it gets the hit on.

12TB isn't a lot and there is a good probability that a 3x4TB RAID5 array will hit such an error and a 3x2TB RAID5 has a 50/50 chance of having it during a rebuild.

>I'm not convinced that "waste" can ever depend on use case

It definitely can. Consider the use case of "office documents" vs "high performance video storage". With office documents a high throughput is not needed so we can reduce storage costs by using a RAID5 or 6 depending on array size. Video storage for example for CCTV requires more throughput an a RAID10 will give you better performance at some lost effective storage.

You can of course put your office documents on a RAID10 but that is the wasted space. It is not necessary to maximize the RAID performance and the low storage needed for office documents means you can use RAID5 on small arrays fairly safely, RAID6 if you need more.

Waste is a matter of efficiency in all domains, in this case means picking the RAID level that maximizes storage efficiency, performance goals met and safety as best as possible. Picking one that trades off one for the other is waste or even dangerous.

>My assertion is that best practice dictates frequent failover.

I don't think I ever met a sysadmin that insistent frequent failover due to failure is considered best practise. Regular testing yes, regular failure, no. Failure is expensive, testing not.

>You still haven't actually quantified the risk. "Unlucky" and "possible" are too hand-wavy to engineer around.

They're not really, plus you can easily quantify risks as you have done using datasheets. Though the moment your array isn't a singular batch of exactly the same harddrive these estimations become a lot harder.

You don't need to exactly quantify the risk, it is sufficient to know the margins of the risk or even if you are going into the margins of the risk, ie a risk model.

I don't need to know the exact and perfect probability of a read error on the array, I need to know if the array is likely to experience one (with likely being >1% or any other >X% during a rebuild or during normal operation). This question is more easily answered and doesn't require more than estimations.

This trades cost efficiency for safety margins meaning that an array has a comfortably low actualized risk of failing. That is what I consider actual engineering there.


> Of a high profile and high quality enterprise disk, unlikely to appear in 99% of RAID setups.

> Most RAID setups in the wild consist of low quality enterprise or even desktop platters, high quality setups are rare and expensive

> they're still not very common

More extraordinary claims, requiring extraordinary evidence. Even Baraccuda Pro models (which do cost more but not prohibitively so) have the same read error rate, but I didn't choose that datasheet because it had less data, overall.

> Esp. considering that lots of consumers have

I call "red herring" on this, since consumers also aren't going to have the kinds of choices we're discussing, nor read these forums, nor the original article (which is the context for this whole discussion).

> the error rate is given in bits here, shaving of 3 or 5 OOM again

I'm not convinced of that, since the tables look identical between the drives. Maybe it's a sneaky marketing ploy, but maybe not. Ultimately, you need real world data, which you consistently haven't provided.

Absent that, it seems as though you're relying on assumptions, and my original conclusion, based on the data that has been published by the likes of Google and Backblaze, stands.

> Waste is a matter of efficiency in all domains

You seem to have re-defined waste, so I can't really speak to it.

> I don't think I ever met a sysadmin that insistent frequent failover due to failure is considered best practise.

I fear you are, again, misunderstanding. I didn't mention failure as a cause, although the term "failover" could lend itself to confusion, with the substring "fail" being in it. I used the term only in the sense of "switchover".

Perhaps I merely misunderstood you originally. You did initially state "A failover is still a failover and can reduce performance" which, even assuming you meant failover-due-to-failure, the assertion is questionable in the context of justifying node-level redundancy on top of RAID-level redundancy, if switchover (not due to failure) is engineered to be frequent (or even continuous).

> you can easily quantify risks as you have done using datasheets

I'm not agreeing or disagreeing as to its ease, but I'm asking you to go ahead and perform this, which you assert the ease of, since that seems to be the basis of your point.

(Earlier, I just made some single-disk calculations based on the spec sheet, not array risks.)

> I need to know if the array is likely to experience one (with likely being >1% or any other >X% during a rebuild or during normal operation). This question is more easily answered and doesn't require more than estimations.

Agreed. As I mentioned, coarse estimates (if based on real data) are plenty good enough. However, even a coarse estimate assigns some number to it.

Given your question above, what's the answer, for likely-being->1%, during a 300-hour rebuild? How did you arrive at that answer?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: