
Backblaze hard drive reliability stats for Q3 2016 - sashk
https://www.backblaze.com/blog/hard-drive-failure-rates-q3-2016/
======
pfarnsworth
With hard drive size increasing so quickly but hard drive transfer speeds
basically flat, I wonder if there are long-term implications for them with
respect to recovery from backup and downtime. For example, if a whole rack
goes down, and they are on 32TB drives in the future, for example, could it
takes a week or more for their data to get online?

~~~
dom0
Well, uh, yes.

No one really expects that rotational rust will get much faster, and in fact
history shows that, compared to the increase in density, the increase in
transfer rates are laughable at best. Between 1990 and today you are probably
looking at a 20 000 times increase in density, yet transfer rates only
increased around a factor of around 150-200. [In fact, from the early 1960 to
today it's only a factor of about 1000. There are quite possibly few
performance metrics that increased so slowly as disk transfer rate].

Why is that?

Increasing density does only marginally increase transfer speed: Most density
increases are achieved by packing more tracks onto the platter, while storing
more sectors per track plays a minor role. But a single R/W head can only read
a single track, not parallel tracks, hence speed only increases if you pack
more sectors into each track, not by increasing the number of tracks. That's
why between today and 10 years ago performance in desktop or server drives
only differs a little, compared to the capacity increase to 8+ TB in a 3.5"
drive.

More platters also don't help in transfer rate, because the alignment of all
heads on the actuator is fixed: at any given time only one platter and one R/W
head is used (locked to the track). [More platters can help reduce seek time
in certain scenarios though]

More disks on the other hand...

~~~
hinkley
That's not entirely true.

Higher density means more data per track, not just more tracks per disk. You
get an entire track per revolution so a track with more data is more MBps. So
linear reads on a higher density drive are faster, and semi-linear accesses
(ie, reading two files that are next to each other) do get faster.

I remember reading a story about a guy who built a drive array with high
capacity 7200 RPM drives that got within 20% of the performance of the 10K RPM
setup they had, by partitioning the drives at the same capacity as the 10K
equivalent. The head only had half as many tracks to traverse, so worst case
access time was better, and the higher density made up for the lower RPMs.

~~~
baruch
Short stroking helps get more performance from the disk, at least in terms of
latency. The bandwidth change occurs because the density is fixed and there
are more blocks on the outer rings per track compared to the inner rings so in
one revolution you can read more tracks and you don't need to change tracks so
often.

You parent comment is right though, there are only small changes in bit
density on the track in recent years so the bandwidth is not improving by
much.

~~~
hinkley
So we went from 1TB to 8TB disks just by packing 3x as many tracks onto
drives? I find that hard to believe, and the benchmarks agree with me.

You don't double the write throughout on a disk by doubling the number of
tracks. You need more platters and/or sectors per track to do that.

------
meritt
We have about 20TB in an AWS S3 bucket we'd like to backup somewhere separate
from Amazon. Is there any chance of Backblaze offering ingestion from an
Amazon Snowball export
([https://aws.amazon.com/snowball/](https://aws.amazon.com/snowball/))?

~~~
atYevP
Yev from Backblaze here -> Believe Snowball is encrypted and fairly locked
down, not sure we could do that.

------
leejoramo
How are people using Backblaze's excellent hard drive reliability reports in
making purchasing decisions?

For example when I search for HGST HMS5C4040ALE640 on Amazon I get a dealer
selling old out of warrantee drives as new.

[https://www.amazon.com/HGST-MegaScale-
HMS5C4040ALE640-Coolsp...](https://www.amazon.com/HGST-MegaScale-
HMS5C4040ALE640-Coolspin-Enterprise/dp/B01K3UV65O)

I get similar results with many of the other drives listed and with other
websites such as NewEgg.

~~~
ars
I did, I picked HGST because of their reports. No problems so far.

When you buy hard disks ONLY buy from Amazon or Newegg directly - never buy
from a 3rd party seller on their site. Especially for hard disks there is too
much fraud, and for a hard disk especially the risk of data loss makes it just
too risky (unlike other items).

~~~
discreditable
> When you buy hard disks ONLY buy from Amazon or Newegg directly - never buy
> from a 3rd party seller on their site.

Agreed. A few times I've purchased third-party disks and found (via SMART
data) that the drives were well used despite not being sold as such.

~~~
chiph
Just bought some drives that were fulfilled via Amazon (but not sold by
Amazon.com LLC themselves). You've given me something to check when they
arrive.

------
trowawee
Looks like they're offline. Google cache link:
[https://webcache.googleusercontent.com/search?q=cache:DNdyZ1...](https://webcache.googleusercontent.com/search?q=cache:DNdyZ1o9QqwJ:https://www.backblaze.com/blog/hard-
drive-failure-rates-q3-2016/+&cd=1&hl=en&ct=clnk&gl=us)

~~~
theandrewbailey
Backblaze always gives great stats with these. I upvoted the story before
looking at it.

~~~
trowawee
Agreed. I'm always interested in the results , even if their physical usage of
the drives is miles above anything I'm ever going to do with a hard drive.

------
porker
Anyone have a reason why the WD30EFRX has such poor reliability in their last
table? Granted it's a smaller sample size.

Only asking 'cause it's my main data hard drive...

~~~
devonkim
IIRC 3TB drives were among the newer drives commercially available at retail
when the flooding happened in Thailand and all retail drives were impacted
negatively. Furthermore, aggregate capacity is only one factor in the design
of a hard drive and most 3TB drives were made in a manner that reduces
reliability (more platters of same capacity or fewer platters with greater
individual platter capacity, can't quite remember which unfortunately). I
don't see why a manufacturer would put the newest technology into product
lines that are older so I'd presume that 3TB drives are among the greatest in
number of platters and the extra components contributes to the failure rate.

Among the least reliable drives I saw in previous reports were Seagate 3TB
drives (supposedly they had worse reliability than the legendary IBM
Deathstars) and after reading about how 3TB drives were designed across
manufacturers years ago during the flooding crisis I decided to avoid 3TB
drives entirely. Seems like my decision is finally getting some data to back
it up now in hindsight (no pun intended).

------
Fej
Backblaze is consistently a great service. Needs a Linux client though.

~~~
atYevP
Yev from Backblaze here -> have you checked out B2? We likely won't have a
Linux client for our backup service any time soon, but our B2 service has a
lot of integrators (like Cloudberry and Duplicity, HashBackup, etc..) that can
back up Linux machines, a lot of folks have been going that route.

~~~
liotier
> We likely won't have a Linux client for our backup service any time soon

Ok. But why ? Technical obstacles such as having to deal with distribution
diversity - or is it a way of market segmentation ?

~~~
atYevP
It's a mixture of a couple things. One is that we tend to run pretty lean and
our engineers are all booked up for the foreseeable future. Linux users are a
passionate community, but we can't quite justify the development time for a
market segment that is not very large. Additionally, because we run an
unlimited model, a lot of people would immediately sign up and back up their
Linux servers for $5/month and we'd sail out of business. We could address
that by putting in limits for those types of devices and only allowing certain
Linux builds, but that adds complexity and we want to keep the backup side of
the service very simple - which works for the vast majority of folks. So it's
a combination of a bunch of factors. We hoped that developing B2 APIs and CLIs
would give Linux folks something that they could use if they needed to have
offsite backups or archives and wanted to use our infrastructure cause we're
pretty neat. Long-winded answer, but TL:DR - small market segment, development
time/cost, possible abuse.

~~~
distances
It's worth considering though that the Linux compatibility is worth more than
the actual market share. Our company (with roughly 90% Mac / 5% Linux / 5%
Windows users) went with Crashplan to have a single backup solution for all of
the employees.

~~~
atYevP
Absolutely, and that makes perfect sense. Having one system in place
definitely beats out multiples. We know we can't be all things for all folks,
but Crashplan is great, no hard feelings ;)

------
avitzurel
Really surprising to me that some disks have 4-5% failure rate yearly. Great
data

------
caf
It's fantastic to see the confidence interval quoted in those latter tables (I
assume 95%?) - that's far more informative than just the mean failure rate.

------
arekkas
Why aren't you using tapes? Wouldn't that be more suited for back ups - larger
and less failures. You probably don't have a lot of reads anyways?

~~~
atYevP
Yev from Backblaze here -> Tape tends to have much higher read times, and can
even be more expensive than hard drives in some cases. Since we have Backblaze
B2 the data needs to be highly available.

------
joshdance
[https://www.backblaze.com/blog/](https://www.backblaze.com/blog/) \- 404
right now?

~~~
atYevP
Yev from Backblaze here -> We're kicking a couple of server. Lots of load from
the traffic so we're putting top me on it! TOP MEN!

~~~
atYevP
Back now!

------
Shivetya
Where do retired hard drives go? Do you have in house recycling or contract it
out?

~~~
andy4blaze
In the Backblaze case, they are securely wiped clean and then recycled.

------
swang
What does Backblaze do with all the removed HD that still work but maybe have
a ton of cycles on them? Are they just recycled, or resold?

~~~
brianwski
Disclaimer: I work at Backblaze. We securely wipe the drives, then we sell
them to a "used hard drive reseller".

------
executive
404 File Not Found

Not so reliable I gather

~~~
atYevP
Yev from backblaze here -> Yea we're kicking a server or two, it's been
loading slow from all the traffic so we put top men on it. TOP MEN!

