
Backblaze Hard Drive Stats for 2018 - sashk
https://www.backblaze.com/blog/hard-drive-stats-for-2018/
======
zachruss92
I really appreciate BackBlaze opening up this data. While I don't purchase
HDDs often, I always refer to these reports when deciding.

They also open sourced their server chassis which is awesome!

It's nice to see a reduction of the failure rates as a whole. It looks like
the next few years will be some interesting times for the growth in storage
capacities for HDDs.

~~~
atYevP
Yev from Backblaze here -> Glad you're enjoying the stats!

~~~
HankB99
Please thank those responsible on my behalf (and all of the others who study
the numbers before purchasing drives.)

~~~
atYevP
I'll let Andy know - he might be around here somewhere :D

------
fpgaminer
Tangentially related. Whenever I get a new drive, I always do a "burn-in". A
program writes data to the whole drive and then reads it back (reproducible
random data).

Is there any real justification for doing this kind of test on a new drive?

Doing it takes quite awhile, so I've been wondering lately if it's even worth
it. I've never found anything with it.

~~~
conbandit
If you've never found anything with it, why do you keep doing it (see: the
definition of insanity)?

~~~
cataflam
Not the parent, but probably because

1\. It's notorious that hard drives have a higher failure rate at the
beginning of their lives than in the middle (see bathtub curve [0]). So it's
not absurd to test them hard early on before writing any useful data and to do
an early RMA.

2\. The failure rate on drives is low enough that his methodology may be right
but he still never has any failure in his life. Doesn't it make insane.

[0]
[https://en.wikipedia.org/wiki/Bathtub_curve](https://en.wikipedia.org/wiki/Bathtub_curve)

~~~
philliphaydon
I once bought a new 1TB Drive when they were fairly new. MOVED about 500gb if
data to the new drive. Checked it. Seemed fine. Turned computer off and went
to bed.

Next day the HDD didn’t turn on. Completely dead. :( I’ve never had a failure
since but I backup everything now.

------
sigi45
I wonder if any of those companies talk to Backblaze about it. Like sending
drives back for inspection :)

I'm also curious why those companies wouldn't directly talk to backblaze. I
read somewhere a blog post on how they bought specific drives online at a
sale.

~~~
atYevP
Yev from Backblaze here ->

> I read somewhere a blog post on how they bought specific drives online at a
> sale.

Yea, we used to buy drives wherever we could, but that was years ago. We're
larger now so we go through more established channels.

~~~
hinkley
Also no natural disasters taking out the manufacturers again, right?

~~~
atYevP
Can't really count on those NOT happening - but we are more prepared now ;)

~~~
hinkley
Yeah I mean that was the impetus for buying them off the shelves, wasn’t it?
It was a crisis management technique not a business plan :)

~~~
atYevP
Yes, that's right. It was one of those "better think quick" scenarios and we
did what we had to in order to stay in business!

------
b3lvedere
Thank you Backblaze! I love your reports.

What is your procedure/policy on which disks to use in the pods? Do you try
and maybe control the risk by using different harddisk brands in a single
storage pod? Or do you just not care, because there have never been 3 pods
dead at the same time? :)

Do you still use 17 data plus 3 parity shards?

~~~
evil-olive
Their Q3 2018 stats had a bit of info on the lifecycle of introducing new
disks:

[https://www.backblaze.com/blog/2018-hard-drive-failure-
rates...](https://www.backblaze.com/blog/2018-hard-drive-failure-rates/)

> In Q3 we added 79 HGST 12TB drives (model: HUH721212ALN604) to the farm.
> While 79 may seem like an unusual number of drives to add, it represents
> “stage 2” of our drive testing process. Stage 1 uses 20 drives, the number
> of hard drives in one Backblaze Vault tome. That is, there are are 20
> Storage Pods in a Backblaze Vault, and there is one “test” drive in each
> Storage Pod. This allows us to compare the performance, etc., of the test
> tome to the remaining 59 production tomes (which are running already-
> qualified drives). There are 60 tomes in each Backblaze Vault. In stage 2,
> we fill an entire Storage Pod with the test drives, adding 59 test drives to
> the one currently being tested in one of the 20 Storage Pods in a Backblaze
> Vault.

~~~
b3lvedere
Thank you for the info! Much appreciated.

------
peterwwillis
I'm interested in failure rate per iops. If the drive fails infrequently,
great, but if it's also the worst performing drive, screw that. Would rather
buy drives that perform as well as possible with the least failure rate.

~~~
walrus01
In my recent experience there is not a lot of speed difference anymore between
multiple manufacturers' 6TB to 12TB sized hard drives, when comparing between
two competing products in the same rpm class (5400 or 7200) and areal density.
Assuming similarly sized RAM cache on drive and not something like a drive
with 64GB of SSD cache (hybrid drive).

~~~
peterwwillis
I mean more like benchmarked performance. Two drives with the same specs may
end up performing differently. I realize benchmarks are not entirely realistic
and tuning can affect the outcome, but if there's a clearly outsized
performance difference between two seemingly equivalent products, I want the
one that performs better and fails the least. So, aggregate random read and
write operations per second over failure rate, as a general spec. (this might
also expose flaws in the stats, if one drive model is getting predominately
more of a certain operation which results in more failures)

------
krob
HGST look like the best, but they don't have the quantities of the Seagate,
makes me wonder if these numbers are skewed :/

~~~
simcop2387
They're not going to be skewed. HGST disks are usually more expensive, so that
limits the quantity that they buy. The seagate disks are usually cheaper but
have a slightly higher (except when it's a brand new line) failure rate. When
filling out a single server I go with the HGST disks because the premium price
and quality means fewer failures, but it's more cost effective to go for lots
of seagate disks when you have more redundancy and can eat more failures.

~~~
level
I built a server a few years ago, and I determined it was more cost effective
for a drive to fail than to use HGST. It would be more inconvenient, but
having a drive fail on a home server with only 6 drives didn't seem very
likely anyway.

~~~
jacobolus
If you have a 2%/year failure rate per drive, then that leaves you with a
nearly 12%/year chance that at least 1 of 6 drives will fail each year. Or a
31% chance that at least 1 of 6 drives will fail within 3 years. Or a 52%
chance that at least 1 of 6 drives will fail within 6 years.

~~~
sangnoir
The question then is, would it be cheaper to replace that one drive or get the
more expensive disks with lower chances of failing?

~~~
Arn_Thor
Isn't there a cost to time and convenience too? Buying a new disk takes time,
as does rebuilding the RAID. And during that time you are vulnerable to
another drive failure which could be disastrous if you only have one drive
redundancy, especially during the very intensive rebuilding process.

------
linsomniac
Anything interesting in this one? I've stopped reading them because they all
seem to be "Seagates fail kind of a lot but we use them because reasons. HGST
doesn't fail a lot, but we also have statistically insignificant numbers of
them, so <shrug>."

~~~
metalliqaz
"because reasons" is an overly negative way to say "because they offer the
best value"

They can get large numbers of them cheap, and their system is good at
detecting and replacing bad drives, so why not use them?

~~~
linsomniac
In my head it wasn't negative, it was that there are a lot of reasons that I
don't think needed to be gone into. Offering value is one, having systems that
are designed to minimize the impact of failures is another. Others that came
to mind when I wrote that include: Being able to get them at the quantities
they need, having systems that reduce the _COST_ of failures (drive
replacement and RMA), cost of RMAing 10 drives is < 10x the cost of RMAing 1
drive, having drives available in the SIZES Backblaze wants. And there are
others I could speculate about but don't have as concrete information on
(vendor relationships, manufacturer relationships, marketshare, firmware
quality/suitability, temperature).

Maybe it's just my social/business groups, but "because reasons" doesn't have
to have a negative meaning. I use it as more a statement of fact: There are
reasons for this.

~~~
metalliqaz
hmmm, I always thought "because reasons" was sarcastic, as in the person would
list reasons but they are all bullshit. Of course, sarcasm can be impossible
to discern correctly on Internet forums and social media. I admit I could be
completely wrong about this.

~~~
nathanlv
Sarcasm tends to depend on context, tone, and delivery. It is particularly
difficult to interpret in written commentary such as blogs like this.

------
icelancer
My company uses Backblaze since I think it's a good product, but blogs like
this really cemented my choice. I appreciate their attention to detail and
publishing data openly.

~~~
atYevP
Yev from Backblaze here -> That's awesome to hear! I'm glad you're with us!
That's one of the nice side-benefits of this blog and one of the reasons we
adopted an "open" policy with the Storage Pods. The first time we published
that post it was because folks didn't believe we could store data so
inexpensively - so it's nice to hear that we're building some trust along the
way!

------
Rebelgecko
I wonder if they have states failure rate based on a drive's manufacturing or
installation date? It looks like there's a bit of a bathtub curve, and it
would be interesting to see if that's attributed to individual drives having a
tendency to fail quickly (if they're going to), or if drives are less likely
to crap out once their model has been manufactured for a few years

------
linux2647
Is there anything similar for SSDs?

------
mikece
While I've heard lots of great things about the Backblaze reports, I've
noticed in comments on NewEgg, Amazon, etc that the SKUs mentioned in the
report frequently aren't available anymore. I've never had problems with WD
Red drives though I don't purchase them all from the same vendor at the same
time to make sure I get drives from different lots in case of a lot defect.

~~~
wmf
By the time you have accurate reliability data on any equipment it's obsolete.
Maybe this will change with the slowdown of Moore's Law/Kryder's Law.

------
humantiy
Curious if anyone knows how they calculate drive days. For example the first
drive on their report (hgst 4tb) has a count of 50 but total days of 23069. If
I take the 50 by the days it should be 18250, so not sure where the extra 4k
in days is coming from. Retired drives or something?

~~~
tzs
I think it is over the time they have had the drive, not over the reporting
interval. It's a measure of the age of the drives. Assuming their drives are
operating 24/7, that means that those particular 50 drives have been in
service an average of 461 days.

I'd expect on next year's report, those particular drives will show up as 49
drives with around 42000 drive days, assuming they aren't replaced by then.

~~~
humantiy
If that is the case then wouldn't the Annualized Failure Rate be based of the
year total not the drive days if it is total days in service? For example the
drive count(50)/drive days(23,236) gives the AFR of 1.58% which equals out
their numbers. The drive days is more than the total amount possible for that
year.

------
akulbe
I realize my comment is a tangent. I'm hoping folks might be understanding and
hopefully have some advice.

If you have terabytes to back up, are there still any backup services left
that'll let you ship them a drive for a faster initial backup?

~~~
brianwski
Disclaimer: I work at Backblaze so I'm biased. :-)

> If you have terabytes to back up, are there still any backup services left
> that'll let you ship them a drive for a faster initial backup?

Backblaze offers a "Backblaze Rapid Ingest Fireball" to allow you to ship us
60 TBytes of data on an appliance.

[https://www.backblaze.com/blog/introducing-backblazes-
rapid-...](https://www.backblaze.com/blog/introducing-backblazes-rapid-ingest-
service-fireball/)

If you only have 2 - 10 TBytes, I suggest you get a faster network connection,
or carry your laptop to a location (like your work place, or a library, or
your neighbor's house) with a fast connection and just upload it. You might be
surprised how easy and fast it is to upload a couple of TBytes nowadays. Using
the Backblaze Personal Backup with 30 threads, I can upload about 1 TByte
every 12 hours or so. So if you can leave your laptop at your workplace for 4
days you can upload 8 TBytes, then bring your laptop back home for the
incrementals.

~~~
post_break
It's been over a year, can you please ask someone at backblaze to add a single
line of code into the snapshots so you can leave a comment? Right now if you
make a snapshot, the only info is the date and time. Not the data, what's in
the snapshot, literally anything about it. Please!

~~~
brianwski
Disclaimer: I work at Backblaze.

> please add ability to comment on snapshots

Actually, very very quietly as part of the 6.0 release (4 days ago), we now
allow you to "name your snapshot".

While this is not a comment, and you can't change the name LATER (so useless
to old snapshots), at least going forward you can put up to about 1,000
characters of description in the snapshot name.

~~~
post_break
Thank you!

------
samstave
BackBlaze spent ~12 million dollars on 12TB Seagate drives (at full retail)

~~~
metalliqaz
Good thing they charge me $0.94 a month for my B2 storage. Gotta make that
budget from somewhere!

~~~
brianwski
Disclaimer: I work at Backblaze.

> Good thing they charge me $0.94 a month for my B2 storage.

We thank you for your business! :-) The absolute beauty of Backblaze B2 (or
Amazon S3 or Azure) is that we can build a storage system at scale, and sell
off all the little pieces of that. You win because you get a fair price on a
sliver, and we win because we add up 100,000 customers like you and make about
$1 million per year.

The very best business is where the customer and provider are happy with the
relationship.

~~~
leowoo91
AWS nerd here, I recently googled you guys and found out you have 1/4 bandwith
cost already. See you soon for my upcoming project.

~~~
metalliqaz
I like my B2. I also like that Backblaze is one of those companies that does
one thing and does it very well.

------
alinde
Would be interesting to also have metrics on failure per TB storage.

~~~
theandrewbailey
I'm not sure how that would be useful, since terabytes don't fail. When a
drive fails, it's effectively a brick with no terabytes.

~~~
alinde
I was thinking as one failure of a 100TB disk has a very different impact of
10 failures of 1TB disks. It'd give some idea on how much data is lost due
failures, no?

~~~
oliveshell
I suppose, but there’s no such thing as a single HDD that stores 100TB. The
biggest you can get currently are (I believe) 14TB helium-filled drives.

~~~
jsgo
My guess is they meant 10TB as it would be a more "equal" comparison:

1 10TB drive

10 1TB drives

~~~
oliveshell
That makes way more sense. Didn’t think before posting!

------
chemmail
Relibality looks decent this year. All my seagates are having tons of errors,
i think i'll stick to the other team from now on even though seagates seems to
be getting better.

------
Arn_Thor
Odd they don't have the 6TB HGST on the list. Well, maybe not odd, but
annoying since I'm curious about that drive's performance especially

------
GNU_IS_UNIX
Do larger capacity drives have higher failure rates?

~~~
simcop2387
It's usually not so much that larger drives are inherently less reliable, but
that the larger drives are also the newest lines and so yields and
manufacturing issues are more likely to crop up than in the older lines that
have had issues sorted out.

------
mark-r
A perennial favorite, literally. Thank you.

~~~
atYevP
You're welcome!

------
arcaster
TIL - Seagate drives are still basically shit :)

