
Hard Drive Stats for 2019 - sashk
https://www.backblaze.com/blog/hard-drive-stats-for-2019/
======
LeifCarrotson
Interesting how the numbers carry over year-to-year in

[https://www.backblaze.com/blog/wp-
content/uploads/2020/02/Bl...](https://www.backblaze.com/blog/wp-
content/uploads/2020/02/Blog_3_year_Drive_Stats_Chart.png)

Some models are dwindling. Some are being tested. Others (like the Seagate and
HGST 12 TB) are increasing. Only thing that's really perplexing is why they
keep buying more and more of the high-failure-rate Seagate 12 TB drives. It
must be more than 3% cheaper to buy (and service!) a Seagate with a 3% chance
of failure than to buy an equivalent HGST with a 0.4% chance of failure. I
guess when you have 120,000 drives, easy hot-swap enclosures, and software to
handle it all that makes good sense! But as an individual consumer, even with
a Backblaze backup, it's definitely worth my time to spend a bit more on a
drive that's far more reliable than to save a few dollars on a Seagate.

~~~
sixothree
> Only thing that's really perplexing is why they keep buying more and more of
> the high-failure-rate Seagate 12 TB drives.

I am guessing they RMA the drives and get replacements.

~~~
universenz
Your comment just sparked an interesting question in my mind: If a drive has
failed, until now I always imagined the drive was just trashed. But now that
you mention they are probably RMA'ing them, do you think that BackBlaze send
the RMA drives through a magnetic tunnel of some sort before they ship the
drives back to the manufacturer? Because otherwise, how do they ensure
potentially unencrypted customer files are not accessed during the
repair/refurbishment process?

~~~
anarazel
I'd hope that their data is all encrypted at rest. Compared to the bandwidth
of spinning disks, the cost of doing hardware assisted AES isn't big.

~~~
m4rtink
Yeah, I would expect any data reaching the drives to be encrypted by
Backblaze, with the key newer reaching the disk.

You could even have keys per disk and wipe them when a disk fails.

Either way, you should be fine to RMA the drives as for an external observer
without the keys they just contain random noise.

------
alberth
Why do people use Amazon S3 when Backblaze B2 is 1/4 the cost of S3 and also
includes a CDN for free. You also get way faster access speeds with Backblaze
vs Amazon since they tier their IO speeds.

[https://www.backblaze.com/b2/cloud-
storage.html](https://www.backblaze.com/b2/cloud-storage.html)

~~~
cheeze
Last I checked, Backblaze still stores most data in 1 location, no?

So, durability of data (which to be fair doesn't matter for most s3 use
cases), and interop with literally everything else in AWS

Intelligent data tiering

Actual access control

Pre signed URLs

~~~
chocolatkey
I've combined cloudflare workers with backblaze to implement etags, signed
URLs, etc. Backblaze is part of CF's bandwidth alliance so your bandwidth fee
is zero. This makes for a very low monthly cost

~~~
SergeAx
Can you elaborate further about this setup? Is there an article or a FAQ topic
about it?

~~~
chocolatkey
Hi, I haven't had time to write up about this, however, I have dumped the
majority of the related code here for you and others who are interested in
this solution:
[https://gist.github.com/chocolatkey/a7ef0364e357629e9875521d...](https://gist.github.com/chocolatkey/a7ef0364e357629e9875521ddc8bc03a).
That should help you get started. It includes HMACSHA256 shared secret URL
signatures based on IP, expiry, and optional path scope restriction, caching,
ETAGs, sentry error reports, access to non-B2 data from a server w/ basic
auth, and more... URLs look like this:
[https://example.com/delivery/UNIQUE_ID/p-001.jpg?token=16fb4...](https://example.com/delivery/UNIQUE_ID/p-001.jpg?token=16fb4112744c20a1f71f2730072f9457612a90ce00cdb6a1ccdf639e4b31966e&expires=5e450200&scope=%2Fdelivery%2FUNIQUE_ID%2F)
. My B2 bucket is public, however the requested path is also hmac'd with a
secret known only to the CF worker to derive the path of the resources in the
bucket. It is optimized for my use case of serving EPUB data. I do not
guarantee it to be free of flaws, but it's worked well so far.

~~~
SergeAx
Thanks a lot!

------
UI_at_80x24
I live for these reports. Always insightful and professional. Thank-you SO
MUCH for publishing this data.

~~~
atYevP
Yev here -> You're welcome! The conversation's always fun :D

~~~
donmcronald
I barely had time to skim it, but I'm not sure I like how the ST12000NM0008
shows up in the table. I find it really hard to reason about what the real
failure rate could end up being on those drives. For example, you've got about
45 days average on each drive, so the failure rate is multiplied by roughly 8
to extrapolate the annualized failure rate. Doesn't that over state the
estimated rate of failure since drives will tend to fail more often at the
start of their life?

I only guesstimated out of the table and didn't have time to look at the
actual data, so it's possible I misread something.

------
magnat
Does anyone remember what is their definition of "drive failure"? Is it SMART
"failure imminent" report, single uncorrectable read error or complete data
loss for a whole disk? I recall reading about it in one of their previous
report, but can't find it again.

EDIT: nevermind, found it.

"Backblaze counts a drive as failed when it is removed from a Storage Pod and
replaced because it has 1) totally stopped working, or 2) because it has shown
evidence of failing soon.

A drive is considered to have stopped working when the drive appears
physically dead (e.g. won’t power up), doesn’t respond to console commands or
the RAID system tells us that the drive can’t be read or written."

[https://www.backblaze.com/blog/hard-drive-smart-
stats/](https://www.backblaze.com/blog/hard-drive-smart-stats/)

------
anonsivalley652
Ah yes, the reliable BackBlaze folks. That they've out-Googled Google in a
niche using mostly commodity infrastructure and kept their business alive for
so long is a testament to their ingenuity (I wonder how their operating costs
compare with AWS Glacier which has a theoretical advantage of unpowered
disks.). And the releasing of this proprietary operational business data is a
testament to their coolness factor.

It's a timely article as I'm looking at HC530's (WUH721414ALE6L4 /
WUH721414ALN6L4 (wiredzone carries it)) for a home FreeNAS box:

\- any relatively-modern enterprise 4U 3.5" storage box with Xeon 4 cores or
so

\- quieter, high-volume fan mod

\- RAM: 64-128 GiB, beyond that isn't useful unless deduping

\- NIC: X710-T4L 4x 10GbE copper NIC

\- ZIL: mirrored pair of high-endurance, write-intensive, reliable SSD like
Optane 900p/905p 280-480GB

\- L2ARC: striped pair of read-intensive/larger SSDs like the Gigabyte Aorus
Gen4 1 TB

This will fit nicely as my home NAS for a water-cooled dual EPYC virtualized
server/workstation build underway. I managed to get a single water block with
(3) G1/4 connections that will cool both CPUs and the VRM chokes/converters.

If anyone has better suggestions, please chime in.

~~~
tbrock
Why does someone need something like this? I ran a home symbology NAS at one
point but it wasn’t worth the trouble. Let others run and maintain those hot,
loud, power hungry disks.

~~~
anonsivalley652
Then you already made the mistakes of:

\- conflating trouble for you with trouble for me, which it clearly isn't

\- not owning your own data

\- paying more to store it

\- paying to access it

\- ability to keep things that aren't worth storing on paid clouds but aren't
all that much when kept on cheap drives

Furthermore, there are additional network costs such as AWS network charges
AND home ISP data limits.

And there are other uses, such as:

\- backing-up VMs

\- backing-up computers

\- caching package and source code repos

\- backing-up CCTV footage

\- and whatever else comes along

------
newscracker
Slightly off topic: is anyone using B2 (which seems cheaper if you have more
than one computer for a certain amount of data) for personal data backups with
strong client side encryption across multiple platforms (Linux, Mac, Windows)?
If so, how do you handle it?

~~~
orhanhh
I use Arq on two macs and it works very well with B2.

~~~
zippergz
I use and like Arq also, but the OP asked for something that covers Linux,
which I believe Arq does not.

------
ksec
Looking at those Data,

It seems they will soon reach 1000 PB / 1EB.

The top 5 Annualised hard drive failure rate are all from Seagate. All Drive
from Hitachi and Toshiba has AFR lowered than 1%.

So basically dont buy Seagate.

~~~
AdamGibbins
> So basically dont buy Seagate.

Or do, because they're cheaper than the competition and modern systems can
handle failures.

~~~
antonyh
True with enough redundancy it's fine, and if they have special terms with SG
such as free replacements and heavy discounts then it's little wonder they use
so many.

Myself though, for SoHo use, I'm willing to pay more for less stress because I
don't have the sheer volume of devices, and the time to replace is time spent
doing something useful instead of shuffling HDDs and rebuilding RAID arrays. A
5% saving on a handful of drives is not worth it, but a 40% saving on
thousands makes them competitive.

------
mherrmann
Does anyone here have experiences with BackBlaze's B2 service for hosting
files? I'm considering switching to it from S3 because it is much cheaper. (I
need to transfer 2-3TB / month, usually in 2-3 bursts of worldwide
distribution).

~~~
atYevP
Yev from Backblaze here -> We're definitely more affordable and our
integrations
([https://www.backblaze.com/b2/integrations.html](https://www.backblaze.com/b2/integrations.html))
make it easy to get your data to us. We even have partnerships with companies
who can help transfer data from S3 into Backblaze B2!

~~~
syedkarim
How is Backblaze able to be _so_ much cheaper than the other, larger
competitors? I assume Amazon/Google/Microsoft has squeezed every last cent
from suppliers and also has highly cost-optimized staffing costs.

~~~
atYevP
Yev here -> great question! We are a bootstrapped company and we focus on
inexpensive storage ([https://www.backblaze.com/blog/vault-cloud-storage-
architect...](https://www.backblaze.com/blog/vault-cloud-storage-
architecture/)). Because we've built a robust system that doesn't use a ton of
expensive components we can provide hot cloud storage (B2 Cloud Storage) and
computer backup at an affordable rate while still making decent margins. To
learn more about our business and decision making, we have a pretty cool
series of entrepreneurship blog posts that might be interesting to some:
[https://www.backblaze.com/blog/category/entrepreneurship/](https://www.backblaze.com/blog/category/entrepreneurship/)

~~~
ChrisSD
Reading about b2 pricing it says, you get "10GB of free storage, unlimited
free uploads, and 1GB of downloads each day". Doesn't that amount to
essentially free backups for (reasonable) personal use? Or am I missing
something?

~~~
jl6
I think even casual users tend to have more than 10GB of data these days.

~~~
ChrisSD
I don't. Although I can easily fill up a terabyte drive, little of that is my
own personal files that I need to keep if the drive blows up. Most of my stuff
is source code, documents/notes and some photos (with photos being the only
thing that takes up significant space). Almost everything else I can re-
download or rebuild from the original source as and when I need it.

------
donatj
I have made all my hard drive purchasing decisions based almost entirely on
these reports for the last couple years and have not been disappointed with
the results.

------
metalliqaz
I use Backblaze's massive infrastructure to store pictures of my keyboard.

~~~
atYevP
Is it a cool keyboard?

~~~
mherrmann
Are the keys very large?

------
exabrial
I still can't believe BackBlaze gives this data away for free. Seems like
something they should be selling to other cloud providers

~~~
Mirioron
Maybe they consider this report to be an ad for their services? The name
recognition this report gives them is probably quite valuable.

~~~
icelancer
It also pressures HDD companies to make better products and appear higher on
these lists, which is good for Backblaze.

------
robertoandred
I love Backblaze, but their log package in my Library folder has grown to
something like 10 gigs. Wish there was a way around that.

------
dleslie
Signed up for this a week ago. 45 days remaining to upload.

Hurray for Canadian internet.

~~~
UI_at_80x24
This may not apply to you, but atleast 2 of the UnderDogs in the Canadian ISP
world (MNSi, & TekSavvy) have been rolling out Gigabit fiber.

I've got a 1Gb fiber pipe for 1/10th the cost that Cogeco was charging.

------
sillyquiet
No really related remarks about this handy study, but anybody else still in
real awe about how spoiled we are with regards to the sizes and speeds of HDs
nowadays? I mean the smallest capacity drive on their chart is 4 _Tera_ bytes.

~~~
zaat
Not feeling spoiled at all, not at all. Especially not with 2 to 3 percent of
failure rate. The failure rate I experienced in my workstation makes me worry
about not having raid 1 or 10. HDs for 9 TB in raid 10 are not that cheap.

But the bigger issue is that the warranty terms for HDs nowadays is down to 2
or 3 years, so this investment is short living. It also tell you something
about the manufacturers reliability estimation of their products.

~~~
AnIdiotOnTheNet
Can't say I agree with that sentiment. The fact that I can quite reasonably
have a 30TB usable RAID5 NAS array makes me feel pretty spoiled. Then again,
I'm old enough that my first HDD was 10MB.

~~~
chousuke
I'd be wary of making a RAID5 array with drives that big; you could easily
lose another drive from the I/O caused by a rebuild; though if you have
backups (you should) then it's probably an acceptable risk for non-critical
data.

~~~
AnIdiotOnTheNet
I'd agree with that. Even 2-disk redundancy these days is a bit dangerous when
you're talking about 14TB drives and 100+TB arrays. As is often stated: RAID
is not backup.

------
Siecje
I have about 10 TB of video files. I use BackBlaze for Windows but I would
like the files to be available on other computers and my phone in my local
network.

What can I use to do this and still keep offsite backups?

~~~
mrguyorama
I think their more premium plans offer sharing

------
ronnier
I have two ST12000VN0007 (VN) Seagate drives. The report shows the
ST12000NM0007 (NM) has a 3.32% failure rate. I wonder how closely related the
VN and NM models are.

~~~
war1025
If you look, that drive model is also the most highly used by far. I think
it's just a matter of the larger sample size / use time.

~~~
JohnJamesRambo
Surely it doesn’t matter when you have 10,000s of drives? Aren’t you already
at a large enough sample size? If it isn’t, what is the point of them
publishing this every year? I don’t know the math of the matter though.

~~~
war1025
Yea I don't know. I'm not big on statistics either. I just noticed that the
drives that did the worst were the ones that had the most usage overall.

~~~
labawi
That would probably be price ~ failure rate correlation.

------
S3raph
I'm a very happy customer, but please do something about your mobile app
(android) it's really horrible.

~~~
leokennis
I agree the mobile app (on iOS in my case) is at best an afterthought, and
most likely not even a high ranking afterthought.

However, out of curiosity...what would you imagine a better Backblaze mobile
app would do?

~~~
S3raph
For sure it is/should not be high priority, but releasing such an app in 2020
for sure does not reflect the great skills of the backblaze team. At least
show me some basic stats, account settings and invoices. You can only download
files from your buckets and that's it.. really?

------
rosstex
Semi-bummed my school partnered with another backup company, cause I'd love to
support BackBlaze.

~~~
atYevP
Yev here -> Thanks! Out of curiosity, does your school provide backup to all
the students?

~~~
rosstex
To all grad students and faculty:

[https://csguide.cs.princeton.edu/hardware/backup](https://csguide.cs.princeton.edu/hardware/backup)

------
voldacar
Seagate always seems to have much higher faliure rates compared to
HGST/WD/Toshiba etc

Does anyone here know the exact reason why? I assume there are enough people
on this site who have worked for them or a competitor :)

~~~
rasz
30% profit margin, why change anything?

------
throwaway17_17
Does anyone have any opinions and experience using backblaze as a personal
only cloud storage and offsite backup for smaller amounts of data (under 30
TB)

~~~
newscracker
> for smaller amounts of data (under 30 TB)

Did you mean to say 30 GB or 30 TB? Calling 30 TB as "smaller amount" seems
weird to me in 2020, especially for personal data. Perhaps it would be the
norm in a couple of decades. :)

FWIW, I have way under 1 TB of personal data to backup to different locations,
and I consider that to be relatively large.

~~~
throwaway17_17
I did mean 30 TB, I have approximately 12 TB of data currently between all of
my storage for video, audio, books, and games. However, I have been avoiding
doing a lot of conversions to digital media from my physical collections
because I'm just unsure of running a full blown archival server at home. I
would estimate if I converted my entire video library to 4k it would put me
somewhere over 10 additional TB. My comic books/manga and graphic novels,
upgraded to archival resolution would probably run over 10 TB as well. Then
there is the soon to be required ripping of PS2/PS3/WIIU roms when those
hardware units become less reliable for actual playing. So I think that 30 TB
of storage would do for the time being for me, but I think I will eventually
need more than that.

TL;DR I am a digital horder, so I've convinced myself I do in fact need 30+ TB
of storage.

------
generalpass
I have wondered about system downtime or time operating in a degraded state.

My understanding is other than mirrored, RAID configurations may take a long
time to rebuild on the larger drives and this is a contributing factor to why
the highest sales volume of drives has been 'stuck' at 4TB (thus the lower
$/GB price).

~~~
linsomniac
They don't use traditional RAID setups there. My understanding is they use a
proprietary data encoding and distribution, which is more accepting of
individual drive failures and reduces rebuild times. I believe I've heard they
use something more like erasure coding rather than RAID-5.

~~~
ddorian43
[https://www.backblaze.com/blog/reed-
solomon/](https://www.backblaze.com/blog/reed-solomon/)

There are many open source libraries.

------
jmnicolas
Any particular reason they don't use Western Digital drives ?

~~~
volkl48
I will point out that HGST is owned by Western Digital and all their products
are being rebranded to WD.

------
gesman
So what does these mean:

smart_177_raw

smart_177_normalized

smart_233_raw

smart_235_normalized

???

~~~
thenewwazoo
They're S.M.A.R.T. attributes: [http://www.cropel.com/library/smart-attribute-
list.aspx](http://www.cropel.com/library/smart-attribute-list.aspx)

~~~
gesman
Thank you!

