
What Hard Drive Should I Buy? - nuriaion
http://blog.backblaze.com/2014/01/21/what-hard-drive-should-i-buy/
======
brokentone
Very cool that Backblaze continues to post things like this. Few people have
this experience. Of those who do, few (I assume) break it out to this level of
detail, actionable for others. Of those who have good experience and records,
most would consider it proprietary or just decide not to post. Kudos to
Backblaze.

~~~
atYevP
We're all on this crazy spinning ball together. Early on Backblaze made the
decision to stay out of the hardware business (it sounded lucrative after we
developed the v1 storage pod, but everyone was a software person, so we went
that route), we like to pay our hardware findings forward. It's a good thing
to do!

~~~
Zancarius
I really like this detailed write up. A certain large company in Mountain View
suggested years ago that they had collected a great deal of metrics on
consumer grade hard disks performing in a data center capacity, but refused to
publish the data.

The fact that Backblaze isn't embarrassed to share this with the rest of this
gives me the warm fuzzies. Thanks for the hard work!

I recently purchased a new Seagate, and I'm feeling a bit of buyer's remorse.
That's not to say this article triggered the feeling completely (though it did
push me over the edge). Taking this information and the knowledge that recent
model Barracudas appear to have aggressive APM that loads/unloads the heads
after a minute or so of inactivity (unless disabled, but it doesn't persist
between reboots) AND seems to borrow from laptop-grade design... I think I'd
be better off replacing it. Again. ;)

I wonder if the poor longevity of Seagate drives is due to changes in quality
control or design? I have older ones that don't utilize the APM load/unload
features and are still running fine after many years. Then again, they were
also pre-flood drives.

I'd also be curious to know if there's a correlation with where the drive was
manufactured and its survival rate, though I suspect that might be a much more
difficult metric to collect.

~~~
AceJohnny2
> A certain large company in Mountain View suggested years ago that they had
> collected a great deal of metrics on consumer grade hard disks performing in
> a data center capacity, but refused to publish the data.

They did in 2007:

[http://static.googleusercontent.com/media/research.google.co...](http://static.googleusercontent.com/media/research.google.com/en/us/archive/disk_failures.pdf)

(first result of a search for "google hard drive report")

~~~
isomorphic
That report does not have data about individual manufacturers. Perhaps Google
considered that to be lawsuit bait, or perhaps they considered it to be
proprietary. Either way Backblaze's post has actionable information about
which manufacturer to buy with respect to reliability trade-offs. Google's
report does not.

~~~
unabridged
>Perhaps Google considered that to be lawsuit bait, or perhaps they considered
it to be proprietary.

I don't think it has anything to do with being afraid of lawsuits (or every
consumer review company would be out of business) or worried about competition
getting that information. They most likely get some kind of volume pricing
discount (or even pricing based on experienced reliability) that is dependent
on them not releasing such data.

------
justin66
> The drives that just don’t work in our environment are Western Digital Green
> 3TB drives and Seagate LP (low power) 2TB drives. Both of these drives start
> accumulating errors as soon as they are put into production. We think this
> is related to vibration. The drives do somewhat better in the new low-
> vibration Backblaze Storage Pod, but still not well enough.

Another reason to avoid the WD Green 3TB: these drives aggressively put
themselves to sleep to save power. It's literally a matter of streaming a
video from disk and if the OS caches enough of the file, the drive will see
there haven't been any accesses in a few seconds and stop spinning.

The video will of course glitch when the cached data runs out and the drive
needs to spin up. Great design.

~~~
B-Con
Don't use the Green line for _anything_ performance related, even if it's
watching videos. The Green line is simply not intended for that. I think it's
designed to be an occasionally-accessed backup drive, or something like that.

The Black line is the performance-conscious line, look at the corresponding
equivalents in that line. IIRC the Blacks are 10-15% more expensive, but they
actually respond.

~~~
IgorPartola
I recently bought two Green drives for my NAS not knowing the difference. Do
your research :).

Having said that, the NAS is working out pretty well so far with ZFS running
on top of them.

~~~
erkkie
Like others have mentioned, please make sure you're using wdidle to regulate
head parking. Also might wanna read this
[http://forums.freenas.org/threads/western-digital-red-
with-t...](http://forums.freenas.org/threads/western-digital-red-with-tler-vs-
green-with-wdidle-set-to-disabled-drives.11280/) . Greens are dangerous in a
NAS context so make sure you have backups.

~~~
IgorPartola
Odd. The link you posted directly contradicts your point. It says that Reds
are at that time unproven and that the Greens are the way to go for ZFS.
Basically it is saying that ZFS does not require TLER so that part is not an
issue. On top of that it is saying that the Green drives are both common and
well tested.

~~~
erkkie
I think I read it slightly differently then, the gist was that if wdidle is
properly used, the drives "should" be equivalent (since they supposedly share
the same hardware, just different firmware).

Reds are made for a NAS context, greens are not. You may or may not make the
difference go away by using wdidle. This does make greens inherently dangerous
(when compared to reds).

------
deltaqueue
From the article and throughout the comments here it seems Backblaze prefers
cheaper drives over a few percentage points of reliability. It would be
interesting to see some data showing the tradeoff, but I suspect it reveals
too much of their operation. At first glance it appears you can get a drive
with .9% failure rate (HGST 7K3000) for $127[1], and yet BB really likes the
WD Red, which has a higher failure rate (3.2%) and cost[2].

What might shed light without revealing too much is information about where
they source drives today (their sourcing coverage during the shortage was very
cool!). I suspect they're finding some nice bulk discounts somewhere.

[1] [http://www.amazon.com/Hitachi-
Deskstar-7K3000-HDS723030ALA64...](http://www.amazon.com/Hitachi-
Deskstar-7K3000-HDS723030ALA640-SATA-600/dp/B004D8KO7Y) [2]
[http://www.amazon.com/WD-Red-NAS-Hard-
Drive/dp/B008JJLW4M/](http://www.amazon.com/WD-Red-NAS-Hard-
Drive/dp/B008JJLW4M/) (both seem to be market consumer prices)

~~~
brianwski
> What might shed light without revealing too much is information about where
> they source drives today

Backblaze employee here -> we are willing to buy from anybody, we have no
loyalty. Lowest price (for a particular drive model) always wins. Once per
month we ask about 20 common suppliers for their "best price". We have bought
from "B&H Photo Video", NewEgg, Amazon, etc among others. We're always willing
to add more possible vendors, but I think we drop you from the list if the
vendor bid prices don't even come close for 3 months - that means you don't
understand anything and you're wasting our time.

~~~
scholia
"Backblaze employee here"

CTO, unless that's changed....

~~~
brianwski
Guilty as charged. :-) CTO, head janitor, the company lived in my 1 bedroom
apartment's living room for 3 years and up to the first 9 employees.

------
hackinthebochs
Definitely wish I had seen this a couple months ago before I bought two 3TB
seagates. Although to be fair I was already pretty sure that seagates sucked
(its good to see data backing that up) but getting two for $85 each was too
hard to pass up. I'm a sucker for a deal. I buy HDs in pairs now so I'm not
too worried about losing anything.

I am intrigued by backblaze's service though. A part of me feels like there
_must_ be a catch somewhere. I have a good 10TB I'd be happy to pay $5/month
to backup but somehow I feel like they'd pull a comcast and say their
"unlimited" claim doesn't apply to the 1% of users (or in this case maybe the
.001%).

~~~
stdbrouw
The catch is that the initial upload can take a while if your internet
connection has limited upstream – not something Backblaze can help but I'm
sure it has made some people think twice before having their external drives
mirrored in the cloud.

~~~
brianwski
One idea that works for some people is to carry your entire computer to a
"fast connection", for example your workplace. Plug it in, leave it running
for 24 hours to power through the initial upload, then carry your computer
home again for the incrementals. Just an idea.

~~~
arielweisberg
My experience has been that the upload speed is slow no matter what connection
you are on. They claim not to throttle, but I am skeptical that you will get
more than 1.5 megabit anywhere. I couldn't.

They actually ship a bandwidth test that goes to their datacenters. If I ran
multiple instances of the test at once I got 1.5 megabits for each instance.
It's not a bandwidth or capacity issue...

If they documented 1.5 megabits I wouldn't complain, but claiming one thing
and shipping a product that throttles, whether explicitly or implicitly, is
obnoxious.

I did a restore recently and was pleasantly surprised to get 9 megabytes/sec
down so that is much better and restore speed is a higher priority.

However if I had to upload the entire data set from scratch I would stop using
Backblaze.

~~~
brianwski
I work at Backblaze -> we absolutely do __NOT __throttle.

Inherently each pod has some built in limitations, for example it has a 1
Gbit/sec network card, so you won't be able to exceed that. But many, many
customers get 100 Mbits/sec upload speed, I can show you our internal chart if
you like.

I'm glad you got a fast restore download, we have been BATTLING Comcast in
recent weeks, between 5pm and midnight downloading restores from Comcast has
been trickle slow. Some people claim this is due to Netflix traffic, but I
think some system admin somewhere is not doing their damn job. We think we
have finally figured out a work around just this morning...

------
cbr
Any stats on power consumption? Over 5 years the difference between a drive
that uses 6 Watts and one that uses 7 is 44kWh or about $5. Double that to
include cooling costs and saving a Watt should be worth something like $10 to
you, so a more expensive more efficient drive could be worth it. Do these
drives all use similar amounts of power?

~~~
ars
Cooling costs are an extra one third (and even that is only during the
summer), not even close to double.

~~~
gibybo
Do you have a source or at least an explanation for that? I've always thought
it took at least as much energy to cool something as it did to heat it
(because AC is not 100% efficient), but I've been pretty wrong about
thermodynamics in the past so I'd like to dive a bit deeper here.

Maybe this is only true for AC, and not water cooling being pumped to passive
radiators on the roof?

EDIT: Here:
[https://www.google.com/about/datacenters/efficiency/internal...](https://www.google.com/about/datacenters/efficiency/internal/),
Google is claiming a 'Power usage effectiveness' of 1.11 across their data
centers. This implies a cooling cost <= 11% of their server costs which I
found quite surprising.

~~~
ars
It's pretty easy to calculate. Just find a window A/C and check the BTU it's
rated for and the wattage.

Convert BTU to watts, then do wattage to power the A/C divided by the watts of
cooling power.

Then note that this is the worse case scenario - normally it does better than
that except on the very hottest days of the year.

------
freshyill
I wonder if any of this actually applies to consumer-grade drives.

My wife's hard drive actually just died. It was 160 GB WD in a _black_ 2006
MacBook. The drive itself was a replacement from 2007 since the original drive
died just over a year into its life.

Stupidly, since her Time Machine backup was misbehaving, I reformatted it and
set it to start over. I spent the weekend recovering her data—with a lot of
success, so no big deal. At any rate, this machine is long past its expiration
date. It's time for a MacBook Air with an SSD, once the tax refund comes in.

~~~
atYevP
Yev from Backblaze here -> All of our drives are consumer-grade. We try to
avoid buying enterprise drives at all costs. These are all off-the-shelf
internals and in some cases...externals that were made internal! :)

~~~
chimeracoder
> We try to avoid buying enterprise drives at all costs.

Why is that? From what I can read on your website, it seems you would be the
correct use-case for enterprise drives, no?x

~~~
atYevP
We designed our storage pods and software to work in spite of hard drive
failures, so paying a $50-$100 premium for a drive just to avoid possibly $5
in labor to replace it when it fails is not a good practice.

We go for the good cheap stuff, and that's how we maintain low prices for our
actual product, which is online backup. As long as the drives are reliable-
ish, and are inexpensive, that's what counts!

------
olov
"If the price were right, we would be buying nothing but Hitachi drives."

I don't understand why they don't. Are the Hitachi drives really _that_ much
more expensive so that it doesn't justify their _vastly_ longer lifespan? Even
if they can get "free" replacement disks during the warranty period, that has
a cost for them. And they mentioned that some replacement disks die even
faster.

I'm sure Backblaze has crunched all these numbers - would love to see them.
BTW thanks for sharing this data!

~~~
brianwski
Backblaze employee here - it is honestly just a spreadsheet that kicks out the
answer. Every month we ask 20 or so suppliers for the lowest price for each
drive type. If Hitachi are 10 percent more expensive but fail 10 percent less
often, that balances out and we buy Hitachi. But if it is 12 percent more
costly then we get the other brand. There is a tiny bit of free preference
leeway given to Hitachi because it means less hassle to our over worked
datacenter team...

~~~
olov
If I'm not reading it wrong then your data says that the Hitachi drives have
_half_ the Annual Failure Rate, or less, than the others (in your setup). Not
sure what this means in MTBF but the Hitachi's sure seem to be worth a whole
lot more, certainly 10, 20 or 30 percent more - no?

~~~
brianwski
I don't think the math works. I posted this above:

I think the calculation is replacing one drive takes about 15 minutes of work.
If we have 30,000 drives and 2 percent fail, it takes 150 hours to replace
those. In other words, one employee for one month of 8 hour days. Getting the
failure rate down to 1 percent means you save 2 weeks of employee salary -
maybe $5,000 total? The 30,000 drives costs you $4 million, so who cares about
$5k here or there?

The $5k/$4million means the Hitachis are worth 1/10th of 1 percent higher cost
to us. ACTUALLY we pay even more than that for them, but not more than a few
dollars per drive (maybe 2 or 3 percent more).

Moral of the story: design for failure and buy the cheapest components you
can. :-)

~~~
olov
Ok, after converting to MTBF the numbers make more sense: An AFR of 0.9% means
a MTBF of 968947 hours (111 years). An AFR of 3.2% means a MTBF of 269346
hours (31 years).

I guess an MTBF of 31 years is plenty for your needs. Thanks again for sharing
the data.

~~~
im3w1l
I think the failure rate will go up in old age. I just don't see those drives
still working in 100 years.

------
mustafab
Too bad you still don't have linux client. Do you think supporting linux users
anytime soon?

~~~
Nickoladze
Relevance?

~~~
blisterpeanuts
Well, it's off the subject, but it's still an interesting question and I was
wondering the same thing.

------
acd
Also see hardware.fr failure rates. It shows different data than Backblaze.

French hardware site, component failure rates. Google translate it to english
[http://www.hardware.fr/articles/911-6/disques-
durs.html](http://www.hardware.fr/articles/911-6/disques-durs.html)

So its also important to take hard drive models into account.

Then there is the Google study Failure trends in large hard drive population
[http://static.googleusercontent.com/media/research.google.co...](http://static.googleusercontent.com/media/research.google.com/en//archive/disk_failures.pdf)

------
jader201
_> We are focusing on 4TB drives for new pods. For these, our current favorite
is the Seagate Desktop HDD.15 (ST4000DM000). We’ll have to keep an eye on
them, though. Historically, Seagate drives have performed well at first, and
then had higher failure rates later._

I'm a little surprised that they actually did the analysis to determine the
Seagates tend to fail more, yet they are still putting most (or at least,
quite a bit) of their faith in those.

Based on their own data, I would likely avoid those, or at least start leaning
more toward Hitachi and WD.

Or maybe the initial cost of those is so much better that it compensates for
any long-term expense.

~~~
gatehouse
You also have to consider that when the drives eventually fail, they will be
replaced with hard drives of the future -- which will presumably be cheaper
than the HDD of today. I.e. they depreciate quickly.

------
dnissley
Interesting. I had always avoided Hitachi Deskstars after having heard they
were nicknamed "Deathstars" for a reason. Perhaps that was once true, but
clearly it's not anymore.

~~~
yen223
I suspect those infamous disk failures were caused by faulty manufacturing
runs, rather than some inherent flaw in the design.

~~~
radiowave
IIRC, it was specific to the 38GB model, and was traced to a batch of bad
electronic components.

~~~
nknighthb
No, it was the whole 75GXP set (actually, according to Wikipedia, the 120GXP
and 180GXP were also affected to a far lesser extent).

The backlash was really nasty, though, because IBM's drives were something a
lot of us had come to trust and rely on, the 75GXP was kind of a flagship
drive, and IBM's handling of the issue was less than stellar. We felt
betrayed.

I never have bought another IBM (or Hitachi) drive. The reaction is visceral
and intractable.

~~~
radiowave
Ah yes. I think I had the jumper on mine set to limit the drive to 38GB.
Luckily, mine was OK.

------
csense
A better presentation of this data would show a failure rate for each brand
and month/year of purchase.

For extremely simple devices like resistors or incandescent light bulbs,
failure rate is relatively constant over the lifetime of device -- the chance
of a functioning resistor with 10 hours of use failing during the next hour is
the same as the chance of a functioning resistor with 1000 hours of use.

For complex devices with lots of interdependent parts, some of which are
mechanical, the failure rate changes over time. There's an "infant mortality"
or "lemon" phenomenon, where relatively new devices have higher defect rates
(because fabrication and shipping sometimes result in imperfections which
quickly cause failures), followed by a steep dropoff in failure rates (because
observing a device operate correctly for dozens of hours is strong evidence
that it doesn't suffer from a failure mode which often results in infant
mortality).

Then there may be an increase in failure rates later, especially with devices
that are partially or wholly mechanical (wear or damage type problems which do
not cause immediate failure, but make it easier for a failure to occur).

You need empirical data to be quantitative about this curve, and it sounds
like Backblaze has it, but their presentation in this article doesn't show it.

~~~
hga
As I recall one of the studies of a few years ago, the one based on
supercomputers, not Google's, showed there was very little infant mortality,
and wear clearly set in after roughly one year in service. The results were
quite striking, and nothing like the bathtub curve many expected and that you
sort of sketch out.

------
Fomite
This is a pretty textbook perfect application of survival/time-to-event
analysis. Any chance the data behind it could be made available for teaching
purposes?

~~~
atYevP
Yev from Backblaze -> Where/what do you teach? We're unsure about releasing
any _more_ information at this time, but we're not opposed to it. What would
it be used for?

~~~
Fomite
I'm a postdoc at the moment, but I've taught survival analysis before. It just
struck me as a particularly straightforward example with pretty clean data, a
visible separation of the different groups, etc.

Basically, lots of people use the "Iris" data set to learn either data
visualization or machine learning based classification. The same could be true
of the "Backblaze Hard Drive" data, but for survival analysis/time-to-event
statistics courses.

~~~
atYevP
Very cool! What's a good way to reach you, I see your handle doesn't have an
about section. My handle is my twitter handle if you would like to ping me and
we can chat a bit more!

~~~
Fomite
Email sent.

------
rythie
I wish Backblaze would provide some sort of Amazon S3 competitor, Amazon
always seems very overpriced.

~~~
atYevP
Hypothetically, what would you be looking for in an S3 competitor?
Specifically:

1\. How much data do you have?

2\. What would you use it for?

3\. How important is it if the API isn't the same as the S3 API?

4\. Any specific certification requirements?

5\. Would you have any SLA requirements?

6\. Specific performance metrics? Think Amazon S3 vs. Glacier.

7\. Are redundant data centers important?

Hypothetically!

~~~
rythie
I think the main problem is cost, Amazon's costs prevent startups like Everpix
from existing as does it for video startups and various other startups that
require a lot of storage. I think Digital Ocean's disruption of the very
established VPS market is the model to follow.

Personally I'd like to have all my data in the cloud.

1\. 300GB growing at 25-50GB/year

2\. Backup / cloud storage

3\. Possibly, though I'd like SFTP access really

4\. no

5\. Don't lose my data, probably not down for more that 3-4 hours in one go.

6\. Enough to stream video to back to my box

7\. Not really, I'd assume the chance of data center being down is pretty
small.

Hitachi 500GB USB $2.4/month (based on $60 drive with 2 year warranty)
Backblaze $5/month [Unlimited] Crashplan $5/month [Unlimited] Amazon charge
$22.80/month Dropbox $50/month [500GB] (though I still need a local copy)
Digital Ocean instance $320/month

I think there is opportunity to come in at half of Amazon's price or less and
that could lead to a new set of start-ups that could build on that.

~~~
atYevP
Thanks for taking the time to answer! Interesting use-case!

------
staticshock
Interesting to see that kind of a difference between hitachi and western
digital, given that WD owns HGST. Are hitachi drives marketed as higher
reliability drives, or was the acquisition by WD simply too recent for the
quality of the two brands to "equalize"?

~~~
davis_m
Backblaze's data shows that the number of errors is largely related to the age
of the drive. Because the older drives are from before Hitachi was acquired by
WD, it is going to take a few more years for the brands to equalize if they do
combine the manufacturing of both lines.

------
DanBC
Do the Thailand floods make any difference to this report? How reliable were
those drives, and are the factories ba k up to full speed yet?

~~~
yen223
The factories have been at full speed for the past year now.

~~~
pstack
You wouldn't know it by the prices.

~~~
izzydata
The prices are pretty much where they were from before the flood. Harddrive
capacity has just gone up a ton. I bought a 1 terabyte before the flood for
$60 and that is pretty much where it is at now if not cheaper in some cases.

~~~
gwern
Yes, but the problem is, hard drive $/GB has not recovered to the pre-flood
trendline, even if prices have (4 years later) finally hit parity. The
trendline has been badly hurt by the floods, it's pretty amazing. (I did a
little tracking and was impressed how my predictions of the consequences were
still insufficiently pessimistic:
[http://www.gwern.net/Slowing%20Moore%27s%20Law#kryders-
law](http://www.gwern.net/Slowing%20Moore%27s%20Law#kryders-law) )

~~~
muyuu
Economies of scale haven't recovered to pre-flood levels, because of the
crisis in developed countries and because SSD is quickly growing its share of
the market.

These are completely different technologies and their R&D spending and
amortisation is separate.

~~~
gwern
> because of the crisis in developed countries

You mean that crisis which started in 2007?

> because SSD is quickly growing its share of the market.

Aren't SSDs still like 5x more expensive on a per-GB basis?

~~~
muyuu
Yep to both.

Still, the crisis makes recovery harder for every market. And SSDs being 5x
more expensive just became acceptable as download speeds have kind of stalled
worldwide, and video resolution is not as demanding (compression also
improved). Past 100GB or so, the utility curve enters into diminishing returns
pretty strongly.

It used to be the case that media demanded exponentially more storage. Not
anymore. You look at the average size of movie torrents for instance, and they
have slowed down their growth drastically. The benefit of bigger files in
practical terms is not as apparent anymore.

The cloud has also taken over for ratpacking download-and-forget data.

SSDs are already the better option for a lot of people (and still on the rise)
in the throughput-latency-cost-space equilibrium. It's quite obvious just by
looking at the stuff OEMs are shipping.

~~~
gwern
This is all vague reasoning compared to a huge flood crisis screwing up the
supply chain for years. Why should I believe your claims about 'the cloud' and
'diminishing returns' rather than an amply documented crystal-clear cause?

~~~
muyuu
These are not claims you need to believe, it's overwhelmingly evident that
there have been changes in the consumer landscape towards less demand for
storage space above what they already have. People are not filling up their
drives as they used to.

The proof is the % share of SSD drives selling in retail. Regardless of the
massive difference in cost/GiB. Mor people prefer a faster 250GB drive over a
slower 4TB. Because 250GB is still plenty for the average consumer and speedy
computing wins the bargain.

~~~
gwern
It's evident - but it's a trend which has always been happening, returns are
always diminishing. Why are we attributing to this gradual effect the the
sudden huge abrupt break in an exponential lasting years which we have
observed, rather than say, an abrupt unprecedented flooding crisis affecting
experience curves?

~~~
muyuu
I don't think the effect of the floods is lasting so much. It just so happened
that the disruption of SSD started to be noticeable around the same time of
the aftermath of the floods.

And by floods, I mean the speculative movements they caused rather than the
real effect on supply.

------
ChuckMcM
Great post, as you get bigger populations of drives you can get a lot more
visibility into their overall reliability. If there was one thing I could add
to the analysis would be to split out the drives by serial number and split
them out by firmware. Sometimes you find that all of the 'problem' in a set of
problem drives is a single range of serial numbers.

We've had similar experiences with replacement drives, they are, by and large,
significantly less reliable than "new" drives.

And last bit, we've got Western Digital drives here (a mix of 2.0 and 3.0 TB
ones) They have been pretty solid performers for us.

------
polskibus
HGST (Hitachi) has been bought by Western Digital. One should be able to
expect a merge of their HDD lines (source:
[http://www.hgst.com/](http://www.hgst.com/)) .

Moreover, it seems that Deskstars are no longer manufactured (or have been
rebranded). [http://www.hgst.com/hard-drives/product-
brands](http://www.hgst.com/hard-drives/product-brands)

------
undoware
Aside from the incredible usefulness of the data herein -- thanks, backblaze!
-- this is also the kind of marketing that I don't mind.

Backblaze got egg on its face yesterday on HN when someone's critical report
(rightly) got upvoted. Today they make up for it by giving us an interesting
and useful data chart.

I know it sounds weird, but for whatever reason I read this as demonstrating a
high level of corporate responsibility and attunement to customers. They could
have compensated by instead, say, dropping a few grand on buying journalists
and 'reviewers', like Microsoft does. But they didn't. Instead, they were
cool. To me that signals that they'll also take care of whatever problems
they've had recently. (Note: I have no affiliation with Backblaze.)

I'm currently not shopping for an online backup service but if that ever
changes, I now have a good feeling about Backblaze, and I hope that other
services take a similar approach to repairing customer relations when they are
fraught.

Shit happens, even to backup providers. It's how you respond that matters
most.

------
cordite
Interesting note on the WD 3TB Green.

I have one in my rig, and whenever I do any disk access, I always have to wait
about 5-8 seconds for it to spin up every other half hour. It seems to
aggressively turn off. I have my base system on an SSD, and my games and other
things on this 3TB drive.

I guess such spin up times would be unacceptable in backblaze's environment.

------
kylec
This makes me feel better about the 4 HGST 4TB drives I just bought
([http://www.newegg.com/Product/Product.aspx?Item=N82E16822145...](http://www.newegg.com/Product/Product.aspx?Item=N82E16822145912)).
They were the cheapest 4TB 7200RPM drives on Newegg by a non-trivial margin.

~~~
atYevP
<3 <3 <3 <3 <3 Hitachi drives. If we (Backblaze) could get them at a rate
within a few dollars of the other manufacturers at the 4TB level, we would
solely buy them.

~~~
reitzensteinm
Surely not? If a specific model happens to have a large defect rate, having
all your eggs in one basket could prove absolutely disastrous.

Spreading your drives out, even among inferior options, seems like the only
solid strategy.

~~~
atYevP
Normally diversification is absolutely the way to go, but once we've found a
drive with low failure rates across the board, we'd want to move mostly to
using that one drive type. We can always move over to another type of drive,
but there's a very real cost of having to swap drives out as they fail, so the
more reliable the drive, the more incentive to stick with it until it's no
longer reliable. At the moment, we're more concerned about price, so we buy a
wide-variety, but we have our favorites :)

~~~
justin66
> Normally diversification is absolutely the way to go, but once we've found a
> drive with low failure rates across the board, we'd want to move mostly to
> using that one drive type.

With regard to deployment, but also with regard to the drive reliability
numbers you guys are putting out there: do you worry that variation between
manufacturing runs is going to hose you? Do you find looking at the drives
that you're getting a good variety of hardware from multiple manufacturing
runs?

------
protomyth
In the last two year, I have stopped all buying of Seagate drives. We had a
rather large rash of failures in RAIDs and desktops (50% of about 50 drives).
Then the "every drive shipped by HP is failing" problems of the netbooks (70
total replace 50 drives) were also Seagate drives.

We are basically an WD house now.

------
desireco42
Is this to counter bad pr from yesterday's story about that guy whose files
you lost? Because it looks that way.

No matter what hd you use, if you are corrupting files, it's all the same.
Same with Evernote, if their sync is losing notes, and it is, everything else
is less important.

~~~
atYevP
Sean's issue was unfortunate. I think he posted the blog a little prematurely,
our support is in contact with him and we're trying to figure out what
happened. We restored a lot of data over Backblaze's life (over 5 billion
files) and normally the .zip restores are rock solid, and we didn't have any
outages yesterday. We're trying to collect his logs and see exactly what broke
down. He said that he'd update his posts after the issue was resolved, so keep
an eye on it!

~~~
desireco42
I will, but it cast serious shadow over your service :) however awesome other
aspects are

~~~
atYevP
It happens. Computers are weird things, especially when networking is
involved. We always recommend having a local and an off-site copy of data.
That way you minimize risk, we're hoping we can get him back up and running
soon!

~~~
GigabyteCoin
>Computers are weird things...

That's not very encouraging, coming from a data storage company employee.

~~~
memracom
It might not be encouraging but it is honest and it is factual. No company,
however much you might wish it, can do magical things. At best, you can
leverage the laws of large numbers to do things which are apparently magical
to those people who do not have an understanding of maths, technology and
engineering. This is exactly how stage magicians accomplish their tricks. A
solid understanding of physics and high precision engineering.

~~~
GigabyteCoin
A data storage company losing data isn't confined to the realms of Magic.

It's incompetence, and lack of backups on their part.

The company that suggests you take backups of your data on their servers does
not keep backups of their data. How humorous.

The company that prides itself on profiting by the skin of it's teeth ($5/mo
for unlimited data storage and transfer), and there are issues that arise from
that severe cost cutting. Surprise, surprise.

------
jpalomaki
Posts like this are the reason why I trust Backblaze.

In general I'm very suspicious about the unlimited offerings. When they tell
the technical details I get the feeling they are up to task and not just
reselling S3 and hoping people pay but don't actually use the storage.

------
caycep
I'm just an amateur at statistics, but I would think this would be a useful
set to do generalized linear mixed models on, to see what factors could be
statistically significant (manufacturer, model, factory location, etc etc etc)

------
gnoway
As an administrator with an admittedly smaller sample size, this lines up with
my experience with the Seagate 1TB Barracuda ES.2, 15K.6/15K.7-era Cheetahs
and some normal consumer Barracudas as well. All of the nearline and
enterprise stuff had 5 year warranties, and over the past 5 years we've
replaced over 25%.

I've sworn off Seagate altogether, at least until they demonstrate a
commitment to producing more reliable drives. I will not willingly buy them
for the datacenter, and I won't buy them at any price for the home, RAID or no
RAID.

------
abcd_f
Interesting stats, but I wonder if the drive usage pattern at BackBlaze is
somewhat different from that of a home user. In terms of how they seek, read
and write. Not that they put more mileage on them faster, but that they might
be doing something fundamentally different from how these drives are tested by
their manufacturer. Lots of bulk in-sequence writes or something else. The 5-7
times difference in failure rates between _leading_ manufacturers is frankly
hard to believe.

------
ck2
I'm a big fan of single platter drives, I just buy the biggest single platter
at the time.

Which is currently the WD 1gb blue. Very fast, very cool running.

~~~
krakensden
1GB? that seems... awful small.

~~~
baddox
And even 1TB, which is presumably what he meant, feels very small to me.

------
Dystopian
VERY interesting. I always avoided Hitachi drives in my deck for some reason -
always concentrated on WD and Seagates (personal experience has led me to lean
towards WD as well, as I had a bunch of those non-LP drives that they talk
about that like to conk out).

From what I read they look quite reliable over a fairly representative sample
(annual failure rate v. # of drives / TBs / years).

~~~
pixl97
> I always avoided Hitachi drives in my deck for some reason

Probably because Hitachi bought IBM's line of _Deathstar_ drives that were
notoriously unreliable. It appears they have done far better with the product.

~~~
brianwski
I've heard this rumor from everywhere, that the Deskstar drives were
unreliable. If you look at the reviews from NewEgg they were TERRIBLE for that
product. But I'm completely suspicious of where that reputation came from,
because the statistics just don't bear it out. Often "common wisdom" is wrong
- I suspect this is one of those cases.

~~~
pixl97
In 2000-2001 I was an admin for a day trading firm managing around 80
computers and servers. The drives worked great. They were really fast right up
till the moment of their clicking death. At around 6 months around 15 out of
30 had it occur.

------
chemmail
Shouldn't Toshiba be the drives to buy since they took over Hitachi's 3.5
drive factories when WD took over due to antitrust?
[http://www.wdc.com/en/company/pressroom/releases/?release=f8...](http://www.wdc.com/en/company/pressroom/releases/?release=f8697da0-111a-49e8-a484-5266d18526ab)

------
InclinedPlane
Back in 2007/2008 or so I bought a pair of Seagate 7200 500gb drives and got
bit hard by their extremely broken firmware. I haven't bought seagate drives
since then, it's sad that their overall QC is still so terrible across the
board.

------
madads
Should consider what Jacob Applebaum revealed in his CCC preso about Hard
drives:
[http://youtu.be/vILAlhwUgIU?t=46m25s](http://youtu.be/vILAlhwUgIU?t=46m25s)

Cross out WD, Seagate, Maxtor and Samsung drives. Hitachi wins!

------
nwmcsween
The worst service and hard drives I have ever had were from seagate. I had 3
separate hard drives RMA'ed _twice_ and all three failed within 3 months twice
over, to this day I will not buy seagate.

------
benrapscallion
Along these lines, I have done a monetary analysis;
[http://perenniallycurious.com/centspergb.html](http://perenniallycurious.com/centspergb.html)

------
cocoflunchy
How does that compare to SSD failure rates? Are they much better?

~~~
KamiCrit
I dunno if SSDs have been around long enough for reliability testing. Along
with they have been advancing quite rapidly also so it is hard to stick with
something and expect the info to be relevant down the road. All I know when it
comes to SSD reliability, Intel is king.

------
bovermyer
I am very impressed by the transparency here, and appreciative of the data.
I'd never heard of Backblaze until now, but now I'll have to pay closer
attention.

------
goofygrin
Seagate Barracuda Green (ST1500DL003) 1.5TB 51 0.8 120.0%

I've got one sitting on my desk that seagate sent me as a RMA return. Guess it
won't be going back into our RAID.

------
edward
Price per TB:
[http://edwardbetts.com/price_per_tb/](http://edwardbetts.com/price_per_tb/)

------
aroch
Heh...I, too, love 3TB WD REDs. At last count I have 40 of these racked up in
servers or NASes and they've worked quite well for me

------
izzydata
Should the larger capacity seagate drives have a higher failure rate than
small ones? That chart seems counter intuitive.

~~~
unreal37
They said in the article that the 1.5TB drives were largely warranty
replacements, and they suspect they were refurbs.

------
happycube
Two things I never like to see together on a hard drive: "Seagate" and "Made
in China"

------
dshep
Dang! I just bought WD Green 2TB WD20EZRX drives (x4) for my NAS.

------
wnevets
the WD reds only have 3 stars on newegg.com, interesting.

~~~
zanny
Thing is most reviews are written immediately after recieving, and I'd rather
get a defective drive in the mail and just RMA it on day 1 with the confidence
when I get a working drive it will last longer than to have a drive that
always works flawlessly out of the box (some 3tb seagate model I know got like
4.5 stars) but usually fails in a year.

Though my current storage drive is a 3tb seagate one, that has been working
fine for 2 years with no smart errors. Problem is I need to get the money to
build a more appropriate backup raid for all my music / movies / games / etc,
but that would be a pretty penny. I've just been using old dvd-rs (I mean, who
uses those anymore?) as an impasse with anything really important on my 1TB
external disks.

------
lae
If only I could find the Hitachi Deskstars for cheap.

------
halayli
Lesson from this: buy yourself a hitachi.

------
wil421
Neither you should buy an SSD.

~~~
gwern
They're doing bulk storage where costs are a priority and performance
irrelevant since users are uploading/downloading over the Internet. What
possible use do they have for SSDs?

~~~
wil421
The title of the post is Which Hard Drive Should _I_ Buy?

Not which Hard Drive should Back Blaze buy again.

As a consumer I wouldnt consider buying a HDD ever again. I would be more
interested in a post that details the reliability of SSD manufactures/models.

~~~
Ecio78
There is still some people interested in HDD, for example those who hosts a
n-Terabyte home NAS and that can't afford spending thousands of dollars[1] to
fill it with SSD..

[1] [http://www.zagg.com/community/blog/what-is-the-largest-
ssd-s...](http://www.zagg.com/community/blog/what-is-the-largest-ssd-solid-
state-drive-available/)

~~~
wil421
Hopefully in a few more years the prices will be reasonable for a TB of NAND
flash.

