Hacker News new | past | comments | ask | show | jobs | submit login
What Hard Drive Should I Buy? (backblaze.com)
688 points by nuriaion on Jan 21, 2014 | hide | past | web | favorite | 266 comments

Very cool that Backblaze continues to post things like this. Few people have this experience. Of those who do, few (I assume) break it out to this level of detail, actionable for others. Of those who have good experience and records, most would consider it proprietary or just decide not to post. Kudos to Backblaze.

We're all on this crazy spinning ball together. Early on Backblaze made the decision to stay out of the hardware business (it sounded lucrative after we developed the v1 storage pod, but everyone was a software person, so we went that route), we like to pay our hardware findings forward. It's a good thing to do!

I really like this detailed write up. A certain large company in Mountain View suggested years ago that they had collected a great deal of metrics on consumer grade hard disks performing in a data center capacity, but refused to publish the data.

The fact that Backblaze isn't embarrassed to share this with the rest of this gives me the warm fuzzies. Thanks for the hard work!

I recently purchased a new Seagate, and I'm feeling a bit of buyer's remorse. That's not to say this article triggered the feeling completely (though it did push me over the edge). Taking this information and the knowledge that recent model Barracudas appear to have aggressive APM that loads/unloads the heads after a minute or so of inactivity (unless disabled, but it doesn't persist between reboots) AND seems to borrow from laptop-grade design... I think I'd be better off replacing it. Again. ;)

I wonder if the poor longevity of Seagate drives is due to changes in quality control or design? I have older ones that don't utilize the APM load/unload features and are still running fine after many years. Then again, they were also pre-flood drives.

I'd also be curious to know if there's a correlation with where the drive was manufactured and its survival rate, though I suspect that might be a much more difficult metric to collect.

> A certain large company in Mountain View suggested years ago that they had collected a great deal of metrics on consumer grade hard disks performing in a data center capacity, but refused to publish the data.

They did in 2007:


(first result of a search for "google hard drive report")

That's exactly what he's talking about; they didn't publish the data:

Failure rates are known to be highly correlated with drive models, manufacturers and vintages [18]. Our results do not contradict this fact. For example, Figure 2 changes significantly when we normalize failure rates per each drive model. Most age-related results are impacted by drive vintages. However, in this paper, we do not show a breakdown of drives per manufacturer, model, or vintage due to the proprietary nature of these data.

In AceJohnny2's defense, and after re-reading what I wrote, I can see how it was assumed I was referring to the data set as a whole rather than the exclusion of manufacturer's data. But I'm certainly appreciative that you knew exactly what I meant. ;)

i can only assume that google didnt want to impair any options from the hard disk developers. If there review stated that one disk if far better than other then that might crusify that HD company.

I believe they acted in that concern ( or so i want to believe )

That report does not have data about individual manufacturers. Perhaps Google considered that to be lawsuit bait, or perhaps they considered it to be proprietary. Either way Backblaze's post has actionable information about which manufacturer to buy with respect to reliability trade-offs. Google's report does not.

>Perhaps Google considered that to be lawsuit bait, or perhaps they considered it to be proprietary.

I don't think it has anything to do with being afraid of lawsuits (or every consumer review company would be out of business) or worried about competition getting that information. They most likely get some kind of volume pricing discount (or even pricing based on experienced reliability) that is dependent on them not releasing such data.

Yeah, that's precisely what I intended to convey. I admit my comment didn't specify what I meant by unpublished data, but I appreciate that you inferred exactly what I meant.

Performance and longevity metrics are curious but they're almost useless without naming the manufacturers. To that extent, the data may as well go unpublished from a consumer's perspective on the merit that it--as you said--contains no actionable data. The Backblaze report does, and that's why I find it useful.

It's hard for a public, respected company to statistically bash the quality of another public, respected brand. It just ain't gonna happen, what with all the lawyers and money and connections.

Glad to have smaller entities which are less prone to being swayed.

I should've been more specific. I'm aware of the study and have seen it, but to my knowledge they refused to publish data regarding the manufacturers, probably for legal reasons.

It's curious that other replies garnered from my post the implied information that I was relating to a lack of specific/specified manufacturers, and you're the only one who took it completely literal. :)

Yes! To my knowledge, this is the first large-scale drive reliability study that names manufacturers.

The study you mentioned by that certain large company was tantalizing, because it also found large differences between manufacturers, but did not name them. The relative insensitivity to temperature (<= 40C or so IIRC) they found was helpful, however.

True. I remember when that study was first posted that it did have curious metrics that were surprising in their own right, but it was disappointing that nothing was correlated by manufacturer. I can understand why that might have been the case. The potential for litigation would have been enormous. ;)

From my perspective (speaking as a consumer), it's more useful to have information about the relative quality of each and their tradeoffs. AFAIK, Seagate produces among the fastest mechanical drives on the market, but it appears you exchange longevity for speed...

That would be interesting. I bet the older drives were probably from Taiwan/Japan, and the newer ones from Thailand/Vietnam.

From what I hear though, at least in Vietnam the plants are run extremely well. Japanese or Japanese trained managers, and interestingly enough most of the workforce are women. they all meet some ISO specification of precision/reliability, I think. Actual conditions, I have no idea, since I"ve never actually been to one (and am not qualified to say)

I may be wrong, but it appears that many of the new model Seagates are Chinese labeled ones. I gather from some previous reading that their manufacturing process is slightly more variable than other locales with quality to match.

Also, you raise a fantastic point about the work conditions. We hear things about Apple's plants and other mobile device manufacturers, but I don't know of any study that includes hard drive makers and the worker conditions at those plants. Very good point.

Well don't despair about your Seagate purchase. We're still buying them. Plus, over 80% of the drives that we have in production are over 4 years old. You should be just fine :)

Just wanna say, there are a bunch of little businesses, like mine, that really depend on writeups like yours when making suggestions to customers. So, thank you.

We just had a spate of bad Seagate drives come through our shop. The drives should be under warranty but Seagate's making it a hassle. Some of them were less than six months old. Your writeup combined with the recent trouble means that we're dropping Seagate for our preferred HD brand.

You're welcome! We love sharing information, and if it can't have a direct impact on our business, there's no good reason not to do it. A rising tide floats all ships! Hopefully you can find a decent substitute for the Seagates you're recommending. And if you need to recommend an online backup service, I can suggest a good one ;-)

We have actually set up a couple of clients with Backblaze. :-) I've been following your guys' progress since early on.

Awesome! That's what we I like to hear :)

It's really annoying to find out that drive went dead in few months and then it happens with the replacement drives too. Especially on laptops with only one drive. Of course backups allow you to restore key data but you'll still have to setup the system. I got series of three Seagate failures and got really tired of it. Of course all non important data wasn't backed up do it was an extra annoyance.

I really appreciate this; as someone who has had a number of SSD and HDD failures due to lackluster manufacturers (OCZ and Seagate respectively).

It would be great if you could list power consumption numbers too next time. For me, that's also a big consideration for something (NAS) that is on 24/7 in the house.

Yev from Backblaze here -> Unfortunately we don't really have any publishable stats on that quite yet, but I can say that we prefer low-power drives so that we can fit more of them in to the racks and not go over power-limits.

I suppose it has an indirect benefits to Backblaze too. For a start the publishing of this useful information brings them to our attention (or refreshed our existing memory of the existence of their services) in a positive manner, and publishing honest information drive reliability might make the manufacturers case that little bit more about the overall quality of their consumer grade hardware (which means we all, including Backblaze, get better products to use).

Yes, if things get more reliable without going up in cost, we are happy. Though, it's true that the exposure is nice too.

Best way to get good attention from a techie audience IMO: provide useful information with enough background data that we can to a certai extent verify how you've derived your assertions. You've been useful to us while proving you know what you are talking about in your chosen domain, so on two counts we are more likely to trust you than the next fellow if we need services like those you are offering. And you've done the above without ramming your services into our faces in the agressive manner many a marketting manager might ill-advisidly suggest.

We certainly want all of you as customers, but the truth is, there's a time and a place for overt marketing. Covert marketing is so much more interesting, it's stealthy! Plus the discussion sometimes leads to ideas that we use and better our product and storage pod design. It's a win/win and we absolutely love it.

We're glad you're enjoying the write-ups as well!

This is also excellent PR and attention acquisition. I've referred backblaze as a possible option to look into for friends looking for backup solutions on more than one occasion just due to being reminded about their existence from posts like this (and I've never used them personally).

Who could say no to some extra exposure? Plus we feel good about sharing this type of stuff. Not many other folks seem to be willing to do it, we can't fathom why. The discussions around it alone are interesting at the very least.

Good timing or pure coincidence?

Just yesterday, a discussion about how BackBlaze restores fail.


Coincidence. Sean is working with our support team as we speak. We had no other issues with restores during the time he was attempting them so we suspect it's something on his end. We'll collect logs and try to get him back up and running as quickly as possible!

Yeah, I've always been disappointed at the lack of statistical information that hosts provide about drive failures, especially failure correlation in RAID situations. When I'm trying to price out some servers on a typical bare metal host's website it is hard to evaluate different configurations, and they are the ones that have the most real-world information, sitting in their support system. Very few publish this kind of info, so I really appreciate the efforts of Backblaze.

It might because folks are afraid that their partners won't sell them drives. You know, if you're working directly with Hitachi or WD or Seagate and you publish stats saying they aren't great, they don't have much incentive to continue working with you, or at least not as cheerfully.

Since we buy off-the-shelf drives from distributors and online, we don't really feel the the heat that much. We'd love to work directly with the manufacturers, but the minimum orders (over 10,000 drives) are not really feasible for us.

Definitely. I just downloaded it to evaluate as a replacement for Mozy, to ultimately consider evaluating it as a replacement for Mozy Pro for one of my clients.

> The drives that just don’t work in our environment are Western Digital Green 3TB drives and Seagate LP (low power) 2TB drives. Both of these drives start accumulating errors as soon as they are put into production. We think this is related to vibration. The drives do somewhat better in the new low-vibration Backblaze Storage Pod, but still not well enough.

Another reason to avoid the WD Green 3TB: these drives aggressively put themselves to sleep to save power. It's literally a matter of streaming a video from disk and if the OS caches enough of the file, the drive will see there haven't been any accesses in a few seconds and stop spinning.

The video will of course glitch when the cached data runs out and the drive needs to spin up. Great design.

You can disable the aggressive parking behavior ("Intellipark") with WD's own wdidle3.exe [1] or idle3-tools [2].

Disabling Intellipark was literally the first thing I did with each of the three WD Green drives I've owned since 2011 (got two but one failed early on and was replaced), so I can't really compare their performance with this setting on and off; however, I can say that I haven't noticed the drives being parked more aggressively than similar Samsung or Seagate drives. I used the official wdidle3.exe under FreeDOS for each drive.

[1] http://support.wdc.com/product/download.asp?groupid=609&sid=...

[2] http://idle3-tools.sourceforge.net/

Edit: changed "aggressive spin-down behavior" to "aggressive parking behavior". It actually isn't quite clear how disabling Intellipark affects disk spin-down behavior.

Edit 2: There's also wdantiparkd (http://www.sagaforce.com/sound/wdantiparkd/). I have not used it but it might help you if tuning the drive itself doesn't work.

I looked into WDC's tool for this a while back and it totally didn't work. That second one looks worth a try, thank you for that.

Don't use the Green line for anything performance related, even if it's watching videos. The Green line is simply not intended for that. I think it's designed to be an occasionally-accessed backup drive, or something like that.

The Black line is the performance-conscious line, look at the corresponding equivalents in that line. IIRC the Blacks are 10-15% more expensive, but they actually respond.

It's true that Green drives aren't intended for performance-related tasks, but it's wildly incorrect to imply that streaming video to a single user qualifies as a performance-related task in this context. Blu-ray video maxes out at what, 50 megabits per sec? WD Green drives from 2010 have no problem attaining sequential read speeds of 100 megabytes per second. They'll manage 50 megabytes per second even with random reads.

Also, I have streamed a lot raw Blu-ray files from my WD Green drives over a gigabit network with absolutely zero issues.

B-con is confused. The behavior in question - the drive powering down right away - would actually occur less frequently if I were trying to do something "high performance." It's the fact that a single user playing mp3 or video sequentially is pretty laid back that means the drive gets a chance to shut off.

Of course, you wouldn't want to use these in a server, either. Or for anything, really.

edit: in response to your comment below, the reason you haven't seen glitching with your high-bitrate video is probably that it's high-enough quality video (lots of data) that the OS never caches enough of it to let the drive stop working and go to sleep. Or else maybe WD have altered the stock settings of these drives. They certainly should have.

edit again: if you've had them for "many years" then we aren't talking about the same hardware

I've had several of them in my home server for many years. It gets heavy use from 1 or 2 users (which as far as the server and drives are concerned is not "heavy use"), and has never had any issues streaming any content, from MP3s all the way up to raw Blu-ray files.

I am with you; i have a 1.5TB WD green as my main non-ssd storage medium, and it performs great. I put everything on there, movies, music, photos, Virtual Machines, etc, and I've never had any kind of (overly) noticeable performance problems.

Thirded. My media server is 6 terabytes worth of WD Green drives and they don't exhibit the issue described with stops and starts and I've never done anything to shut the "intelligent parking" off.

I guess YMMV depending upon exact OS and software used to access the device, but most media playback software will run a big enough local cache that it is requesting chunks of data long before it absolutely requires them for playback to work without a hitch, giving the drive more than enough time to wake up if it has parked itself.

> I guess YMMV depending upon exact OS and software used to access the device, but most media playback software will run a big enough local cache that it is requesting chunks of data long before it absolutely requires them for playback to work without a hitch, giving the drive more than enough time to wake up if it has parked itself.

On a Windows 7 machine with lots of RAM and a 3TB WD Green secondary drive, Foobar 2000 music playback and VLC video playback both exhibit the problem behavior. I don't think whatever read-ahead caching those programs do is designed to accommodate the latency involved in getting the drive going again.

Also, consider how a random playlist of thousands of songs works - reading all the songs into RAM isn't really an option. Foobar 2000 definitely does some read-ahead caching, I've noticed that a few seconds of music will sometimes play at the beginning of a song and then pause, as the software blocks waiting for the disk to get going.

I believe you, I just haven't run into this myself and I guess it is correct that YMMV depending upon setup then.

All of my WD Green drives are sitting in a Linux server and exposed as samba shares to various computers (including some running Windows 7) and HTPC devices. My guess would be that this setup results in greedier read-ahead caching by the local OS because of the relative slowness and unreliability of accessing blocks via a network-share file versus what the OS considers to be a local file.

There's probably network layer (ie Samba) caching which would help in this situation, too.

This comment thread is almost entirely FUD.

> B-con is confused. The behavior in question - the drive powering down right away - would actually occur less frequently if I were trying to do something "high performance."

I'm not confused.

a) The drive will only stay on if the performance requirements are constant. But many performance use cases don't ensure that there is constant activity to the drive, so the drive may shut off.

b) The Black line consistently posts better raw benchmarks than the Green line. (The last one I looked at: http://www.legitreviews.com/western-digital-2tb-caviar-green...)

YMMV for your specific situation, obviously, but buys should know that the Green drives aren't oriented at performance. If they give you what you want, great, but don't buy them and then be surprised when little performance hiccups happen.

There should be disclaimers about that when you buy such HDDs. I wouldn't consider watching videos a performance related activity. I say this is WD's failure at properly marketing their product lines.

I recently bought two Green drives for my NAS not knowing the difference. Do your research :).

Having said that, the NAS is working out pretty well so far with ZFS running on top of them.

Like others have mentioned, please make sure you're using wdidle to regulate head parking. Also might wanna read this http://forums.freenas.org/threads/western-digital-red-with-t... . Greens are dangerous in a NAS context so make sure you have backups.

Odd. The link you posted directly contradicts your point. It says that Reds are at that time unproven and that the Greens are the way to go for ZFS. Basically it is saying that ZFS does not require TLER so that part is not an issue. On top of that it is saying that the Green drives are both common and well tested.

I think I read it slightly differently then, the gist was that if wdidle is properly used, the drives "should" be equivalent (since they supposedly share the same hardware, just different firmware).

Reds are made for a NAS context, greens are not. You may or may not make the difference go away by using wdidle. This does make greens inherently dangerous (when compared to reds).

I've had 4 Green 3TB drives in a raidZ for less than a year. First few months were great, then I had to replace two of them a couple months ago due to data errors and another just started failing so today (literally as I write this) I'm replacing the other two.

Don't use Greens for anything that's on and could be active 24/7. The aggressive load cycling kills them.

I bought 3 Green 2.5TB drives for a Synology NAS that's lightly used -- and all 3 have had SMART failures in the ~18 months that I've owned them. WD just sent me a refurb'ed Green 3TB drive, let's see if it holds out better.

I wish I had had access to this data before buying the Greens...

I have streamed hundreds of hours of high bitrate video (raw Blu-rays) from WD Green 3TB drives over my home gigabit network, and have never encountered this behavior. I couldn't be more pleased with the drives.

This has not been my experience. We run a bunch of GP drives, 1TB, 2TB, and 4TB, and I have not experienced them spinning down ever unless explicitly commanded to. They park heads for sure, but don't spin down. (These are normal GP, not AV-GP, never tried one of those.)

is this the "head parking" thing? The Reds are supposedly the same drive, but don't have the head parking algorithm, with tweak able TLER to make them more RAID friendly, and their rates are pretty acceptable.

That and spin down. Actually, the spin down might be worse.

I've never had good luck with WD drives, from the 1990s through the present. At first, I had no luck with Seagates, then they seemed to do pretty well up until recently, where now I'm seeing some radically bad results. I wish IBM would get back in the HD business.

Well, IBM's HD division was bought by Hitachi, and they're the ones coming out of this smelling of roses, so (subject to what WD do with Hitachi's drive division) your wish is sort-of granted.

Thanks to both you and zonk for pointing this out. I feel my warez will soon be safer on a Hitachi drive.

IBM sold their drive business to Hitachi -- which lead in these published reliablity statistics by a large margin.

From the article and throughout the comments here it seems Backblaze prefers cheaper drives over a few percentage points of reliability. It would be interesting to see some data showing the tradeoff, but I suspect it reveals too much of their operation. At first glance it appears you can get a drive with .9% failure rate (HGST 7K3000) for $127[1], and yet BB really likes the WD Red, which has a higher failure rate (3.2%) and cost[2].

What might shed light without revealing too much is information about where they source drives today (their sourcing coverage during the shortage was very cool!). I suspect they're finding some nice bulk discounts somewhere.

[1] http://www.amazon.com/Hitachi-Deskstar-7K3000-HDS723030ALA64... [2] http://www.amazon.com/WD-Red-NAS-Hard-Drive/dp/B008JJLW4M/ (both seem to be market consumer prices)

> What might shed light without revealing too much is information about where they source drives today

Backblaze employee here -> we are willing to buy from anybody, we have no loyalty. Lowest price (for a particular drive model) always wins. Once per month we ask about 20 common suppliers for their "best price". We have bought from "B&H Photo Video", NewEgg, Amazon, etc among others. We're always willing to add more possible vendors, but I think we drop you from the list if the vendor bid prices don't even come close for 3 months - that means you don't understand anything and you're wasting our time.

With the volume you must buy in, why not buy direct from the manufacturers? I imagine they could supply you with their OEM pricing and product packaging, which seems like it would save money? Or is it a strategic reason like not getting stuck with one manufacturer?

I asked that question last time BB employees were here on HN.. the answer was they just don't buy enough volume to qualify to buy direct from the manufacturers. It sounds like you must but enormous quantities before the manufacturers will give you the time.

Unfortunately that's still the case (Yev here, from Backblaze). Minimum orders are around 10,000 unites, and we're just not there yet. Thinking about starting a consortium though, so if anyone needs hard drives... ;-)

That is absolutely untrue. Talk to the big distributors, they will also be able to fix your incredibly unstable supply chain.

We currently work with Distributors. I was referring to the manufacturers themselves, like buying directly from Seagate/WD. We currently work with a few different distributors to get different types of drives.

"Backblaze employee here"

CTO, unless that's changed....

Guilty as charged. :-) CTO, head janitor, the company lived in my 1 bedroom apartment's living room for 3 years and up to the first 9 employees.

He's so modest.

BTW, I'm not sure B&H Photo Video deserves scare quotes.

If you've been doing video and/or photo stuff for a long time, you know them as a rock solid distributor: I've been buying video stuff from them since the middle '90s or so, camera stuff more recently (e.g. my first serious camera, vs. Amazon they had better selection with competitive pricing). Joel Spolsky was sufficiently impressed with this home town operation to do a fascinating write-up on them: http://www.inc.com/magazine/20090501/why-circuit-city-failed...

One other note on who to buy from: if you're just buying one or a few drives, Newegg has gotten really serious about packing. http://www.pregis.us/en-us/productsandservices/productsoluti... inflated padding inside a fitting cardboard box. I suspect this is about as good as the packaging Seagate requires to return a drive for warranty service. Don't know about B&H, but as of a couple of years ago Amazon had a horrible reputation for packing bare hard drives.

Ever run into counterfeit drives?

Not that we're aware of no. We doubt it would pass our testing if it wasn't legit.

If systems are designed with the expectation that hardware can and will fail often, better reliability drives aren't worth the cost as long as cheaper drives are relatively comparable. In addition to cost savings, your system has better robustness when it is decoupled from hardware reliability.

For example, Google's Map Reduce paper has a section on fault tolerance that goes into detail about how they handle the issue of failing workers:


Deltaqueue just posted the prices and there is a $5.00 difference between the two. So you pay 4% more for 2.5% less annual failures. That sounds close enough to be worth paying more for less operational expense. Obviously, backblaze gets a better deal or they would be buying up the Hitachi drives instead.

The AFR is 2.3 percentage points less (0.9% vs 3.2%), which in this case means that a single unit of the inferior brand is 3.5 times more likely to die during a full year of use. I'd love to see their calculations that justifies buying non-Hitachi drives.

I think percentage points is the right metric here. Spending a lot of money to cut down the frequency of a rare occurrence doesn't make sense, even if you can cut it down by 100x.

"Rare" is the key here, thanks. An AFR of 3.2% is already a pretty damn long MTBF. Makes sense now!

I have experience with hundreds of T of data stores. My opinion is very high of Hitachi 1T and 3T Deskstars. The problem is that they are not generally available - there could be months when you just could not order them.

Are you taking into consideration when a drive fails it requires work to replace it?

This could be minimal and something that in terms of budgetary considerations might be negligible - but I'm not sure.

Backblaze employee here -> Yes, this gets a SMALL amount of allowance. The datacenter team begs us to buy the Hitachi drives even at twice the price, but it would bankrupt us. But if the Hitachis are only $2 or $3 more expensive per drive (including the failure rate in that calculation) then we're willing to buy them for the reduced hassle.

I think the calculation is replacing one drive takes about 15 minutes of work. If we have 30,000 drives and 2 percent fail, it takes 150 hours to replace those. In other words, one employee for one month of 8 hour days. Getting the failure rate down to 1 percent means you save 2 weeks of employee salary - maybe $5,000 total? The 30,000 drives costs you $4 million, so who cares about $5k here or there?

> I think the calculation is replacing one drive takes about 15 minutes of work.

Is that really the true cost of replacement? I would think there is also the cost of dealing with the warranty and the testing and monitoring. Here is the quote from the blog post about unsuitable drives:

> When one drive goes bad, it takes a lot of work to get the RAID back on-line if the whole RAID is made up of unreliable drives. It’s just not worth the trouble.

I don't have the time to think about this fully, but it seems similar to calculating the present value of a future cash flow, because there are other costs beyond the first replacement effort:

> Their average age shows 0.8 years, but since these are warranty replacements, we believe that they are refurbished drives that were returned by other customers and erased, so they already had some usage when we got them.

It sounds like the total cost of a failed drive is actually 1.5x, because 50% of the replacement drives also fail.

You can probably cut the time for dealing with warranty down if you do it in bulk. They seem to have ~50 failing drives per month.

Yup. We used this paper as one of the proofs for why we could continue to rely on economy hardware.

The price we pay as consumers isn't going to be the same as that paid by Backblaze when they buy 100 drives at a time. The fact that they pick the WD Red implies that they're getting a better deal on those, otherwise they wouldn't settle for the higher failure rates.

EDIT: comments from BB employees below actually say they often buy off the shelf from NewEgg and Amazon.

They posted on their blog that they wouldn't get discount pricing until they hit something like a 10,000 drive order.

It's a darn shame. We would love to pay less for hard drives. Even at our current capacity, we're still a "small fish". And since we try to run lean we likely could buy 10,000 drives and keep inventory, but that doesn't necessarily work out for us in the long run as hard drive prices tend to fall monthly.

Yea, unfortunately we aren't large enough to work directly with the manufacturers. We do buy from distributors sometimes, but if we can get a cheaper deal online, we'll go that route every time.

pity australian companies want to charge 60$ more for a hitachi drive over seagate.

Definitely wish I had seen this a couple months ago before I bought two 3TB seagates. Although to be fair I was already pretty sure that seagates sucked (its good to see data backing that up) but getting two for $85 each was too hard to pass up. I'm a sucker for a deal. I buy HDs in pairs now so I'm not too worried about losing anything.

I am intrigued by backblaze's service though. A part of me feels like there must be a catch somewhere. I have a good 10TB I'd be happy to pay $5/month to backup but somehow I feel like they'd pull a comcast and say their "unlimited" claim doesn't apply to the 1% of users (or in this case maybe the .001%).

As a user with almost six terabytes of data, I can vouch for their service. I have never received any messages asking me to limit the amount of data I upload. I do try to compensate, for the sake of my conscience, by recommending the service to people (with much less data) who previously lacked a backup-strategy.


The catch is that the initial upload can take a while if your internet connection has limited upstream – not something Backblaze can help but I'm sure it has made some people think twice before having their external drives mirrored in the cloud.

One idea that works for some people is to carry your entire computer to a "fast connection", for example your workplace. Plug it in, leave it running for 24 hours to power through the initial upload, then carry your computer home again for the incrementals. Just an idea.

My experience has been that the upload speed is slow no matter what connection you are on. They claim not to throttle, but I am skeptical that you will get more than 1.5 megabit anywhere. I couldn't.

They actually ship a bandwidth test that goes to their datacenters. If I ran multiple instances of the test at once I got 1.5 megabits for each instance. It's not a bandwidth or capacity issue...

If they documented 1.5 megabits I wouldn't complain, but claiming one thing and shipping a product that throttles, whether explicitly or implicitly, is obnoxious.

I did a restore recently and was pleasantly surprised to get 9 megabytes/sec down so that is much better and restore speed is a higher priority.

However if I had to upload the entire data set from scratch I would stop using Backblaze.

I work at Backblaze -> we absolutely do NOT throttle.

Inherently each pod has some built in limitations, for example it has a 1 Gbit/sec network card, so you won't be able to exceed that. But many, many customers get 100 Mbits/sec upload speed, I can show you our internal chart if you like.

I'm glad you got a fast restore download, we have been BATTLING Comcast in recent weeks, between 5pm and midnight downloading restores from Comcast has been trickle slow. Some people claim this is due to Netflix traffic, but I think some system admin somewhere is not doing their damn job. We think we have finally figured out a work around just this morning...

That's definitely not true, on a 1gigE connection I was uploading at well over 200mbit/sec.

Do you buy the same drive model when buying in pairs? I've anecdotally had 2 identical drives die of the same type (manufacturing batch) at the same time.

I do generally buy the same model. I keep the second one completely offline and do occasional manual backups. I'm hoping a completely offline keeps the likelihood of failure frozen.

Any stats on power consumption? Over 5 years the difference between a drive that uses 6 Watts and one that uses 7 is 44kWh or about $5. Double that to include cooling costs and saving a Watt should be worth something like $10 to you, so a more expensive more efficient drive could be worth it. Do these drives all use similar amounts of power?

Cooling costs are an extra one third (and even that is only during the summer), not even close to double.

Do you have a source or at least an explanation for that? I've always thought it took at least as much energy to cool something as it did to heat it (because AC is not 100% efficient), but I've been pretty wrong about thermodynamics in the past so I'd like to dive a bit deeper here.

Maybe this is only true for AC, and not water cooling being pumped to passive radiators on the roof?

EDIT: Here: https://www.google.com/about/datacenters/efficiency/internal..., Google is claiming a 'Power usage effectiveness' of 1.11 across their data centers. This implies a cooling cost <= 11% of their server costs which I found quite surprising.

It's pretty easy to calculate. Just find a window A/C and check the BTU it's rated for and the wattage.

Convert BTU to watts, then do wattage to power the A/C divided by the watts of cooling power.

Then note that this is the worse case scenario - normally it does better than that except on the very hottest days of the year.

Hard drives are going to run hotter than the outside temperature even in the middle of the summer, so they are going to get quite a bit of natural cooling for free. The cooling they have to do is just a supplement to the cooling that happens naturally.

I wonder if any of this actually applies to consumer-grade drives.

My wife's hard drive actually just died. It was 160 GB WD in a black 2006 MacBook. The drive itself was a replacement from 2007 since the original drive died just over a year into its life.

Stupidly, since her Time Machine backup was misbehaving, I reformatted it and set it to start over. I spent the weekend recovering her data—with a lot of success, so no big deal. At any rate, this machine is long past its expiration date. It's time for a MacBook Air with an SSD, once the tax refund comes in.

Yev from Backblaze here -> All of our drives are consumer-grade. We try to avoid buying enterprise drives at all costs. These are all off-the-shelf internals and in some cases...externals that were made internal! :)

> and in some cases...externals that were made internal! :)

I remember reading about this a while back[1] (fun read!).

Have you guys noticed a big difference in life expectancy of the repurposed external drives or do they generally match up with their internal equivalents?

[1]: http://blog.backblaze.com/2012/10/09/backblaze_drive_farming...

Interestingly, we think they do as well if not better (http://blog.backblaze.com/2013/12/04/enterprise-drive-reliab...). We think it's because external drives are meant to be jostled around by Joe Schmo, so they do well in our storage pods, whereas internal drives and especially enterprise drives are meant to live in a perfectly Utopian environment.

Have you seen a different failure rate between the drives that were sold as internal vs. the drives that were sold as external and "shucked" ?

Not enough to measure, it turns out once you get their case off they are mostly the same. The real difference is "enterprise" grade vs "consumer" grade dries.

Do those quotation marks mean "enterprise labelled", such as the WD RE4-series? Are there good data on how they perform vs the consumer series? By good I mean like the data your company provided, samples of thousands of drives.

The MTBF is typically much higher than regular HDDs, and their reliability is probably at the level described for Hitachi (can't give actual stats though, my experience is limited to hundreds not thousands)

Those drives aren't designed for HDD farms though. They are meant to be used in environments where a failure is a big problem and paying a lot more for a bit more reliability is worth it. Like that 1U server you got in a datacenter with 2 disks in RAID1.

If you have thousands of disks replacing is part of the daily routine, and isn't an issue at all (as numbers suggest in article/comments)

That isn't why you buy enterprise drives. Or, at least, I've spent several Porsches worth of money that I could reasonably have spent on Porsches on enterprise drives, and that isn't why I bought enterprise drives.

I buy all enterprise drives. Not because I can't handle replacing a disk; I live within walking distance of the coresite santa clara location where most of my servers are, and actually kinda enjoy that sort of thing. (Yes, yes, I'm sick. But what of it?)

I pay double for 'enterprise grade' drives because more often than the consumer-grade drives, they fail clean.

That's the thing... what does it mean to have a drive "fail?" The vast majority of my failures with consumer grade drives just /degrade/ rather than outright failing. They get shittier and shittier over time. And yes, with sufficient software you can detect this and automatically fail them, but I don't have that. the "enterprise sata" stuff? More often than not, the things actually fail before they degrade to the point where I notice them causing problems with other things.

I buy enterprise grade, not because they last longer, (In fact, I see no evidence that they do) but because they tend to work or fail, whereas consumer grade drives exist on a continuum between "working" and "failing"

(Of course, even the enterprise stuff isn't 100%... but it's much better.)

Well, nowadays I buy them because that's the only way you get a 5 year warranty; I only build systems with 5 year design lifetimes (even for the parents; saves a lot of hassle).

Although I have seen some evidence of consumer grade drives "failing dirty". E.g. in 2002 I tried a couple of 5 year warranty Seagate Barracudas in a new machine; it didn't take long for one of them to outright fail a 4K portion of the disk. I actually wrote a little C program to recover everything but that one bit (and it was a file I could then recover the missing data from), and switched back to SCSI enterprise drives for my main system drives and haven't looked back (granted, those are as fast as I can buy, and that means SCSI enterprise). And that includes a couple of machines I built for a non-profit that are exactly what datphp describes.

And, yeah, these best of the best drives do fail, at least I had at least one of the Seagate Cheeta 10K drives I bought back in 2002 completely fail, I think both eventually, the 2nd after 5 years had passed.

>I actually wrote a little C program to recover everything but that one bit (and it was a file I could then recover the missing data from),

'ddrescue' is a program I use in similar situations. (not to discourage you from writing your own; doing that sort of thing leads to a deeper understanding of what is going on.)

I was still stuck in Windows (Windows 2000), and it was quite a bit faster to write the program than find a solution on the net like that on a live CD, download it, etc. Having started on C in 1979 (sic) I knew it cold by then.

But, yeah, confirming exactly what was the problem was good, especially since it was extremely odd. All blocks readable except for those 8, although the drive knew it was in trouble.

It reminds me of problems others have reported, especially those using high level error checking file systems like ZFS, where the drive accidentally writes to the wrong location correct data correctly, so the internal CRC passes on reads. The firmware in drives is said to be getting to full OS complexity....

I can understand the cost/reliability tradeoff given a data storage mechanism that provides enough resiliency, but how do you provide that? Are the drives just single-drive FSes with secret sauce atop it, or commodity RAID-N with something doing resiliency atop that, or ?

> We try to avoid buying enterprise drives at all costs.

Why is that? From what I can read on your website, it seems you would be the correct use-case for enterprise drives, no?x

We designed our storage pods and software to work in spite of hard drive failures, so paying a $50-$100 premium for a drive just to avoid possibly $5 in labor to replace it when it fails is not a good practice.

We go for the good cheap stuff, and that's how we maintain low prices for our actual product, which is online backup. As long as the drives are reliable-ish, and are inexpensive, that's what counts!

That assumes that you don't think enterprise drives are just marketing ploy, to get execs to part with more of their money for no real benefit.

Any laptop with a mechanical hard drive makes my hair stand on its end. In 2014, it's just an unnecessary liability.

When my girlfriend and I started getting serious a few years back, the first thing I did was to replace her aging MacBook Pro with a SSD MacBook Air. It was the only way the relationship could move forward.

Interesting, I bought my wife a Hitachi and it improved the relationship.

I see what you did there.

If you build good hardware...you build good hardware.

I love love LOVE the SSD drive in my laptop for performance reasons (I can't ever go back, seriously). Emotionally I'm also glad it doesn't have spinning parts anymore, but in the end your laptop can still get stolen or break or fail. No matter what, in 100 percent of all situations, you need a backup. (Yes, I work at Backblaze, but seriously, you don't have to use our product, JUST USE SOMETHING!!)

Yes, I also forced her to get a Dropbox account. That's how she knew the relationship was getting serious.

In 1999, I knew my new girlfriend was the right one for me when she was excited about one of those fancy "Airports" from Apple. That Christmas I bought her a Palm IIIx.

We've been married for a dozen years.

I helped my in-laws set up a new iMac last fall (an upgrade from their ancient Dell tower running XP) and made sure to install Dropbox. Sure they may never use it, but it's only a matter of time until I need to do some form of tech support, and it's going to make my life immeasurably easier.

Just signed up for BB :)


SSDs have a slightly higher failure rate than mechanical drives, and when failed are mostly unrecoverable. Here is a report of massive failure rates (>50%) with OCZ drives: http://lkcl.net/reports/ssd_analysis.html

Right now the safest thing to do is to store at multiple cloud providers, where they have proper measures to manage disk failures.

That's more or less exactly how my wife (then girlfriend) got this MacBook. Except replace "aging MacBook Pro" with "2004-era eMachines tower from Walmart". I gave it to her in 2008, when it was still a fairly decent machine, but that junky eMachine was already junk. I had actually forgotten about that old computer until I actually spotted it in some of the photos I was recovering over the weekend.

And for what it's worth, last Thursday, before all of this happened, I had brought up off-site backups, with Backblaze in mind.

"If the price were right, we would be buying nothing but Hitachi drives."

I don't understand why they don't. Are the Hitachi drives really that much more expensive so that it doesn't justify their vastly longer lifespan? Even if they can get "free" replacement disks during the warranty period, that has a cost for them. And they mentioned that some replacement disks die even faster.

I'm sure Backblaze has crunched all these numbers - would love to see them. BTW thanks for sharing this data!

Looking at PCPartPicker (no drive shucking) Hitachi drives are at least $.02 more per GB than similar (7200RPM >= 1TB) WD and Seagate drives. Which means a difference of ~$20 per TB of storage, which at 1 pod (180TB) every few weeks means they're saving something on the order of 3.5k every few weeks using the cheaper drives. How much the additional failure rate costs them would be wild speculation, between RMAs/warranties and labor there are lots of assumptions to make, so I'll stop there.


    Hitachi 0F10311	Deskstar 7K2000 (7200RPM) : $0.059/GB

    Western Digital WD30EFRX RED (5400RPM) : $.044/GB

    Seagate ST3000DM001	(7200RPM) : $.034/GB

Backblaze employee here - it is honestly just a spreadsheet that kicks out the answer. Every month we ask 20 or so suppliers for the lowest price for each drive type. If Hitachi are 10 percent more expensive but fail 10 percent less often, that balances out and we buy Hitachi. But if it is 12 percent more costly then we get the other brand. There is a tiny bit of free preference leeway given to Hitachi because it means less hassle to our over worked datacenter team...

If I'm not reading it wrong then your data says that the Hitachi drives have half the Annual Failure Rate, or less, than the others (in your setup). Not sure what this means in MTBF but the Hitachi's sure seem to be worth a whole lot more, certainly 10, 20 or 30 percent more - no?

I don't think the math works. I posted this above:

I think the calculation is replacing one drive takes about 15 minutes of work. If we have 30,000 drives and 2 percent fail, it takes 150 hours to replace those. In other words, one employee for one month of 8 hour days. Getting the failure rate down to 1 percent means you save 2 weeks of employee salary - maybe $5,000 total? The 30,000 drives costs you $4 million, so who cares about $5k here or there?

The $5k/$4million means the Hitachis are worth 1/10th of 1 percent higher cost to us. ACTUALLY we pay even more than that for them, but not more than a few dollars per drive (maybe 2 or 3 percent more).

Moral of the story: design for failure and buy the cheapest components you can. :-)

Ok, after converting to MTBF the numbers make more sense: An AFR of 0.9% means a MTBF of 968947 hours (111 years). An AFR of 3.2% means a MTBF of 269346 hours (31 years).

I guess an MTBF of 31 years is plenty for your needs. Thanks again for sharing the data.

I think the failure rate will go up in old age. I just don't see those drives still working in 100 years.

The cost you see on newegg probably isn't the cost Blackblaze sees. They're getting bulk pricing, which WD is probably more eager to sell than Hitachi.

No, we are not getting bulk pricing. At least not really by much. We sometimes still "farm" where employees all visit Costco to buy drives RETAIL, we also buy from Amazon and NewEgg. Honestly, we just cannot seem to get good deals on drives! If anybody on earth can get these magically better prices, we are begging them to come forward, we promise to give you 99 percent of the cost savings. Beat our current suppliers by 1 penny per drive and you take all our business. We're buying about 800 drives per month, shut up and take our money! (Each drive is 4 TBytes)

Too bad you still don't have linux client. Do you think supporting linux users anytime soon?


Well, it's off the subject, but it's still an interesting question and I was wondering the same thing.

Also see hardware.fr failure rates. It shows different data than Backblaze.

French hardware site, component failure rates. Google translate it to english http://www.hardware.fr/articles/911-6/disques-durs.html

So its also important to take hard drive models into account.

Then there is the Google study Failure trends in large hard drive population http://static.googleusercontent.com/media/research.google.co...

> We are focusing on 4TB drives for new pods. For these, our current favorite is the Seagate Desktop HDD.15 (ST4000DM000). We’ll have to keep an eye on them, though. Historically, Seagate drives have performed well at first, and then had higher failure rates later.

I'm a little surprised that they actually did the analysis to determine the Seagates tend to fail more, yet they are still putting most (or at least, quite a bit) of their faith in those.

Based on their own data, I would likely avoid those, or at least start leaning more toward Hitachi and WD.

Or maybe the initial cost of those is so much better that it compensates for any long-term expense.

You also have to consider that when the drives eventually fail, they will be replaced with hard drives of the future -- which will presumably be cheaper than the HDD of today. I.e. they depreciate quickly.

Interesting. I had always avoided Hitachi Deskstars after having heard they were nicknamed "Deathstars" for a reason. Perhaps that was once true, but clearly it's not anymore.

Apparently that was 2001 and they were IBM Deskstars at the time [1].

I guess this is why companies sometimes decide to rebrand/rename products.

[1] https://en.wikipedia.org/wiki/HGST_Deskstar#IBM_Deskstar_GXP...

If you avoid Hitachi due to that bad batch and Seagate due to their bad batch and WD just because, you're left with nothing.

To my understanding (see my other comment in this subthread for a bit more), neither Seagate nor WD in at least recent times (21st Century?) screwed up even close to that badly.

I suspect those infamous disk failures were caused by faulty manufacturing runs, rather than some inherent flaw in the design.

I worked in platter sputtering at IBM right out of College during the Death Star period. Basically, IBM had multiple parts manufacturing and sputtering facilities all over the world. The best disks/heads/motors always went in to the SCSI drives and the lesser parts went in to IDE drives. SCSI drives of that period were fairly bullet proof. Desktop IDE drives, not so much. Most of the San Jose and Munich production had higher yields and were mainly for SCSI and laptop drives. So it was less faulty manufacturing runs than the parts came from lesser production facilities, usually Singapore and the Philippines. They ended up closing the sputtering plant in the Philippines as they were never able to get their test yields up high enough...

I bought an IBM Deskstar and went through five returns over the course of a couple of years - across two different generations of drives! (The 75GXP and the 60GXP, I believe) I will say their warranty service was generally not bad... they seemed to get a lot of practice.

The interesting thing is that what you say strikes me as almost certainly true, in a narrow way: there were probably some manufacturing runs that were a lot better than others. The proportion of bad runs to good runs is significant.

IIRC, it was specific to the 38GB model, and was traced to a batch of bad electronic components.

No, it was the whole 75GXP set (actually, according to Wikipedia, the 120GXP and 180GXP were also affected to a far lesser extent).

The backlash was really nasty, though, because IBM's drives were something a lot of us had come to trust and rely on, the 75GXP was kind of a flagship drive, and IBM's handling of the issue was less than stellar. We felt betrayed.

I never have bought another IBM (or Hitachi) drive. The reaction is visceral and intractable.

Ah yes. I think I had the jumper on mine set to limit the drive to 38GB. Luckily, mine was OK.

Well, there was a period where a company that failed that badly apparently lost so much business they were forced to sell it or the division. IBM Deathstars (which I missed buying my the most minuscule fractions time, phew) resulting a sale to Hitachi, Maxtor I gather to Seagate.

A better presentation of this data would show a failure rate for each brand and month/year of purchase.

For extremely simple devices like resistors or incandescent light bulbs, failure rate is relatively constant over the lifetime of device -- the chance of a functioning resistor with 10 hours of use failing during the next hour is the same as the chance of a functioning resistor with 1000 hours of use.

For complex devices with lots of interdependent parts, some of which are mechanical, the failure rate changes over time. There's an "infant mortality" or "lemon" phenomenon, where relatively new devices have higher defect rates (because fabrication and shipping sometimes result in imperfections which quickly cause failures), followed by a steep dropoff in failure rates (because observing a device operate correctly for dozens of hours is strong evidence that it doesn't suffer from a failure mode which often results in infant mortality).

Then there may be an increase in failure rates later, especially with devices that are partially or wholly mechanical (wear or damage type problems which do not cause immediate failure, but make it easier for a failure to occur).

You need empirical data to be quantitative about this curve, and it sounds like Backblaze has it, but their presentation in this article doesn't show it.

As I recall one of the studies of a few years ago, the one based on supercomputers, not Google's, showed there was very little infant mortality, and wear clearly set in after roughly one year in service. The results were quite striking, and nothing like the bathtub curve many expected and that you sort of sketch out.

This is a pretty textbook perfect application of survival/time-to-event analysis. Any chance the data behind it could be made available for teaching purposes?

Yev from Backblaze -> Where/what do you teach? We're unsure about releasing any more information at this time, but we're not opposed to it. What would it be used for?

I'm a postdoc at the moment, but I've taught survival analysis before. It just struck me as a particularly straightforward example with pretty clean data, a visible separation of the different groups, etc.

Basically, lots of people use the "Iris" data set to learn either data visualization or machine learning based classification. The same could be true of the "Backblaze Hard Drive" data, but for survival analysis/time-to-event statistics courses.

Very cool! What's a good way to reach you, I see your handle doesn't have an about section. My handle is my twitter handle if you would like to ping me and we can chat a bit more!

Email sent.

Oh, I'd love for you to just dump this somewhere with a CC-BY (attribution required, free use/mix/share) -- wonderful way of marketing Backblaze to up-and-comming statisticians and students everywhere ;-)

(Just put a "this attribution (blah, blah) include text: backblaze is a service that provides unlimited backup (link) ... etc)

I can't see it harming you in any way -- your business is already wonderfully transparent -- if any mystery remains to be figured out by would-be competitors it would be in the data center/bandwidth area, not the price of hard disks. But it would be really interesting to play around with such data.

Maybe you looked at Schroeder and Gibson from Fast07, which looks at MTBF calculations. They have lots more data, but none released. Hard drive failures are an easy example of the use of the concept of hazard to examine probability distribution functions. I'm not teaching this now, but I sympathize with the difficulty of finding good datasets. Who knows? If you group data by pod, or even location in datacenter, there could be something interesting in exploratory analysis.

In my limited knowledge, this would be a generalized linear mixed model to look at factors influencing drive failure? or more of a poisson thing?

You could use any number of models to model the time until drive failure - accelerated failure time models, a poisson model, a Cox proportional hazards model, Kaplan-Meier curves...

I wish Backblaze would provide some sort of Amazon S3 competitor, Amazon always seems very overpriced.

Hypothetically, what would you be looking for in an S3 competitor? Specifically:

1. How much data do you have?

2. What would you use it for?

3. How important is it if the API isn't the same as the S3 API?

4. Any specific certification requirements?

5. Would you have any SLA requirements?

6. Specific performance metrics? Think Amazon S3 vs. Glacier.

7. Are redundant data centers important?


I think the main problem is cost, Amazon's costs prevent startups like Everpix from existing as does it for video startups and various other startups that require a lot of storage. I think Digital Ocean's disruption of the very established VPS market is the model to follow.

Personally I'd like to have all my data in the cloud.

1. 300GB growing at 25-50GB/year

2. Backup / cloud storage

3. Possibly, though I'd like SFTP access really

4. no

5. Don't lose my data, probably not down for more that 3-4 hours in one go.

6. Enough to stream video to back to my box

7. Not really, I'd assume the chance of data center being down is pretty small.

Hitachi 500GB USB $2.4/month (based on $60 drive with 2 year warranty) Backblaze $5/month [Unlimited] Crashplan $5/month [Unlimited] Amazon charge $22.80/month Dropbox $50/month [500GB] (though I still need a local copy) Digital Ocean instance $320/month

I think there is opportunity to come in at half of Amazon's price or less and that could lead to a new set of start-ups that could build on that.

Thanks for taking the time to answer! Interesting use-case!

Interesting to see that kind of a difference between hitachi and western digital, given that WD owns HGST. Are hitachi drives marketed as higher reliability drives, or was the acquisition by WD simply too recent for the quality of the two brands to "equalize"?

Backblaze's data shows that the number of errors is largely related to the age of the drive. Because the older drives are from before Hitachi was acquired by WD, it is going to take a few more years for the brands to equalize if they do combine the manufacturing of both lines.

Do the Thailand floods make any difference to this report? How reliable were those drives, and are the factories ba k up to full speed yet?

The factories have been at full speed for the past year now.

You wouldn't know it by the prices.

The prices are pretty much where they were from before the flood. Harddrive capacity has just gone up a ton. I bought a 1 terabyte before the flood for $60 and that is pretty much where it is at now if not cheaper in some cases.

Yes, but the problem is, hard drive $/GB has not recovered to the pre-flood trendline, even if prices have (4 years later) finally hit parity. The trendline has been badly hurt by the floods, it's pretty amazing. (I did a little tracking and was impressed how my predictions of the consequences were still insufficiently pessimistic: http://www.gwern.net/Slowing%20Moore%27s%20Law#kryders-law )

Economies of scale haven't recovered to pre-flood levels, because of the crisis in developed countries and because SSD is quickly growing its share of the market.

These are completely different technologies and their R&D spending and amortisation is separate.

> because of the crisis in developed countries

You mean that crisis which started in 2007?

> because SSD is quickly growing its share of the market.

Aren't SSDs still like 5x more expensive on a per-GB basis?

Yep to both.

Still, the crisis makes recovery harder for every market. And SSDs being 5x more expensive just became acceptable as download speeds have kind of stalled worldwide, and video resolution is not as demanding (compression also improved). Past 100GB or so, the utility curve enters into diminishing returns pretty strongly.

It used to be the case that media demanded exponentially more storage. Not anymore. You look at the average size of movie torrents for instance, and they have slowed down their growth drastically. The benefit of bigger files in practical terms is not as apparent anymore.

The cloud has also taken over for ratpacking download-and-forget data.

SSDs are already the better option for a lot of people (and still on the rise) in the throughput-latency-cost-space equilibrium. It's quite obvious just by looking at the stuff OEMs are shipping.

This is all vague reasoning compared to a huge flood crisis screwing up the supply chain for years. Why should I believe your claims about 'the cloud' and 'diminishing returns' rather than an amply documented crystal-clear cause?

These are not claims you need to believe, it's overwhelmingly evident that there have been changes in the consumer landscape towards less demand for storage space above what they already have. People are not filling up their drives as they used to.

The proof is the % share of SSD drives selling in retail. Regardless of the massive difference in cost/GiB. Mor people prefer a faster 250GB drive over a slower 4TB. Because 250GB is still plenty for the average consumer and speedy computing wins the bargain.

It's evident - but it's a trend which has always been happening, returns are always diminishing. Why are we attributing to this gradual effect the the sudden huge abrupt break in an exponential lasting years which we have observed, rather than say, an abrupt unprecedented flooding crisis affecting experience curves?

I don't think the effect of the floods is lasting so much. It just so happened that the disruption of SSD started to be noticeable around the same time of the aftermath of the floods.

And by floods, I mean the speculative movements they caused rather than the real effect on supply.

It doesn't help that the price per GB for SSD's didn't really go down in the same period.

Great post, as you get bigger populations of drives you can get a lot more visibility into their overall reliability. If there was one thing I could add to the analysis would be to split out the drives by serial number and split them out by firmware. Sometimes you find that all of the 'problem' in a set of problem drives is a single range of serial numbers.

We've had similar experiences with replacement drives, they are, by and large, significantly less reliable than "new" drives.

And last bit, we've got Western Digital drives here (a mix of 2.0 and 3.0 TB ones) They have been pretty solid performers for us.

HGST (Hitachi) has been bought by Western Digital. One should be able to expect a merge of their HDD lines (source: http://www.hgst.com/) .

Moreover, it seems that Deskstars are no longer manufactured (or have been rebranded). http://www.hgst.com/hard-drives/product-brands

Aside from the incredible usefulness of the data herein -- thanks, backblaze! -- this is also the kind of marketing that I don't mind.

Backblaze got egg on its face yesterday on HN when someone's critical report (rightly) got upvoted. Today they make up for it by giving us an interesting and useful data chart.

I know it sounds weird, but for whatever reason I read this as demonstrating a high level of corporate responsibility and attunement to customers. They could have compensated by instead, say, dropping a few grand on buying journalists and 'reviewers', like Microsoft does. But they didn't. Instead, they were cool. To me that signals that they'll also take care of whatever problems they've had recently. (Note: I have no affiliation with Backblaze.)

I'm currently not shopping for an online backup service but if that ever changes, I now have a good feeling about Backblaze, and I hope that other services take a similar approach to repairing customer relations when they are fraught.

Shit happens, even to backup providers. It's how you respond that matters most.

Interesting note on the WD 3TB Green.

I have one in my rig, and whenever I do any disk access, I always have to wait about 5-8 seconds for it to spin up every other half hour. It seems to aggressively turn off. I have my base system on an SSD, and my games and other things on this 3TB drive.

I guess such spin up times would be unacceptable in backblaze's environment.

This makes me feel better about the 4 HGST 4TB drives I just bought (http://www.newegg.com/Product/Product.aspx?Item=N82E16822145...). They were the cheapest 4TB 7200RPM drives on Newegg by a non-trivial margin.

<3 <3 <3 <3 <3 Hitachi drives. If we (Backblaze) could get them at a rate within a few dollars of the other manufacturers at the 4TB level, we would solely buy them.

Surely not? If a specific model happens to have a large defect rate, having all your eggs in one basket could prove absolutely disastrous.

Spreading your drives out, even among inferior options, seems like the only solid strategy.

Normally diversification is absolutely the way to go, but once we've found a drive with low failure rates across the board, we'd want to move mostly to using that one drive type. We can always move over to another type of drive, but there's a very real cost of having to swap drives out as they fail, so the more reliable the drive, the more incentive to stick with it until it's no longer reliable. At the moment, we're more concerned about price, so we buy a wide-variety, but we have our favorites :)

> Normally diversification is absolutely the way to go, but once we've found a drive with low failure rates across the board, we'd want to move mostly to using that one drive type.

With regard to deployment, but also with regard to the drive reliability numbers you guys are putting out there: do you worry that variation between manufacturing runs is going to hose you? Do you find looking at the drives that you're getting a good variety of hardware from multiple manufacturing runs?

If you're buying something to run for 5 years x 365 days x 24 hours, then how big a percentage of bucks is worth saving?

In the last two year, I have stopped all buying of Seagate drives. We had a rather large rash of failures in RAIDs and desktops (50% of about 50 drives). Then the "every drive shipped by HP is failing" problems of the netbooks (70 total replace 50 drives) were also Seagate drives.

We are basically an WD house now.

Is this to counter bad pr from yesterday's story about that guy whose files you lost? Because it looks that way.

No matter what hd you use, if you are corrupting files, it's all the same. Same with Evernote, if their sync is losing notes, and it is, everything else is less important.

Sean's issue was unfortunate. I think he posted the blog a little prematurely, our support is in contact with him and we're trying to figure out what happened. We restored a lot of data over Backblaze's life (over 5 billion files) and normally the .zip restores are rock solid, and we didn't have any outages yesterday. We're trying to collect his logs and see exactly what broke down. He said that he'd update his posts after the issue was resolved, so keep an eye on it!

I will, but it cast serious shadow over your service :) however awesome other aspects are

It happens. Computers are weird things, especially when networking is involved. We always recommend having a local and an off-site copy of data. That way you minimize risk, we're hoping we can get him back up and running soon!

> "We always recommend having a local and an off-site copy of data"

Sound advice.

I agree and thank you for quick response to my comment.

>Computers are weird things...

That's not very encouraging, coming from a data storage company employee.

It might not be encouraging but it is honest and it is factual. No company, however much you might wish it, can do magical things. At best, you can leverage the laws of large numbers to do things which are apparently magical to those people who do not have an understanding of maths, technology and engineering. This is exactly how stage magicians accomplish their tricks. A solid understanding of physics and high precision engineering.

A data storage company losing data isn't confined to the realms of Magic.

It's incompetence, and lack of backups on their part.

The company that suggests you take backups of your data on their servers does not keep backups of their data. How humorous.

The company that prides itself on profiting by the skin of it's teeth ($5/mo for unlimited data storage and transfer), and there are issues that arise from that severe cost cutting. Surprise, surprise.

You've never had a problem solved by turning it off and on again?

Not with a real operating system like linux, no. I have never truly solved a computer problem by rebooting a computer.

I sure hope that's not the modus operandi at BB.

70% of the time it works all the time? We haven't yet gotten the logs so we don't know what happened, but we take this stuff very seriously. At the end of the day we're a backup company, it's a serious business and even though we try to have fun and be entertaining with these blog posts and stats, our core job is to ensure folks don't lose data.

That dude's story yesterday concluded with switching from one $5/mo/unlimited provider to another. I'm guessing we'll see a post from him in another year bitching about Crashplan and then maybe he'll realize that $5/mo is too low a price to pay for an unlimited amount of data to backup reliably.

I agree. If he's only willing to pay $5 per month, the data can't be worth THAT much to him. I don't think it casts a "serious shadow" on anything. Backblaze seems like an awesome customer-focused company.

Posts like this are the reason why I trust Backblaze.

In general I'm very suspicious about the unlimited offerings. When they tell the technical details I get the feeling they are up to task and not just reselling S3 and hoping people pay but don't actually use the storage.

I'm just an amateur at statistics, but I would think this would be a useful set to do generalized linear mixed models on, to see what factors could be statistically significant (manufacturer, model, factory location, etc etc etc)

As an administrator with an admittedly smaller sample size, this lines up with my experience with the Seagate 1TB Barracuda ES.2, 15K.6/15K.7-era Cheetahs and some normal consumer Barracudas as well. All of the nearline and enterprise stuff had 5 year warranties, and over the past 5 years we've replaced over 25%.

I've sworn off Seagate altogether, at least until they demonstrate a commitment to producing more reliable drives. I will not willingly buy them for the datacenter, and I won't buy them at any price for the home, RAID or no RAID.

Interesting stats, but I wonder if the drive usage pattern at BackBlaze is somewhat different from that of a home user. In terms of how they seek, read and write. Not that they put more mileage on them faster, but that they might be doing something fundamentally different from how these drives are tested by their manufacturer. Lots of bulk in-sequence writes or something else. The 5-7 times difference in failure rates between leading manufacturers is frankly hard to believe.

I'm a big fan of single platter drives, I just buy the biggest single platter at the time.

Which is currently the WD 1gb blue. Very fast, very cool running.

1GB? that seems... awful small.

And even 1TB, which is presumably what he meant, feels very small to me.

Massive compared to the 10MB drive in the IBM PC/XT....

The fewer platters the better for vibration and noise as well. Though I wouldn't necessary restrict myself to single-platter drives, 2 platters is generally fine.

It's really unfortunate that WD has completely halted expanding capacity on their Blue line. Meanwhile, my WD6400AAKS (640GB blue, 2 platter) keeps marching on.

Agree. Just bought another 3 yesterday.

At less than $60 each, you can't really beat the price/performance ratio, particularly on a RAID5 environment.

Any particular advantages of having a single platter?

Also, I assume you meant to say 1TB ^^

The logic is that fewer moving parts means a lower failure rate. It amounts to paying a premium for longer-lived drives. How much you value that is a matter of taste and means.

VERY interesting. I always avoided Hitachi drives in my deck for some reason - always concentrated on WD and Seagates (personal experience has led me to lean towards WD as well, as I had a bunch of those non-LP drives that they talk about that like to conk out).

From what I read they look quite reliable over a fairly representative sample (annual failure rate v. # of drives / TBs / years).

> I always avoided Hitachi drives in my deck for some reason

Probably because Hitachi bought IBM's line of Deathstar drives that were notoriously unreliable. It appears they have done far better with the product.

Yeah, I've always had a negative view of Hitachi since I had a lot of trouble with the Deathstar 75 GXPs. But Seagate bought Maxtor, and I never heard anything good about Maxtor aside from "they're the cheapest!", so I guess it goes both ways... I stopped hearing good things about Seagate right after they had to stop using their "silent" tech after some patent dispute, which IIRC coincided with the Maxtor acquisition...

I've heard this rumor from everywhere, that the Deskstar drives were unreliable. If you look at the reviews from NewEgg they were TERRIBLE for that product. But I'm completely suspicious of where that reputation came from, because the statistics just don't bear it out. Often "common wisdom" is wrong - I suspect this is one of those cases.

In 2000-2001 I was an admin for a day trading firm managing around 80 computers and servers. The drives worked great. They were really fast right up till the moment of their clicking death. At around 6 months around 15 out of 30 had it occur.

The Deathstar drives were shipped during the Clinton administration. That's a long time ago to hold a grudge against a hard disk company, particularly since they all ship lemon models every few years.

I, too, have avoided Hitachi drives, but only for noise and heat reasons. Is it just me, or do they always jam 5 platters into their 3.5" drives? Historically, I lean towards WD.

Shouldn't Toshiba be the drives to buy since they took over Hitachi's 3.5 drive factories when WD took over due to antitrust? http://www.wdc.com/en/company/pressroom/releases/?release=f8...

Back in 2007/2008 or so I bought a pair of Seagate 7200 500gb drives and got bit hard by their extremely broken firmware. I haven't bought seagate drives since then, it's sad that their overall QC is still so terrible across the board.

Should consider what Jacob Applebaum revealed in his CCC preso about Hard drives: http://youtu.be/vILAlhwUgIU?t=46m25s

Cross out WD, Seagate, Maxtor and Samsung drives. Hitachi wins!

The worst service and hard drives I have ever had were from seagate. I had 3 separate hard drives RMA'ed _twice_ and all three failed within 3 months twice over, to this day I will not buy seagate.

Along these lines, I have done a monetary analysis; http://perenniallycurious.com/centspergb.html

How does that compare to SSD failure rates? Are they much better?

I dunno if SSDs have been around long enough for reliability testing. Along with they have been advancing quite rapidly also so it is hard to stick with something and expect the info to be relevant down the road. All I know when it comes to SSD reliability, Intel is king.

I am very impressed by the transparency here, and appreciative of the data. I'd never heard of Backblaze until now, but now I'll have to pay closer attention.

Seagate Barracuda Green (ST1500DL003) 1.5TB 51 0.8 120.0%

I've got one sitting on my desk that seagate sent me as a RMA return. Guess it won't be going back into our RAID.

Heh...I, too, love 3TB WD REDs. At last count I have 40 of these racked up in servers or NASes and they've worked quite well for me

Should the larger capacity seagate drives have a higher failure rate than small ones? That chart seems counter intuitive.

They said in the article that the 1.5TB drives were largely warranty replacements, and they suspect they were refurbs.

Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact