One Billion Drive Hours and Counting: Q1 2016 Hard Drive Stats (backblaze.com)
May 17, 2016

Since we're on the subject of hard drives, I hope this will be helpful or interesting:

It has been our experience that it is no longer possible to buy new, non-fraudulant drives of a sort of recent, but not brand new drive model from Amazon.

So, for instance, in early 2016 if you want to buy 4TB enterprise drives from Amazon, you will find them, they will be classified as brand new, and they will be sold by some big amazon parts seller.

When you receive them they will be nice and shiny brand new wrapped - perfectly sealed - and when you spin them up, SMART stats will show 4000-6000 hours of use and that they are 2-3 years old.

This is almost universal and has been happening for at least 3-4 years. These sellers are selling the drives as brand new and they are anything but. When you complain, they will immediately exchange or refund - there's never a hassle there - and once in a while the seller will spout some bullshit about the drives being "new pulls" ... that is, drives they stripped from unsold servers/desktops.

Hope that helps.

Interesting. Just before I left Google they were looking to resell "old" drives from their machines. Basically after 3 years they had been depreciated, so the drives would be securely erased and sent to a reseller who would sell them (presumably marked as used) for a good price. However, those drives would have 20,000 - 30,000 hours on them not 4,000 - 6,000. When we decommissioned a bunch of Blekko servers the guys buying them were mostly interested in the drives (easy to resell) but again they had a lot of hours on them. Warranty replacements we had done over the years had a special label that said they were refurbs.

All of that to say I'm wondering who runs a drive for 250 days and then sells it. I could easily see a thousand hours being run as burn in to catch infant mortality failures but those are pretty much all gone after the first thousand hours.

At the few datacenters I worked at the drives were being crushed and put into bins for recycling. Burn in would only take a day or so. A thousand hours is a bit too long for a burn in cycle.

Unlikely, but the reseller gets new drives, uses them (selling cloud storage) while they sit as unsold inventory, then sells them as new.

Disclaimer: I work for WD/HGST in Enterprise Support.

Another problem we see a lot with 'refurb' purchases (especially SAS models) is that the drives can be OEM types made to order for use in a vendor's SAN/NAS enclosure; the OEMs often load up their own firmware and tweak the hardware design, so generic code may be incompatible and can't be installed. Buyers try these drives with their own hardware and when they don't work (at all or partially) they come to us for a firmware 'fix'.

Warranty and support issues for OEM drives are handled by the company for whom the drives are made, and they may not cover second-hand drives purchased through certain channels. A customer the other day had around $8K of 'refurb' drives that wouldn't work in their storage enclosure and we had to refer them back to their auction-site based supplier.

Amazon has become quite the cesspool of late. Everyone and their dog can register as a reseller there. And as long as the stink do not reach Amazon spreadsheets, nothing happens about it.

> Amazon has become quite the cesspool of late.

Agreed, Fulfillment by Amazon opened the floodgates for dishonest sellers and reviewers. It's slowly turning into Alibaba.

I no longer trust anything I see on Amazon unless it's a well-known, name brand product being sold by Amazon themselves.

Even then, how do you know that it didn't come off a rejects pallet originally shipped overseas b/c the items were deemed "not fit" for sale in the US, then reshipped back in bulk to a private reseller?

Saddest part of this trend is it's not just Amazon. Newegg, BestBuy, and even Walmart, now have these "marketplaces" full of spurious sellers. Drop-shipped rejects, so-called refurbs and mis-represented descriptions abound.

edit: sorry if not from US, I and my experiences are from the US and tend to be US-centric.

Is there evidence of NewEgg doing this? Ive seen proof of the others, but not NewEgg? I hope its not true, Ive always had good experience with them for probably close to a decade now.

Newegg doesn't do this, but Newegg does allow third party sellers on their website so the third party sellers still may be selling garbage.

Newegg really needs to drop that shit, it's bad for business.

Agreed. It has sullied their otherwise stellar track record, IMO.

On another note: THANK YOU NEWEGG! For adding an apply to their side-bar search function so my page doesn't have to refresh EVERY! TIME! I! MAKE! A! SELECTION!

Even "sold by Amazon themselves" is something of a concern:

There's no differentiation in the warehouse between "item as sold by Amazon" and "item as sold by Joe's Hard Drives and Fulfilled By Amazon".

People have commented on this before, with returns and exchanges, unique identifiers on something that they bought back from themselves (can't remember the reasoning) not being present, and Amazon saying "well, all items with a given SKU get grouped together".

Same with laptop batteries: sell as a 5200mAh battery, label it as 5200mAh, but when you read the chip that's inside (similar to reading SMART) it's actually 4400mAh or something.

Some more info I wrote a while ago: http://lucb1e.com/rp/batstat/en/

Never ever buy 3rd party from Amazon.

Maybe if they have tens of thousands of perfect ratings and it is fulfilled by Amazon, worth the risk for non-mechanical stuff.

But even then, there is tons of counterfeit stuff on Amazon and eBay now - if it is a brand name and the price is too good, it is fake. Even if the price is okay it still might be fake.

China is getting amazing at making fakes. Even medication for humans and pets.

This is disturbing.

What happens once they are able to wipe SMART data? Is that even possible?

I can't remember the specific forum off the top of my head. But I remember reading several long threads covering complaints about one or two specific hard drive resellers. Apparently not only were new hard drives not actually new, but the SMART data looked suspicious. So I think that SMART data may not actually be completely trustworthy.

Edit: Here's the thread: https://forums.servethehome.com/index.php?threads/goharddriv... That forum also has some in depth reviews of white label hard drives.

Not just hard-drives, but a lot of electronics. Smartphones are a huge issue as well. There are sellers, selling counterfeit androids phones or refurbed phones as new. It's very disturbing.

Hey, have you thought about writing a blog/medium/github piece about how the "average" person can check this for themselves?

have you seen similar issues with brand new drives bought from legit VARs? i.e. the big new HSGT drives.

we've had good luck but something about your username tells me you buy more than we do...

the worst luck we've had is the large 2.5" drives. those are basically just disposable.

No, the only issues we have seen is buying "new" drives on amazon that aren't the very latest, greatest 6-8TB drives.

If they are very new, they haven't had time to percolate through the supply chain and come out the other end as "drives we can pretend are new".

So again, as of May 2016, it is 4TB drives that we are seeing at great prices, sold as brand new (no confusion about this at all) on Amazon, and ... turns out they are not new.

I keep meaning to write a blog post about this with specific vendor names ...

Backblaze is such a great company. Not only do they offer almost unlimited backup for just $5 a month, they also publish extremely useful articles such as this one.

Good to see Seagate has improved the quality of its product significantly, but a 3.5% failure rate still seems rather high. How old are the drives in question?

Does anyone think that these reports have shamed manufacturers into decreasing failure rates?

When the FCC started doing reports on internet speeds, ISPs took notice and in the second report, an ISP's actual speeds better matched advertised speeds. It would be awesome if these Backblaze reports were likewise improving hard drives.


I'm actually wondering if one of the reasons that they're having trouble buying WDC drives is because WDC keeps trying to add a clause into the purchase agreement saying BackBlaze can't release information about WDC drive failure rates. Anyone from BackBlaze able to comment on that?

Brian from Backblaze here - so far none of the drive manufacturers have done anything like that. Part of it is the manufactures flatly refuse to let us purchase from them directly (based on we need an order of 10,000 of the same drive in one purchase to buy direct). So the manufacturers have zero control. We always go to one of the hundreds of resellers or distributers out there (like CDW or Ingram Micro). Since we buy from the resellers, the resellers just want to make money, the resellers don't care if we report drive stats.

Have you guys thought about becoming a reseller as another line of business? In the post it says you need about 1200 drives a month (if read that right) so you could buy 10,000 from the manufacturer, and then sell the extra as Backblaze Approved(tm) or some such. I'd much rather buy a drive I know you guys have put through its paces, especially if I knew that purchase would help fund more data and research, by supporting your business.

I have suggested it. :-) There is no transparency when purchasing drives, in that we don't know what percentage of the price of the drive goes to the reseller (like CDW) and what percentage goes to the manufacturer (like Seagate). I proposed becoming a reseller just to cut CDW out and pocket the profit margin that CDW lives on.

DISCLAIMER: I have not run my brilliant idea past any lawyers, so it may not be legal. We have not done this (yet).

"Consider: Morningstar rates funds. Imagine: Backblaze rates drives." :P

> buy a drive I know you guys have put through its paces

Are you talking about some kind of burn-in at Backblaze for units sold to get past the infantile failure region? Or just Backblaze-as-drive-retailer but they specifically endorse this model because they have good experience with it?

WDC is likely not selling directly at such low qtys and if they do pricing would not be good.

Good to see Frontier wasn't phased and actually went down. I'd imagine if you checked 2016 they'd be closer to 60%.

How's the restore process?

I've actually been backing up to Backblaze for about a year now but (knock on wood) haven't had to restore any data. That said, one of my drives has been acting up in the last few days so that moment may be at hand (though I also have it backed up to a second local drive).

In addition to what Yev has said, I'd say:

You don't have backups, unless you can do restores. So you have to practice doing restores.

For my servers at work, we're using btrfs snapshots and send/receive to the backup host. So restoring files is just going into the appropriate snapshot directory, and copying out the files of interest.

If your backup scheme is any more complicated than that, you need to practice it at least a few times per year so that it is completely familiar.

Hilarious story from the old days...

We were doing backups to QIC tape drives. At one point, there was a lightning storm. The servers were plugged into UPSs with power protection though.

However, when running a backup, I noticed that the tape drive sounded a little different. So I check one of the backup tapes... the tape drive would no longer switch over the tracks on the tape. So it was just overwriting the same track again and again. Corrupted backups. Worse yet: silently corrupted backups. No messages from the OS about a hardware problem.

That could have been bad news if it wasn't caught quickly.

> You don't have backups, unless you can do restores. So you have to practice doing restores.

100% - I worked hard to make sure that was in the Best Practices sent out to every person that signs up with Backblaze. Restoring is the most important part. So far we have over 200PB of backups, but the stat that I like even more is that we have restored over 10 Billion files.

I realize this is slightly off topic, but I want to nerd out for a moment re: your comment on hearing something wrong with the tape drive; a skill I always felt was under-recognized for how much of a "superpower" it gives you, that being how critical sound is for a good sysop. Broken AC belts, bad hard drive backplanes, boot cycles, all things I've run into where the sound was the cue; detecting an unalerted tape drive failure is the icing on the diagnostic case.

Audio is an incredibly rich feedback mechanism for all kinds of mechanical processes. And the fascinating part about it is that our brains process it so effortlessly. If the data your ears can analyze from your car were presented visually, perhaps as a scrolling FFT spectral graph or plots of a host of sensors, you'd never notice a momentary misfire or a tiny change in pitch. It would be complete data overload! But even untrained ears can pick out errant noises.

I had another incident like that earlier in my IT career.

I was a 'terminal room consultant' in college... back when we had serial terminals hooked to Unix systems. Part of the job was the care and feeding of a couple printers, a big ol' line printer (green bar paper) and a Printronix graphics printer (dot matrix, for printing out fancy lab reports you wrote up using troff).

So over time, from loading paper and clearing jams, I had accumulated hours and hours of hearing these two guys chatter as they went about their business.

At one point, I noticed that the Printronix printer sounded funny. Just off, in some way. So I call it in for maintenance, but they don't seem to care what an undergraduate punk thought about printer sounds.

Sure enough, a week later, I see it is down and taken apart for repairs.

Your ears, your nose, all your senses should be used for debugging and general investigation.

Here in the HGST EMEA lab, you will often find an engineer listening to a drive spin up with an induction pick-up and amplifier, muttering something like "Yep, this one's running firmware XYZ", or "Hmm, sounds like this one has the older, unmodified ramp".

> You don't have backups, unless you can do restores.

Bingo, I was just explaining this to someone yesterday. Testing the restores MUST be part of the backup strategy. If your db data is small enough to have it all in your test environment, I often try to test the restores by restoring to the test db and then using that db for the test environment until the next test restore.

I had a drive fail a couple years ago and restored it with Backblaze. The one gotcha is it wasn't a bootable backup, so it was still a pain getting my system back to something approximating what it was before. Since then I've added a weekly bootable backup to a local USB drive. Not failsafe but good enough for my needs.

> I've actually been backing up to Backblaze for about a year now but (knock on wood) haven't had to restore any data.

Yev from Backblaze. Please test a restore. We have that as part of our best practices. Why? Hopefully you won't but if you DO need a restore, it's likely that you'll be in a heightened state of panic, so familiarizing yourself with the process before hand make it go more smoothly! It's pretty easy, though we are currently working on ways to streamline it even further :)

Brian from Backblaze here. Let me elaborate on what Yev said.

When customers are doing a restore, almost by definition they are not having a good day. For example, this could be somebody who just had a $1,200 laptop stolen, and now they might lose every photo they have of their child who died last year. Real example. :-( So they show up to the Backblaze website freaked out of their minds, and they FORGET THEIR PASSWORD or something silly and minor like that, and after guessing a few times our support guys get a flaming hot chat session with a person using more four letter words than not accusing Backblaze of not having their restore.

When we resolve it all (help them with the "recover password" feature) then we usually get a happy customer for life who apologizes for losing their temper earlier. I always find it amusing when they think they are the FIRST customer to ever lose their temper under such a stressful situation. Usually it isn't even the first one THAT DAY.

TL;DR - you only restore when you are freaking out. And that's Ok - your worst day is the day when Backblaze has to be the best.

Yev from Backblaze here. Let me elaborate on what Brian said.

A lot of our customers also restore just for fun or out of convenience. But yea, if you're one of the ones that's doing it out of necessity after a crash or theft, knowing how it works makes everything go a lot smoother.

I back up to a local external drive, and also to BackBlaze. External drives are so cheap these days that the ease of handling your restore yourself is worth it to me. BackBlaze is just an offsite back up so when the house is burning down, I can grab the kids instead of pictures of the kids.

My alternative: buy 2+ external drives and a fire/water proof box. Keep one drive unplugged and in the box when not backing up. Keep other n-1 drives elsewhere and swap with the local occasionally.

I have one at work, and another at my parent's house, more than 100 miles away. In case of a literal blaze and/or high water, even the local drive should be fine. If all drives are gone from a single catastrophe, I figure I (and everyone else) have more serious things to worry about, like what's for dinner.

Most fireproof boxes are not actually fireproof. In many fires they will just turn into an oven that bakes whatever is inside.

>> I have one at work, and another at my parent's house, more than 100 miles away.

How often can you swap them?

As much as I want. I usually swap the work one about every week or two; parent's house is a few months.

That's what I do. I've got local Time Machine backups (which I have restored from in the past). So far I just haven't had to use my Backblaze backups but I think I'll test that this weekend.

It should take you less than 20 seconds. Here is the procedure. First sign in here:

https://secure.backblaze.com/user_signin.htm type your username, password, and click "Sign In", then....

Look on the left nave for "View/Restore Files", browse to one file you know you changed recently, and check the checkbox by it (and maybe a few other files) and click "Continue With Restore". Done!

Within a few seconds you can go to "My Restores" (on the left side of the web browser) to download the restore!

I just assume that anybody who uses backblaze and is serious about their backups just uses it as a true DR. They keep a local backup which will be used for 99% of restores. BB is just for the case where you lose the backup as well. I honestly don't know how they do it at $5/user/month. They must have a ton of very light users to offset the whales.

> must have a ton of very light users to offset the whales

Brian from Backblaze here. Our new B2 product is priced at half a penny per GByte per month which accurately reflects how much it costs for us to store your data including a small profit for us.

So the $5/month is profitable for us up to a 1 TByte backup. We have about 25 customers with more than 50 TBytes in a $5/month backup, and yes, we lose money on them (which is FINE - they often recommend us to friends with less data). On the opposite end, we have about 20,000 customers with less than 20 GBytes backed up where we are massively profitable on those particular customers. Interestingly enough, my 84 year old father is in that demographic - no digital music, no digital movies, a few digital photos and a Quicken file. Last year he lost a hard drive, we restored his files from Backblaze. :-)

In between the 50 TByte and 10 GByte customers is a big bell curve with the bulk of our customers basically paying for their own backups.

A different way to think about it is the vast majority of people store files inside their laptops and are happy. The maximum size hard drive you can configure in a laptop today is probably about 1 TByte (the 2 TByte laptop size drives are just starting to appear), so by definition we're profitable on people like this. Technical people think everybody is like them and has a 16 TByte RAID array filled with all the Linux ISOs and movies. :-) But really most people have less than a TByte of personal data.

That describes me as well. I figure Backblaze is a cheap extra pair of suspenders which is both DR and gives me an additional backup made in an independent way. I have pulled older versions of files off Backblaze a couple times when I've needed to but, now, Time Machine pretty much takes care of that.

I also keep a periodic backup in my office as well as various less systematic backups in various ways. I'd recommend anyone keep at least a couple of backups made with different methods. You never feel more exposed than when you need to use a backup and realize you only have one copy of all your data. Hope nothing else fails or you do something boneheaded with the restore.

I had a drive fail last fall and used the restore portion for the first time. I actually opted to use it over my local time machine backup and am glad I did. It was just a matter of selecting the files online, waiting for it to create the download, and then doing so. Much more painless than my other options.

The restore process on their consumer process is simple, but takes some time.

You logon or use their app, choose the folders/files you want to restore and await an email from them letting you know when the folders/files are available for download.

Or they will send you a thumb drive (or a bigger external), with your files on it. Obviously the use cases are different, one is "oops, I deleted a folder" and the other is "my machine is a heap of slag".

The only problem with blackblaze is that they are on osx and Windows only. I'm using Linux so the only cheap backup is from crashplan, kudos to them that they support us.

I really appreciate these blog posts and have used the data they present when buying HDDs for personal use.

One thing that bothers me is that the data presented doesn't really take into account the age of the HDDs. For example, if a batch of HDDs of a particular model is 6 years old and has a failure rate of 12%, that really doesn't tell me much except that it's an old HDD.

What I'd like to know, for a given model, what the blended failure rate is after 3mo, 6mo, 12mo, 24mo etc of operational time. That would be a real apples-to-apples comparison.

http://bioinformare.blogspot.com.au/2016/02/survival-analysi... has graphs of survival rate over time broken down by manufacturer and model. Far more informative, you're right. I don't think Backblaze uses 6 year old drives. :-)

Thanks for sharing this. It's exactly what I was looking for.

tldr; It clearly shows that HGST has an overall superior survival rate over time. WD is a distant second and Seagate in third (although the Seagate ST4000DM000 model is exceptional and fairs very well).

That graph is pretty dope! When it first came out we had fun diving in to it! Glad someone is using our data-set for fun stuff!

> What I'd like to know, for a given model, what the blended failure rate is after 3mo, 6mo, 12mo, 24mo etc of operational time. That would be a real apples-to-apples comparison.

You might actually be able to pull that from the raw data. We post it all here -> https://www.backblaze.com/b2/hard-drive-test-data.html in case you wanted to play around with it!

Wow, did I read that right? They are filling 3+ vaults per month? A vault is 20 pods. Each pod is 480TB. They've mentioned in the past that they use 17+2 redundancy. So each pod should be about 429TB of added storage. At 60+ pods per month, they're adding 25,740TB per month?!

Yev from Backblaze ->

That's about right ;)

I wish some other companies that use massive amounts of hardware (Dropbox, Amazon, Google, Facebook, NSA) would release data like this. I love that Backblaze does it, but having an even larger sampling of data would be great.

Brian from Backblaze here. It completely, utterly blows my mind that we are the first and still only company to do this.

First of all, it is a great way for people to hear about our company, which sometimes leads to people buying our products and services.

Second, even just for the good of all humanity, why wouldn't you do this to promote the best drive manufacturers and heck, even help them learn how their drives are working in the field?

It boggles my mind that Facebook, Yahoo, Google, and Amazon (Amazon Web Services include S3) don't publish their drive failure stats. What could possibly be the reason they keep silent?

You guys are in the Do Well by Doing Good category on this, and unfortunately few companies in any business sector maintain this philosophy as they grow.

The boardroom is so divorced from the showroom that it's not even funny how big he disconnects can get.

Google has released some similar information in the past


Brian from Backblaze here. This Google study is EXCELLENT and extremely useful, but it leaves off the drive model numbers and manufacturers which means you cannot use it to help guide purchasing decisions.

It's still an awesome study and well worth reading and useful to a ton of people (like us at Backblaze when we were starting out).

Interesting that HGST do so well, yet are now part of WD, who do so badly.

I would have expected more absorbtion+rationalisation from the takeover, yet they still clearly look of the IBM family than anything in WD's ranges.

IBM always had a pretty good rep for failure rates, aside from a couple of horror drives in the 90s. I wonder how they've managed to keep their rates markedly better than other makes, even under changes of ownership.

More to the point I wonder why WD haven't been able to improve rates as a result of taking over IBM/HGST.

Yev here -> From what we can tell and have been told they are run as completely different entities. Hopefully that quality spillover will indeed happen in the future, but for what it's worth WD isn't doing all too poorly, they just tend to be slightly more expensive for us than the Seagates at the moment.

The Chinese regulatory system had been preventing any actual integration of the two companies for some time. They finally agreed to allow it to proceed in late 2015, but sales and marketing have to remain separate for another 2 years.

After seeing that HGST are the best drives from back blaze reports and then seeing that WD owns the HGST and IP I bought WD shares. A reasonable dividend doesn't hurt either

Having had many years of experience as a sysadmin, I have one word for those concerned with hard drive reliability - temperature. Heat is the enemy of hard drives (and other system components).

Sometimes heat causes a machine room meltdown, where machines start failing left and right. This is not uncommon. What usually happens is this - the company knows enough to put their web server etc. at a controlled colocation facility (which isn't always ideally controlled, but that's another story). But the developers would like a local file server, development source control server etc. There is a spare, windowless room in the office, and without much planning, a machine or two goes in. Then a machine goes in attached to a tape drive, which backs up those machines and the local desktops. Then more machines go in, then more. All these machines generate heat, so a room cooler is bought. But it drops condensation in a cup, and shuts off when the cup is full of water. So someone's job becomes to empty that cup every morning. But then summer comes, and on a particularly hot Sunday afternoon the condensation cup fills, the cooler shuts down, the outside temperature combines with the temperature of a closed, windowless room full of machines for hour after hour. Finally one machine has a component fail. Then another. Then e-mails and phone calls and panic starts flying.

I have seen this happen more than once, and have heard about it more. It always starts as an ad-hoc, unplanned, "temporary" solution for "unimportant" machines. But as time goes on, and machines are added, and business dependencies are formed, you have to start supporting a machine room that was never planned as a machine room.

> For Q1 2016 we are reporting on 61,590 operational hard drives ... [t]hese days we need to purchase drives in reasonably large quantities, 5,000 to 10,000 at a time.

To me, that's an astoundingly large number of hard drives. But I realise there are probably much bigger deployments. Does anybody happen to know just how many hard drives Amazon or Microsoft have for AWS/Azure?

I imagine maybe 10x more than this. The numbers actually aren't too shocking if you think of the mirroring of data that's required for maximum uptime (not even archival), and the occasional disk failure.

1,000 TB (1PB) can be easily handled across ~150 (6-7TB)HDDs for one copy, but 300-450 HDDs would be required for additional mirroring.

Largest tape cartridges out there are between 6 and 8.5 TBs, and cost around $22 per TB. That's only $22,000 per PB, and this is for high throughput cartridges like LTO7 or StorageTek Titanium. LTO5 is much cheaper.

Considering that the largest tech companies and major organizations routinely cut POs for several $100ks and are dealing with 100s of PBs of data across disk, tape, DVDs etc, it isn't outside the realm of possibility to have 300,000+ individual disks and tapes floating out there :)

Anecdotally, Google's file system Colossus uses Reed-Solomon 1.5x replication. So those 150 drives might only turn out to be in the low 200's.

And I remember reading a tweet from a Google engineer that they would be paged if their free storage dropped below 5PB.

I thought waking up at 3am, shambling to my desk and connecting to the VPN was bad. Imagine having to drive down to the datacenter and rack 200 hard drives.

Sweet, I appreciate that info :) I imagine 5PB fills up quite quickly for Google too!

At that scale, it's just a function of (number of ethernet cables) x (avg size of ethernet cables), rather than disk space in their data center, I'd imagine!

> 1,000 TB (1PB) can be easily handled across ~150 (6-7TB)HDDs for one copy, but 300-450 HDDs would be required for additional mirroring.

Mirroring is not used at that scale



That's all good, it would just increase the number of disks required to be purchased and the amount of electricity/cooling/floorspace to maintain them. It would add to the disk count, but not really affect cost per TB all that much.

How do cloud storage vendors guarantee triple-mirroring and uptime then? Lots of 2TB drives? Lies? :)

Does it say triple mirroring or triple redundancy? Take a look at the article from Backblaze for an overview of the math. In their case it might be called quadruple redundancy. 20 shards hold 17 shards of data with triple parity. Events destroying hard drives containing information about your data could happen three times, and you still wouldn't lose anything.

> 1,000 TB (1PB) can be easily handled across ~150 (6-7TB)HDDs for one copy

Yea, our latest storage pod (https://www.backblaze.com/blog/open-source-data-storage-serv...) has 60 drives at about 8TB a piece so we're pushing 480TB. Two pods are about a Petabyte, if you go up to the 16TB Hard Drives some of the manufacturers are testing, you can hit pretty close to 1PB in an enclosure, and Dropbox is actually doing that already with their "Diskotech" boxes (HN link -> https://news.ycombinator.com/item?id=11282948) - so folks are already getting more and more dense :D

Yev from Backblaze here -> They have a lot more :D

And yet, your quantities aren't enough to attract the attention of the WDC direct sales department?

> And yet, your quantities aren't enough to attract the attention of the WDC direct sales department?

We've just started heading down the direct paths, but yea it seems like the minimum order number to work direct with the manufacturers is about 10,000 hard drives per order. We aren't quite at that capacity and we don't like to keep inventory since we try to run pretty lean. Also the price for hard drives tends to drop by a small percentage monthly, so for every month we have excess inventory we purchased earlier, that could potentially be money left on the table. So our orders to be smaller than their minimums. We're inching towards it though :)

That seems like a silly minimum for WDC to have, unless the discounting is phenomenal (50%+) compared to other vendors.

I'm asking myself, even if I'm a WDC rep that is selling hundreds of thousands of PC hard drives and having an excellent quarter/year, why would I turn away the business of a growing company?

Tape cartridges can be purchased in packs of 20 from any corporate vendor and no order is too small to attract a rep's attention. I've seen deals (25th to 75th percentile) from $10,000 to $200,000, and the min/max deal range is from $2,000 to $500,000

I imagine it has something to do with demand or their partnerships with their distributors. This is conjecture, but I'd think that in order to keep distribution deals/channels up they might have deals in place that stipulate any order over 10,000 goes direct and any order under goes to preferred distributors. No idea though, but that would make a smidge of sense.

>> how many hard drives Amazon or Microsoft have for AWS/Azure?

I'd like to know YouTube's totals with 500 hours of video being uploaded every minute...

Tough to say, but if this link is right: http://video.stackexchange.com/questions/8850/typical-size-o...

Then being conservative, we can say an hour of video is 1-2GB of data, so between 500 and 1,000 GB of data uploaded (just to YT) every minute.

Accounting for mirroring and compression, 525,949 minutes per year * 1 TB per minute = ~526 PB per year

Ironically this would be about 65,743 8TB HDDs, which is close to the number in this article :)

They also store several copies of each video (different codecs and resolutions).

I had heard 1PB/day somewhere...

So my math is probably messed up here but:

- Price of buying a drive and a pod to house it: ~$0.036/GB [0]

- 1.84% of drives need replacing each year (warranty aside)

- They use 17:3 parity for redundancy (15% of storage)

So the hardware price of a GB should be something like $0.036 * (1/0.85) * 1.0184^(years). For 10 years, the hardware would cost $0.05/GB, or $0.0004/GB/month.

Power costs and DC space of course need to be taken into account but I still find it interesting that the hardware itself costs only ~10% of what they charge for B2.

[0]: https://www.backblaze.com/blog/open-source-data-storage-serv...

How did you choose 10 years?

Arbitrarily. I suspect after that point another storage technology will supersede hard drives.

From reading this, HGST >>> Seagate, right? I wasn't sure how the "weird" sourcing they had (cracking external drives open during the HDD drought) would affect things, but that was over in 2015, and the trend still seems to be HGST > Seagate.

Yev from Backblaze -> we're not really out to make a definitive "x is better than y" statement, that's just what's occurred in our environment. We've heavily favored HGST in the past, but lately we've been buying a lot of Seagate drives, mostly because for us their failure rates have dropped, and they tend to be more plentifully available and slightly less expensive. Since we built our system to handle hard drive failure, the failure rate is interesting, but the price is what we're more focused on (at least at the moment).

> I wasn't sure how the "weird" sourcing they had (cracking external drives open during the HDD drought) would affect things, but that was over in 2015

That was actually in 2011, all those hard drives have been out of our system for a while!

> We've heavily favored HGST in the past, but lately we've been buying a lot of Seagate drives, mostly because for us their failure rates have dropped, and they tend to be more plentifully available and slightly less expensive.

Are you buying bare drives, or shucking them from enclosures?

> Are you buying bare drives, or shucking them from enclosures?

We haven't had to shuck drives since 2012. And we only started doing that because regular drive prices went up so much due to the Thailand flooding. I believe all those drives have now been replaced as well. If you want a good write-up (along with links to past posts) this is a good "Hard Drive Farming" write-up -> https://www.backblaze.com/blog/farming-hard-drives-2-years-a...

Seagate had one horrifically bad, almost guaranteed to fail drive in the sample, the 3TB ST3000DM001. Many many people lost data when they died (according to reviews @ newegg and amazon).

It is currently the best gb/dollar drive right now at $70 so I was tempted to get it but then saw the backblaze article.

People on /r/buildapc don't seem to believe it's a problem though: https://www.reddit.com/r/buildapc/comments/4jkpor/avoid_the_...

Well, they're entitled to their opinion. It's possible that the problem was fixed and that there are new drives with that model number that don't have the bug.

I personally wouldn't use one for anything more critical than a door stop. Been there, done that, lost some data.

Yeah, I personally wouldn't buy one at this point. It looks a lot like some sort of design or chronic manufacturing flaw. Seagate's version of the IBM Deathstar.

I had the massive misfortune of building out my first large storage box at home with those drives.

Thank god for RAID, because every single one of them and every single warranty replacement has now failed.

I bought HGST after that and have not seen a single bad sector.

Yeah, that was my takeaway. In particular, this graph makes for sobering viewing.


I'm regretting the last batch of Seagates I bought to replace HGSTs now.

Yev from Backblaze -> you're probably fine. Everything will fail eventually, just have a backup! We recommend Backblaze :P

Haha thanks, but I have my backups already sorted (I bought more disks and put them in a different location)

Consider additionally using RAID-1 to increase short-term stability with potentially untrustworthy drives. Not only do you not have to restore from backup as often, you get a nice read rate boost besides!

6 disk RAID6 sets with a hot spare per array. Adding additional disks to RAID1 each Seagate kinda defeats the point at this point unfortunately.

That's an interesting idea in other circumstances, however. Thanks for sharing!

We applaud you :)

I wish you had a way to "seed" backups with removable media. I'm a backblaze user but initial backup (I have photo collections) is super slow. Probably not in your pricing model, though.

Yea, that's a bit tough for us to do. We'd have to have a way to tie that to your account from inside our datacenter and that's a bit complicated. Plus we'd have to find a way to charge for that to make up for the additional labor. That might be something we consider for B2, but it's a tougher sell for our personal/business product.

For B2, making an open source version of the amazon data transfer appliance would be cool.

I was about to buy some WD Red's for my home NAS but may reconsider for the Hitachi NAS edition. Ayone have experience with the later?

edit: just found this:


I love the work Backblaze does, and especially how much data they publish. Not only that, but their openness of their storage pod design helped me more fully understand the current limitations of backplanes as we move to more and more data. I have dreams of m.2 backplanes so I can skip the limitations of sata... but I digress.

Thanks Yev and everyone else there, a shining example of how to build a company and a reputation at the same time. Keep it up.

My 5 year old hard drive has been vibrating like crazy this year, also making the metal case panes vibrate, which makes a ton of noise. So I searched online, and found a solution where the HDD is suspended with cable ties, a little like this http://i.imgur.com/4MXB1IG.jpg

I guess I should just buy another one. I'm going to lose all my data anyday now.

You're going to lose your drive any day now. Not necessarily because your drive is vibrating... just because you're always going to lose your drive any day now. Anything on only one drive can be on only zero drives in the blink of an eye. If that's a problem for you, best deal with it before it's a problem.

It is true that hard drives often give a surprising amount of warning. I don't think I've ever had a drive totally spontaneously fail on me. But it's still best not to count on that.

I once had a drive that somehow managed to corrupt its own firmware. Total garbage on bootup (nonsensical LBA numbers, returned model number was garbage, SMART info was totally corrupted, etc...).

Hard drives store firmware (executable code) on hidden sectors of the drive. The amount of firmware they store in flash is relatively small.

That may help explain how your drive's firmware got corrupted.

I've had it happen. It taught me a hard lesson about keeping backups.

> So I searched online, and found a solution where the HDD is suspended with cable ties, a little like this http://i.imgur.com/4MXB1IG.jpg

Yev from Backblaze here -> good hustle! That sounds a lot like our "anti-vibration sleeves" from our earlier pod versions: https://www.backblaze.com/blog/backblaze-storage-pod-vendors... (https://www.backblaze.com/blog/backblaze-storage-pod-vendors...).

That lack of failure on the Seagate 6TB is impossibly impressive for Seagate.

Someone else must be making the drive for them or they have a new factory doing something different.

Wondering if data centres typically power down a drive after some period of no accesses? Seems like you could save a lot of power and gain some HD life that way - so long as a drive isn't frequently spinning up and down. Seems like the data access patterns could be monitored and less frequently data moved to a 'low access frequency' HD bank(?)

Yev from Backblaze here ->

For us the drives are constantly spinning, we don't power them down. One of the reasons is we never know when a customer will want a restore, so we have the data available 24/7. That said with our Vaults (https://www.backblaze.com/blog/vault-cloud-storage-architect...) it's theoretically possible to power down entire cabinets and still have the data available, but we don't currently see a need to do that.

See the other comment: there is a lot of data that isn't accessed for years at a time, so you're essentially spinning disks for years to save a few seconds of spin up time.

I think he's right - energy is too cheap for this to be a commercially relevant factor to you, but if the true cost of energy was factored in (climate change, the human cost s of the relatively dangerous mining and oil drilling industries, local pollution, etc) then I think that would tip the balance.

Oh well. OK, final suggestion, maybe do it and spin it as a public relations win? :)

With Backblaze B2 the data has to be available at all times, so powering them down isn't really feasible. For personal backup it might have been, though the manpower it would have taken might have also knocked it out of whack a bit.

Maybe energy is too cheap. I use BB and I'm kind of sad to hear that juice is wasted on my stuff I haven't restored for years.

I would suggest that you consider 'voting with your feet' so to speak. You are, after all, ultimately paying for that wasted energy with you backblaze fee.

I'd really like to try Backblaze, but I use a Windows Home Server 2011 Atom-based system as a makeshift NAS to store all of our family photos. The Backblaze Personal Backup installer fails with an error because it is considered a "Server Operating System". I'm currently using CrashPlan with this system.

> For WDC, we sometimes get offered a good price for the quantities we need, but before the deal gets done something goes sideways and the deal doesn’t happen.

Sound pretty interesting. What's really going on?

Java is responsible for 91%* of security attacks. Backblaze's code is native to Mac and PC and doesn't use Java. -- is this real?

Anyway I'm going to be a backblaze customer for $5 a month!

Isn't it for the applets?

I wonder, what filesystem are they using. Or more general the whole software stack (what OS, etc) would be interesting to know.

Ah I found it on their blog: https://www.backblaze.com/blog/vault-cloud-storage-architect...

They use ext4:

Each of the drives in a Vault has a standard Linux file system, ext4, on it. This is where the shards are stored. There are fancier file systems out there, but we don’t need them for Vaults. All that is needed is a way to write files to disk, and read them back. Ext4 is good at handling power failure on a single drive cleanly, without losing any files. It’s also good at storing lots of files on a single drive, and providing efficient access to them.

Compared to a conventional RAID, we have swapped the layers here by putting the file systems under the replication. Usually, RAID puts the file system on top of the replication, which means that a file system corruption can lose data. With the file system below the replication, a Vault can recover from a file system corruption, because it can lose at most one shard of each file.

There seems to be an incomparably tiny sample size for about half of these... 1 out of 47 drives gives a 8.63% annual failure rate.

Yes, they mention this in the article.

"Failure rates with a small number of failures can be misleading. For example, the 8.65% failure rate of the Toshiba 3TB drives is based on one failure. That’s not enough data to make a decision."

Applications are open for YC Summer 2019

