Hacker News new | past | comments | ask | show | jobs | submit login
Backblaze Drive Stats: 2018 Q3 Hard Drive Failure Rates (backblaze.com)
175 points by LaSombra on Oct 16, 2018 | hide | past | favorite | 49 comments



A total of 704 petabytes of raw space, which I think is 600 petabytes of usable space.

Looks like backblaze will cross the exabyte mark very soon. They crossed 500 PB in February [1].

Yev, in case you're around, I was curious what your plan was for a potential future shortage of supply of hard drives due to another disaster, political problem, etc. Perhaps just grow far enough ahead of your needs to have a buffer? I imagine you're now at a scale where you can't farm hard drives from costco anymore.

Also what happens to all of those functional, but non-profitable 4TB drives?

[1]: https://www.backblaze.com/blog/wp-content/uploads/2018/01/50...


Yev from Backblaze here -> of course I'm around :D Good questions! We recycle the drives that get removed from service. As for future shortages, that's one of the reasons we took a small round back in 2012 (https://www.backblaze.com/blog/backblaze-raises-5-million-wh...). The majority of it was to spend on marketing experiments, but a little portion was to be used for buffer if an event like that ever happened again and hard drive prices spiked 300%. We're a lot larger now and do a pretty good job with forecasting and having rainy day funds, but we've also gotten to the point where we are now a bit less price sensitive, so we'd be able to handle the fluctuation. That said, if everything goes horribly wrong...https://www.backblaze.com/blog/backblaze_drive_farming/


Looking at 8TB drives on Amazon right now, external USB 3.0 8TB drives are $150 (e.g. STEB8000100). Bare 8TB drives (e.g. ST8000DM004) are $200 and up.

This isn't the same class of drive. Supposedly what's inside the STEB8000100 is a ST8000AS0002 5900 RPM archive drive. But that drive, which you can barely even find anymore, is over $200.

What's up with this pricing?


You might not be the right person to do anything about this, but I have a bookmarklet for hiding position:fixed elements when I'm trying to read something, and every time I scroll your blog it recreates the obnoxious share bar. Whyyyyyyy?

EDIT: PS- fun story! I enjoyed reading it except for all the social media buttons.


Hahahaha! Social sharing is great! Do not fear the share bar! Seriously though, sorry about that - not sure if we can do anything on our end as that's the way that widget works :( We might be able to add a "print" button so you can just print the post out, would that help?


It's not that I never share things, I'd just rather ctrl-C/ctrl-V than waste a bunch of screen space on every single webpage.

Looks like Firefox's reader view fixes it, I'll just need to get in the habit of using that instead of the Kill Sticky Headers bookmarklet as things catch on and circumvent it... https://alisdair.mcdiarmid.org/kill-sticky-headers/

I can only assume the thought process of whoever writes these social bars is "People are hiding our share buttons instead of clicking them! MAYBE IF WE SHOVE IT IN THEIR FACES REPEATEDLY THEY'LL LIKE THAT BETTER AND WANT TO SHARE!"

Gruber sums up my feelings nicely: https://daringfireball.net/2017/06/medium_dickbars


>Gruber sums up my feelings nicely: https://daringfireball.net/2017/06/medium_dickbars

Ironically, that site is perhaps more unreadable than a typical medium page. It uses some hardcoded mobile format (?) that shows tiny text in the middle 20% of my 16:9 screen.


And you can't just double tap the text column to fit it to screen? That's odd, since it doesn't appear to set a maximum-scale or user-scalable=no on the viewport:

    <meta name="viewport" content="width=500, minimum-scale=0.45" />
#worksonmymachine


Heh, "medium dickbars", hell of a title, and I think that's EXACTLY what the social bar manufacturers are thinking. Unfortunately though, they're super useful :-/


> they're super useful

Useful for whom? The user, or website operator?

Snark aside, do the sharebars actually increase user engagement on your site?


The operator of course. They are definitely useful (at least on our end). We've found that making it easy to share the blogs increases the odds that folks actually boink the button and add a personalized message. Plus the share counting works as "social proof" for posts. If you read a blog post that 3 other people have shared you might abandon it before finishing. If 300 people have shared it, you'll likely be more prone to finish searching for why they thought it was so interesting.

That can be gamed though, but we keep ours "natural".


As a sysadmin I can't say how much I appreciate BB publishing these each year. it helps me make informed decisions on drive selection backed by data.


Yev from Backblaze -> Glad you like it! *Edit -> I was just at SpiceWorld in Austin and a lot of the sys admins who were there enjoyed it quiet a bit! I'm headed to JNUC in Minneapolis next week and last year we talked a lot about hard drive stats as well!


I always enjoy this site for anyone in the market. https://diskprices.com/?locale=us

FYI I have nothing to do with the site, just a user.


Sure, but, at least for my local, ALL links are to Amazon.


Do BB publish pricing of what they paid for drives? Sort of a TB/$/Failure rate. Those 12TB HGST drives must cost a ton


Disclaimer: I work at Backblaze.

> Do BB publish pricing of what they paid for drives? ... a TB/$/Failure rate

Thank you, that is EXACTLY the correct way to look at the failure statistics! So many people seem to sort the list by failure rate and think no matter what the cost, the lowest failure rate wins. For Backblaze, we just feed it back into the cost calculation. For example:

If a drive fails 1% more often but is 2% cheaper in total cost of ownership, we buy that drive. Now, total cost of ownership includes the physical space rental so more dense drives can be more expensive per TByte in raw drive cost because we can make some of that back up in physical space rental. Also, most drives seem to take about the same amount of electricity unrelated to how many TBytes are contained inside, so double the drive density is like saying it takes half as much electricity over its 4 - 5 year lifespan. Electricity is one of our largest datacenter costs, so we keep an eye on that also.

But to answer your very first question, unfortunately we cannot release the price we paid for the drives due to the vendors requesting we don't disclose prices. But don't think we have some magically huge discount or anything. Most of the time we are literally paying about retail, maybe a 3% - 5% discount for buying in bulk orders of over 10,000 drives. But for reasons nobody at Backblaze can figure out, sometimes a bunch of new drives appears randomly at a really good discount price, then returns to the original price the next month. It might be some attempt along the supply chain to boost monthly or quarterly top line at the expense of profits, I don't know.


Cost adjusted by failure rate is definitely the best metric for a RAID. But when it's only one or two disks, and any failure means a giant hassle, it can be worth paying more to improve the odds.


When I'm just buying drives for my home machines, I'd pay much more for a lower failure rate because a dead drive means a lot of time and massive inconvenience plus some research and a trip to the computer store to buy a replacement.

I'd honestly pay double if I could be guaranteed to avoid all that.


There's some other downsides of course, but RAID1 basically gets you exactly that. Double price for much lower failure rate of the storage volume. Go with different manufacturers for the two drives to further reduce the failure rate at the probable cost of a little performance and space.


Yev from Backblaze here -> We don't typically post the cost of the drives, but one thing you can be sure of, is that we go for as good of a price-to-density ratio as we can! So if they were much more expensive than their counterparts we would likely avoid them until they came down enough for us to test them out!


No they don't.


$33,500 if you bought them from amazon right now, plus whatever it costs to ship 79 hard drives.


Disclaimer: I work at Backblaze.

> $33,500 if you bought them from amazon right now

About half of all of the money that flows into Backblaze from customers gets spent on the datacenter, and the largest component of that is drives. I think sales tax and property tax is the second largest datacenter component. It turns out you don't just owe property tax on buildings and land - if you own something of significant worth like $10 million worth of hard drives in a large pile, you owe the government "property tax" every month for owning that property.


Yes, called "tangible personal property tax" (even though it's used for business). Some states have it and some don't, with various exceptions and whatnot in all of them.


if you own something of significant worth like $10 million worth of hard drives in a large pile, you owe the government "property tax" every month for owning that property.

If you're Intel, you potentially owe "property tax" every month on all the equipment in multi-billion dollar fabs. That's why those fabs only get built in states where the tax doesn't apply or where the state/county agrees to forego it.


I had never thought of income on capital vs income before Piketty made it the center of his theories, but since then realised that it is exactly what property tax is, which is weird that it is only in one type of capital.


Any plans on the European Datacenter?


Yev from Backblaze here -> yes but no ETAs at the moment!


That would be spectacular; I know this is obvious, but please do make sure that it actually can operate independently of the other locations and that end users have an easy way of controlling which region their data sits in.

There are legal reasons some companies might need to keep their data within the EU, and for others having multi-region backups are a checkbox to tick off on disaster preparations. Both are things the marketing department should use.


What does Backblaze do with all its older hard drives? Resell? Recycle?


Disclaimer: I work at Backblaze.

> What does Backblaze do with all its older hard drives? Resell? Recycle?

If the drive is still working (like you can read and write to it) when we cycle it out, we securely wipe it and then sometimes resell them in "bulk" to places like "Weird Stuff Warehouse" in the San Francisco area, or other places.

If the drive is not working, we physically destroy the hard drives (special equipment) and dispose/recycle the electronic waste of the carcass.


I think the Weird Stuff Warehouse closed down this past year. What are other examples of where you sell these cycled out but still operational drives?


> physically destroy the hard drives (special equipment)

Is that as fun a day as I'm picturing?


When you've got hundreds or thousands of drives to destroy it's not worth the man hours or consumables to cut them or drill holes in them. Even shooting them gets old quickly.

When I worked for a defense contractor we started one of our daily infosec briefings with a video of one of our shipyards destroying a few hundred hard drives. They put them all in a shipyard size press brake. The end result was very long line of C shaped hard drives.

Anyone who deals with enough drives and doesn't have other business (like a shipyard) that they can borrow suitable equipment from will likely wind up buying an industrial shredder.[1]

[1] https://www.youtube.com/watch?v=sQYPCPB1g3o


Of course it depends on the volume you have to dispose of, but besides shredding there are also hydraulic "punchers", like:

https://www.youtube.com/watch?v=shiRKLw5qVI

Hint: any hydraulic press would do, albeit a tad bit slower.


What do you guys do with decommissioned but functional hard-drives?


Like google probably wipe and maybe shred. I thought they were all shredded but it turns out they do sell decommissioned drives if they can be verified as 100% wiped.

Google actually recently upgraded to using a robotic arm to manage the wiping and shredding of drives. [0]

[0] https://www.datacenterknowledge.com/google-alphabet/robots-n...


Little confusing, are you referring to shred the tool or some physical destruction process?

https://linux.die.net/man/1/shred


Physical destruction. Destruction of decommissioned drives by shredding, crushing, or passing a bolt through the drive is a standard practice for security-sensitive operators. Some IaaS operators I've worked with extend this to every component of systems that could ever hold customer or proprietary data in memory.



I wonder where I can purchase those.. I have a personal Plex Server that could use a few 4TB used drives that are cheap.


I'm betting they probably sell them to a bulk reseller of which there are many. If you're looking to buy from one they're out there but may or may not be looking for small customers.


Smart way of doing a wipe is to kill the encryption keys needed to access the data.


It's hard to 100 percent know your keys haven't been stolen. They are very small things, easily smuggled out in an ICMP packet, and must be in RAM of whatever machine the disk is in.

For many businesses, crushing the disk makes better business sense than take the risk that they have historically lost encryption keys and are now about to hand their data away to an attacker.



Something seems to go wrong with WDC 6TB.


It's a relatively small sample though. 5 failed drives.


cool




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: