* my company installed about 4500 Seagate Barracuda ES2 (500 GB, 750 GB and mostly 1TB) between 2008 and 2010. These drives are utter crap, in 5 years we got about 1500 failures, much worse than the previous generation (250 GB to 750 GB Barracuda ES).
* After replacing several hundred drives, we decided to switch boats in 2010 and went with Hitachi (nowadays HGST). Of the roughly 3000 Hitachi/HGST drive used in the past 3 years, we had about 20 failures. Only one of the 200 Hitachi drives shipped between 2007 and 2009 failed. Most of the failed drives were 3 TB drives, ergo the 3 TB HGST HUA drives are less reliable than the 2 TB, themselves less reliable than the 1 TB model (which is by all measure, absolutely rock solid).
* Of the few WD drives we installed, we replaced about 10% in the past 3 years. Not exactly impressive, but not significant either.
* We replaced a number of Seagate Barracudas with Constellations, and these seem to be reliable so far, however the numbers aren't significant enough (only about 120 used in the past 2 years).
* About SSDs: SSDs are quite a hit and miss game. We started back in 2008 with M-Tron (now dead). M-Tron drives were horribly expensive, but my main compilation server still run on a bunch of these. Of all the M-Tron SSD we had (from 16 GB to 128 GB), none failed ever. There are 5 years old now, and still fast.
We've tried some other brands: Intel, SuperTalent... Some SuperTalent SSDs had terrible firmware, and the drives would crash under heavy load! They disappeared from the bus when stressed, but come back OK after a power cycle. Oh my...
So far unfortunately SSDs seem to be about as reliable as spinning rust. Latest generations fare better, and may actually best current hard drives ( we'll see in a few years how they retrospectively do).
1. Infant mortality. Drives fail after a couple months of use.
2. 3 year mark. This is where fails begin for typical work loads.
3. 4-6 year mark. This is when you can expect the drives that haven't failed earlier to fail. By this point, we're looking at 33% fail.
Interesting that my experiences roughly match up with Chart 1.
My experiences are 10 to 15k SAS drives. Slower moving 7200rpm drives? No idea. Haven't used them in servers in a while. They seem more of a crapshoot to me. SSD's, thus far, are even more of a crapshoot and we don't use them in servers and only hesitantly in desktops/laptops and only Intel.
It is very disappointing how flaky and unreliable SSD devices have been when their promise was just the opposite, due to lack of moving parts.
Back in 1999/2000 I had a habit of building some personal as well as commercial servers in datacenters with compact flash parts (plain old consumer CF drives) as boot devices with the goal of fault tolerance in mind. There was a price to be paid in that these devices needed to be mounted, and run, read-only.
But they ran forever. I never had one part fail. Just plain old CF drives mated directly to the IDE interface.
Now fast forward to 2013 and new servers we deploy for rsync.net have a boot mirror made of two SSDs ... things have gone well, but our general experience and anecdotal evidence from other parties gives us pause.
One thought: an SSD mirror, if it fails from some weird device bug or strange "wear" pattern would fail entirely, since both members of the mirror are getting the exact same treatment. For that reason, when we build SSD boot mirrors, we do so with two different parts - either one current gen and one previous gen intel part, or one intel part and one samsung part. That way if there is some strange behavior or defect or wearing issue, they both won't experience it.
If you'd still follow up on your idea of using a read-only root like you did with CF cards and figured a safe place for the logs you could still use the SSDs in the same mode. Why not go that route?
But if you stick to boot SSDs that are read and written to using different makes sounds like a good strategy.
It was a huge win for uptime.
I'd echo the sentiment seen elsewhere in the comments about Seagate drives vs. Hitachi drives. Both for SATA and NL-SAS. Hitachi 1TB were rock solid compared to Seagate.
* Most consumer drives over 2TB have extremely poor reliability. Just check any Amazon or Newegg review (DOA and early mortality show up with more frequency). Yes, I know using reviews are not accurate but since there is no public information of drive failure rates there is not really much to go on.
* The reduction of manufacturer warranty since Thailand floods. Surprise, they never changed it back to the original 3 year warranty.
If you have a large array of disks, there is nothing to really to worry about. If you have a small set of drives, spend a little extra and get the "Black" or RE drives with 5 year warranty. Avoid any "Green" drive.
Check your S.M.A.R.T data. Look at the head park number. (Load cycles I think it is called, can't look it up now). If it is a six digit number, you are in trouble. For a server you want if to be in the same order as to power ups. Anything else and you have to explain to yourself "why?"
Edit: adding. The 1TB and smaller greens were disasters. I ruined a lot of them. I was told all of the 2TB and up greens didn't have head park issues, but spent part of last week replacing a storage unit populated with 2TB greens when a spindle failed (>200 unrecoverable blocks) and found that some of the 2TB greens were load cycling into the 200000 range, others weren't running up. They were all identical models purchased at the same time. Maybe they had different firmware? I replaced hem with REDs. They aren't supposed to park and they won't try to recover a bad sector for more than a few seconds so the don't hang your RAID when they get bad sectors.
I can second the 200> bad blocks. Sometimes they still work fine after using badblocks -w a few times on them and raising the timer.
The number I have been finding to be high is Hardware_ECC_Recovered (values between 1036555546 and 2699460003). Not sure if that's normal. I've also had two 1.5TB drives now end up with unrecoverable sectors. RAID recovers from that just fine but I've been replacing them as it keeps reoccurring and is supposed to be a signal of failing disks. These 1.5TB drives are a 3+´years old and I've been thrashing them a bit lately. I'd have expected them to last longer though.
$ for dev in `ls /dev/sd?`; do echo $dev; sudo smartctl -a $dev | grep Load_Cycle_Count; done | cut -d " " -f 2,40
We recently had 3 servers have two drives each fail within hours of each other, with about two weeks between each of the 3 servers. These were 3 out of 4 servers that had been configured at the same time, with drives from the same delivery - clearly something had gone wrong.
Usually we try to drive types, but we didn't have enough suitable drives when we had to bring these up. Thankfully we do have everything replicated multiple times and very much specifically avoided replicating things from one of the new servers to another.
When we brought them back online we got a chance to juggle drives around, so now they're nicely mixed in case we get more failures.
For my private setup, I've gone with a mirror + rsync to a separate drive with an entirely different filesystem + Crashplan. Setups like that seems paranoid until you suffer a catastrophic loss or near loss...
My first big scare like that was a decade or so ago when we had a 10 or 12 drive array of IBM Deathstar (Deskstars) that started failing, one drive after the other, about a week apart, and the array just barely managed to rebuild... Particularly because it slowed the array down so much during the rebuild, that we were unable to complete a backup a day while running our service too, and taking downtime was unacceptable. So our backups lagged further and further behind while we waited for the next drive failure.. Those were some tense weeks.
What I am doing now as soon as I get unrecoverable errors from the drives is to replace them one at a time with whatever is the best cost/TB drive. Whenever all the drives have been upgraded I can resize the array to the new minimum drive size.
Some of them were also crippled in firmware so you couldn't use them in RAID1 arrays, but this might have changed.
‣ In Linux you can manually ...
• check the power state of your drive using: hdparm -C /dev/sda
• manually spin down the drive (standby) using: hdparm -y /dev/sda (it will immediately spin up at the first attempt to read a sector)
‣ Or you configure the automatic standby of the drive (which also does not involve the OS)...
• hdparm -S n /dev/sda will configure the timeout of the drive to a value encoding the time to spin-down on a non-linear scale, check the manpage
• hdparm -B n /dev/sda will configure another type of power management which doesn't specify a fixed timeout, but rather a vendor-defined type of arbitrary power saving measures on a scale of 1..254 (1: waste power, 254: conserve power, n>128 allows spin down)
The latter two options are handled internally by the drive and (as far as I know) even stored non-volatile.
(Edit: fixed my broken English ;-) )
At the very bottom, in small, grey text, you'll find "IntelliPower" defined as:
"A fine-tuned balance of spin speed, transfer rate and caching algorithms designed to deliver both significant power savings and solid performance. For each WD Green drive model, WD may use a different, invariable RPM."
been running 3x of these in a raid-5 NAS, no issues so far (not that it's any kind of indicator on a system which idles as a backup all day)
1) select the make and model of drive you want
2) buy the same model of drives from multiple vendors which have different serial and build numbers.. even if you're buying two drives, buy each from separate locations or vendors.
3) mix up the drives to make sure they don't die. place stickers of purchase date and invoice number on each drive to keep them straight.
This all.. because when one drive goes due to a defect or hitting a similar MTBF, other ones with a close by serial number or build number can tend to die around the same time for similar reasons.
From owning hard drives over 8 or 9 generations of replacing or upgrading since the 90's on all types of servers, desktops and laptops: The day you buy a new piece of equipment is the day you buy it's death. Manage the death proactively as it gets more and more tiring to deal with it each time.
Drives have died for me both in 24/7 powered systems and through power cycles. Drives have reported intermittent failures for many months, but still lived for years without any actual data loss. The oldest drive I still have spinning is a 200G IDE containing the OS for my old OpenSolaris zfs NAS; must be getting on for 9 years.
I advise having a back-up of every drive you own, preferably two. I built a new NAS last week, 12x 4G drives in raidz2 configuration; with zfs snapshots, it fulfills 2 of the 3 requirements for backup (redundancy and versioning), while I use CrashPlan for cloud backup (distribution, the third requirement). Nice thing about CrashPlan is my PCs can back up to my NAS as well, so restores are nice and quick, pulling from the internet is a last resort.
Incidentally, about "consumer-grade drives", the last time I looked into this, I was led to believe that if it's SATA and 7200RPM (or less), there's no hardware distinction. It's just firmware. Consumer drives try very hard to recover data from a bad sector, while Enterprise/RAID drives have a recovery time limit to prevent them being unnecessarily dropped from an array (which will have its own recovery mechanisms). That's it.
There is a long feature reference that mentions things like: higher RPM, more quality, larger magnets, air turbulance control, dual processors, etc.
I'm not a spec in hard drives, just that I remember reading this stuff when trying to figure out do I need it. In the end, For my small-scale corporate file server, I chose zfs raidz with consumer grade disk drives.
 Enterprise-class versus Desktop-class Hard Drives: http://download.intel.com/support/motherboards/server/sb/ent...
They even admit to the problem themselves at the end:
"Some hard drive manufactures may differentiate enterprise from desktop drives by not testing certain enterprise-class features, validate the drives with different test criteria, or disable enterprise-class features on a desktop class hard drives so they can market and price them accordingly. Other manufacturers have different architectures and designs for the two drive classes. It can be difficult to get detailed information and specifications on different drive modes."
That PDF tells me nothing interesting. It's marketing crap for clueless executives, not a technical analysis. (Given their absurd obsession with "Higher RPM" as some sort of defining characteristic, it's not even relevant to the statement I made in the first place.)
Certainly the old 9.1Gb SCSI disks that were so popular 10 years ago are well past be justifiable to give power to now.
But these drives will still be useful. What about, say, shipping them to ONGs located in Africa?
But there are other considerations:
* This would also result in a big pile of waste in Africa, as their recycling infrastructure is limited.
* They need food, shelter, stable politics and functional education before they can make any use of computers.
* They have limited energy supply. Low powered tablets / laptops are much more useful.
But, it is possible that the logistic/shipping is more expensive than the production of new hardware.
Also, cheaper more energy efficient hardware could be more cost effective.
source: A friend of mine worked for ong in Burkina Faso
While most of the western world has huge fright ship docks that can load/unload and ship relatively price efficient, Africa does not.
This means, the primary means to ship stuff there is via airplane, which is notoriously weight constrained, which in turn makes shipping bulk goods expensive.
This makes shipping newer, lighter tech even more cost effective than old hardware.
p.s. This is probably true for anything they need to import, not just tech.
Many countries on the continent appear to have rather robust capacity for their economies.
I don't think that's an horrible idea, because I routinely donate old computer tech to non-profits, which do make use of it. Usually to assemble computers for children. Granted, this has negligible cost to me, as we are in the same country.
When one has access to cheap tech from the BestBuy just around the corner, it is difficult to imagine how expensive it can be in third world countries. It may also be difficult to understand how incredibly old are some of the computers that do exist.
I can't believe that it would be so expensive to ask people and companies to donate their old "junk", fill up a container and ship it.
> This would also result in a big pile of waste in Africa, as their recycling infrastructure is limited.
That may be so. That much tech waste could also create the necessary conditions for a recycling industry to start. Assuming that it doesn't already.
> They need food, shelter, stable politics and functional education before they can make any use of computers.
Is that so? Can't computers help them achieve those goals?
> They have limited energy supply. Low powered tablets / laptops are much more useful.
Yes, many locations completely lack power and ordinary desktop computers wouldn't work. But I don't think that's true for all african countries.
While this is a cool idea, how much bandwidth would you need to boot at roughly the same speed you do today? Some SSDs have 500 MB/s (~4 Gbit) read. You'd need to have gigabit networking with almost zero latency for that to perform well.
I suppose a smaller OS like Chrome OS would be perfect for that. Even if this worked on fiber, how would you boot over a cellular network? Aside from costing you a ton of money, it would take forever to download.
Hard drive space per dollar grows exponentially, and they're big weighty things. The window of time where it would be economical to reuse would be short, and value dubious.
I've been hit-and-miss, gotten a few drives replaced, had a few warranties expire. But pretty much every disk drive fails eventually.
Think about it - its a commodity. If it lasted much longer than the warranty, they spent too much on robustness for the price.
Proper statistical analysis would help you there.
Yes, if you know the probability distribution. If you don't know the distribution, you can not calculate your confidence, and thus can not do a proper statistical analysis.
And, guess what, nobody knows the probability distribution of hard drive failures. That's exactly what they are trying to find out.
Amazon could use some competition in this space, IMNHO.
80% drives surviving after 5 years seems right, this is what we're seeing as well. The hardware is decommissioned faster then the drives fail.
I'm not sure that the information would be all that valuable anyway. Google's data-center environments, workloads and requirements are likely pretty different than your environment and requirements, so I'm not sure how the information would be useful?
We are constantly looking at new hard drives, evaluating them for reliability and power consumption. The Hitachi 3TB drive (Hitachi Deskstar 5K3000 HDS5C3030ALA630) is our current favorite for both its low power demand and astounding reliability. The Western Digital and Seagate equivalents we tested saw much higher rates of popping out of RAID arrays and drive failure. Even the Western Digital Enterprise Hard Drives had the same high failure rates. The Hitachi drives, on the other hand, perform wonderfully.
In the second article they say that WD-RED are 2nd in reliability (WD-RED did not exist 2011). I'm happy that I've got a cheap Hitachi Ultrastar. But who knows.
As a personal anecdote: WD-Green failure rates are huge here. 24/7 Desktop machines, 240 drives. I've replaced in last 12 month at least 20 drives.
We replaced the drives with a different brand and the 'failures' went away.
There is indeed a firmware setting on how long the drive spends checking for errors after it detects one, during which the drive doesn't respond. Sometimes this is too long for the RAID controller, so it drops it.
It used to be you could buy WD Caviar Black drives and tweak that timeout setting on the controller to effectively have WD RE drives (enterprise version of the WD Black drives). They removed that "feature" a few years ago.
Odd question, but I've always been wondering. These things just seem to hast forever.
I don't have any hardware to read my pile to 5.25 Atari 800 disks.
Also, no hardware raid, battery, or cap.
Source: worked at Eye-Fi, built 2PB storage
It is not true that the pod team must remove the 4U server from the rack. It is slid out like a drawer (no tools required, takes maybe 10 seconds). The drive or motherboard is then replaced, then you slide the drawer back in. So the 4U server must slide 18 inches one way, but zero cables have to be unplugged or replugged when done. This only takes one technician and no "server lift", the drawer supports all the weight.
I'm not defending this design, just correcting a mistake. Backblaze frankly "makes do" with this design because nobody will step up and make anything that fits our needs better. The number 1 criteria is total system cost over the lifetime of the system INCLUDING all the time spent on salaries of datacenter techs dealing with the pods. "raw I/O performance" is not that important for backup, so trying to sell us an awesome EMC or NetApp that costs 10x as much and has 10x the raw I/O performance is not very compelling to us. But if you came up with a design making it faster for our datacenter technicians to replace a drive faster while not significantly increasing overall costs in another area, we SURELY would listen.
While I don't recommend them outright, we settled on 3U boxes from SuperMicro. http://www.supermicro.com/products/chassis/3u/837/sc837e26-r...
We somewhat affectionately dubbed them "mullets" as in business in the front, party in the rear.
They make 4U devices as well. Cost was about $1000. We added LSI Megaraid 9280 controllers, about another $1500 and ran min-SAS back to a controller node responsible for 4 JBODs.
1. you have to muck around with more firmware and sometimes reboot in order for changes to take effect
2. if a controller dies, you have to replace it with (almost) the exact same controller in order to read the data
3. Datacenters rarely lose power, take the HW raid money and instead put servers on true A+B power feeds.
CPUs are so fast these days that they can easily handle in software all the "stuff" that HW raid used to do.
Their hardware design is specifically geared towards their use-case and I applaud them for knowing how to optimize for their use-case. I wouldn't use it for mine but only because it's not a good fit.
They can open-source the hardware because the real secret sauce is the software and the hardware open sourcing gives them a nice edge in marketing.
Edited to add: They've optimized for hardware purchase price and given up reliability (HW RAID, battery, cap), performance, and maintainability. The strange thing is the overall cost of the storage system is driven by power, not purchase price. Smarter RAID controllers, like I link above, let you manage power by spinning down disks as they are unused and thereby reducing your power draw. Can't do that with SW RAID that I've ever seen. Take a look at Amazon Glacier which I suspect is using this power-off strategy to drastically reduce their costs.
As for saving power by spinning down disks, it is likely to be useful to them and is completely possible even in SW RAID though it requires some managing to perform effectively.
There isn't much that is applicable directly to most other use-cases but if your data is mostly sitting idle and you only need occasional access to it the backblaze pod is a nice design. If you care about performance and do not deploy multiple pods with redundancy between them you are not likely to be happy with the result.
I've restored just a few files from Backblaze. While it's an "offline" operation where you choose the file, then get a notification when it's ready to be downloaded, it took only a handful of minutes.
It's not why I signed up with them, but it was delightful that it worked.
See also: http://en.wikipedia.org/wiki/Amazon_Glacier#Storage
So SpinRite may be handy, but throw the drive away after use.
Not everyone lives in the EU. In fact, the majority of people don't.
Outside your regulation happy haven, warranty periods aren't random and do indicate durability under normal use.
Care to back that up with any real data instead of baseless consumer speculation relying on time travel?
Am I unaware that there are new paid spots on the first page of HN? (it would make sense I guess, from a business perspective)
TIA to anyone that can be of help on this,
cheerio, (and good luck to Blackblaze, backblaze a path to a backblazing success!)
Don't like it? Don't vote for it.
I suspect the reason why people do "burn in" tests on hard drives is to make drives that suffer from early failure ("infant mortality" as described in the article) fail early enough that you can RMA them with the manufacturer. Apart from that, I don't think there's much you can do to improve your chances.
The article actually makes this point (about anecdote), but their data suggests that failure rates do rise substantially after three years.
The optical drives I've had, on the other hand, are actually unreliable. They all seem to break down after about four years, and I don't use them all that often!
There is an inherent bias in the reviews. Which makes the backblaze report so interesting, they have less of a bias though they do not report actual disk vendors and models to really draw direct inference only the general trend.
I think this is most evident in the reduced warranty periods compared to before when 5 years was quite normal.
Can't seem to find relevant information on the website anywhere for this.
A little statistics is a dangerous thing.