> I have been authoring the various Drive Stats reports for the past ten years and this will be my last one. I am retiring, or perhaps in Drive Stats vernacular, it would be “migrating.”
Seriously. In addition to helping inform my purchasing decisions, these reports also taught me the most valuable lesson of data: that it can only ever inform most likely patterns, but never guarantee a specific outcome. Until you act upon it, the data has no intrinsic value, and once you act upon it, it cannot guarantee the outcome you desire.
Thanks for a decade of amazing statistics and lessons. Enjoy resilvering into retirement!
It's not a best practice, but the last 10 years I've run my home server with a smaller faster drive for the OS and a single larger disk for bulk storage that I choose using Backblaze Drive Stats. None of have failed yet (fingers-crossed). I really trust their methodology and it's an extremely valuable resource for me as a consumer.
My most recent drive is a WDC WUH722222ALE6L4 22TiB, and looking at the stats (albeit only a few months of data), and overall trend of WDC, in this report gives me peace of mind that it should be fine for the next few years until it's time for the cycle to repeat.
I am becoming more and more convinced that hard drive reliability is linked to the batch more than to the individual drive models themselves. Often you will read online of people experiencing multiple failures from drives purchased from the same batch.
I cannot prove this because I have no idea about Blackblazes procurement patterns but I bought one of the better drives in this list (ST16000NM001G) and it failed within a year.
When it comes to hard drives or storage more generally a better approach is protect yourself against down time with software raid and backups and pray that if a drive does fail it does so within the warranty period.
>Often you will read online of people experiencing multiple failures from drives purchased from the same batch
I'll toss in on that anecdata. This has happened to me a several times. In all these cases we were dealing with drives with more or less sequential serial numbers. In two instances they were just cache drives for our CDN nodes. Not a big deal, but I sure kept the remote hands busy those weeks trying to keep enough nodes online. In a prior job, it was our primary storage array. You'd think that RAID6+hot spare would be pretty robust, but 3 near simultaneous drive failures made a mockery of that. That was a bad day. The hot spare starting doing its thing with the first failure, and if it had finished rebuilding before the subsequent failures, we'd have been ok, but alas.
This has been the "conventional wisdom" for a very long time. Is this one of those things that get "lost with time" and every generation has to rediscover it?
Like, 25+ years ago I would've bought hard drives for just my personal usage in a software raid making sure I don't get consecutive serial numbers, but ones that are very different. I'd go to my local hardware shop and ask them specifically for that. They'd show me the drives / serial numbers before I ever even bought them for real.
I even used different manufacturers at some point when they didn't have non consecutive serials. I lost some storage because the drives weren't exactly the same size even though the advertized size matched, but better than having the RAID and extra cost be for nothing.
I can't fathom how anyone that is running drives in actual production wouldn't have been doing that.
I had to re-learn this as well. Nobody told me. Ordered two drives, worked great in tandem until their simultaneous demise. Same symptoms at the same time
I rescued what could be rescued at a few KB/s read speed and then checked the serial numbers...
It’s inconvenient compared to just ordering 10x or however many of the same thing and not caring. The issue with variety too is different performance characteristics can make the array unpredictable.
Of course, learned experience has value in the long term for a reason.
Nearly every storage failure I've dealt with has been because of a failed RAID card (except for thousands of bad quantum bigfoot hard drives at IUPUI).
Moving to software storage systems (ZFS, StorageSpaces, etc.) has saved my butt so many times.
Same thing I did except I only wanted WD Red drives. I bought them from Amazon, Newegg, and Micro center. Thankfully none of them were those nasty SMR drives, not sure how I lucked out.
Well to me the report is mostly useful to illustrate the volatility of hard drive failure. It isn't a particular manufacturer or line of disks, it's all over the place.
By the time Backblaze has a sufficient number of a particular model and sufficient time lapsed to measure failures, the drive is an obsolete model, so the report cannot really inform my decision for buying new drives. These are new drive stats, so not sure it is that useful for buying a used drive either, because of the bathtub shaped failure rate curve.
So the conclusion I take from this report is that when a new drive comes out, you have no way to tell if it's going to be a good model, a good batch, so better stop worrying about it and plan for failure instead, because you could get a bad/damaged batch of even the best models.
> I am becoming more and more convinced that hard drive reliability is linked to the batch more than to the individual drive models themselves.
Worked in a component test role for many years. It's all of the above. We definitely saw significant differences in AFR across various models, even within the same product line, which were not specific to a batch. Sometimes simply having more or less platters can be enough to skew the failure rate. We didn't do in depth forensics models with higher AFRs as we'd just disqualify them and move on, but I always assumed it probably had something to do with electrical, mechanical (vibration/harmonics) or thermal differences.
My server survived multiple drive failures. ZFS on FreeBSD with mirroring. Simple. Robust. Effective. Zero downtime.
Don’t know about disk batches, though. Took used old second hand drives. (Many different batches due to procurement timelines.) Half of them was thrown out because they were clicky. All were tested with S.M.A.R.T. Took about a week. The ones that worked are mostly still around. Only a third of the ones that survived S.M.A.R.T. have failed so far.
I didn't discover ZFS until recently. I played around with it on my HP Microserver around 2010/2011 but ultimately turned away from it because I wasn't confident I could recover the raw files from the drives if everything went belly up.
Whats funny is that about a year ago I ended up installing FreeBSD onto the same Microserver and ran a 5 x 500GB mirror for my most precious data. The drives were ancient but not a single failure.
As someone who never played with hardware raid ZFS blows my mind. The drive that failed was a non issue because the pool it belongs to was a pool with a single vdev (4 disk mirror). Due to the location of the server I had to shut down the system to pull the drive but yeah I think that was 2 weeks later. If this was the old days I would have had to source another drive and copy the data over.
IME heat is a significant factor with spindle drives. People will buy enterprise-class drives, then stick them in enclosures and computer cases that don't flow much air over it, leading to the motor and logic board getting much warmer than they should.
I have four of those drives mentioned and the one that did fail had the highest maximum temperature according to the SMART data. It was still within the specs though by about 6 degrees Celsius.
The drives are spaced apart by empty drive slots and have a 12cm case fan cranked to max blowing over it at all times.
It is in a tower though so maybe it was bumped at some time and that caused the issue. Being in the top slot this would have had the greatest effect on the drive. I doubt it though.
Usage is low and the drives are spinning 24/7.
Still I think I am cursed when it comes to Seagate.
With the added complication that the controller should be kept cool, but the flash should run warm.
The NVMe drives in my servers have these little aluminium cases on them as part of the hotswap assembly. They manage the temperature differential by using a conductive pad for the controller, but not the flash.
This. My new Samsung T7 SSD overheated and took 4T of kinda priceless family photos with it. Thank you Backblaze for storing those backups for us!
I missed the return window on the SSD so now have a little fan running to keep the thing from overheating again
>It's not a best practice, but the last 10 years I've run my home server with a smaller faster drive for the OS and a single larger disk for bulk storage that I choose using Backblaze Drive Stats. None of have failed yet (fingers-crossed). I really trust their methodology and it's an extremely valuable resource for me as a consumer.
I also have multiple drives in operation in the past decade and didn't experience any failures. However unlike you, I didn't use backblaze's drive stats to inform my purchase. I just bought whatever was cheapest, knowing that any TCO reduction from higher reliability (at best, around 10%) would eaten up by the lack of discounts the "best" drive. That's the problem with n=1 anecdotes. You don't know whether nothing bad happened because you followed "the right advice", or you just got lucky.
Nobody should ever have peace of mind about a single drive. You probably have odds around 5% that the storage drive fails each cycle, and another 5% for the OS drive. That's significant.
And in your particular situation, 3 refurbished WUH721414ALE6L4 are the same total price. If you put those in RAIDZ1 then that's 28TB with about as much reliability as you can hope to have in a single device. (With backups still being important but that's a separate topic.)
Drive manufacturers often publish the AFR. From there you can do the math to figure out what sort of redundancy you need. Rule of thumb is that the AFR should be in the 1-2% range. I haven't looked at BB's data, but I'm sure it supports this.
Note, disk failure rates and raid or similar solutions should be used when establishing an availability target, not for protecting against data loss. If data loss is a concern, the approach should be to use back ups.
You picked a weird place to reply, because that comment is just saying what "cycle" means.
But yes, I've done the math. I'm just going with the BB numbers here, and after a few years it adds up. The way I understand "peace of mind", you can't have it with a single drive. Nice and simple.
Not an expert but I’ve heard this too. However - if this IS true, it’s definitely only true for the biggest drives, operating in huge arrays.
I’ve been running a btrfs raid10 array of 4TB drives as a personal media and backup server for over a year, and it’s been going just fine. Recently one of the cheaper drives failed, and I replaced it with a higher quality NAS grade drive. Took about 2days to rebuild the array, but it’s been smooth sailing.
The bit error rates on spec sheets don't make much sense, and those analyses are wrong. You'd be unable to do a single full drive write and read without error, and with normal RAID you'd be feeding errors to your programs all the time even when no drives have failed.
If you're regularly testing your drive's ability to be heavily loaded for a few hours, you don't have much chance of failure during a rebuild.
I end up doing this too, but ensure that the "single data disk" is regularly backed up offsite too (several times a day, zfs send makes it easy). One needs an offsite backup anyway, and as long as your home server data workload isn't too high and you know how to restore (which should be practiced every so often), this can definitely work.
RAID doesn't cover all of the scenarios as offsite backup, such as massive electrical power surge, fire, flood, theft or other things causing total destruction of the RAID array. Ideally you'd want a setup that has local storage redundancy in some form of RAID and offsite backup.
In fact for home users backup is WAY more important than RAID, because your NAS down for a (restore time) is not that important, but data loss is forever.
For essential personal data you're right, but a very common use case for a home NAS is a media server. The library is usually non-essential data - annoying to lose, but not critical. Combined with its large size, it's usually hard to justify a full offsite backup. RAID offers a cost-effective way to give it some protection, when the alternative is nothing
For a number of people I know, they don't do any offsite backup of their home media server. It would not result in any possibly-catastrophic personal financial hassles/struggles/real data loss if a bunch of movies and music disappeared overnight.
The amount of personally generated sensitive data that doesn't fit on a laptop's onboard storage (which should all be backed up offsite as well) will usually fit on like a 12TB RAID-1 pair, which is easier to back up than 40TB+ of movies.
Same here, I use raid 1 with offsite backups for my documents and things like family pictures. I don't backup downloaded or ripped movies and TV shows, just redownload or search for the bluray in the attic if needed.
I think there's a very strong case to be made for breaking up your computing needs into separate devices that specialize in their respective niche. Last year I followed the 'PCMR' advice and dropped thousands of dollars on a beefy AI/ML/Gaming machine, and it's been great, but I'd be lying to you if I didn't admit that I'd have been better served taking that money and buying a lightweight laptop, a NAS, and gaming console. I'd have enough money left over to rent whatever I needed on runpod for AI/ML stuff.
Having to restore my media server without a backup would cost me around a dozen hours of my time. 2 bucks a month to back up to Glacier with rclone’s crypt backend is easily worth it.
How are you hitting that pricing? S3 "Glacier Deep Archive"?
Standard S3 is $23/TB/mo. Backblaze B2 is $6/TB/mo. S3 Glacier Instant or Flexible Retrieval is about $4/TB/mo. S3 Glacier Deep Archive is about $1/TB/mo.
I take it you have ~2TB in deep archive? I have 5TB in Backblaze and I've been meaning to prune it way down.
Edit: these are raw storage costs and I neglected transfer. Very curious as my sibling comment mentioned it.
Yup, deep archive on <2TB, which is more content than most people watch in a lifetime. I mostly store content in 1080p as my vision is not good enough to notice the improvement at 4K.
> more content than most people watch in a lifetime
The average person watches more than 3 hours of TV/video per day, and 1 gigabyte per hour is on the low end of 1080p quality. Multiply those together and you'd need 1TB per year. 5TB per year of higher quality 1080p wouldn't be an outlier.
"In a survey conducted in India in January 2022, respondents of age 56 years and above spent the most time watching television, at an average of over three hours per day."
For china I'm seeing a bit over two and a half hours of TV in 2009, and more recently a bit over one and a half hours TV plus a bit over half an hour of streaming.
Yes it includes ads, sports, and news.
Personally I don't watch a lot of actual TV but I have youtube or twitch on half the time.
If it's infrequently accessed data then yes, but for a machine that you use every day it's nice if things keep working after a failure and you only need to plug in a replacement disk. I use the same machine for data storage and for home automation for example.
The third copy is in the cloud, write/append only. More work and bandwidth cost to restore, but it protects against malware or fire. So it's for a different (unlikely) scenario.
I switched to TLC flash last time around and no regrets. With QLC the situations where HDDs are cheaper, including the cost of power, are growing narrower and narrower.
It really depends on your usage patterns. Write-heavy workloads are still better cases for spinning rust due to how much harder they are on flash, especially at greater layer depths.
Plus that SSDs apparently have a very dirty manufacturing process, worse than the battery or screen in your laptop. I recently learned this because the EU is starting to require reporting CO2e for products (mentioned on a Dutch podcast: https://tweakers.net/geek/230852/tweakers-podcast-356-switch...). I don't know how a hard drive stacks up but if the SSD is the worst of all of a laptop's components, odds are that it's better and so one could make the decision to use one or the other based on whether an SSD is needed rather than just tossing it in because it's cheap
Probably it also matters if you get a bulky 3.5" HDD when all you need is a small flash chip with a few GB of persistent storage — the devil is in the details but I simply didn't realise this could be a part of the decision process
If this is really a significant concern for you, are you accounting for the CO2e of the (very significant) difference in energy consumption over the lifetime of the device?
It seems unlikely to me that in a full lifecycle accounting the spinning rust would come out ahead.
The figure already includes the lifetime energy consumption and it's comparatively insignificant. The calculation even includes expected disposal and recycling!
It sounded really comprehensive besides having to make assumptions about standard usage patterns, but then the usage is like 10% of the lifetime emissions so it makes a comparatively small difference if I'm a heavy gamer or leave it to sit and collect dust: 90% remains the same
> If this is really a significant concern for you
It literally affects everyone I'm afraid and simply not knowing about it (until now) doesn't stop warming either. Yes, this concerns everyone, although not everyone has the means to do something about it (like to buy the cleaner product)
Um, no. Not unless you're still running ancient sub-1TB enterprise drives.
It turns out that modern hard drives have a specified workload limit [1] - this is an artifact of heads being positioned at a low height (<1nm) over the platter during read and write operations, and a "safe" height (10nm? more?) when not transferring data.
For an 18TB Exos X18 drive with a specified workload of 550TB read+write per year, assuming a lifetime of 5 years[2] and that you never actually read back the data you wrote, this would be at max about 150 drive overwrites, or a total of 2.75PB transferred.
In contrast the 15TB Solidigm D5-P5316, a read-optimized enterprise QLC drive, is rated for 10PB of random 64K writes, and 51PB of sequential writes.
[2] the warrantee is 5 years, so I assume "<550TB/yr" means "bad things might happen after 2.75PB". It's quite possible that "bad things" are a lot less bad than what happens after 51PB of writes to the Solidigm drive, but if you exceed the spec by 18x to give you 51PB written, I would assume it would be quite bad.
ps: the white paper is old, I think head heights were 2nm back then. I'm pretty sure <1nm requires helium-filled drives, as the diameter of a nitrogen molecule is about 0.3nm
hopefully you have 2x of these drives in some kind of raid mirror such that if one fails, you can simply replace it and re-mirror. not having something like this is risky.
That may be true for pools that never get scrubbed. Or for management that doesn't watch SMART stats in order to catch a situation before it degrades to the point where one drive fails and another is on its last legs.
With ZFS on Debian the default is to scrub monthly (second Sunday) and resilvering is not more stressful than that. The entire drive contents (not allocated space) has to be read to re-silver.
Also define "high chance." Is 10% high? 60%? I've replaced failed drives or just ones I wanted to swap to a larger size at least a dozen times and never had a concurrent failure.
If you're doing statistics to plan the configuration of a large cluster with high availability, then yes. For home use where failures are extremely rare, no.
Home use is also much more likely to suffer from unexpected adverse conditions that impact all the drives in the array simultaneously.
Disaster-plan is always backup (away from location) or out-of-house replication, raid is NOT a backup but a part of a system to keep uptime high and hands-on low (like redundant power and supply)
Disaster = Your DC or Cellar is flooded or burned down ;)
I've owned 17 Seagate ST12000NM001G (12TB SATA) drives over the last 24mos in a big raidz3 pool. My personal stats, grouping by the first 3-4 SN characters:
- 5/8 ZLW2s failed
- 1/4 ZL2s
- 1/2 ZS80
- 0/2 ZTN
- 0/1 ZLW0
All drives were refurbs. Two from the Seagate eBay store, all others from ServerPartDeals. 7/15 of the drives I purchases from ServerPartDeals have failed, at least four of those failures have been within 6 weeks of installation.
I originally used the Backblaze when selecting the drive I'd build my storage pool around. Every time the updated stats pop up in my inbox, I check out the table and double-check that my drives are in fact the 001Gs.. the drives that Backblaze reports has having 0.99% AFR.. I guess the lesson is that YMMV.
I think impact can have a big influence on mechanical hard drive longevity, so it could be that the way the ServerPartDeals drives were sourced, handled or shipped compromised them.
I used to think these were interesting and used them to inform my next HDD purchase. I realized I only used them to pick a recently reliable brand, we're down to three, and the stats are mostly old models, so the main use is if you're buying a used drive from the same batch that Backblaze happens to have also used.
Buy two from different vendors and RAID or do regular off-site backups.
Mirrored raid is good. Other raid levels are of dubious value nowadays.
Ideally you use "software raid" or file system with the capabilities do scrubbing and repair to detect bitrot. Or have some sort of hardware solution that can do the same and notify the OS of the error correction.
And, as always, Raid-type solutions mostly exist to improve availability.
Backups are something else entirely. Nothing beats having lots of copies in different places.
...and many used Seagate drives have been resold as new in the last 3 years. They were used for crypto mining and then had their SMART parameters wiped back to "new" 0 hours usage.
Seagate has always been the "you get what you pay for" && high replacement availability option, at least since the Thai flood and ST3000DM001 days - they kept shipping drives. It was always HGST > Toshiba > Seagate in both price and MTBF, with WD somewhere in between.
There are coins like Chia that were built around "wasting disk space" proof-of-work rather than the "wasting {C/GPU|ASIC} cycles" PoW that Bitcoin uses (and Ethereum used to use.)
It's not Seagate's fault, but it would behove them to clamp down on such activity by authorised resellers.
After all, it's not just the buyer getting ripped off; it's also Seagate. A customer paid for a brand new Seagate drive and Seagate didn't see a penny of it.
When I started my current 24-bay NAS more than 10 years ago, I specifically looked at the Backblaze drive stats (which were a new thing at that time) to determine which drives to buy (I chose 4TB 7200rpm HGST drives).
My Louwrentius stats are: zero drive failures over 10+ years.
Meanwhile, the author (Andy Klein) of Backblaze Drive Stats mentions he is retiring, I wish him well and thanks!
PS. The data on my 24-drive NAS would fit on two modern 32TB drives. Crazy.
Blackblaze is one of the most respected services in Storage industry, they've kept gaining my respect even after I launched my own cloud storage solution.
Although a minor pet peeve (knowing this is free): I would have loved to see a 'in-use meter' in addition to just 'the drive was kept powered on'. AFR doesn't make sense for a HDD unless we know how long and how frequently the drives were being used (# of reads/writes or bytes/s).
If all of them had a 99% usage through the entire year - then sure (really?).
Probably can't say too much, but I know that the I/O on these drives stays pretty consistently high. Enough so that Backblaze has to consider staying on smaller drives due to rebuild times and the fact that denser drives really don't stand up to as much abuse.
A company called TechEmpower used to run periodic web framework benchmarks and share out the results using nice dashboard. Not sure why they stopped doing these.
Yev from Backblaze here -> When we started this, we did so with the intent of sharing data and hoping that others would do the same. We see glimmers of that here and there but it's still so fun for us to do and we're expanding on it with Networking Stats and some additional content that's going to be coming soon including how these inform our infrastructure deployment. Fun stuff :)
Puget Systems has similar publications covering their experience building client systems, though not always in the same level of detail. They also have PugetBench to benchmark systems in real-world applications/workflows.
I had five Seagates fail in my Synology NAS in less than a year. Somebody suggested it was a "bad" firmware on that model, but I switched to WD and haven't had a single failure since.
This will probably jinx me, but I've had so many drives, many purchased on the cheap from Fry's Black Friday sales when I was a poor university student, and the two drives I've ever had fail since I started buying over twenty years ago were
1. catastrophic flood in my apartment when drive was on the ground
2. a drive in an external enclosure on the ground that I kicked by mistake while it was spinning
Did you purchase them all at the same time from the same store? I've had a batch of SSDs fail from the same vendor / mfg timeframe. I started ordering a couple here and there form different vendors where possible. So far i've been lucky to get drives that aren't from the same batches. I tend to buy Exos from seagate and WD gold though so there's a bit of a premium tacked on.
No, that's the weird thing. Even the RMA models were failing. But sure enough, it wasn't just some incompatibility with the NAS because I tested them on PCs to confirm they were failing, and they were.
I had a similar experience. I ordered four EXOS drives three years ago and one of them came DOA. They had to send me three more drives before I got a working one. I’m amazed they’re all still happily humming away in a Synology.
What models? There's a big difference between the cheapest and the more pro models.
That said, my four 2Tb Barracudas still going fine after many years (10+). One failed, replaced with a green. Big mistake, that failed quickly and I went back to standard Barracudas.
I've had terrible luck with those drives, out of 12, 10 failed within a couple of years. Not a same batch issue as they were purchased over the period of about 6-8 months and not even from the same place.
Yet I've got Toshibas that run hot and are loud as heck that seem to keep going forever.
After couple failed hard disks in my old NVR, I’ve come to realize heat is the biggest enemy of hard disks. The NVR had to provide power to the POE cameras, ran video transcoding, and constantly writing to the disk. It generated a lot of heat. The disks were probably warped due to the heat and the disk heads crashed onto the surface, causing data loss.
For my new NVR, the POE power supply is separated out to a powered switch, the newer CPU can do hardware video encoding, and I used SSD for first stage writing and hard disks as secondary backup. The heat has gone way down. So far things have run well. I know constant rewriting on SSD is bad, but the MTBF of SSD indicates it will be a number of years before failing. It’s an acceptable risk.
That seems like a very poor chassis design on the part of the NVR manufacturer. The average modern 3.5" high capacity HDD doesn't generate that much heat. Even 'datacenter' HGST drives average around 5.5W and will top out at 7.8W TDP under maximum stress. Designing a case that uses relatively low rpm, quiet, 120 or 140mm 12VDC fans to pull air through it and cool six or eight hard drives isn't that difficult. In a midtower desktop PC case set up as a NAS with a low wattage CPU, used as a NAS, a single 140mm fan at the rear sucking air from front-to-back is often quite enough to cool eight 3.5" HDD.
But equipment designers keep trying to stuff things into spaces that are too small and use inadequate ventilation.
In a combination of heat, the POE cameras draw quite a bit of power, the video transcoding, and then the constant disk writes, all in a small slim case. It ran very hot during summer.
It continues to surprise me why Backblaze still trades at a fraction of its peak COVID share price. A well-managed company with solid fundamentals, strong IP and growing.
Because they are bleeding money and they must sell stock to stay in business. Cool product, but I personally don’t want to buy something that doesn’t turn a profit and has negative free cash flow.
I feel very confident that in 30 years AWS, Azure and Google Cloud will still be operating and profitable.
I think there's a very small chance that Backblaze will be.
Nothing against them, but it's virtually impossible to compete long-term with the economies of scale, bundling and network effects of the major cloud providers.
Cloud providers AWS in particular uses storage and transfer pricing as means of lock-in to other products , they can never be cost competitive to Backblaze , they has a thriving prosumer business .
Be aware that it's just a single server. It's not replicated across multiple hosts like in the case of google drive. So you definitely want a backup of that if it's your primary copy.
I’m assuming bloopernova is based in Europe, so latency should be fine. At least they asked for an European-based Hoster (although that could also theoretically be for privacy reasons).
True enterprise drives ftw - even Seagate usually makes some very reliable ones. They also tend to be a little faster. Some people have complained about noise but I have never noticed.
They are noticeable much heavier in hand (and supposedly most use dual bearings).
Combined with selecting based on Backblazes statistics I have had no HDD failures in years
I'm not sure I follow you, are you really saying that your choice of Seagate was based on Backblaze's statistics? Maybe I'm missing something but aren't they the overall least reliable brand in their tables?
My home NAS drives are currently hitting the 5 years mark. So far I'm at no failures, but I'm considering if it's time to upgrade/replace. What I have is 5 x 4TB pre-SMR WD Reds (which are now called the WD Red Pro line I guess). Capacity wise I've got them setup in a RAID 6, which gives me 12TB of usable capacity, of which I currently use about 7.5TB.
I'm basically mulling between going as-is to SSDs in a similar 5x4TB configuration, or just going for 20TB hard drives in a RAID 1 configuration and a pair of 4TB SATA SSDs in a RAID 1 for use cases that need better-than-HDD performance.
These figures indicate Seagate is improving in reliability, which might be worth considering this time given WD's actions in the time since my last purchase, but on the other hand I'd basically sworn off Seagate after a wave of drives in the mid-2010s with a near 100% failure rate within 5 years.
Related - about a year ago or so, I read about a firmware related problem with some vendors SSDs. It was triggered by some uptime counter reaching (overflowing?) some threshold and the SSD just bricked itself. It’s interesting because you could carefully spread out disks from the same batch across many different servers, but if you deployed & started up all these new servers around the same time, the buggy disks in them later all failed around the same time too, when their time was up…
Wow, that’s a cool stat. I wonder if people will ever seriously use 16EB of memory in a single system and will need to change to a more-than-64-bit architecture or if 64 bit is truly enough. This has „640k ought to be enough for anybody“ potential (and I know he didn’t say that).
> Perhaps we don't need a single flat address space with byte-addressable granularity at those sizes?
History is filled with paging schemes in computers (e.g. https://en.wikipedia.org/wiki/Physical_Address_Extension). Usually people do this initially as it allows one to access more space without requiring a change of all software, it is an extension to an existing software paradigm, but once the CPU can just address it all as a single linear space, it simplifies architectures and is preferred.
Fair enough. I suppose once any scheme receives full native support it becomes indistinguishable from a flat address space anyway. What's a second partition if not an additional bit?
My pondering is less "why such a large address space" and more "why such a large native word size"? Extra bits don't come without costs.
As long as I'm asking ridiculous questions, why not 12-bit bytes? I feel like a 12/48 system would be significantly more practical for the vast majority of everyday tasks. Is it just due to inertia at this point or have I missed some fundamental observation?
Until Intel's Ice Lake server processors introduced in 2019, x86-64 essentially was a 48-bit address architecture: addresses are stored in 64-bit registers, but were only valid if the top two bytes were sign-extended from the last bit of the 48-bit address. Now they support 57 bit addressing.
True. However what I had in mind there was something along the lines of 48-bit integer and fp arithmetic as the "native" size with 96-bit as the much more limited extended form that 128-bit currently fulfills for x86. For the address space regular 48-bit pointers would address the needs of typical applications.
Extended 96-bit pointers could address the (rather exotic) needs of things such as distributed HPC workloads, flat byte addressable petabyte and larger filesystems, etc. Explicitly segmented memory would also (I assume) be nice for things like peripheral DMA, NUMA nodes, and HPC clusters. Interpreters would certainly welcome space for additional pointer tag bits in a fast, natively supported format.
Given the existence of things like RIP-relative addressing and the insane complexity of current MMUs such a scheme seems on its face quite reasonable to me. I don't understand (presumably my own lack of knowledge) why 64-bit was selected. As you point out addresses themselves were 48-bit in practice until quite recently.
There was a dashboard where the total storage at Google was tracked and they had to update it from 64 bits for this reason... about a decade or more ago.
If you google "supermicro 72 drive server" it's definitely a thing that exists, but these use double-length drive trays where each tray contains two drives. Meaning that you need a "whole machine can go down" software architecture of redundancy at a very large scale to make these useful, since pulling one tray to replace a drive will take two drives offline. More realistically the normal version of the same supermicro chassis which has 1 drive per tray is 36 drives in 1 server.
There are other less publicly well known things with 72 to 96 drive trays in a single 'server' which are manufactured by taiwanese OEMs for large scale operators. The supermicro is just the best visual example I can think of right now with a well laid out marketing webpage.
You don't LEGO assemble rackmount servers. Chassis come with figurative array of jet engines with 12V/0.84A -ish fans that generate characteristic ecstatic harmony. They're designed, supposedly, to take 35C air to keep drives in front at 40C and GPUs at back <95C.
Not a server per se, but you just take one 1U server and daisy chain a lot of those JBOD chassis for the needed capacity. You can have 1080 disks in a 42" rack.
I wish there was a way to underspin (RPM) some of these drives to lower noise for non-datacenter use - the quest for the Largest "Quiet" drive - is a hard one. It would be cool if these could downshift into <5000RPM mode and run much quieter.
I wonder if that's even technically possible these days. Given the fact that the heads have to float on the moving air (or helium) produced by the spinning platter, coupled with modern data densities probably making the float distance tolerance quite small, there might be a very narrow band of rotation speeds that the heads require to correctly operate.
Check your FARM logs. It sounds like people who were using the drives to mine the Chia cryptocurrency are dumping large capacity drives as Chia's value has fallen.
Yev from Backblaze here -> you're welcome! Glad you like and we also wish they did! That's one of the reasons we started doing it, we wanted to know :D
It’s a bit odd. HGST always fares very well, in Backblaze stats, but I have actually had issues, over the years, in my own setup (Synology frames). Seagate has usually fared better.
They, ironically, got acquired by Western Digital. But the 'Ultrastar' line name is still alive, if that's what you're looking for. 'Deskstar' seems to be gone, though.
It's a wash. Modern mechanical HDDs are so reliable that the vendor basically doesn't matter. Especially if you stick with 'Enterprise'-tier drives (preferably with a SAS interface), you should be good.
Aside from some mishaps (that don't necessarily impact reliability) with vendors failing to disclose the HAMR nature of some consumer HDDs, I don't think there have been any truly disastrous series in the past 10-15 years or so.
You're more likely to get bitten by supply-chain substitutions (and get used drives instead of new ones) these days, even though that won't necessarily lead to data loss.
Polite data viz recommendation: don't use black gridlines in your tables. Make them a light gray. The gridlines do provide information (the organization of the data), but the more important information is the values. I'd also right align the drive failures so you can scan/compare consistently.
Thank you for all these reports over the years.
reply