Hacker News new | comments | show | ask | jobs | submit login
The Hot/Crazy Solid State Drive Scale (codinghorror.com)
240 points by AndrewDucker on May 2, 2011 | hide | past | web | favorite | 123 comments

A couple of additional points about my abysmal 8/8 failure rate:

  - These were in 8 different machines in 8 different locations.
  - 7 were in desktops, 1 was a laptop
  - All were running Windows
  - All purchased from NewEgg
  - Most of these were gifts for other people. 
(I may be slightly crazy, but I don't have 8 computers.)

After this saga, I've concluded that there are probably two root causes leading to high failure rates:

(1) Something about installing an after-market SSD in a desktop, probably related to power fluctuations, increases the likelihood of failure.

(2) NewEgg only offers a 30-day warranty on their SSDs. You can't even purchase a third-party extended warranty. So I suspect that you'll see a higher failure rate from NewEgg purchases than from a merchant who offers a 2- or 3-year warranty. (But NewEgg's prices are so damn good!)

I should also say that the SSDs that I have purchased with laptops - 2 from Dell (2008, 2009) and 1 from IBM (2009) have not failed.

As for (1), unless you specifically go purchase a quality PSU, the PSU you get in a pre-built machine is usually utter trash as far as output quality is considered. (This is probably a fault of the ATX standard, which frankly allows too much leeway. Many parts that could otherwise use the voltages provided by the PSU have to include their own power regulation circuitry simply because the direct PSU output is too noisy) It's a good idea to build your own computer from newegg or similar just to know the PSU you'll get.

Other than that, there's no difference if the SSD is plugged in by "a trained professional" or yourself.

Agreed. Never, ever skimp on the PSU, even for a budget build. It's a false economy (which, in the worst case scenario, could result in you replacing the PSU and every other piece of hardware too).

If you're not sure if your PSU is good or not, pick it up and see how heavy it is. If it has a fair bit of heft, it's probably good. If it feels light, it's definitely bad.

Did you verify the authenticity of the drives? It sounds odd that that they were limited to a NewEgg warranty without a manufacturer's warranty.

Every SSD I see on Newegg mentions that it has a 3 year manufacturer's warranty.

Yeah, I think he may be confusing "return policy" and "warranty". Just about all SSDs (at least from the major companies like OCZ, Corsair, Intel) will have a 3 year warranty from the manufacturer.

(1) I think that's it. You could get a cheap oscilloscope card for looking at the power grid quality. It seems it is really bad on there.

You could even make an script that analyzes if the wave shape is too bad, and if so store the surroundings of the power glitches. You can do that with a beagle board DSP, almost free.

Maybe you shut down or power on a lot of computers at the same time, start engines, whatever. It does not make sense your SSDs failure rate(15 days, come on!!).

These were in 8 different computers in 8 different people's houses (friends and colleagues).

So something specific to my environment is not a factor.

My experience has been different, knock on wood. My X25-M has been going for about a year now. It has two bad sectors, but nothing catastrophic. I back up all my files every day with duplicity anyway, though.

Is this really the standard case for SSD's? I know they have a certain number of writes, but didn't know the failure rate was this high. It seems strange that all the drives you got did not have warranties. On Newegg I'll sometimes buy parts that seem like really great deals. Eventually they will either brake/perform poorly. I haven't cheaped out on hardware since then. I just can't stand the hassle of bad hardware anymore.

Those drives didn't fail to running out of write cycles -- they simply wouldn't have had time to do that by now. They failed because SSD's are still quite new devices that haven't seen the kind of decades of slow improvement like most other consumer hardware.

Most SSD crashes are because the firmware managed to corrupt something, and thus you can fix most failed SSD's by reflashing their firmware, assuming the drive has a method of doing this without special equipment. (doing so would, of course, erase the entire drive)

The SSD I got from NewEgg had an extended warranty offer. The only time I've ever accepted one. http://www.newegg.com/Product/Product.aspx?Item=N82E16820227...

Offtopic, but you could get a third-party warranty from SquareTrade.

> All were running Windows

It's too much to hope for that Windows is the common factor here, isn't it?

Read this if you are concerned about TRIM and want to restore a SSD drive on OS X to a "new" state.

If you have an SSD on OS X, I discovered that in many cases you can "reset" the drive and regain performance. Basically, if your drive implements the "security erase" command, you can use the following procedure by taking the cover off the back of your Macbook, removing the hard drive, leaving the SATA cable in reach, and booting with the GParted Live CD.


The tricky part: You have to plug in the SSD drive >after< your machine boots, so it's not in "frozen" state. The other tricky part, is that you first have to set a security password to do the erase. Then doing the secure erase also erases the password.

I know first-hand that this works on Crucial C300 RealSSD and I've seen lots of reports that this works with OCZ drives. This is also useful for doing new installs on an SSD if you've gotten a new machine. (My situation)

WARNING: DO NOT try to use a Crucial C300 SSD with an i5 or i7 2011 13" Macbook Pro. The problem is in the hardware, namely electromagnetic interference in the SATA cable when trying to use 6GB/s SATA. Wrapping it with aluminum foil doesn't always work. Buying an upgraded SATA cable doesn't always work either. Buying a 2010 Macbook Pro instead always works. You have been warned.

Or you could just enable TRIM [1] on your 3rd Party SSD under Snow Leopard [2].

[1] http://www.groths.org/?page_id=322

[2] http://imgur.com/gKtL6

I've seen this, but I've also seen some reservations about whether this is safe. I think secure erase plus garbage collection will hold me until I get TRIM in OS X Lion.

If you have an Intel SSD you can use the tools on their site to do the same (intended for PC, but they boot fine on a Mac). No need to set a password but you still need to do the tricky power cycle thing where you disconnect the drive momentarily after power up. Easy on a Mac Pro so use that if you have access to one.

Personally, I think he's crazy.

I don't care how fast a computer is if I have to reinstall it from scratch every few months.

Sure, all my working files are safely on Dropbox (documents) or GitHub (code), but that doesn't help when I have to reinstall my operating system and software every three months!

HDDs still fail as well remember. However, that aside, if you offered me the price of my SSDs and a next day warranty replacement on my computer with a great backup solution to give up tabbed browsing or keyboard shortcuts I wouldn't take it. The efficiency from those things is just too painful to give up once you have tasted it. SSDs are like that. They can be seen as the carbon fibre sportscar with no airbags, but it's more than just a revving status symbol and the joy of movement, it's speeding up the device with which I think: for work and hobbies. Brin calls Google the third half of your brain, implying improvement and extension, the computer as a whole is much more than that for people who value the life of the mind. The investment then becomes fairly easy to justify I think.

You don't have to install from scratch with a decent backup.

(A complete Time Machine restore of my laptop over Firewire takes 2-3 hours. I sleep better at night, knowing this.)

Yes. Use SuperDuper to clone occasionally to a laptop-sized HD and you can probably get the restore time down to "how long will it take me to open up the machine and slide in this backup HD".

Such a backup might not be especially fresh, but once it's up you can sync over files from a fresher, automated backup.

Do you know a good one for Windows?

Acronis TrueImage (http://www.acronis.com/trueimage). Both backups and restores are amazingly fast -- in my case, 8 to 10 minutes for just under 60GB.

You sound like a desktop user, but a server environment, if done properly, is kind of the opposite.

Anything important is part of a pool.

Rebuilding, with cobbler + puppet, should take 10 minutes depending on your hardware and network, plus app files.

This is all trivial when weighed against the speed and to a lesser extent, power savings you get with SSDs.

Servers are, indeed, a different kettle of fish. But he seems to be talking about desktops (and laptops).

1. You can backup entire partition

2. Restoring of Windows machine with lot of software very annoying and takes a lot of time (clicking dumb "I agree", and reboot the computer now, entering serials etc...) but with ubuntu or os x it's less annoying experience. Good package manager calculate dependencies for software. The only exception is when you need to install some proprietary software (less frequently than on windows) or you need edge version.

I have a shell script that installs all the packages I need on a fresh Ubuntu install, but

1. I only tend to update it after running it and finding that something I was expecting to be installed wasn't (really I should switch to using puppet/cdist/similar for this)

2. I have to set up Firefox manually (downloading extensions etc)

Put your Firefox profile in a Dropbox directory?

If you're scripting it anyway, you could go one step further and just download all the XPIs to your extension directory, where Firefox will automatically pick them up. I also believe Firefox Sync will be able to sync extensions in the future.

True, a partition restore would be faster.

Agreed. I fancy though that he's imagining, or rather betting, the reliability of SSDs will improve. I bet that if his new purchase also fails after 12 months (or less) he'll be pretty pissed.

Other comments are assuming this is flash wearing out, but is this actually the flash wearing out or is it something else (buggy firmware, cheap memory cells, what have you)? The time-to-failure seems way too low for it to be excessive writes (15 days for an 80GB IBM!).

Actually, come to think of it, what is the failure mode for a drive when it's all worn out? Is it catastrophic data loss or is it just an unwritable drive?

I've read a test done on a basic SD card, and apparently it just turned to read-only mode when worn out. We could reasonably hope that higher end devices would fail the same gentle way :)

Theoretically it should just be an unwritable drive since flash tends to only fail on erase. If the firmware is badly programmed or something other than the flash fails who knows what could happen though.

There is a "read disturbance effect" on MLC flash that causes wear on reads. MLC flash gets multiple bits per cell using several voltage levels.

MLC flash is also around 5K erase cycles (and about the same number of reads until you need to do an erase-and-rewrite). Multiplied out and properly wear-leveled this is not a big deal.

We might be seeing crappy wear leveling, or badly written firmware, or the need for more ECC bits (yes, MLC has these, and they are /not/ optional) than the EE types think they can get away with.

I've been writing flash file systems since 1991. They're fun as hell to work with.

> I've been writing flash file systems since 1991. They're fun as hell to work with.

Could you recommend any good resources for someone new to file system technology that is interested in their inner-workings? Thanks!

Practical File System Design: http://www.letterp.com/~dbg/practical-file-system-design.pdf

NT File System Design (Rajeev Nagar) -- you can find this used. I believe his web site had a PDF copy for a while, sans pictures.

My favorite book on transactions is Bernstein's Principles of Transaction Processing (may also be available as a PDF somewhere, from the author).

Read the v6 Unix file system code. That will date me, but it's /simple/ and it works. You can move up from there.

btrfs is neat. I haven't looked at the code.

I wonder how well the SMART stats would predict that failure -- or whether they were sudden unexplained failures.

If you run smartctl -A on your SDD device then for e.g. Intel you can see attribute 0xE9 or 233 -- media wear indicator, starting at 100% (mine's at 98% after 14 months, so this should hopefully indicate 2% of the wear out from writes). You can also see how many of the reserved blocks it's used (when it fails to reflash a cell when rewriting).

I've been following Jeff's SSD stories for a while now, and I came here today thinking maybe it was time for me to take the plunge and get one myself. But this latest post just frightens me.

Even if you can avoid catastrophe with regular, automated backups (I use two Time Machine drives myself), what about bitrot? If the SSD you've been diligently backing up over months has been slowly rotting, can you have any faith at all in the backups?

The general assumption with SSD is you have both a SSD and a traditional HDD. As long as you only have program and temp files on a SSD it's loss can have fairly minimal impact. Especially, if you schedule a full disk backup of your SSD weekly. Worst, case you lose a few OS/browser patches ship back the SSD for a replacement drive and move on.

PS: If you want to get fancy you can set up a bootable partition on your HDD and then backup the SSD to that partition.

Is that really the general assumption, or is it the assumption amongst people who know enough about hardware to build their own systems? Plenty of system integrators (Dell, Apple, etc) sell configurations that are SSD-only. If the failure rates are really this high, it's a point of concern to say the least.

I keep regular Time Machine backups, Backblaze off-site backups, and much of my "current" work is done in Dropbox, so it's synched quickly, but a hard drive loss still means downtime and the loss of whatever I was working on. Of all the computing users I know in every day life, I am the most backed up amongst them. SSDs are rapidly going mainstream, and the impact of hardware failure for the "mainstream" user isn't something that should be marginalized.

What makes you say that's the general assumption? As far as I know the vast majority of machines shipping with SSDs are shipping with ONLY an SSD, not two drives. Are you saying Apple's (for example) assumption is you're going to cary around an external HDD with your MacBook to store your files on?

The general assumption (or rather disclaimer) is: make backups.

An earlier failure may even be an advantage here because that way you have less time to accumulate important data before learning your lesson...

A lot of people go with removing the cd drive and adding the ssd there. That's what I did and aside from a sata saturation firmware bug it went great. So fast.

If you're not using the SSD for some of your data files, you're losing out on a lot of the performance. I keep all my code working directories on the SSD as well. Compiles of even large projects just scream from an SSD.

That's what check sums are good for. You can use a file system that has them build in, or add them manually.

This post has some pretty decent hard stats. Not scientific, but less anecdotal than the linked article.


To be recorded the VAS had to be made directly through the merchant, which is not always the case since it is possible to return directly from the manufacturer: however, this represents a minority in the first year.

- Maxtor 1.04% (against 1.73%) - Western Digital 1.45% (against 0.99%) - Seagate 2.13% (against 2.58%) - Samsung 2.47% (against 1.93%) - Hitachi 3.39% (against 0.92%)

Hitachi is plummeting, which was first in the previous ranking! Western Digital retained its second place despite a failure rate increasing, while Maxtor is occupying the first place.

More specifically the failure rate for 1TB drives:

- 5.76% Hitachi Deskstar 7K1000.B - 5.20% Hitachi Deskstar 7K1000.C - 3.68% Seagate Barracuda 7200.11 - 3.37%: Samsung SpinPoint F1 - 2.51% Seagate Barracuda 7200.12 - 2.37%: WD Caviar Green WD10EARS - 2.10% Seagate Barracuda LP - 1.57%: Samsung SpinPoint F3 - 1.55%: WD Caviar Green WD10EADS - 1.35%: WD Caviar Black WD1001FALS - 1.24%: Maxtor DiamondMax 23

Hitachi is logically the less well placed, what with two separate lines! What about the 2 TB version?

- 9.71%: WD Caviar Black WD2001FASS - 6.87% Hitachi Deskstar 7K2000 - 4.83%: WD Caviar Green WD20EARS - 4.35% Seagate Barracuda LP - 4.17%: Samsung EcoGreen F3 - 2.90%: WD Caviar Green WD20EADS

Overall, failure rates recorded are bad. That does not really want to entrust to 2TB of data to these discs alone: a mirroring will not be too much for securing data. Logically 7200 rpm disks are less reliable than the 5400/5900 rpm, with almost 10% for the Western model!

For the first time, we also integrate SSDs in this article type. The rates of failure recorded by manufacturer:

- Intel 0.59% - Corsair 2.17% - Crucial 2.25% - Kingston 2.39% - OCZ 2.93%

Intel stands here with a failure rate of the most flattering. Among the few models sold over 100 copies, displays a rate of no more than 5% VAS.

Here are comparable numbers for SSD from another source [1] (this site is affiliated with an online computers shopping site).

Intel 0,3%, Kinston 1,2%, Crucial 1,9%, Corsair 2,7%, OCZ 3,5%.

[1] http://www.hardware.fr/articles/831-7/taux-pannes-composants...

Edit: and on last page of the previous link there is a list of the current models sold between 10/01/2010 and 04/01/2011 with the worst fiability track record:

6,7%: OCZ Agility 2 120 Go. 3,7%: OCZ Agility 2 60 Go. 3,6%: OCZ Agility 2 40 Go. 3,5%: OCZ Agility 2 90 Go. 3,5%: OCZ Vertex 2 240 Go

Simplistically assuming a very high failure rate of 7%, Jeff's friend anecdote of 8 out of 8 failures has a probability of about 0.07^8 or about 1 in 2 billion. That's quite a lottery to win.

What this tells me is that there are probably some differences in the environment that you use your SSD in that can have disproportionate effects on its lifetime.

You assume that only 8 of Jeff's acquaintances bought SSDs. It might actually have been 8 failures out of 100 purchases, for example.

Did you read the article? I'm referring to this: "Portman Wills, [...] he went all in. He purchased eight SSDs over the last two years ... and all of them failed".

I stand corrected.

I admit that I tend to skim Jeff Atwood because his occasional tasty flakes of insight are thickly coated with delicious but useless fluff.

Jeff's anecdotal failure rates are scary, but don't line up with expectations - e.g. my Intel SSD has a 3yr manufacturer's warranty which would suggest failure rates more in line with these numbers.

Intel claims a 0.4% annual failure rate for their most-popular last-generation drive, the X25-M.


And people wonder why women have trouble getting into this industry.

Because... they like their disks to not fail?

did you not notice the article's comparison between SSD drives and "hot/crazy women?" it was highly offensive.

That was from a show, and I imagine the same goes for guys. I didn't find it sexist at all.

No, I'm with the OP on this. In the show, this kind of thing works because part of the schtick of the Barney character is that he's a sexist jerk. Everyone knows it, so you can riff off that to say things you can't in isolation.

Here, there's none of that context. Drives, like women, are irreparably either hot or crazy and the only question left to users (or men) is whether they're "worth it". That's just sexist, sorry. The only people who have an excuse for thinking its not are the ones who get the sitcom reference.

Hmm, maybe it's because I can't see it out of context, but I sort of understand what you mean. I guess people who haven't seen the show don't know that Barney basically doesn't date, which makes this line even more nonsensical.

I'm not saying I disagree with you, but I think there is an impulse to label things sexist without addressing it, so I'm gonna ask a few questions.

If a woman wrote this about men, would it be sexist? (I expect you'll say yes.)

If a lesbian woman wrote this about other women, would it be sexist? (I feel like you're forced to say yes since you are committing to the idea being sexist independent of the sex of the speaker.)

Is it possible for a man like Barney to be honestly and accurately analyzing trends of the women in his life, given that he only ranks women shallowly?

Careful now. It's not necessarily sexist for Barney to _only care about looks,_ or at least: since we all care about looks to some degree, it is dangerous to imply that caring about looks is sexist. And if his analysis of his desires is based on his decision to only try for attractiveness, how is that analysis sexist instead of revealing the frailty of being so shallow?

So the comparison is basically that there are two orthogonal traits, one negative and one positive. It is not "Drives, like women"; it is "Drives have orthogonal traits, and evaluation of them therefore proceeds along the Barney Analysis."

I understand the argument, but this isn't about logic. Context matters. Sexism directed by men against women (well, "heteronormative" to use the jargon) is "worse" than the reverse because it exacerbates an existing situation.

It's easy for a typical man in our society to shrug off a comment equating his looks with his worth. It's much harder for a woman. "Should" it be? Of course not, but that's sort of the point: let's try laying off things for a while before demanding logical equality, OK?

A well put argument to establish a local subjectivity for practical effect.

The problem I have with it is that it is then used to marginalize _other_ local subjectivities like r/mensrights (which admittedly has its share of ludicrous opinions, do _not_ admit you don't care you were circumcised) for the purpose of serving the "most important" 'ism.

In reality there are a whole bunch of inequalities in a whole bunch of different directions, and it doesn't make sense just to target the one that _some_ people have decided is the most important. (In all honesty, for instance, I think racial inequality is a much bigger problem than sexism in our society now that women are becoming much better educated, whereas we still have a lot of 'bad part[s] of town.')

In other words, we are not escaping heteronormative forms until we can actually work with logical equality, and the quickest way through the forest is the straightest. Why not just admit the problem is with beauty and our valuation of it--and yes, men might be guilty of this more by percentage, but that's not the source of the problem--instead of with gender?

I'd be grateful if you could explain the issue, because I really don't see what the problem here is. I guess I don't understand what sexism is.

The way I see it, Barney only judges women by how hot/crazy they are (or he judges them by more than that, but those are the two primary factors in his judgement). Given that everyone judges everyone based on some traits, and these traits have different weights attached to them, why is it sexist for Barney to do that? He's just a shallow person, but not qualitatively different than anyone else (only quantitatively).

Also, he is not saying "this is how you should judge women", only "this is how I judge women". I would be fine with a woman saying that about a guy, or a woman about a woman. It wouldn't be a woman I'd like to date, but I don't consider it an attack on my gender as a whole.

so if I were to claim that unicorns are real, it would be okay, so long as I am quoting from a TV show?

it was objectifying a human, reducing her to a series of inputs and outputs, no better than a microwave oven. it would not have been acceptable whether the victim was male or female.

I'm sorry, but I genuinely don't understand. It was saying that Stinson would date someone even though she was crazy, as long as she was hot enough. Isn't that what humans do with everything, weigh the pros and the cons? How is it objectifying?

I didn't notice but now I realise that you are right. Would changing the article so that it didn't directly quote the show, instead referring to someone theoretical that the reader is attracted to, make it less distasteful? The article could credit the show without going into specifics.

I'm not sure if this reference is appropriate at all.

Imagine how the article would read if the quote was from a fictional female or gay character on the topic of what type of male they were attracted to. Just one data point, but it would make me wonder why that's there at all and what it adds to the discussion of SSD reliability.

I also thought that about SSDs.

Then I stumbled upon this on the Apple store when configuring your disk options: "Your MacBook Pro comes standard with a 5400-rpm Serial ATA hard drive. Or you can choose a solid-state drive that offers enhanced durability."

So, why do they say that? What SSD do they built in? Is that a particular special SSD with enhanced durability? Even more enhanced than regular harddisks (like they are claiming)?

There is indeed no such thing. What they can always say is that since there are no moving parts in an SSD drive, you don't have to fear screwing up your data by moving your laptop around while working. In that sense and in the hands of a careless laptop user, it would indeed be more reliable.

What I describe above is just an excuse they can (and probably) use. The reality is that it's just marketing text to up-sell you something. I bet it works in many cases.

Surprising, though, that they use "enhanced durability" to upsell the hard disk and not "increased performance".

Reading that statement in English and not in my native Geek, that statement simply refers to the replacement of a more failure-prone device with a less failure-prone device.

SSDs have long been marketed as being more durable than mechanical HDDs, and are quite often used in environments subject to sudden shock or vibration.

The particular underlying marketing going all the way back to the classic "solid state" marketing and the removal of vacuum tubes (valves) from computing decades ago, and this irrespective of the acceleration sensors and other schemes intended to reduce the numbers of head crashes with HDDs.

With LCDs, SD/CF/flash/downloads and now SSDs, the remaining non-solid-state devices are being removed from computing, and following the path as the Cathode Ray Tubes (CRTs), floppies and CD/DVD drives and HDDs. As rugged as CRTs and HDDs have become, vacuum tubes and flying heads still do not deal with shock and vibration particularly well.

If you jostle your laptop, there's a chance that the drive heads might come into physical contact with the disk surface, gouging it and causing irreversible damage. SSD's have no moving parts and very high tolerance to shock and impact. So they are more durable in the sense that they can take more of a beating without losing your data.

Durability =/= reliability. SSDs are more durable (resistant to shock, etc.) because they have no moving parts. They are not necessarily more reliable (long-lived), as this article shows.

I had an SSD from SuperTalent last year. It didn't even last 2 months before massive filesystem corruptions started. Two weeks later it wasn't even recognized anymore by the SATA controller.

...And I have an SSD that came installed with my Macbook Pro almost a year and is still working without any hickups. Single data points aren't all that valuable.

I had a whole bunch of eight supertalent SSDs that randomly disappeared from the controller view under heavy load. They were just fine after a reboot, though; most probably the firmware is buggy as hell and crashes. I returned them back, and still no news. Bah...

they early Supertalent Disks had a bad firmware bug that made them disappear under heavy load. That had been fixed some weeks later. I have two of those since early 2009 with updated Firmware, still running strong!

I've often wondered about this stuff. MLC SSDs are good for 10000 writes per 250k (iirc?) block. When you think about using an SSD for virtual memory, even with wear levelling, that's really not very much at all. Add in programs that do lots of small file writes to logs and so on, and I can imagine losing a lot of blocks fairly quickly - particularly because wear levelling will cause you to lose all of the blocks in a relatively small time space, as opposed to gradual failure.

I wonder if, as flash memory gets cheaper, we'll see SLC SSDs getting more popular.

Its actually good for 10,000 erases per block, so depending on how your file system works you could erase a block, then write to some of it, then rite to another part of it, then write to it again filling it up. Write amplification does occur for this reason and because the drive has to keep track of what it has on itself, but its usually not more than a factor of 2.

In reality, every erase is followed by exactly one (amortized) program, so it doesn't matter which one you track.

Really? That seems to inefficient though... I not a huge expert on commercial SSDs, but when I've written my own flash drivers for embedded sensors I would always make use of multiple writes per erase.

More likely we'll just continue to see the firmware get smarter. SandForce controllers compress a lot of redundant data and only make writes when absolutely necessary.


I have been using an enterprise-class Intel SSD (X25-E I think) for about one year now in my workstation so help speeding up the build/linking process of a large C++ project. This has made all the difference in development speed for me. The disk gets punished a lot, lots of file copies and read/write operations. To date I haven't had any problems (nor have I heard of any from colleagues who are using them in the same way). I wouldn't care much about data loss though since only temporary files are on there.

I have the same Intel SSD (32 GB only, though) and so far it works amazingly well as a startup/system volume (Windows 7). I hope they'll offer the 128GB version they promised as I'd definitely like to get one of those for a Linux-box.

My experience with regular consumer-grade hard disks indicates that regular backups of any data I want to keep is always a good idea. The X25-M mentioned in the article are using MLCs and so might some of the other SSDs he has used up. Another potential factor could be heat, malfunctioning hardware, etc. - there is not enough data given to even guess what the problem could be.

If excessive writes are a problem, an idea have had is to protect a large MLC-disk by using union fs to write only to a smaller SLC. No idea how much overhead using union fs introduces by itself but to me it appears to be a good way to use SLC/MLC drives, keep down costs, and don't rely too much on "magic" MLC-firmware.

I must be remarkable lucky. I have seven SSDs, four of them are installed in computers I use often, and some more in datacenter servers. I have never had a failure. And the SSD that's in my home unix server is already doing its job for almost two years now.

I do make regular (image) backups from all my SSDs, so when one fails I can quickly replace it and restore the image. Apart from that I keep all my sourcecode and work-related stuff in a source code versioning control system hosted in a datacenter.

What SSDs are those?

Intel X25M, in my MacMini this one:

  Capacity:	80.03 GB (80,026,361,856 bytes)
  Model:	INTEL SSDSA2M080G2GC                    
  Revision:	2CV102HA
  Serial Number: CVPO0042012S080BGN
In my home unix server:

  smartctl 5.40 2010-10-16 r3189 [FreeBSD 8.1-RELEASE i386] 

  Model Family:     Intel X18-M/X25-M/X25-V G2 SSDs
  Device Model:     INTEL SSDSA2M080G2GC
  Serial Number:    CVPO0054037B080BGN
  Firmware Version: 2CV102HD
  User Capacity:    80,026,361,856 bytes
In my datacenter server:

  Device Model:     INTEL SSDSA2M080G2GC
  Serial Number:    CVPO951000ZJ080BGN
  Firmware Version: 2CV102HA
  User Capacity:    80,026,361,856 bytes

I'd like to know too.

One of the cool things about SSDs will be the huge amount of random I/O we can throw at them without noticeable responsiveness slowdown. SSDs will probably increase peoples' desire to backup because it will no longer be such a drain on I/O responsiveness.

My experience on SSDs, having used multiple drives myself and sold 100s of drives to tech savvy customers:

- Brand does matter - the failure rate on Intel drives is a lot lower than other brands.

- The system you put it in also makes a difference. For example, certain generations of Macbook Pros are just not happy with certain SSDs. We've had customer that had two or three failures with a certain brand / model / chipset, and then changing to a different SSD they haven't had problems.

- The failure rate for SSDs overall is higher than that for HDDs, but lower than the failure rate for some other products (such as Graphics cards).

- Every SSD we have sold went into a custom built system or was installed afterwards into a laptop, so I don't buy the argument that problems are caused because it's installed by an end user and not at the factory.

- They are blindingly fast. In my opinion, it's better for 90% of users to put a new Intel SSD into their existing system, than it is to buy a new system.

Do we know what's killing them? Writes? Heat? Random loss of magic smoke?

Short answer: writes, yes.

Long answer is long and I'm most certainly not an expert. It's just a consequence of how NAND memory works. You can read more about this in the links below (one is to a review of a particular ssd drive, but a couple of sections explain how SSD works in great detail):

(1) http://en.wikipedia.org/wiki/Flash_memory#Memory_wear (2) http://www.anandtech.com/show/2614/3

Someone else pointed out that other flash devices just become unwritable when they "fail" due to writes, rather than suddenly becoming bricks. I'm skeptical that this is the same thing.

I've had an Intel X-25M for over a year now and it's been amazing performance wise. In that same time span my 1TB data drive did die. The only problem I've had with the SSD is that it about twice a month it will randomly freeze. The hard drive light is solid on, and I have to just reset my machine. Not a big deal though because the SSD starts up so fast.

I have a similar anecdote; my 2TB western digital green drive died after a few weeks, whereas my Intel X25-M G2 is still running fine after a year and a half

For people who need reliability, SSDs are clearly not the wise choice. For performance, they are, IF you are willing to replace them frequently (at GREAT expense) and have downtime. Not to mention solid backups. The cons far outweigh the pros for me. I like stable measured performance. My computer isn't a hot rod, it's meant to be efficient.

Using a rotational HD does not excuse you from having solid backups. I've had rotational HDs fail in just months, it's not the norm but it does happen.

Since the manufacturer typically provides a 3 year warranty, replacement costs aren't actually that high.

No amount of hot is going to convince me to date someone with an STD. That would be...well, crazy. I want my SSD to be at least as reliable as my system memory (if not more). The article makes it sound like they are inherently diseased. If that's the case, no thanks.

It seems like it depends on the manufacturer. In this thread, somebody posted a comparison: http://news.ycombinator.com/item?id=2505883

Also note that mechanical hard drives fail a lot, too.

Does it depend on manufacturer / generation? My Intel at least lives for a bit longer already.

Hmm this is bad. I was hoping SSDs would give us more speed in RAID0 setups for film/video editing where we need sustained speeds of over 1.2GB/s regularly (for one track 4k dpx only)... but if they fail a lot, it's too cost prohibitive.

"A lot" is a very relative term. Many people (including myself) have SSDs running for over a year and they're ticking on just fine.

I've had good luck with SSD drives (Intel and Kingston) for the last two years. No failures. I use them in servers and laptops. I do rysnc them to spindle based drives hourly, so if they do fail, I've got backup. Just my experience.

If they did fail, would you failover to the spindles (and would they keep up with the workload?), or use them to recover to new SSDs?

I use 2.5" laptop hard drives not older than 2 years. They last about 6 years.

And if a hard drive starts to go, sometimes you hear it failing. I actually backed up a 30gb hard drive where I had to wobble it up and down while being backed up.

2 x Intel SSDs here. No issues.

While the speed of these things can vary a lot, so does reliability - and that won't show up in a benchmark. I noticed that a lot of the mentioned SSDs are GSkill. Apparently those are cheap and unreliable.

I have 3 different SSDs since early 2009 (64 & 128GB Supertalent Ultradrive ME + Intel X25-M 80GB) and the latter two are in heavy use everyday. They all work fine until today, but this article worries me a bit ;)

I'd like some slightly less anecdotal evidence before I conclude that all SSDs are doomed to an early death; we could be seeing some confirmation bias here. Maybe Jeff and his friends just had a run of bad luck.

Created a poll: http://news.ycombinator.org/item?id=2506138

(probably still won't be very useful, but at least it will be another data point)

Seems like the entire post was a lead in to the affiliate link at the bottom.

My first SSD died in three days. It is entertaining when an SSD dies, as you simply get a "no OS found" type error from your bios. Checked the drive and it appeared to be working, only was completely empty.

Swapped it out, been fine since (about a year and a half now). Also no problems in my mac.

But both machines run regular backups so that should an SSD fail, it's just a matter of imaging the new drive and replacing it.

This is really rather shocking to me. I always figured SSDs were as reliable as SD cards or other Flash based items like MP3 players or phones. Where you practically never see a failure due to the memory going bad.

The alternative to SSD's are the conventional harddisk. But they also fail so you need a full backup anyway. (Maybe not so often, but i had also several harddisk which failed without much warning.)

5 SSDs here, got the oldest 2.5 years ago. No failures so far. I've run them as system drives, DB storage and so on.

I'd think a reason for quick wear out may be swap memory. I've never put a swap file/partition on mine.

>> may be swap memory

That's very interesting. I've had a swap file on all of my SSD systems. What makes you think that could be a culprit?

Continuous writing to the same area on the drive can probably exhaust the spare cells and degrade performance. I don't know how the firmwares deal with 0 spare cells and further failures, though.

I have a swap partition on my SSD but I also have swappiness=0, so that it only writes to disk if there's actual memory pressure.

Modern SSDs use wear leveling to prevent this from happening. If you write to the same logical location a thousand times, the SSD's wear-leveling algorithm will write to a thousand different locations.

More info: http://en.wikipedia.org/wiki/Wear_leveling

Having a personal infrastructure chaos monkey is maybe not a bad thing (apart from the expense etc).

This has to be advertisement :)

went to the vertex 3 page on newegg. there was at least 5 reviews on the very first page about it dying from 0 to 24h.

my cheap SSD on my 3yr old eeepc1000 is still kicking. it's now serving games to my wii via a $5 enclosure that does NOT need external power. and the eeepc has a bigger and faster one that's also working for 1yr+

...too bad asus used a mini e-pci interface that is as odd as records you listened when you were 15.

Shocker - Cutting edge new technology has high failure rate! But +1 for mentioning Boob Job in a article about SSD drives.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact