
The SSD Endurance Experiment: Casualties on the way to a petabyte - nkurz
http://techreport.com/review/26523/the-ssd-endurance-experiment-casualties-on-the-way-to-a-petabyte/
======
nkurz
I find it astonishing that none of the drives that died finished in a state
where their data was accessible. Some of these drives even intentionally self-
brick themselves at end-of-life:

    
    
      According to Intel, this end-of-life behavior generally   
      matches what's supposed to happen. The write errors 
      suggest the 335 Series had entered read-only mode. When 
      the power is cycled in this state, a sort of self-destruct 
      mechanism is triggered, rendering the drive unresponsive.
    

So you enter a read-only mode, and then on power cycle you self-destruct,
making the intact data inaccessible? In what circumstance is that possibly the
right choice?

    
    
      Intel says attempting writes in the read-only state could   
      cause problems, so the fact that Anvil kept trying to push 
      data onto the drive may have been a factor.
    

Oh, I see. Maybe it's to prevent the unwanted writes from damaging the drive.
Wait, what? The attempted writes can damage a drive that is in read-only mode?

~~~
rurounijones
I imagine it is to stop people trying to sell second-hand almost-dead SSDs to
non-savy customers who do not know about SMART etc.

You will notice that their Enterprise SSDs do not self-brick.

[EDIT] Read down for clarification regarding "almost-dead"

~~~
ggreer
My guess is that Intel couldn't coerce the SandForce controller into behaving
nicely. Of course if anyone asks them about the behavior, it's a feature, not
a bug. But really, selling almost-dead SSDs is not a big problem. These
endurance tests show that you'd have to overwrite a 240GB SSD every day _for 5
years_ before getting close to exhausting it. Drives will become obsolete
before they wear out.

Last I checked, Intel used their own controllers in their enterprise SSDs. Now
I wonder if their older consumer drives like the X25-M (which used in-house
controllers) become read-only instead of bricking themselves.

~~~
userbinator
_Now I wonder if their older consumer drives like the X25-M (which used in-
house controllers) become read-only instead of bricking themselves._

[http://www.xtremesystems.org/forums/showthread.php?271063-SS...](http://www.xtremesystems.org/forums/showthread.php?271063-SSD-
Write-Endurance-25nm-Vs-34nm/page206)

X25-M G1 dies at ~883TB written (without TRIM, so in reality much more), also
by just failing to be detected.

On the first page of that thread you see an X25-V (40GB) with _1.5PB_ written
and apparently it is _still alive_ as of earlier this year...

[http://www.xtremesystems.org/forums/showthread.php?271063-SS...](http://www.xtremesystems.org/forums/showthread.php?271063-SSD-
Write-Endurance-25nm-Vs-34nm&p=5221960&viewfull=1#post5221960)

------
nisa
How would I monitor SSD drives SMART values? I'd like to know when the SSD is
about to brick...

These are the values for a Samsung 830 SSD:

    
    
        ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
          5 Reallocated_Sector_Ct   PO--CK   100   100   010    -    0
          9 Power_On_Hours          -O--CK   097   097   000    -    13779
         12 Power_Cycle_Count       -O--CK   099   099   000    -    68
        177 Wear_Leveling_Count     PO--C-   096   096   000    -    118
        179 Used_Rsvd_Blk_Cnt_Tot   PO--C-   100   100   010    -    0
        181 Program_Fail_Cnt_Total  -O--CK   100   100   010    -    0
        182 Erase_Fail_Count_Total  -O--CK   100   100   010    -    0
        183 Runtime_Bad_Block       PO--C-   100   100   010    -    0
        187 Uncorrectable_Error_Cnt -O--CK   100   100   000    -    0
        190 Airflow_Temperature_Cel -O--CK   074   058   000    -    26
        195 ECC_Rate                -O-RC-   200   200   000    -    0
        199 CRC_Error_Count         -OSRCK   253   253   000    -    0
        235 POR_Recovery_Count      -O--C-   099   099   000    -    62
        241 Total_LBAs_Written      -O--CK   099   099   000    -    10945325283
    

Monitor all? Or just the wearlevel-count? Or watch only for the program or
runtime error counts? What do these even mean?

~~~
ars
Check 177, 235 and 241. When the normalized value gets close to 0 (for any of
them) start to worry.

The ones I listed are the main wear indicators, for the others I'd worry if
they had any value at all in the RAW column, even if the normalized value was
not close to 0. If that happens then carefully research the implications of
whichever one it happened to.

If this is a linux server then if you configure smartmontools correctly it
will email you when it gets close.

~~~
yungchin
I just tried to find out about 235 (POR recovery count), and from what I
gather this counts any hard power cycles you've "inflicted" on your machine,
is that right?

I suspect that between 177 (wear level) and 241 (blocks written), 177 may be
the best indicator, because 241 may not take into account write-amplification
- useful discussion of this from Anandtech:
[http://www.anandtech.com/show/6459/samsung-
ssd-840-testing-t...](http://www.anandtech.com/show/6459/samsung-
ssd-840-testing-the-endurance-of-tlc-nand)

------
vxNsr
In light of recent revelations of manufacturers giving reviewers different
hardware than what they sell to the public, I think it's important to find out
where these drives came from. If techreport just walked into a bestbuy (or
amazon) and bought the six drives consumer-reports-style then I'd be willing
to believe that this is a legitimate test of consumer hardware, but if they
were given the drives from the OEMs the there is a chance that some jiggering
was done to boost scores.

Though I guess in the long run if they do even half as well as the report
suggests we're probably in good shape.

------
toolslive
There is also:
[https://www.usenix.org/system/files/conference/fast13/fast13...](https://www.usenix.org/system/files/conference/fast13/fast13-final80.pdf)

and [http://www.tomshardware.com/reviews/ssd-reliability-
failure-...](http://www.tomshardware.com/reviews/ssd-reliability-failure-
rate,2923-4.html)

Together with this one should give a good picture on SSDs

------
edwintorok
According to this older article Intel SSD were the only ones that reliably
stored data in a write-sync-powerloss scenario:
[http://lkcl.net/reports/ssd_analysis.html](http://lkcl.net/reports/ssd_analysis.html)

But if they also completely brick themselves at EOL then it doesn't like a
good choice, so ... are there any reliable SSDs??

~~~
personZ
_are there any reliable SSDs??_

For most rational definitions of reliable, absolutely. In this case
inexpensive MLC or even TLC drives withstood hundreds of TB or more of writes
before failing, clearly indicating their remaining lifespan the entire time.
In the common applications of these drives, they're unlikely to ever see tens
of TB of writes.

They can't last forever. It is odd that the Intel "bricks" itself, and that
sounds more like a fault than anything (I'm at a loss to explain how that
makes sense as a behavior, beyond maybe "hiding your data after you've lost
the ability to wipe it"), but again it was clearly communicating the entire
time that its death was imminent.

As an aside, that power test you linked was an extreme test of thousands upon
thousands of abrupt power cuts while a write-back cache was enabled and
populated, against a very small number of drives. It is not relevant to
consumer products (where few or no drives have power retention), but it's even
odder in the enterprise space, where power assurance is at the rack or even
center scale, not component by component.

~~~
edwintorok
Good point about "communicating the entire time that its death was imminent".

Regarding the power tests I think the relevance is that the drives lie to the
OS about sync, and just claim they completed it when its in their cache, and
then hope they can write it to flash before power loss / before the capacitor
runs out. The cache wouldn't be a problem if it could be reliably flushed. As
it is they will corrupt the integrity of anything that relies on syncs / write
barriers like databases.

------
jackalope
I'd like to see the same kind of test with identical drives mirrored (RAID 1
or another suitable way to precisely duplicate disk I/O). One of the things
I've always wondered is if two SSDs from the same lot are more likely to fail
at the same time. It's not unusual to have such an arrangement in a newly
deployed server. Longevity is (more than) nice, but simultaneous failure is
still a disaster, whenever it happens. Does it make sense to provision drives
that don't match exactly (in age, manufacturer, etc.) in order to avoid
potential issues?

~~~
Filligree
Yes, it does; this is common knowledge. Using drives with similar serial
numbers will seriously degrade the expected benefit of RAID mirroring.

------
argc
Samsung 840 Pro vs Corsair Neutron GTX? They both did well.. which would you
buy?

~~~
wmf
Neither; those models are old. I have actually bought two EVOs but today I'd
buy an EVO or a MX100:
[http://www.storagereview.com/it_s_game_over_for_most_consume...](http://www.storagereview.com/it_s_game_over_for_most_consumer_ssd_companies)

------
zamalek
How would HDDs fare in this test?

I'm an overly cautious person when it comes to data so I'm still holding off -
these results look promising enough to throw some of that caution to the wind.

~~~
theandrewbailey
I've had a 512GB OCZ Vertex 4 for almost 2 years. Depending on the OS used,
you should be able to automatically put more valuable fast-changing user
specific data on a HDD, while keeping your system and apps on a SSD. I did
that for Windows 7, and everything has been great.

~~~
wvenable
I'd be interested to know the best way to set that up. My next Windows PC will
have an SSD for the system drive and a regular harddrive for other storage. I
figure I can use junction points to map certain folders to the HD but I'm not
sure what would be a good configuration.

~~~
theandrewbailey
I used an answers file during installation. I followed this:
[https://answers.microsoft.com/en-
us/windows/forum/windows_7-...](https://answers.microsoft.com/en-
us/windows/forum/windows_7-files/win7-how-do-i-move-user-folder-to-a-
different/565f16a5-e5ed-43c9-8422-4f56aebb296e)

If you use Windows 8, I've heard that doing something like this causes massive
problems when updating Windows itself. It's been fine for Windows 7 though.

~~~
chrisdhal
Correct. I did this exact thing (used a junction point for my User directory)
and could not upgrade from 8 to 8.1. Moved it back to "normal" and it went
fine. This is a known "issue", so don't do junction points, just do the
traditional "Move This Folder" for each special folder (Documents, Downloads,
etc.)

------
tim333
I wonder if there's a way to get those ssd metrics like wear indication and
reallocated sectors on the macbook? Not that I'm ever likely to write that
much data

~~~
pixelglow
You could try
[http://binaryfruit.com/drivedx](http://binaryfruit.com/drivedx), it's been
pretty useful for me.

~~~
tim333
Thanks. I tried it on my 2013 macbook air. It said 96% remaining life after
350 hours of use for what it's worth.

------
Shivetya
In my environment SSD are either in RAID sets or mirrored like any traditional
drive. This is just common practice, so a drive bricking itself while odd
would not endanger the data.

Our systems vendor is making a big push for SSD only systems for many reasons.
Namely speed but the reduced costs for electricity and cooling are apparently
significant too. There there is the form factor, they are going to be much
more space efficient than spinning drives

~~~
falcolas
> raid

and

> would not endanger data

in the same train of thought does not compute. Raid arrays do not protect your
data. Only backups will do that.

In fact, striped raid arrays will tend to lose data faster than a drive by
itself - when it comes time to calculate parity, they'll find an overlooked
mistake and take out 2-3 more drives, writing off all the data on that cluster
at once.

~~~
rbanffy
The "R" in RAID is for "redundant". Unless we are talking about RAID-0, which
should not even be called "RAID" (AID, perhaps?), it does, indeed, protect you
from individual drive failure (how many and which drives can fail depends on
the RAID configuration) and, therefore, the whole array will be, on average,
more reliable than the least reliable of its component drives.

There is, however, a lesson I learned - never, ever build an array with drives
of the same maker, model and batch, as they will have a tendency to fail at
the same time for the same reasons. I did not build the array (I'd never do
it), but it failed under my watch. Luckily, the first drive to go went one
week before the second and the third and I had a plan-B.

------
comatose_kid
I wonder if you can draw meaningful conclusions about the reliability of a
given brand of SSD with such a small sample size?

~~~
jws
_Given our limited sample size, I wouldn 't read too much into exactly how
many writes each drive handled. The more important takeaway is that all of the
SSDs, including the 840 Series, performed flawlessly through hundreds of
terabytes. A typical consumer won't write anything close to that much data
over the useful life of a drive._

~~~
d64f396930663ee
> The more important takeaway

That's exactly the issue - you can't have a takeaway if you don't have a
reasonable sample size. Are these drives all in the top 5% of quality for
their respective brands? We'll never know unless we do a larger study.

~~~
mikeash
The odds of all six drives being in the top 5% by chance is one in 64 million.
So I think we can at least rule out that extreme with decent confidence.

------
drzaiusapelord
I wonder what this means for shops like digital ocean and now Linode that sell
SSD backed VM's. Those things must be constantly writing. Even in a RAID
array, the load is even and the drives should fail at the same time, kinda
making RAID a little useless. I'm guessing they mix and match different
vendors and models.

------
puzzlingcaptcha
Out of curiosity, how heavily do you tax your consumer SSDs? I have a Samsung
840 in my desktop for the system and applications (while keeping media and big
files on a platter) and after approximately one year I am at 1.5TB of data
written. That seems to give me another 200 years before the wear indicator
reaches 0.

~~~
alphapapa
Those numbers seem very strange to me. My main system is an oldish laptop
running Linux, and its primary partition is a 200 GB ext4 one, which was
originally created over 5 years ago. Its lifetime writes (tune2fs -l) shows
2.4 TB, which includes /home. How would you manage 1.5 TB written for just
system and apps over 1 year?

------
theandrewbailey
I wonder when I'll be able to buy SSDs that heat their cells to increase life
expectancy.

[http://arstechnica.com/science/2012/11/nand-flash-gets-
baked...](http://arstechnica.com/science/2012/11/nand-flash-gets-baked-lives-
longer/)

------
jotm
Nice, I bought 3 Samsung 470s back in the day because of the endurance tests
that showed it can withstand 400TB of writes.

Two of them are still going strong, one failed in very unique circumstances
(and was probably recoverable by Samsung).

------
callesgg
Great article. Nice to see some real fact based stuff.

------
chmars
Recommendation for Mac users:

'Verify disk' in Disk Utility from time to time. I usually get error messages
and have to boot from the recovery partition to repair my system disk
(Command-R).

According to the local Genius Bar, that's normal to a certain degree. For one
MacBook Pro, I got the mainboard and the SSD repaired on warranty.

I am wondering if these issues are related to HFS+ or to the reduced
reliability of SSDs …

~~~
projct
I have experienced this on several spinning rust-based Macs (with multiple
drives that pass SMART and several other types of tests) over the past decade
as well as occasionally on SSDs. It's HFS+ or some sort of other OSX issue.

------
n0body
this is the trouble with ssds, they just die. but they're fast before they do.
but it's ok because everyone has backups right? i know i do, and raid1 just
incase.

on a side note, glad i paid the extra cost for the 840 pro!

~~~
ZoFreX
> this is the trouble with ssds, they just die

As opposed to hard drives, which are always so good about failing predictably
/s

My take away was that SSDs are somewhat easy to predict failure on. I'd take
predictable failure over unpredictable any day.

~~~
n0body
in my experience hdds have signs they're going to die soon, they go slow,
click a lot, get smart errors, things get corrupt etc. where as with flash
media it all seems fine, but suddenly isn't.

not that this is a rule, i've seen hdds fail instantly for no reason as well,
and flash media throw a massive wobbler but still be recoverable. but in my
experience this is not the normal behaviour.

but who cares as long as you have backups and a contingency plan?

~~~
ZoFreX
Google's study[1] on spinning disk drives found that predicting failure from
SMART data is not very accurate.

[1]
[http://static.googleusercontent.com/media/research.google.co...](http://static.googleusercontent.com/media/research.google.com/en//archive/disk_failures.pdf)

------
elheffe80
I wonder if spin rite could fix them... heh

------
rasz_pl
dupe
[https://news.ycombinator.com/item?id=7899336](https://news.ycombinator.com/item?id=7899336)

------
Achshar
Behind a registration wall. Anyone have a mirror?

Edit: My apologies, mixed up this with other youtube wsj article which is
behind paywall. Consequence of opening multiple articles at once.

~~~
scrollaway
I don't see a reg wall here. But TLDR:

Even with only six subjects, the fact that we didn't experience any failures
until after 700TB is a testament to the endurance of modern SSDs. So is the
fact that three of our subjects have now written over a petabyte. That's an
astounding total for consumer-grade drives, and the Corsair Neutron GTX,
Samsung 840 Pro, and compressible Kingston HyperX 3K are still going!

~~~
NamTaf
The selection of brand comes from more large-scale trends in failure, like
what OCZ went through with some of their Vertex drives. They were firmware
bugs rather than degredation of NAND that led to a more sudden bricking of the
drive and loss of data.

A properly functioning drive will last ages now. Not all manufacturers produce
the same rate of improperly functioning drives, so that's where the
discrepency in product exists.

