
The SSD Endurance Experiment: Only Two Remain After 1.5PB - ferrari8608
http://techreport.com/review/27062/the-ssd-endurance-experiment-only-two-remain-after-1-5pb
======
tytso
The definition of wear out is more than just the SSD declaring the cell bad,
or the SSD failing suddenly. A cell is technically declared worn out when the
chance that the cell suffer charge leakage after N months at temperature T
exceeds probability P. (Where exactly what these parameters are are a secret
that the SSD vendors don't disclose. There are some standards, but the SSD
vendors don't necessarily use those standards when the make promises about
their product's wear endurance.

So even though an SSD might last for 1.5 PB's worth of writes, there is no
guarantee that if you were to then put the SSD on a shelf and wait nine
months, that they data will still be good. This is probably one some vendors
will declare themselves to be dead after so many gigabytes worth of writes,
even if the flash cells haven't "failed" yet. Otherwise users might depend on
the SSD's contents being retained, when in fact they might suffer data loss.

But of course, this doesn't really matter much, because you treat all data
stored on SSD's as a cache, and do regular backups, RIGHT? :-)

~~~
bloat
This does not invalidate your point of course, but note that the article
describes one of the tests some of these drives have passed: write a large
file and power the drive off for a week and then check the file.

~~~
allegory
Try powering off for 5 years and keeping at 75oC and see if the data is still
there.

~~~
gambiting
Wouldn't most storage types, including regular hdds and optical media fail
this test though? I am not talking about complete data loss, but some data
would be corrupted after such time.

~~~
userbinator
Magnetic media (hard drives, tape) will essentially retain data indefinitely
unless exposed to magnetic fields that are strong enough or until the Curie
point is reached, both of which are unlikely scenarios for long-term storage.
Even in cases of fire that destroys the external components of an HDD, if the
platters didn't get hot enough the data is still there:

[http://istcolloq.gsfc.nasa.gov/fall2008/presentations/peders...](http://istcolloq.gsfc.nasa.gov/fall2008/presentations/pederson.pdf)

Flash-based memory is different - unlike magnetic media which can be thought
to be _bistable_ , flash is inherently unstable ( _monostable_ ); the erased
state of a cell is lower energy so the electrons stored in a programmed one
are "under pressure", and due to tunneling effects, slowly leak out over time.

The consequence of this is that magnetic media will continue to store
information long after it's obsolete; I'm almost willing to bet that the data
on a modern HDD will still be there on the platters in 100+ years, even if the
rest of the drive becomes inoperable. Ditto for optical media such as pressed
CDs - in that case the bits are manifested _physically_ , and unless the
medium is degraded to the point where the bits are no longer distinguishable,
the data stays (theoretically, even a CD whose reflective layer has degraded
is still readable via SEM or other physical means, since the data is
physically pressed into it.) On the other hand, flash will slowly and
irreversibly erase itself over long periods of time, as each cell returns to
its non-programmed stable state.

~~~
Retric
This is not exactly true, the magnetic fields on HDD migrate around on the
disk over time and eventually become unreadable by the disk. In theory the
data remains recoverable for significantly longer than that but it's not
'stable'. While historically not much of a problem it's a larger issue as HDD
keep increasing in capacity.

HDD actually internally refresh data to avoid this issue so their much better
as 'hot' storage. Tape is designed to avoid most of these issues and is much
better for long term storage.

"Magnetic media – such as floppy disks and magnetic tapes – may experience
data decay as bits lose their magnetic orientation. Periodic refreshing by
rewriting the data can alleviate this problem. "
[http://en.wikipedia.org/wiki/Data_degradation](http://en.wikipedia.org/wiki/Data_degradation)

------
userbinator
The main issue I have with this form of testing is that it's basically
measuring the _ultimate_ endurance characteristics of the flash - running
program/erase cycles until some piece of the flash becomes completely
unusable. The majority of the time the first failure will occur in a user data
block, but there's a nonzero chance that it's in a block mapping table or the
firmware itself, and that will definitely cause catastrophic failure. The
article seems to be implying that it's OK to write more data than the
manufacturer specifies, but this is not something anyone should ever be doing
in a real-world scenario, because retention is inversely proportional to
endurance and also (exponentially!) to temperature. A drive that retains data
for a week at 20C may not be able to at 30C or even 25C.

The 840 Pro's reallocated sector count appears to have started rising at
600TB, which is roughly 2400 P/E cycles, on average, of the whole flash - this
is not surprising and agrees with the typical endurance figure of 2K-3K for 2x
nm MLC.

I've never agreed fully with the reasoning behind MLC - yes, it's technically
twice the capacity for the same die area/price as SLC (or alternatively, half
the area/price for the same capacity), but it's also nearly _two orders of
magnitude_ less endurance/retention and requires far more controller
complexity for error correction and bad-block management. In a storage device,
I think reliability is more important than capacity - even with backups, no
one wants to lose _any_ data. The tradeoff doesn't make so much sense to me -
theoretically, you could buy an MLC SSD that wears out after a few years (thus
needing replace it and copy the existing data over to the new one, along with
all the risks that causes, etc.), or for only twice as much, an SLC one that
probably won't ever need replacing.

A 256GB SLC SSD with 100K P/E cycle flash is conceivably good for 25PB and
5-10 years, or <1PB and over a century... i.e. you could probably use one for
archival if stored in a good environment. Part of me thinks the manufacturers
just don't want to make such long-lasting products, hence the strong
association of SLC to "enterprise" products. (And the much higher pricing of
SLC SSDs, more than the raw price of NAND would suggest.)

~~~
vidarh
The rationale is that most people will _never_ approach those kind of number
of P/E cycles, and so people would rathe pay for more space, or pay less. Even
in many cases in enterprise settings.

We have some cheapish SSD's in use for some of our high traffic database
servers. We lost some drives that failed catastrophically, and the company we
bought it from "suggested" we might have worn them out and maybe we didn't
have a reason to RMA them, and perhaps we just ought to buy more expensive
enterprise models next time.

So we checked the SMART data, and after a year of what to us is heavy 24/7 use
with a large percentage of writes, we'd gone through less than 10% of the P/E
cycles.

(We did our RMA, and it was very clear that this was a problem with the
model/batch - all the failed drives were OCZ Vertex drives from when their
failure rate shot through the roof before the bankruptcy)

All our other SSD's are chugging along nicely; the oldest have suffered
through 3-4 years of heavy database traffic. I am just waiting for the oldest
ones to start failing.

At that rate it doesn't matter if they won't survive as long as SLC anyway:
We'll end up replacing them with faster, higher capacity newer models soon
anyway - we usually do on a 3-5 year cycle depending on hardware and needs -,
because it's more cost-effective for us to upgrade regularly to increase our
hosting density as it helps us avoid taking more rack space, and colocation
space/power/cooling costs us more than the amortised cost of the hardware.

The consumer market is similar: Most people don't ever buy replacements for
failed drives - they buy a newer computer.

~~~
userbinator
_The rationale is that most people will never approach those kind of number of
P /E cycles_

The flip side of that is most people could now have drives that don't cost all
that much more, but last _much_ longer. Most SLC tends to be rated for 100K
cycles and 10 years of retention; assuming a roughly inverse correlation, at
10K or 1K cycles the retention goes up considerably to a century or more.

 _The consumer market is similar: Most people don 't ever buy replacements for
failed drives - they buy a newer computer._

That is true, but the long-term implications are more subtle; the fact is that
most people don't backup, and quite a few of them keep the old drives (that
were still working when they were replaced) around as "backup", with the
implicit assumption that the data on them will likely still be there if they
ever decide e.g. that they wanted to find an older version of some file they
had. With flash memory, this assumption no longer holds.

On a longer timescale, we've been able to "recover data" from stone tablets,
ancient scrolls and books, this being a very valuable source of historical
information; and most if not all of that data was probably never considered to
be worth archiving or preserving at the time. More recently, rare software has
been recovered from old disks ( [http://www.chrisfenton.com/cray-1-digital-
archeology/](http://www.chrisfenton.com/cray-1-digital-archeology/) ). Only
the default, robust nature of the media made this possible.

Despite modern technology increasing amount of storage available, and the
potential to have it persist for a very long time, it seems we've shifted from
"data will persist unless explicitly destroyed" to "data will NOT persist
unless explicitly preserved", which agrees well with the notion that we may be
living in one of the most forgettable periods in history. It's a little sad, I
think.

~~~
darkmighty
The fact is, even it wouldn't matter as-is if the data took 10000 years to
degrade from the platter itself. Most consumer hard drives those days are made
for laptops, which are probably used for less than 5 years on average. Even if
you consider external HDs and desktop HDs, a long of a lifetime isn't much
use: the control electronics themselves fail fast, and the mechanical
reliability even faster.

It's an optimization rule of thumb: in an optimal trade-off for (e.g.) maximum
reliability for cost, the reliability of each element will tend to be close
(actually the derivative of the reliability vs cost will be equal, but this
tends to imply the former) -- i.e. you improve the least reliable and
sacrifice the most reliable, even if the most has very reliability.

------
Retric
I normally find it annoying when they run endurance tests like this using only
one drive of each brand and treat the results as particularly meaningful.
However, in this case I think the failures may suggest things about the drives
underlying architecture not just who picked the best sample from the bin.

~~~
rodgerd
> I normally find it annoying when they run endurance tests like this using
> only one drive of each brand and treat the results as particularly
> meaningful.

The main takeaway for me from this is less around the reliability of
individual drives, but that SSDs _as a whole_ have moved into a space where I
don't really need to worry about them being signifiantly less reliable than
hard drives.

~~~
jychang
Well, they could still improve.

My Macbook's SSD died in a blaze of reallocated sectors and write failures
just last week, and that thing was just around 2 years old.

For comparison, I've only had 2 spinning hard drives fail on me (I've owned
around 25), and those were >10 years old and mostly decommissioned.

It could be a statistical fluke, and the sheer speed of SSDs means that even
now, a higher failure rate is acceptable, but SSDs in general don't have the
longevity of older magnetic hard drives.

~~~
jahewson
My Macbook's SSD lasted just six months before it failed, right before
printing a boarding pass for a transatlantic flight. Afterwards it turned out
Time Machine had been making corrupted backups, which wasn't fun, fortunately
no user data loss, only Applications. I've learnt my lesions about SSDs: they
don't give you any hints that they're about to fail, and one they do, it's
game over.

------
jcampbell1
These SSDs are failing at roughly 3000 write-out cycles. Traditional hard
drives can take 6 hours to write out, so doing a similar test would take ~2
years.

Spinning disks are so freaking slow that you could never test the reliability
apples-to-apples. Any workload that wears out an SSD could never be run on a
traditional HD.

~~~
CPAhem
A very good point, but it would still be good to see a comparison, even if it
does take 2 years to run on a traditional HD.

------
joshvm
This is encouraging although even tests on early drives showed that an
'average' drive should last far longer than people need them for - purely
based on the number of allowed writes. 750TB? That's more data than my
department, an imaging research group, have on our cluster...

There are some more tests which are very hard to do because you need time and
a large sample size, for instance what's the data retention time for a typical
SSD?

As far as I can tell, nobody really knows because you'd need to leave the
drive off for probably more than a year - and as soon as you turn the drive
on, presumably you refresh the charge that's leaked out? Most of the time this
isn't a problem because almost everyone turns on their PC or laptop
weekly/monthly if not daily.

~~~
dzhiurgis
If you don't have much RAM I suppose you'd hit these limits much faster as OS
would swap a lot of data onto drive.

~~~
reitzensteinm
Two of the drives survived a year of continuous use, so even regular swapping
is trivial.

Say you're using your computer 12 hours a day, half of that doing tasks where
memory usage greatly exceeds your physical memory, and you're using 1/4 of the
bandwidth of the drive to swap in/out (of which 1/2 are writes).

Under this quite heavy scenario, the writes will catch up to this endurance
test after _32 years_.

The speed, price & capacity of these drives are (at least for now) improving
so quickly that by the time you wear one out, even with torturous use,
replacing it will be trivial.

Cost per gigabyte has halved over the period of this test. So even when taking
the drive that just died and using it for the most pathological use case
imaginable, you'd be buying a $200 drive today, replacing it with a $100 drive
in one year, $50 in two years, etc. Not a big deal.

~~~
fixedd
I just wish someone could tell me one I could buy that would last longer than
1.5 years in my laptop. I'm on #3 now :(

~~~
simoncion
The 100 "GB" OCZ Vertex LE in my laptop has 27,834 hours of power-on time.
This is the only drive in the system, and it houses several always-on
encrypted swap partitions. I'd be surprised if you could purchase a new one at
this time, though. :P

I can get back to you in a little more than a year about the 750 "GB" Samsung
840 EVO in the lady's laptop, and 6->8 months on the Crucial M4 SSD in my
gaming PC. :)

~~~
fixedd
My last to fail was an OCZ Vector :(

------
callesgg
So not entering read only mode after life end seams like a very dangerus bug
that is not realy accepteble.

Sidenote: Articles like this always scares the shit out of me I have a
kingston ssd that has been in my main server for almost 3 years now.

The smart data seams to say that it is 100% fine but as it has been on for 24*
365*3 hours that seams unlikely.

~~~
rocky1138
Yes, but your data is on more than one drive, yes? Presumably you have some
sort of RAID setup whereas if the one drive were to truly fail, another would
pick up in its place.

~~~
rsync
What worries me about raid mirrors on SSDs is that a lot of SSD failures are
not due to a part failure, but rather, a pattern failure ... meaning, if you
subject this SSD to thus and such series of writes, then it fails.

So the worry is, if you mirror an SSD then you could (theoretically) inflict
the exact same pattern on them over their lifetime and they would fail
simultaneously.

That is why all of our SSD boot drives, which are indeed mirrors, are built
from two different SSDs ... either two different generations of Intel SSDs
(3xx and 5xx for instance) or one Intel and one equivalent Samsung. This way,
their behavior cannot become correlated...

~~~
vidarh
That should concern you for regular drives too.

The infamous IBM DeathStar problem was partly resolved with a firmware upgrade
that added wear levelling, because the crashes were found to be due to
material flaking off the platters when the head remained in the same location,
leading to dust in the drive that had a high risk of trigger head crashes so
bad it'd strip almost all the material off the glass platters.

There's a fun picture on Wikipedia:

[http://en.wikipedia.org/wiki/HGST_Deskstar#mediaviewer/File:...](http://en.wikipedia.org/wiki/HGST_Deskstar#mediaviewer/File:IBM75GXP_Failed_Disks.png)

We had an array of 10 of them in 2001, and when the first one failed we
weren't aware of the problem and didn't think much of it. Then the second
failed a week later. The third a week after that. And so on, almost like
clock-work until all of them were dead. At the time that array made up enough
of our capacity that we couldn't afford to just take it out of rotation until
we'd replaced the drives.

It's the event that taught me the hard way to always at a minimum mix batches,
and preferably models and/or brands (just mixing drives from different batches
would've been insufficient with the DeathStar, as far as I remember). As well
as to favour multiple smaller arrays... We never lost any data, and the array
remained available for the entire time period it took to cycle through all the
drives, though.

------
devindotcom
I kind of anthropomorphize the devices in tests like this, so it's a bit sad
to see the poor things made to run until their legs fall off. But it's nice to
know they run farther than expected.

~~~
Scuds
running a defragment operation on a few of these makes them emit some
electronic noise if you listen closely. :D

~~~
darkstar999
For real though, don't defrag an SSD. It's pointless and wears it out for no
reason.

~~~
dghughes
The old Windows defrag with the small blocks was like visual bubblewrap.

We need more system utilities that are fun to watch.

------
colordrops
The fact that they work better than spec indicates to me that we are still at
the forefront of this technology with good engineering effort behind it. Once
it matures and companies try to squeeze out every dollar, expect them to fail
a lot more and the MTBF to be less than advertised, similar to printers etc

------
scrollaway
It'd be nice to get some endurance testing on the sandisk internal solid state
drives (eg. [http://www.amazon.com/Sandisk-SDSA5JK-064G-Module-Laptop-
Net...](http://www.amazon.com/Sandisk-SDSA5JK-064G-Module-Laptop-
Netobook/dp/B00IWNYRQ0))

I've had one fail three days ago on a laptop that was barely a year old. I'm
not even sure what actually failed - debugging a broken ssd is a pain, and
when I mount it it just freezes for ages, making the matter worse.

Sandisk being the only real seller of those things there isn't exactly a lot
of competition but I'd like to actually see how they hold up vs. regular SSDs.
There is a bit of a false expectation when you buy a laptop with a ssd,
expecting the endurance of a regular ssd and getting something potentially
awfully bad.

~~~
kabdib
Have seen quite a few Sandisk SSDs die over the past 3-4 years. Ditto, OCZ.

Yet to see an Intel or Samsung bite the dust. I'm sure it's only a matter of
time, but I won't buy any other brands.

~~~
simoncion
I have an OCZ Vertex LE that's still going strong after a little over 27,800
power on hours and ~18.6 TB of data written.

I would be shocked if "Judge reliability on drive models, rather than
manufacturers." was any less true in the SSD era than it was in the HDD era.

~~~
vidarh
You're right, sort of. Most of the time that's certainly the case. But OCZ QA
appears to have fallen off a cliff at some point, to the extent where it was
presumably a large factor in their bankruptcy, and it was a systemic problem
with most OCZ models.

A huge list of models have failure rates way above average, with a number of
them exceeding 5%, and some claiming that the failure rate for some of the
Octane and Petrol models exceeded 30%.

The best ones have been in line with other manufacturers, though.
Unfortunately the problems were so widespread that your odds of picking a
"safe" OCZ drive for a while were ridiculously bad, unless you were prepared
to wait for a year or two to get hard numbers on a model before buying.

The Vertex line has been a crapshot, for example. Several of the Vertex 2
models had unreasonably high failure rates (5%-10% range). Your Vertex LE is
as far as I know based on a different design/controller than the regular
Vertex 2 and might have "escaped" the Vertex 2 problems. The Vertex 3 appears
to have been much better (and none of the Vertex 3's we have have failed).
Vertex 4 has been disastrously bad for us - every single one failed hard after
less than a year, to the point where even the SMART data is partially
corrupted on most of the drives (claiming it's been powered on for 75 million
years for example..), and marked the end of buying OCZ drives for our part.

------
kalleboo
I'm happy to see the high write longevity these drives are achieving, but it
frightens me a lot that it seems like the fail-safes where they're designed to
go into read-only mode instead of just dropping off the controller and losing
your data are failing on all of them, even the Intel!

~~~
cesarb
In my experience with traditional HDDs, they often drop off the controller and
lose your data when they fail. Which means that, at least, SSDs are not worse
than HDDs in that regard. And they have the potential for a softer failure
mode (going read-only).

------
qwerta
There was test at czech site diit.cz. SSD survived several overwrites. But
after it was left disconnected for several months without power, it lost all
data. Apparently SSD needs to repower its cells periodically.

~~~
masklinn
Note that the techreport includes a component of leaving SSDs unpowered to
verify exactly that kind of issues. Although it's not a very long one.

------
higherpurpose
I wish they tested a Crucial drive, too. Crucial drives tend to have great
dollar/GB ratio.

------
Aoyagi
I wish the number of samples was much, much higher...

~~~
CPAhem
Yes and that SSDs had also been compared to normal spinning hard drives, too.

~~~
JoeAltmaier
Here's somebody else's data: [http://www.overclock.net/t/1284055/ssd-return-
rates](http://www.overclock.net/t/1284055/ssd-return-rates)

~~~
gonzo
2012

------
ck2
The intel failed to reach the petabyte mark.

This is interesting because many datacenters use the intel.

~~~
daurnimator
Though it is noted they intentionally fail:

> The Intel 335 Series is designed to check out voluntarily after a
> predetermined number of writes. That drive dutifully bricked itself after
> 750TB

~~~
dragontamer
This is also the consumer drive instead of the enterprise drive.

------
TheLoneWolfling
I am concerned about the lack of read-only at EOL.

I, for one, would _much_ prefer slightly less longevity and better reliability
than vice versa.

I mean, I do backups, but backups only do so much.

~~~
chroma
Backups only do so much? Could you elaborate on what your backups don't do?

My backups allow me to recover from theft, hardware failure, accidental
deletion, and more. If my computer were to burst into flames right now, I
would only lose 30 minutes of work.

Even if SSDs reliably went read-only at the end of their service life, I would
still keep my existing backup strategy. There are so many ways to lose data
without disk failure.

~~~
TheLoneWolfling
I have a laptop, and as such tend to use it on-the-go.

As such, backups won't capture anything after the last time I had access to my
backup volume.

Also, again, I don't always have access to my backup volume.

I'm not saying "I wish that they would go read-only so I didn't have to back
up" \- I am perfectly well aware of the need to back things up.

------
NoMoreNicksLeft
If an SSD is written to once, and kept powered on (and at a temperature in the
low 70s), while getting regular reads, how long can I expect this drive to
last?

If that drive is powered on, but kept at low temperatures (say near or below
freezing), does this help it survive longer?

What failure modes would occur in such an environment? Would it just be power
surges frying the thing, static electricity?

------
listic
They should have included Samsung 850 Pro, which sells since August.
[http://smile.amazon.com/s/ref=nb_sb_ss_c_0_7?url=search-
alia...](http://smile.amazon.com/s/ref=nb_sb_ss_c_0_7?url=search-
alias%3Dcomputers&field-keywords=850%20pro&sprefix=850+pro%2Caps%2C411)

~~~
OniBait
That would've been a bit hard to do seeing as how the endurance test has been
running for over a year. Even with SSDs, 1.5 petabytes of reads/writes takes a
while to write.

------
gambiting
I've always been curious - I understand that memory cells have their own
durability, but how about the controllers that are used to transfer that data?
Do they wear out too? In fact, can a CPU fail after having exobytes of data
sent through it?

------
arenaninja
Does anyone know if there's a utility that monitors SSD health? I've a 512GB
SSD, which I'll probably keep for a while, but I don't like being in the dark
about how far along its lifetime I am

~~~
emddudley
Samsung drives come with Samsung Magician, which is a pretty nice utility.

~~~
AndyNemmity
And the only time I've lost a drive was due to a Magician firmware upgrade
that was wrong. I now avoid it out of general fear.

------
alecco
I can't find in the article if those were sequential 1.5PB writes or random
small writes (i.e. < 4KB). If the later case, this article should be flagged.

~~~
arihant
They link to a post with details of the test in this post, which mentions it
is sequential - [http://techreport.com/review/24841/introducing-the-ssd-
endur...](http://techreport.com/review/24841/introducing-the-ssd-endurance-
experiment)

~~~
alecco
Thanks!

------
abvdasker
Really excellent writing on these pieces. I lol'd at "dutifully bricked
itself". If only all tech writing were as colorfully engaging.

