
Power Failure Testing with SSDs - kustodian
http://blog.nordeus.com/dev-ops/power-failure-testing-with-ssds.htm
======
dboreham
I can see that the author put some considerable time into testing and writing
the article, but nobody in the business of deploying production Posgresql
should ever have been using the various models of desktop SSD that were
tested. This is because the subject of which SSD are reliable has come up
repeatedly on the PG mailing lists. Several heavy PG users and developers have
already done testing and the universally accepted advice in that forum, for
years, has been to never ever use a drive that lacks power fail protection.

Disclosure: we run Postgresql in production on Intel and Samsung "data center
grade" SSDs, and I participated in the aforementioned PG mailing list
discussions.

Updated: from [http://www.postgresql.org/docs/9.4/static/wal-
reliability.ht...](http://www.postgresql.org/docs/9.4/static/wal-
reliability.html)

"Many solid-state drives (SSD) also have volatile write-back caches."

and this thread: [http://www.postgresql.org/message-
id/533F405F.2050106@benjam...](http://www.postgresql.org/message-
id/533F405F.2050106@benjamindsmith.com)

and
[https://news.ycombinator.com/item?id=6973179](https://news.ycombinator.com/item?id=6973179)

~~~
kustodian
"...but nobody in the business of deploying production Posgresql should ever
have been using the various models of desktop SSD that were tested." this is
one of the reasons why this article was written. To show you how to test, to
see what you should use and what you shouldn't in production. When we did
these tests we already knew we shouldn't use those SSDs, we mostly did it just
for the sake of testing, to see if the tests we wanted to use make any sense
and to show others how to do it.

Also I know it may sound funny to use these SSDs in production, but when we
started with our game Top Eleven we didn't know that, when you have 5 people
working on a game, you don't have time (nor resources) to think about a type
of an SSD; you order server with an SSD, you get an SSD (you usually don't
even have a choice when you rent) and you use it. This is how it still works
with almost any server renting company.

~~~
dboreham
Ok, you didn't know something that's common knowledge in the industry - that's
reasonable. I'm sure there are plenty of things I don't know that I should.
But there are only three people in our company and we spend quite a bit of
time worrying about SSD reliability because our business depends on it :)

~~~
fulafel
Testing beats reasoning from specs.

------
t_tsonev
Am I the only one who thinks RAID controllers are a placebo and wouldn't trust
anything but ZFS?

 _What was even more interesting is that our SSDs are connected to the Dell
H700 /H710 RAID controller which has a battery backup unit (BBU) which should
make our drives power failure resilient. RAID controller with BBU in case of a
power failure can hold the cached data until the power comes back, so that it
can flush it to the drives when the drives come back online._

~~~
coldtea
> _Am I the only one who thinks RAID controllers are a placebo and wouldn 't
> trust anything but ZFS?_

No, there are others that cargo-cult believe in ZFS too

\-- a hyped single vendor OS, without first-tier-support on Linux, and with
its own issues, compared to a industry wide standard, used for 3 decades in
the most demanding data-centers protocol and its implementations.

~~~
andor
_a hyped single vendor_

Who's that single vendor? FreeBSD, OpenIndiana, Joyent, Oracle?

 _compared to a industry wide standard_

Good luck exchanging your fried RAID controller for a "comparable" model

~~~
aexaey
> Good luck exchanging your fried RAID controller for a "comparable" model

If you are running Linux/*BSD/SmartOS, the best RAID controller is a JBOD
controller, i.e. one that that exposes all connected disks as-is to the host
OS, and than it would be in-kernel soft-RAID implementing the actual logic.

This approach gives you a more predictable system with no vendor-specific
idiosyncrasies or extra cache level to worry about (BBU in OP's article); you
end up running RAID code that is peer-reviewed, fully integrated into fsync()
algorithm, and surprisingly, often gives you better performance too.

"True hardware RAID controllers", on the other hand, are nothing more than an
application-specific computer with its own (non-upgradeable and often
outdated) CPU, RAM, I/O and hard-to-upgrade proprietary software.

And if you buy into the view described above, than replacing a JBOD controller
for RAID use is exactly the same thing as replacing JBOD controller for ZFS
use, by definition.

------
devit
Seems pretty inexcusable to release SSDs that get corrupted on power off.

These drives are defective, and they should refund customers for the SSD
prices, plus some hefty compensation for the potential data loss they could
have or have caused.

If you make a storage system and ack a write/sync, it had better be durably
written.

~~~
baruch
Most non-enterprise SSDs do not have an internal supercap or other power
protection mechanism and they are not intended for a server use-case and
shouldn't be used in such capacity.

An HDD will not hold data that is written to its write cache either so the
SSDs are well within the spec.

~~~
arielweisberg
That is not at all what the blog post is claiming nor is it the reality. It's
also not what the commenter you're responding to is complaining about.

The complaint is not about losing data in a volatile cache the complaint is
that drives will lose data even after the drive has claimed to have flushed
it's write cache after being given a write barrier.

------
perlpimp
These are old (320-520) SSDs we are using Intel 3500 & 3700 DC series and
haven't had any issues. with few power outages in past few years.

840Pro is a consumer grade drive and has reduced guaranteed max write capacity
and larger variance in quality from unit to unit. Instead you should use
server type disks instead:
[http://www.samsung.com/global/business/semiconductor/minisit...](http://www.samsung.com/global/business/semiconductor/minisite/SSD/global/html/why/forDataCenter.html)
We use these in our analytics servers, they have stood up to test of time many
times, and without any issues. SSD tech evolves so fast that 3 years since
release of model lineup seems like forever, especially in an intensive write
prone environment(for example writing slows reading but by how much? DC level
drives are way ahead in this and very consistent as per what we found out in
our tryouts).

~~~
kustodian
Yeah I know that these are old drives, we've done this testing about year and
a half ago and when we started renting servers more than 5 years ago, we
didn't think about which SSDs are in there. At that time we didn't have an
option to choose, nor time to think about the implications of different SSD
models, we used what we got. Later on when we grew and started having
problems, we started investigating what models we should use. The main reason
for writing this article is for people to be aware that they should test their
drives and how to do it.

~~~
acqq
What's the decision behind not testing Samsung with "On On" cache and
barriers? Is it so much slower that it's not worth testing? Shouldn't barriers
allow the cache on the disk to "know better" how to organize writes and still
be faster than without the cache turned on?

The "disk cache" is a disk hardware option (how it uses its own RAM), if I
understood, and the barriers are just an option of the FS behavior (software).
I'd expect that the performance penalty to the former is much higher than to
the later?

~~~
kustodian
The reason is the performance penalty for barriers On + disk cache On was much
higher than for barriers Off + Disk Cache off, at least that is how it looked
like on our testing system. Of course that could be do to the fact that we are
using a RAID controller which has a 1GB of cache. If you are not using RAID,
"On On" would be a valid test.

------
Already__Taken
Can I just get a decent sized capacitor on the power rail to the SSD to keep
the half second of power to empty the state that's cached and avoid these
problems? What's the capacitor doing in enterprise drives that triples the
price?

~~~
mrb
There is _nothing_ that makes SSDs with power loss protection inherently
expensive. As always, it's just market segmentation...

A large supercap in the 100-200mF range for an SSD is around $1 or $2. In fact
you can implement power loss protection with less capacitance with regular
tantalum caps like the Intel 320 did
([http://www.storagereview.com/intel_ssd_320_review_300gb](http://www.storagereview.com/intel_ssd_320_review_300gb)).
But drive manufacturers see the consumer market doesn't care about power loss
protection, so they decide to scrap the feature, which saves a buck or two,
and saves some PCB space.

~~~
mavhc
Which is odd, because in my world ssds in laptops will lose power, but ssds in
servers are on UPS, so will be shut down gracefully

~~~
jws
A laptop running down its battery should shut itself off gracefully. That
won't trigger the SSD power loss problems.

It isn't a clean OS shutdown usually, but an orderly transition to a
hibernation state of some variety which should include flushing drives.

~~~
drzaiusapelord
That's true until your battery ages a bit and now the on-board battery
monitoring is inaccurate because it thinks there's 5% or so left in power when
there is none. I have more than one laptop that will just let the battery run
out because the on-board diagnostics think there's juice left when there isn't
any.

I think if you're selling laptops, then you should worry about these types of
cases. Not to mention cases like having Windows Updates run on battery which
means the laptop can't hibernate when after its started these installs at
shutdown.

Standby/hibernate is still far from perfect. A fifty cent capacitor shouldn't
be a dealbreaker for ssd manufacturers.

~~~
Already__Taken
People who sell laptops want you to go and buy new laptops not just replace
the battery. Why design around failed components? The answer you want is to
replace the battery not fork out for an expensive ssd option that's
unnecessary for 100% of the design life of the product.

------
brongondwana
And yet again the takeaway is "buy Intel datacentre quality SSDs".

~~~
vegardx
It would be interesting to see them use Samsung SM843, which would be more
comparable to Intel S3500.

~~~
dboreham
We used Intel DC drives exclusively until one of them mysteriously failed,
thereafter we used Samsung 845DC for new builds. So far they've performed
well.

------
DiabloD3
This is why I've been deploying Crucial M500, M550, and MX200 drives over the
years. They don't use full scale super-cap protection, but they're better than
Samsung Pros and the few Intel drives without proper protection.

Generally power loss will not effect these drives, and I couldn't get them to
scramble existing data or damage the drive while unplugging them or unplugging
the computer they were in during heavy writes.

M600DC is Crucial's full scale DC model that offers superior power loss
protection.

Crucial drives are manufactured at the joint Intel/Micron facility (using
technology from both companies) that Intel's current lineup of drives are
manufactured.

I agree with the article that S3500s have sufficient protection.

SanDisk also has a power loss protected drive, but the drives themselves don't
seem to be any good. I'm hoping SanDisk drives produced under Western
Digital's ownership will be much better.

------
scurvy
Anyone notice the theme in the drives that are not power-safe? They all use
Sandforce controllers.

------
acqq
What was with these "barrier" things? What was actually turned on and off?
Which filesystem exactly?

~~~
wyldfire
> CentOS 6.5 was used, SSDs were formatted with XFS and they were mounted into
> /mnt/ssd1 and /mnt/ssd2. XFS was used because that is our main file system
> for databases.

write barriers is a mount option [1] [2].

[1] [http://linux.die.net/man/8/mount](http://linux.die.net/man/8/mount)

[2] [http://lwn.net/Articles/283161/](http://lwn.net/Articles/283161/)

