

“When all your engines stop, the flight is over” - anigbrowl
http://blog.theoldreader.com/post/56209408824/important-update/

======
xb95
Honestly, this doesn't sound like a bad batch of drives or the like -- sounds
like they weren't doing scrubbing on their RAID.

In case it's helpful, and for general knowledge dissemination:

What likely happened is that a drive "failed". This is usually when the RAID
card decides that a drive has had enough command errors that it fails the
drive. It may actually be fine, and just had a spate of bad responses. You
might try to online the drive again and let it rebuild, but that's debatable.

At any rate, so they replaced the drive. That's fine. But then, to rebuild the
RAID back to optimal state, it has to read all of the data off of the other
drives. Here's where a bad scrubbing policy bites you -- because if those
drives have any sectors that have gone bad or problems with the hardware,
those drives might fail as soon as the rebuild runs.

Scrubbing should be done regularly (weekly?). What it does is, in essence,
test every sector on all of the disks in the array to make sure that all of
them are still fully functional so that -- if there is a failure -- you're
pretty sure you can rebuild.

The downside of scrubbing is that, for better or worse, it does exercise your
disks fairly heavily. Also, if you don't have a suitable trough period then
you might even find it difficult to have the available I/O bandwidth to do it.

That said, you should do it if you're not.

~~~
lmm
I'd recommend ZFS for working with these kind of arrays; scrub is a command
that's easy to regularly schedule, and will only use idle I/O bandwidth (helps
that the RAID functionality is integrated with the whole filesystem).

If you have an infrequent scrub policy and hit bad sectors on rebuild, it can
detect checksum failures and mark specific files as corrupted, rather than
declaring all your disks defective. Traditional linux md raid behaviour is
particularly bad in this regard: if you have a raid6 configuration and haven't
been scrubbing, and then have a single disk failure, all your disks will have
a few random isolated bad sectors (i.e. sectors that will URE when you attempt
to read them) on, but since you still have one disk's worth of parity it's
possible to recover all your data with no downtime (and with ZFS raidz2 this
is what would happen). But with md raid as soon as you hit those bad sectors
during the rebuild it will consider those drives as failing and kick them out
of the array, and since all your drives have at least one bad sector on that
means it's impossible to recover the array.

~~~
xb95
Well, scrubbing with md is pretty easy, too.

That said, ZFS sounds great, and I've been meaning to learn it sometime. The
big thing holding me off has been just -- well, how much time I've put into
learning md and such, and how I would find it hard to justify going back to
not knowing much (since this is my livelihood).

If you have any tips on "read this, it's a good intro" stuff (slanted to
Linux, as that's my cuppa), I'd welcome them.

~~~
lmm
I'm no expert, just a satisfied user; I got into it from the FreeBSD side, and
followed their tutorials, but I don't know how well they'd apply to Linux. In
general I like the Gentoo and Arch wikis, particularly for slightly lower-
level things like this - even if one is using a different distribution, they
tend to take the time and explain the concepts in a way that's mostly
distribution-independent.

I will say that ZFS felt very coherent and even - dare I say it - easy. I was
worried that merging layers that I'm used to thinking of as separate would
mean a loss of control, but I was able to put together all the layouts I
wanted. The various commands with a single interface felt a lot like using
lvm2 - different tools but under a unifying structure with the same parameters
- and a lot of what it does is lvm-like. So my best tip is to approach it more
as an LVM that can also do raid-like functionality and be mounted directly,
rather than as a raid system that includes volume management.

Sorry if that's not terribly helpful - as I said, I'm not an expert, but I
wanted to at least give you some kind of reply.

~~~
markeganfuller
+1 for the Arch wiki. Despite being a distro wiki they have very nice in depth
and distro-independent pages. I use Debian but regularly read through the Arch
wiki when trying to do new stuff.

------
bigiain
Reminds me of this (from many years ago):

"But even today a 7 drive RAID 5 with 1 TB disks has a 50% chance of a rebuild
failure. RAID 5 is reaching the end of its useful life. "

[http://www.zdnet.com/blog/storage/why-raid-5-stops-
working-i...](http://www.zdnet.com/blog/storage/why-raid-5-stops-working-
in-2009/162)

~~~
tjoff
Although the fail characteristics and sizes of SSDs are not the same as
conventional harddrives so you can't really make much out of that article in
an SSD context.

------
ck2
If you aren't using Intel SSD in a server environment, you are going to have a
bad time.

They are too relatively slow for home computer use but for servers, much more
reliable.

That said, this chart concerns me:

[http://www.ssdaddict.com/ss/Endurance_cr_20130122.png](http://www.ssdaddict.com/ss/Endurance_cr_20130122.png)

25nm Vs 34nm
[http://google.com/search?q=cache%3Ahttp%3A%2F%2Fwww.xtremesy...](http://google.com/search?q=cache%3Ahttp%3A%2F%2Fwww.xtremesystems.org%2Fforums%2Fshowthread.php%3F271063-SSD-
Write-Endurance-25nm-Vs-34nm)

My first personal computer SSD is going to be the Samsung 830 from different
batches in Raid1

Also, when someone else builds your servers, you should query the smart info
from the drive to make sure they aren't used SSD.

------
pja
SSDs still seem to have a bunch of nasty failure cases. Right now, for
production use I'm not sure I'd trust the things for reliable storage. As a
fast cache for spinning rust? Definitely. As my only live copy, even
duplicated in a RAID? Hmm.

Not even Intel SSDs are immune: One of the Debian developers has reported that
the SSDs shipped in the latest Thinkpads die if you try and construct an
encrypted filesystem on them. Somehow they corrupt themselves during the
initial write of random data to the disk.

(Interesting that these SSDs died whilst under high write load too: is this a
particular weak point for some reason?)

~~~
coldtea
> _SSDs still seem to have a bunch of nasty failure cases. Right now, for
> production use I 'm not sure I'd trust the things for reliable storage._

You should not trust anything for "reliable storage". That's what backups and
redundant drives are for.

------
ronilan
Losing data s __but the analogy is flawed.

“When all your engines stop, the flight is just starting"

~~~
bdonlan
When your engines stop, you trim for best glide ratio, point yourself at an
open field, and fly the plane all the way to the ground. Failure to do so is a
great way to get yourself killed.

~~~
xb95
Nice to see a fellow pilot -- well, I'm still in the student phase, but I've
been doing lots of "glide, grass, gas!" lately, as my instructor likes to say.

Sadly, the number of open fields to aim for in the SF Bay Area is pretty
small. It's either hillsides, bay, or city. Oh, and a couple of thin dikes you
could land on. ("That's why nailing the centerline is so important. Someday
you might need to land on something like that dike wall -- and if you nail it,
you walk away fine.")

------
anigbrowl
BTW I should add that I'm still a fan of TOR and have no plans to switch away.
In fact, their upfront explanation of what went wrong and how has improved my
opinion of the project.

------
chiph
Sympathies. We had a similar thing happen at a previous job in the days of the
"deathstar" drives. Lost a drive, no biggie. Tell the DC guy to replace it.
Lost a second drive, told him to start running towards our cage. Lost a third
drive, uhh - how current are our backups?

Different job - had a developer accidentally run a where-clause-less delete in
production. Same net result. RAID and SANs are definitely not backup
solutions.

~~~
tetha
Very much so. For us, raids are mostly time we can use to move all the
important data off of that raid. It might be drastic, but it's safe and our
data isn't too big.

------
skore
If your service goes down, be sure to at least have some notion of what your
service does on the homepage. Right now, it's only an error report and I had
to skim through the blog for a while to figure out that it's a kind of Google
Reader replacement.

When your engines stop and you're getting lots of eyeballs, don't assume they
all know you.

------
achille
Storage is cheap compared to engineering labour. That's a massive migration
(from one DB to another), definitely not worth saving that 300gb.

It's still unclear how the storage failure was related to the migration. Was
the new engine/fs disruptive to the SSD?

~~~
Narkov
They mentioned pretty explosive growth so 300GB today may be 3TB next month
and 100TB in a few years.

------
coldtea
> _When all your engines stop, the flight is over_

Pedantic and off-topic as it is, this is incorrect. At least for airplanes.

When all your engines stop you continue to glide, and you can even manage to
land succesfully with a little skill and luck.

~~~
tjoff
So, a glide to an emergency landing means that the flight is not over?

Similarly just because a rebuild fails doesn't mean that the data is lost.
Just pop in the drives individually and fetch whatever you can from all of
them, it is seldom the case that the exact same parts of them become
irretrievable at the same time or that the drives stop function completely.
It's just some more work than a regular rebuild.

~~~
coldtea
> _So, a glide to an emergency landing means that the flight is not over?_

By the very definition of the world "flight", yes.

~~~
tjoff
Well, it depends on which definition of flight you use...

------
rxp
Ouch, sounds like somebody got hit by a really bad batch of drives. Always a
risk when you buy a bunch at the same time. :/

~~~
nbevans
In my experience, all SSDs are a bad batch unless they're Intel 520 or
Samsung's high end range.

A big mistake to make is to fill an SSD drive completely up with data. This is
a huge no-no unless it's an enterprise drive which usually use a totally
different design.

------
nknighthb
Either those drives are all from the same bad batch, or this isn't a drive
problem. I would be thinking either a cooling problem or flaky controllers,
and a drive swap in that case is not the solution.

~~~
akg_67
Doubt those drives were from bad batch. As these were SSDs, most likely write
wear. Two SSDs should never be in RAID 1 pair or replace at least one SSD in
RAID 1 at predetermined schedule instead of replacing on failure only. They
are going to wear off at same pace and will fail about the same time.

