Hacker News new | past | comments | ask | show | jobs | submit login
Why you need STONITH (advogato.org)
17 points by morphics on Oct 22, 2013 | hide | past | favorite | 6 comments

Sadly, even simple concepts like STONITH are hard to get right. I believe it was GitHub that had an outage because both db nodes shot each other, but their network was extremely slow because of some fault (which caused the initial problem as well) and both nodes received the STONITH message from the other at similar times, long after they each timed out waiting for a response.

Distributed systems are hard.

One thing to keep in mind, is clustering will never make your app 100% available. The best it can do is add 1 "9" to your uptime (so you are already at 99.9 percent, it can get you to 99.99 percent uptime, etc). But one thing that it should NEVER do is corrupt your data. If you have a service outage, yet your data was kept safe, then the cluster still did it's job.

Is this for scenarios where there's no shared storage medium, so IO fencing isn't a solution? I can see for a DRBD solution how you need some real way of ensuring only one node is up. It just feels like STONITH is a really ugly hack and would be better solved via other quorum solutions, even if it means adding a witness system.

Normally when you have shared storage, you can use that shared storage as a quorum device (i.e., via an exclusive scsi lock). If you have something like DRBD, then you still have the issue where each node can't see each other, but an outside application writing to the database served up by the cluster can see both nodes -- and if each node wants to bring up it's shared IP address, some writes will go to one node, some to the other. Then you have the database on each node not having all the current data (even if it isn't technically "corrupted").

For some reason, I always get a bit disappointed when I read an article about STONITH, and it doesn't begin with a pointer to the world's funniest joke (http://en.wikipedia.org/wiki/World's_funniest_joke). Now I know that misplaced humor in technical documentation can go wrong sometimes, but this is one case that I think it can help make the concept really stick to the reader.

I was working on HA software in 1992. Specifically, I was working on the software from which Linux-HA copied all of its terminology and basic architecture. We ourselves were not the first, and often found ourselves copying things done even earlier at DEC, so I'm not complaining, but I want to make the point that this article from 2010 is actually a rehash of a much older conversation. As cute as the metaphor is, it gets two things seriously wrong.

(1) Fencing and STONITH are not the same thing. Fencing is shutting off access to a shared resource (e.g. a LUN on a disk array) from another possibly contending node. STONITH is shutting down the possibly contending node itself. They're quite different in both implementation and operational significance. Using the two terms as though they're interchangeable only sows confusion.

(2) You only need STONITH if you have the aforementioned possibly contending nodes - in other words, only if the same resource can be provided by/through either node. If the resources provided by each node are known to be different, as e.g. in any of the systems derived from Dynamo, then STONITH is not necessary.

To elaborate on that second point, the problem STONITH addresses is one of mutual exclusion. It might not be safe for the resource to be available through two nodes, because it could lead to inconsistency or because they can't both do a proper job of it simultaneously. As in other contexts, mutual exclusion is a useful primitive but often not the optimal one to use. In general it's better to avoid it by avoiding the kinds of resource sharing that make it necessary. That's why "shared nothing" is the most common model for such systems designed in the last decade or more, and they don't need STONITH unless they've screwed up by not fully distributing some component (such as a metadata server for a distributed filesystem).

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact