

Within 3 days another power outage at Linode (Fremont) - jrnkntl
http://status.linode.com/2010/11/fremont-connectivity.html

======
jrnkntl
I don't understand what they did the last days if this is another power
outage.

In their posting about the outage on the 20th they said: "At this point all we
know is a severe lightning storm in the area caused a power outage and
redundant UPS systems failed." [http://status.linode.com/2010/11/possible-
power-outage-in-fr...](http://status.linode.com/2010/11/possible-power-outage-
in-fremont.html)

Redundant UPS systems failed? And now it fails again? What kind of data center
are they running in Fremont?

~~~
dholowiski
I worked in a datacenter (serving many companies you've heard of) during a
catastrophic power failure that lasted almost 24 hours. It's kind of like a
plane crash - it's never one thing failing that causes the problem (that's
what redundancy is for) - it's that perfect chain of events, multiple 'once in
a lifetime' failures that causes it.

For example, power outage occurs at the same time the UPS batteries are being
changed. Bypass fails and Diesel generators fail to kick in. Circuit breakers
blow everywhere making it extremely difficult to get the generators back on
line. This all happens in the middle of the night in a winter storm (or
'lightning storm) which causes a further delay in response time.

Been there, done that.

Edit: Also, bureaucracy, lack of documentation, and a manager CF added several
hours to the outage. Sometimes you just have to STFU and let the geeks fix the
problem.

~~~
jmcnevin
What you've described here would make an AWESOME movie. Granted, I'm not sure
how one would work "Datacenter" into the title and still sell many tickets.

~~~
rryyan
Cory Doctorow (of Boing Boing fame) wrote a sci-fi novelette about a
datacenter during a crisis: [http://craphound.com/overclocked/Cory_Doctorow_-
_Overclocked...](http://craphound.com/overclocked/Cory_Doctorow_-
_Overclocked_-_When_Sysadmins_Ruled_the_Earth.html)

I personally didn't find this piece particularly great, but there's an entry
in the genre for you.

------
ghshephard
Every Datacenter I've been in for more than a couple years, for the last 15
years has had a power outage (or two). AT&T, Qwest, Exodus, AIS (San Diego),
Layer 42 (San Jose), Media Temple (Los Angeles). It's like cable-cuts on your
circuit - they happen so reliably that you put these events into your business
plan. If you need network redundancy, you always have two (diverse) circuits.
If you need data center redundancy you always have two (diverse) data centers.
These events happen so reliably that the surprise is when they _don't_ happen,
not when they _do_ happen.

Plan for a minimum of one power outage every two-three years and you won't be
disappointed.

I feel for the HE guys - back to back power outages has got to be killing them
right now.

------
2timer
I host directly with HE in Fremont1 and just experienced the power outage.
I've had equipment there for 4+ years and this is the first power issues I've
had with them. HE isn't perfect, but up until now I've been perfectly happy
there. Yes, HE runs a fairly relaxed data center there for better or worse. HE
hasn't communicated about this outage--which I find very disappointing. I
would guess this power outage was a result of attempting to fix whatever broke
Saturday.

------
jread
Here are my own availability stats for Linode this year. We use Panopta to
monitor each data center. All are within the SLA:

Dallas - 99.951% Newark - 99.969% London - 99.986% Fremont - 99.989% Atlanta -
99.995%

~~~
scalyweb
Thanks for sharing. We're you able to identify with Linode the outages for
each datacenter or do the stats include any false positives?

------
rbarooah
I found this a little disturbing (having just put a new app on a node there -
fortunately pre-production), but then I remembered that the last two places I
worked paid the premium to be hosted at 365 Main in SF, with its flywheels and
diesel generators etc, and that didn't turn out to be magic:

<http://www.365main.com/status_update.html>

The moral of the story for me is;

* These things are complicated

* Failures will happen

* You have to be prepared to deal with them

------
dotBen
I wouldn't host anything in Hurricane Electric @ Fremont for several reasons:

Back in 2008 an HE based colo "McColo" were shut down because they were
hosting a HUGE amount of botnet controllers, spamming operations, and similar
shady operations. When they were shut off some security firms saw a 50% drop
in spam going through their firewalls.

However it only happened after immense pressure on HE and other providers
involved by Google, Washington Post and all sorts of other players.

HE would have been aware because large IP blocks were being blacklisted (I
heard at one point all of HE Freemont's IPs were blocked by some of the more
extreme SBL lists) but they turned a blind eye and/or claimed Ts & Cs were not
being infringed.

More here <http://news.cnet.com/8301-1009_3-10095730-83.html>

I found that highly irresponsible, both in terms of the detriment to their
other colo customers who were sharing the BGP-level bandwidth but also from a
wider 'being a good actor' perspective.

Given that they are also on a fault line and that good connectivity to Europe
is more important to me than Asia, I would prefer to host on East Coast.

I'm actually in Linode's New Jersey and London Colos, and they are both
excellent.

~~~
updog
I'm not sure that is a bad thing, actually. Although I do not like spam and
botnets, at least they are willing to stand up for their customers and require
due process. Others will cave on bogus DMCA notices and assume that the
customer is guilty.

I would actually call pulling the plug without due process irresponsible.

Also, I question your claim about it happening only after "immense pressure."
Your own link praises them for their response, and some googling suggests
similar wording in all coverage I can find.

------
jojopotato
Folks in their irc channel are saying that the problem is with Hurricane
Electric.

------
thethimble
I bought my first Linode on the morning of the first power outage. Does this
kind of thing happen regularly there? I'm having regrets...

~~~
rbranson
This is the exception for sure. Word is that the problem is with the
datacenter their boxes are in (Hurricane Electric). Of course, ultimately this
could cause customer loss for them, so it becomes their issue.

------
jrnkntl
Resolved after 44 minutes of downtime (according to <http://wasitup.com>) for
my server.

------
revicon
Posting a reference to the previous outage thread here as it's the only place
I could track down their support IRC info. (Might need to dig deeper in the
Linode support docs)...

<http://news.ycombinator.com/item?id=1926368>

Actual IRC info...

Server: irc.oftc.net Channel: #linode

------
mrinterweb
These recent outages have been giving me a lot of stress lately. I manage 7
Linodes at the Fremont data center. Fortunately these outages have been at
non-peak times, but if this happens during a peak time. I will need to rethink
my server architecture.

------
floodfx
Out of curiosity, I am wondering why someone would choose Linode over AWS?

~~~
dotBen
Nuanced, but different products.

AWS is a cloud product, with pros and cons -- instances can die at any time
and data (RAM and disk) won't persist, network is slightly strange with their
NAT IPs but in return you get a setup that lets you connect with big storage
(S3), potentially large clusters (the new GPU core product) etc.

Linode is a VPS which is just presenting you with an abstracted server. If the
instance or the hypervisor gets bounced anything on disk is preserved.
Networking is normal (standard IP address) and everything runs as if it is a
bare metal server (more true for XEN based VPS's like Linode rather than
SolusVM)

------
cvg
I live down the street from this datacenter and my home experienced this same
power outage. Makes me wonder how they handle power failures. My guess is that
they don't.

------
dholowiski
Yikes, makes me happy I chose atlanta for my project. I kind of figured it
would be a good idea to have my server a long way away from California, for
many reasons.

------
holdenc
I would really like a way to move my linode out of the Freemont data center.
My linode in Newark, New Jersey has been perfect for three years.

~~~
coyled
You can, just open a support ticket requesting a migration. They'll queue it
up for you, then you click a button to migrate at your leisure.

------
tszming
Linode's SLA: <http://www.linode.com/faq.cfm#what-is-your-sla>

~~~
dholowiski
" 99.9% uptime, or your lost time is refunded back to your account" - unless
my math is wrong that's 7.2 hours a month (for a 30 day month). Most SLA's are
terrifying if you do the math.

~~~
AlexMuir
It's 0.72 hours a month.

~~~
detst
Right. This is a nice little problem that can be easily checked by Google:

Search: 0.1% of 1 month in hours

Result: 0.1% of (1 month) = 0.730484398 hours

------
ddemchuk
to say this is annoying is an understatement

