Rackspace Outage
18 points by suhail on Nov 3, 2009 | hide | past | favorite | 28 comments
We're back up!


irc.freenode.com #slicehost - lets celebrate our down-time.

Alastair: We're currently experiencing a power outage in one of our datacenters, we're working as fast as possible to get the issue resolved and will update the status at status.rackspacecloud.com as soon as we have updates. I apologize for any inconvenience and thank you for your patience.

At least she didn't say "any inconvenience we may have caused you." Today is a rainy day in the cloud. :-'(

As of 12:35AM CST Rackspace Cloud engineers are seeing intermittent connectivity to our WC2 cluster in our Dallas - Fort Worth (DFW) and data center. We are working to resolve the issue as quickly as possible and will update the status post accordingly.

If you have any questions or concerns please contact our support via live chat or at 1-877-934-0407 international +1.210.581.040.

UPDATE: As of 1:15am CST, Rackspace Cloud engineers are still working to address the current connectivity issues. We are making significant progress and we will post another update here shortly.

UPDATE: As of 1:30am CST, service has been restored to our WC2 cluster. We are going to continue to monitor the situation closely. Additional updates to follow.

It's as if Rackspace has lost its key value proposition. The first time this happened, many of us thought "oh, it's a one time thing." But now that so many of us pay more money for something that goes down every few months, it's starting to get annoying.

In some ways, I actually feel sorry for the team at Rackspace. This hurts them big time.

It's certainly disheartening. These have all been power-related so far. Given the amount of money they should have spent on power systems, there's no reason this should happen. I've been to Savvis' facilities and they know exactly when a battery in their UPSes start to fail and how badly. Gensets get load tested regularly. Everything is fully redundant with instant failover. Yes, things can still go wrong, but they shouldn't be going wrong this often.

All sites are bound to go down sometime and sometimes

Yes, but when 100% uptime and fanatical support are key to your branding, even a little downtime undermines the expectations of current and prospective customers.

They set a high bar for themselves. By saying that they have the lowest downtime in the industry, more eyes are looking at them to see that they do as they say. That's what makes this such a nasty situation.

Just unfortunate this one was so soon after their last big outage (in July, iirc).

Slicehost had excellent service until they were acquired by Rackspace. Now they're on the second or third huge outage this year.

Rackspace copying Linode so hard they're even down-time compatible.

I have irate customers on the phone :-(

( I am not blaming slicehost but does "power outage" make any sense? Aren't there power backups? (and presumably an alert system for when primary power goes down?) If anyone has worked in/runs a hosting service, please enlighten me).

no a power outage never makes sense. all systems should be on online battery backup with gas powered backup generators for outages longer than 5 minutes. These systems should be tested periodically to make sure that they are all in good working order.

I think I'd rather not have 100% uptime and have periodically tested systems, to increase reliability as long as I know when my server MAY go down.

There is a Rackspace update; from reading it this hit one of the fluke single points of failure in one of their power system.

Considering how data centres are set up there isn't much they could have done to stop it :)

Looks like human error (during the servicing)

Well I guess they need that fanatical support.

Personally I would just say fire the support people and use the money to hire somebody that can actually deal with servers.

> Well I guess they need that fanatical support.

Im not a rackspace user :)

But yeh; someone is probably getting a tongue lashing. Human error is so easy; even for experienced engineers and especially with live electrical equipment... (been there, done that, got the scars)

Reminds me of Liquid Web who copied rackspace calling it Heroic Support. If you need to be a Fanatic or hero you're doing it wrong. I think I'll start a web host and call it "competent" support.

TC was down - http://www.techcrunch.com/2009/11/02/large-scale-downtime-at...

although I was a little more gentle with them on the phone than @arrington (check his stream)

He's a bit...harsh is putting it lightly.

After the article on Facebook game scams, he was seeming to move past his previous attitude. Now he just looks like a raving mad man. It's not Rackspace's fault if you left a single point of failure on your site by hosting in one datacenter. If you want better reliability, then pony up the cash. You're making enough from just one of those ads to do it.

I don't think you grok sarcasm

(ok so let me explain. we are running with the joke that it was scoble himself that took the servers down.)

All my sites are down too. I am not happy about this.

Ya my site's down too :( Statuses are being updated at:


We just invited a bunch of new users onto our site earlier tonight and I freaked out thinking they broke our server, until I saw this :)

This is probably a much more complex failure than just a genset not working or dead UPS batteries.

lol (to avoid crying) we just "launched" yesterday and we are down already: http://blog.split-the-bill.com/rackspace-cloud-is-down

Did you just link to an entry that's hosted on the rackspace cloud and inaccessible? Just me?

Actually Posterous seemed to be up but they were not processing emails (perhaps that bit is the bit hosted in the cloud ?). I had to post via the web...

Ironic that we decided not to host our own blog so that we would have a working communications channel if our main site got taken out....

My slice managed to stay up... Phew. Good to see their being vocal about it.

my slice power source is hybrid, so once the electricity went down, it switched to the solar module, unfortunately, it happened at night time, so this did not helped much

Everything breaks.

Personally it seems pretty good to get it fixed in an hour.

My demo is down too, hope this comes back soon. :(

