

Amazon cloud outage takes down Netflix, Instagram, Pinterest, & more - amnigos
http://venturebeat.com/2012/06/29/amazon-outage-netflix-instagram-pinterest/

======
meiji
From the article - "The outage underscores the vulnerabilities of depending on
the public cloud versus using your own data centers."

No it doesn't. It underscores the vulnerabilities of not understanding your
hosting and accepting the "no outages" slogans of ANY cloud. A single data
centre is always susceptible to outages like this, it doesn't matter who owns
it. If any of those sites had owned a single data centre that was hit by storm
damage, the impact would be the same. I know this is supposed to be the year
of the cloud backlash but even so...

~~~
adrianpike
I thought both Heroku & Netflix had fairly robust multi-AZ deployments, so I'm
hoping they share any of their learnings from this outage.

Either way, that quote is ridiculous.

~~~
datasage
Netflix does, but considering the time, running out of capacity after loosing
a zone might have been more of an issue.

~~~
adrianpike
Definitely a possibility - I was actually watching Netflix when my phone
started rattling with all the alerts.

I was able to finish out the episode, so their CDN was working for the actual
media, but everything else was dead for me.

Another useless anecdote: A coworker was watching on his xbox, and it
apparently cut mid-stream for him.

------
Homunculiheaded
It's funny, the more I think about I think this is actually a good reason to
host on the cloud. From a technical standpoint it's terrifying to see all
these big players down at once. But what the average user likely sees is
"something is wrong with the internet". So rather than seeing that your site X
is down and users being angry with you, users are probably likely to think
"well instagram is also down, oh and so is netflix, something big must be
broken, I'll check back later" the same way users don't blame you if the power
goes out.

~~~
nutjob123
Interesting thought. A couple users may be empathetic because the actual
problem is somewhat visible but I'm not sure if that is an real benefit. It is
of course a negative perception when they see that youtube is up and then
perceive all the down sites as being less technically competent.

------
lubos
It's interesting how AWS outage didn't take down Amazon.com.

~~~
heretohelp
That would be because Amazon.com doesn't use AWS.

~~~
zhoutong
No, it doesn't. Even the name servers of Amazon.com belong to UltraDNS and
Dynect, instead of their own Route 53.

~~~
sandfox
Route53 uses UltraDNS.

~~~
zhoutong
No it's wrong.

> "I also wanted to clarify that Route 53 is an Amazon-built and operated
> service. It is not a re-branding of a third party DNS service. Over time
> you'll see various parts of Amazon move over to use Route 53."

[https://forums.aws.amazon.com/message.jspa?messageID=209251#...](https://forums.aws.amazon.com/message.jspa?messageID=209251#209251)

------
poppysan
If you are hosting on a server(as everyone is) it will, at some point fail.
You have to choose a service that has minimal failure combined with quick
resolution times. I think AWS fits this description...

~~~
dangrossman
AWS has had far more failures than my servers at any data center ever have.
Running in 'the cloud', you're taking all the unavoidable points of failure
(power, network, hardware) and adding in a bunch of proprietary ones (all the
software that manages EC2, EBS, ELBs, internal routing between them, etc) that
have all failed spectacularly at least once already with hours- to days-long
resolution times.

~~~
ibejoeb
Yes, risk still exists, and risk profile shifts a little, but I find it to be
a toward the better. Here's an anecdote:

I run applications on EC2 and RDS. I'm using Oracle. AWS has recently
introduce Multi-AZ Oracle, but I haven't enabled yet. Before it was available,
though, I set up a poor-man's procedure that consists of running data exports
and dropping them on S3.

Now, when everything went to hell in the east, I lost an RDS instance. I
couldn't do point-in-time restore, and I couldn't snapshot (both are still
pending since 7 AM or so).

Luckily, I was able to spin up an RDS instance in the west, pull down the
latest data from S3, and do an import. I repointed my apps at the new
database, and now I'm back up.

The process took about 45 minutes. Setting up the backup scripts took about 20
minutes about 2 years ago. Now I'm just sitting on my hands waiting for the
AWS ops team to fix everything. This is work I'd normally be scrambling to do
myself. I'm quite happy to let those talented folks deal with it. When it's
all back up and running, I'll check integrity and consistency, and I might
have to restore some interim data, but for now I'm operational.

I'm sure there are worse scenarios, but the major outage last year and in the
past 24-hours were quite easily mitigated.

There's something to be said for being part of a giant machine. AWS really is
utility computing, so even the small guys get the benefit by virtue of
standing next to the big guys.

------
cocoflunchy
Instagram is still down! However Netflix & Pinterest seem to be back.

------
azarias
Google, MS, Rackspace etc. ought to give a good look at all the middle layer
libraries like boto, and support them to make it a matter of configuration to
switch cloud service providers.

This already works well for email providers.

------
Nux
Here's to keeping all the eggs in one basket!

------
samuel1604
Just use rackspace already, they hardly have an outage.

------
philip1209
If Pinterest is down, then there may be a net gain for the internet.

