Tell HN:  Heroku is Down (update: recovering as of 10PM PST) - timr
======
michaelfairley
The AWS status page[1] is showing problems for EC2 East as of a few minutes
ago. This <strike>might be</strike> is a more widespread issue.

EDIT: Various non-Heroku EC2-East-based sites (e.g. Quora) seem to be down as
well, lending more evidence to this being an EC2/EBS outage.

1: <http://status.aws.amazon.com/>

~~~
ajasmin
Out of curiosity do we know if <http://status.aws.amazon.com/> is hosted on
AWS?

~~~
sprout
It's a key design constraint of status.aws.amazon.com that it not depend on
any AWS services. (Or so I've heard.)

------
bdesimone
Dear Heroku -- I know it's my job to make sure my site is available (/thread).
However, I think I speak for most enterprise customers when I say I will throw
money at your company the second you come up with a multi-zone/highly-
available.

~~~
ctrand
I spoke to one of the guys from Heroku recently and he said they are working
on it but he couldn't give me a date.

It couldn't come soon enough for us, but also for Heroku as AppFog seem to
have its foundations built on a multi-zone/region/provider architecture.

------
zeeg
Everyone should take a page from the book of Netflix right now. It's pretty
embarrassing to be anyone that's entirely down and can't do a thing about it
due to an EC2 outage.

How do you explain to your customers/users/etc that you were down and have
absolutely no control of when you will be back online? How can you explain it
to yourself?

~~~
harryh
Because this shit is really hard especially when you're trying to build a
product at the same time.

It's not like you can just wake up one day and say "I'm gonna go build a fully
fault tolerant distributed system that works across multiple data centers!"
and then you're done by the time you go to sleep.

Go actually talk to some Netflix engineers. They'll tell you the same thing.

~~~
arohner
Yes, you're absolutely right. However, Netflix is distributed across multiple
AZ, while Heroku has spent the last two years after their $212MM acquisition
in the same AZ.

That makes it sound like Netflix has a more reliable platform than the PaaS
company.

------
tg3
Their uptime percentage is 99.97%, but I'm having a hard time fighting the
recency effect that is telling me to get off the platform ASAP.

~~~
reuven
I just got up (it's morning in Israel), and a client of mine in the US with a
major, mission-critical application was screaming (rightly so) that things are
down.

We're already looking into alternatives -- perhaps not leaving Heroku
altogether, but certainly not depending on them 100 percent. There's no way
that we can entrust the business to something that can just catastrophically
fail at any moment. I've been running my own servers for years, and they've
never had such unpredictable issues.

I increasingly have to think that a few servers, on different providers, with
the application deployed via Capistrano, will be more fault-tolerant than
Heroku. At least, it seems that way right now.

~~~
tkaemming
> There's no way that we can entrust the business to something that can just
> catastrophically fail at any moment.

Anything, including service providers, can catastrophically fail at any
moment. Fault-tolerant architectures are based on redundancy (including
infrastructure provider redundancy, as you mention), not on "guaranteed" SLAs.

~~~
wmf
Provider redundancy goes against the concept of PaaS IMO (ignoring the sci-fi
future where there are multiple 100% compatible providers). Heroku needs to
become internally redundant to really live up to its promise.

------
learc83
I've had so much more downtime with heroku/AWS than I ever had back when I was
running my sites on slicehost.

I also feel like I've let my admin skills deteriorate because I've been
dependent on heroku. Back when I was running everything myself, worst case
scenario I could set up a new VPS from a backup in another datacenter. Now if
heroku goes out I just have to twiddle my thumbs while I wait for updates.

~~~
toast76
...or you can keep on working knowing that your services will be magically
back online without you lifting a finger.

~~~
zeeg
except for any potential data loss that just happened

------
davidjohnstone
Here's the Heroko status site: <https://status.heroku.com/>

------
jasongullickson
Seems like someone posted a link to a project that made self-hosting a
previously Heroku hosted site simple, but I can't find it now...

...would be cool if there was a Linux package (or distro) that you could boot-
up and then just change your git remote to and have your app up-and-running on
your own hardware.

~~~
zrail
I made Dokuen a few weeks ago, maybe that's what you're thinking of? If you're
willing to live with a few warts, it's working pretty well for my personal
use. My blog is still on Heroku so you can't really read about it now, but you
can check out the code.

<https://github.com/peterkeen/dokuen>

~~~
whalesalad
I saw this a while ago and will actually be using it very shortly to deploy
3-4 internal apps on our own mini cloud. I love Heroku and can't stand all of
the open source "alternatives" like Cloud Foundry. Yours is exactly what I
wanted and I can't wait to really start using it and contributing via github!
Thanks again!

------
whalesalad
If Heroku moved off of AWS they'd have better uptime and lower prices.

~~~
ceejayoz
That'd depend where they moved to, wouldn't it?

~~~
whalesalad
At their size they need to be running and managing their own hardware. I'd use
them more if it wasn't hundreds of dollars a month to host a few apps that
might not be up when my customers are.

------
atlasom
Can we get this title renamed to AWS outage as the problem is not Heroku.

~~~
biot
It may be an AWS outage as the cause, but ultimately it's Heroku's problem.
They're the ones touting:

    
    
      "Erosion-resistant architecture.
    
       Heroku takes full responsibility for your app's health,
       keeping it up and running through thick and thin..."
    

The thick happened today, something eroded, and people are holding them to
their word: full responsibility for their app's health. Heroku could do load
balancing between multiple independent providers rather than be solely
dependent on [one region of?] AWS.

------
itos84
Another update from Amazon: "9:55 PM PDT We have identified the issue and are
currently working to bring effected instances and volumes in the impacted
Availability Zone back online. We continue to see increased API error rates
and latencies in the US-East-1 Region." Been thinking that maybe most startups
are seeing that cloud computing is the most reliable way to go, but today I'm
reconsidering having another type of backup server. Just hope there is no data
loss in the apps.

------
structAnkit
Pocket also tweeted that they are having issues due to Amazon's issues.

<https://twitter.com/pocket/status/213481664670732288>

------
vegasbrianc
Was at a Heroku "crash course" last week where they claimed they learned from
their major outage in March/April Here is the link to the videos from the
conference [http://zurichtechtalks.tumblr.com/post/24670375315/heroku-
in...](http://zurichtechtalks.tumblr.com/post/24670375315/heroku-intro-crash-
course-zhgeeks)

------
ceejayoz
One of mine went down, and I'm seeing folks on Twitter saying the same.
Definitely something going on.

*edit: Came up 12:42 AM ET.

------
reustle
A few of my us-east-1d machines are down.

~~~
base698
One of mine is as well. The load balancers also seem to be haywire.

------
jordanthoms
Related: Right now when I try to cat /proc/mdstat or use mdstat to look at my
RAID status, it just hangs. Seems it's trying to contact the EBS volumes and
it's just failing. Any way to actually view my raid status?

~~~
dangrossman
It's probably best not to muck around with the RAID right now when the drives
it thinks are there aren't actually there. If it were me, I wouldn't touch
anything until Amazon fixes itself.

~~~
jordanthoms
It's more a theoretical question than anything else - Say this outage had gone
on for days, I'd have needed to be able to see which volumes have failed and
drop them from the array. How can I do that when I can't view the raid status?
I have these problems even If I purposely detach a volume to test.

------
alphex
And yet we continue to throw everything on AWS...

You know there are OTHER data centers, right?

~~~
ceejayoz
Other datacenters can go down, too. In many cases, the complexity of running
an application across different platforms (say, AWS + Rackspace Cloud) might
not be worth it.

------
alanh
cloud.engineyard.com isn’t loading either.

 _edit_ : I meant literally the EngineYard website at that address. Some
EngineYard websites were up and some were down, no doubt based on region.

~~~
rjsamson
I'm hosted on EngineYard and my app is up and running just fine.

~~~
ctrand
are they on us-east-1?

------
aaronbrethorst
Anyone know if Amazon Fresh is hosted on EC2? I had the worst connectivity and
performance issues with their site earlier today...and now all of my sites are
down.

------
jbermudez5
So is Parse.com, they just announced it is AWS related.

------
mschonfeld
At this point, the internet may as well be dead to me.

------
option_greek
My AWS instances on us-east are unreachable :(

------
ctrand
It seems to be the elastic load balancers on AWS, can't blame Heroku this
time.

My love hate relationship with Heroku continues...

~~~
ceejayoz
It's not just ELB.

~~~
ctrand
It looks like Amazon are worse at reporting their outages than Heroku...

~~~
ceejayoz
During the previous outage (which wasn't AWS related), Heroku's status page
was down entirely (among other things, it relied on static assets from
heroku.com), so I can't say I agree with that.

~~~
ctrand
Yep, but they saw that flaw and addressed it and it's ok for the time being.
Amazons has been and still is crappy.

------
talos
knocked out some other stuff? gothamist/chicagoist/laist/all those other blogs
are out.

------
breck
We have an unreachable instance in us-east-1b but others in that region are
reachable

~~~
michaelfairley
FYI: Your 1b is not the same as other people's 1b:
[http://aws.amazon.com/ec2/faqs/#How_can_I_make_sure_that_I_a...](http://aws.amazon.com/ec2/faqs/#How_can_I_make_sure_that_I_am_in_the_same_Availability_Zone_as_another_developer)

Various software used to hardcode 1a, so 1a received disproportionate load.
Now, everyone's a-e is randomized among the "true" a-e, meaning that even if
everyone hardcodes 1a, the load will still be evenly distributed.

------
jszielenski
I want a credit on my Heroku account. Paying $71/month for shit like this is
stupid.

~~~
cmelbye
Maybe for the database service, but dynos and workers are paid for by the
hour, aren't they?

~~~
zeeg
All services are billed based on time usage (its not hourly, its much more
granular), including the database.

That said, an outage is still an outage.

------
esente
Not sure if it's related, but www.pythonanywhere.com is also down.

~~~
ceejayoz
Doesn't look like it, their IP is owned by a German company.

------
fookyong
google searches for "migrating off heroku tutorial" just spiked.

------
iamandrus
I'm getting a "Request limit exceeded" error in my EC2 panel.

------
PabloOsinaga
parse.com also down

~~~
matkiros
I confirm this. Was working on an app prototype and when I reloaded the page
it couldn't find it.

------
damian2000
quora.com is down

------
cardmagic
This is why <http://AppFog.com/> is investing in multiple IaaS and is not
being hit nearly as hard.

------
narrator
42floors.com is down.

------
mschonfeld
#AWSpocalypse

~~~
gojomo
...the sequel.

