

Heroku is down - jackmoore
http://www.heroku.com/?test

======
bdesimone
When will heroku have a multizone offering?

I know most sites hosted on heroku are pet projects with no real need that
type of uptime, but this type of downtime makes it impossible to use them as
enterprise customers.

~~~
chao-
Agreed. We are finally scaling from "small scale production" to having actual
customers we want to take care of. Worse, we began a huge marketing push this
week and had been seeing promising results. Now we'll probably miss a boatload
of potential customers.

Heroku had been a joy to work with until now. I ask of you all (as most of you
are far more experienced in these matters than I), what is the traditional
practice to mitigate this sort of risk? Paying for hosting with two separate
companies?

EDIT: I understand the benefits of cloud hosting over hiring a sysadmin. At
the same time, I'm interested to learn about what possible solutions there
are. I'm at the point where I don't know what I don't know, and even the name
of a topic or technology would be a huge help.

~~~
kennystone
Heroku is really good. You will probably have more downtime than them unless
you hire a whole team of operations and admins and pay a lot of money for
hosting. It may also take a while to build out the deployment and development
tools they have.

~~~
wheels
Honestly, that's pious bullshit. It's the thing people keep saying every time
Heroku gets mentioned is that it'd do better than you would yourself for
availability, but you'd have to be borderline incompetent to have as much
downtime with a more traditional VPS / hardware hosting to match the combined
Heroku / AWS downtime.

We moved all but one of our Rails apps off of Heroku precisely because of the
frequent downtime -- or, rather, that was the last straw; there were other
issues, notably the difficulty in debugging production issues, that had us
already debating such. Heroku has gotten somewhat better, but it's still down
far more than anything else that we use. (And we have services spread across
Linode, Rackspace, AWS and the mentioned one app on Heroku.)

You'll have better uptime in most cases with a standard nginx / passenger
setup on a $20 VPS than you will with Heroku.

~~~
ricardobeat
They are at 99.97% uptime. It's not that easy to get to even 99.5%, just the
flakiness of a standard uplink will put you below that.

~~~
wheels
Huh? The flakiness of a standard uplink will have you down for 44 hours a
year? On what planet? With our non-AWS hosting providers we tend to see 2-4
hours of network issues _per year_. Heroku's uptime also hasn't historically
been anywhere near 99.97%. They were down for several _days_ last year in The
Great AWS Failure.

~~~
ricardobeat
This is in the context of hosting it yourself, not using a different large-
scale business provider.

~~~
rdl
I don't think anyone ever considers hosting it on a desktop in the closet on
DSL at home/office.

The competition for something like Heroku is EC2, VPS, or dedicated servers,
in commercial colocation facilities. A good hosting facility is going to be a
lot closer to 99.995% uptime for network and power to the box, but you can of
course screw up past that point on your own.

------
kitsune_
We've recently begun to move our entire infrastructure to external providers
(vps and cloud) because we felt that we couldn't guarantee a sufficiently high
level of quality and had to devote too much time and money to operations.

In over four months we've probably already had more downtime than in the past
5 years. Despite paying quite a lot for "redundant", clustered offerings.
There was one nasty bug in the hypervisor that killed the entirety of one of
our providers' vps infrastructure - for over a day.

Our experience with "the cloud" was a little bit better, but we aren't
entirely satisfied.

I don't know, maybe we should just order services from three different
providers mirror our applications ourselves as a fail-over mechanism.

~~~
neilmiddleton
Heroku status is currently showing production uptime as 99.97%. That's not too
shabby, and definitely not as bad as you make out.

~~~
jsprinkles
He implied that Heroku is not his sole provider. I'm skeptical of that figure,
too, as it's probably something on the Heroku sales site (99.7% to the 30-byte
health check!), or it's manually updated, in which case it's probably rounded.

------
Cushman
They went down literally the moment our demo thing started. I swear this is
because of me.

~~~
jszielenski
Same here! This happened last year too! Annoying.

------
g0atbutt
As a newbie, I've really enjoyed deploying on Heroku. However these outages
really terrify me. Is there an easy way to host a redundant version of your
app outside of Heroku?

~~~
patio11
Ask a simple question, get a simple answer: No. This is no knock on Heroku,
either. No hosting solution can make that easy: it is an enterprise
requirement which implies six figures of investment and a dedicated ops team
with no newbies on it.

~~~
jsprinkles
I'm sorry, but that's completely untrue. I can't speak for using Heroku
specifically, but in the end hosting is just running an app. It's fairly
trivial to make an app multi-provider these days through a plethora of
methods. DNS and a low TTL is on the easiest-to-approach end if your app is
designed for it and aware of the complications, which implies thinking about
your database and other supporting architecture from the perspective of a
multihomed setup. If your app isn't idempotent nor designed to be multihomed,
it'll be harder, but six figures of investment is _insanely_ high even in that
awful case.

Perspective: I could throw an app on Rackspace and Amazon, with a replicated
database, for under $200, in about a day.

~~~
joevandyk
One of the problems with using multiple providers is you can't use any of the
specific features of a provider.

For example, I really like AWS's security groups and ELBs. Those serve as my
firewall and my load balancer and SSL terminator.

Replicating the application to another service means configuring and testing
all that on my own.

If I use heroku and use their logging system, then replicating it to another
provider means I need to be an rsyslog expert.

I don't really want to be an expert on rsyslog, postgresql configuration,
floating IPs for HA LB, the best IO scheduler for file systems, etc. As
someone who is in charge of all the sysadmin duties, and is solely responsible
for writing all the business and db logic for several e-commerce sites, I want
to spend my time on writing code. Not fucking around with figuring out the
syntax for iptables.

~~~
rdl
Ultimately this is why it would be nice for the code/configuration to be
independent of the operations. I agree it is unreasonable to expect a random
developer to build AND MAINTAIN the entire stack for every project. PaaS makes
a lot of sense, especially as a starting place.

The solution is either to have a PaaS provider who ruthlessly eliminates
single points of failure (there isn't one, currently), or use some
standardized software system which can be operated by multiple independent
operators with nothing shared. Unfortunately, the only vendor-independent
infrastructure is the physical server, various forms of VPS, etc. -- it's all
at the IaaS level. As far as I know there's no PaaS type thing with a common
interface which arbitrary providers can operate, with some kind of marketplace
for users to pick operators independently from the technology.

~~~
T-hawk
_The solution is either to have a PaaS provider who ruthlessly eliminates
single points of failure (there isn't one, currently)_

This isn't possible by definition, right? The PaaS provider itself becomes the
single point of failure.

Rings a bit like a "Who created God?" argument. "What single entity can I use
to defend against failures by a single entity?"

~~~
rdl
Theoretically, sure, but it's an engineering and economic thing.

You can mitigate specific risks, and you try to prioritize those based on
cost, frequency, and severity. If there were a great redundant provider with
good authentication on accounts, a strong balance sheet and business, and sane
policies on managing accounts, you would be fairly safe using just that
provider. After all, you could always get a court order to cease providing
services, yourselves, like if you do something some troll has patented. It
kind of depends on your application, too -- if I were doing a wikileaks, a
bitcoin exchange or torrent site or some other legally at risk business, I'd
want country-level separation across multiple providers, at least as a cold
backup. Casual game for facebook or mobile, not really much of a concern.

------
switz
If status.heroku goes down due to 500 errors (overloading) use
<https://status-old.heroku.com/>

~~~
riffraff
the incident specific page is up often when the main page is not, in this case

<https://status.heroku.com/incidents/372>

------
grandalf
RedHat was timing its blog postings in anticipation of this:

[https://openshift.redhat.com/community/blogs/new-
openshift-r...](https://openshift.redhat.com/community/blogs/new-openshift-
release-june-7-2012-instant-apps-new-windows-client-and-more)

~~~
benmccann
There's no big announcement there that I see. They release a blog post like
this at least once a week.

------
bretthoerner
<http://www.whoownsmyavailability.com/>

~~~
lemieux
What is it suppose to do?

~~~
yourcelf
I think it's similar to <http://isitchristmas.com/> . It's a simple answer
that's always the same, to drive the point home: you are always the one
responsible for your uptime, no matter whether you choose dedicated hosting,
the cloud, your own closet, etc. You can't outsource responsibility.

------
ralphleon
It's so scary to think that our entire startup almost died with this glitch.
Our enterprise customers use our product in the mornings... this happened at
9-fucking-am right during our peak hours.

Obviously it's our fault for not having a redundant system / for trusting
heroku. Though I still can't help but be a little pissed as I email 100 people
about how sorry I am for their service disruption. Heroku didn't email me.

~~~
joevandyk
You can subscribe to notifications on <http://status.heroku.com>.

~~~
ralphleon
Thanks so much! Though I can't imagine anyone who wouldn't want these
notifications...

------
mikejarema
Interesting to see that www.heroku.com runs on their own platform, and is just
as susceptible to platform downtime.

On the other hand its surprising to me that when heroku goes down, it goes
down _hard_ , namely that all hosted sites including their own are
unavailable.

~~~
mjackson
Totally agree with this assessment. I don't know how their routing mesh works
exactly, but it seems that they're coupled too closely. Either everybody is
mostly working fine, or everybody's down.

------
bitsweet
What would really help in these situations is a failover to our maintenance
pages or some other static page we could provide instead of showing the Heroku
"Application Error".

This is probably harder then it seems, especially when the outage is related
to their routing infrastructure.

~~~
mcs
Well, there is an option to have error pages on your heroku domains, but that
is probably dependent on at least a minimum level of the routing layer
working.

<https://devcenter.heroku.com/articles/error-pages>

~~~
bitsweet
Yeah, I have that setup but its not being used at the moment with this outage
- if full redundancy with multiple regions is a huge challenge, perhaps some
interim basic routing redundancy would go a long way so we could at least
display a branded error page during an outage.

------
garindra
Status update on <https://status.heroku.com/incidents/372> :

"We have confirmed widespread errors on the platform. Our engineers are
continuing to investigate."

~~~
jt2190
Also can follow @herokustatus on twitter:
<https://twitter.com/#!/herokustatus>

Edit: Also IRC #heroku

Edit 2: heh... I just noticed the "subscribe to notifiations" link on the
incident page: <https://status.heroku.com/incidents/372>

------
andQ
<https://status-old.heroku.com/> seems to work

------
instakill
What's crap about these outages is that custom 500 error pages that you've
created aren't shown.

------
joshcrews
I checked 5 sites, all single dyno: 3 down, 2 up

~~~
joshcrews
update: all 5 down

~~~
bad_user
I have 4 apps on 20 paid dynos. All of them down.

------
splatcollision
Heroku: Awesome when it's up, but when it's down, it's really really down...

------
grandalf
Heroku is a great platform and has come a really long way -- but it could
really use some competition. Some of the main issues that would be resolved
quickly with competition are:

\- it would be possible to have the non-DB part of your heroku app be spread
across multiple availability zones.

\- worker pricing would be much lower or based on actual CPU cycles used.

\- add-on providers would be vetted more thoroughly before getting a spot in
the add-on store.

\- they'd reopen the #heroku irc channel for informal support

~~~
ricardobeat
you mean like AWS, EngineYard, appfog, dotCloud?

~~~
bdesimone
IaaS != PaaS

~~~
ricardobeat
Amazon excluded those are PaaS.

------
fragsworth
Eventually, they need to offer multiple independent regions, so if you're
inclined (like Netflix does with AWS) you can develop a way to fail-over to
another region.

At least then, it won't be "Heroku is down", it will be "Heroku West is down"
or some such thing.

~~~
cardmagic
<http://AppFog.com> has multiple regions today: Ireland, Singapore and US,
addin Rackspace and HP soon. All free and backed by CloudFoundry

------
AntonTrollback
<http://www.nooooooooooooooo.com/>

------
jboggan
What's their usual outage duration? I'm supposed to demo a project in a few
hours.

~~~
wilfra
They keep a historical archive of incidents at
<https://status.heroku.com/past>

Last time I can recall this happening to us it was down for several hours.

~~~
jboggan
502 Bad Gateway . . . is there a page that displays the status of the status
page?

------
believeUme
Goes to show that the five nines is a bunch of crap. More like two nines.

~~~
patrickgzill
5 nines is 5 minutes of downtime per year. 4 nines is 52 minutes of downtime
per year.

See:
[http://en.wikipedia.org/wiki/High_availability#Percentage_ca...](http://en.wikipedia.org/wiki/High_availability#Percentage_calculation)
for a handy chart and follow along with any of your favorite *aaS providers!

------
freditup
What kind of service guarantees does Heroku have?

------
RoyceFullerton
Confirmed. One of my apps is down. I hope they resolve it soon, but I don't
pay anything at the moment so I am not too upset.

------
zsherman
My apps are finally back up, how's everyone else doing?

~~~
kunalmodi
things seem to be working for me, but Heroku seems a little off still. I got
an error when I tried to view my app logs though, which is a little scary

------
pardner
still getting sporatic errors

------
namm
my demo site is still down.

------
asparagui
and it's back up....

just a glitch in the matrix. it occurs when they switch over local control.

~~~
wilfra
We are still down: <http://warsocial.com>

Edit: Came back for a second. Went down again. Came back. Went down.

------
trebuch3t
I was going to get details from status.heroku.com but that's down too.

~~~
pilap82
it's up sometimes :)

Potential Platform Issues 5m+ We have confirmed widespread errors on the
platform. Our engineers are continuing to investigate.

this is the message for Production and Development

