

Heroku down - tferris
https://status.heroku.com/

======
jaryd
Lets hope the simplicity of 'git push heroku' hasn't distracted most
developers (and customers of Heroku) from Werner Vogels' core tenet of
administration:

Everything fails all the time.
([http://technocation.org/files/videos/original/mysqlconf2008/...](http://technocation.org/files/videos/original/mysqlconf2008/2008_04_15_amazonKeynote.wmv))

This might separate the men from the boys (excuse the idiom) to see who has
adequately prepared a DR/redundancy plan for when the cloud fails.

Edit: Improved link to Vogels' MySQL keynote from which the quote originated.

~~~
dasil003
Personally I don't think it makes sense to build in redundancy if you're using
Heroku, maybe a fallback to a simple offsite "we're down page" but that's it.

My reasoning is that if you are on Heroku you should be small and leveraging
Heroku's management to the fullest. If you are big you are paying an
incredible premium to be on Heroku (thought experiment: when does the premium
start costing more than a full-time systems engineer). There is no Heroku
equivalent from another company, at least nothing that's really turnkey in the
same way, so if you build redundancy you're throwing away the benefit of
Heroku's PaaS.

So in my mind if you need redundancy you need to be building your own system
stack. Whether it's co-located, dedicated, or cloud-based is a separate issue
because it's more or less commoditized. You build your system on a portable
platform and then redundancy is just a proportional amount of extra work.

~~~
jaryd
Thanks for the thoughtful reply. I think that you raise some valid points
here.

Regarding "If you are big you are paying an incredible premium to be on
Heroku": Sure, Heroku is great (or cheap) for development and small-time
deployment, however it is likely true that people get locked into it and
(hopefully) grow quickly before they can scale away from a PaaS solution. In
these cases, it is imperative to consider the marginal cost of
alternative/off-site failover vs. the cost of downtime. This is what I meant
by "separate the men from the boys". Those guys who are paying the incredible
premium to continue to host on Heroku likely -know- that they are doing so,
and have made that decision consciously.

While the situation is different, I am reminded of when AWS went down awhile
back and the sites that had automated failover to alternative clouds really
stood out.

The bottom line is that no one solution is ever enough. Tiny apps can likely
get away with unexpected//unplanned downtime, however it ultimately
delegitimizes them in the eyes of their users (read: customers), and that is
universally bad.

~~~
dasil003
Sure, but my point is that if you determine you need 100% uptime and you are
currently on Heroku, you are going to need to build a complete non-Heroku
system to fail over to. At that point should you A) pick another high level
PaaS provider and maintain two very different stacks, each of which is
commanding a high premium, or B) build on a standard Linux stack that can be
deployed to multiple completely separate clouds?

~~~
rdl
Or C) Heroku should develop and deploy fully independent availability zones
and regions. Amazon hasn't yet done a great job of keeping AZ and even regions
independent but better connected than separate providers, but maybe Heroku
could do better (even if built on EC2).

~~~
dasil003
That would be a valuable service for Heroku to provide, but being better
connected also implies more likely for joint failure, even if Heroku does
everything right, there is stampede failure risk (which is what happened in
the last major EC2 cross-zone outage). In any case, I think if you are
seriously pursuing 100% uptime, you can't outsource your redundancy.

~~~
rdl
"better connected" could just be a billing thing; they make sure there is
great transport (their own, transit providers, whatever) between AZs running
in separate ASes, and don't bill for inter-AZ traffic.

------
jbeynon
If a Heroku application's DNS is setup correctly then you can usually avoid
problems like this on the new Heroku stack (CEDAR). It's important to reread
the documentation on using custom domains but also be _VERY_ aware of using
naked domains (mysite.com) in your application. This isn't just a Heroku
problem, it's a DNS problem. Heroku have recently had a spate of routing
issues (usually DDoS attacks) which would have been largely negated by correct
DNS setup and either avoiding naked domains or using a DNS host that lets you
cname a naked domain to a host.

~~~
kilburn
Could you please elaborate on why exactly using (non-CNAMEd) "naked domains"
is a problem? What do "naked domains" and DDoS attacks have to do with each
other?

~~~
glenngillen
DNS only allows you to put an IP addresses on the apex (read: bare/naked/etc)
domain A record. That means in the case of a DDoS or other problem affecting
one of those IPs you run the risk of having degraded redundancy. Some people
unfortunately only put one IP in, which means you're only one machine away
from going offline. A long way from what you'd want from the cloud.

DNSimple have created an ALIAS record type which gets around this problem
nicely, and Route 53 from Amazon takes a similar approach.

------
purephase
Interesting. I was just considering them over AWS for hosting. Current users,
how reliable is Heroku? While I appreciate openess, looking over that status
page does not inspire a lot of confidence.

~~~
robin_reala
Just checked: 99.94% for our main site, 99.93% for a promo site.

~~~
purephase
That's great. Better than our current provider. Thanks.

------
ChrisAnn
Excuse me, but why is this top of Hacker News?

Services go down sometimes, especially cloud ones. No need to make it harder
for them by voting it to the top of a major tech news site...

 _edit - spelling_

~~~
rdl
10-20% of apps developed by people on Hacker News probably use Heroku. It's
not just a random service people use; if there is an outage, service operators
will need to field end user inquiries, restore service, etc.

hn is probably not a replacement for a good offsite monitoring service, but
for me, I was browsing "why does my friend's new app not work; is he updating
it live?", switched tabs to hn, saw "heroku down", and all was clear.

~~~
melling
I think 90% of statistics are made up on the spot.

~~~
wyuenho
Around 18% of the people responded to the poll said they are hosting on a
PasS, which I assume is mostly on Heroku.

<https://news.ycombinator.com/item?id=3466168>

~~~
melling
I did not respond but I use GAE. I think I felt better when I knew the number
was just made up.

~~~
rdl
I was going by other private surveys of specific subsets of the hn population.
Number of companies, it's 10-20%. Lots of smaller companies or earlier stage;
basically everyone who does Rails deployments (since that was originally
Heroku's strength).

GAE is the one very few people are using or want to continue using, especially
after the Google price hike.

I still prefer colo/managed hosting at scale, with EC2 for trials or surge
capacity, myself.

------
aen
The status page (<https://status.heroku.com/>) is also really slow now.

~~~
alexchamberlain
Probably not designed for the HN effect.

~~~
StavrosK
It's probably not running on Heroku (imagine the irony).

~~~
alexchamberlain
It's generally good practice to run your monitoring website from a different
datacenter (or at least a different server).

~~~
StavrosK
I meant "the irony of running a status page that's reporting when a platform
is down on the platform itself", not the irony that heroku isn't running its
status page on itself.

------
tferris
> ISSUE: We're experiencing a widespread outage affecting our HTTP routing

doesn't sound good

------
latchkey
I was wondering why <http://intercom.io> was down and now I come here and see
this at the top. Must be it. Update, they seem back now, but their site was
throwing errors a minute ago.

------
dedene
They're up again?

~~~
dchmiel
We're running as well.

------
numbdemon
some services are accessable now ， but some not

------
instakill
It's up for me

