

Heroku is down for the third time today - Janteh
http://status.heroku.com/#

======
ayb
I use Heroku for subscription software services, online retail stores, and
phone ordering system for our staff.

Right now all of our sites are failing with 503 errors. Our store is down and
when one of our employees went to take a phone order they got a "Welcome to
your new app" message.

I've been a big evangelist of Heroku since we migrated over last year, but I'm
getting deeply concerned about the elevated error rate since every minute is
costing us money.

~~~
qeorge
Does Heroku have an SLA? (I could not find it)

At some point they're exposing themselves to serious risk. Rackspace had to
pay out ~$3MM (in free service credits) after an outage in 2009:

[http://www.networkworld.com/news/2009/070609-rackspace-
outag...](http://www.networkworld.com/news/2009/070609-rackspace-outage.html)

~~~
StavrosK
This is offtopic, but what's MM? What's the second M for?

~~~
jcsalterego
M = 1000 in Roman numerals, but the confusing bit is not reading them like
Roman numerals (2000) but rather interpreting them as one thousand thousands,
or one million.

~~~
StavrosK
Hmm, so who uses this? How is it better than just saying $3M?

~~~
marclove
This came up in a thread a while back
(<http://news.ycombinator.com/item?id=1483667>). Bottom line is that "MM"
comes from the banking/finance world. In banking, $3M actually means $3,000
and $3MM means $3,000,000.

~~~
StavrosK
I see, thank you. I prefer the SI, kilodollars, megadollars, etc.

------
vegashacker
It just occurred to me that you know you've made some pretty serious traction
as a startup when HN posts about your company no longer have something like
"(YC W08)" appended to the end.

~~~
Timothee
Thanks for pointing that out, because I had _completely_ forgotten that this
was the case. (I actually can't remember at all, but I figure that I knew that
from when they came out)

They did go a long way in a short period of time. Winter 2008 feels so close.

~~~
petercooper
Yeah, I'm in the same boat as you. I see successful, "big" companies mentioned
here with "YC-whatever" on the end and am blown away by which ones are YC
alumni!

------
gfunk911
It all depends on what the SLA says, but hypothetically, if they are down for
24 hours a year, that's 99.7% uptime, which isn't terrible.

Heroku had a 1-2 hour outage the week after we switched an app there last
year. My boss was freaking out, cursing about how they were unreliable, etc,
neglecting the following:

1\. The timing was unfortunate, but that was the first outage in months.

2\. We had had multiple outages on our Rackspace box that were our own fault,
due to bad server management.

In the long term you're likely better on Heroku, for small companies at least.

~~~
whirlycott1
Uh... 99.7% is ridiculously bad if you're doing anything that matters.

~~~
kes
I agree with you, but only in theory. I can't think of one thing that runs
100% non-stop.

Even in places like medicine or finance or security. Stuff breaks, things
fail. It's sad, but the reality is there.

~~~
jackowayed
Of course nothing will have 100.0 (repeating)% uptime. But 99.7% uptime means
it can be down for over 2 hours _every month_. Anything less than 99.9% uptime
(which means 3x less allowed downtime--a big difference) is probably
unacceptable, and if downtime costs you serious money, you're going to want
more decimal places.

------
awt
I have an app running on Heroku. Interestingly, it caches itself using HTML 5
application cache, so most people won't even notice the site is down. Need to
make sure the background network ops are fault tolerant though.

~~~
davidamcclain
Interesting. Care to share what you're doing/what the heck that means?

~~~
awt
<http://motodiaryapp.com> \-- of course if Heroku is down and it's not already
cached for you it won't load. This is the technology the site uses to allow
offline access: [http://www.whatwg.org/specs/web-apps/current-
work/multipage/...](http://www.whatwg.org/specs/web-apps/current-
work/multipage/offline.html)

~~~
gvb
That is really awesome. I just got back from playing with it between Chrome on
an old 800MHz P-III (very usable) and an Android (Nexus One). On the Nexus, I
went off-line (airplane mode), edited, and then went back on-line. MAGIC! My
edits showed up in my Chrome browser on the desktop.

My use case is that I want to use Google Docs (or equivalent) to keep notes
while on-line and off-line. MotoDiary ain't quite there yet, but it has the
hard part (IMHO), the on-line/off-line syncing. What is rough is text size and
fixed(?) edit box size on the Android. Also (obviously), it is diary-oriented
(single entry per day) rather than supporting multiple documents.

Google Docs are totally uneditable (?WTF!) on Android, never mind doing it
off-line and syncing.

There are some Apps that work better, such as GDocs. GDocs has been a mixed
bag, it allows me to edit off-line and sync docs, but has been iffy in terms
of success rate. It definitely isn't as smooth as my brief experience with
MotoDiary.

------
froggie
You have to give Heroku credit for selling major quantities of Kool Aid.
They've been pretty flakey for the past couple of months, and people are here
claiming that this is the first outage. Someone's even claiming that 99.7% is
a good record.

------
railsjedi
"Applications are fully restored." via <http://status.heroku.com/>

Downtime always sucks, but gotta give them credit the way they keep everyone
in the loop and provided status along the way.

------
n-named
Make your error page prettier. You guys are capable of better design (after
seeing your pricing page).

------
aarongough
It's worth noting that this was not universal as far as I can tell.

I have 5 minute watchdogs on all of my 3 sites in production with Heroku, and
none of them pinged me. Given that I know the watchdogs work (regular testing
and previous incidents) I would have to conclude that not everyone was
affected.

------
jread
We've been monitoring a heroku instance for the past 8 months. Our current
instance uptime is 99.953% (about 200 minutes of downtime). Of the 76 services
we monitor, Heroku is #64.

<http://cloudharmony.com/status>

------
snprbob86
The magic of cloud computing: As someone running an app on Heroku, I had no
idea. Luckily, I simply don't care.

Our app has a cyclic usage pattern and all is quiet right now. So rather than
freaking out about it, I'll just let someone at Heroku figure it all out.

It would suck if it happened during our busy period, but then again I could
say "We're working on it." and just assume the Heroku team will fix things
faster than I ever could have with my limited *nix admin skills.

~~~
jbail
How exactly is the fact that you didn't know about the outage "the magic of
cloud computing"?

I get that you're saying your users don't care/didn't notice, but I'm clearly
missing something because if I had an app on Heroku, I'd be a little nervous.
When the cyclic nature of your app swings back around and it's in regular use
again, this kind of outage might not be so magical.

~~~
snprbob86
Well technically, I was informed of it. I got email alerts and stuff, but I
was busy doing other things, so I didn't read them.

Users surely noticed, but Heroku definitely noticed before my users did.
They're quietly working on a solution and I can quietly go about my day. If my
users start complaining, I'll have time to talk to them; time I wouldn't have
if I was neck deep in log spew.

Having run apps on my own servers before, I know what a pain in the ass it is
to deal with downtime yourself. I'm not particularly good at it, so I
appreciate having experts take care of it for me.

~~~
absconditus
Having experts be responsible for dealing with problems is not unique to
"cloud computing".

~~~
Goosey
Not unique to it, but it is implicit in it. This matters. If you are at the
size where you can't have a dedicated staff monitoring your uptime 24/7 than
you are at the size where a cloud solution is going to be more responsive than
what you can afford.

------
itsnotvalid
It's sooner or later for most people to realize that, it is not that safe to
rely on a specific deployment system that is not directly controllable. It
could be dangerous to use a full stack that cannot easily be replaced without
a decent amount of efforts.

Initial laziness now adds up.

------
boltofblue
Even if you hosted your own server and it was just serving one static file,
there are still services you depend who could cause an outage.

Heroku so far has not had major outages.

And they will be learning from the current ones.

------
alexyoung
I host an app on there that I've been using all day and I didn't notice it go
down. I reckon I've got some kind of unplugged-TV poltergeist action going on.

------
aneth
I haven't seen an explanation for this, but I could be related to ec2 issues
today. I'm a heroku user. Downtime with any host always seems to happen with
bad timing, during a daily client call today. However I'm not concerned about
heroku - yet... I think they have less downtime than I would have doing it
myself.

