
Heroku Is Down - jonstaab
https://status.heroku.com/
======
jonstaab
Is anyone else getting timeouts with their heroku postgres instances? My dynos
are healthy, but pg isn't. The status page doesn't mention anything about
that.

Edit: it seems the postgres instances are running on amazon ec2, so that's
probably why they're down. Edit: they're failing over postgres instances now,
so we should be back up soon.

------
jrochkind1
My app is down, got alerted by my monitoring. (It's a hobby/side project app,
so not a big deal, I'd be freaking out if this were a money-making production
app. In this case, first thing I did before even checking heroku status was
seeing if I had let my domain name expire... nope! phew).

While the heroku outage message that suggests the main problem is "dynos can
not be restarted" \-- in fact what happened to me was at 3pm UTC Heroku tried
to do an automatic "daily restart" of my app, triggering an outage of my app.
I did not do anything else to ask for a restart.

The heroku stats/events page for my app shows ~300 critical errors, starting
at 3pm UTC, immediately following "Dyno restart: Daily restart" event. (~1
hour 40 minutes before now). All appear to be "H20 App boot timeout". So
perhaps the app has been down since then, although i would have thought my
monitoring would alert me before now.

If when a reboot fails, it tries again, and fails again (~150 times an hour),
and this is happening to a whole bunch of apps, especially as a consequence of
heroku's automated dyno cycling... I can see how it would make a bad problem
even worse/hard to recover from in heroku's infrastructure.

To heroku's credit... I'm not sure if I can remember this kind of widespread
outage _ever_ happening before. Against heroku's side... I'm not sure what a
heroku-deploying customer would be expected, by heroku, to do, to avoid
downtime here. "Have your whole thing runnable on some non-heroku stack" kind
of detracts from heroku's value proposition, which is you won't have to figure
out stuff like that, heroku will do it for you. If I was going to do the work
to ensure my app could be switched at any time to be deployed on some other
stack... I'd probably just use that some other stack instead of heroku, as
it's probably cheaper than heroku. The answer is probably "No matter how much
you are doing yourself, outages are possible. By using heroku you put it in
their hands, but outages are still possible."

 _update_ possibly related?
[https://status.aws.amazon.com/?date=2019-08-31](https://status.aws.amazon.com/?date=2019-08-31)
[https://status.aws.amazon.com/?date=2019-08-31](https://status.aws.amazon.com/?date=2019-08-31)
(in which case you might have been saved by having multi-region deployment on
heroku. But if it's only effecting us-east-1, you'd think heroku would have
noticed and said so? Could be unrelated in an odd coincidence. Or perhaps
there's effected heroku infrastructure on us-east-1 regardless of what region
you choose to deploy on heroku).

~~~
tetha
> To heroku's credit... I'm not sure if I can remember this kind of widespread
> outage ever happening before. Against heroku's side... I'm not sure what a
> heroku-deploying customer would be expected, by heroku, to do, to avoid
> downtime here.

This is the trade-off of lock-in into simple deployments.

It's largely the same, EKS/Beanstalk/maybe lambda + RDS + S3... Azure has
similar offerines, GCP has similar offerings, Heroku has similar offerines.
You can get your app up, running, hosted for a dime - cost based on usage. And
you can do a lot of stuff with very little operational time invested. That's
why this is so great for small startups with 5 technical dudes.

However, you can't get out. You're comitted to their abstractions for
relational databases, deployments, DNS, loadbalancing, ...

If you go ahead and do the things you need yourself based on simple linux VMs,
you can have much more mobility across cloud providers. Give us a terraform
endpoint and some kind of centos image and we can probably host there,
speaking of work.

However, that's overall an expensive and non-trivial thing. Suddenly, you're
having so many problems heroku has a solution to. Suddenly you need someone
who knows how to run a postgres cluster and how to handle failovers, or how
not to. Suddenly you need someone who knows how drives should be handled.

This is becoming a really big rift I'm overall starting to see. Applications
and deployments are becoming more and more trivial, but there aren't many
people understanding how to make the stack below it run reliably, with low
chances of data loss.

------
coder4life
Reddit also seems to be down

~~~
mooreds
Wonder if there's an underlying AWS issue.

Yup:

[https://status.aws.amazon.com/](https://status.aws.amazon.com/)

------
fareesh
I just poked one of my really old applications running ruby 1.8.7 and got a
bunch of errors about half an hour or so ago. The status page was fine then.

I hope it wasn't me that broke it lol

------
WalterSobchak
Sling TV is also down, which makes sense given the AWS outage.

------
MobileVet
We make moderate use, 0.5M MAU, of Heroku and AWS-east-1 and appear to be up.

Thank goodness... Last thing I needed on a 3 day weekend was an outage.

------
davidjnelson
My dynos and postgres dbs are ok, thank goodness. Hopefully it’s been fixed
already.

------
senorsmile
reddit is responding for me about 20% of the time.

forum.wordreference.com is also completely down for me.

------
nafeydev
Man I wish literally all of the web ends up on AWS and then we get an outage.

