
GitHub Major Service Outage - DeepWinter
See https:&#x2F;&#x2F;status.github.com
======
drinchev
A bit weird. GitHub says they fixed it, but on the other hand CircleCI still
considers it as an outage :

> Monitoring

> May 31, 2017 3:08 PM

> GitHub have declared the outage resolved and we are starting to see incoming
> GitHub hooks. Builds are being triggered again. However we are still seeing
> failures with the GitHub API. This continues to prevent our webapp > from
> fetching data from GitHub. We are monitoring the situation and will ensure
> sufficient capacity for when their service resumes normal operations.

~~~
amorphid
Maybe fixing the problem is different than recovering? Like stopping blood
loss vs slowly replacing the blood.

------
runeks
Good thinking on GitHub's part not using github.com/github/status to host the
content of status.github.com. Amazon, take notice.

~~~
brian_herman
Were they doing this before amazon had that major outage?

~~~
imbriaco
Yes. They've been hosting it outside of their production infrastructure for
several years.

------
tyingq
Seeing an interesting thing where my github issue comments just posted are
apparently posted a short time in the future.
[http://imgur.com/a/eQSc9](http://imgur.com/a/eQSc9)

Doesn't seem to break anything, but it is a bit curious. May not be new
though...I just happened to notice it today.

~~~
i336_
I've noticed this behavior with a lot of services.

I can only chalk it up to something like clock drift between the processing
node and the database server.

Irritatingly I can't remember which site it was but I posted something
somewhere a couple days ago and immediately after hitting enter the site
marked what I'd said as submitted "a few seconds from now". I never fail to be
amused that the fuzzy time library being used has code specifically designed
to handle this edge case scenario. :D

~~~
epicide
I don't think it's really an edge case. Probably one of the main uses,
actually.

Sure, if you use it to show comment age, you shouldn't ever see it, but I'm
sure they fully support using it for countdowns, too.

EDIT: it's the 4th example under relative time for Moment.js
([https://momentjs.com/](https://momentjs.com/)).

~~~
i336_
I can totally agree about relative timestamping in both directions
(past+future) - my argument is more about the UX of situations where you're
canonically referring to a past event.

So as not to spam with my reply to a similar comment, I'll link it:
[https://news.ycombinator.com/item?id=14452335](https://news.ycombinator.com/item?id=14452335)

------
apeace
As others have said, Github's postmortems are always great.

But frankly, I'd rather they have better uptime. Every couple months is too
much. I pay them. My work pays them.

If their CEO is serious about zero downtime, how about he offers his paying
customers a credit for time they cannot access the service?

~~~
tchaffee
My problem with a credit is that it never even comes close to what I'm losing
in income. An ISP is an excellent example. I might get a $10 credit for 24
hours of downtime. I'm charging _slightly_ more per hour than that... /s

Maybe switch to bitbucket or other competition for a while?

~~~
scott_karana
If the price of the service working (or not working) is disproportionately
large compared to the price of your lost business, that's a problem at _your_
end: you needed to calculate the risk vs return for _redundancy_.

Eg, if your Internet costs $100/mo, but you'd lose $100/hour when it's down
during business hours, _buy a fallback connection from a competing ISP._ ;)

~~~
tchaffee
> a competing ISP

Wow! That actually exists in some places? ;-)

Infrastructure so often becomes a monopoly. I can't pay a competing bridge
service to drive to work quicker, I can't pay a competing gas company to
deliver gas via different pipelines to my house. And I can't pay a competing
electric company that uses different wires.

I actually am lucky enough to live in a city where there are many competing
high speed ISPs. But guess what? I've paid for fallback connections in the
past and when one goes down, the other goes down, so I go out to lunch and see
the guys working on the wires in the cabinet down the street. The wires that
both my ISPs share. I suppose I could get a satellite ISP? That latency. True
redundancy for infrastructure is actually very expensive in most cases.

------
detaro
"discussion":
[https://news.ycombinator.com/item?id=14451924](https://news.ycombinator.com/item?id=14451924)

------
peterjlee
Funny thing is Github sometimes makes more sales after an outage because
clients want to upgrade to the enterprise edition to host on their own
servers.

------
Jayakumark
A little funny like Silicon Valley episode , saw the news from GitHub CEO
yesterday saying our goal is zero downtime and now it's down

~~~
IanCal
Perhaps an issue with punctuation?

Goal: zero downtime.

vs

Goal zero: downtime.

~~~
pavement

      Works on contingency?
      No, money down!

------
saosebastiao
Business idea: github hosting failover. You'd probably need a modified git
client, but if you can't push/pull/whatever from github, it transparently
fails over to your service which will sync up with github once they've
recovered.

Even better idea: github should stop failing.

~~~
emars
When expanded to see the monthly trend it shows 99.6% availability.

Serious Question: is there enough people that would pay for that 0.4% to
support a business?

------
i336_
As of right now...

On the one hand, I see "Everything operating normally." at the top in green,
and no flags or alerts.

On the _other_ hand, the charts look good, but "App server availability" looks
interesting, the right edge of the chart is pretty much at 0%.

------
nadim
MEAN WEB RESPONSE TIME - 262ms

98TH PERC. WEB RESPONSE TIME - 1134ms

4.3x?

~~~
maxyme
And the 98th percentile is still faster than the 50th percentile of GitLab...

------
samgranieri
Apparently it's resolved. I'd like to read their postmortem on it. They write
those extremely well

------
rodionos
github daily availability history:

[https://apps.axibase.com/chartlab/25f38b08/2/](https://apps.axibase.com/chartlab/25f38b08/2/)

------
thejosh
Looks like their cdn is having problems as well now, seeing timeouts when
trying to download archives.

~~~
r3bl
Was receiving random downtimes when I've tried opening a certain project and
its wiki ~1 hour ago. Nothing big and a couple of refreshes fixed it, just
minor annoyance.

------
citrusui
The outage seems to be resolved as of 8:58 EDT

~~~
DeepWinter
Yep. Now wonder what the issue was. Ghost in the shell?

------
gionn
web looks fine but repository are not so responsive

------
JohnHaugeland
feel free to make a better one

~~~
edoceo
It's called GitLab. Not 100% uptime but better (and constantly improving)

~~~
Filligree
It will take a long time before GitLab can even begin to regain my trust from
their missing-backups outage.

Problems are to be expected. But as great as it is that they had multiple
levels of backup, _none of them worked_. They hadn't even been tested.

~~~
jjawssd
Do you have any evidence that Github does any better?

~~~
Xylakant
They haven't lost my data so far and they had to restore the production
database at least once in their history. All circumstantial, but we'll have to
wait and see.

~~~
foxylion
They did not lose that much data, only a few hours.

------
runn1ng
Git access to repos seems to be working for me (pull, push).

------
JensRantil
I guess they just rolled out their new DNS infrastructure
([https://githubengineering.com/dns-infrastructure-at-
github/](https://githubengineering.com/dns-infrastructure-at-github/)) :-P

