
GitHub was down - nanddalal
http://status.github.com/
======
Illniyar
Seems like things started to go down the drain somewhere around February 2020.

[https://www.githubstatus.com/uptime/kr09ddfgbfsf?page=2](https://www.githubstatus.com/uptime/kr09ddfgbfsf?page=2)

Wonder what was the trigger for the reliability hit - actions went GA on nov
2019, so it's something else (or possibly a combination of things)

~~~
kenhwang
COVID related work from home adjustments is my guess.

~~~
hnlmorg
That might be the trigger but it wouldn't be the root cause.

For example web and backend servers for cloud services shouldn't be affected
by whether people are sat in one location or another. However if those systems
requires lots of maintenance to keep running and people are less available due
to COVID then you'd see a rise in downtime. But that would mean it's not COVID
that's the problem, it's the amount of maintenance required and COVID only
surfaced that problem.

~~~
kenhwang
Doesn't have to be on service side either. Could be that user patterns are
changing too, testing scenarios that aren't well worn on GitHub.

------
mindfreeze
I am thinking when was the last time GitLab went down, I rarely or did not see
downtime experience like this bad with Gitlab, I was seeing history,

[https://status.gitlab.com/pages/history/](https://status.gitlab.com/pages/history/)

They do have some latency or slowness issues, but couldn't find like whole
system down thing,

Like in one of the comments here, reminded me of 2017 incident,
[https://about.gitlab.com/blog/2017/02/10/postmortem-of-
datab...](https://about.gitlab.com/blog/2017/02/10/postmortem-of-database-
outage-of-january-31/) They should have improved a lot by now, but still I am
curious, why such large or frequent downtimes are happening to GitHub. Is it
due to making it more open for teams with Private repos, and more perks along
with quarantine and WFH things

~~~
letientai299
The correct link for history (at of now) should be:

[https://status.gitlab.com/pages/history/5b36dc6502d06804c083...](https://status.gitlab.com/pages/history/5b36dc6502d06804c08349f7)

Above link result in err 500.

------
rvz
14 Days ago, they went down [0]. And today it's happening again. Twice in less
than a month.

Another reminder to self host via solutions like GitLab or Gitea. [1]

[0]
[https://news.ycombinator.com/item?id=23675864](https://news.ycombinator.com/item?id=23675864)

[1]
[https://news.ycombinator.com/item?id=23676072](https://news.ycombinator.com/item?id=23676072)

~~~
dt3ft
I would choose self-hosting for small to medium size teams any day. I can't
fathom why people choose not to self-host at this scale. Your data. Your
control. Your network. Your infrastructure. Your responsibility. Are people
becoming more afraid of responsibility these days?

~~~
moooo99
> Are people becoming more afraid of responsibility these days?

I wound't say people become more afraid, they just don't see the reason to
bother. I'd choose GitHub or Gitlab for small teams at any time. I'd probably
even be fine with the free versions.

I see little to no reason for self-hosting in a small team. I cannot imagine a
performant server, bandwith and the employee it takes to maintain it to be
cheaper than the 10€/month/user for a hosted solution.

~~~
Symbiote
From zero servers to one requires an employee with the skills, but from 100 to
101 (my situation) does not. It requires less than 1% more employee, since
more is likely to be automated etc.

Many small companies already have VMs running other services.

------
thih9
Looks like they need some high level analysis on why these outages happen so
often.

At this point it feels like it’s no longer a series of accidents and that they
should improve something.

~~~
haik90
after this downtime today. We're finally start discussing (again) to use our
own Gitlab.

~~~
aaomidi
I love when people suggest this and when someone asks okay who do we put in
charge of having better availability than github no one has a good answer.

~~~
Symbiote
In your case, maybe the same people that keep microsoft.com online.

------
amirathi
Post mortem for February events[1]. It was resource contention on their
primary mysql cluster caused by unintended config change.

There's a total of 34 incidents in 2020 so far[2]. I wonder if all are DB woes
or there are other factors at play (like move to Azure).

[1] [https://github.blog/2020-03-26-february-service-
disruptions-...](https://github.blog/2020-03-26-february-service-disruptions-
post-incident-analysis/)

[2]
[https://www.githubstatus.com/history](https://www.githubstatus.com/history)

------
anemic
Protip:

Always have an extra customer, like the flowershop downstairs. Let her borrow
your wifi in exchange for some office flowers. Now she is technically your
customer.

When your shit goes down and nothing works you can still write " _some_ of our
customers are experiencing issues" in the statuspage as the flowershop still
has wifi (hopefully).

------
quyleanh
I still don't understand people who always mentions to Microsoft's
acquisition. Until the official statement, it isn't Microsoft failure. Don't
blame them.

~~~
jaekash
You know if you use Microsoft products every day, and every day they let you
down, and every day you experience the worst most unintuitive design in the
world, and every day you have to deal with their reliability issues, and then
MS acquires github, and github starts to behave like everything else MS
touches ...

Clearly something about how MS runs is responsible for their past outcomes,
why is it a stretch to assume it is responsible for another similar outcome?

It is like saying we don't know the rotation of the earth is why the sun rose
this morning because we have not had an official investigation into the
matter.

~~~
quyleanh
It's just your own feelings. I haven't had any issue with Microsoft products
(and yes I use their services every day in my working life). All I can see
from you is making a guess with your full of sentiments and negative thoughts.
So as I said, don't blame them until official statement.

~~~
jaekash
> It's just your own feelings.

Nope. Github is down, I don't feel it is down, it is down. Office 365 falls
apart when two people try edit a spreadsheet at once. Outlook does not allow
you to unreject a meeting invite. Windows won't report DLL errors over a
powershell terminal, Azure takes 10 minutes to delete a VM instance, Azure
DevOps is so poorly designed that nobody can figure out how to find a repo
without someone explaining it to them first, I can go on, but none of this is
my feelings, these are actual things that people have to put up with from MS
every day.

~~~
majkinetor
You seem to have problems with online stuff. I don't use cloud and have no
such problems.

~~~
jaekash
I prefer my option which is to just not use MS products but to each their own.

------
aspectmin
Is there historical data of Github Uptime/Downtime? CSV format or other?

I'd love to do some analysis on how things were pre, vs post the acquisition
(and trends in availability)

~~~
dtech
They have some self-reported data [1]. You could scrape that and transform it.

Eyeball analysis suggests it started in december 2019-february 2020, and
rapidly went downhill starting april.

[1]
[https://www.githubstatus.com/uptime?page=2](https://www.githubstatus.com/uptime?page=2)

------
mullikine
I have noticed a pattern than when I generate markdown from org-mode and have
the text 'language' selected for highlighting push, this causes github to hang
like crazy. I don't think I'm crazy in thinking it might be me. I push
frequently to my blog and am starting to notice a correlation.

I export this into the below markdown.

    
    
        #+BEGIN_SRC text -n :f "translate-shell -s fr -t en" :async :results verbatim code
          I learned some French so that I can talk to
          you during tennis. I hope I know enough so you
          will not get bored.
        #+END_SRC
    

When I get a page build failure it's usually my fault for creating .

This is the markdown which was pushed to my blog. The 'Page Build failure'
messages take a long time to arrive to my inbox and I can see that the page
build is hanging.

    
    
        {{< highlight text "linenos=table, linenostart=1" >}}
        I learned some French so that I can talk to
        you during tennis. I hope I know enough so you
        will not get bored.
        {{< /highlight >}}

------
gitgud
Can't wait for the article about this outage, what will it be?

\- Auto-scaling issue

\- DDOS

\- DNS error

\- Datacenter outage

Any other possible problems?

~~~
darkwater
\- Kubernetes control plane screw-up

------
kchoudhu
Second time in what, a week? What is going on at Microsoft?

~~~
jaekash
What is going on that people expect something better from Microsoft? Really
this is quite on par with the quality they deliver. The only surprising thing
is that people are surprised by this.

~~~
ip_addr
I wasn't aware of that reputation, besides Windows ME, that is.

------
echelon
Well, there goes my night. I was waiting on a build triggered by Github
actions and was wondering what was up.

I guess this is my sign to get some sleep.

Microsoft needs to slow things down and focus on stability. This really isn't
good. I need these weekend and late night hours for my side hustle. I already
have enough trouble as is, I don't need an injection of additional difficulty.
(That's just my frustration; I can't imagine what y'all are all going
through.)

They're making some very frustrating choices lately. Their redesign broke
READMEs with tables (which now require horizontal scrolling), and they don't
seem to care about all the repos they impacted.

Pull it together, Microsoft.

~~~
jaekash
Why are you not using an alternative?

~~~
echelon
> Why are you not using an alternative?

I don't need two CI/CD pipelines. Nor do I have have time to build something
like that. I need to spend my time working on the core product.

The risk here is that my deploy SLA is tied to Github. I accepted this risk as
the cadence of my deploys tends to be once to twice a day on average and
Github is usually available (many nines).

I made a choice, and now that choice is biting me. Now I'll evaluate if I
should spend the time and effort to migrate off it. If this is the last of it
and Microsoft makes a commitment to not break things, I'll likely stay as the
effort to move is nonzero. If this begins to happen every month, on the other
hand...

~~~
jaekash
> I don't need two CI/CD pipelines.

Maybe my question was unclear, but I was not asking why are no not using two
services, I was asking why are you not using an alternative, and I meant
instead of GitHub, not in addition to GitHub.

> If this is the last of it and Microsoft makes a commitment to not break
> things, I'll likely stay as the effort to move is nonzero.

Commitment without liability is not a commitment. It is just empty words. And
I don't see Microsoft making any commitments with liability on their part for
a free service.

~~~
echelon
> why are no not using two services

Simplicity. Microsoft has a marketplace where they bundle a bunch of services
together where developers already are. I didn't have to go elsewhere to look.
It's a real competitive advantage they're building. That said, I may begin to
look elsewhere.

> And I don't see Microsoft making any commitments with liability on their
> part for a free service

It's not a free service. I pay roughly $50/mo for the features I use.

> And I don't see Microsoft making any commitments with liability on their
> part for a free service.

I imagine they take their SLAs very seriously. Especially in light of this
string of outages. If they don't, they're going to lose customers and good
will.

~~~
jaekash
> I imagine they take their SLAs very seriously.

What are the liabilities on them for failing to meet their SLA? Taking thing
seriously is not a liability. Specifically in this case, what are their
liability to you?

------
onyb
That moment when you submit a very long comment on GitHub, and realise that it
is down. :(

------
zhdc1
I received a 500 error when I went to GitHub, and the first thought on my mind
was to check Hacker News.

I wasn't disappointed.

In all fairness, I GitHub has more or less been fairly reliable, minus
whatever has been going on over the last week.

------
noble_pleb
Seems to me like a Deja-Vu after I answered this[1] on quora yesterday.

[1] [https://www.quora.com/Github-now-allows-unlimited-private-
re...](https://www.quora.com/Github-now-allows-unlimited-private-repos-with-
unlimited-collaborators-Why-should-one-stay-with-Bitbucket-now/answer/Prahlad-
Yeri)

------
mundanevoice
Move off Rails now GitHub, is clearly not working for you. It took you this
far, now go move to something that doesn't sh*ts the bed twice every month.

~~~
brobinson
Rails is (apparently still) a popular target for haters, but for projects at
Github's scale it's rarely a code logic/framework-level blunder that's taking
the service down. It's generally a cascade of failures in things like multiple
database systems, auto-scaling, dns/caching, etc.

~~~
mundanevoice
There are projects bigger than GitHub that do all of these things and still
doesn't fail as much as GitHub. Honestly, both of us speculating. You can't
say for sure if it's not Rails and I can't say if it's not anything else.

> for projects at Github's scale it's rarely a code logic/framework-level
> blunder that's taking the service down. It's generally a cascade of failures
> in things like multiple database systems, auto-scaling, dns/caching, etc.

This is just guessing. You can't honestly tell what is causing the issue.

~~~
brobinson
Not guessing, postmortems are regularly posted on this site and elsewhere.
These kinds of failures are generally not someone using some "magic" (or
whatever other pejorative term) feature of a framework.

~~~
mundanevoice
Agreed. Do we have postmortems for the number of spectacular crashes Github
had this year?

------
niffydroid
Github is probably still more reliable than Bitbucket, that has weekly
disruptions (small but you do get a performance impact)

------
NiceWayToDoIT
On June 29th they also had outage (lasting 2 hrs), does anyone know what was
the cause back then?

------
dmpetrov
Why this is usually happen on weekends? The only time that I have for coding
:)

------
svntid
yet again - I have not experienced anywhere close the amount of outage before
Github was swallowed by Micro$oft

------
ijelliti
now we are talking about GitHub availability report!! happy Monday

------
jaekash
again ... well done microsoft. well done. I really don't get why people keep
using it, gitlab seems objectively better to me.

~~~
ishansharma
Out of curiosity, what could have Microsoft done to cause this?

I'm not aware of the acquisition details. But did they make GitHub switch to
Azure or make significant changes to infrastructure?

I've seen several people blaming Microsoft for GitHub outages and am trying to
understand why this is a common thing.

~~~
Illniyar
When microsoft took over the rate of new features increased dramatically.

Adding new features fast often entails tradeoffs in other areas - such as
reliability. Which seems to be what happened in github.

~~~
iddqd
I haven’t used their new features but am heavily affected every time they go
down. I wonder if they are making the right trade offs here.

------
tmsh
deleted

~~~
rimliu
You are bad at trolling.

