
GitHub is degraded/down - juancampa
https://githubstatus.com/
======
hellepardo
Is it just me, or has Github's quality of service been continually degrading
over the past several months? What is going on internally? Is this because of
the Microsoft acquisition? Increased usage? An internal transition to Azure?

...is it time to move away from Github?

~~~
ddevault
If you're looking for somewhere else, SourceHut has had no unplanned outages
in 2020, despite being kept online by an army of one. The software and
infrastructure is just more resilient and better maintained. Our ops guide is
available here:

[https://man.sr.ht/ops/](https://man.sr.ht/ops/)

It's also the highest performance software forge by objective measures:

[https://forgeperf.org/](https://forgeperf.org/)

Full disclosure: I am the founder of SourceHut.

~~~
erikpukinskis
Sidebar, I just want to say, you are one of the few people I’ve observed doing
actual “modern” web development.

When most people talk about “modern web” or modern anything in software they
think it means “using all the latest tools”.

That often means things like ES6 and Webpack, which have nice surfaces, but
which create nightmares under the hood.

That’s the opposite of what modern architecture was. It was about embracing
the constraints of materials. Given the properties of concrete, what is the
limit of what you can do with it. Go there, and no further. And don’t cover it
up, just finish the dang slab and get on with the rest of the house.

ES6 means transpiling, which means webpack, which means a massive machine of
hidden complexity, which if you’re lucky exposes a nice smooth surface where
everything is arrow functions and named exports. And if you’re unlucky is a
flimsy piece of cardboard over the nightmare underneath.

You (SourceHut) seem to be building a UI that actually takes note of _how the
browser is_. And you are trying to push the big numbers... how reliable your
service can be, how many endpoints can one person maintain, while letting the
materials of the web (forms, urls) dictate the details.

That’s true modernism.

So, bravo. I’m glad to see you out in the world. It takes courage to step
outside of the norm and I’m rooting for you.

~~~
SparkyMcUnicorn
Just wanted to interject that browsers (other than IE 11) have over 98%
coverage for ES6 without transpiling.

------
robertakarobin
I don't think it's just GitHub. I noticed everything seemed slow but thought
it was just my ISP. Turns out from looking at downdetector.com there seems to
be a large spike in reported problems across many sites and providers, all
occurring at the same time.

~~~
MattGaiser
Maybe the cloud providers are running out of capacity?

~~~
EForEndeavour
Maybe The Cloud is full.

~~~
dijit
Azure is! (or; was?[0])

[0]:
[https://www.theregister.co.uk/2020/03/24/azure_seems_to_be_f...](https://www.theregister.co.uk/2020/03/24/azure_seems_to_be_full/)

------
amirathi
GitHub is having major availability issues these past three months. 4
incidents in Feb, 4 in March and 6 in April so far.

Source:
[https://www.githubstatus.com/history](https://www.githubstatus.com/history)

I look forward to reading root cause analysis promised by Nat in Feb:
[https://twitter.com/natfriedman/status/1233079491204804608](https://twitter.com/natfriedman/status/1233079491204804608)

------
MiroF
Azure is also refusing to allocate me capacity - I'm wondering if this is a
general MSFT outage?

~~~
coldpie
Here in Minnesota, several coworkers and I are having trouble with our ISP.
And a website we host is (apparently) being DDoSed. And now GitHub's down, and
you're reporting some Azure issue. Is something going on...?

~~~
snazz
No, I don't think so (and I'm also in Minnesota). I'm going to guess that the
increased load is just pushing services over the edge.

Edit: Also, interestingly enough, I am now reliably hitting Cloudflare's ORD
(Chicago) datacenter instead of MSP. If you visit [https://snazz.xyz/cdn-
cgi/trace](https://snazz.xyz/cdn-cgi/trace) (or any other Cloudflare-backed
website), what comes after the COLO= for you?

~~~
judge2020
You can test Cloudflare at [https://cloudflare-
test.judge.sh/#snazz.xyz](https://cloudflare-test.judge.sh/#snazz.xyz)

~~~
snazz
So ORD is probably just higher-capacity, which is why free plan users like me
are getting routed to it?

~~~
judge2020
Most likely, the other/normal DC is likely either overloaded or has part of
its hardware under maintenance.

------
SirensOfTitan
In my experience, nothing is correlated more with downtime than code changes.
Github has been pushing a lot more features since the Microsoft acquisition,
and has felt down a lot more often since then.

There has been one major outage a month the last several months, with
sprinklings of little outages (I recall webhooks down quite a lot). The
cadence of these outages is rapidly changing my perception of Github as a
reliable service.

------
jacobra2
At my organization, we saw an uptick in timeouts spanning vendors that started
at the same time - 6:55am PST. Makes me think there's an internet wide event
occurring right now

~~~
exabrial
Which providers are you seeing issues with? I'm curious if we can corroborate.

~~~
jacobra2
I was referring to WorldPay and Adyen (payment providers), but also saw issues
directly w Github.

------
bwhitty
CenturyLink ISP is having issues:
[https://downdetector.com/status/centurylink/](https://downdetector.com/status/centurylink/)

------
bob1029
Today marks the 3rd time I've broached the topic w/ management of getting the
self-host enterprise option... Compared to (public) GitHub's problems this
year so far, our AWS EC2 instances are orders of magnitude more reliable.
Sure, the internet can still go down, but my VPN into us-east-1 from Texas has
been unbroken for weeks now.

At this point I'd almost prefer to pull it all in-house and manage it myself
so the entire team doesn't have to lose a whole day of productivity over all
this. I am so glad we moved away from using GH Actions for builds because we
would be absolutely hosed right now on supporting our customers.

~~~
dijit
I'm not sure the irony is lost on you that; people generally prefer services
like github (and; similarly EC2) precisely _because_ it's not on-prem and that
if it's down there are hundreds of talented engineers working to resolve those
issues.

Unfortunately anecdotal experience here is spotty. Services I run for my team
are several 9's higher in terms of actual availability (note: I did not say
"uptime"); and contrarily other internal services at my company have many
times less reliability than services like github.

Given that people prefer external hosting for the reasons I mentioned, for
many I think pulling it in-house is unappealing.

~~~
bob1029
Does the GH Enterprise option not provide some degree of support with initial
setup and configuration? Is the enterprise support more or less responsive
than the public channels? I would expect those same engineers are also
responsible for maintaining the enterprise offering.

Also, does isolation of a private GH instance from the public instances not
provide some degree of added reliability considering the potential for DDOS or
simply extreme load?

I absolutely grant you the IT infrastructure concerns. BUT Amazon is our
vendor on that. GitHub provides the software. It's not like I'd be standing up
a new series of physical hosts to run an on-prem GitHub built from source and
managing all of the hell around that. This would be simply putting a GH-
provided image on a EC2 instance and making sure we have frequent snapshots.

~~~
CodeAlong
My typical response from GitHub Enterprise support has been fantastic over the
years. I'll also note that after GitHub Actions/Packages came out on
Github.com the lead time for support response did increase substantially for
non critical tickets due the increased support burden for those new services.
My most recent tickets have been answered promptly so they must have figured
out the staffing issues.

You should be well aware that actual architecture of GitHub Enterprise isn't
truly highly available (1 active and 1 or more standby instances) unless you
are a huge customer to support clustering mode. Which means you likely need to
take downtime to implement upgrades since they typically require rebooting the
VM.

------
exabrial
There are larger problems I think... our call center is having trouble with
Five9 in San Francisco.

I also I tried setting up an EC2 instance in North Cali and just pinging it
from Kansas City via Google fiber experienced 37% packet loss.

~~~
bwooceli
same with Five9 from the KC Metro, also accessing azure resources. No issues
on my Google Fiber. But we have agents all over the midwest, all having
issues.

------
super3
Created [https://gitbackup.org](https://gitbackup.org) for this very reason.

~~~
Gehinnn
Where do you store all that data?

~~~
super3
Should only be 2-3 PB. Storing it all on
[https://tardigrade.io](https://tardigrade.io)

~~~
jlmorton
Only 3 petabytes? At $0.01/GB/Month, it seems that works out to about
$360,000/year in storage costs alone. Cheers, but you're paying for this just
as a public service?

~~~
super3
There is discount pricing over 100 TB. Its a tech demo so Tadigrade.io
sponsors it. Pretty powerful demo on easily storing large amounts of data
especially when Github goes down.

------
vortico
Why not make two versions of GitHub, one free and one paid, having
professional-level uptime and support? Is that the idea behind GitHub
Enterprise ([https://github.com/enterprise](https://github.com/enterprise))?

~~~
snazz
Yes. Enterprise has an SLA (when GitHub hosts it) and an option to self-host.

~~~
vortico
Oh okay, I didn't know GitHub had an option to host Enterprise. (It's way
overbudget for me right now, but just curious.)

------
_abattoir
Who would have guessed that handing control of a company to Microsoft would
make it less reliable.

------
snazz
Git pushes are going really slow for me, but they do work after a few tries.
If you normally use a credential cache and it's asking you for your password
again, hit control-C and retry the operation. That seems to have gotten my
last few pushes to work.

~~~
dpau
got my push to go through, unfortunately downloading packages from
[https://codeload.github.com](https://codeload.github.com) was timing out

------
throwanem
CircleCI's having a lot of trouble, too. Intermittent 500s, and even the
images on the 500 page don't load.

Seems like something bigger might be going on.

~~~
dwwoelfel
What usually happens with CircleCI after a GitHub outage is that they're hit
with a flood of GitHub webhooks when GitHub comes back online. Then Circle
starts to slow down under the load of all of the new jobs that have been
queued.

One usually causes the other, so we shouldn't infer anything larger from that
datapoint.

~~~
throwanem
In this case, CircleCI and Github were down at the same time, and seemed to
come out of it around the same time too. I don't know what to infer from that
either.

------
sbr464
I noticed it being more of a centurylink issue.

------
imedadel
I think my biggest nightmare is SO, GH, and HN all being down at the same
time.

Edit: A nightmare is an exaggeration. But it would slow down work.

~~~
caleb-allen
Reddit and Rocket League servers were degraded simultaneously yesterday, and
if I recall correctly they both host on Google Cloud (I may be mistaken
there). My paranoia of a high impact state-sponsored cyber attack have been
high!

~~~
imedadel
Yup. Google Cloud seems to be having frequent problems lately.

However, I feel like GitHub being down AND Stack Overflow would slow down
work. And HN is a good place to know what's going on.

~~~
caleb-allen
Oh for sure. Reddit and Rocket League being down just mangled my relaxation
time, GH and SO would have much worse material consequences.

------
ohsik
I knew this will be here lol

------
random_savv
As a huge fan of JetBrains products (I use WebStorm, PyCharm, DataGrip in
roughly equal parts) I am considering:

[https://www.jetbrains.com/space/](https://www.jetbrains.com/space/)

Anybody else got experience with that and/or TeamCity?

~~~
fastest963
We use TeamCity and can't say I have any complaints. We utilize the tagging
feature pretty heavily in our deployments and source all artifacts from
TeamCity builds. The builds also post success or failure to Gerrit.

------
EvanAnderson
I feel compelled to repeat my earlier sarcasm:

If only there was some kind of distributed version control system. /s

I don't feel bad being an "old fogie" who demands local copies of my
dependencies.

~~~
throwanem
Github is as much or more a collaboration tool as a Git upstream. The Git part
is easy to distribute. The collaboration tool, not so much.

And to anticipate the likely next argument - no, mailing lists are _not_ a
better tool for collaboration. They are distributed, sure. In so saying, I
have exhausted the list of their virtues.

~~~
EvanAnderson
The "we deploy production directly from Github" pathology is the one I'm
talking about. Not being able to "collaborate" for a brief outage is fine. Not
being able to deploy code isn't.

~~~
derefr
You gotta CI/CD from _somewhere_. Whatever that somewhere is, it can go down.

You can, of course, override your CI/CD and do a manual deploy. It's not a
matter of "can." It's a matter of not fully understanding all the checks the
CI/CD system does to the code, and all the build steps for prod builds, and
therefore not having the _confidence_ to deploy to prod without CI/CD holding
your hand. (Which I don't at-all blame devs in bigcorps for not knowing;
release management is its whole own _thing_ , and division of labor means that
at sufficient scale it doesn't make sense to learn about it.)

~~~
strenholme
Well, the place I run my CI tests is on a VMware virtual machine, e.g.

    
    
      #!/bin/bash -e
      git clone https://git.example.com/myrepo
      cd myrepo
      sh do.tests
    

If the tests are automated enough to run with every Git checkout, they can be
easily enough be run by hand too.

The real world script I use is a little more complicated, because the code
base in question is two decades old so I need to make some changes to how the
code looks to the tests so that the tests can run.

I once worked for a company which had a series of tests which took eight to
ten hours to run. Running those tests with every single Git checkin was out of
the question; we instead used cron to run the tests every night.

------
c1t1z3n0n3
Our nonprofit offers both privacy friendly cloud storage and gits with gitlab.
We also offer gitlab hosting for private instances.
[https://git.stealthdrop.cloud](https://git.stealthdrop.cloud) and
[https://my.stealthdrop.cloud](https://my.stealthdrop.cloud). Our service are
free and we are 501c3 donations are deductible.

