
Gitlab is down - aao
https://gitlab.com/users/sign_in
======
ploggingdev
I get the impression that Gitlab focuses a little too much on releasing new
features at a rapid pace with every release. Maybe they should spend more time
on running their infrastructure reliably and getting their engineering
practices up to speed. The recent events will raise a red flag with
enterprises who might be potential Gitlab customers and that directly affects
their bottomline.

~~~
cptskippy
I questioned their engineering competence during their last outage and the
general responses I got were very dismissive. Gitlabs were able to spin the
last disaster into a publicity stun with the live streaming but I feel like
that was a fluke. Bro coding and "openness" will only garner you so much good
will with paying customers who are more concerned with availability.

The unavailability of a code repository might be more of an inconvenience for
a single developer but in an enterprise environment with teams of coders being
able to quickly disseminate code changes can be critical. The unavailability
of a source repository becomes a huge liability and a waste of man hours.

~~~
xena
If you are relying on something for your business that you paid nothing for,
you get your money's worth.

~~~
cptskippy
That excuse doesn't really work because Gitlab offers a premium paid option
for repositories hosted on Gitlab.com.

~~~
Perihelion
.com is free -- paying customers/CE users weren't affected by the outage.

~~~
cptskippy
> .com is free

Yes, and Gitlab.com Bronze Support is a premium service offered on the
Gitlab.com platform. It is not the hosted or self hosted premium offerings.

> paying customers/CE users weren't affected by the outage.

That is incorrect. If you pay for Bronze Support you are definitely affected
by this outage. Hosted and Self-hosted customers are unaffected.

------
johnnycarcin
Unfortunately, events like this are one of the reasons I never was able to go
all in on Gitlab. When I first started trying it out the performance was not
great (it is pretty good these days though) and they just seemed to have more
of these smallish events. I realize Github also has problems but it feels like
they are much less frequent. I don't have any data to back that up though.

Either way, went self hosted recently so now I only worry about my server
haha.

~~~
tdkl
Self-hosted is one of the main selling points of Gitlab software.

~~~
johnnycarcin
For sure and that's one of the reasons I think they have a chance of at least
getting to the same level as Github. I've actually worked at two fortune 500
companies that are using the self hosted Gitlab which to me says a lot, I just
wish the hosted option was a bit more... stable.

------
amingilani
Question: with a strict CI/CD in place, as well as a staging server, how can
these problems be so common for Gitlab?

Isn't this exactly what CI is supposed to prevent?

Not blaming Gitlab for bad practices or anything, i'm just curious.

~~~
michaelt
Even with a staging server, things can pass testing but fail in production if
the staging environment provides an imperfect simulation of the production
environment - and that's almost inevitable.

For example, your staging environment servers should be connecting to a
different database with a different password. If the password's right in the
staging config but wrong (or missing) in the production config, things that
work in staging can fail in production.

~~~
justinlaster
> things can pass testing but fail in production if the staging environment
> provides an imperfect simulation of the production environment

Your staging environment should match production, or it's not really staging
at that point. It doesn't have to match it in _size_, just structure and
process. Ignoring data loss, if you can't quickly switch staging to production
it's not really staging. It's just a dorky test environment masquerading as a
stage environment. It's also surprisingly not that difficult (the variation of
difficulty depends on the type of data you're interacting with, and how
isolated it needs to be) to "forward" a slice of real word traffic to your
staging environment and monitor it for some duration of time.

>For example, your staging environment servers should be connecting to a
different database with a different password.

Handled by proper CI/CD pipelines. Completely irrelevant to deploying new
features, configuration for production specific users/passwords happens on the
sysadmin/devops side of things.

------
richardboegli
It was back 5mins ago:
[https://twitter.com/gitlabstatus/status/829685770084548608](https://twitter.com/gitlabstatus/status/829685770084548608)

------
jameskegel
I'm rapidly starting to question my use of mid-tier web services. Who else is
operating like this? CI/CD, Staging, downtime playbooks, backup playbooks, all
of this or any combination of it would have been a good idea. Folks, I just
want to work without my tools failing so that I can go home and think about
something else.

~~~
shakna
GitLab.com is a testing platform for their Enterprise version.

If you want to guarantee reliability you need to pay for hosting or self-host.

Otherwise, there are quite a few competitors in this market with 99.99%+
guarantees.

~~~
scaryclam
If this is even remotely true, they need to put that on the front page, in
really big letters.

~~~
tdkl
The letters are F, R, E and another E.

~~~
scaryclam
It's nice that you can spell and all, but bitbucket is also free and we don't
see this type of issue there. Or github, or gmail, or google analytics,
or...well, you get the point I hope. Free is not a synonym for unreliable, so
if a company wants me to sign up to their product and _doesn 't_ tell me it's
unstable, I'm not going to be terribly impressed if it fails. A clear notice
on the homepage would sort this out. Heck, they can even link over to the
enterprise edition for anyone who doesn't want to take the risk.

~~~
shedam
please search for "github outage" on hn search engine for example... or
"bitbucket outage"... pay is not synonym for reliable either...

------
overcast
Gitlab is praised for being transparent in everything they do, so where is the
backup infrastructure policy that they should now have in place? I'd like to
see that situation proven resolved before we discuss rewriting their front end
with Vue.js and any other new deployments.

~~~
YorickPeterse
It's still being written: [https://gitlab.com/gitlab-com/www-gitlab-
com/merge_requests/...](https://gitlab.com/gitlab-com/www-gitlab-
com/merge_requests/4779), in the mean time quite a bit of work is underway,
see the issues mentioned in [https://gitlab.com/gitlab-com/www-gitlab-
com/issues/1108](https://gitlab.com/gitlab-com/www-gitlab-com/issues/1108) for
more info.

------
shakna
Status: Investigating

[https://twitter.com/gitlabstatus](https://twitter.com/gitlabstatus)

~~~
shakna
Should be all good again.

Updates:

Our Redis cluster is currently experiencing a split brain, we are looking into
the problem

Split brain is fixed for Redis cluster we are currently investigating the
cause

We're performing a hard restart of our Unicorns, this may lead to an increase
in HTTP 500 errors

Deployment finished and (link: [http://GitLab.com](http://GitLab.com))
GitLab.com is available again. Apologies for the bumpy ride.

------
chernoby
Unsuccessful deployment i guess.

We will be deploying 8.17.0 EE RC1 to [http://GitLab.com](http://GitLab.com)
shortly, no downtime is expected

------
Kiro
> Our Redis cluster is currently experiencing a split brain, we are looking
> into the problem

------
koolba
After the backup catastrophe postmortem, have they published anything about
what they're doing differently now?

i.e. here's how we're handling Postgres WAL archiving and logical backups,
Redis RDB/AOF backups, etc?

~~~
mrwilhelm
[https://gitlab.com/gitlab-com/www-gitlab-
com/merge_requests/...](https://gitlab.com/gitlab-com/www-gitlab-
com/merge_requests/4779/diffs)

------
thatonecoderguy
I understand gitlab to be a test bed. But at least this time they didn't
delete the wrong directory. I know, I'm late to the party since service has
restored.

------
foobazzy
status.gitlab.com is also not loading. Gitlab pages hasn't been working
properly for me(redirects to a 404)[1]

[1]: [https://gitlab.com/gitlab-org/gitlab-
pages/issues/43](https://gitlab.com/gitlab-org/gitlab-pages/issues/43)

~~~
mdekkers
running your status page from the same infrastructure is a really bad idea

~~~
tvmalsv
Doesn't look like they are, to me. Their main domain's IP address is owned by
Microsoft (so, using Azure?), but the status page IP is in a block owned by
Digital Ocean.

Edit: I could have sworn I refreshed the page before replying to make sure
someone else hadn't already responded, and I didn't see your comment
jschulenklopper. Scary how similar they are lol.

------
LeonidBugaev
Hey, where is the live video of this incident? :sarcasm:

------
amingilani
Me too. Came here to find this, wasn't disappointed.

------
philtar
Welp. There goes gitlab. It was nice while it lasted.

