Hacker News new | comments | show | ask | jobs | submit login
Gitlab is down (gitlab.com)
36 points by aao 1 hour ago | hide | past | web | 19 comments | favorite





I get the impression that Gitlab focuses a little too much on releasing new features at a rapid pace with every release. Maybe they should spend more time on running their infrastructure reliably and getting their engineering practices up to speed. The recent events will raise a red flag with enterprises who might be potential Gitlab customers and that directly affects their bottomline.

reply


Unfortunately, events like this are one of the reasons I never was able to go all in on Gitlab. When I first started trying it out the performance was not great (it is pretty good these days though) and they just seemed to have more of these smallish events. I realize Github also has problems but it feels like they are much less frequent. I don't have any data to back that up though.

Either way, went self hosted recently so now I only worry about my server haha.

reply


It was back 5mins ago: https://twitter.com/gitlabstatus/status/829685770084548608

reply


Question: with a strict CI/CD in place, as well as a staging server, how can these problems be so common for Gitlab?

Isn't this exactly what CI is supposed to prevent?

Not blaming Gitlab for bad practices or anything, i'm just curious.

reply


> Not blaming Gitlab for bad practices or anything, i'm just curious.

On the contrary, the backup snafu was caused by a series of bad practices. If that's how backups are handled I wouldn't be surprised if the rest of the testing infrastructure has issues as well. Heck, I'd be surprised if it didn't!

Particularly because a solid testing infrastructure works in tandem with your backup processes by restoring recent backups.

Nothing tests new code better than running it on a production restore and nothing validates backups better than using them on a regular basis for testing.

reply


Even with a staging server, things can pass testing but fail in production if the staging environment provides an imperfect simulation of the production environment - and that's almost inevitable.

For example, your staging environment servers should be connecting to a different database with a different password. If the password's right in the staging config but wrong (or missing) in the production config, things that work in staging can fail in production.

reply


Isn't this exactly what CI is supposed to prevent?

CI is only a facilitator, if their test coverage or quality isn't as good as it could be it won't make much difference. Also if it's due to load not sure how much loading testing they would do as part of CI. Having CI and writing automated tests is something everyone seems to agree in theory is a good idea but in my experience hardly anyone does it well because writing features always trumps writing tests. I am not talking about Gitlab specifically, I know absolutely nothing about their set up, only in general.

True story, I am involved with a startup that offers cloud based storage/reporting of test results (https://www.tesults.com) and my colleague just emailed the CTO of Gitlab yesterday to offer a promotion on a plan, very odd indeed to see this story on HN the next day!

reply


I'm rapidly starting to question my use of mid-tier web services. Who else is operating like this? CI/CD, Staging, downtime playbooks, backup playbooks, all of this or any combination of it would have been a good idea. Folks, I just want to work without my tools failing so that I can go home and think about something else.

reply


GitLab.com is a testing platform for their Enterprise version.

If you want to guarantee reliability you need to pay for hosting or self-host.

Otherwise, there are quite a few competitors in this market with 99.99%+ guarantees.

reply


Status: Investigating

https://twitter.com/gitlabstatus

reply


Should be all good again.

Updates:

Our Redis cluster is currently experiencing a split brain, we are looking into the problem

Split brain is fixed for Redis cluster we are currently investigating the cause

We're performing a hard restart of our Unicorns, this may lead to an increase in HTTP 500 errors

Deployment finished and (link: http://GitLab.com) GitLab.com is available again. Apologies for the bumpy ride.

reply


Unsuccessful deployment i guess.

We will be deploying 8.17.0 EE RC1 to http://GitLab.com shortly, no downtime is expected

reply


After the backup catastrophe postmortem, have they published anything about what they're doing differently now?

i.e. here's how we're handling Postgres WAL archiving and logical backups, Redis RDB/AOF backups, etc?

reply


Not yet.

reply


> Our Redis cluster is currently experiencing a split brain, we are looking into the problem

reply


status.gitlab.com is also not loading. Gitlab pages hasn't been working properly for me(redirects to a 404)[1]

[1]: https://gitlab.com/gitlab-org/gitlab-pages/issues/43

reply


running your status page from the same infrastructure is a really bad idea

reply


Me too. Came here to find this, wasn't disappointed.

reply


Welp. There goes gitlab. It was nice while it lasted.

reply




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: