Git is distributed but all the things around it that we really need aren't
You can set up git to push to multiple remotes automatically
Nobody is actually using git in distributed mode
Did I forget anything?
The importer can still get better but we're importing about 500 projects per hour at peak https://www.dropbox.com/s/hr3ndcmu21aehmk/Screenshot%202019-...
I'm generally very pleased with GitLab. I'm delighted that it's open source.
Please let me know if this issue addresses the problem you encountered, would love to make sure we're prioritizing it. https://gitlab.com/gitlab-org/gitlab-ce/issues/20780
No idea when these are going to be deployed. 12.2, 12.3, 12.5... But I'll keep checking in and trying the import every few weeks.
I've opened an issue with some improvements suggestions, like surfacing when we are rate limited and allowing gradual access to the data while the import is going on: https://gitlab.com/gitlab-org/gitlab-ce/issues/66525
See also https://gogs.io/
Can anyone recommend an issue tracker that is distributed with the repo?
$ git issue add "foo doesn't work"
Created issue ece5591: foo doesn't work
$ git issue comment ece5591 "the problem might be with bar"
Created comment 0191fa1 on issue #10.
$ git issue comment --reply 0191fa1 "or with baz"
Created comment f98e783 on issue #10.
$ git issue show ece5591
foo doesn't work -- ece5591 Your Name <your@email>
the problem might be with bar -- 0191fa1 Your Name <your@email>
or with baz -- f98e783 Your Name <your@email>
$ git issue push
Pushed issues ece5591.
$ git issue pull
Pulled issues 3ebdc7e, 24cdb90.
If the issues are just another source artifact alongside the code in the same git history a lot of that information flows the other direction. Instead of a magic command in a commit note to close an issue, an issue closing shows up in a diff in the commit. Figuring out which issues are still open in a given branch (what isn't in our release branch yet?) is a simple command in a branch. Figuring out which issues changed between branches (what's finished in this feature branch that hasn't been merged to release yet?) is a typical diff operation.
In terms of documenting the state and progress of an individual branch in a complex project there can be a lot of magic in having issues tracked directly inside code branches. It's a fascinating ideal for code documentation to have issue comments, changes, and workflows side-by-side the code that changed with it.
That said, issue trackers are sadly not ideal code artifacts for a lot of reasons, including that it is easy to forget that issues aren't just for coders, but also stakeholders/product owners/PMs/etc.
My personal solution is similar to sibling comments, but more structured - using vimwiki in a subdirectory inside the repo, and committing updates alongside the relevant code changes.
And the meta-meta follow on.
Linux repos and ISO download sites have TONS of mirrors, sometimes contributed by companies, but also a lot of universities, government and public institutions.
Why is one of the largest core open source repositories centralized and private? Why are we so dependent on AWS, Google and others for large chunks of Internet services to be accessible?
There is a deeper question here than just the outages; but rather why they affect so many people.
Centralized infrastructure tends to be less complex and therefore cheaper. Economies of scale drive people into using a centralized monolith and then network effects do the rest.
If you want a decentralized system to sustainably compete with a centralized one, the decentralized system must provide tangible and obvious surplus value in comparison to the centralized one or otherwise it is not economical.
In reality that's often not the case. Usability is often worse or infrastructure is in favor of centralization (e.g. asymmetric broadband internet connections).
Just speculating but I think distributed systems are just inherently more complex. Serving multiple mirrors is easy, hosting across a set of providers and keeping things in sync gets harder.
It's a problem, and besides building completely open source systems that are designed to have distributed instances, I think its a pretty hard thing to fix.
> There is a deeper question here than just the outages; but rather why they affect so many people.
The answer to these seems so obvious to me perhaps I'm missing something - Outside the centralized services the solutions are fragmented (ie the decentralized solutions are used by a lot of people but are not large by definition).
I use my own Linux virtual server (as it happens, not in AWS although I have no anti-AWS beef) to host my private websites and my git repos. There are a million services that do this kind of hosting and people use all kinds of solutions like this. You will never hear about my private git repo being down because only I care when that happens.
Last night I wrote a little spreadsheet for myself showing how much time I spend doing various things. In particular, I for one do NOT need to read HN more often
I expect similar poor behavior testing GitHub’s actions because it’s so convenient to only run through the service.
It is important though that it’s possible to run without CI/cd. If it wasn’t available I would complain because of the risk of lock-in.
A Radicle project contains a git repository, plus the associated issues and proposals.
^ Neat project
Individuals and teams who already have what they need in the local repos can continue to work through an outage of the VCS part of github, but at the points where they need to collaborate (merging each other's changes, issue tracking, etc.) the workflows break down. Yes you could share changes in a more distributed manner, or workaround the outage in other ways, but in reality people will stop and wait for the central repo to be available again. Also, pulling changes from my repo to yours directly to avoid that part of the downed service doesn't solve the issue tracker or CI manager also being down. The clue is in the name: git HUB.
That said, the issue is similar to how people see the results of aviation accidents: a jumbo going down takes out a lot of people but statistically, considering all air journeys, that is a lot less than the deaths that would happen due to equivalent car journeys. I suspect that everyone being a bit inconvenienced for a while when github/gitlab/other-centralised-service has issues like today doesn't add up to as much as the many small individual inconveniences they'd experience over time with their own local instances of, for example, gitlab, especially if including the routine maintenance time required which doesn't exist as much when using github, or gitlab managed instead of self-hosted - the outage just feels bigger than all the little problems because everyone is affected at once.
So the UX point is moot. From there, it is just a matter of how much do you want to pay for availability (multizone replication, etc)
All kinds of stuff broke, including "the entire internet" for people who were made to use 22.214.171.124 by their ISPs. E-shops that load jQuery from google stopped working, for no real reason whatsoever, people getting lost because they couldn't navigate anymore in the real world, etc.
Since then I try to flag all this repetitive nonsense about slack/github going down for a few minutes, to no avail. It keeps comming back to the first page of HN. :D
> didn't make any news here either
Denied by government by any chance? Makes sense why it din't hit the news.
Also what was fun was everyone being blocked from all services that use reCAPTCHA.
When was the last time ALL freenode servers were offline? I know about occasional netsplits but those do not affect all of their servers and freenode itself is still operational.
These modern cloud services using Linux ecosystem are doing rolling outages instead. I'm amazed people keep making them dependencies. Maybe their local systems using similar technology went down more often. Maybe this is an improvement. I'd still put mine on an OpenBSD or OpenVMS cluster if it was business-critical with lots of money on the line. I want it staying up, up, up. :)
VMS clusters were (and modern equivalents are) awesome, but they have a different speed of innovation / development.
It is a trade off - perfect uptime, but that new feature could take a year, or 99.9% uptime, but it can be written and deployed tomorrow.
Generally, on top of code reviews, one can get high return with minimal labor and hardware with a combo of Design-by-Contract, contract/spec/property-based testing, low-F.P. static analysis, and fuzzing with contracts in as runtime checks. That's my default recommendation.
In Github's case, they also might have access to both closed and open tools that Microsoft Research makes. MSR is awesome if you've never checked them out. Two examples applicable to system reliability and security:
Plus some of their other tools in various states of usability:
The Github monopoly is driven by its users.
HOW COULD YOU FIND NOTHING?! Not even the file I copied it from? A simple grep search would've had no issue and would've been faster!
- Gitlab (sorta OSS, managed, self-hosting)
- Gitea (OSS, self-hosting)
- Gogs (OSS, self-hosting)
- Bitbucket (managed)
- Azure Repos (managed)
- Google Cloud Repositories (managed)
To be honest, Gitlab has come a long way. And things like search that people point out doesn't work well on Github either, where code search is super broken and often skips results or find nothing where a git grep would drown you in results.
Uh, I've been using this for quite some time in GitLab and never found it lacking. Curious what do you find missing there?
Problem 2 with Atlassian is that their portfolio is entirely made up out of stuff they bought up over years, with wildly different coding and UI/UX standards/history.