This is one of my nightmares... I have 800+ repos on GitHub and I dread the thought of some weird machine learning algorithm somewhere deciding that my account should be banned.
A lot of the stuff I care about on there is backed up elsewhere, but not everything. I even have a tool for this: https://datasette.io/tools/github-to-sqlite - which exports issues, issue comments etc to a SQLite database via the GitHub API.
I do at least trust GitHub not to delete everything, but to instead put my account in some kind of soft-deleted state - and I'm reasonably confident I could get it reinstated via my network. But still, scary.
Why do I have everything on GitHub? Because I genuinely do trust them - they have a 15+ year track record of NOT breaking, and I know that my repos are backed up to three different continents automatically.
I'm pretty sure Microsoft getting acquired is a considerably lower probability event than the other risks you're mitigating. If not, you might want to assess your risk tolerance.
Acquisition was just an example, and true, it's not likely with Microsoft. More likely glitches with Microsoft are that it sells Github, or there's a change in management priorities or personnel.
I was really addressing the larger case of depending too much on any company for this sort of thing, not Github specifically.
It's a tradeoff between concentrating your risks in a very low-risk-probability monoculture or distributing your risks across a large statistical sample of (generally) much higher risk individual providers (smaller companies shut down or get acquired at much higher rates, and are frequently still high value targets for attackers even though they have much smaller budgets to spend on security and defense).
Distributing your risk among a large number of vendors is not necessarily better, and even if it's "just backups" can still multiply your risk of ideally non-public information getting disclosed.
Denial of service risks exist but are rare and, for example with the myriad complaints about "stripe is evil" tend to trace back to "account in question was doing something a rational unaffiliated observer would predict as a likely account cancellation driver for stripe." We don't know the circumstances here. If the circumstances were benign, it will likely be corrected, if not benign, we may never know the details.
EDIT ADDED (since HN is rate limiting this): self-hosting doesn't eliminate risks, it just changes them. You still face denial of service risks from loss of power, fire, natural disaster, theft, component failures, etc. If there were a zero risk path, everyone would take it.
We tend to prefer risks we feel we have "control over" (eg. Self-hosting), but at the incident rates of most other corporate hosting risks you're into the territory of an earthquake took out both your house and your rented storage locker.
A third way is to run essential services yourself rather than relying on other companies. This is my preference.
EDIT to reply: " self-hosting doesn't eliminate risks, it just changes them."
Absolutely true, but from my observations over the past couple of decades, the overall risks are lower with with self-hosting than with having someone else do it. The tradeoff is that self-hosting is more work. What's right for you depends on your particular needs.
This is why I never use any OAuth/SAML authentication from places like Google, Microsoft, or Facebook when registering with a site.
Always use email address and password (with 2FA where possible) to ensure you have ultimate control over your base account.
If Google chose to just ban my account, the downstream ramifications of me losing access to linked sites that use that account for auth would be very damaging.
This assumes you're using a consumer Gmail account, I guess? If so, yeah, you're SOL. But I think as long as you have your owns domain you should be fine, even if you hook it up to Google Workspace for email, since you can always switch it to another provider. (And yeah, I realize this is not something civilians are likely to be able to figure out / do...)
Not necessarily. Speaking for myself, I like the convenience of SSO, and I generally trust Google to keep my account secured more than a random web site (if they even offer OTP!). As long as I know there's an escape hatch.
Same here. I have my own domain set as the main email address of my personal Google account (I have in fact deleted the Gmail one altogether, as I noticed that otherwise some websites would still pick the Gmail address over the custom domain one when using SSO). Not many people seem to know you can do this without Google Workspace, or that you can have a Google account without a Gmail address.
If I ever lose access to my Google account I can just set a password through the "forgot my password" feature (I've noticed this is often also the only way to set a password if you create an account through SSO with some websites, but it does work). The convenience of SSO with an escape hatch as you say.
Here's a term you need to start saying to management of your company: "software supply chain risk".
github just became a business risk and a software supply chain risk.
I recently worked for a very large company that was 100% all in on github.
And now its clear that in an instant and single developer or perhaps the entire organisation can be finished as far as github goes.
Any CTO would now be negligent to have an organisational dependency/risk on github.
The future has to be companies getting out of the cloud, getting out of vendors who can inflict major business damage in an instant with no recourse on the basis of entirely unknowable decisions.
Okay but usually businesses deal with this not by having no dependencies and doing everything themselves. Instead they establish business relationships that ensure they do have a recourse, such as through contractual obligations.
Likewise, if the concern is that Github automatically bans you without a recourse, then my preferred solution is that they do introduce recourse. This could even be a paid option: I think it's entirely reasonable for them to ask an administration fee for the appeal process, just to cover the costs of having a human look at what's going on and providing a good explanation.
Judging by some of the reactions on this thread, GitHub would do well to make an official post explaining the situation, as well as the processes and mitigations in place for users affected by such suspensions. Regardless of the technicalities behind making backups or the decentralized nature of git, GitHub is a trusted service and needs to maintain that trust.
The tech industry has a very strong belief that if they have to explain anything they do with rule enforcement, it will permit bad people to manipulate the process to avoid being punished for being bad people (as defined by the tech industry). The solution is to put everyone in constant fear of somehow angering something in a tech company (human or algorithm) so they'll reduce themselves down to a very narrow set of behaviors that probably won't (but aren't guaranteed not to) upset the enforcers (human or algorithm) of secret rules. Don't like it? Go build your own tech industry.
We'll hopefully have a real alternative to these centralized platforms when Radicle[0] gets closer to core feature parity with GitHub. For many, this will be sometime this year, for others it might take another year or two.
It's really scary to have a single platform be able to take away what is essentially your portfolio, resume, social profile, webhost (github pages) and collaboration platform all in one go, with no explanation or recourse available. It shouldn't be that way.
We're inching towards a Radicle v1.0 release that will provide some stability and a basic feature-set for hosting and collaboration in a fully "sovereign" way. If you're interested in helping out or learning more, feel free to drop me an email: cloudhead@radicle.xyz or come by our Zulip[1].
That - you can't ban an account unless providing clear, direct and irrefutable evidence of serious nature such as pornography, hate speech or terrorism etc.
This should be applicable to every online service provider.
The automatization of moderation is legitimately one of the scariest problems with modern internet, but because it's abuse and mistakes inherently affect tiny minority, nobody cares.
And obviously companies will fallback to automated tools rather then pay for human reviewers.
I don't really know how this should be dealt with.
...or maybe it shouldn't.
And we should just accept that random minority will have their livelihoods destroyed.
After all, it wouldn't be the first time when society disregarded injustice of few for the convenience of the many
There have been at least a few truly decentralized efforts along the lines of GitHub. There is almost no interest. I also feel that just going with one or two other companies is better than exactly one monopoly but we can do much better.
For some reason if I say p2p on HN it's not appreciated.
The issue is more that p2p is always touted as the solution and yet there's been almost zero actually useful solutions developed. There's some fundamental issues preventing the decentralized network to be a generic solution and until that's sorted out, it doesn't seem to add to a conversation to bring up p2p as a solution.
Centralization is just too convenient for most people. Most people - technical or otherwise - won't accept the slightest quality-of-life (edit: and network size/activity) hit in exchange for the greater robustness of decentralized options, regardless of the application.
Note how the vast majority of people who use or interact with cryptocurrencies converge around a few centralized options. Even a ton of the technical people who develop cryptocurrency-related systems do.
The quality of life on the Fediverse is pretty good, in fact it's better than on Twitter because the default app has a lot of features from the long defunct custom clients. What people won't accept is a social network their friends aren't using.
I was thinking about this recently. What does "truly decentralized" Github look like? In particular, I'm curious about decentralized code review and so on. Probably not technically difficult or mind-blowing, but I'm curious nevertheless.
> But if you get banned from Github, can't you just pick different remote and keep rolling?
If you have a local copy. I imagine that if you have 1800+ repositories on Github, there'd be at least a few you probably didn't have installed anymore, because you knew you could always get them back from Github.
PRs are useful, and so is the discussion system. These things are at par with other platforms that are pretty much equally barrier-free.
Any of the workflow automation stuff might be more difficult to replace, but I haven't had an especially good experience with it anyway. I don't know about everyone else, but it's been pretty unreliable and fragile. Even the Azure integration, which ought to be first-class, is pretty opaque. I really wanted to lean into that stuff, but for now, I'm happy running my own.
Git is already decentralized. So no, we don't need a fediverse equivalent. Crazy idea, but what if you hosted your code on your infrastructure, like the "old days"? Or, move to an independent forge, or stay where you are. The concern isn't about GitHub directly, but that everyone storing all their source code in one place is concerning.
Git is flexible. I'm kind of surprised at how ubiquitous GitHub is given how easy it is to host your own repositories and set up a mailing list to accept patches and discussion on.
Self hosting is painful and not a real solution for most people. Mailing list also have terrible UX. I'd rather risk Github nuking all of my 50 or so repos than work over a mailing list.
As much as I love doing things myself, the reality is that I spend wayyyy more time doing things myself than I save, and time is my main constraint. Getting booted off of GitHub would be a pretty minor inconvenience compared to the time sink of doing it myself, especially if I botch something and lose data.
> Git is flexible. I'm kind of surprised at how ubiquitous GitHub is given how easy it is to host your own repositories and set up a mailing list to accept patches and discussion on.
The UX you're proposing is far worse than what GitHub provides for less effort, and anyway GitHub isn't just "repo hosting + code review", you also get web hooks, CI, user/org management, code search (such as it is), a web interface (as well as mobile app), etc. And you don't have to teach people how to use it effectively.
I can't tell if you're being ironic, but Git is already federated. Like, GitHub should be just another remote, and which remote happens to be the source of truth is a policy decision. The sticky part are the value-add, non-Git parts like GitHub actions, the issue tracker, wiki, static hosting, access management etc. But most of those are orthogonal and federating them (perhaps as git repos themselves) seems like a good idea.
(disclaimer: work for MS, not on GitHub, also hi again superkuh)
Putting the source somewhere else: easy, git push and change the URL in your doc.
Moving all the issues: hopefully someone must have done a script to fetch the issue and convert them.
Moving your CI: just rewrite your GitHub action away from these horrible yaml command. This is just going to be a bit more expensive than the 100% free GitHub plan.
Getting the community of millions of user that can report issues, propose PR and star your repository. Good luck with that.
The network effect of the community is the barrier to the change. Unless you do proprietary software, then you don't care.
> Getting the community of millions of user that can report issues, propose PR and star your repository. Good luck with that.
Oh my stars! The network effect of "the community" is the worst part of Github. Your life and your project will be much better without whiny entitled people spamming the same issues expecting free support, and endless "i got an error" bug reports.
I really wish gitea offered a paid plan for individual users so that there was a good alternative to Github available for personal private git hosting without having to self host.
This was a good reminder for me to request an export of my Github account.
I have no intention of leaving and don't expect any issues myself, but it made me realize I have a lot of stuff there that I don't have local backups of anymore.
The way it got banned represents something that could happen to any one of us - getting seen by an MS employee on the front page of HN is something that can only happen to Chris Wanstrath.
Moving off github is a good idea anyway, since they took away source code search for non-signed-in-users. I experience that annoyance every time I need to find the source of an error string in someone's github project.
>Moving off github is a good idea anyway, since they took away source code search for non-signed-in-users. I experience that annoyance every time I need to find the source of an error string in someone's github project.
It could be done in part for cynical registration metrics reasons, but I'm guessing search is probably costly and this helps reduce automated activity. I also imagine they might face issues with bots searching for certain patterns to identify vulnerabilities and leaked credentials. Why is it unacceptable for you to register an account to make your debugging process easier?
I use several devices, oftentimes don't keep cookies, and now that 2FA has been mandated for all accounts (even ones whose sole value is a ticket to source code search), logging in takes a flow-breaking context switch. In fact, it takes more time to access github search than it does to clone the repo and grep locally. Although I am aware of the passkey workaround, I am not a huge fan of installing browser extensions with access to my credentials, and I don't want to have the equivalent of my_passwords.txt on every single one of my devices.
A lot of the stuff I care about on there is backed up elsewhere, but not everything. I even have a tool for this: https://datasette.io/tools/github-to-sqlite - which exports issues, issue comments etc to a SQLite database via the GitHub API.
I do at least trust GitHub not to delete everything, but to instead put my account in some kind of soft-deleted state - and I'm reasonably confident I could get it reinstated via my network. But still, scary.
Why do I have everything on GitHub? Because I genuinely do trust them - they have a 15+ year track record of NOT breaking, and I know that my repos are backed up to three different continents automatically.