And, I think it remains the case that the difficult part of decentralizing GitHub is not working out how to share repositories using p2p bandwidth, but instead working out how to use p2p for the social features of the site, like comments and pull requests.
Secure Scuttlebutt has some potential there, but as a gossip protocol it doesn't have a way for people you don't know/follow to e.g. send you a PR.
edit: added gitlab enterprise link (plus formatting)
For the actual code behind PR I would just setup a git daemon and reference own repo. With appropriate CORS the code can even be cloned in a browser.
though I just stumbled upon it, no idea how it works and what are the features vs. limitations.
Looks like it's abandoned, but it's a start at using ETH and IPFS.
If you have a tight knit group of friends getting everyone to swap over to Mastodon (and slowly bring their friends) isn't that difficult. That's how the social effect works whenever there's a max exodus of users from Platform A --> Platform B (like what happened with Myspace-->Facebook, Digg-->Reddit, etc.). It isn't impossible - just unlikely.
You can see some statistics here: https://mnm.social/
GitHub is a center for activities: PRs, issues, notifications. It's only a matter of convenience. To be successful you have to create an equally (or more) convenient decentralised alternative.
Do ad hock networks lead to fetlock-in?
A hock is a joint in the lower end of a quadruped's hind leg, between the knee and fetlock. It's in analogous position to the ankle in humans.
Of course 'hock' can also mean 'to sell, especially to a pawn broker'.
The word you want is 'hoc' which is Latin for 'this'. You may be familiar with the 'post hoc' fallacy: which is that if event A is followed by event B, event B must therefore be caused by event A. The full name of the fallacy is 'post hoc, ergo propter hoc', which means 'after this, therefore because this'.
The latin word 'ad' simply mean 'to' or 'for'. So 'ad astra' means 'to the stars', an 'ad hominem' argument is to literally direct your argument 'to the person' and their faults.
So finally we arrive at the combination 'ad hoc', which means 'to the purpose'. Ad hoc things are done in purely pragmatic manner, for a particular purpose. They are often short lived arrangements which end once their purpose has been meet or is made moot.
Oh, and you didn't make this mistake, but I'm going to blather about it anyway. 'etc.' is short for 'et cetera', it is not spelled 'ect.', nor 'excetera', nor is the usage 'and etc.' valid. 'et cetera' means literally 'and the rest'--meaning anything, not necessarily just the Professor and Maryanne.
The ad-hoc I was reffering was : https://en.wikipedia.org/wiki/Wireless_ad_hoc_network
And I am aware of "etc." ;)
Do you believe internally Github is a single big server? And that big server only have a single HDD, a single chip of ram and a single core? Even then I'm pretty sure we could argue there's some separation there. It doesn't means the interface isn't centralized.
Same goes for Bitcoin. We can all access the same data, yet it's decentralized.
Git != Github
The problem with Git decentralization is that you need a publically accessible endpoint (meaning there is an eventual central host). Even when you are using email for patch files, your email becomes the central point of failure (even if you host that yourself). So far as fetch goes, this is why git.kernel.org exists.
Git behaves like an mp3-sharing website with multiple mirrors, where-as gittorrent behaves like magnet URLs. This protocol effectively obsolesces the requirement for remotes.
Not sure what you mean by this. The article mentions and compares itself to Github throughout.
> The problem with Git decentralization is that you need a publically accessible endpoint (meaning there is an eventual central host).
I disagree slightly here. It's completely possible to build a network of remotes that don't all reference a single central origin (e.g. teammates referencing eachothers' local repos over authenticated connections, possible on a LAN, etc.). This gets messy and is hard to administer securely, but Git is more than capable. Also, the challenges of doing this Gittorrent securely with private repos seem similar to those of using SSH remotes between individual dev machines.
In terms of the article talking about "hubs" (Github specifically), Gittorrent purports to replace the need for Github, but only fills one of the uses of a hub. A hub like Github serves two purposes:
1. a central origin, i.e. the same use git.kernel.org serves
2. a searchable/discoverable network, e.g. the same use the central thepiratebay.org search engine serves for Torrents. Decentralising this seems like a hard problem.
* Issue tracking. It's a bit hard to export these. I imagine it'd be a bit nicer to export if issues were actually stored inside git as trees. You could easily make the issues frontend use git as the storage backend. If users don't have an account, maybe they could issue a pull-request to the issue tracker?
* Notifications. This ones's a bit harder. Web-hooks?
* Merge requests - github uses refs internally (I think). That wouldn't be too hard to standardize. If somebody adds commits to their pull/merge request, then they just have to push the updated ref to the repo they're submitting to.
* Auth - this is hardest part. GitHub provides authentication & permissions for pushing. The issue with allowing merge requests submission from anybody (which you could do with a server side hook) is that now people can ddos you by submitting HUGE pull requests constantly. If you could make that safe, in theory you could get away without needing user account federation.
* There is also the big security issue that you can't actually use server side hooks for this stuff since libgit2 can be used to bypass it (when pushing using libgit2, it doesn't trigger server side hooks).
* Oh, and this entire time I've been thinking about the email & username as being unspoofable. But you can easily spoof them. I guess you need federated user accounts after all.
Edit: It's untrusted interactions that's hard to decentralize
All in all, I think that git is easily decentralized if you know you can trust all the actors. It's untrusted git that's harder to decentralize.
I don't think "pull requests" are necessary for such a system. Even if you like the pull request workflow (I personally find it tedious and cumbersome) it can be achieved via existing mechanisms like personal email, mailing lists, IRC, pastebin, etc. Trying to define one workflow and baking it into a protocol sounds like a receipe for bloat. Note that git already has built-in support for email.
Regarding authentication and permissions: I don't think git is any different from other mutable collection of files in this regard. For example, I push my git repos to IPFS just like any other files, and use IPNS to point at the latest version. IPNS uses keypairs to limit access, which is a pretty standard technology with known practices for things like revocation, indirection, etc.
Notifications are just another 'mutable collection of files', e.g. an RSS feed, so solvable in the same way.
I wouldn't necessarily expect people to switch from Git to Fossil. I certainly haven't. But I found it fascinating and figured others might as well.
Git by itself doesn't have issue tracking, wikis, etc. Those are part of Github/lab/bucket.
Should the git project seek to add this as part of the repo and protocol itself? Or should we be making new protocols and repositories for this meta-project/code data?
I haven't tried Fossil yet, but I understand it has a lot of this stuff built into its protocol/repository format.
git-ssb (Git over Secure-Scuttlebutt):
Secure Scuttlebutt (https://www.scuttlebutt.nz/)
I was thinking of using high frequency sounds (inaudible) for peer discovery when in the same room, like chromecast is doing when phone and chromecast are not on the same local network, but simply scanning network seems to work just as well, apparently.
P2P data access isn't about free storage (we're not all Linus ;) http://www.webcitation.org/6P8EBZqQX ): it's about decoupling data access from device access. Web addresses like `github.com/foo` make requests to a specific machine (github.com), in the hope that (a) that machine still exists and (b) it will respond with the data we were after.
Content addressing (bittorrent, ipfs, dat, etc.) lets us request the specific data we're after. If the original host is still serving that data, it can fulfil the request for us; i.e. the existing Web setup will still work. The advantages are those situations where the Web model would break down. For example if the host changes its address without a redirect, if their layout/navigation changes, or if they disappear but someone else just-so-happens to have it (even us!).
Many people focus too much on that last scenario. I think it's nice that with something like IPFS I can have a few flaky boxes serving my files, including a laptop that's often suspended/offline, and I don't need to care about failover, load balancing, etc. Plus I don't need to do anything specific to support offline usage from a local cache: those with the files can just serve themselves.
I'd like to know how that works out too. I suppose in the end it's up to you to make sure you keep a base copy of your project so it is always available, even if near zero popularity.
1) We set up something like the Internet Archive, spidering and caching the network, run using donated bandwidth. The hosting could be centralized like the Archive or p2p itself (donated by peers).
2) A system like FileCoin's where you pay a tiny amount of money to ensure other peers will always host your repos on the network.
3) A straight up centralized, for-profit service where you pay a company a monthly fee to host your data on the network, as people already do with GitHub itself.
It's like paying a centralized service, except you don't have any recourse (legal or financial) if the people hosting it decide to stop.
Which features would we miss?
Maybe we could add those features with something similar to git, some intelligent scripts?
Maybe they could be solved in a federated way like Mastodon?
There are the incidental contributors who do not necessarily have any affinity with git. For example, I've written a plugin for JOSM (OpenStreetMap editor), and a plugin for Inkscape. Both are hosted on GitHub, and any user willing to report a bug can do so there (and they do) without having to learn developer tools.
for me it's the random people browsing github by tags and opening issues on my project.
They already have a competent P2P twitter clone, as an example app. If it can do that it can probably do a github.
Maybe git-donkey or git-gnutella instead?
Overall though I think a federated system that also handles issues is better, something more akin to Fossil.
However I think the main problem is almost always existing users. Git has arguably been falling behind in terms of features for quite a while now, but its momentum makes it hard for a lot of devs (including me!) to switch to a different vcs.
It’s all I’ve ever really used, and I’ve never seriously considered alternatives. What features in other VCSs are interesting?
GNU arch was an early DVCS but didn't seem to gain much traction. It was eventually superceded by Bazaar, which is much slower than git but was better at tracking history e.g. across renames. I think git has improved in that area, and bazaar has been getting less popular.
Fossil is built on an sqlite database and has a built-in Web UI and issue tracker, which is fine for those who want that but violates the "do one thing" principle (e.g. if you wanted to use some other issue tracker).
Darcs keeps track of patches and their dependencies, unlike git and mercurial (which track snapshots and their diffs). This approach supposedly makes life easier, and makes the darcs CLI quite usable. One problem is that some of the merge algorithms can take an exponential amount of time to complete, which has been getting addressed in recent years but many users seem to jump ship to git.
Pijul is similar to darcs, but breaks compatibility in order to solve the exponential merge problems. It looks very promising, but is still very new.
All in all, I'd just like a distributed scm that doesn't feel like it was designed to make Linux with 10,000 other people. It is... frustratingly sophisticated at times.
My projects aren't big enough to have their own sites, but I do push my repos to IPFS and give links at http://chriswarbo.net/projects/repos (actually using IPNS, which should reduce regeneration but is still experimental and flaky ;) )
If other people prefer to seek an alternative to GitHub, why is this upsetting to you?
Isn’t it a good thing that people can choose to move to other services?
In a world where companies are expected to have growth every quarter, a project like GitHub is one bad quarter away from microsoft making major changes or being abandoned or being completely shutdown.
As we’ve seen with product after product, when companies are expected to have growth every quarter, they will sometimes (often?) make radical changes to the software. In the case of GitHub, if microsoft decided it needed far more profit coming in from GitHub, it is entirely reasonable to expect they may decide to shift it into a service which is only usable for their large corporate partners, or they may decide the social features need to go, or they may decide to stop allowing public repos, etc...
Microsoft has shown many many times in the past that they’re more than willing to throw out open standards and attempted to force adoption of their own proprietary standards.
It is entirely reasonable to be at least a tad skeptical given their history and their business model.