

Ask HN: Why is GitHub not considered distributed? - sidcool

I have been using GitHub since the past few months, but could not connect the points on the distributed part of it.  Recent GitHub outage raised many questions about it not being distributed.  Someone quoted &#x27;There is no d in GitHub&#x27;....Please enlighten me.
======
brudgers

       Github != Git
    

There is a type mismatch. Github is a webservice with a RESTful API. Git is a
tool for building distributed databases with an emphasis on availability
rather than consistency.

If you have a project under Git source control and the only version is on
Github then the project uses a remote repository but is not distributed. If
you have a local repository in addition to one on Github then your project is
distributed and in the event of a network partition (for example if Github
goes down or your ISP goes offline) then the Git database is still available
because you have access to the local repository.

Now suppose you have six developers on the project and each has a local
repository and uses the repository on Github as a remote repository. If Github
goes down, then the network is partitioned in a way that keeps the developers
from seeing each other's work...though each developer still has their local
repository available. The important point is that Git doesn't require a hub
and spoke configuration. The six developers can each have access to the five
repositories of the other developers plus Github. Though it could be a bit
messy for systematic merging to achieve reasonable consistency, such a
configuration would be robust in the face of partition.

A perhaps more practical possibility is to have a nearby Git server as a
remote repository as well as Github and perhaps another server out on the
interwebs...say on AWS...to provide redundancy against partition.

There are really two strong features of Github which Git facilitates.
Internally, it provides a team with a remote backup. Externally, it provides
an organization with content distribution. However Github's centralized nature
[it even has the word "hub" in it] makes it something less than a content
distribution network and when it goes down it serves to illuminate the reasons
that large long running open source projects mirror their content.

~~~
sidcool
Thanks for the detailed explanation, @brudgers. It helped!

------
andrewchambers
Github is a central server, it is not distributed at all, things like issue
tracking and the wiki won't work if the server goes down. Git on the
otherhand, is distributed because the central server is not required for most
git operations to work if you have alternate methods of sending patches to
eachother.

The fact that git works even when Github is down works in Github's favor
really nicely and makes server outages less noticeable.

~~~
sidcool
Oh, that makes sense? Can we pull, push from different machines running git
without a git server?

~~~
andrewchambers
You can test this pretty easily by just running ssh on your own machine and
doing git clone ssh://sidcool@localhost:/home/sidcool/myrepo

~~~
sidcool
Do I need to expose any SSH keys for that? (Sorry if a noob question, I am new
to Mac (and Unix)). Thanks!

~~~
andrewchambers
Just give it a test and see what errors you get. You can even play with just
cloning local repositories without ssh.

"git clone /path/to/localrepo newclone". You should be able to push and pull
between local git repo's on your system too. It will help to learn about
setting up remotes too.

"git remote add otherfolder /path/to/other/repo" "git pull otherfolder"

~~~
sidcool
Sure, will give it a try. Thanks!

------
rdc12
GitHub is a code hosting provider, that happens to use git (which is
distributed), but the service they provide is run centrally and hence when
their infrastructure fails, the service in not available (searching, tickets
etc)

~~~
sidcool
Thanks, rdc12! What would an ideal Git service provider be like? Are there
distributed alternatives? Is GitLab distributed?

~~~
rdc12
Well ideal is subjective, GitLab appears to also be a centralized model, but
with an option to self-host.

For a distributed solution, with similar features and purpose. You would
probably be looking at a federated model.

~~~
sytse
Most people self-host GitLab. We would love an option to federate
[http://feedback.gitlab.com/forums/176466-general/suggestions...](http://feedback.gitlab.com/forums/176466-general/suggestions/5097708-implement-
cross-server-federated-merge-requests) merge requests welcome

