

GitHub forking has one big flaw (2011) - tutuca
http://zbowling.github.io/blog/2011/11/25/github/

======
jondubois
I think it's only fair that the original repo should be the most promoted one.
Usually, the original author has put a lot of thought and effort into coming
up with the idea and turning it into a popular open source project.

You don't want to create an environment in which forkers can easily steal
credit from the original author(s). It takes a lot of passion and goodwill for
someone to start a new open source project. I think they deserve some credit.

If an 'owner' no longer feels up to the task of managing their project, GitHub
lets them transfer it to someone else. That's what happened with ExpressJS and
it worked out fine.

~~~
tessierashpool
_I think it 's only fair that the original repo should be the most promoted
one. Usually, the original author has put a lot of thought and effort into
coming up with the idea and turning it into a popular open source project._

OP brings up the example of a project where the original repo shouldn't be the
most promoted one, because it's been abandoned, while plenty of other repos
are still alive. I blogged about the same thing earlier this month and used
the same example.

if you haven't encountered this problem, you will. it is absolutely a problem
every developer __will __encounter. it 's only a matter of time.

somebody starts a great project, doesn't have time to keep it alive, and the
community fractures because GitHub has no way to differentiate between
"original repo" and "canonical repo."

not going to rant about this because I already did in a blog post:
[https://www.pandastrike.com/posts/20150610-thought-
experimen...](https://www.pandastrike.com/posts/20150610-thought-experiment-
github-community-view)

 _If an 'owner' no longer feels up to the task of managing their project,
GitHub lets them transfer it to someone else. That's what happened with
ExpressJS and it worked out fine._

this is a ludicrous statement. the Express.js transfer of ownership was a
ridiculous fiasco full of angry drama, hurt feelings, and core developers
resigning from the project.

documented here: [http://gilesbowkett.blogspot.com/2014/07/the-bizarre-
bazaar-...](http://gilesbowkett.blogspot.com/2014/07/the-bizarre-bazaar-who-
owns-expressjs.html)

also, the idea that you can solve this problem by having the original owner
transfer ownership doesn't make any sense. the whole problem is that the
original owner isn't paying attention at all, doesn't care in the first place,
and wouldn't know who to transfer ownership to, if they did care.

it happens all the time.

~~~
jondubois
>> this is a ludicrous statement. the Express.js transfer of ownership was a
ridiculous fiasco full of angry drama, hurt feelings

True, I should rephrase; from the consumer's point of view it turned out fine
:p The project is still healthy.

About ExpressJS, I did read something about one of the main developers not
even being aware that the transfer was happening until the last minute. I also
heard that there might have been money involved and it probably wasn't a fair
process. There are a lot of ethical dilemmas there. A transfer of ownership
doesn't have to be this nasty though.

------
tenfingers
Also, of note, is that currently if you "fork" using git and push it into a
github repository, there's no way to re-attach/hint github about the original
ancestor.

The problem is compounded by the fact that doing _anything_ related to the
ancestor, such as pull requests, or even just diffs, will not be possible.

I submitted a feature request to the github folks years ago, but nothing has
really happened (I was just suggested to delete and fork the repository
again).

Not that it's hard: you could determine the ancestor and different lineages
just using the hashes of the commits upon the first push to github. You could
also do it completely offline, it wouldn't matter.

There's also quite a number of forks available on github which aren't really
visible because of that. I know that for some of my own projects and smaller
projects that I checked, a code search would actually reveal _many_ non-linked
repositories. And I also know why: I often don't fork on github (why would I
if I know nothing about the project yet?), I just shallow clone locally.
Forking on github doesn't serve any purpose until you actually change the
code, which oftentimes has already been done locally.

~~~
kpcyrd

        git clone git@github.com:original/repo.git
        # make some commits
        git remote rename origin upstream
        # fork the repo on github
        git remote add origin git@github.com:your/repo.git
        git fetch origin
        git push -u origin master

~~~
zimbatm
This would be perfectly fine if all the project data was stored into git.
Unfortunately github issues, comments, project metadata, webhook setup, team
membership, ... are not part of the repository.

~~~
TazeTSchnitzel
Not to mention past and outstanding pull requests.

------
nathankleyn
Amusingly, Bitbucket has since removed a lot of their useful fork information
after a redesign that took place between now and this article's publish date
(2011) [1].

One approach to this problem, as the article mentions, is to list by
popularity - however what would this mean? If it's by the number of "stars",
not many people curate their list to keep them up to date. It would have to be
some kind of rolling popularity measure, perhaps number of unique users who've
cloned a repository in the last month or something.

[1]: [https://bitbucket.org/site/master/issue/5009/list-of-
project...](https://bitbucket.org/site/master/issue/5009/list-of-project-
forks-gives-very-little)

------
qznc
It affects only one part of the rant, but I wonder why Github considers it
necessary to publically fork a project. Often, I want to push a single fix. I
would like to

    
    
        git clone git@github.com:foo/bar.git
        # fix locally
        git commit
        git push  # creates pull request

~~~
ghthor
While that would be convenient,I have no idea how it would work. You're asking
to push to something you don't have rights to modify.

Git evolved with a pull workflow because the problem it was made to solve was
a the pull workflow of Linus and the kernel. This inherently means you must
self host your changes while they're being reviewed and accepted.

~~~
carussell
... which GitHub totally messes up, by the way. You have to go ahead and
create a (superfluous) on-GitHub fork to file a pull request. Not a problem if
the maintainers know how to use git and are willing to pull from you without
using the GitHub UI, but there are tons of people whose only exposure to git
is through GitHub and stops there.

As I wrote to an acquaintance earlier this week while venting about GitHub
(and the condescending remarks you're liable to get from people who equate it
with git and will assume that a tendency to stay off the former means you're
unfamiliar with the latter):

"Coming from a background where wiki pages would be hosted on wikis and
submitting [code] changes for review is as simple as a) creating a patch and
b) attaching it for review, as I look at all the unnecessary (>3x) overhead
that GitHub imposes and all the people who don't have a problem with it and
feel that it's good and proper and normal, I feel like I'm in crazytown."

Further reading: Mozillians'comments on Gregory Szorc's post "Please Stop
Using MQ"[1]. Pay particular attention to everything that Gijs has to say.

1\. [http://gregoryszorc.com/blog/2014/06/23/please-stop-using-
mq...](http://gregoryszorc.com/blog/2014/06/23/please-stop-using-mq/)

------
caboteria
Github can make some changes as administrative actions even though there's no
UI to do it. For example, I have a project that was originally a fork but
became the upstream when the fork went offline. Github was able to "break" the
link from my fork to the dead upstream so mine became the upstream.

It's not as easy as DIY but it's just a support request.

------
IshKebab
I've noticed this too. Loads of github projects have dozens of forks with
identical Readmes. The only way to work out the difference between them is to
look at the commit messages on the Network tab.

------
musically_ut
I wrote a Chrome/Firefox extension to address the problem of finding "notable"
forks of original repos. These often are community supported version of the
original which would have been hard to find otherwise: Lovely Forks ~
[https://github.com/musically-ut/github-forks-
addon](https://github.com/musically-ut/github-forks-addon)

------
ryanbrunner
I agree with the idea that "not all forks are considered equal" insofar as
GitHub should do a better job of surfacing notable forks, rather than a fork
that fixes a small environmental issue specific to a single person, or ones
that don't make notable changes at all.

In terms of not elevating the root to special status, I disagree. Recognizing
one particular repository as canonical is a feature, not a bug. As much as git
itself doesn't place any special significance on a particular repository, the
culture of open source development does. Linus Torvalds can certainly say that
his Linux repository doesn't hold special status, but that's just not true
beyond a technical level. It's useful to be able to say "this is the _main_ ,
supported version of this library"

~~~
MereInterest
It is good to have a canonical version, but the canonical version and the root
version may be entirely different. The author brought up this point, talking
about a project that he had started, which was later forked. That fork has
many more features, and should be considered the canonical version, but isn't.

~~~
lsaferite
So, what are your choices to elect the canonical one then?

First project pushed is canonical and can pass the baton?

Seems better to drop the idea of a root or canonical version totally. Linking
forked projects together with hashes for the network graph seems like a good
idea personally.

------
amelius
> I want to add a feature or fix a bug but I want to share the changes
> upstream or with other people that may find it actually useful. I fixed a
> bug for my own environment. These changes may break everyone else and so no
> body should probably use this fork. I want to lock a version of a project
> away in a safe place that I know won’t change or break later and may use it
> as a point to send changes back up later. (This partially due to some design
> issues with git submodules.) I want to experiment. My changes are probably
> interesting but not ready for primetime, but if it works out it maybe come
> something fruitful.

Isn't that what "branches" are for?

~~~
IshKebab
You can't make a branch on a projcet you don't have write access to.

------
btown
Solution: Just make sure your fork has better SEO than the root! Work hard at
promoting your fork, have blogs link to you as the best thing since sliced
bread, etc. After all, since SEO is all about grassroots efforts, it's not at
all like you're kowtowing to a central entity's policies and trying to work
around their ridiculous restrictions. Because that would be counter to the
distributed nature of Git and FOSS in general, right?

Oh, wait.

/s

