

When GitHub kills Open Source - rsaarelm
http://t-machine.org/index.php/2012/01/13/2012-the-year-of-uncollaborative-development-or-when-github-kills-open-source/

======
pilif
For every contributor to an open source project in the old days, there might
be fifty failed forks on github, sure. But for every five failed forks, there
will be one that thrives and which commits get accepted back.

What you are seeing is both an explosion in contributions and a permanent log
of every failed contribution ever. This greatly affects your perception.

Back in the old days it was infinitely harder to provide and apply a useful
patch, so it wasn't done in nearly the same frequency. Contributions were
limited to a small circle of people motivated and skilled enough to climb the
huge hurdle.

Nowadays, creating and submitting a patch is trivial, so the hurdle is much
smaller. Hence you will get many, many more people to try and contribute,
which, because of how github works is also visible to the public for all
eternity.

At least in my case, none of my patches I sent in in the old days and which
were not accepted are visible anywhere. Heck, most of the time you'd have a
really hard time at even finding the patches that were accepted.

Github is far from killing open source. Quite to the contrary. But as
visibility increases and hurdles get torn down, you might have to adjust your
perception of reality.

~~~
turbulence
I think you have to read the article again, because you are talking about
something quite different.

~~~
pilif
Not necessarily. If a project doesn't merge a proposed patch, the patch could
simply be deemed inappropriate for the projects chosen direction.

So if a fork doesn't get merged upstream, I see this as a failed fork.

If a upstream stops working on their project and stops accepting patches, the
outlined problems can happen, but just look at any random sourceforge project
not updated since 2006. In the old days, there was practically no way for
other contributors to get back on track, but it wasn't logged for eternity
either - the project just died.

Today, Github at least provides a chance to get back on track, but, again,
your perception might be altered by the fact that on Github you don't just see
the successful reboots, but also all the failed ones?

------
jaggederest
As someone who has dealt with this, it's not as big a deal as you might think.
Most future forks are based on older forks, so all the person at the end of
the line has to do is fast forward onto the end of the branch. One FF merge,
push, end of story.

When you have bifurcations, you can do an octopus merge - git is _really_ good
at resolving these things. Very little human effort is needed except where
multiple revisions change the exact same line in different ways.

In addition to this, most patches that people submit are quite small. Even if
you have 200 people submitting patches, the odds are that most of them fall
into two categories: people fixing the same bug, and people working on
completely different sections of code. Neither is a substantial problem to
merge.

I think I can count on one hand the number of times I had to do any nontrivial
merge work on patches from contributors... And you're pretty delighted to do
it - it means they fixed something that _really_ matters.

~~~
adamrg
(as the author of the original post)

In theory, yes. And I never used to worry about this. But over time, in
practice, it's been a bigger problem than I think you're giving credit to.

e.g. ...

"Even if you have 200 people submitting patches, the odds are that most of
them fall into two categories: people fixing the same bug, and people working
on completely different sections of code. Neither is a substantial problem to
merge."

IME ... in practice, this is a HUGE problem. Because every one of those
developers fixes the bug in a slightly different way.

The longer time goes on without the original Author fixing it, the worse it
gets. And the cost to them - or anyone! - of sifting through the "100
variations on bug fix #123" becomes greater and greater.

Usually, you want to cherrypick individual lines and characters from 5-10 of
the best "solutions" to the bug.

If you'd avoided the "100 alternative fixes", then those "improved" solutions
would have been built on the "basic" solutions - and merging would be easy.

But because you've got to this massively-forked scenario, all of the patches
have been written independently and incompatibly.

~~~
moe
_But because you've got to this massively-forked scenario_

That may be true for the 0,0001% of projects that have so many contributors
that they need dedicated release managers and such anyway.

The remaining 99,9999% projects are just grateful for github making
contribution so easy that they're now receiving patches at all.

------
pixelcort
It's not that hard to do this:

    
    
        git remote add other_user_who_forked git://github.com/other_user_who_forked/project_name.git
    

from within your checkout of your fork. In one of the projects on GitHub that
I've forked, we are merging between each other without the original project
owner even being involved.

~~~
ge0rg
And this does not even need GitHub support. I'm using remote repositories to
maintain projects with >10 contributors without much effort.

------
yummyfajitas
I don't get it.

 _If A disappears with merges pending … then B/C/D find they have 3 distinct
codebases, and no way within GitHub to do a simple cross-merge.

Now, the situation is not lost – if B, C, and D get in contact (somehow) and
negotiate which one of them is going to become “the primary SubAuthor”
(somehow), and they issue manual patches to each other’s code (surprisingly
tricky to do on GitHub)..._

If B, C and D get in contact via, I dunno, github messages, and pick a primary
subauthor, it's very easy to issue manual patches. If I'm B:

    
    
        git remote add C ...
        git remote add D ...
        git pull C master
        git push github master
    

I agree github might not have a button for this, but I'm pretty sure most
github users are comfortable with the git command line.

------
xdissent
I really did feel the same way as the author for a long time, but I haven't
yet seen any of my fears manifest in practice. The vast majority of forks die
without fanfare after serving some singular purpose. People who want to
contribute code do so more easily than ever. People who fight over ownership
of open source projects are just jerks like they've always been. There may be
more of them, or more of them are more visible now that we all use Github, but
I consider this a trivial downside of an otherwise remarkable ecosystem.

~~~
jaggederest
I've seen smooth transitions between _de facto_ ownership of projects a ton,
but never a bitter divide where both ends are actively maintained.

One pathological example is delayed_job, which has changed 'leadership' a few
times over the years. It's still pretty easy to look at the 'network' graph
and choose the endpoint you want... or just use the published gem.

~~~
xdissent
I've been the leader of a project I assumed from another guy that he assumed
from yet another guy and then the original guy even assumed leadership back
after a while. No one missed a beat. If you're involved in this community, you
are most likely capable of tracking down the "correct" fork.

Unfortunately, the "published gem" part is the one that has given me the most
trouble historically. But now that pretty much everyone uses bundler for gems
this should be a nonissue - you can even specify a branch of a fork that you'd
like to build.

------
acdha
The author has this completely backward: this is fundamentally a social
problem - the difference is that with GitHub it's actually visible. Anyone old
enough to remember the pre-DVCS era should remember chasing down patches in
bug trackers, blog posts, etc. and maintaining local forks — with the
requisite terror-inducing periodic gigantic merges. Now we've lost all of the
manual labor in that process and made it easy for anyone who wants to do
things the right way to do so – it's still possible to waste your
collaborators' time if you really want to but before it was almost a
requirement of the process.

As a minor point of craft, this also illustrates an area where more training
is needed: the problems described are most common when someone makes a fork
and keeps every single commit in a single branch. Using feature branches – and
it'd be awesome if Github started encouraging that with the fork & edit model
– makes most of the listed problems far more manageable.

------
babarock
When you hear Linus speak about his workflow when working on the kernel, he
always mentions his "Web of Trust" concept. I think the problem is not
inherent to github, but rather to the idea we have when we foolishly think of
the possibilities combining git and a social network.

The truth is, programming is still very much about people, and you need to
trust the people in order to pull their code. Trusting the people goes beyond
trusting the code. If you give me great code, then disappear or decide to make
an unmergeable fork, it will harm my project as described in the article.

On the other hand, if I get to know the people behind the pull requests, learn
to talk to them and get them to be more involved in the project, then the
risks exposed can be easily circumvented.

------
iamwil
Actually, if you fork from the main branch, you can still pull commits from
other collaborators--though I don't know if you can send pull requests to
other people, haven't tried. But it is doable. There is nothing to stop you
from making another remote branch that tracks another person's repo and share
code that way.

------
alexchamberlain
I don't agree that GitHub is killing open source. However, the author has a
point. It is hard(er) to merge into other forks, which is a shame since git is
so good at this.

I'm not critising GitHub, their software is great, but in the next iteration,
they should consider addressing this.

------
DasIch
Most Open Source projects die. That's why CPAN, PyPi etc. are able to have
such a huge number of packages, a significant part, if not most, has no
documentation, tests, support or is dead all of which in practice is more or
less the same.

"In the old days" you didn't notice it as much because those projects just
disappeared but with Github they don't, in fact they're all over the place.

I'm not sure if this is a problem or one big enough worth caring about but in
any case Github isn't the problem.

It would be nice if authors could "archive" or "abandon" repositories which
could be filtered out on searches by default and be displayed less dominantly
on profiles.

------
6ren
To be fair, the usual consequence for a project that loses its Author is to
die.

It seems that github could facilitate the migration of an "ownerless" project
to a designated fork - including facilitating the selection of who has the
designated fork. Just support for the informal process outlined in the
submission.

It's interesting that linus deliberately avoided having a "designated fork" in
git, but instead made them all equal, and you just pull from who you trust. Of
course, in his case. _his_ fork was the socially designated one, so this was
not a problem he experienced or had to solve.

------
obtu
The commit graph (gitk --all if you are using plain decentralised git, GiHub's
network graph for the convenient everyone-github-knows online version) makes
it quite obvious which author is good at reviewing and integrating patches.
With a little bit of side-channel communication, a deficient maintainer is
easy to replace.

Also, someone who is late at merging patches won't have a lot of difficulty
catching up. If they did no divergent work at all, it's just a matter of
picking the best integrator and fast-forwarding.

------
AdrianRossouw
I don't think projects faltering out due to the bus factor [1] not being taken
into account is github's fault.

set up a team repository and give multiple people commit access? Team/project
accounts should probably be more of a standard feature of open source projects
on github, once things get beyond a certain point.

[1] <http://en.wikipedia.org/wiki/Bus_factor>

------
riosatiy
Wow, no offense dude but this was a really crappy post. This problem have
existed for all eternity, just as other people are stating: There are a lot
more projects, people who contribute and transparency of those than before.

And choosing that title. It just seems you are writing one of those. "Look at
me! I am writing something controversial"-articles

~~~
chimeracoder
Welcome to HackerNews! Since you're a new user (green name), a friendly
explanation of why you seem to be getting downvoted: At HN, we try and
encourage respectful discourse, even when we dislike or disagree with what is
being said. If you'll look at the other articles, other people seem to agree
with you that the article is poorly written and that the title is sensational,
but they aren't being downvoted because the way they phrase those complaints
comes across as less insulting or _ad hominem_.

~~~
riosatiy
Excuse me, I will use better phrasing next time. Thank you for the friendly
explanation.

------
potomak
It's a little bit extreme but I understand your point of view. Anyway I think
GitHub helps open source more than how it kills it.

------
aliguori
I think the author is missing something that makes Open Source work that few
people appreciate. It's a fundamentally lossy development model. A certain
number of patches/features end up in /dev/null for any Open Source project.

You can think of each "fork" as a new start-up trying out a new idea. But
instead of reinventing the entire world, they get to start with a functioning
product. The vast majority of these start-ups will fail but the ability to
experiment (and fail) with forking is fundamentally what makes Open Source
development better (at least IMHO) than proprietary development.

A lot of people look toward Open Source development thinking that there's a
lot of wasted development and that that's a problem worth solving, but that's
like the government trying to make 100% of businesses successful.

------
robot
This has nothing to do with github. Github is what it is as the name suggests,
it's a convenient hosting platform for git projects. The fix/merge issues are
between developers and has been around since open source first started. It's
people issues, not github.

------
powertower
Here is a question:

How do you handle merging someone else patch into your dual-licensed project?

The GitHub hosted code is GPL, but your other code license is for commercial
projects (you charge a fee for the NON-GPL license).

Obviously, the patch is based on the GPL project, but you don't have copyright
on that patch (to be able to merge it into the non-GPL codebase).

Do you ask the contributer to give you the copyright?

What if it's a simple bug fix that's only a few characters?

What if the contributer says no?

Is there a way to make this happen smoothly?

~~~
desas
1\. Yes 2\. See [http://www.softwarefreedom.org/resources/2007/originality-
re...](http://www.softwarefreedom.org/resources/2007/originality-
requirements.html) 3\. If it can only be written one way then it's not
copyrightable ianal. 4\. Canonical and others require that you fax/email a
signed document to them e.g.
<http://www.oracle.com/technetwork/community/oca-486395.html>

------
cykod
This reminds me of the famous Churchill quote: "It has been said that
democracy is the worst form of government except all the others that have been
tried."

Most of the article is true, but GitHub is also leagues above anything else
out there, and certainly leagues above the mailing list with hand-crafted
patches by de-demonizing forks and turning projects more into a meritocracy.

There is certainly room for improvement, but I think it's a step in the right
direction.

------
wavetossed
The very fact that both forks are available on github means that you can check
out both forks, then merge changes locally. After that, you can use the merged
code to create a new github project that is not a github fork of the original
ones.

If you really have a tangled web of failed forks, this is the way to fix it by
starting afresh with a merger of the best forks.

~~~
turbulence
From your comment I see you have not gone through the "fun" of merging 3+
projects with varying degree of changes.

------
astrodust
What would go a long way towards fixing this is having an organization plan
that's free, but only allows public repositories. That way the code could be
entrusted to more than a single individual as it is now. Commit rights are one
thing, but having ultimate control over the repository is usually limited to
one person.

------
timkeller
Ha! If anything GitHub is doing more to keep Open Source development alive and
healthy than any other company.

------
omarqureshi
I fail to see a better alternative unfortunately.

The only way that you can stop this is by having multiple maintainers for a
project so that projects don't just die if the main maintainer is hit by a
bus.

And yes, this crappy, albeit well known situation is not just specific to
Github.

------
keeran
Was this written (conceived) before the new pull requests mech & UI was
introduced?

How can the dead end of an ignored patch submission be better than what we
have now?

------
av500
reading that makes me wonder how open source projects existed at all _before_
GitHub...

GitHub exposes the once private forks that people had lying around on their
HDDs, so I count that as a plus.

As for developing open source in a collaborative way, that goes much further
than just a git infrastructure, there's mailing lists, patch reviews, roadmap
discussion etc... exactly like in ye olde days

~~~
adamrg
(as author of original post)

Agreed. But GitHub did a lot more than just that - across the board it removed
the barriers to collaboration (I used to run a few projects on SourceForge,
and contribute to others; the ease of GitHub was like a breath of fresh air).

It got people excited and feeling free and able to collaborate.

...and so (I suspect) we're today _less tolerant_ of unexpected barriers to
collaboration. GitHub gets you hooked, then makes it extremely difficult to
manage the "handover" part of a project (something that SF - for all its
failings - handled pretty well).

The projects that die this way may well never have existed without GitHub in
the first place - but that's not an excuse to just kill them off under a
burden of maintenance crud.

~~~
hunvreus
I think your post fail to recognize a few things. I am not going to point out
the various other argument, however I think you'd need to acknowledge the fact
that first, there are much more people contributing to OSS nowadays than
during the "SF's days". Moreover, with the acceleration of online
collaboration in all its forms, we are overwhelmed with new trends that are
sometimes hard to interpret. I genuinely believe that these trends tend to
self-regulate themselves over time and that users, in the end, learn to better
leverage the tools they are introduced to.

------
6ren
google cache:
[http://webcache.googleusercontent.com/search?q=cache:http://...](http://webcache.googleusercontent.com/search?q=cache:http://t-machine.org/index.php/2012/01/13/2012-the-
year-of-uncollaborative-development-or-when-github-kills-open-source/&strip=1)

------
biafra
What does this have to do with git_hub_? Isn't this a "problem" with git?

I don't think it is a problem at all because without git (or hg, bazaar etc.)
or github we wouldn't even have such a thriving open source and open
development community. Collaboration didn't get harder with DVCS it got
easier.

~~~
darb
His issues sound more like issues with open source project governance. Github
has just made it easier to contribute, and thus it is becoming clear that good
open source projects have good governance. The owner of a project on github is
not the only one who can merge pull requests, they can add multiple
collaborators on a project...

If anything it does highlight the need for more people to form collectives
around projects and use the organisation tools to own the projects...

------
ighost
I'm tired of this alarmist tone.

