
Open source license usage on GitHub - hodgesmr
https://github.com/blog/1964-open-source-license-usage-on-github-com
======
andrewguenther
Just a friendly reminder about what it means to not have a license in your
repository. You can read about it here: [http://choosealicense.com/no-
license/](http://choosealicense.com/no-license/)

Essentially, by not including a license, you default to standard copyright
protections and "you retain all rights to your source code and that nobody
else may reproduce, distribute, or create derivative works from your work."
However, GitHub's ToS say that "by setting your pages to be viewed publicly,
you agree to allow others to view your Content. By setting your repositories
to be viewed publicly, you agree to allow others to view and fork your
repositories."[1] If you go into the glossary you can see that the definition
for fork "allow[s] you to freely make changes to a project without affecting
the original."[2]

Since I'm not a lawyer, I'm not going to do any deeper analysis other than
direct quotation here, but I can say that I personally like to submit pull
requests with the MIT license to projects I wish to use which do not include a
license, as well as a link to
[http://choosealicense.com/](http://choosealicense.com/) before I will use
them in my own project.

[1] [https://help.github.com/articles/github-terms-of-
service/#f-...](https://help.github.com/articles/github-terms-of-
service/#f-copyright-and-content-ownership)

[2] [https://help.github.com/articles/github-
glossary/#fork](https://help.github.com/articles/github-glossary/#fork)

~~~
Alupis
I might be in the odd camp here, but I think Github/Bitbucket et al should
force the user to pick a license when creating a new repo.

Even if there's an option that says "proprietary", I'm OK with that...

It would get rid of all the ambiguity of stumbling upon a very wonderful
project but being unsure if you can use it due to no explicit license.

~~~
andrewguenther
I wrote a paper on this in grad school. A few of these services (I cannot
recall which and I don't have a copy of the paper handy) actually state in
their ToS that you _must_ include an OSI approved license in your repo.

~~~
willnorris
Google Code has always been exclusively for open source projects. From
[https://code.google.com/p/support/wiki/FAQ#Hosting_Your_Open...](https://code.google.com/p/support/wiki/FAQ#Hosting_Your_Open_Source_Project_on_Google_Code)

> __Can I use Google Code to host projects that aren 't open source? __

> Nope. Open source projects only.

------
JoshTriplett
There are several entire communities of projects, such as Ruby or node.js,
which by policy put all of their myriad projects on GitHub. Those communities
have thousands or tens of thousands of repositories each, which make up a
significant part of the 80000 repositories shown in the main graph in this
article, and they're almost all MIT. So I'm wondering how much of the huge set
of MIT repositories on GitHub all come from that handful of communities, and
how much is random contributors.

I'd also suggest that there's a significant overlap between people who choose
copyleft licenses and people who avoid proprietary hosting services.

So while I _do_ think this data is significant, I think it represents the
GitHub community, not the broader FOSS community.

~~~
arfar
>I'd also suggest that there's a significant overlap between people who choose
copyleft licenses and people who avoid proprietary hosting services.

Very true, I quite like that some have adopted the "Don't Fork Me"[1] flag for
the corner of their software pages. Nice cheeky little stab

[1]
[http://librecmc.org/librecmc/wiki?name=github](http://librecmc.org/librecmc/wiki?name=github)

~~~
detaro
It is strange to see an argument like "They didn't like our change to the
license descriptions to a more political one". You want everything to be
copyleft, including your infrastructure, that is fine and consistent. Say so,
act accordingly and encourage others to do so, but that is just petty.

~~~
quadrangle
The license description at choosealicense.com actually _misrepresents_ the
GPL. The GPL doesn't actually require that the source of modified versions be
released to the original author. It requires that the source be released to
the users who get the software. It is a downstream license designed for the
freedom of the users and only incidentally works incompletely as a license to
promote sharing the code back.

Everything about the design of the GPL including the very wording of the
license is explicitly about blocking software from becoming proprietary. The
choosealicense.com description is intentionally designed to avoid giving any
acknowledgement of the GPL's actual design and intention because they don't
want people to value the things the GPL is about.

~~~
zerocrates
> The license description at choosealicense.com actually misrepresents the
> GPL. The GPL doesn't actually require that the source of modified versions
> be released to the original author.

Where are you seeing a description that says sources must go to the author? I
only see

> requires anyone who distributes your code or a derivative work to make the
> source available under the same terms

and

> When distributing derived works, the source code of the work must be made
> available under the same license.

Those both seem accurate, if maybe a little less clear than they might be.

------
kibwen
I was curious how this handled repositories that use are offered under more
than one license, so I tried this out on the Rust repository:

    
    
      "license": {
        "key": "apache-2.0",
        "name": "Apache License 2.0",
        "url": "https://api.github.com/licenses/apache-2.0"
      }
    

Rust is offered under both MIT and Apache 2, and has corresponding LICENSE-MIT
and LICENSE-APACHE files in its root directory. I'm assuming that Github
simply searches alphabetically for all files that begin with LICEN[CS]E and
eagerly terminates upon finding a single match.

~~~
yaeger
That would be very bad for projects that not only use dual licenses but
especially if one of them is a commercial license. If you just checked there
and found the apache and think "Great, I can put that into my closed source
tool" you might be in for a surprise when the author contacts you and tells
you that the apache is only for non commercial use and there is a second, paid
license for commercial use available that you should have taken.

~~~
delsalk
Surely doing that would require them to rewrite the Apache license (thus
making it no longer the apache license) otherwise there's no legal way of
enforcing the non-commercial use?

------
frik
It's good to see MIT and GPLv2 in the top 3, with BSD* and LGPL* in the top
10.

The MIT license is also known as the _X11 License_ :
[https://www.gnu.org/licenses/license-
list.en.html#X11License](https://www.gnu.org/licenses/license-
list.en.html#X11License)

In the age _secure boot_ functionality of _UEFI_ (BIOS replacement) to hinder
or outright prevent the installation of alternative operating systems and
locked _firmwares_ on common smartphone and tablet hardware. The source code
license has a strategic value, especially with high profile open source
projects like operating systems. Examples: Android based on GPLv2 Linux,
OSX/iOS and many routers based *BSD operating system source code that ship on
a closed down hardware.

------
pointfree
I'm pleasantly surprised to see that AGPLv3 is at least in the 10th spot.

------
reiz
Very interesting blog post. I wouldn't have thought that so less projects on
GitHub provide a proper license! I'm working on
[https://www.versioneye.com](https://www.versioneye.com) and we track
currently more than 500K open source projects in package managers. I just did
a quick lookup in our database about Ruby licenses. Currently we have licenses
for 56803 Ruby projects (RubyGems) and 49842 of them are MIT! That means 87%
of all Ruby projects who provide a license at all, provide an MIT license! I
will do a couple more queries and write a blog post to this!

------
igravious
Well. Thanks for the reminder. I always mean for the code I put on Github to
be shared but I rarely seem to plonk a license file in the project root
folder. (Does that mean I subconscious _don't_ want to share???)

I just GPLv2'd
[https://github.com/igravious/clearsilver_ruby](https://github.com/igravious/clearsilver_ruby)
and
[https://github.com/igravious/clearsilver_ebuild](https://github.com/igravious/clearsilver_ebuild)
Nothing earth-shattering but it's a start I hope...

~~~
__david__
> Does that mean I subconscious _don't_ want to share???

No, it's just a reminder that releasing software is kind of a pain. You have
to write docs, you have to choose a license, you have to organize things, tag
your repo, maybe make a .tar file if your project is big enough (or it's
expected by your audience).

> Nothing earth-shattering but it's a start I hope...

Everyone has to start somewhere! Nice job.

~~~
igravious
Thanks :)

------
sergiotapia
I'm really happy that MIT is in the lead by such a wide margin. It's the best
license in my opinion.

~~~
cpach
That’s interesting indeed! I get the feeling that the GPL has lost a lot of
mindshare over the last fifteen years or so.

~~~
tedunangst
As open source software itself has gained mindshare, more people have become
aware that there's more to it than just Linux, er sorry, GNU/Linux, and those
parts aren't all GPL. Anybody developing for the web was probably using some
combination of perl, php, python, ruby, node, etc. and those parts, i.e., the
parts they're directly exposed to, aren't GPL.

~~~
cpach
Good point. And also, the type of software will influence the choice of
license. E.g. when Facebook released the React library under the BSD license
they most probably decided that the potential loss in competetive advantage
would be unimportant. They could have gone with the GPL and then be able to
offer a commercial license as well, but that would likely just be a
distraction from Facebook’s core business. For a company’s main product
offering it’s a different equation.

------
quadrangle
Too bad that choosealicense.com is so carefully designed to maintain neutral
appearance while actively discouraging copyleft licenses (relegate GPL to last
option, describe it as only about sharing rather than about preserving
freedom, make it outdated GPLv2 — since they like GPLv2 loopholes better than
closing the loopholes etc.)

------
agwa
> To detect what license, if any, a project is licensed under, we used an open
> source Ruby gem called Licensee to compare the repository's LICENSE file to
> a short list of known licenses

They should also look in COPYING, which is the conventional place to declare
that a project is licensed under the GPL. The GPL percentage would likely get
a boost if they did this.

~~~
JoshTriplett
I checked Licensee, and it does look in COPYING:
[https://github.com/benbalter/licensee/blob/master/lib/licens...](https://github.com/benbalter/licensee/blob/master/lib/licensee/project.rb)

~~~
rmc
But not LICENCE

~~~
benbalter
It does!
[https://github.com/benbalter/licensee/blob/master/lib/licens...](https://github.com/benbalter/licensee/blob/master/lib/licensee/project.rb#L56)

------
tracker1
I prefer the shorter ISC license to BSD or MIT... It's the default from npm's
init, and just feels better than putting more verbiage than what's really
needed.

------
eliotfowler
I'd like to see them do some analysis and attempt to include licenses in
READMEs. That seems to be VERY common amount projects on GitHub.

~~~
reiz
I wrote once a license crawler for VersionEye which recognises the most common
licenses in README files on GitHub. But I didn't crawl whole GitHub! Only
projects which are submitted in package managers and did not provide a license
info on the package manager. I'm using it for example to complete license
infos about RubyGem projects without a license on RubyGems.

------
vitno
If I start a project on GitHub, I generally put a license in the LICENSE file
(at their repo creation). If I don't start it on GitHub and later add it, I
generally just put the license in the README(.md, 'cause GitHub).

This should definitely crawl the README as well.

------
ksk
>Here at GitHub, we're big fans of open source,

When will GitHub be open sourced?

~~~
Xylakant
I'd downvote you twice if I could. Open Source is not and never was about not
making money and it was never about giving away what makes the core of your
business. Github shares a lot of code and projects that are useful to the open
source community [1] and at the same time provides a free service that is
tremendously useful for a lot of open source projects. I feel like they're
doing their part of sharing, attacking them because they're a profitable
company is just cheap.

[1] see [https://github.com/github](https://github.com/github),
[https://github.com/libgit2/libgit2](https://github.com/libgit2/libgit2)

~~~
adrusi
Licensing their entire infrastructure would probably not impact their profit.
There are already very capable competitors to Github, the also-proprietary
Bitbucket, and the Free Gitlab most notably.

It's not so much their technology that keeps them the market leader, their
tech isn't really anything special (not saying it's poor quality or trivial).
It's their network momentum that makes them the perennial favorite.

If they released their entire source code under GPLv3 and started accepting
pull requests, they would almost certainly be fine. Bitbucket couldn't make
much use of their code unless Atlassian followed suit and released _their_
source under the terms of GPLv3 as well.

They also of course sell a self-hosted version of github to enterprise
customers, who now _could_ give them the finger and just build the application
from source, but if they wanted to do that, they would just use Gitlab. The
enterprise customers are paying for the support package that comes with the
software.

Free software isn't about not making money. It _is_ about "giving away" the
core of your business. Not because we want to run ourselves out of business,
but because we have an ethical obligation to preserve our users' freedoms to
run the software however they want, to study and modify its implementation,
and to redistribute it regardless of whether it was modified. Or at least many
of use believe we do.

Part of the reason the GPL exists is to help your users without endangering
your business.

~~~
sytse
Thanks for mentioning GitLab! Giving away the core of your business is harder
than giving away the fringes. At GitLab we would love to have only one free
version. But we need an open-core business model (with a proprietary
Enterprise Edition) so that the enterprises pay for licensing. Making money
only on support is very hard, especially when you ship a stable and easy-to-
use product.

~~~
mikekchar
Please don't take this as criticism, because I don't mean it that way. I just
want to point out the possibility that the situation you find yourself in is
based on your choices, not necessarily on an inability to create a free
software business model. I have not looked at GitLab specifically, if the
following does not apply to how you do business, then I apologize. But for
interest sake, consider the following.

Most software companies work on a "push" based model. They have BAs/PGMs who
decide what work needs to be done, make plans and get development in motion.
The developers implement the features and after a certain period of time, the
features are finished. The business then sells licenses to the software based
on the features that exist in the product.

In this kind of business model, you have many problems. First, you have a
fairly large up front investment which you need to recover. The initial
software development is paid for by the company and you need to find someone
who is willing to reimburse those costs. Selling support is untenable because
if the product requires a lot of support (meaning that the cost of the support
contract is worth it), nobody will use the product. So instead the company
must restrict the use of the product to paying customers. In this system, the
no-cost, open source version is a loss leader that brings new customers. Such
a system may actually be antithetical to free software developers because
there is a huge incentive for the company to introduce vendor lock-in so as to
push people to the paying version.

However, "push" models are not the only model you could use for development.
While they are comparatively rare, "pull" models also exist. In a "pull"
model, you avoid doing work unless someone pays you to do it. While it might
be bug fixes, it is more likely features. Organizations are willing to pay for
specific features because they get a defined benefit for an single up front
cost. As long as there are several customers who are willing to pay for
development, the cost is shared across those customers.

In such a system, it is the company's actions rather than their IP which is
valuable. In fact it would be detrimental to restrict people from getting
access to all the features because you lose potential customers. Also in such
an environment, it is very important that nobody makes a proprietary version
of the software to unfairly compete with you, so licenses like the GPL are
incredibly useful.

As I said, examples of this type of model are few and far between. It is
unfortunate because I think there is a lot of potential. If you got this far
and are interested in one example of a successful "pull" model business, I
highly recommend reading:
[http://www.oreilly.com/openbook/opensources/book/tiemans.htm...](http://www.oreilly.com/openbook/opensources/book/tiemans.html)

Cygnus was eventually acquired by Red Hat for about $700 million in Red Hat
stock. I can't find numbers to back this up, but from memory I believe they
had sales of over $100 million per quarter at the time (the best place to look
for how Cygnus impacted Red Had after the acquisition is to look at the
quarterly reports from the year 2000 on).

I hope that proves interesting. Whether or not GitLab could transition to such
a business model is obviously a different question. It is a completely
different way of doing business and I don't think you could just tell people
to change overnight. However, if you are really serious about exploring
options for moving to a more free software oriented approach, I hope the above
offers some clues for how to start.

~~~
sytse
Thanks for the advise, it was interesting reading about Cygnus, I was not
aware of their story. At GitLab we already use both the pull and the push
model. Since we started there have been people contributing to GitLab (more
than 700!) and we encourage more contributions:
[http://feedback.gitlab.com/forums/176466-general/status/7964...](http://feedback.gitlab.com/forums/176466-general/status/796455)

We also do sponsored development were organizations pay for adding new
features.

People and organizations love contributing and sponsoring new user facing
features. But other work is not contributed. Release management, dependency
updates, security investigations and design updates don't receive much love.
Few people want to do the work and few organizations want to sponsor it.

I don't say pull driven software can't be secure (FreeBSD) or that it can't be
well designed. But with GitLab we were not happy with the rate of progress on
these fronts. So we adopted an open core model to pay people to help with
this. This meant our free software (that has the same design and security as
the proprietary one) advanced at a much more rapid pace.

We're sure that staying 100% pull driven would not have created such as
quality product as GitLab is today. We respect other organizations working
with other models (Linux, Rails, etc.) but we're very happy with the choice we
made for GitLab.

------
kijin
Is there any way to select a license for an existing GitHub repository,
without adding or modifying any file at a particular location in the tree?

Also, it's difficult to find out the license for any given repository while
casually browsing or searching. I'm often looking for things that come with
(or without) a particular license, e.g. "I can't use any GPL components in my
current project." It would be really nice if the license was displayed
prominently, just like the programming language used.

------
rmc
Seems to look for the file "LICENSE". Some people spell it "LICENCE"

~~~
benbalter
It does!
[https://github.com/benbalter/licensee/blob/master/lib/licens...](https://github.com/benbalter/licensee/blob/master/lib/licensee/project.rb#L56)

~~~
rmc
Yes, turns out I was mistaken!

------
XiZhao
For another alternative:

[https://tldrlegal.com/api/license](https://tldrlegal.com/api/license)

is available w/ default REST pagination/query params for content on TLDRLegal

------
fiatjaf
Ok, I just added a lot of LICENSEs from the web interface to a lot of repos
and I'm not ashamed.

------
HeadlessChild
All those Linux "dot-files" and similar stuff take up quite a lot of space on
GitHub.

------
seth_gillette
So some intern put together a list of all the licenses?

