

Finding Unlicensed Repos on Github - niggler
http://aniggler.tumblr.com/post/48445125606/finding-unlicensed-repos-on-github

======
jcr
Your script is a great step in the right direction, but unfortunately, having
a file specifically named "LICENSE" is not universal. It's almost as common to
see a file named "COPYING" and plenty of other alternatives are used.

More importantly, any license stated in the actual source files supersedes the
additional "LICENSE" file. There is also the headaches of binary files like
images and similar where verifying the license is just painful.

I wish it was as easy as running a script. It would be a good idea if github
enforced some convention to make sure the free repo's they provide for open
source really are licensed under an OSI approved license.

~~~
niggler
"It would be a good idea if github enforced some convention to make sure the
free repo's they provide for open source really are licensed under an OSI
approved license."

I wholeheartedly agree. Every once in a while you stumble upon a landmine:
<https://github.com/stephen-hardy/xlsx.js/issues/8>

------
geraldcombs
The license specified in a LICENSE or COPYING file may not match the rest of
the source code. Debian's devscripts package has a utility called
"licensecheck" that will scan a directory and report the license used by each
file. Chromium has a wrapper called "checklicenses.py" that checks the output
of licensecheck against a list of incompatible licenses. If someone were to
take the next step of letting you point checklicenses.py at a remote
repository the world would be a better place.

~~~
mh-
is checklicenses.py suitably licensed for that? :)

------
tjaerv
The term you actually want is "license-free software":
<http://en.wikipedia.org/wiki/License-free_software>

There's some more to be found regarding the terminology here:
<http://ar.to/2010/12/licensing-and-unlicensing>

------
graue
This is a really important problem. I can't count the number of times I have
seen something promising on GitHub only to notice on closer inspection that it
has no license attached at all. Or, almost as bad, it just says "License: MIT"
at the end of the readme with no link or actual license text, which I doubt
(IANAL) is legally meaningful.

Sometimes I respond by filing an issue, titled "License?", where I gently
suggest applying the MIT or Apache license or something similar. Usually
that's what the authors intended, they just forgot, and are receptive to
adding it. But I still kind of hate to be That Guy. Even though it's
important, it feels like I am trying to correct someone's grammar. I wish more
people would step up and be That Person so I wouldn't have to be.

I do generally use MIT-LICENSE.txt on my repos (and some other variations on
older ones) so I agree that a slightly more general script would be nice if we
are to solve this programmatically.

~~~
niggler
"But I still kind of hate to be That Guy. Even though it's important, it feels
like I am trying to correct someone's grammar. I wish more people would step
up and be That Person so I wouldn't have to be."

I can assure you that this is probably worse than anything you've done:
<https://github.com/stephen-hardy/DOCX.js/issues/1>

" Many people assume that code on github is open source, but that is far from
the truth. In fact, the Microsoft Office Extensible File License exemplifies
Open Source Trolling: each clause an insult to the diligent readers'
intellect.

The real problem with all code under the license is that it muddies the water.
Microsoft could use your code as a reason to take legal action against others
who genuinely try to innovate and use DOCX as a data format. You hurt the
community far more than you help by releasing pseudopen source code. Ever
think that those lawyers might want you to do this so that they can go after
others later on?

@stephen-hardy I think you are a reasonable person, and I might be niggling a
bit, but neither of us want to see innovation stifled by myriads of lawsuits
because one person's effort to release code created a miasma around a beloved
software product. Let this be a clarion call, and please share with your
coworkers and superiors: unless the code can be released in a proper open
source format, it's better that you don't release it. "

------
orta
We have this issue in the obj-c community for Cocoapods, one of the the best
choices we've made lately is to refuse libraries that do not have a license.
Definitely wish that github would make you put some kind of license on a repo
if you are going to make it public.

~~~
niggler
Is that an automated process or do people manually inspect proposed libraries?

~~~
orta
We have a declarative ruby file that must include a license (
[https://github.com/CocoaPods/Specs/blob/master/ARAnalytics/1...](https://github.com/CocoaPods/Specs/blob/master/ARAnalytics/1.2/ARAnalytics.podspec)
) which doesn't have to be OSS ( because we support closed source libraries )
but is nearly always OSS.

So we have a CI linter that checks every library.

------
mkelley
I agree - GitHub being the great repository it is for Open Source projects, it
really wouldn't be a bad idea to have some sort of reminder to users when
creating a repo to add a file detailing the license the code is being released
under.

------
CodeCube
I really wish github had an automated tool to add a license file (by easily
choosing from a list of existing licenses, of course). I always neglect to
include a license on my projects ... and then procrastinate doing it
afterwards.

~~~
niggler
" automated tool to add a license file (by easily choosing from a list of
existing licenses, of course)"

I see the merits of that (it would be nice to see that option in the "Add
Repository" page), but I worry that licenses would then be set by autopilot
(without actually considering whether the license is applicable), creating
even more problems.

------
Zolomon
Cool stuff! But you forgot to license your code. ;)

~~~
niggler
Added :) See, even for small things it's easy to forget. Hopefully someone
makes a script to remind you for licensing gists.

~~~
Zolomon
If I had the energy for it today, I would add to your script so it fetches the
repository's owner's e-mail address and sends them a reminder/or adds an issue
to the project!

------
gsiener
Quick plug for License Audit: <http://licenseaudit.pivotallabs.com>

We developed that at Pivotal Labs since we need to pay attention to licenses
while working w/ our clients. Just connect your repos, add licenses to your
whitelist, and get updates if you're not in compliance. Feedback welcome!

------
rubbingalcohol
This is a great post and a good crack at addressing the problem. Your post
raising awareness of the importance of clear licensing is probably a more
valuable contribution than your script itself. I've lost track of how many
times I had to pass up on a good project on Github because of an unclear
licensing situation.

~~~
niggler
"I've lost track of how many times I had to pass up on a good project on
Github because of an unclear licensing situation."

A situation many of us have experienced. Until today, I thought I was alone in
my concerns regarding licensing.

" your script itself"

It's a gist for a reason. If I truly thought it was the best starting point
for a proper "license niggler", I would have made it a proper repo :) This
fits my particular licensing scheme (only using a LICENSE file).

------
NathanKP
You should also check for the existence of a package.json file at the root,
which is the Node.js style. These files can contain licensing information for
Node modules and projects, and personally I think this is a better pattern
than making a separate LICENSE file.

~~~
niggler
I don't disagree, but that's only applicable for node.js code. There's no
tradition of using package.json for fortran or C.

Although I do like the overall theme of developing a language-agnostic way of
indicating licenses (because, as also mentioned by jcr, checking for LICENSE
doesn't cut it)

------
nevir
You should check for any root level file with all caps LICENSE in it. Other
common examples:

LICENSE.txt LICENSE.md MIT-LICENSE MIT-LICENSE.txt etc.

~~~
niggler
A general tool would do that :) I try to use LICENSE for my projects.

