

GitHub Recommended Repos Contest - pjhyett
http://contest.github.com

======
evdawg
"GitHub must be allowed to use the code commercially without restriction,
regardless of the license choosen."

I'm not too crazy about this. Contests can be good and bad at the same time.
"Here guys, solve this very hard problem for us for free and we'll take it off
your hands and we'll completely ignore your licensing wishes and use it
without any restrictions whatsoever."

I envision a lot of "Well if you don't want to participate, DON'T!", but the
point of the matter is, I see GitHub as taking advantage of a warm and
supportive open-source community. The license of the submitted code should
apply to everyone, including GitHub!

~~~
defunkt
Free? If you win, you get a lifetime GitHub account and a tasty bottle of
bourbon.

We're trading you the prizes for a commercial license.

~~~
chaqke
i'd recommend you try rittenhouse's 23-year-old and black maple hill's
21-year-old

~~~
defunkt
I'm a big Black Maple Hill fan. Will have to try Rittenhouse - thanks!

------
icefox
Part of the callenge of the netflix contest was the dataset. At 101 million
rows of data you couldn't toss your simple agorithm at the problem. But with
githubs data at only 440,237... I am tempted to toss it into my netflix code
just to see how fast the recommendations are generated!

~~~
icefox
Some other info on the data

    
    
      user id 1-56554 (i.e. 16bits) <- yah
      repo_id 1-123344 (i.e. 17bits)

------
michaelfairley
It'd be nice if there was a leaderboard (as Netflix had) so we could see
other's progress (to see if our implementations stand a chance, or if we're
wasting time).

Edit: Scratch that. It was hiding up top
(<http://contest.github.com/leaderboard>)

~~~
slig
I see that you didn't read the whole page carefully:
<http://contest.github.com/leaderboard>

------
alextp
I wonder if one can brute-force the results.txt file. To the github folks: do
like netflix and publish the comparison with one data set (validation set)
and, after the contest is over, publish the comparisons with a test set that
is only used once. That way nobody can brute-force anything.

~~~
schacon
figuring out the answers is actually not that hard - we're using data that is
publicly available through our API. in the rules we state that your entry will
be disqualified if you do this sort of thing, though.

------
alextp
Github folks: the repos numbered

59337 95472 80221 73599 24616

have no descriptions in the repos.txt files but have had their languages
computed. Is it because they are private? Anyway, you maybe should remove
that.

