
Stack Exchange Machine Learning Contest - moserware
http://blog.stackoverflow.com/2012/08/stack-exchange-machine-learning-contest/
======
Homunculiheaded
If you're someone who's interested in ML/datamining but haven't had a chance
to put your ideas to any hard/interesting problems I strongly recommend a
kaggle contest. It's one thing to plug some data into a random forest and go
"oh cool, I guess that did okay" and entirely another to see how other
competitors are comparing.

one of the biggest challenges I've found in implementing ML projects is I
don't have a great sense of when I've really gotten the most info out of the
data. I'm not particularly competitive but the contest format is great for
this. When you see that a solution you'd normally be happy with ranks in the
lower half of the answer you're really pushed to improve your solution.

This is leads you to learn your tools and algorithms better. For a couple of
contests I took seriously I ended up learning tons about R, spent most of my
nights reading academic papers on various newer techniques, and also read
through a few books. On top of all that you really should spend time reading
up on how past winner have won which gives a bunch of practical insight into
approaching different ML problems.

In one contest I tried the hardest in I actually placed terribly after the
final results were calculated, but looking over what went wrong I was amazed
to see that I actually did progress really far with my understanding of ml.
I'd say a month of seriously competing is easily worth a semester long grad
class.

~~~
mej10
I am really interesting in ML, but have only recently been diving into it. I
have watched all of the videos for Andrew Ng's Coursera course (and most of
the programming exercises), but just looking over some of the Kaggle contests
I think I would be quickly out of my depth.

Would I be wasting my time attempting these with such a basic level of
knowledge?

~~~
Homunculiheaded
If you read through the past winners you'll find that in many cases a very
simple model will win. I believe one of the winners that posted a blog post
had pretty much the background you describe.

When I started I was in a similar position to you and just wanted to see if I
could even tread water with some of the really knowledgeable members of the
community. I ended up placing in the top 5 for one of the contests I was in
(with btw a really simple model).

They usually give you some starter code in either R or Python which will give
you the results for a benchmark, start there and then use cross-validaton to
see if you can beat that bench mark, and if you do submit. It's very addictive
and you'll come away knowing a lot more than you started with.

~~~
mej10
Awesome, thanks for the info. I am checking out some of the benchmarks now.

Why do you think it is that simple models often win? Is it due to the experts
no participating or is there a lot more low-hanging fruit than I previously
thought? Or just that simple models are easier to use and reason with for
humans and thus easier to get right.

~~~
numlocked
I work at Kaggle.

In many cases where simple models win, there's some insight into the data that
the winner found - engineered a new feature, or noticed a pattern and
appropriately tuned a particular method. Where those insights exist, they
often overshadow any gains by super-sophisticated ML techniques.

~~~
mej10
Makes sense. Thanks!

------
rm999
Looks like a cool contest, I may check it out. What bothers me about modeling
contests (I've taken part in several, it's my field) is they often reward
putting 90% of your effort into extracting relatively small performance gains.
For one thing it's not a realistic operating environment, there are usually
many other factors more important than pure performance like upkeep, cost,
speed, etc. This is why the netflix contest winning models couldn't go into
production. The other issue I have is that people with other commitments (like
a job) don't really stand a chance, it's usually very time-consuming to go
from fifth place to first.

~~~
Homunculiheaded
As someone who went from top 5 to somewhere in the 60s in one contest, and
reviewing results of past contests, I believe a lot of those small tweaks for
slight gains in leader board scores end up penalizing the contestant for over-
fitting. I saw a similar complaint to yours in a couple of forums but I do
believe more often than not those small performance gains in the leader board
actually hurt final scores.

Additionally for contests like Heritage Health [0], I believe the necessary
goal of RMSLE of less than 0.4 is not considered possible (I came across this
in the forums but never verified), so even if the contestants just inch past
0.4 it would still be something impressive.

0\. <https://www.heritagehealthprize.com/c/hhp/leaderboard>

~~~
rm999
It's not about small tweaks, it can be substantial additions to a model that
improve its actual, out-of-sample performance. A popular method in these
contests is ensembling, which involves building many sub-models and combining
their scores into a single ensemble model. The netflix winner used ~100 sub-
models in their ensemble, but the vast majority of the predictive power came
from just three of those sub-models (can't find the source now).

~~~
Homunculiheaded
Ah, I think I see what you are saying: essentially that the time it takes to
build and tune the blending method and model selection for a 100+ ensemble
gives you only a slightly better prediction than an appropriately choosen
reasonably performant model at both a large computation and human labor cost?

What I was addressing was the issue that some users on Kaggle seemed
frustrated that people were essentially submitting models with small parameter
tweaks in order to marginally boost leader board scores. To these complaints I
would argue that over-fitting is it's own punishment.

Thanks for the clarification!

------
msellout
Does any other profession have a Kaggle? Imagine a more general contest: build
my company a tool that increases our market value by X%; we'll give the
winners $Y and a job interview. The expected value of participating is $Y/n,
where n is the number of participants.

It's like the opposite of a professional organization. I suppose the
libertarians approve. It drives down the cost of labor and therefore might
make the market more efficient. Yet I'm suspicious.

I'd like to propose a counter-organization. Analysts can band together and
offer a contest. We collaborate to create a tool that gives your company an X%
increase in value. Companies bid for the rights to that tool. I'd expect that
the value to the laborer would be greater than $Y/n. I guess that just
described a consulting company.

Perhaps the situation is not so unique. Art also provides much value in the
act of production and many organizations hold art contests similar in design
to Kaggle competitions. Open-source software often doesn't even have a
competition sponsor.

It'd be ludicrous to imagine holding a contest to offer the best legal advice
or diagnosis. I'm not saying that I agree with the restrictions that the
American Medical Association has placed over the ability to attend medical
school, but the free market is harsh enough competition.

Kaggle does promote the value of the field as a whole. I worry that it
commoditizes rather than professionalizes.

------
cletus
It's nice to see this kind of contest but the topic just sets me off on a
much-needed rant.

The moderator situation on Stackoverflow is getting out of control. I see a
Q&A site as having three main groups:

1\. People who ask questions;

2\. People who answer questions; and

3\. People who edit/moderate questions.

Even 2+ years ago there was a lot of lip service paid to the value of (3). I
disagreed then and it's only been reaffirmed by subsequent events. To be
clear: it's not that I think these functions have no value, it's that they
are, at best, _secondary_ to content creation.

The problem is that these roles without diligent oversight attract the wrong
kinds of people (eg [1] [2] and a scandal a few years about an admin black
list that I can't seem to find right now).

Take this question from Stackoverflow: _Database development mistakes made by
application developers_ [3], a question I spent some time answering and that
people seemed to appreciate the answer to (based on comments and 1000+
upvotes). It is closed as "not constructive". This is hardly a unique
phenomenon. We've all seen many interesting questions posted here that are now
closed or locked and who knows how many have been deleted.

The kind of person you end up is overly pedantic and a real stickler for an
arbitrary set of rules.

Editors/moderators are the bureaucrats of the Internet.

As Oscar Wilde said, “The bureaucracy is expanding to meet the needs of the
expanding bureaucracy.” [4]. These sorts of people just invent work for
themselves in the absence of anything to do.

Joel needs to make some changes to Stackoverflow. It's rapidly going the way
of the old Usenet days when anything interesting gets shot down and anything
else gets closed and the OP lambasted for not having found the 17 previous
duplicates. Not good.

The biggest problem I see is an extreme interpretation of what is
"subjective". "What language should I learn?" is an obviously subjective
question. In the absence of any concrete criteria, it's hard to give a useful
answer.

But consider a question like "What are the pros and cons of Sinatra vs Rails?"
This sort of question (IMHO) absolutely has value as someone experienced with
both could enumerate the relative merits of each in a pretty objective fashion
without making an absolute determination. This is something that absolutely
could have value to anyone evaluating Ruby Web frameworks.

So, back to this post, what are the odds of any particular question being
closed? it seems to be positively correlated with how much time has passed
(since SO's inception) and how interesting the question is.

[1]: [http://www.nbcnews.com/technology/technolog/wikipedia-
admins...](http://www.nbcnews.com/technology/technolog/wikipedia-admins-face-
gauntlet-scrutiny-889502)

[2]: [http://www.searchenginepeople.com/blog/most-notorious-
wikipe...](http://www.searchenginepeople.com/blog/most-notorious-wikipedia-
scandals.html)

[3]: [http://stackoverflow.com/questions/621884/database-
developme...](http://stackoverflow.com/questions/621884/database-development-
mistakes-made-by-application-developers)

[4]: [http://www.goodreads.com/quotes/130452-the-bureaucracy-is-
ex...](http://www.goodreads.com/quotes/130452-the-bureaucracy-is-expanding-to-
meet-the-needs-of-the)

~~~
codinghorror
> But consider a question like "What are the pros and cons of Sinatra vs
> Rails?" This sort of question (IMHO) absolutely has value as someone
> experienced with both could enumerate the relative merits of each in a
> pretty objective fashion without making an absolute determination. This is
> something that absolutely could have value to anyone evaluating Ruby Web
> frameworks.

I guess, but Zookeepers could also potentially talk about "What are the pros
and cons of Gorillas vs Sharks?"

<http://blog.stackoverflow.com/2011/08/gorilla-vs-shark/>

> Database development mistakes made by application developers

This is a _discussion_ , not a question. The entire text of said "question"
is, quite literally, "What are common database development mistakes made by
application developers?" If it can have infinite answers, is it really a
question?

[http://stackoverflow.com/questions/621884/database-
developme...](http://stackoverflow.com/questions/621884/database-development-
mistakes-made-by-application-developers)

Great post, indeed, but it belongs on your blog.

One of the biggest misconceptions about Stack Exchange is this idea that
discussion is, in and of itself, a net good to the world -- and therefore we
are monsters for not allowing discussion. I do not believe this to be true.
There is, and will always be, an infinity of discussion. Like Jay Leno once
said about Doritos, "type all you want, we'll make more". If something can be
had in infinite amounts, what is its value?

Stack Exchange supports only the minimal subset of discussion necessary to get
practical, useful answers to specific questions. The goal is _not_ discussion,
but science-in-the-small. Back up your claims. Show us references. Show us
data. Share your specific experiences.

Otherwise you end up with Quora, a system where everything is a discussion,
and all answers are opinions. Thus they can only be evaluated based on how
famous the poster is, or how compelling a yarn they can spin.

Nothing against Robert Scoble and great storytelling (I used to work with Joel
Spolsky, after all), but I've seen where that system leads. Given a choice,
I'll always take tiny science. You should too.

~~~
grey-area
_What are common database development mistakes made by application
developers?_

It is a question, an open question intended to provoke debate and teach about
a subject, in effect it's a request for an FAQ. Now perhaps SO is not intended
to be for that sort of question, and that is of course for SO to decide.

I suppose the reason many people come to SO to _read_ questions is that they'd
like to learn about a subject area, and the reason many people come is _write_
answers is that they'd like to teach a little about a subject, and this sort
of open-ended questions offer the opportunity for someone to answer questions
the asker didn't even know they had - like should I add an index to my db, if
so when? Should I use natural keys? etc. To in effect tell them to unask all
those questions they would otherwise have asked in groping their way to
familiarity with the subject. It functions as an FAQ for that particular
subject, to prevent beginners from making the same mistakes/asking the same
questions over and over.

So that sort of question can be very useful for someone starting out, for the
kind of person your site targets. Maybe that sort of question belongs on some
other site though, a sort of training site rather than a question/answer site,
or maybe SO should just expand to encompass that sort of FAQ function?

I'm not so convinced that for this category of question there is a clear line
between 'db x breaks when I do y, what should I do?', 'Do I need to use db
transactions in db x?', and 'What are the common db mistakes?', and that one
sort of question/answer is rational, and another narrative - is the division
really that clear? Are there not many many borderline questions which solicit
opinion (of someone who knows more about the subject), and yet are useful for
others too? Are not many of these smaller 'science-in-the-small' questions
actually answerable in many different ways, each of which may be somewhat
valid and none of which is actually 'right' in some categorical way?

For example this question, which is equally open-ended, remains open (rightly
so I think as it could be a useful discussion :)

[http://stackoverflow.com/questions/327199/what-will-we-do-
af...](http://stackoverflow.com/questions/327199/what-will-we-do-after-
access?lq=1)

~~~
codinghorror
> or maybe SO should just expand to encompass that sort of FAQ function?

Some of the tag wikis do this:

<http://stackoverflow.com/tags/java/info>

<http://stackoverflow.com/tags/c%23/info>

So Cletus' post might make more sense as the tag wiki for, say:

<http://stackoverflow.com/tags/database/info>

~~~
grey-area
Yes it could although the wikis don't contain the other alternative questions
or the dialogue which make Q&A sessions involving several people who know the
subject well useful for other readers. For many questions (particularly open
ones) there are many answers, and no right answer for all circumstances, so
there is no clear division between this question/answers and other longer more
specific ones as to being opinion or fact - they're mostly a mix of both.

Because people will inevitably continually ask/answer these questions and many
similar more specific and yet still open ones and see the resulting debate as
useful on any QA site, it might be good to have a more structured way of
moving them to an FAQ section on SO, without destroying all the ad-hoc
relations and rewards that users have built up using your QA format (i.e. not
turning them into a wiki, which doesn't really suit them and loses all the
attribution, comments, discussion etc).

It feels a little draconian at present sometimes when useful answers are
marked as 'trivial' or 'not constructive' when they clearly are constructive,
but are constructive in a direction SO didn't anticipate.

~~~
rhizome
I have to wonder if the original plan was to leverage the question answering
as a wiki generator, but if so they should put pointers on closed questions.
I'm not sure whether I have a problem with how SO is conducting themselves,
since it could simply be that they're opinionated about what constitutes valid
content and aren't afraid to leave the other stuff to other sites. That is,
I'm not sure Zawinsky's Law can be extended to GYOFB situations.

------
ricardobeat

        probabilityOfClosing = (question) ->
            text = question.text.toLowerCase()
            return (text.length / (text.indexOf('jquery') + 2)) / 100

~~~
usea
I realize this was in jest, but your algorithm says that the shorter a post,
the less likely it is to be closed. A post of zero length has a 0% chance of
being closed.

Also, the result is not bound to 0..1

------
impendia
Amusing side note: I clicked on their job ad, apparently they score 10/12 on
the "Joel Test", which according to the link indicates "serious problems".

------
finnw
Sounds easier than winning the Loebner Prize, and yet there is more cash on
offer.

I just hope the winning entry will prompt the developers to remove that stupid
filter[1] that prevents you from referring to the Halting Problem in question
titles.

[1]: [http://meta.stackoverflow.com/questions/107989/using-the-
wor...](http://meta.stackoverflow.com/questions/107989/using-the-word-problem-
in-titles)

------
dotborg2
All those people, who helped in generating data for machine learning algos,
now may feel fooled, like suckers.

------
drudru11
why is the prize so small when the economic benefit could be much larger?

