

Data for 2009 + 2010 March Madness. Can your algorithm predict the tourney? - danger
http://blog.smellthedata.com/2010/03/2009-and-2010-march-madness-data.html

======
aneesh
We had a (small) Hacker News fantasy league for March Madness last year -- the
only rule was that your picks had to be by some algorithm which you shared
after everyone made their picks. I'd be happy to set up one for this year if
there's enough interest.

~~~
danger
That sound fun. I think the rule should be a bit more hardcore, though: that
the predictions have to come from raw data. i.e., no meta-algorithms that use
information about seeding or expert predictions, but if somebody wanted to
gather, say, play-by-play data and use that, it'd be ok.

~~~
aneesh
Ok, I've created a group called "HN" on Yahoo! Sports.

Here's the link: <http://y.ahoo.it/mVPMVA8X>

------
CytokineStorm
How is one supposed to run any sort of machine learning algorithm with only
two seasons of data? I could understand throwing the stats from the last 15-20
seasons into Weka and seeing what it said about 2010, but seriously how useful
is only 2 seasons worth of data going to be?

~~~
dwine
The data there has the scores from ~5000 games played over the course of each
season, and the model he links to also seems quite reasonable to me:
[http://blog.smellthedata.com/2009/03/data-driven-march-
madne...](http://blog.smellthedata.com/2009/03/data-driven-march-madness-
predictions.html)

Don't think of it as two data points. Think of it as two data sets.

------
rohanseth
Great idea! Looking forward to seeing your predictions.

