

Invite HN:  The Algorithmic NCAA Bracket Challenge - jraines
http://www.jeremyraines.com/post/18

======
colins_pride
We're doing something similar to this at our office this year. The wrinkles
are that it isn't a standard tournament bracket point structure, and nobody is
sharing their algorithms.

The point structure is that each person picks 8 of the 64 teams. For every
game that those 8 teams win, the person gets the team's seed added to their
point total. This makes the team selection a very, very interesting exercise:
pick a high seeded team and be very confident of getting a few points, or pick
a lower seeded team and risk getting no points, but perhaps win the whole
thing with that one team.

I bought a copy (price: one starbucks coffee) of 24 seasons worth of data from
a dependable source (he knows i'll poison his next coffee if he made a
mistake). Now the issue I'm grappling with is that he and I both know we have
the same data, so a game theory element is added.

For those who are interested, it turns out that 3 seeds and 6 seeds have
historically done fairly well, and I've cooked up some rationalizations that I
like for why this might be the case, but then we'll just see what happens this
time around.

~~~
jraines
Cool.

If you want to enter this one as well, you don't _have_ to share your
algorithm.

I just think it will be more interesting if at least the winner does share.

------
tlrobinson
Any suggestions on where we can get relevant data to feed to our algorithm?
(preferably in a machine readable format. I'd rather not scrape ESPN.com or
something)

Also, the password to the jottit page you link to isn't "hackernews" like it
says.

~~~
jraines
password should be fixed.

I haven't found a great site for scrapable stats. I'm using
<http://msn.foxsports.com/cbk/stats> which is at least a step up from ESPN's
stats.

If anyone has a better source, I'd be very interested as well. And if anyone
wants to create one -- I guarantee it will make money from sports touts,
especially if you niche down to one sport and charge less than statfox.com

~~~
jackowayed
pass still isn't working for me.

~~~
jraines
OK - it's open for anyone now. If not, I've tried again to make the password
hackernews

~~~
jackowayed
nope. maybe jottit is working off of a stale pass or something?

~~~
jraines
OK - screw jottit then. It's set to 'anyone can view and edit' and i've
changed the password to hackernews like 5 times.

Can someone recommend a similar site?

In the meantime -- don't even worry about posting the algorithm. If your
bracket starts to kick ass, just email me with your description or a link to
your personal blog post about it, and we'll have a writeup of the top
performers. Or -- easiest solution yet -- just post it as a message in the
Yahoo! group.

~~~
samueladam
<http://couch.it/>

Built on CouchDB by the way.

------
rms
My algorithm: pick the team with the highest seed. In the case of a tie,
alphabetical order by city name.

~~~
aneesh
That's probably the best strategy without putting much effort, but I bet some
enterprising HN user will come up with a more complex strategy that has a
higher expected value.

~~~
yters
Just do principal component analysis on past data. That'll give the best
predictors (based on linear regression) based on the past.

~~~
aneesh
The hardest part of that is actually finding all the relevant historical data
& getting it into a good form. Anyway, I'm not about to spend hours on picking
a bracket, so I'll leave that to someone else.

------
jraines
Mine is going to be based on the "Wages of Wins"-emphasized stats of FG%, FT
Attempts, Turnovers, and Offensive Rebounds.

If you want to get a little more hardcore, check out HN user lance's post
about using Logistic Regression Markov Chains for your bracketology:

[http://blog.weatherby.net/2009/03/using-lmrc-to-pick-ncaa-
to...](http://blog.weatherby.net/2009/03/using-lmrc-to-pick-ncaa-
tournament.html)

~~~
lanceweatherby
The Logistic Regression Markov Chain developed at Georgia Tech to predict
results of NCAA tournament games does a good job at picking both upsets and
late round game winners. A bit soft in the middle rounds.

------
gfunk911
Algorithm: If ("lower seed's pomeroy ranking" + 0.0085) > "higher seed's
pomeroy ranking", the lower seed wins, otherwise the higher seed wins.

<http://gist.github.com/80978>

Feel free to steal my code

UCLA beats Gonzaga and Memphis beats Louisville, Memphis beats UCLA for title.

------
ganley
The cleverest dead-simple algorithm I've seen is to favor the team whose head
coach has the highest salary. When I first heard of this, it did pretty well
that year.

------
anon_1111111
www.kenpom.com has all the stats you need sons

~~~
jraines
looks very interesting, i'm definitely going to look into this

------
grandalf
here's my algorithm:

1) first assume that no upsets will occur

2) then apply the upset rule to all brackets and to the final games.

upset rule:

\- teams from the east beat teams from the north and midwest (2 seed points)

\- teams from the south beat teams from the west (1 seed point)

YMMV

------
te_platt
Thanks for setting this up. Should be fun. You'll get my algorithm once the
tournament starts :)

------
Raphael
I went with alphabetical order. Xavier FTW!

~~~
jraines
Funny -- this got downvoted to 0 and is perfect through 5 games

~~~
Raphael
Sweet!

