Hacker News new | past | comments | ask | show | jobs | submit login
Invite HN: The Algorithmic NCAA Bracket Challenge (jeremyraines.com)
27 points by jraines on Mar 17, 2009 | hide | past | web | favorite | 25 comments

We're doing something similar to this at our office this year. The wrinkles are that it isn't a standard tournament bracket point structure, and nobody is sharing their algorithms.

The point structure is that each person picks 8 of the 64 teams. For every game that those 8 teams win, the person gets the team's seed added to their point total. This makes the team selection a very, very interesting exercise: pick a high seeded team and be very confident of getting a few points, or pick a lower seeded team and risk getting no points, but perhaps win the whole thing with that one team.

I bought a copy (price: one starbucks coffee) of 24 seasons worth of data from a dependable source (he knows i'll poison his next coffee if he made a mistake). Now the issue I'm grappling with is that he and I both know we have the same data, so a game theory element is added.

For those who are interested, it turns out that 3 seeds and 6 seeds have historically done fairly well, and I've cooked up some rationalizations that I like for why this might be the case, but then we'll just see what happens this time around.


If you want to enter this one as well, you don't have to share your algorithm.

I just think it will be more interesting if at least the winner does share.

Any suggestions on where we can get relevant data to feed to our algorithm? (preferably in a machine readable format. I'd rather not scrape ESPN.com or something)

Also, the password to the jottit page you link to isn't "hackernews" like it says.

I've been following my own algorithm for ranking teams since the start of the NBA season. USA Today has the best computer readable format that I could find.

password should be fixed.

I haven't found a great site for scrapable stats. I'm using http://msn.foxsports.com/cbk/stats which is at least a step up from ESPN's stats.

If anyone has a better source, I'd be very interested as well. And if anyone wants to create one -- I guarantee it will make money from sports touts, especially if you niche down to one sport and charge less than statfox.com

pass still isn't working for me.

OK - it's open for anyone now. If not, I've tried again to make the password hackernews

nope. maybe jottit is working off of a stale pass or something?

OK - screw jottit then. It's set to 'anyone can view and edit' and i've changed the password to hackernews like 5 times.

Can someone recommend a similar site?

In the meantime -- don't even worry about posting the algorithm. If your bracket starts to kick ass, just email me with your description or a link to your personal blog post about it, and we'll have a writeup of the top performers. Or -- easiest solution yet -- just post it as a message in the Yahoo! group.


Built on CouchDB by the way.

My algorithm: pick the team with the highest seed. In the case of a tie, alphabetical order by city name.

That's probably the best strategy without putting much effort, but I bet some enterprising HN user will come up with a more complex strategy that has a higher expected value.

Just do principal component analysis on past data. That'll give the best predictors (based on linear regression) based on the past.

The hardest part of that is actually finding all the relevant historical data & getting it into a good form. Anyway, I'm not about to spend hours on picking a bracket, so I'll leave that to someone else.

Mine is going to be based on the "Wages of Wins"-emphasized stats of FG%, FT Attempts, Turnovers, and Offensive Rebounds.

If you want to get a little more hardcore, check out HN user lance's post about using Logistic Regression Markov Chains for your bracketology:


The Logistic Regression Markov Chain developed at Georgia Tech to predict results of NCAA tournament games does a good job at picking both upsets and late round game winners. A bit soft in the middle rounds.

Algorithm: If ("lower seed's pomeroy ranking" + 0.0085) > "higher seed's pomeroy ranking", the lower seed wins, otherwise the higher seed wins.


Feel free to steal my code

UCLA beats Gonzaga and Memphis beats Louisville, Memphis beats UCLA for title.

The cleverest dead-simple algorithm I've seen is to favor the team whose head coach has the highest salary. When I first heard of this, it did pretty well that year.

www.kenpom.com has all the stats you need sons

looks very interesting, i'm definitely going to look into this

Thanks for setting this up. Should be fun. You'll get my algorithm once the tournament starts :)

here's my algorithm:

1) first assume that no upsets will occur

2) then apply the upset rule to all brackets and to the final games.

upset rule:

- teams from the east beat teams from the north and midwest (2 seed points)

- teams from the south beat teams from the west (1 seed point)


I went with alphabetical order. Xavier FTW!

Funny -- this got downvoted to 0 and is perfect through 5 games


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact