

Netflix Prize 10% barrier broken - pie
http://www.research.att.com/~volinsky/netflix/bpc.html

======
aswanson
...and I eat crow: <http://news.ycombinator.com/item?id=393667>

Haskell wins: <http://news.ycombinator.com/item?id=394528>

------
larryfreeman
Pretty exciting. I've been working on the prize in my spare time. Got to #162
(.8838 rmse).

The algorithms involved and the solutions from past years have been very
interesting. I highly recommend that people check out the forum posts on
<http://netflixprize.com>

The winning submission was made by a combination of the top 3 teams. In my
opinion, if they win it, it's well deserved! It will be very interesting to
see Pragmatic Theory's method published.

Teams BellKor and Chaos Theory published their methods when they won the 2008
Progress Prize.

~~~
crocowhile
I thought one of the caveat of the prize was that the winner was not supposed
to disclose the algorithm they used, isn't?

~~~
scscsc
On the contrary, they are supposed to publish it.

------
SapphireSun
Wow! I wonder though how they account for over fitting of the data. Is it a
real solution or a statistical anomaly? I ask because it seems that the
progress over the last year(s) has been small increments within the 9-10%
range.

~~~
Eliezer
I'm also worried about this. I believe that Netflix has a separate data set,
not used in the previous reports of mean squared error, which validates the $
prize. I also believe that the teams have been use the Netflix-reported
squared errors from standard test data to combine their estimates. If so, in a
month we're going to learn that the Prize has not actually been won yet.

~~~
paraschopra
But I wonder whether the effect of an unknown dataset slowly seeps into the
algorithms as they are tweaked, especially when 100s of algorithms are
competing to fit that data.

It might be akin to the fact that if you correlate more than 25-30 sets of
25-30 random numbers, you would find at least one statistifcally significant
correlation.

~~~
bravura
They are not granted access to the data set, so it would be statistically
unlikely to achieve 10% RMSE improvement on the test set solely by chance.

------
davidalln
And so it begins. It will be interesting to see what other teams will do now
with such a tight deadline.

------
froo
Congratulations to the team Bellkor's Pragmatic Chaos. Now if they could only
spend some of their time researching how to make a page that doesn't look like
it was coded by an 11 year old myspace user.

An example from the source

    
    
      <br>
      <br>
      &nbsp;&nbsp;&nbsp;&nbsp;<big><big><big><big>&nbsp;&nbsp;&nbsp;</big></big></big></big><big><big><big><big>&nbsp;&nbsp;</big></big></big></big><big><big><big><big>&nbsp;
    
      </big></big></big></big><br>
      &nbsp;&nbsp;&nbsp; <br>

~~~
redorb
I dare say, if you can do what they are doing ~ your not to worried about
perfect html or design.

~~~
smanek
or grammar, apparently ;-)

("your not to" => "you're not too")

~~~
johnnybgoode
I had _exactly_ the same thought, word for word, when I read the comment, but
I wasn't, um, _inspired_ enough to post it and take the downvotes. ;)

