Most interesting to me though was that it finally gave a clear description of how the contest worked. I've never looked into getting involved with the contest and the past articles I've read have rather glossed over the format and the way in which they run tests and judge entrants.
It'd be fascinating to know how much each percentage point increases sales.
1) It is only perceptible after someone is a customer.
2) It is not strongly perceptible (people suck at precise measurements -- that is why we have computers tell us when the root mean squared error is 9.63% instead of just looking at our slice of 4 movie recommendations and saying "HOT DAMN that is better than last week -- you gave Saving Private Ryan 3.5 stars -- a strong improvement over your last rating of it at 3, which didn't quite represent my interest.")
3) It only affects a portion of the customer base. Netflix is a service for delivering movies, not for rating them. The feature is doubtless useful to many and of intense interest to a few, but there are equally doubtlessly many Netflix customers who don't even know it exists.
Now, there might be some bonus for having the perception of having better accuracy (or, for that matter, geek cred) from having the Netflix prize... but that wouldn't be tied to the objective reality of whether the algorithm actually improved or not.
It is likely that A/B testing on the call to action in the signup button would moev the needle a lot more, for a lot less work. (Please don't stone me too harshly.)
If this wasn't their worry, the recommendation algorithm is counterproductive-- by giving consumers more movies to like they watch Netflix increases its own postage costs, etc.
Amazon's algorithm is actually quite simple (at least as much as can be extrapolated from the outside, and confirmed by interviews with the people that wrote it), but they do a lot of work to place recommendations in context. I think that's one of the critical places to focus on when looking at raw sales boosts.
The ability to predict that, say, movie ABC would score a 3 from customer X really isn't very helpful for anything.
As a Netflix customer, I want them to suggest things that I'm likely to rate 5, and perhaps to warn me away from things that I'd consider a 1 or 2. I want to ignore 3 and 4 movies, so it doesn't matter if the algorithm can predict them well. It also doesn't matter alot if it is able to ferret out every possible 5; as long as it gets a substantial portion, everything is great.
I played a bit with this contest when it first started, along with a friend. In just one night, I was able to beat Netflix's own algorithm by enough to get us onto the leaderboard (temporarily). And the algorithm I used was nothing fancy, just correct scaling and normalizing using means and standard deviations: high school math. Unfortunately, neither of us knew anything about algorithms for collaborative filtering, so we quickly reached a plateau.
What it does do (assuming) is modify that bottom line (profit) in a number of ways:
1. Given the same $1M expenditure on product development, nets better results.
2. Free advertising (reducing expenses of that department) with all this indirect marketing.
Additionally, a better algorithm helps them to secure their position in the market, which is really more of a strategic move; hard to measure.