

"It's not you, it's me"... says the netflix algorithm to the data-set - thinksketch
http://www.thinksketchdesign.com/2009/10/29/design/algorithm-design/its-not-you-its-me-says-the-netflix-algorithm-to-the-data-set

======
akamaka
"No one seems to question the dataset".

Actually, this has already been pointed out in papers written on the subject.
If you want to learn about the Netflix prize, read the papers from the teams
that solved it, and don't waste your time on self-important bloggers who want
attention.

~~~
thinksketch
akamaka, I can understand how my blog might come across as self-important - I
know I can be over the top, but that's part of the idea (see the tag line of
my blog).

The thing is, I'm writing for a general audience from the point of a general
audience. I'm a designer not a programmer. And though I've read lots of
articles in wired and on tech blogs about the netflix algorithm, I haven't
heard much discussion about viable alternatives to the five star system. Yes,
there are articles talking about how the rating system is faulty, and they
talk about how to best "work around" the faults of the five star system. But
I'm trying to brainstorm alternatives - scrap the system entirely and build an
algorithm on something else. I'm sure the ideas are out there, I'm just
indignant that as a general audience, we haven't heard about them yet. I'm
getting great feedback already. Thanks greatly appreciated.

For such a little UI pattern, the five star system plays an enormous influence
on how we see the internet, and the effects of it have a tangible impact
offline as well - for example, restaurant traffic influenced by yelp reviews.
I've heard a lot of people question how five star reviews influence the range
of products that we're exposed to. It seems like an important question to ask.
Look and now we've got a good brainstorm going.. Maybe think of my post as a
challenge and request for a detailed article from someone who knows their
stuff about rating systems and how they effect our everyday life. Cheers -

~~~
akamaka
Hey, thanks for taking the time to reply. I realize that my comment probably
comes across as being quite personal, but it more reflects my frustration at
the lack of insight into the Netflix Prize that exists in the blogosphere.

Here are some prime examples of people making lots of noise without any data
or science to back it up:

[http://anand.typepad.com/datawocky/2008/03/more-data-
usual.h...](http://anand.typepad.com/datawocky/2008/03/more-data-usual.html)

<http://scienceblogs.com/cortex/2009/08/netflix.php>

There's actually very few people who have made genuine contributions toward
winning the Netflix prize, as can be see in the winning team's final
publications. They only list about a half-dozen key papers as references.

Anyways, I apologize for directing my comments specifically at you. I totally
agree with your basic point, and this is a problem I've been spending a lot of
time thinking about myself. My personal view is that explicit rating systems
should be totally eliminated, in favor of using data gathered automatically,
without asking the user to provide a subjective rating. I don't either have
any evidence to prove that's better, mind you. :)

------
aarongough
YouTube recently admitted that they have a similar problem with their rating
system ([http://youtube-global.blogspot.com/2009/09/five-stars-
domina...](http://youtube-global.blogspot.com/2009/09/five-stars-dominate-
ratings.html))

The raw data simply doesn't show a decent distribution of ratings. People are
far more likely to simply give something 1 star or 5 stars.

I would say that it's far more meaningful to use a simple thumbs up/thumbs
down system...

~~~
sp332
They need multiple dimensions for rating. Production values, humor content,
information content, etc.

~~~
roc
It would be interesting to see a thumbs-up/thumbs-down rating combined with an
optional tag classification system.

I don't think the average user will ever really tag anything. But as long as
there's a sufficient quantity of interested users who _would_ , I think you
could get a lot closer to statistically guessing _why_ people might
like/dislike certain videos.

~~~
thinksketch
I like this idea a lot. I also think that once you have some tags, you could
extrapolate which movies are likely to fall into those tag categories by
seeing which movies people browsed on the site at the same time. You can get
this information if you offer one additional step beyond flipping through
thumbnails - such as watch a preview. Then with each browsing session you
collect data you can use to map genres.

