I competed in this, and found it somewhat fun. It was like an x-prize for computer science. I knew little about this area of research or work. So when I first saw the data, which was simply movie IDs and user IDs with dates and ratings, and NO other information, I didn't possibly see how any predicting could be done. I spent each morning commute thinking it over, surprised there was no movie metadata released, or user demographics. Just meaningless IDs, and rating dates and ratings on those dates (by a given user).
What was great, was that as I thought I about it, the more I started thinking ok, well, if that is all you have, what can you do with it. And essentially started realizing you can look for other users to compare a user to, and find similarities between movies based on ratings, or similar users and their ratings for a movie, etc.
At some point, after writing my own code, I did some research and learned of cohort analysis, etc, and found I'd invented the same stuff as others (I'd invented my own correlation algorithms, etc). For some reason, I found all of that super fun. In the end, I did beat their algorithm, but not by enough to win.
What was great, was that as I thought I about it, the more I started thinking ok, well, if that is all you have, what can you do with it. And essentially started realizing you can look for other users to compare a user to, and find similarities between movies based on ratings, or similar users and their ratings for a movie, etc.
At some point, after writing my own code, I did some research and learned of cohort analysis, etc, and found I'd invented the same stuff as others (I'd invented my own correlation algorithms, etc). For some reason, I found all of that super fun. In the end, I did beat their algorithm, but not by enough to win.