Hacker News new | past | comments | ask | show | jobs | submit login
I made a website that rates the latest movies by analyzing social media (buzzscale.com)
114 points by buzzscale on Feb 8, 2013 | hide | past | web | favorite | 36 comments

Interesting idea; I'd be interested to see a full correlation between Buzzscale scores and other ratings sites (IMDB, Rotten Tomatoes). Based on a quick analysis of the top movies on the front page (vs. Rotten Tomatoes user ratings) it seems to be pretty well correlated. It seems to do a bad job identifying "eh" movies (e.g., Gangster Squad: BS-80 RT-64 and Hansel & Gretel: BS-76 RT-64). Buzzscale scores also seem to be more uniform in nature: pretty much everything is 70-85.

Any thoughts on "ratings effects"? E.g., proportionality to either gush about a movie or pan it, exclusively? Lack of nuanced reviews? (No one goes on Twitter to say, "It was whatever.")

It does seem like "gushers" and "panners" tend to overcrowd reviews, but I was curious enough to do a quick Twitter search. About half of these results are "ok" tweets about movies/TV shows, so maybe there are enough people going to Twitter to say "whatever": https://twitter.com/search?q=watched%20was%20ok

Looks interesting. fflick.com did this sometime back and were acquired by Google.

A feature suggestion - "Enter your email to get weekly notification for movies with more than 80 buzzscore" (where 80 is editable.) An extension to this could be email me movies that are popular for their comedy, acting or the story.

Look good, but it needs to assess sentiment rather than just buzz (and it needs to grade buzz a lot better).

Gangster Squad for example has a 80 buzz score, but it got between terrible and mediocre reviews. Django has an 85 buzz score, but got almost universally great reviews. Django will make roughly 4 to 5 times as much money as Gangster Squad, and was just about blanket everywhere in the media 24/7, so it's very unlikely Ganster has anywhere close to as much actual buzz as Django.

You're right, but your comment made me think of something Andy Warhol once said:

"Don't pay any attention to what they write about you, just measure it in inches"

I'd train a linear regression on the text features to approximate the box office returns. That would be a way to build a rating system - the value of the function is the rating itself.

Exactly. I'd love to see this with sentiment analysis (ideally done by humans instead of support vector machines)

The data from people using Senti to rank movie sentiment has been interesting: https://senti.crowdflower.com

I am not sure if this site provides an accurate aggregate view of public opinion on movies, but if it does then I have no desire to know what the public likes. I also don't know how this is better than other "what does everyone think?" ratings available on IMDB or rotten tomatoes.

The real beauty of all the social data now available is that a service like this could theoretically provide ratings based on the sentiment of people who's opinions I follow or who are have similar tastes to me. I don't care how much the average person loves Pitch Perfect, its just not something want on my list of potential movies. If I were matched through my social iterations to people who's ratings are relevant that would likely be obvious.

I love seeing new software in this space and I hope you develop it further, but you're going to need to dive much deeper into the social data available if you want to really make a disruption.

I had an idea about this recently, and would love for someone to further develop it. I already implemented a proof of concept, but won't have time to build it to a complete product any time soon.

The (simplified) idea for recommending movies was:

1. Crawl a torrent site (say, the pirate bay) for as many torrents you can find. Extract the title, description, magnet link and category for each torrent. 2. Use the DHT network to find out which IP addresses are downloading what torrents 3. Use these (torrent, ip) tuples to build a recommendation based on "People who download X also download Y"

This way you can try to find movies that are liked (or actually, downloaded) by people with similar taste to you. The quality of this recommender is quite impressive in my opinion.

Email me at arno at vanlumig dot com if you want to see the proof of concept, I won't post a link here because the performance is quite bad and I'm sure the server won't be able to handle any significant load.

I will make the data (torrent metadata and data about who downloaded what) available as soon as I've anonymised the IP addresses, also mail me if you're interested in that data.

Looks great, and useful, wanted to share, but you are missing open graph tags so facebook share just includes your link, please add meta tags / og tags (like description, title, image etc) to make easy sharing, this is pretty important...

see here for more details: http://ogp.me

I haven't seen Billy Crystal's Parental Guidance, but I am surprised that the film is ranked #8 for "Best Special Effects".

You should also display the avg. buzz per week for lets say the first 5 weeks. You could also display how many weeks movies held a certain buzz level (i.e. 5000 mentions). It seems the way it is currently movies that have been out longer are going to have a higher buzz than the brand spanking new releases.

It seems to me that what you are measuring is how well marketed a movie is, not how good it is on any scale. Interesting point, compare your results against box office results.

Also: the hottest and sexiest movies #6: Ice Age??? :)


This looks like a great way to find movies. I was a bit confused at first about what the different colors (blue, red, grey) meant. After looking at the markup I came to the conclusion that blue meant positive and red meant negative. A bit later I found the explanation hidden behind the "Click to Learn More". There's no visual indication that this is a link, so I found it easy to miss. Perhaps it would be good to move the explanations somewhere not hidden behind a click, or make the place to click more obvious.

I'd also recommend choosing different colors. When scanning the list, I want the hottest movies to jump out, while the less-good movies fade into oblivion. As it stands now, red is the dominating color, and my eyes are naturally drawn to the worst movies. If you want to stick with the blue, I'd recommend the same dark blue for "hot" movies, a lighter blue for mediocre movies, and the gray for the worst movies.

Nice idea, though; seems to work impressively well.

Thanks this is great feedback.

Care to share technical implementation?

I feel like I've read the comment I'm about to make somewhere else about something, so this isn't necessarily my original idea: the buzz is from social media currently, yes? And it works as a great indicator after the movie comes out, as people are tweeting/facebooking/etc. about it. But what happens in a little while when the "buzz" dies down? Or is the idea to just give you a quick glimpse of currently releasing movies, and not to be a ranking database of all movies ever made.

Just FYI, you're violating Twitter's API display guidelines. https://dev.twitter.com/docs/embedded-tweets

I'm curious, what Twitter display guidelines is the author breaking?

When you display a tweet embedded in your page, you have to display it exactly as they do, which takes up a truly ridiculous amount of space and is intentionally prohibitive.

And as a sidenote, Twitter would very much prefer if the API would just disappear, as they get non-linear benefits from users going to twitter.com instead of accessing its services through a third party. Internally, it is spoken of as a liability.

Internally, it is spoken of as a liability

I doubt that their API is spoken of as a liability internally because, if it was spoken of as a liability then Twitter.com itself is a liability. Why? Twitter.com itself is powered by their API[1]

[1] http://engineering.twitter.com/2010/09/tech-behind-new-twitt...

It is spoken of internally as a liability, and the executives want nothing more than to get rid of it for many reasons, not the least of which being their current investors care only about user numbers, and don't seem to count third-party users.

Trust me on this.

Thanks, Didn't know about this. I'll fix it.

Just thought you should know. If you haven't heard from anyone, it probably isn't a big deal.

Nice! I just had a conversation with a friend about what new game he should play. Perhaps a nice extension of your platform is to get social media buzz ratings for games.

I want to do this if I am successful in marketing this website and people enjoy using it. Once I've perfected the scoring algorithm, classification, website design/UI and data sourcing (to incorporate reviews & forums), I plan on expanding to video games, tv shows and books.

Eventually it would be cool to do stuff like material goods (cellphones, appliances, etc...) so that consumers have another users to leverage when trying to find something to buy. Imagine being able to compare cars based on what people are writing across the web about the cars durability, performance, comfort when shopping for a car!

But right now I'm working on the website by myself and have a ton of work to do.

This is great. My wife and I are always trying to decide what is worth watching on Amazon, Netflix, etc. Added to my bookmarks.

You might like Rotten Tomatoes' RSS feeds for new and "Certified Fresh" DVD releases:


Nice! I've been wanting to do something similar for a while but have been busy working on other side projects, glad you got to ship it! Out of curiosity, what apis are you using?

I use the rotten tomatoes API to get the movies and automatically create them as blogs in Wordpress using the wordpress xmlrpc interface. I use the Clarabridge API for classification and sentiment. Facebook's graph search API & Twitter API for the data. And duct tape to hold it all together :).

hahaha love it! Just like Macgyver :)

This is great. Already bookmarked.

One little thing though: please add the fullscreen button on these trailers, it would be much more enjoyable to watch them.

Excellent idea. I'm already telling people about it.

Great job. Any plans to open it up as a web API?

Awesome idea, the best place to know the best movies.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact