

Presidential Candidate Sentiment Analysis using R - stathack
http://stathack.wordpress.com/2012/10/08/presidential-candidate-sentiment-analysis/

======
mweatherill
The fundamental problem with social media sentiment analysis is that it is
based upon the assumption that tweets are representative of the population. It
ignores the issue that a large percentage of tweets come from spam bots.

As soon as you start measuring sentiment, there is an incentive for an
interested party to try sway the results. For example, I could use a spam
network to publish opinionated tweets just to obtain a headline in the media
(E.g., "95% of tweets agreed with X"). The media coverage gives legitimacy to
the manipulated result.

I wonder how many companies have set goals around this sort of sentiment.
"Let's increase positive sentiment by 10%". An ethically challenged consultant
could easily manipulate results to meet the stated goal.

There is still a place for sentiment analysis and that is in targeting
individuals for follow-up. Assuming that the user can be matched to a real
consumer, this avoids the issues of statistical manipulation. Though users can
still exploit the "complain and get something free" loophole.

~~~
burke
Nevermind that even with spam bots filtered out, twitter users are not nearly
representative of the voting base.

~~~
gklitt
This is an important point to keep in mind. I was surprised that the analysis
seems to suggest a slightly more favorable view of Obama's performance than
Romney's on Twitter, given the general consensus in the media that Romney won.
It's probably because the young, plugged in Twitter crowd leans to the left.

------
skennedy
When you know the right pieces to the puzzle, amazing things can be created.
Thanks for taking the time to show even more cool tools for future tinkering!

------
rezrovs
Comparing the use of positive and negative words doesn't take sarcasm into
account.

------
hartej
nice

