

Sentiment Analysis in Python - abromberg
http://andybromberg.com/sentiment-analysis-python/

======
languagehacker
Cool post, Andy. NLTK is a lot of fun, but it's not necessarily a production-
ready solution -- for instance, scaling it out to other languages may pose
some problems with respect to utf-8. NLTK's real purpose is more for pedagogy,
and your blog post is a nice addition to teaching people Python and
computational linguistics at the same time.

You might be interested in checking out pattern (
<http://www.clips.ua.ac.be/pages/pattern> ). It has a heuristic approach to
sentiment analysis built right in that might be worth comparing your features
against.

Finally, as far as classification goes, Python is pretty all right, but can be
a tad slow working through large amounts of data. I've found that text
classification at scale is best left to an external library, with Python doing
feature extraction and managing the data pipeline. In the past I've built out
feature sets with Python and then passed them to TADM (
<http://tadm.sourceforge.net/> ). The advantage of TADM is that, being written
in C++, it's meticulously optimized. Of course, you have fewer modeling
options available to you. That's just one example; there are plenty of these
kinds of services written in Java, too, for instance.

Thanks for a good read!

~~~
abromberg
Thanks for the comment! I'll definitely check out Pattern and TADM and see
what I can do with it.

------
ismaelc
Great to see our good friend Jacob Perkins aka Streamhacker referenced several
times in the article. The man's a genius and an inspiration. Turning Text
Processing into a successful self-sustaining API "project"
[http://streamhacker.com/2013/02/27/monetizing-
textprocessing...](http://streamhacker.com/2013/02/27/monetizing-
textprocessing-api-mashape/)

------
bbayer
That was a great read. As I understand OP's main concern here is accuracy.
What about performance? NLTK is good start point and it deemed as slow.I
really like to hear about runtime performance for same functionality with
Python and R.

~~~
jkldotio
I think NLTK is pure python because it's supposed to work as a teaching tool
as well. Perhaps pypy can come into play for that, although in real world
situations I've only see marginal gains from pypy so far.

------
aidos
I really enjoyed this. It's a good practical example that's helped me to make
sense of a lot of the words that fly around in this space.

------
jnazario
great write up, and great analysis. i looked at this about a year ago, wanted
to do some playing in NLTK and figured twitter data was a place to learn to do
sentiment analysis. my bright idea - movie reviews and tasting - has been done
to death :)

really great writeup, thanks for this.

------
alokv28
Really good stuff. Any other good links on sentiment analysis using python?
For example to analyze tweets?

