
Sentiment analysis on Twitter using word2vec and keras - ahmedbesbes
http://ahmedbesbes.com/sentiment-analysis-on-twitter-using-word2vec-and-keras.html#.WWu1sV3rWjo.hackernews
======
williamsmj
This post is a great word2vec/keras intro, but it doesn't do one thing you
should _always_ do before you break out the neural networks: try to solve the
problem with traditional machine learning to establish a performance baseline
and confirm the data is predictive.

78.9% accuracy on a sentiment classification of tweets with no neutral class
is actually slightly _worse_ than you get if you do this in scikit-learn with
plain old bag of words and Logistic Regression:
[https://github.com/williamsmj/sentiment/blob/master/sentimen...](https://github.com/williamsmj/sentiment/blob/master/sentiment.ipynb).

~~~
ahmedbesbes
Yes, I'm totally aware of that. I'm currently working on a slightly modified
version of the code that will include CNN and hopefully better accuracy.

~~~
throwaway_bob
I think you are missing the parent's point. A CNN is a more complicated model,
not a simpler one- it is better to try simple linear classifiers on bags of
word or bags of bigrams or trigrams before breaking out the more complicated
neural models. Note that you can do this easily with VW or FastText.

~~~
ahmedbesbes
the aim of my post is not to make a prediction based on a standard alogrithm.
my goal is to show that DL techniques make better predictions. What I wrote
was a first draft of the algorithm, and I've not made my point yet,
unfortunately. I'm working now on a better solution.

Could suggest improving the DL algorithm I wrote?

Thank you.

~~~
williamsmj
Again, you're missing the point. If your goal is to show that DL techniques
make better predictions, you need to _begin_ by answering the question "better
than what?"

------
v1n337
Nice post. Thanks for sharing.

I have to point out though, that it's a bit dangerous to measure classifier
accuracy as the percentage of correctly classified samples when you've no idea
how the test data is skewed in favor of one class vs. the other (for binary
classification, and can be generalized to multi-class problems too).

It's always much better to represent accuracy as the F1 score[1] or to just
examine a confusion matrix of the predictions[2].

[1]
[https://en.wikipedia.org/wiki/F1_score](https://en.wikipedia.org/wiki/F1_score)
[2]
[https://en.wikipedia.org/wiki/Confusion_matrix](https://en.wikipedia.org/wiki/Confusion_matrix)

------
gattilorenz
Also worth mentioning the "competitor" of word2vec: Stanford GloVe
[https://nlp.stanford.edu/projects/glove/](https://nlp.stanford.edu/projects/glove/)

Haven't had the opportunity to measure the difference in quality, and I've
mostly used word2vec until now (with vectors I've trained myself after
lemmatization and PoS-tagging of a corpus), but the fact that GloVe provides
you different trained models from twitter, Wikipedia and so on is pretty nice

~~~
b_ttercup
Glove is great. Simpler and faster with a very small trade off for quality.
Word2vec has an advantage in that you can produce document vectors with only a
small change in the network infrastructure. Tf-idf weighted word vector
averages will probably be the best you can do using glove.

------
ppod
Word2Vec is not deep learning. I think the neural network it uses has a single
hidden layer. It's much simpler than deep-learning approaches and all the
neural network is doing is dimensionality reduction in a way that can be done
with SVD for similar results.

~~~
v1n337
You're right. It's a shallow single network with the weight embeddings of
words at the intermediate layer extracted as the vector-space representation
of a word, depending on its context.

