
Show HN: Analyzes your social media comments to get psychometrics on personality - testitouter
http://emotize.co/?testt
======
Nadya
"Okay"? Is all I can think about. I would try and provide more information,
such as how people can _use_ this information.

I found it was accurate for me - but these sorts of things tend to be broad
enough to be accurate to anyone with vague accuracy.

 _> natural language processing and sentiment analysis of your last 25 reddit
comments!_

Perhaps expand on this or give a page on how you're doing the
processing/analysis?

FWIW: [http://i.imgur.com/xOY3owP.png](http://i.imgur.com/xOY3owP.png)

~~~
testitouter
Sure thing! It creates a machine learning classifier from a corpus of data
using a unique algorithm which allows for a higher accuracy rate under the
circumstances of natural language dialogue. The machine learning and text
classification algorithm I employ is my own derivative of Naive Bayes, with
added tokenization, named entity recognition, and Laplace smoothing. I have
plotted the accuracy rate of the true/false positives of each of these machine
learning classifers on a ROC curve, with their AUC probability from it. Here
are the categories which Emotize’s text classifying and machine learning
algorithms classify the chatting input from the user into: Mood and Sentiment
Polarity Classifier: True False Personality Analysis Conducted with the Five-
Factor Model (FFM): Openness to experience inventive/curious
consistent/cautious Conscientiousness efficient/organized easy-going/careless
Extraversion outgoing/energetic solitary/reserved Agreeableness
friendly/compassionate analytical/detached Neuroticism sensitive/nervous
secure/confident Mood and Sentiment Polarity Classifier

Since Emotize’s sentiment analysis is polar (either Negative or Positive), the
algorithm can either classify positive correctly, or incorrectly. For 77.9% of
the corpus phrases, the text classifying algorithm categorized the corpus data
correctly. The dataset corpus that this text classification algorithm was
trained with for use on Emotize was introduced by Pang/Lee, with 3,800 corpus
phrases. 1/3 of them (1,280 corpus sentences) were excluded from the training
set of the classifier and the ROC curve was calculated with the remaining
corpus sentences from the training examples. This allowed the true/false
outcomes to be known, and for the accuracy rate to be calculated by comparing
the two outcomes. This data illustrates that Emotize’s sentiment analysis
machine learning algorithm has an accuracy rate of 77.9%, which is a 1.1%
difference between the average human sentiment analysis detection accuracy
rate to positive/negative texts, which is 79%.

Mood and Sentiment Polarity Classifier

Since Emotize’s sentiment analysis is polar (either Negative or Positive), the
algorithm can either classify positive correctly, or incorrectly. For 77.9% of
the corpus phrases, the text classifying algorithm categorized the corpus data
correctly. The dataset corpus that this text classification algorithm was
trained with for use on Emotize was introduced by Pang/Lee, with 3,800 corpus
phrases. 1/3 of them (1,280 corpus sentences) were excluded from the training
set of the classifier and the ROC curve was calculated with the remaining
corpus sentences from the training examples. This allowed the true/false
outcomes to be known, and for the accuracy rate to be calculated by comparing
the two outcomes. This data illustrates that Emotize’s sentiment analysis
machine learning algorithm has an accuracy rate of 77.9%, which is a 1.1%
difference between the average human sentiment analysis detection accuracy
rate to positive/negative texts, which is 79%. The raw, unparsed, and
unorganized survey data used partly as the corpus data during the training
process for this algorithm, is available here in the form of a ZIP file.
Personality Analysis Conducted with the Five-Factor Model (FFM) Classifier

The corpus for the 5-factor psychometric machine learning algorithms were
collected through a survey of 1,741 participants through the Emotize website.
Emotize was able to build the 5-factor psychometric machine learning algorithm
through this data. As in the Mood and Sentiment Polarity Analysis, 1/3 of the
algorithm’s original training set corpus of the classifier was used in the ROC
Curve.

