

Show HN: What rapper are you? (my first NLP project in Python/Bottle) - sjaakkkkk
http://www.whatrapperareyou.com

======
sjaakkkkk
Just finished this website a few days ago and looking for some feedback.

The program has analysed 300 east and 300 west coast lyrics and looked at the
words used in those to train a Naive Bayesian Classifier using the Natural
Language Processing Toolkit for Python. It then lets you put in a twitter name
and loads its latest 174 tweets and classifies that bag of words to the
classifier to let you know if you tweet like west coast or east coast rappers
rapped in the nineties.

I finally got NLTK to work online after trying different hosting companies.
Decided to learn Bottle to build the web version of the program and it was a
breeze, which was awesome as I am really a beginner in programming.

Loved to getting my first real project done and working without quitting
before it was finished. Extended my knowledge in jQuery and css a bit and
overall had a lot of fun.

I wanted to ask you guys what you think of it and if you have any suggestions
both content and design wise.

I found that after getting it to work I was kinda lost in the design. It's
funny: when the code works it works, but designing seems to be a never ending
job of improving and not being satisfied.

------
lambtron
thanks for sharing, i think this is a fun, light-hearted application! i would
be interested in seeing a percentage of how east coast rapper i am (60% east
coast / 40% west coast) just to see the break out. also, it would be cool to
show some of my tweets that feature some of those keywords, to remind me (as i
couldn't recall any tweets that had those words).

aside from that, great work!

~~~
sjaakkkkk
thanks for the comment, means a lot! Good suggestion as well, the percentage
one might be too hard as I'm using some basic NLTK libraries, but the one
showing tweets I'll definitely look into!

Furthermore, the words listed are certainly not necessarily words you used,
but just the 50 most differentiating words between east and west coast.
'compton' is used way more in west coast lyrics than east for example. It's
just there to see some fun differences not related to your personal twitter
stream. At last, in total the classifier looks for not 50, but 1000 words in
classifying your tweets. Hope this explained it a bit.

------
bdr
It's telling me "That ain't no real twitter user bro!" which is false (and
kind of lame).

~~~
sjaakkkkk
I'm sorry to hear that. When Tweepy gives an error I let it output that
sentence as almost every time it was a Twitteruser does not exist error. I
checked your twitter and its working with me so it might be because of some
exceeded twitter daily call limit or something on your network. I noticed that
when I was developing it -- after a day of refreshing and reiterating it would
give that error too.

Thanks for pointing out the flaw, I'll try to implement some more descriptive
error messages for different errors.

