

How to build a model to distinguish tweets about Apple and apples - polemic
http://stackoverflow.com/questions/17352469/how-can-i-build-a-model-to-distinguish-tweets-about-apple-inc-from-tweets-abo

======
nlh
Meta comment since the SO thread covers the discussion on the actual question:

Does the tone of the asker bother anyone else? The "I don't want you to teach
me, I want you to do it for me" attitude feels to me the antithesis of the
spirit of this corner of the tech community.

His first and only SO question, he's putting up a bounty, and it just feels
like a finance/hedge fund guy saying "Damn, this is tough. Let's see if I can
bribe the nerds to do it for me instead."

I know there are places for that - elance, oDesk, etc. Just feels like SO is
the wrong forum for that kind of attitude, as interesting as the question
might be.

~~~
thomasjoulin
Sure, and he's not the only one like that on SO. However, if the question is
interesting, I see no reason why it shouldn't be answered or upvoted. SO is
not about one guy asking the community to do his job, but more like archiving
a bunch of programming questions and its solutions. This thread will benefit
more people than this guy with a bad attitude.

~~~
nlh
Very good attitude and perspective. Thank you!

------
pallandt
Side note: possibly for a project/homework involving online reputation
management. While in college, we were also tasked with doing something very
similar for an international competition. Given this and the overall tone of
the request for 'help', I'd caution everyone familiar with NLP to be wary of
actually giving a fully coded solution.

~~~
blauwbilgorgel
Good catch. I'll just post my results. I classified it with a simple algo[1]
using the normalized frequency distributions of unigrams and bigrams in the
50-sample training set.

    
    
      Company ( 0.0567236272499 )
      iPhone = Eye Phone = Illuminati Phone. Siri spelled backwards is Iris, thats a part of the Eye. Apple is Illuminati. They're watching you. 
    
      Company ( 0.208253968254 )
      Apple caught testing offline Dictation for iOS 7 http://idb.tc/1d2GKph 
    
      Company ( 0.0578323858427 )
      RT @jonnyevans_cw: Why #Apple really, really doesn't need a shopkeeper to lead its retail chain http://shar.es/AhsUK  via @computerworld 
    
      Company ( 0.0242924384131 )
      The Best Music Streaming Apps For Your iPhone #Apple #iPhone http://bit.ly/14xwinU  
    
      Company ( 0.0249022556391 )
      Apple May Be Working on Self-Adjusting Noise-Cancelling Headphones http://on.mash.to/12hFDMu  via @mashable 
    
      Company ( 0.488585099111 )
      Samsung Continues Ad Campaign against Apple's iPhone in Iceland 
    
      Company ( 0.179605263158 )
      the creator of the iPhone 
    
      Company ( 0.0764348527178 )
      I Used To Hate Apple, And Now I'm A Giant Sell-Out http://bit.ly/IGL0h5  #archivesWeek in Review | YSL Chief Executive to Apple, Bec Astley Clarke, Fashion Sweatshirts, Esteban Cortazar http://bit.ly/128MPf5  via @BoFI've been single since Apple was just a fruit. 
    
      Fruit ( 0.727605245395 )
      I be up so high trying to get a piece of that apple pie 
    
      Fruit ( 0.438529121875 )
      I want to eat healthy I really do. But I just found a whole apple pie in my fridge. 
    
      Fruit ( 0.662778904665 )
      Apple banana and a cup of milo 
    
      Fruit ( 0.0214817448669 )
      Apple I look like a human heart. Mango I look like a stomach. Grapes I look like eyes. Banana I don't like this game. 
    
      Fruit ( 0.0815735543081 )
      An apple potato and onion all taste the same if you eat them with your nose plugged. 
    
      Fruit ( 0.229706852 )
      PSYCHOLOGY Test Choose 1 among the fruits below: APPLE MANGO GRAPES PEAR BANANA 
    
      Fruit ( 0.00714560473592 )
      Today's shake is spinach, avocado, apple orange banana blueberry with Chia seed and protein feelin the energy 
    
      Fruit ( 0.238171611868 )
      Banana Bread topped with Apple Maple Syrup and Yoghurt 
    
      Fruit ( 0.184395490353 )
      Mid-PM Snack Apple blueberries frozen mixed berries plain @Alpro_UK soya yoghurt, coconut flakes & agave nectar pic.twitter.com/4188sHosA5 
    
      Fruit ( 0.0788802216228 )
      Green spinach kale apple banana garlic ginger Orange lemon ginger cayenne #juicing http://instagram.com/p/bbT6L3yZyU/  
    

[1] Using libraries for this like
[http://nltk.org/_modules/nltk/classify/naivebayes.html](http://nltk.org/_modules/nltk/classify/naivebayes.html)
or [http://scikit-learn.org/stable/modules/svm.html](http://scikit-
learn.org/stable/modules/svm.html) would probably be more accurate and faster
to implement. Also I didn't cross-validate on the training set, but used new
samples.

