Hacker News new | comments | show | ask | jobs | submit login

The accuracy is over 95%

Citation needed...

How can you draw this conclusion at this point in the process? I'm genuinely curious to your filtering scheme to be able to extract information out of such a noisy data stream.




This is not a scientific research, so I didn't compute std, t-stats, etc. But I did pull a few hundred tweets from our database and counted how many wrong ones we had. That's where the number comes from. The filtering scheme is very simple: classify only if we're confident. There are many tweets containing "voted", but we only took ones we have a strong confidence and throw away the rest. For a complete set of keywords used for filtering, please feel free to email.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: