Hacker News new | comments | show | ask | jobs | submit login
Is there a good English wordlist with common words, for free download?
2 points by pramodbiligiri 1776 days ago | hide | past | web | 3 comments | favorite
We are doing entity extraction for documents specific to a domain. Unfortunately our domain specific index contains many common English words, and we would like to take them out or weigh them much lower.

I'm trying to choose between WordNet, Google Ngrams (too big!), and Moby Wordlist from Sheffield University. Any suggestions?

Look at the file named "eign" in the GNU troff distribution. I use it as a "stop word" list and it seems to work pretty well.

Oh, had never heard of this. This looks good for extremely common short words. The version I have has only 133 words though. I'm looking for something in the range of a few thousand words at least.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact