
Scouring the Web to Make New Words ‘Lookupable’ - hvo
http://www.nytimes.com/2015/10/04/technology/scouring-the-web-to-make-new-words-lookupable.html?ref=business
======
afarrell
I feel like this article commits a serious oversight by failing to reference
[http://www.urbandictionary.com/](http://www.urbandictionary.com/) and the
problem it raises for this enterprise: How does one determine if a word is
actually used in communication or if it's existence is is merely a result of
trolling.

To take one example, I seriously doubt that the entry "wolfbagging" refers to
a genuine sexual activity that people engage in, but that does not mean it is
not used in some metaphorical or referential sense. After all, "rainbow
parties" were never a phenomenon in the 90s outside the minds of the media and
jumpy parents, but that is probably a phrase that has entered common knowledge
in the U.S.

One cannot use ridiculousness or physical possibility as a test. One has to
look at a sufficiently a massive corpus of text to observe the word being used
in the wild. The reddit corpus might be sufficient, but even then, that will
only let you decide that the word exists with a certain subculture.

~~~
wingerlang
This is my only issue with urban dictionary, it feels like a lot of it is just
random words that people shoehorn sexual activities into.

------
mojoe
I really like this concept -- there have been many studies on how language
shapes thinking (for example,
[http://www.sciencedirect.com/science/article/pii/S0010028501...](http://www.sciencedirect.com/science/article/pii/S0010028501907480),
[http://www.nature.com/scientificamerican/journal/v304/n2/ful...](http://www.nature.com/scientificamerican/journal/v304/n2/full/scientificamerican0211-62.html),
[http://www.pnas.org/content/104/19/7780.short](http://www.pnas.org/content/104/19/7780.short)).
I would be willing to bet that having a greater number of descriptive English
words will be beneficial in the long run.

------
_0ffh
Well, I think it's a good thing, so long as these new words are cromulent.

