
Word Embeddings: Past, Present, and Future - tim_sw
http://w4nderlu.st/teaching/word-embeddings
======
visarga
Great slideshow. Word embeddings are already indispensable in NLP, but I'm
wondering if it is right to assign just a unique vector per word. Instead,
each word should have a vector based on context, for disambiguation and
nuance.

I see word meaning as an action similar to object recognition in vision - we
have to infer meaning because the actual word itself is a family of meanings.
Most of the time it's used one word - one embedding, even in word2vec which is
the standard method used in papers.

Of course word sense disambiguation is an old topic of research and there are
many methods. The translation task is probably the ultimate word-
disambiguation application - it shows how much work is still to do after word
embedding, in order to understand the meaning of words. If words had unique
meanings, we could have done translation with simple dictionary substitutions.

~~~
nl
Sense2Vec is a thing; Spacy has a demo
[https://demos.explosion.ai/sense2vec/?word=natural%20languag...](https://demos.explosion.ai/sense2vec/?word=natural%20language%20processing&sense=aut)

For many downstream tasks it doesn't seem to improve things though. NLP is
hard - lots of things should work but don't because of the sparse training
data. Like this slide deck says: most sentences occur only once.

~~~
Jack000
I thought Spacy only did part of speech disambiguation rather than sense
disambiguation. ie. it would be confused if a word has multiple meanings, all
of which are nouns.

~~~
nl
This demo uses the Spacy POS tagging to build a word embedding which
disambiguates multiple senses. The paper explains it:
[https://arxiv.org/abs/1511.06388](https://arxiv.org/abs/1511.06388)

------
luisjgomez
Anyone have pre-print/draft version link for this cite:

Alessandro Lenci, Distributional models of word meaning, 2017

Doesn't appear to be release yet:
[http://www.annualreviews.org/doi/abs/10.1146/annurev-
linguis...](http://www.annualreviews.org/doi/abs/10.1146/annurev-
linguistics-030514-125254)

~~~
w4nderlust
I had the chance to read a preprint, it's a great survey, it will be published
soon.

------
natch
Video:

[https://www.youtube.com/watch?v=AsGf8cV4hqg](https://www.youtube.com/watch?v=AsGf8cV4hqg)

------
paradite
Side note: You may not want to open this on mobile data. It was downloading at
1mb/sec and I had to force close the browser immediately.

Edit: Not sure how big is it in total though. Maybe someone not on mobile can
share.

~~~
w4nderlust
I edited the page. Now there's a download button and a preview button. If you
click the preview button, the javascript document viewer embedding the pdf
will be loaded, otherwise it won't be loaded.

