Was hoping for some discussion about word vectors like word2vec. I keep reading ...

umilegenio · on Nov 15, 2017

The interesting thing about word2vec is that is an unsupervised method that build vectors to represent each word in a way that makes easy to find relationship between them.

There is a video by the creator of Gensim on word2vec and frieds: https://www.youtube.com/watch?v=wTp3P2UnTfQ

We didn't include it, simply because it relies on machine learning and we wanted to show simpler methods.

laGrenouille · on Nov 15, 2017

Yes, I agree that the applications for word vectors are not made as clearly as it should be. One direct application is as the first layer of a neural network [1], which could be part of either a 1-dimensional convolution or a recurrent neural network. Using pre-trained word vectors is a form of transfer learning and allows for much more predictive models with smaller amounts of training data.

[1] https://blog.keras.io/using-pre-trained-word-embeddings-in-a...

matt4077 · on Nov 15, 2017

Let me try:

Take the famous example of [king] and [queen] being close neighbors in vector space after generating the word vectors ("embedding"). If you then use these vectors to represent the words in your text, a sentence about kings will also add information about the concept of queens, and vice versa. To a far lesser degree, such a sentence will also add to your knowledge of [ceo], and, further down, [mechanical engineer]. But it will not change the system's knowledge of [stereo].

bane · on Nov 16, 2017

Thanks, yeah I get that, but I think I'm having a lack of imagination about what to do with that in terms of how to build something useful and user friendly out of it.

rpedela · on Nov 15, 2017

Essentially they are useful for comparing the semantic similarity of pieces of text. The text could be a word, phrase, sentence, paragraph, or document. One practical use case is semantic keyword search where the vectors can be used to automatically find a keyword's synonyms. Another is recommendation engines that recommend other documents based on semantic similarity.

amirouche · on Nov 15, 2017

are you sure it allows to guess synonyms? I was under the impression that word2vec only allowed to know how similar are words, which different from synonyms. E.g. red is like blue in word2vec sens, but not a synonym.

rpedela · on Nov 15, 2017

Technically yes. It will find words which are used in similar contexts such as synonyms, antonyms, etc. However in practice, word2vec and clustering does a good job of finding synonyms [1].

1. https://www.slideshare.net/mobile/lucidworks/implementing-co...

boxy310 · on Nov 15, 2017

Was very pleased to find this out when I first started studying word embeddings (the abstract principles of word2vec). Essentially it comes down to words having similar verbs and objects that come up most frequently together, so they end up being semantically close.