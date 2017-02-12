Hacker News new | comments | show | ask | jobs | submit login
Tomáš Mikolov on Word2vec and AI research at Microsoft, Google, Facebook [audio] (rare-technologies.com)
132 points by tim_sw on Feb 12, 2017 | hide | past | web | favorite | 6 comments



Great podcast.

The world owes a big THANK YOU to Tomáš Mikolov, one of the creators of Word2Vec[0] and fastText[1], and also to Radim Řehůřek, the interviewer, who is the creator of gensim[1].

The number of software developers and researchers in industry and academia who rely on the work of these two individuals is large and growing every day.

[0] https://code.google.com/p/word2vec/

[1] https://github.com/facebookresearch/fastText

[2] https://radimrehurek.com/gensim/


Interesting podcast, I wished they talked more about how he came up with word2vec and how he went from the initial idea from the Czech language (I think that's what he said) into the word2vec we all know.

I also liked how this podcast was conversational like the interviews in the talking machines podcast. I look forward to the future episodes.


I think most people who have worked in computational linguistics or natural language processing for a while would have some idea of words existing in some kind of vector space.

The things which still amaze me are that meaningful vector operations work (Queen + Man ~= King), that you only need 300 dimensions (or sometimes even less!) and that it is possible to build this vector space so "easily".


So, what is this word2vec, after all?


After putting some documents in word2vec, it gives you a vector representation of each word (1). Words that are close to each other in this space might be similar, for example. Famously, it's possible to do neat computations, e.g. king - man might result in a vector that's close to the one for queen.

(1) This representation is learned, essentially, by trying to predict words from the surrounding words in a sentence or the other word around.


You need to compute king - man + woman, since, in a well learned model, the king - queen vector is similar to the man - woman vector.

These gender-based vector differences can be explained as capturing syntactic rather than semantic information, but semantic relations (Rome:Italy -> Paris:France) can also work.




