Hacker News new | past | comments | ask | show | jobs | submit login

In general, the individual coordinates of a word embedding (and hence a sentence embedding) have no semantic meaning, at least not in any "normal" sense.

The n-dimensional space into which the embeddings have been represented is mainly just a space such that similarity between two vectors/embeddings in this space has some semantic meaning, e.g. query or word similarity.

A few more details are here: https://www.quora.com/Do-the-dimensions-of-Word2Vec-have-an-...




My favorite paper for introducing the idea is this oldie but goodie: http://lsa.colorado.edu/papers/dp1.LSAintro.pdf

The math behind Word2Vec and GloVe is different. Most word embedding models have been shown to be equivalent to some version of matrix factorization, though, and, at least if you're comfortable with linear algebra, the SVD formulation makes it relatively easy to get an intuitive grasp on what the dimensions in an embedding really "mean".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: