
A Word Is Worth a Thousand Vectors - astrobiased
http://technology.stitchfix.com/blog/2015/03/11/word-is-worth-a-thousand-vectors/
======
est
As a native Chinese speaker, this comes so natural.

pork = pig + meat

So the year of 2015 is the year of ram/sheep/goat, in Chinese they literally
means

Ram = male ∪ caprinae

sheep = wool ∪ caprinae

goat = mountain ∪ caprinae

basically, word composition is pretty common in analytic language like
Chinese, but kinda new idea in fusional languages like English.

~~~
perdunov
This is super-cool. I am trying to start learning Chinese on Coursera
[https://www.coursera.org/learn/learn-
chinese](https://www.coursera.org/learn/learn-chinese)

Learning a language that is based on different principles is an enormous brain
exercise. Also, remembering the characters is a challenge.

------
imh
I've always wondered about doing this in non-flat spaces. Like if I add the
"7100 miles west" vector to the "California" point, I get Turkmenistan. If I
add "7100 miles west" again, I get back near "California." Similarly, adding
the "not" vector twice might get you back where you started in a word
embedding. Anyone know if anyone is working on this? It could be tricky
because "7100 miles west" lives in the tangent space to the space "California"
lives in, but that in itself could be an interesting thing to study in the
context of words.

~~~
jeremysalwen
Take a look at some of the compositional models here under publications:
[http://www.socher.org/](http://www.socher.org/).

Here is the demo webpage for the sentiment analysis system:
[http://nlp.stanford.edu/sentiment/](http://nlp.stanford.edu/sentiment/)

------
madcowherd
Wondering how this differs from the SemanticVectors package? Will have to look
into word2vec further.

~~~
juxtaposicion
It's my first time seeing the package, but looking over the docs it looks like
it implements LSA. The major difference here is that word2vec dramatically
outperforms LSA in a variety of tasks
([http://datascience.stackexchange.com/questions/678/what-
are-...](http://datascience.stackexchange.com/questions/678/what-are-some-
standard-ways-of-computing-the-distance-between-documents)). My experience has
been that the vector representations in LSA can be underwhelming and poorly
performant. I can't comment on the Random Projection and Reflective Random
Indexing techniques SemanticVectors implements.

This link is about document distances but still compares other techniques
nicely: [http://datascience.stackexchange.com/questions/678/what-
are-...](http://datascience.stackexchange.com/questions/678/what-are-some-
standard-ways-of-computing-the-distance-between-documents)

~~~
madcowherd
Sorry, I should have specifically mentioned how it differs from random
indexing/projection. I was immediately reminded of a similar inference example
using random indexing/projection.

[https://code.google.com/p/semanticvectors/wiki/PredicationBa...](https://code.google.com/p/semanticvectors/wiki/PredicationBasedSemanticIndexing)

------
madsravn
Very exciting stuff. I love how you can take simple building blocks and create
something elegant and fun with them.

However, why are there words more similar to "vacation" than "vacation"?

~~~
juxtaposicion
Thanks! The word 'vacation' is just removed from the list since it's exactly
what we're looking for.

~~~
Bill_Dimm
It's not removed from the list -- it is second from the bottom. madsravn's
question is a good one.

~~~
TomAnthony
The one in the list includes a period after it, so I believe it is just a case
of slightly dirty data.

~~~
Bill_Dimm
Good observation -- I missed that (obviously). They seem to be using data from
the word2vec project, so I would guess that it is intentional rather than a
lack of cleaning.

------
nl
I don't understand how the item matching is working. Do they have textual
descriptions of each item (including colors and patterns), or are they somehow
building vectors for the images and then doing cross-modal vector
calculations?

If it's the first option, then generating those descriptions seems and
important thing to mention.

If it's the second, then it's a pretty significant result! I've seen some
papers that indicate some possibilities in that area, but never anything
working as well as this.

~~~
juxtaposicion
The item vectors are generated from the text: customers and stylists write
text about the stripes, or maternity, and word2vec associates this with the
item in question. How that happens is treated in the next section about
summarizing documents (a 'document' here is the collection of all text about
an item).

So we don't do any fancy deep learning from the images themselves, although
this is on the horizon :)

~~~
nl
Colors would be trivial to do at least.. "The Dress" not withstanding..

------
sinwave
Am I wrong that the title seems to imply something negative about word
vectors? But the article is super pumped about them!

~~~
juxtaposicion
It's just a play on the phrase a 'picture is worth a thousand words' :) The
word vectors themselves contain sophisticated relationships that seem almost
miraculous, we're definitely not negative on them

(Speaking as one of the authors of the post)

------
hmate9
By simple algerba, we can now prove that a picture is worth 1,000,000 vectors.

------
Dewie
> If the gender axis is more positive, then it's more feminine; more negative,
> more masculine.

Reminds me of my java textbook: the example was to model some employee[1] and
it's gender was a `boolean`: `false` for man; `true` for woman. Of course that
was just an intermediate example before they showed off the `enum` solution.

[1] because hey, java OOP + CRUD business application example == match made in
heaven as an example, apparently.

