
A Magician of Chinese Poetry - prismatic
http://www.chinafile.com/nyrb-china-archive/magician-of-chinese-poetry
======
kevinwang
I wonder if it would be feasible to have the words of two languages living in
the same word2vec vectorspace. So to understand a feel for a word in a poem of
the other language, you can see what summation it is of words from your
language

~~~
vtange
What if one language has a word that simply has no equivalent in another
language? Wouldn't that word be 'off the charts'?

Chinese has many characters that are pretty much exclusively used for names
and making sounds when transliterating other languages. If we're trying to put
Chinese characters and English words on a vectorspace, you're going to be
missing a lot of Chinese characters which have no meaning.

~~~
gwern
> What if one language has a word that simply has no equivalent in another
> language? Wouldn't that word be 'off the charts'?

No. It'll still be mapped onto the word2vec vectors, but it would just take on
bizarre, extreme, and highly unstable values due to maximum likelihood
training. It'll associate that untranslatable word with all sorts of random
other words like 'cheese' simply because 'cheese' happened to be used within a
few sentences of the untranslatable, and the mapping will vary unpredictably
with different hyperparameters, random seeds, datasets, implementations etc.

This often happens; for example, in a simple logistic regression, suppose you
have 10 datapoints and 1 variable that separates them perfectly, what will
maximum likelihood do? It won't give you a sensible odds ratio like 10, it'll
give you an OR of ∞, because if the OR is ∞, that maximizes the chance of
perfect separation since the likelihood of separation is always 1 and any
finite OR would imply a slightly smaller likelihood <1\. (While a logistic
regression using any kind of regularization or Bayesian prior will behave much
better.) You can see this in image-classification CNNs when you give them a
photo of something not in the training set - they just come up with whatever
is the closest match for the flimsiest and most superficial of reasons. Or if
you try to use a NN for time-series prediction and extrapolate out the
prediction, it'll be an exact extreme trend, because it's not reflecting any
of the real uncertainty.

This is why being able to ask a NN about its uncertainty is so useful: the
regular NNs will just confidently predict bullshit because that's what you
trained them to do. There's some nifty stuff about this in "Uncertainty in
Deep Learning"
[http://mlg.eng.cam.ac.uk/yarin/blog_2248.html](http://mlg.eng.cam.ac.uk/yarin/blog_2248.html)
, Gal 2016.

~~~
kevinwang
Why would an untranslatable word take on bizarre values? If there are enough
training examples of the untranslatable word, why couldn't it actually have a
sensible mapping in the vectorspace?

p.s. thanks for the reference to that thesis. I've been wondering about
uncertainty in predictions for a side project to predict athletic performance
that I'm working on, so this is very timely.

~~~
gwern
word2vec is trying to predict the co-occurence of words; if there is nothing
translatable about it, then it will be effectively random. When you fit a
maximum likelihood to random stuff and force it to attribute pattern to the
noise, it's not going to come up with any sensible mapping.

~~~
kevinwang
But just because it's "untranslatable" from, say, Chinese to English, doesn't
mean that we can't see many Chinese-Chinese co occurrences of the word, so
this Chinese word would still have a vector of significance relative to other
Chinese words, right?

That is- if you ran word2vec on a Chinese corpus, this word would be
represented adequately in the vectorspace.

Wouldn't it then make sense, if you could find a proper mapping from one
language's space to the other (I don't know if this part is realistic, but
maybe by finding correspondences between already-translated words), that this
word would not land in a bizarre spot were it to be expressed as a combination
of vectors from the other language?

------
moogleii
Great example of how language and grammar affects how one thinks and vice
versa.

------
vorg
For those interested...

    
    
      《鹿柴》
      空山不见人，
      但闻人语响。
      返影入深林，
      复照青苔上。

------
onlywei
I was named after this guy.

------
chris_st
Wow, that's beautiful.

------
ktRolster
Chinese poetry inspired me to start learning Chinese.

