

Putting Google to the Test in Translation - seer
http://www.nytimes.com/interactive/2010/03/09/technology/20100309-translate.html?ref=technology

======
Jun8
The Systran system is quite old, check out how Hofstadter thrashed it some
years ago in his "Le Ton beau de
Marot"(<http://en.wikipedia.org/wiki/Le_Ton_beau_de_Marot>) (awesome book on
machine translation and literature in general, BTW). Looking at these I
thought that (i) Google totally blows the competition out of the water in
French, Spanish, and German (I even found the German translation slightly more
readable than the human one - "uneasy dreams" sounds better than the human's
"anxious dreams"); but (ii) the Russian and Arabic translations are not very
readable. Too bad they didn't provide a comparison in Chinese.

Many old school linguists often scorn the no-model, statistical approach to
machine translation that Google adopts (my advisor was one of these), opting
for more semantically-based models instead. I have yet to see results
comparable to Google's (i.e. in unconstrained domains) from this school.

~~~
sundarurfriend
How can we know about the linguistic models that Google uses? Are these from
some papers published by Googlers?

~~~
Jun8
For a simple introduction to their approach you can read Peter Norvig's
explanation of how Google performs spell checking (<http://norvig.com/spell-
correct.html>) or his video lecture ([http://www.catonmat.net/blog/theorizing-
from-data-by-peter-n...](http://www.catonmat.net/blog/theorizing-from-data-by-
peter-norvig-video-lecture/)). You can get links to their results for NIST
2005 evaluation from here:
[http://googleresearch.blogspot.com/2006/04/statistical-
machi...](http://googleresearch.blogspot.com/2006/04/statistical-machine-
translation-live.html)

------
patrickas
There is something strange, in the Arabic article both the human and the
Google translation say "[..] soccer great Diego Maradona" where as the
original arabic there is no mention of anything resembling "great". Was the
human translator influenced by the Google result ? or is Diego Maradona always
referred to as "soccer great Diego Maradona" in English press and google is
picking up on that?

Edit:I suppose it is the latter since the Human translator was not trying to
have a correct reference translation for testing the engines. He totally
skipped translating the words " in South Africa ". Imho, this makes the
results much less interesting.

Also in the french translation it is interesting that "Le premier soir"
translates as "The first evening" if you type it in google without the rest of
the sentence, but becomes "The first night" after you put it in context.

~~~
sundarurfriend
> I suppose it is the latter since the Human translator was not trying to have
> a correct reference translation for testing the engines. He totally skipped
> translating the words " in South Africa ".

Each human translation also has the source below it, which ranges from old
classics to relatively recent publications. So, yes, the human translators
were not trying to have a correct reference translation, which imho makes it
only more interesting.

> becomes "The first night" after you put it in context.

which makes me suspect this one happened to one of the inputs used to train
the translator also.

------
tokenadult
The comparisons with Yahoo and with Bing are quite interesting.

------
yaroslavvb
The Russian version confuses two engines because of hyphenation (perhaps due
to line-wrapping in the original article)

