
Google Translate might not be as good as people thought - robin_reala
http://groups.google.com/group/shibboleth-users/browse_thread/thread/123bd2d82822a3a7?pli=1
======
user24
Most SMT systems are trained using the procedings of the European Court - as
it's a huge corpus of multilingual documents which all have the same meanings.

This is a large factor as to why non-european languages typically don't fare
that well in Statistical Machine Translation systems. The corpuses (corpii?)
aren't as large for other languages.

Subject-broadness is another problem. Early information retrieval systems were
trained (and many still are) using the Wall Street Journal as the corpus.
Which means they work great for searching on the topic of big business, but
not so great on getting apple pie recipes, as WSJ doesn't talk much about that
;)

~~~
woodson
There are some research projects (in rather early stages) that develop SMT
techniques for translation to languages without big parallel corpora
(essentially by bootstrapping such corpora, assisted by active learning). This
could be of particular importance to keep smaller languages from disappearing,
otherwise less and less works in that language will be available (yes, I'm
aware that there are many people who consider language death a good thing).

~~~
user24
Sounds fascinating; links? research? Applicability to lost languages like
Linear A?

------
kylec

        Please apologize for your stupidity.
    

I wish I had the gall to say this to people

~~~
techiferous
I couldn't resist: <http://pleaseapologizeforyourstupidity.com>

~~~
thenduks
There needs to be a place to _actually apologize_ , toss a textarea in there
and show previous apologies below :)

~~~
techiferous
What an _awesome_ idea. I'm adding that to my to-do list.

------
david_p
So far, I'm guessing :

    
    
      lumber = log
      vomit = thow
      insult to father's stones = thrown in superclass
      wind, pole, and dragon = maybe "try,catch,finaly" or "if,then,else"
      goat-time = WTF ?

~~~
hasenj
Wind in Japanese is 'kaze' (風), and the kanji character (according to
rikaikun) has a second meaning: method.

<http://translate.google.com/#en|ja|wind>

Edit:

The kanji for dragon seems to have a reading: "ryuu" and that "sound" (if not
written in kanji) has several meanings, one of which is also "method"

[http://jisho.org/words?jap=ryuu&dict=edict](http://jisho.org/words?jap=ryuu&dict=edict)

"goat" seems to be "yagi", and putting that in jisho.org, one of the results
is related to "night shift", so maybe a nightly process? nightly build?

[http://jisho.org/words?jap=yagi&dict=edict](http://jisho.org/words?jap=yagi&dict=edict)

~~~
donw
I can't imagine anybody using '風' for 'method' for anything relating to
software. It's usually written in either katakana (メソッド), or the mathematical
term for function is used instead (関数).

Really wish I could see the original message...

~~~
hasenj
could it then be just a general "method", as in "the way to do a certain
thing"?

~~~
donw
Kind of, although I think that 'in the style of' is a more accurate
translation. For example, you might see '日本風のステーキ' for 'Steak, in the Japanese
style.'

------
sfphotoarts
I have a Russian friend who used my computer to check her vkontakte (Russian
facebook), she's not used to Chrome and it auto-translated the page. At first
she didn't notice because she's equally capable in English, then she started
to giggle at the word pancake (which is a poor translation), but she said that
on the whole it did a really good job. I guess some languages are easier to
do. Anecdotal I know, but its good enough for me to use Russian websites to
read photo comments.

------
avar
Another interesting thing about Google Translate is that people sometimes
successfully troll it by amusing the "submit a better translation" feature,
which Google ostensibly uses without much checking.

I'd point out an example, but Google's engineers probably read this site, and
the examples I have are too valuable for my personal amusement to give up :)

~~~
rdela
I love amusing the "submit a better translation" feature.

~~~
avar
Thanks :) That was a fun grammar error.

------
njharman
Did people think it was that good? IME, it translates normal text to and from
many languages into something that provides the gist if lacking in the
nuances. Which, IMHO, is flippin fantastically great even in the face of the
odd horrible translation.

~~~
waterlesscloud
I was recently typing in some phrases from a 19th century book in French
(which I'd found on Google Books), and it was interesting to watch the
translation morph as I typed and provided more context. It wasn't perfect
translation in the end, but I was still pretty impressed.

------
xtacy
Google Translate sort of works on the premise that at least the input text
isn't confusing.

~~~
othello
If by not confusing you mean not ambiguous, then that's rarely the case in
Japanese, which is a very context-sensitive language.

The original Japanese text is probably not at fault here.

------
techiferous
When I worked in Germany some coworkers were trying to figure out the English
word for betriebshof (bus depot) by using Altavista's Babelfish, which claimed
it was "yardyard yard".

------
bherms
We did an experiment in my societies and culture class where we tested
translations between several different languages using several different
services. What's really interesting to see is when you translate between a
language and back again, or go through multiple channels to origin--eg,
English->French->German->English. We still have a long way to go before
machines can provide perfect translations, but that's part of the fun right?
Pushing the boundaries and finding new inventive ways to solve the problem of
allowing anyone in the world to communicate with anyone else by bringing down
the language barrier. We're closer today than we've ever been, and we'll be
closer tomorrow, and so on.

~~~
jamesbkel
You may enjoy this

<http://translationparty.com/>

~~~
scotty79
This is amusing: <http://translationparty.com/#7966687>

------
robk
Many SMT systems are based on European Court, but Google's had people working
on acquiring parallel texts for years now and has one of the largest corpora
of parallel texts in existence digitally, as far as I know.

Quality is logarithmically proportional to the volume of unique text
available. Thus, there's a rough formula for every doubling of the corpus for
any language, quality increases by a few points (on a well known scale of
translation quality).

The general assumption is that over time this statistical technique, along
with the growing data acquisition of Google, will approach human quality.

But you can assume Google's tried to acquire tons of available parallel texts.
Book translations, government (any multilingual gov't is great, Canada for
ex), religion, etc. Sky's the limit.

------
mech4bg
I know the initial post was tongue in cheek, but Google Translate seems to be
worse than 1999-2000 era Babelfish at times. I often (try to) use it to double
check my German before sending off an email and it inevitably fails
dramatically and I have to instead check individual words on a decent site
like dict.leo.org. It seems weird that a small site like that obliterates
Google for word accuracy.

------
torial
For a great discussion / informal research on comparing online translation
tools (primarily Google, Bing, Yahoo) on quality, see:
<http://www.tcworld.info/index.php?id=175>

The interesting factor was when they took brand identity away as a criteria
given to the graders of the translation Bing and Yahoo's quality scores rose.

------
rimantas
I was having fun with it when translating from one non English to another non
English language. The fun part it that it translates Lang1->English->Lang2 and
when it does not know how to translate word X from Lang1 to English it just
sometimes chooses similar English word and then translates it to Lang2. Often
result is hilarious.

------
superk
I'm just surprised that any engineer can be so lacking in foundational
english. Are there any programming languages that have a vocabulary that is
_not_ english?

------
phreeza
Does anyone with knowledge of japanese have a clew what was actually being
asked, and what went wrong?

~~~
MikeMacMan
I have no idea. I don't know where 'goat time' could come from or why there
would be a dragon reference. I've seen a lot of bad translations, but this one
is so bizarre that I kind of doubt its authenticity.

------
ephesus
The funniest part is that Nate's response is almost equally bad Japanese.

~~~
tjarratt
Could you offer a translation? I'm curious how it relates, going the other
direction.

~~~
auxbuss
According to Google Translate <drum roll>:

Mr. Matsumoto, Hi. This is Nate.

Google translator, the incompetence which is vulgar. (Laughs) Make sure to
email me directly. You can write in Japanese. I will help you.

Sadly, no goats.

~~~
delackner
Far from "nearly equally" as bad, at least Nate's text is easily interpreted
in the intended way. His grammer may be very stiff and incorrect, but for a
non-native speaker who probably doesn't live in Japan, give him some slack.

That said, I did enjoy his construction
「あなたは日本語で書くことができます。私はあなたを助けるつもりです。」Trying to preserve the jerky tone: "You are
able to write japanese! It is _my_ intention to rescue _YOU_."

------
abdelazer
I look forward to someone re-captioning All Your Base with this text.

------
9ec4c12949a4f3
Maybe trolls are over-riding the suggest improved translation feature...

------
labboy
Not perfect, plenty of delayed reaction, but at least gives a flavor and some
easier than previous access to good stats on what gets blocked and why

------
js2
You know what would have been neat? The readers of HN working together to try
to reverse engineer the original meaning. There's an attempt at that by a
couple readers. But is that what get's voted to the top of this thread? No, an
inane comment sits at the top with 58 points. Disappointing.

------
devmonk
松本武 - At often, the goat-time install a error is vomit.

Agreed.

松本武 - To how many times like the wind, a pole, and the dragon?

I don't know... 5, maybe 7?

松本武 - Install 2,3 repeat, spank, vomit blows

Was that a Windows install?

松本武 - goat-time see like the wind, pole, and dragon?

Goat-time... is that like happy hour?

松本武 - This insult to father's stones?

Ow... don't bring my father's stones into this.

松本武 - JSP error handler with wind, pole, dragon with intercourse to goat-time?

I knew Oracle was taking Java downhill, but whoa... a dragon? No way.

松本武 - Or chance lack of skill with a goat-time?

Would you care to participate in a game of skill?

松本武 - Please apologize for your stupidity.

I'm sorry.

松本武 - There are a many thank you

You are welcome!

