
Peeking into the architecture used for Google's Neural Machine Translation - eternalban
http://smerity.com/articles/2016/google_nmt_arch.html
======
Smerity
Hey HN, I'm author of the article. If there's anything confusing, I'd be happy
to help :)

The GNMT architecture is likely to keep popping up in Google's papers due to
the amount of engineering that's been dedicated to make it scale well. Other
than chewing up massive machine translation datasets, from single to multiple
language pairs, it's also been trained over the entirety of Reddit for a
conversational model system.

Just this week a new paper[1] was released using the GNMT architecture which
teaches a machine translation system how to translate between language pairs
it has never been taught before by using bridging from other language pair
knowledge. It's all pretty fascinating stuff :)

[1]: [https://arxiv.org/abs/1611.04558](https://arxiv.org/abs/1611.04558)

~~~
zackmorris
Great article, thank you for getting to the crux of the paper. Now that you
have a base case working:

* Could you convert the parameters that govern the creation and interconnect of each layer into lisp and feed the whole thing into a genetic algorithm to evolve even better translators? Using GAs to evolve NNs was the new hotness in the late 90s and I don't know what ever happened with it. It would take on the order of 10,000 times more computation to simulate all the generations, but hey, it's Google.

* How hard would it be to generalize the GNMT for solving other types of problems where the output translation is the solution to a problem or what to do next? While reading the article, it occurred to me that the hidden layer S could be thought of as the meaning of the input data. If S is smaller than the input data, then you could treat it like a word feeding into another GNMT to process paragraphs, chapters, etc. I guess this would hinge on whether sentences have any relation to one another. In any case, would love to hear your thoughts on something like that.

I'm just an armchair hobbyist toiling away my career building CRUD apps, but I
really feel that if we can get beyond the learning curve (no pun intended) and
get to basic building blocks like this, then engineers will run with it and
scale it very quickly to a passable implementation of hard AI.

------
lucaspiller
Does anyone know of any good primers on building your own machine translation?
My wife is from a small European country, and all machine translation I've
used usually end up with incompressible results. My theory is that it's not
that there is anything hard about the language, it's just they don't put as
much effort into training this language as the number of users is small - so
it ends up with poor results. As an experiment I'd like to try building my
own, but I don't know where to begin.

~~~
nl
[https://www.tensorflow.org/versions/r0.11/tutorials/seq2seq/...](https://www.tensorflow.org/versions/r0.11/tutorials/seq2seq/index.html#neural-
translation-model)

If the European country she is from is within the EU, you can probably get
parallel corpus from the EU parliamentary translations. I believe there are
other EU research projects which release useful corpus.

~~~
xbmcuser
Sadly one of the largest tranlation corpus of multiple languages can't be used
for copyright reasons i.e the user made subtitle files that are shared all
over the internet. Every week a few hundred episodes of different shows are
released in different countries that are translated by users into many
languages. Now that I think about it an app that could transcribe the words
spoken in a TV show and then would translate it using google translate would
be a huge with the manga crowd. Though Japanese hasn't been added yet.

~~~
nl
It's actually not at all clear if they couldn't be used. Things like Google's
word2vec model are derived from copyrighted data and seem to be fine.

Perhaps distribution of the subtitles is illegal under copyright law, but I
suspect the trained model would be fine.

It would be an interesting case anyway.

