Hacker News new | past | comments | ask | show | jobs | submit login
Neural Machine Translation and Sequence-To-sequence Models: A Tutorial (arxiv.org)
110 points by tim_sw on Mar 7, 2017 | hide | past | favorite | 6 comments

This is a good paper for anyone interested in how modern Machine Translation works at the level of detail you might get from a well-written text in a college-level CS course (which is what I believe this is from). The paper starts with a background on statistical machine translation and then goes through the newer approach of sequence-to-sequence learning for translation, including word replacement and attention mechanisms. It's a good overview.

But if you are looking for a higher-level introduction that covers the same big ideas in ~10 minutes for a more general audience, here's my take: https://medium.com/@ageitgey/machine-learning-is-fun-part-5-...

This is one of my favorite general high level introduction to this area. Most people whom I have shared this with understand it even if they do not have deep technical knowledge. Great material!

Adam, I've been using Machine Learning Is Fun Part 1 at work, to introduce non-technical business leaders to supervised machine learning concepts. Thanks for the great series!

Stephen Merity of Metamind has a nice visual tutorial here as well: http://smerity.com/articles/2016/google_nmt_arch.html

I was lucky enough to study in the same lab as the author while he was doing his doctorate. Graham has a real talent for explaining complicated concepts in a way that's easy to understand. He's also strongly committed to putting as much of his code and data as he can online so that anyone can play around with it, including people who aren't academics.

If you're interested in a more introductory talk, I gave one a few weeks ago that goes over the basics of Deep Learning, and how TensorFlow works internally.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact