
Attention and Augmented Recurrent Neural Networks - ZeljkoS
https://distill.pub/2016/augmented-rnns/
======
make3
an oldie but a goodie.

see also the lectures for oxford deep learning for nlp
[https://github.com/oxford-cs-
deepnlp-2017/lectures](https://github.com/oxford-cs-deepnlp-2017/lectures)

~~~
woliveirajr
Doing NLP with neural networks for some time, specifically categorizing
information and authorship attribution.

What I've seem so far is that neural networks, in this area, are being used as
classifiers, and sometimes old classifiers (SVM, for example) also have a good
performance.

How you do extract information (bag-of-words? TF-IDF? Doc2Vec? FastText?) is
the thing that is making difference, in my tests... And some researches
indicate that extracting information using semantics give stronger results.

~~~
make3
The more useful part about neural net in NLP is that you can train complexe
end to end models that end up performing better at their task than anything
else composite.

An example of this is neural translation with the Attention is all you need
paper ([https://arxiv.org/abs/1706.03762](https://arxiv.org/abs/1706.03762)),
which is the top of the line in translation.

An other example of this is Microsoft's R-NET ([https://www.microsoft.com/en-
us/research/publication/mrc/](https://www.microsoft.com/en-
us/research/publication/mrc/)), which are the top of the line at question
answering ([https://rajpurkar.github.io/SQuAD-
explorer/](https://rajpurkar.github.io/SQuAD-explorer/))
([https://research.fb.com/downloads/babi/](https://research.fb.com/downloads/babi/))

An other cool example of complex end-to-end network are Stanford's Dynamic
Coattention Networks
([https://arxiv.org/abs/1611.01604](https://arxiv.org/abs/1611.01604))

~~~
woliveirajr
Thanks, I'll read them !

