
Natural Language Processing: The Age of Transformers - ilnmtlbnm
https://blog.scaleway.com/2019/building-a-machine-reading-comprehension-system-using-the-latest-advances-in-deep-learning-for-nlp/
======
baalimago
>Next we shall take a moment to remember the fallen heros, without whom we
would not be where we are today. I am, of course, referring to the RNNs -
Recurrent Neural Networks, a concept that became almost synonymous with NLP in
the deep learning field.

XLNet ([https://arxiv.org/abs/1906.08237](https://arxiv.org/abs/1906.08237))
is in essence a recurrent neural network, using a transformer (which is based
on neural networks) which recurrently keeps context between different batches.
But the gated RNN's, such as AWD-LSTM/GRU, are fading out to the superior
transformer architectures, this is true.

That's my only complain though, excellent theoretical introduction.

Although, if anyone wanted to actually implement a transformer, be ware that
you want to have a 8+ GB GPU unit available, or be prepared to use cloud
computing (Google Colab is free, for now). Training neural networks is quite
hardware dependent still.

~~~
karambahh
Scaleway (where the author of this post works, as I do) is a cloud service
provider with a pretty interesting GPU instance: Nvidia P100 16-GB NVIDIA
Tesla P100 at 1€ per hour

------
sgt101
If anyone wants to use these tools practically I urge you to have a good look
at this paper :
[https://www.aclweb.org/anthology/P19-1439/](https://www.aclweb.org/anthology/P19-1439/)

My take away - pretraining achieves excellent paper results but robust
application is hard. There is still quite a way to go down this road for fault
intolerant users and applications.

~~~
Jack000
I'm not an expert, but after playing with some pre-trained transformers I
think they are mostly good at the exact thing they're trained for. eg. GPT-2
is great for text generation, but if you try to use it for say translation, it
will tend to add imagined details not in the source text. Similarly, BERT is
great at sequence and token-level classification but quite bad at text
generation.

------
bra-ket
Previous discussion on “Attention is all you need”:
[https://news.ycombinator.com/item?id=15938082](https://news.ycombinator.com/item?id=15938082)

Recent work by Jakob and
team:[https://ai.google/research/people/author37567/](https://ai.google/research/people/author37567/)

Image transformer is particularly interesting

~~~
cjauvin
Also, recent discussion about a very good intro article about Transformers:
[https://news.ycombinator.com/item?id=20773992](https://news.ycombinator.com/item?id=20773992)

~~~
kuu
I was about to share it too, really GREAT intro :)

------
macando
"This spring, the Elon-Musk-founded AI research lab OpenAI made a splash with
an AI system that generates text. It can write convincing fake reviews, fake
news articles, and even poetry."

You can check it out for yourself.
[https://talktotransformer.com/](https://talktotransformer.com/) Pretty
awesome.

~~~
3pt14159
For the lazy:

<human>The bridge wasn't quite right. There were small sparks coming off the
incomplete sections of material hanging just to the right of the traffic
lights. Gary wasn't worried, however. He walked up
carefully.</human><computer> "This is it, guys!"

A huge explosion rocked the bridge, and a fireball engulfed the entire bridge.
"Shit! Shit!"

It seemed as though the fire was just blowing out, and the bridges were back
in place. Gary was a little concerned, but it wasn't a problem at the time. At
least it was under control.

He walked forward, peering over the top. "This is terrible. I'm not leaving
the village tonight! How about a break for some water and a drink?"

When he looked down, the large chunk he had just taken off of the bridge was
gone. As soon as he realized it was gone, he turned back to the village to
check to see what had happened. Not sure what to do, he continued down the
hill, heading toward the village at all times. The entire town was completely
surrounded, and everyone was either inside or had</computer>

Not bad, but it's strange that it autogenerated text with multiple spaces at
the end of sentences. Also, it is far more dramatic than I would have guessed.

~~~
noobiemcfoob
It's not just "not bad." It's scary. It makes enough coherent sense from
sentence to sentence that I doubt my mom (missing a few marbles) would notice
it doesn't make a whole lot of sense. Combine this...some official looking
logos and a request for money for a fine.

Determining an actual official government or other institutional request is
going to get much harder.

~~~
sean2
I don't see the danger here though; how is this scarier than what a couple
guys in Nigeria could concoct to fool your mom? Any English speaker can still
put together a much more coherent an official sounding institutional request.

------
mark_l_watson
Good explanation of transformers, and the history leading up to them. I look
forward to the next installment covering BERT.

As someone who spent a lot of time trying to manually code up solutions to
anaphora resolution (pronoun coreference), BERT seemed like a small miracle to
me. As a side comment: I love that getting training data for BERT is so cheap:
any text source, and randomly remove words, target output is predicting the
words removed.

------
ArtWomb
Conversational AI is much closer than we think. Neural sequence-to-sequence
models are successful in domain specific domains. But in the context of chit-
chat based dialogue systems, the responses lack humanity. Undoubtedly due to
the fact they don't comprehend our world. Transfer learning alleviates some of
that awkwardness.

If anyone's interested in further experiments on their own. There is now a
unified Python framework for dialogue models ;)

[https://parl.ai/about/](https://parl.ai/about/)

~~~
abhishek0318
Really? Most of the chatbots in production are rule based systems.

~~~
macawfish
The transformer can also be used at different levels of abstraction, e.g. to
do interesting stuff with knowledge graphs. I think the transformer
architecture is about to make things very, very interesting.

[https://arxiv.org/pdf/1904.02342.pdf](https://arxiv.org/pdf/1904.02342.pdf)

------
fellahst
Great blog!

