
Visualizing Transformer Language Models: Illustrated GPT-2 - nafizh
https://jalammar.github.io/illustrated-gpt2/
======
somebodythere
My understanding of state-of-the-art language models is limited, but it seems
like there has been a lot invested into having models that generate their
output one token at a time, no edits.

Have there been any experiments into models that can re-read and edit their
work, before spitting out entire sentences, paragraphs, or documents?

~~~
p1esk
[https://arxiv.org/abs/1811.04454](https://arxiv.org/abs/1811.04454)

------
ReDeiPirati
This is now my second-best reading on GPT-2 after this one:
[https://blog.floydhub.com/gpt2/](https://blog.floydhub.com/gpt2/).

