Visualizing Transformer Language Models: Illustrated GPT-2 (jalammar.github.io)
64 points by nafizh on Aug 12, 2019 | hide | past | favorite | 3 comments

My understanding of state-of-the-art language models is limited, but it seems like there has been a lot invested into having models that generate their output one token at a time, no edits.

Have there been any experiments into models that can re-read and edit their work, before spitting out entire sentences, paragraphs, or documents?

This is now my second-best reading on GPT-2 after this one: https://blog.floydhub.com/gpt2/.

