Visualizing Transformer Language Models: Illustrated GPT-2

somebodythere · on Aug 12, 2019

My understanding of state-of-the-art language models is limited, but it seems like there has been a lot invested into having models that generate their output one token at a time, no edits.

Have there been any experiments into models that can re-read and edit their work, before spitting out entire sentences, paragraphs, or documents?

p1esk · on Aug 12, 2019

https://arxiv.org/abs/1811.04454

ReDeiPirati · on Aug 13, 2019

This is now my second-best reading on GPT-2 after this one: https://blog.floydhub.com/gpt2/.