Hacker News new | past | comments | ask | show | jobs | submit login
Visualizing Transformer Language Models: Illustrated GPT-2 (jalammar.github.io)
64 points by nafizh on Aug 12, 2019 | hide | past | favorite | 3 comments



My understanding of state-of-the-art language models is limited, but it seems like there has been a lot invested into having models that generate their output one token at a time, no edits.

Have there been any experiments into models that can re-read and edit their work, before spitting out entire sentences, paragraphs, or documents?



This is now my second-best reading on GPT-2 after this one: https://blog.floydhub.com/gpt2/.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: