Hacker News new | past | comments | ask | show | jobs | submit login
Visualizing Transformer Language Models: Illustrated GPT-2 (jalammar.github.io)
64 points by nafizh 63 days ago | hide | past | web | favorite | 3 comments



My understanding of state-of-the-art language models is limited, but it seems like there has been a lot invested into having models that generate their output one token at a time, no edits.

Have there been any experiments into models that can re-read and edit their work, before spitting out entire sentences, paragraphs, or documents?



This is now my second-best reading on GPT-2 after this one: https://blog.floydhub.com/gpt2/.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: