
Single Headed Attention RNN: Stop Thinking with Your Head - nl
https://arxiv.org/abs/1911.11423
======
nl
Some key quotes:

 _We also achieve state-of-the-art on WikiText-103 - or do we? This work has
undergone no intensive hyperparameter optimization and lived entirely on a
commodity desktop machine that made the author 's small studio apartment far
too warm in the midst of a San Franciscan summer. The final results are
achievable in plus or minus 24 hours on a single GPU as the author is
impatient. The attention mechanism is also readily extended to large contexts
and requires minimal computation. Take that Sesame Street._

 _Many fight against the homogenization of language by dividing and conquering
as they did in the Tower of Babel era (see:Javascript frameworks)._

 _Perhaps the reserchers give up on machine learning and in-stead decide to
pursue a musical career themselves? Who am I todictate what they do in this
alternate timeline?_

