Theoretically, HMMs *are* "models of the world" and transformers are approximati...

gryn · on June 1, 2024

yes, the projection is possibly responsible for it looking like a simplex/triangle since it's a probability distribution over 3 states.

another individual seem to have asked that same question in the comment section of that article and they wrote another article with the author after a lot of back and forth:

https://www.lesswrong.com/posts/mBw7nc4ipdyeeEpWs/why-would-...

skinner_ · on June 1, 2024

Thank you, that's a perfect follow-up piece, makes the whole thing much clearer.