LTM-1, an LLM with a 5M token context window

sp332 · on June 6, 2023

Absolutely no information about the model or the product? It's not transformer-based, so what is it?

kristjansson · on June 6, 2023

I posted the link to see if there were any plausible theories, or concrete information.

Clearly it’s not an exact-attention transformer - perhaps some sort sparse / approximate attention, or recurrent-transformer-ish-thing like RWKV?

Their twitter announcement[0] does say it’s a novel architecture they’re calling a “Long Term Memory Network”. But who knows what that actually means.

[0] https://twitter.com/magicailabs/status/1666116949560967168