Hacker News new | past | comments | ask | show | jobs | submit login
You Only Cache Once: Decoder-Decoder Architectures for Language Models (arxiv.org)
3 points by reqo 26 days ago | hide | past | favorite | 1 comment



Basically, it is a hybrid model like Jamba.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: