Hacker News new | past | comments | ask | show | jobs | submit login

are those good or bad



FlashAttention is an amazing improvement over the previous state of the art. The others are still highly experimental, but seem like they'll at least contribute significant knowledge to whatever ends up surpassing the Transformer, (assuming something does).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: