Nope. Moreover, simulating it even with AVX-512 is quite an experience. Been pos... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

ashvardanian 11 months ago | parent | context | favorite | on: Building Meta's GenAI infrastructure

Nope. Moreover, simulating it even with AVX-512 is quite an experience. Been postponing it for 2 years now... But first of all, you need to choose the version of float8 you want to implement, as the standards differ between GPU vendors.

janwas 11 months ago [–]

We use it in gemma.cpp [1]. This hybrid of E5M2 and E4M3 decodes to bf16 in ~14 instructions, so we can do that on the fly during dot products.

[1]: github.com/google/gemma.cpp

danielhanchen 11 months ago | [–]

Congratulations on gemma.cpp!!

Consider applying for YC's Spring batch! Applications are open till Feb 11.
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact