Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
ashvardanian
11 months ago
|
parent
|
context
|
favorite
| on:
Building Meta's GenAI infrastructure
Nope. Moreover, simulating it even with AVX-512 is quite an experience. Been postponing it for 2 years now... But first of all, you need to choose the version of float8 you want to implement, as the standards differ between GPU vendors.
janwas
11 months ago
[–]
We use it in gemma.cpp [1]. This hybrid of E5M2 and E4M3 decodes to bf16 in ~14 instructions, so we can do that on the fly during dot products.
[1]: github.com/google/gemma.cpp
danielhanchen
11 months ago
|
parent
[–]
Congratulations on gemma.cpp!!
Consider applying for YC's Spring batch! Applications are open till Feb 11.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: