Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The formats supported in the tutorial are the OCP microscaling formats, including mxfp4 and mxfp8, as well as NVIDIA’s nvfp4 format. These matrix multiplications are accelerated by fifth generation tensor core instructions on CUDA devices with compute capability 10.

Blog post: https://developer.nvidia.com/blog/openai-triton-on-nvidia-bl...



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: