Hacker Newsnew | past | comments | ask | show | jobs | submitlogin



I believe these are SIMD. Tensor cores require MMA family of instructions. Ask me how I know. :)

https://github.com/m4rs-mt/ILGPU/compare/master...lostmsu:IL...

Good article: https://alexarmbr.github.io/2024/08/10/How-To-Write-A-Fast-M...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: