Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: UNet diffusion model in pure CUDA (github.com/clu0)
8 points by clu0 on June 28, 2024 | hide | past | favorite
Hi HN!

I was inspired by Andrej Karpathy's llm.c (https://github.com/karpathy/llm.c), and wrote a full diffusion model training loop in CUDA. I learnt a lot about CUDA from Simon Boehm's Matmul blog (https://siboehm.com/articles/22/CUDA-MMM).

Currently there is still a lot of room for optimization: the model is running at 45% speed of PyTorch with torch.compile.

I'm curious about any thoughts or CUDA tips for convolutions.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: