Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Richard – A CNN written in C++ and Vulkan (no ML or math libs) (github.com/robjinman)
184 points by rjinman on March 16, 2024 | hide | past | favorite | 24 comments
This started out as a personal effort to learn more about machine learning. It's currently a CLI app where you give it a JSON file specifying your network architecture and hyperparameters and point it to your training data, then invoke it again in 'eval' mode with some data it's not seen before and it will try to classify each sample.

I don't see many other people using Vulkan for GPGPU, and there may be many good reasons for that, but I wanted to try something a bit different.

I've made every attempt to make the code very clean and readable and I've written up the math in documentation/math.pdf, so hopefully this is a useful learning resource for anyone interested in how neural nets work under the hood.

I'll be continuing to add new features over the coming months.




Interestingly, I have a similar project with C# for the higher-level pieces, and Direct3D 11 GPU API instead of Vulkan: https://github.com/Const-me/Cgml

> don't see many other people using Vulkan for GPGPU, and there may be many good reasons for that

I think the main of these reasons is nVidia’s contributions to the ecosystem. It’s much easier to call library functions implemented by cuDNN or cuBLAS first-party library, than it is to write and dispatch compute shaders to multiply these matrices.

However, for use cases when hardware costs are substantial compared to software development costs, using Vulkan or D3D can be a reasonable thing to do. nVidia is too greedy, in the EULA of their drivers they forbid to use GeForce GPUs in data centres. For people who need GPGPU on their servers, that paragraph of text sometimes makes nVidia hardware an order of magnitude more expensive than AMD hardware.


What a great project! One of my favorite things to see is when people implement as much as possible themselves. It really makes a big difference to have control over what is going on. And it is great for educational purposes of course.

I'm definitely going to be looking at this. Hope you had fun making it.


> I don't see many other people using Vulkan for GPGPU, and there may be many good reasons for that, but I wanted to try something a bit different.

A big reason is that C++ for Vulkan (SPIR-V) isn't quite there, while CUDA does C, C++20 (minus modules currently), Fortran, Haskell, Java, C#, Julia, Python JITs, and anything else that might target PTX.

SPIR was a late response after Khronos realized the world wasn't willing to be stuck with C for GPGPU, and the tooling is still catching up to PTX.


Very nice project, congratulations! Have you done any performance comparisons with tensorflow or pytorch?


[flagged]


One of the main reasons (the other being personal learning) people write these minimal dependency free implementations is exactly for speedup, so it's a fair question. If the author has other motivations that's fine, but it's very interesting to see how fast you can get once you strip out some of the overhead of the common frameworks.


The performance comparison would be interesting to me.

I'm curious to know how close we can get to these frameworks by directly using Vulkan for GPGPU.

Especially as an indicator of the feasibility of rewriting core components like cuFFT and cuDNN in Vulkan.


The comparison really, seriously, doesn't make any sense.

It's like asking if anyone has benchmarked gnuCOBOL on OS/2 against LLVM ML IR.

You might be interested in llama.cpp's Vulkan backend_s_, that's more of an apples to apples comparison.


You can implement an idiomatic CNN in PyTorch or TF and compare their performance to this implementation. It's a perfectly reasonable comparison.


People make CNN’s in both. There’s classes for them, too. It’s reasonable to ask how a readable implementation compares to an optimized one.

You shouldn’t be so quick to accuse people of wrongdoing. It helps to try to understand where they’re coming from. In this case, evaluating multiple styles of implementation.


PyTorch and Tensorflow are arguably the two most practical neural network frameworks in the profession.


They both let you build and train a model of nn layers. Tensorflow via keras and pytorch via torch.nn.Module.

How is that not functionally the same?


You won't learn anything, and attemping to derive any info is likely to be extremely unfair to Vulkan, which was the stated purpose of said comparison.

PyTorch and TensorFlow have been developed and fine-tuned over many years by large teams of expert engineers. They incorporate numerous performance optimizations, including advanced memory management, kernel fusion, and auto-tuning of hyperparameters.

Richard, as a single-person educational project, simply cannot match this level of optimization. The scope and feature set of the projects are vastly different.

PyTorch and TensorFlow are comprehensive ecosystems that include not just the core computational graphs, but also high-level APIs, pre-trained models, data loading and augmentation pipelines, distributed training support, and more.

PyTorch and TensorFlow are often evaluated on large, complex models with millions of parameters, trained on massive datasets. Richard's examples, as shown in the README, use much smaller models and datasets.

I wish I could think of a good analogy off the top of my head...I've only gotten as far as testing the viability of carbon fiber tires for a highway-bound semi via someone's hobby pinewood derby car where they lovingly crafted novel carbon fiber tires from scratch. No one did anything wrong and both are achievements. It's just a category error.


I do implementations from scratch/paper myself. I know the value of this. I do my own personal research using graph neural networks recreationally. I was replying to someone who thought the two ubiquitous libraries people use for almost identical functionality were somehow different.


Not sure what you mean, maybe they weren't either.


Lots of C++ inference code out there but this does training as well - impressive.


That reminds me of a small project I did to classify drawings of cats and dogs 10 years ago with machine learning https://jeena.net/catdog but no neural networks back then, just things like Canny edge detector, Hough transform, k-nearest neighbor, etc.


Very cool. I always find doing these kinds of toy projects is a great way of dipping your toes in the deep end of a new subject.


Huh. I wanted to do the same for many years, but you have actually done first. Cool!


Can you comment :

how well does Vulcan api fit Neural Net primitives [ matmul / relu / backprop / tensor arrays ] ?

Also.. do you think the in-browser WebGPU api, the successor to webGL, is a good api for NN ?


The short answer to the first question is… not great.

There are no Vulkan constructs corresponding to those things. Vulkan lets you run compute shaders written in GLSL, HLSL, or any language for which there exists a compiler that outputs SPIR-V. Vulkan doesn’t have its own shader language. Deciding how to break the problem down into separate shader programs each with different workgroup sizes is a challenge.

And the end result performs very poorly compared with tensorflow. However, I haven’t yet put much effort into optimisation, so I expect there’s some easy performance gains to be made.

I don’t know anything about WebGL or WebGPU.


> Vulkan lets you run compute shaders written in GLSL, HLSL, or any language for which there exists a compiler that outputs SPIR-V.

Context: I am interested in writing GPU-code as well (for ML too, but also for custom physic simulation). I also think, that compute-shaders are the way to go. But my intution says, that openCL would be simpler than vulcan.

Question: Have you considered alternatives to vulcan's compute shaders? (e.g. opencl instead of vulcan, or maybe some other ways entirely to run GPU code, such as rust-gpu)


I liked the original post title that left me momentarily confused going "why is CNN writing articles in C++?????"


What is a CNN?


A Convolutional Neural Network, which is particularly processing-intensive (convolutions over large numbers of inputs, large matrix multiplications) but quite effective compared to other machine learning approaches.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: