Hacker News new | past | comments | ask | show | jobs | submit login

Support in for high level backends almost always comes down to c++ either way. That goes for NPUs (which are really fragmented anyway, i.e. there is no uniform API), and and also for JAX' TPU backend (which iirc is using XLA).



C++ stings less as an API used in the low level machinery under the hood as long as you as an application author don't have to write code in it.

I haven't done an in depth look but most matrix math accelerators (eg AMD, Intel and Apple) seem to provide C/C++/Python APIs for describing the computations but the code executing on the NPU is not compiled from user C++ code.

Apparently eg in Intel's stuff there's a custom run-time compiler consuming this kind of IR (intermediate representation) in the accelerator sw stack: https://docs.openvino.ai/2022.3/openvino_ir.html & https://github.com/intel/linux-npu-driver

And on AMD from user POV it doesn't seem too different: https://ryzenai.docs.amd.com/en/latest/devflow.html


Yes, but this is not where I am getting at.

"Will be interesting to see if other high level accelerator supporting languages like Chapel or Futhark or JAX end up getting NPU backends, it might give them a nice boost over the proprietary C++ inspired language."

As you say, the GPU (or NPU, TPU,..) don't run C++ or anything derived from it. The "runtime" (~backend) will usually emit some kind of hardware dependent format (or again, an IR) like SPIR-V, PTX, etc.

But the backend itself is usually written in C++ (due to performance reasons), and there is really no way to get around that. Interacting with that from Python (or Jax) is a usability win, but there is zero difference in functionality. I.e. there is no proprietary C++ inspired language in play here. Hence no way to get a boost.


Right.I was focusing more on the CUDA-kernels-on-NPU line of thought from patrikthebold's message and its alternatives, which as you say, is not a reality now either.

In the Jax style implementation scenario the compiler part of JAX is better inspiration, maybe along the lines of this case study of a path tracer running on a TPU: https://blog.evjang.com/2019/11/jaxpt.html - I don't think Chapel or Futhark would adopt the same approach as such but it's at least some kind of existence proof of a compiler targeting it from a high level language for a non-machine learning code.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: