> Looks like this more about adapt existing CUDA code to work with AMD. So it makes sense as the least resistance approach I suppose.
It isn't about that. Vulkan isn't flexible enough to allow it to happen. Give true pointers support in Vulkan for example first. GLSL/HLSL is much worse than modern C++ as a programming language too.
As described by Brecht Van Lommel (Blender and Cycles developer):
Vulkan has limitations in how you can write kernels, in practice you can’t currently use pointers for example. But also, GPU vendors will recommend certain platforms for writing production renderers, provide support around that, and various renderers will use it. Choosing a different platform means you will hit more bugs and limitations, have slower or no access to certain features, are not likely to see library support like OSL, etc.
Our strategy for Cycles X is to rely on the GPU vendors to support us and provide APIs that meet our requirements. We want to support as many GPUs as possible, but not at any cost.
This doesn't sound right to me. Vulkan 1.2 has support for pointers through an extension[1], and that's getting more widely available (certainly a lot more cards than can run ROCm). There's also support coming down the pike for buffer device address[2], which basically lets you use pointers to refer to resources when submitting work to the GPU instead of having to create bindings.
If an open source project of this scope came to me for advice, I would absolutely 100% recommend they base their stuff on Vulkan. It's getting very good quite quickly; there's basically nothing holding you back any more if you're willing to use the latest extensions. There is a tendency in open source to move more slowly though, and I understand that.
I agree with you that GLSL/HLSL is a bad programming language, but you can still get your work done in it. I'm personally excited about rust-gpu, but that is not ready yet.
> Pointers are an abstract type - they have no known compile-time size. Using a pointer type as the argument to the sizeof() operator will result in an undefined value.
> Pointer-to-integer casts must not be used.
> Integer-to-pointer casts must not be used.
> Pointers must not be compared for equality or inequality
The first one in that list is a quite big limitation, affecting a significant number of scenarios…
(And yes, you cannot actually make a compliant OpenCL 1.2 implementation on top of Vulkan, Vulkan has too few features for that purpose)
Thanks for the detailed explanation, that makes sense and is useful information. My understanding was that they were more like real pointers, but I see that is not the case.
You'd recommend Vulkan Compute over "CUDA" though for a path tracer? (I don't assume Cycles has any hybrid rendering in it, so there isn't much value in the traditional pipeline)
So it depends a lot on the goals. CUDA is a very good developer experience, and Vulkan compute shaders is (at the moment) a very bad one. But if the goal is to ship real compute on a wide variety of devices, Vulkan is pretty close to the only game in town.
I also expect the language support and tooling to get a lot better. The fact that it's based on SPIR-V means it doesn't lock you into any one language, it's possible to develop good tools that run on top of it. That's happening to a large extent in the machine learning space with IREE, much less so here though.
I should also say, I'm not disagreeing with the strategy Blender has chosen, specifically getting vendors to support tuned ports to their hardware. I'm just saying that these specific criticisms of Vulkan (like lacking pointers) aren't really valid.
If I have learned anything from Khronos APIs is that if Internet was a Khronos standard instead of IETF, we would only have IP as standard, while everyone else had to come up with TCP, UDP, HTTP,... as extensions.
Language support and tooling are never better than the alternatives.
Blender decided to support just the platforms which have a C++ target device language.
This encompasses CUDA, Metal[1] (one of the reasons why it’s much more usable than Vulkan), ROCm HIP, and oneAPI[2].
[1] Metal’s Shading Language is C++14 with a handful of limitations, the biggest one is no lambdas
[2] Vulkan uses a restricted SPIR-V dialect without pointers notably. OpenCL and oneAPI use a separate one which _does_ support them. However, AMD[3] and NVIDIA do not implement SPIR-V in their OpenCL drivers.
Migrating back down to GLSL/HLSL from that makes absolutely no sense. The options above, using C++ as a device language, allow much more code sharing with a CPU backend.
Portability is also a good story, the adaptation is mostly in the glue layer, if your GPU vendor is _not_ AMD and has a proper software stack.
Indeed, and a reason why Khronos only adopted SPIR-V and is now slowly embracing C++.
They took a hard beating from proprietary APIs that have long moved away from "C is the best" approach, and now they are playing catch-up with the rest of the world cozy in their mature C++ tooling for the GPGPU.
However, MS failed to capitalise on it and bailed out (too early). Has been limping along dead since years, finally acknowledged as deprecated in VS2022.
>My super 3D graphics, then, runs only on /dev/crt1, and X windows runs only on /dev/crt0. Of course, this means I cannot move my mouse over to the 3d graphics display, but as the HP technical support person said “Why would you ever need to point to something that you’ve drawn in 3D?”
>Of course, HP claims X has a mode which allows you to run X in the overlay planes and “see through” to the graphics planes underneath. But of course, after 3 months of calls to HP technical support, we agreed that that doesn’t actually work with my particular hardware configuration. You see, I have the top-of-the-line Turbo SRX model (not one, but two on a single workstation!), and they’ve only tested it on the simpler, less advanced configurations. When you’ve got a hip, forward-thinking software innovator like Hewlett-Packard, they think running X windows release 2 is pretty advanced.
Except in this case the question makes sense, you don't have malloc on the GPU (it only allocates from a fixed array), and without dynamic memory allocation you could just use indices to an array any time you would use a pointer on the CPU. This also makes sense if you transfer your datastructure between GPU and CPU, since you don't need to translate pointers.
You share the same pointers between CPU and GPU. Pointers in structures for example are one of the use cases.
Also as an even more unique option, see cudaHostRegister (on CUDA) to make a CPU buffer not allocated through CUDA accessible from the GPU without a copy, at the same address[1]. This isn’t especially high performance because you have to go through the PCIe bus (on Tegra, this mechanism is even more attractive), it’s however very useful.
Your address space is unified between CPU and GPU, with the same pointers used between both.
[1] at the same address for host allocations registered through that mechanism is only there on recent GPUs. Allocations done through CUDA still share the same address on both worlds on GPUs where this is unsupported.
So the hip, forward-thinking HP technical support person was right after all, decades ago! Makes me wonder why they're not still in business. Such nice hardware! Too bad about the software.
Among other things: C++ on the GPU, with lots of common code between the two worlds.
It’s one of the core reasons why CUDA took off. You could adapt your code instead of a full rewrite, even mixing code within the same file. Types are checked at compile time too.
It would be nice if I could use function pointers in hlsl. It would let me pass a pointers to distance functions around. Doing anything complicated with signed distance fields is a bit hairbrained without them.
It isn't about that. Vulkan isn't flexible enough to allow it to happen. Give true pointers support in Vulkan for example first. GLSL/HLSL is much worse than modern C++ as a programming language too.
As described by Brecht Van Lommel (Blender and Cycles developer):
Vulkan has limitations in how you can write kernels, in practice you can’t currently use pointers for example. But also, GPU vendors will recommend certain platforms for writing production renderers, provide support around that, and various renderers will use it. Choosing a different platform means you will hit more bugs and limitations, have slower or no access to certain features, are not likely to see library support like OSL, etc.
Our strategy for Cycles X is to rely on the GPU vendors to support us and provide APIs that meet our requirements. We want to support as many GPUs as possible, but not at any cost.