> Build userspace NVMe drivers and storage applications with CUDA support
> libnvm can be linked with CUDA programs, enabling high-performance storage access directly from your CUDA kernels. This is achieved by placing IO queues and data buffers directly in GPU memory, eliminating the need to involve the CPU in the IO path entirely.
No. DirectStorage is mostly about addressing the major problems that make disk IO so much slower on Windows than Linux: cloning io_uring and bypassing some of the overhead in the filesystem stack. Optionally, DirectStorage can enable offloading decompression to the GPU, and I think it also optionally supports using P2P DMA to transfer data directly from the SSD to VRAM. Having IO commands actually originate from the GPU is way out of scope, because you can't safely let a userspace application directly construct and submit commands to a drive that holds a filesystem managed by the OS.
Just curious, are there exploits involving the GPU in the wild? It seems there is a lot of surface area for attack when emitting command buffers to the GPU. Not that I know how to get at system critical resources via the GPU, but there are some clever bastards out there.
All modern GPUs (Or at least both I've worked on professionally) have MMUs and some context privilege level (IE so contexts can't change the MMU state themselves to escape) in hardware. If that is busted all bets are off, similar to how if the CPU has issues in the same area.
The security model of the GPU itself doesn't really allow any level of security within the context - if you can write shader code (e.g. massage the glsl/hlsl into doing what you want, or just upload a binary shader of your own) you already can read anything mapped into that context. I think there's some systems that add stricter validation onto those shaders to disallow things like out-of-bounds reads, and that may have bugs that will cause issues if any security relies on that - e.g. for browsers when sourcing shaders from webgl, or anything where untrusted sources share the same context as possible private info. But that's built on top of the GPU driver's model itself, and provides a restricted model to it's clients. This would just be another thing they disallow.
This "Driver in the GPU" will just be code ran in that context already, so I don't see it as increasing the attack surface. It may be an issue if people start wanting finer grain permissions within a context, however.
My understanding is that with current generation GPUs a crash/timeout can happen. Again, you can do pretty much anything with shaders, so invalid accesses, infinite loops, other things that can halt execution are perfectly possible. It doesn't mean much in the way of security.
But they should be detected and that context killed (or reset in a way that may lose the current job's state) - all other contexts should keep their state. Though task switching is often a lot cruder than on CPUs, so during that timeout other tasks may be blocked - so other renders (including the window system) will be wedged until it's reset. This may mean that a bad actor with access to the GPU can try a denial of service style attack, throwing constant bad tasks at the GPU and causing it to wedge repeatedly, which can certainly be annoying.
I'm not well-versed on GPU architecture. Can someone help out with a gloss-over? I read that something is being pushed-down from CPU/host territory into GPU territory but I'm not quite clear what the deltas are.
Typically the application wants to draw stuff on the GPU. To do this it calls into the GPU driver, which builds a big memory buffer in some implementation-defined format that instructs the GPU what to do, and then the driver instructs the GPU to execute that.
However, applications these days want to do some of the work to determine which commands to do on the GPU (e.g. culling).
So the work here is to allow the application to create a memory buffer on the GPU with commands using a standardized format and then the driver has to convert that into the implementation-defined format on the GPU. This copies a lot of the complexity of the driver from the CPU to something running on the GPU.
> Build userspace NVMe drivers and storage applications with CUDA support
> libnvm can be linked with CUDA programs, enabling high-performance storage access directly from your CUDA kernels. This is achieved by placing IO queues and data buffers directly in GPU memory, eliminating the need to involve the CPU in the IO path entirely.