Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I think newer GPUs have dedicated DMA engines for copying to GPU memory that isn't mapped into the host address space but I'm not really familiar with these drivers.

Not even newer, but instead it's a pretty common feature for GPUs for the past couple decades or so.



Does anyone still rely on CPU writes with WC? My impression is that is kind of obsolete these days.


And also heavily used in non-oss drivers.

DMA blocks can only do so much when the (often older but still well used) APIs don't map well to the synchronization required - either allowing the user to immediately free and/or reuse the buffer used to pass in data (so likely requires a synchronous CPU copy to a staging area before the API function returns), or direct memory mapping of resources. Putting either of those in the cache is often wasteful, as it's unlikely for any line to be re-used before being flushed anyway, then passed over to the GPU DMA block to do whatever asynchronously.

And there's also non- device driver use cases - I've seen image processing libraries intentionally skip the cache if they know they're not going to be touching the data again for some time, and the data set itself is large enough. I assume other users exist, I just see those as I work on GPUs, and images are a big source of large data sets.

WC allows these to avoid clobbering the cache while having at least some chance of using the memory/pcie bus effectively.


Yes, it's still used in the Intel 3D drivers in Mesa (at least).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: