Google Posts Experimental Linux Code for “Device Memory TCP”

ncr100 · 2023-07-11T17:01:11

As a "how HW works" noob, how is this different from DMA?

eklitzke · 2023-07-11T17:28:48

If I understand this correctly, it's RDMA for GPU/accelerator memory.

rlupi · 2023-07-11T20:51:17

1) If you read Nvidia GPUDirect RDMA page closely (https://docs.nvidia.com/cuda/gpudirect-rdma/index.html), you can find this paragraph:

"Traditionally, resources like BAR windows are mapped to user or kernel address space using the CPU’s MMU as memory mapped I/O (MMIO) addresses. However, because current operating systems don’t have sufficient mechanisms for exchanging MMIO regions between drivers, the NVIDIA kernel driver exports functions to perform the necessary address translations and mappings."

This change address this problem in a accelerator-agnostic way.

2) GPUDirect RDMA has the limitation that two devices must share the same PCI Express root complex.

The change eliminates this limitation: the data transfer is at the lowest level of the PCIe tree, so you can isolate smart nics and GPU/TPU/accelerators with a PCIe switch and not saturate the PCIe root complex bandwidth.

(From https://lore.kernel.org/dri-devel/20230710223304.1174642-1-a... "Advantages")

ncr100 · 2023-07-21T20:40:58

Nifty. Thanks for the deep info - makes a lot of sense.