"Traditionally, resources like BAR windows are mapped to user or kernel address space using the CPU’s MMU as memory mapped I/O (MMIO) addresses. However, because current operating systems don’t have sufficient mechanisms for exchanging MMIO regions between drivers, the NVIDIA kernel driver exports functions to perform the necessary address translations and mappings."
This change address this problem in a accelerator-agnostic way.
2) GPUDirect RDMA has the limitation that two devices must share the same PCI Express root complex.
The change eliminates this limitation: the data transfer is at the lowest level of the PCIe tree, so you can isolate smart nics and GPU/TPU/accelerators with a PCIe switch and not saturate the PCIe root complex bandwidth.