The moving-data-around cost is conventional wisdom in GP-GPU circles. Is it chan...

The moving-data-around cost is conventional wisdom in GP-GPU circles.

Is it changing though? Not only do PCIe interfaces keep doubling in performance, but CPU-GPU memory coherence is a thing.

I guess it depends on your target: 8x H100s across a PCIe bridge is going to have quite different costs vs an APU (which have gotten to be quite powerful, not even mentioning MI300a)