The moving-data-around cost is conventional wisdom in GP-GPU circles.
Is it changing though? Not only do PCIe interfaces keep doubling in performance, but CPU-GPU memory coherence is a thing.
I guess it depends on your target: 8x H100s across a PCIe bridge is going to have quite different costs vs an APU (which have gotten to be quite powerful, not even mentioning MI300a)
Is it changing though? Not only do PCIe interfaces keep doubling in performance, but CPU-GPU memory coherence is a thing.
I guess it depends on your target: 8x H100s across a PCIe bridge is going to have quite different costs vs an APU (which have gotten to be quite powerful, not even mentioning MI300a)