If you're doing simulations, or poking big matrices continuously on CPUs, you can saturate the memory controller pretty easily. If you know what you're doing, your FPU or vector units are saturated at the same time, so "whole system" becomes the bottleneck while it tries to keep itself cool.
Games move that kind of data in the beginning and doesn't stream new data that much after the initial texture and model data. If you are working on HPC with GPUs, you may need to constantly stream in new data to the GPU while streaming out the results. This is why datacenter/compute GPUs have multiple independent DMA engines.
Games move that kind of data in the beginning and doesn't stream new data that much after the initial texture and model data. If you are working on HPC with GPUs, you may need to constantly stream in new data to the GPU while streaming out the results. This is why datacenter/compute GPUs have multiple independent DMA engines.