GPUs generally don't come close to saturating x8 3.0 lanes, unless you have a very specific workload (like the new 3dmark bandwidth benchmark AMD used to demo PCIe 4.0).
Games don't do nearly enough asset streaming to use a lot of bandwidth, since the amount of assets used at the same time is limited by VRAM size, and most stuff is kept around for quite some time. Offline 3D renderers like Blender Cycles IIRC just upload the whole scene at once and then path tracing happens in VRAM without much I/O. For buttcoin mining, people literally use boards with tons of x1 slots + risers. No idea how neural nets behave, but would make sense that they also just keep updating the weights in VRAM.