I think the whole system price basically ends up as a wash with TR4 motherboards being at least ~$300 and needing a ~$100 cooler, I'm assuming these consumer chips will continue to have bundled coolers. The 3950X also draws 75W less than the 1950X so you can probably save a few bucks on the power supply and of course your electric bills over time.
The performance comparison will be interesting though. The 3950X should be quite a bit faster than the 1950X when it's not bottlenecked by memory bandwidth, but of course the 1950X still has twice the memory channels. Slightly offset by the Zen2 memory controller supporting higher frequency RAM. So which one is better will depend heavily on workload. I suspect that for a developer workstation the 3950X would be the better performer, most compilation workloads are not very sensitive to bandwidth.
Yea, the platform cost is higher. I ended up making a build you could make for ~$1,500 now. It was ~$1,600 when I made it. The biggest feature I'm interested in is the availability of PCIe lanes. I want this for adding a 10G nic later and two GPUs at some point as well (host & guest).
If you don't need those features you're completely correct about the 3950x.
My biggest problem with virtualization is USB. I have a libvirt with GPU passthrough setup that works great, but have been unable to get a USB controller of any sort to passthrough; always winds up in a group with a bunch of other PCI-e devices. And ordinary forwarding with SPICE or something isn’t really sufficient for what I’d like to set up...
It's technically a security risk, but take a look at the acs patch that's out there. It'll forcefully split up the iommu groups to the hypervisor so you can do the pass through, but it does mean that the cards that were in the same group before can technically see each other's dma and other stuff on the bus. For anything other than a shared host it's pretty much fine though.
This should be doable on desktop Ryzen. I currently have one GPU (x16) + NVMe SSD (x4) + 10G NIC (x4 from chipset). The 16x can be split x8/x8 for dual GPU.
My single port Intel X520 achieves line rate through x4 chipset lanes just fine.
GPUs generally don't come close to saturating x8 3.0 lanes, unless you have a very specific workload (like the new 3dmark bandwidth benchmark AMD used to demo PCIe 4.0).
Games don't do nearly enough asset streaming to use a lot of bandwidth, since the amount of assets used at the same time is limited by VRAM size, and most stuff is kept around for quite some time. Offline 3D renderers like Blender Cycles IIRC just upload the whole scene at once and then path tracing happens in VRAM without much I/O. For buttcoin mining, people literally use boards with tons of x1 slots + risers. No idea how neural nets behave, but would make sense that they also just keep updating the weights in VRAM.
Except this is pcie 4 vs pcie 3 so it's double the lanes for the latter, so no, x8 pcie 4 will not bottleneck anything. Unfortunately GPU do not support pcie 4 yet, but it's not a problem for integrated 10GbE.
Yes, the Navi card of course supports gen4, and even the Vega20 did too. At least in the original Instinct variant (most places on the internet say that gen4 was cut on the consumer Radeon VII card)
The performance comparison will be interesting though. The 3950X should be quite a bit faster than the 1950X when it's not bottlenecked by memory bandwidth, but of course the 1950X still has twice the memory channels. Slightly offset by the Zen2 memory controller supporting higher frequency RAM. So which one is better will depend heavily on workload. I suspect that for a developer workstation the 3950X would be the better performer, most compilation workloads are not very sensitive to bandwidth.