Civ 6 really doesn't utilize cores as much as one would think. I mean it'll spread the load across a lot of threads, sure, but it never seems to actually... use them much? E.g. I just ran the Gathering Storm expansion AI benchmark (late game map completely full of civs and units - basically worst case for CPU requirements and best case for eating up the multicore performance) on a 7950X 16 core CPU and it rarely peaked over 30% utilization, often averaging ~25%. 30% utilization means a 6 core part (barring frequency/cache differences) should be able to eat that at 80% load.
Whether the bottleneck is memory bandwidth (2x6000 MHz), unoptimized locking, small batch sizes, or something else it doesn't seem to be related to core count. It's also not waiting on the GPU much here, the 4090 is seeing even less utilization than the CPU. Hopefully utilization actually scales better with 7, not just splits up a lot.
> 16 core CPU and it rarely peaked over 30% utilization, often averaging ~25%. 30% utilization means a 6 core part (barring frequency/cache differences) should be able to eat that at 80% load.
As a rule I wouldn't be surprised if 90% of the stuff Civ 6 is doing can't be parallelized at all, but then for that remaining 10% you get a 16x speedup with 16 cores. And they're underutilized on average but there are bursts where you get a measurable speedup from having 16 cores, and that speedup is strictly linear with the number of cores. 6 cores means that remaining 10% will be less than half as fast vs. having 16 cores. And this is consistent with observing 30% CPU usage I think.
If only 10% of the workload can be parallelized, then the best case parallelization speed up is only 10%. That doesn’t line up with the GP’s claim that Civ6 benefits from more cores.
My rule is more like I’d be willing to bet even odds that this could be sped up 100x with the right programmers focused on performance. When you lack expertise and things work “well enough” that’s what you get. Same for games or enterprise software.
That's what we get in a market dominated more by the need to release before the competition rather than taking some time to optimize software. If it's slow, one can still blame the iron and users who don't upgrade it.
https://i.imgur.com/YlJFu4s.png
Whether the bottleneck is memory bandwidth (2x6000 MHz), unoptimized locking, small batch sizes, or something else it doesn't seem to be related to core count. It's also not waiting on the GPU much here, the 4090 is seeing even less utilization than the CPU. Hopefully utilization actually scales better with 7, not just splits up a lot.