I am a software engineer so I am pretty knowledgeable in the topic of computers in general, but this specific question continues to bother me. Why does a processor from 2015 at 2.5GHz run slower than a processor from 2022 at 2.5GHz? What should I look at specifically? Is the difference reported somewhere?
In general: how can I tell when I need to replace my processor with a new one (without needing to manually compare the new and old one...)?
I think looking at the GHz and number of cores is not enough anymore.
This "speculative out-of-order execution" requires a huge number of transistors to consider various combinations of future instructions it might be able to execute every clock cycle, and burns some extra power doing that. So although most of the basic ideas were known by the late 90s, adding more transistors in every generation lets it do more and more in parallel.
Also, faster and larger caches cause fewer stalls.
Also, modern cores are better at predicting branches, so they can proceed to start executing instructions past a branch before knowing which way the branch is going to go. If it guessed wrong about the branch, it has to undo the results of some instructions. So it adds a lot of complexity to track each side-effect that might need to be canceled.
Also, SIMD parallel has gotten much better. Some modern cores can do 8 floating point operations per cycle using AVX2 or Neon. While older SIMD systems had very limited instructions sets, you can do a lot with modern ones. x86 SIMD instructions can process 32 bytes at a time. With a great deal of cleverness, you can do some byte stream operations in less than one cycle per byte. See https://arxiv.org/abs/2010.03090
GPUs generally do 32 parallel floating point operations per core per cycle, with hundreds of cores.
Also, main memory is gradually getting slightly faster and wider.
Lastly, more cores are good. Back when the most cores you could get was 4, it was barely worth writing parallel software because all the locking slowed things down almost as much as the 4 cores speeded things up. But high-end Xeons can have 40+ cores, which makes it worth the hassle of writing parallel code. And GPUs have 1000s of cores, so it's worth a lot of complication to make use of them.