Per-clock performance is THE metric. Apple can't sustain those peak clocks for more than a few seconds before throttling down. Once both chips are running at a sustainable 2-2.5GHz, the IPC starts mattering a lot.
If Apple can't sustain those clock speeds for long, that's reflected in the benchmark result. Benchmarks and real-world performance are the only metrics which matter in the end.
And higher clock speed doesn't proportionally improve either real world metrics or benchmark results, so "benchmark score divided by clock speed" is a useless metric.
The CPU peaked out at 14 Watts in multicore Geekbench. That's close to the peak CPU power consumption of the entire M1 chip in devices many times larger than an iPhone.
GeekerWan had it throttling 200-300MHz when simply running specInt/specFP. It essentially throttles down to the same speed of the iPhone 14 at slightly higher wattage.
For mobile devices, real-world peak CPU performance hasn't gotten much better than my aging iPhone 12 because most of the extra performance has come at the expense of heat/power.
I assume this comment is just here to supply various vaguely CPU-related technical info in case someone is curious, not to argue against my point? Because if it's the former then that's fine, but if it's the latter it doesn't really hit the mark