I agree that 2^32 is better, but the reason is not that division is too expensiv...

worstspotgain · 2024-07-17T22:39:15 1721255955

I just meant that the compiler in your example is computing x / 1000000 as follows:

   *q = (x * 1125899907) >> 50;

This works out OK because there's a 32x32->64 instruction, arm64's umull in this case. If it didn't have one, and it had a 32/32->32 division instruction, it might have emitted that instead of hand-rolling the same scheme using 32x32->32 multiplications.

Note that it's technically an arithmetic approximation that just happens to work for all possible values of x, since obviously 1<<50 is only divisible by 1<<(0 to 50).

colonwqbang · 2024-07-18T17:32:03 1721323923

I see. However, 32x32->64 multiply has been common on 32-bit processors since the mid 80's. It would be interesting to know if somebody is still building new products on (32-bit) processors that lack it.