Hacker News new | past | comments | ask | show | jobs | submit login

I agree that 2^32 is better, but the reason is not that division is too expensive because OPs algorithm after standard optimisation does not need any division.

Clearly choosing a base type that is fast on the machine, is key. But I feel even a 64/32 divide with constant operand will be optimised into something reasonably fast on most systems. If you have a counterexample it would be interesting to hear.

A system without fast divide likely doesn't have any floats either. So all arithmetic would be fixed point and 32x32->64 multiplication would be a pretty common operation.




I just meant that the compiler in your example is computing x / 1000000 as follows:

   *q = (x * 1125899907) >> 50;
This works out OK because there's a 32x32->64 instruction, arm64's umull in this case. If it didn't have one, and it had a 32/32->32 division instruction, it might have emitted that instead of hand-rolling the same scheme using 32x32->32 multiplications.

Note that it's technically an arithmetic approximation that just happens to work for all possible values of x, since obviously 1<<50 is only divisible by 1<<(0 to 50).


I see. However, 32x32->64 multiply has been common on 32-bit processors since the mid 80's. It would be interesting to know if somebody is still building new products on (32-bit) processors that lack it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: