
Swapping two numbers in an array using XOR doesn't use less memory - whtshtf
https://blog.stjepan.io/why-swapping-two-numbers-in-a-list-with-xor-doesnt-require-less-memory/
======
billconan
this was one of the interview questions I got for a NVIDIA position, the
second part of the question was asking why using XOR in the actual code (or
why XOR is faster than using assignments).

I didn't know. the answer (as I faintly remember) was because ALU comes before
the memory logics in the cpu pipeline. so if this can be solved by math
calculations, the instructions can bail out early from the pipeline, hence the
execution is faster.

also the article doesn't mention compiler flags. was optimization on?

~~~
whtshtf
>I didn't know. the answer (as I faintly remember) was because ALU comes
before the memory logics in the cpu pipeline. so if this can be solved by math
calculations, the instructions can bail out early from the pipeline, hence the
execution is faster.

This makes perfect sense and it occurred to me earlier , but I couldn't get
neither gcc or clang to do as you said :) As far as I know, there is no way of
getting such behavior using "regular" C.

>the second part of the question was asking why using XOR in the actual code
(or why XOR is faster than using assignments).

From the benchmarks I performed it seems to me that XOR-ing is definitely
slower, but nvidia folks claiming that it is faster makes me wonder if I made
a mistake somewhere.

>also the article doesn't mention compiler flags. was optimization on?

Nope. It seems to me that passing -O3 makes the XOR approach behave
equivalently to the approach with a tmp variable
([https://godbolt.org/z/N_zGjv](https://godbolt.org/z/N_zGjv)):)

