You are write on the time issue (but it does work if a and b are the same) The r...

ori_b · on July 1, 2009

If you've hit memory, you've already lost. Running instructions will basically be line noise compared to that. Similarly, function call overhead will probably dwarf actually doing the xors.

However, with a good register allocator and inlined functions, your point becomes even stronger. The compiler can simply remove all instructions associated with the naieve swapping version, and simply record that before the swap %eax holds b, and after it, %eax now holds a. (Even the most naieve of code generators will do this sort of thing if the values in the swap don't fall outside of a basic block). The xors are far harder to optimize.