
(x / 2) or (x &gt;&gt; 1)? - pykello
http://stackoverflow.com/questions/10681375/which-is-better-option-to-use-for-dividing-an-integer-number-by-2
======
Smerity
In general, you're not smarter than your compiler[1].

Even if you're smarter than your compiler it's usually not worth the
readability trade-off.

These sorts of optimisations only ever make sense if you're doing something
incredibly intensive and you've tested it on real world data. This is rarely
the case and the code suffers.

[1]: "Source Code Optimization" => [http://www.linux-
kongress.org/2009/slides/compiler_survey_fe...](http://www.linux-
kongress.org/2009/slides/compiler_survey_felix_von_leitner.pdf)

~~~
hrktb
Perhaps thinking about it as an optimisation and not a standard operation is
not the best way to look at it.

For example 99% of the time I want a binary shift and not a /2.In a
dynamically typed language 201/2 will give you 100.5 when you'd prefer to keep
it integer.

~~~
pwaring
"In a dynamically typed language 201/2 will give you 100.5"

Not necessarily. Python gives 100, PHP gives 100.5, Perl gives 100.5, Ruby
gives 100.

Edit: Versions are Python 2.7.3, PHP 5.3.10, Perl 5.14.2 and Ruby 1.8.7, all
as shipped with Ubuntu Precise.

~~~
archangel_one
Python 3 gives 100.5. Python 2 gives 100 (or 100.5 if you use "from __future__
import division").

~~~
fjh
But using >> for division is still a bad idea. If you want integer division,
use //.

------
chrisbennet
Not long ago, I was doing something that needed to be fast and I tried the old
the "x>>1" trick and timed it compared to x/2. Much to my surprise "x/2" was
faster (which is why you should always time things instead of assuming).

The takeaway is _not_ that "x/2 is faster than x>>1", instead it's that you
shouldn't assume some code it faster - always time it.

~~~
to3m
Nobody would make this argument if, during testing, it transpired that reading
files over a 100Mbit network connection was faster than reading files from the
local SSD. Sure, one should always measure rather than pontificate. And yes,
in this hypothetical case, one has measured, and it truly turns out it's
quicker... to read over the network. But one should always consider: do the
results make any sense whatsoever at all, or do they suggest that further
investigation might be warranted?

And similarly, it would be an unusual CPU indeed - so unusual in fact that
you'd probably already know that you were working on it, and would mention it
as part of this crazy story of the time that divisions were faster than shifts
- that managed to mess up single-bit shifts so badly that they were slower
than an integer division.

Perhaps in a higher-level language, one might find that x>>1 were faster than
x/2, for some reason unrelated to the relative speeds of the two operations.
But if one is working in C or C++ (and I assume this is the case, if one is
bothering with stuff like replacing divides with shifts), and yet the shift is
still the slower, then this is most unexpected, and probably evidence that
something has gone wrong.

(The division algorithm is inherently more complicated than a shift, and so
divisions are always going to take more time than a shift. You can expect a
shift to take 1 cycle on a modern CPU. A division would take more like 20. It
will probably be the slowest integer instruction possible, and it might not be
accessible from every pipeline. Sample division latencies for vaguely modern
CPUs: 20+ cycles on x86, ~35 cycles on MIPS, ~20 cycles on POWER, ARM doesn't
seem to have it. Shifts, on the other hand - always 1 cycle. ARM even gives
them away for free.)

~~~
chrisbennet
I just noticed your response. The machine I timed it on was an older Lenova
laptop with some sort of dual core Intel processor - indeed not an "unusual"
CPU.

In the end, it's results are what matter even if you (or I) don't understand
why. For all I know, the compiler parallelized some of the code with divisions
and didn't for the shift using code. The thing is, you can't just look at what
"should" be fastest and my experience was a good reminder of that.

In the old days, one optimization technique we used was to "unroll loops" i.e.
instead of:

for (int i=0; i<8; i++) a[i] += b[i];

We'd do this:

// Unrolled loop a[0] += b[0]; a[1] += b[1]; a[2] += b[2]; ... a[7] += b[7];

The unroll loop has less clock cycle according to the book. Great - unless the
larger "faster" code no longer fits in the CPU cache, now it's actually
slower. The only way to tell is to measure it.

Here is the code. Substitute ">>1" for the "/2" portions and compare.

// pBuffer contains a single horizontal line in a bitmap // destBuffer will
contain the first derivative convolution i.e. if you graph it, it would look
like peaks at the edges of things in the image

void Convolution1D_FastFirstDerivative(int* pBuffer, int sizeOfBuffer, int
_destBuffer) { int lastPos = sizeOfBuffer - 1; int isum = 0;

// Special case start of buffer where filter "hangs off edge" isum =
-pBuffer[0] / 2; // _ 0.5 //sum += pBuffer[0] * 0; // * 0 isum += pBuffer[1] /
2; destBuffer[0] = isum;

for (int i=1; i<sizeOfBuffer-1; i++) { isum = -pBuffer[i-1] / 2; // * -0.5
//sum += pBuffer[lastPos] * 0; // * 0 isum += pBuffer[i+1] / 2; // * 0.5
destBuffer[i] = isum; }

// Special case end of buffer where filter "hangs off edge" isum =
-pBuffer[lastPos-1] / 2; // * -0.5 //sum += pBuffer[lastPos] * 0; // * 0 isum
+= pBuffer[lastPos] / 2; // * 0.5 destBuffer[lastPos] = isum;

}

Sorry, I wish I could fix the formatting.

------
jejones3141
They don't have the same results if x is signed (-1 >> 1 gives you -1; -1 / 2
typically gives you 0). A compiler worth its salt will optimize x / 2
appropriately. For unsigned, it will emit the shift; for signed, it will
generate (x + 1) >> 1.

~~~
jrmuizel
(x+1)>>1 isn't equivalent to x/2. It gives the wrong result for positive
numbers like (x=1).

Compilers actually generate (x+(x>>31))>>1 for signed division by 2. Which on
x86 is three more instructions and requires an additional register.

Using x>>1 makes sense if you want floored division instead of truncating
towards zero. Floored division is better if you want to reduce precision of
the input or something like that and don't want to bias towards zero.

------
jimmyswimmy
Most disappointing was the fact that the top comment focused on x=x/2 vs x/=2,
the latter of which I believe is more confusing than not. In any event, none
of the three options ought to make a difference when you have the luxury of a
compiler, which can choose the "best" way to do it.

When you're programming a microcontroller YOU are often the compiler and
therefore need to know the relative clock times for each command. I remember
one chip where the division command could take 100 cycles while a shift was 2.
Big difference there, and you'd definitely pick the shift.

One of the later comments there focused on actually decompiling the resulting
code. I was surprised that the compiler did not produce the same output.

------
mihaifm
That's what I don't like about stackoverflow, this type of "viral" questions
that get a lot of attention while the good ones often get lost in the noise.

~~~
MindTwister
Not only that, its also completely irrelevant for 99% of developers. My bet is
that most of the (currently) 167 up votes did not have this specific problem.

~~~
ralfd
The Shift vs Division question was also already asked many times on
Stackoverflow in the past.

~~~
tych0
It's a bad question, though. Either you know the answer, or the peephole
optimizer will do a better job than you will, and therefore you should use
what's most readable. I agree with GP that these kinds of questions are
unfortunate noise at stackoverflow.

------
JoeAltmaier
Many folks commented on the difference between

    
    
      x /=2;
    

and

    
    
      x = x / 2;
    

I object to the combined operator (x /= 2) because it is ill-conceived as a
language feature. It only works for a half-dozen cases the language writer
chose. I can't extend it to my operations. Why would I do that? Lets assume
instead we define '@' as a symbol for 'LHS' (Left-Hand-Side). It now looks
like this:

    
    
      x = @ / 2;
    

How does that add value? Imagine 'x' is ' _(int_
)(aRec[RecIndex(p->handle)]->itemCountRef)'

I know, could do that already with /= 2; But consider we wanted to do

    
    
      x = x / (x+1);
    

No help in C. But with '@' it becomes

    
    
      *(int*)(aRec[RecIndex(p->handle)]->itemCountRef) = @ / (@+1);
    

Much clearer than the alternative

    
    
      *(int*)(aRec[RecIndex(p->handle)]->itemCountRef) = *(int*)(aRec[RecIndex(p->handle)]->itemCountRef) / (*(int*)(aRec[RecIndex(p->handle)]->itemCountRef)+1);
    

Sure, you can add declarations of intermediate values in C now. But I'm not
sure that adds clarity in the same way.

~~~
bonzoesc
> How does that add value? Imagine 'x' is
> '(int)(aRec[RecIndex(p->handle)]->itemCountRef)'

That's a gnarly code smell and a Law of Demeter[1] violation. I suspect you
could find a better way to express that operation without requiring that a
single line of your code know so much about so many different things (aRec,
aRec members, the return value of RecIndex, p, aRec member -> itemCountRef).

1: <http://c2.com/cgi/wiki?LawOfDemeter>

~~~
Munksgaard
I think you're missing the point.

~~~
bonzoesc
Am I? The language doesn't need to make affordances for bad code, bad code
needs to be refactored into better code.

~~~
JoeAltmaier
Oh for heaven's sake, use some other complex reference then.

------
Jabbles
Opening "Hacker's Delight", which you (as a C++ programmer) surely have in
your collection next to "Code Complete 2", we find:

" _Apparently, many people have made the mistake of assuming that a shift
right signed of k positions divides a number by 2^k_ "

Surprisingly, the top few answers do not give the correction for negative
numbers, which in "Hacker's Delight", is to add 2^k-1 to negative numbers
before shifting.

Obviously, this ignores optimization and readability arguments, which although
important, are probably less important than correctness.

------
haberman
I answered almost exactly the same question on Stack Overflow once, except the
asker was comparing (x / 2) to (x * 0.5). Someone had previously advised him
to use the latter, on the theory that multiplication is cheaper than division.
I disassembled both to show that (x * 0.5) is absurdly less efficient:
[http://stackoverflow.com/questions/895574/what-are-some-
good...](http://stackoverflow.com/questions/895574/what-are-some-good-code-
optimization-methods/1354688#1354688)

------
borkabrak
In my work, it's generally better to optimize for clarity rather than
performance. I suspect this is true more and more as compilers grow in
sophistication. In this vein, use:

x = x / 2

To DIVIDE, use the DIVISION operator.

------
zafka
It really depends on what you are doing. I do embedded work on a 40 MIPs PIC
dsp. When i want to smooth A/D readings I collect 32 by adding them each time
an interrupt is called, and when i reach a count of 32 I use: smooth_value
=sum >>5

This is a case where shifting is worthwhile.

------
hakboy11
its 2012 dammit. x/2

------
Scotchy
I often find myself using 1 << Y when I mean 2^Y, but just because it is
faster to write and generally accepted in all languages than to find pow in
the current language I am writing in...

Edit: fixed

~~~
exDM69
You're computing 2^Y there.

pow is usually a floating point operation, that's why it's slow. Modern CPU's
may actually be fast at a pow floating point operation, but it's still
expensive to move integers from general purpose registers to floating point
registers.

------
chj
I prefer x/2, the operator precedence is much clearer.

------
darkstalker
If you're using floating point, x * 0.5 is faster

~~~
dtparr
Given that the question indicates that bit-shifting x would result in a valid
answer, I think we can rule out floating point.

------
xxiao
I think this is the same these days, due to smarter compiler.

