
Nearly All Binary Searches and Mergesorts are Broken (2006) - brudgers
http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html?m=1
======
ScottBurson
I would argue that the bug is not in the algorithm -- the bug is in languages
that don't detect integer overflow by default. Arbitrary-precision integers,
as are the default in Lisp and Python, are ideal, but at the very least the
implementation ought to throw an exception. Yes, occasionally one specifically
wants arithmetic modulo 2^32 or some other word size, and languages should
provide that as an option, but it shouldn't be the default.

~~~
RogerL
"shouldn't be" is in the eye of the beholder.

None of my input is in the range where a+b generates an overflow. I
"shouldn't" have to pay for something I am not using - overflow detection. If
_you_ want to pay that price, fine, use a BigNum library and pay it.

I'm not saying I'm right, I'm pointing out the thinking behind the 'only pay
for what you ask for' type languages.

edit: more context is in order. This is a trivial example of numeric issues.
Try writing numeric code using IEEE floating point. There is a large number of
concerns to keep in mind. While we can envision schemes that would make it
safer to write naive code, I'm going to bet the average person is willing to
accept a frame rate of 5 because "safety". Any real language design is going
to make trade offs in this sort of thing, and any given line drawn in the sand
will inconvenience somebody.

~~~
Retric
Overflow detection at the CPU level is almost free. If common languages add it
by default you quickly get HW speed ups.

~~~
exDM69
> Overflow detection at the CPU level is almost free.

Yes, detecting the overflow is free but reacting to it is expensive. If you do
care about it, you'll fire off some kind of trap handler or at least do a data
dependent branch which has a performance hit with pipelining.

It's definitely not something that should be enabled for every integer
operation for all languages.

~~~
Demiurge
What are the potential instances when you want to ignore integer overflow, or
the expense of ignoring it is less than handling it? I can't think of
anything.

~~~
detrino
Are you overloading "expense" here to mean something other than runtime
performance?

~~~
Demiurge
Yes, any expense, for example satellite crashing on white house starting WW3.

------
doctorpangloss
If you're wondering, this post did by itself have meaningful impact on
implementations of binary searches. For example:

[https://github.com/mono/mono/blame/88d2b9da2a87b4e5c82abaea4...](https://github.com/mono/mono/blame/88d2b9da2a87b4e5c82abaea4e5110188d49601d/mcs/class/corlib/System/Array.cs#L861)

When I need easy-to-read and well-commented algorithms code, I often visit the
Mono codebase. Here, you can see a fix for the exact bug reported.

------
im3w1l
>This bug can manifest itself for arrays whose length (in elements) is 2^30 or
greater (roughly a billion elements). This was inconceivable back in the '80s,
when Programming Pearls was written, but it is common these days at Google and
other places

I only have a basic understanding of basic, but I think the program from
programming pearls[0] uses 16 bit integers, and so fails even earlier.

[0]
[http://www.it.iitb.ac.in/~deepak/deepak/placement/Programmin...](http://www.it.iitb.ac.in/~deepak/deepak/placement/Programming_pearls.pdf)
page 42(47 of pdf)

~~~
abecedarius
It depends on dialect, but usually variables defaulted to float. I _think_ his
code would be OK on arrays that'd fit in 64K (and it'd fail by loss of
precision rather than overflow).

FWIW when I coded binary search back in the elder days I checked my C against
Bentley's pseudocode but took care about overflow; I didn't judge this
difference a bug in the pseudocode, but the sort of consideration necessary in
rendering it to C. (Although I was being more careful than usual.) Since this
was before the net or open source were part of professional life I didn't get
to see other renditions to find out they were wrong.

~~~
im3w1l
He uses the cint which forces to int.

Ninjaedit: cint is used for the midpoint, and is then assigned to either the
upper or lower limit. After 2 or more iterations, both upper and lower limits
can be ints.

~~~
abecedarius
_After_ the sum and division.

------
dtdt
The latter two suggested fixes are also incorrect for arrays of length >
INT_MAX where the goal element is close to the end, there won't be any out of
bounds array access though.

For example, when low is 0x70000002, high is 0x90000000

    
    
      int mid = (low + high) >>> 1;
    

and

    
    
      mid = ((unsigned int)low + (unsigned int)high)) >> 1;
    

will both have mid as 1 due to unsigned int overflow.

~~~
hofstee
The article assumes the use of a signed integer, by passing in values that
can't be represented with a signed integer you can't assume the same solutions
hold.

------
jmount
And a lot of quicksorts are also broken. [http://www.win-
vector.com/blog/2008/04/sorting-in-anger/](http://www.win-
vector.com/blog/2008/04/sorting-in-anger/)

------
scriptedfate
tl;dr - if you're adding two numbers in your algorithm (like finding the
midpoint in a binary search), make sure their sum can be represented in the
resultant type.

Basically: "watch out for overflows"

~~~
brudgers
The bug is subtle. It does not arise out of engineering failure, but rather is
a byproduct of engineering success manifesting itself as a durable solution
being used beyond any time frame anticipated in the specification. It's of a
kind with Y2k or Y2038.

------
Retric
_This bug can manifest itself for arrays whose length (in elements) is 2^30 or
greater (roughly a billion elements)._

Of course if you have ~2^31 elements it's going to fail anyway so it's not
really buying you much. Instead your much better off using Longint or at least
unsigned int for large arrays.

PS: int mid = (low + high) >>> 1; might work assuming overflow's are ignored,
but you might endup with the same bug.

~~~
PhantomGremlin
_it 's not really buying you much_

I strongly agree. There's a certain elegance in fixing the bug, but it has
little practical significance. If you're processing arrays with a billion or
more elements, you need to be doing that in a 64-bit environment. In fact you
_are_ in a 64 bit environment, because your data won't even fit into a 32 bit
address space.

End of story.

Edit: just to be pedantic, and to forestall arguments, when I say "64-bit
environment", that means pointers, array indexes, etc are all 64 bits. Using
32 bit ints just doesn't work.

------
mark-r
Slightly related, someone once did a challenge to see how many readers could
correctly code a binary search: [https://reprog.wordpress.com/2010/04/19/are-
you-one-of-the-1...](https://reprog.wordpress.com/2010/04/19/are-you-one-of-
the-10-percent/)

------
gweinberg
It you're using an int as the index of your array, and your array already has
more than half MAX_VALUE elements, you've got a much more serious problem
coming in the future, when your arrays are twice as big as they are now.

------
bkin
Not directly available in C, but wouldn't RCR[0] fix this appropriately on X86
at least?

[0]
[http://docs.oracle.com/cd/E19620-01/805-4693/instructionset-...](http://docs.oracle.com/cd/E19620-01/805-4693/instructionset-117/index.html)

~~~
pbsd
I suppose it would. However, RCR is a notoriously slow microcoded instruction
in modern x86. You'd be better off with something like

    
    
        add rax, rbx ; lo + hi
        setc cl
        shrd rax, rcx, 1
    

Or, better, the generic way to do it without carries: ((a ^ b) >> 1) + (a &
b).

------
LASR
I've been asked about this 'bug' in almost every interview question about
binary search/merge sort etc. The fix for the bug being the exact same fix in
the post.

Isn't this just common knowledge?

~~~
userbinator
_Isn 't this just common knowledge?_

It could be now, but 9 years ago, when that article was published, it wasn't.

------
hofstee
Yep. They cover this in intro CS classes at CMU to this day.

