
Nearly All Binary Searches and Mergesorts Are Broken (2006) - olalonde
https://research.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html
======
Animats
As I was saying over at [1], if you're going to do program proofs, you have to
prove absence of overflow. It's the cause of many program bugs.

[1]
[https://news.ycombinator.com/item?id=12146049](https://news.ycombinator.com/item?id=12146049)

~~~
olalonde
Oh yeah, I actually saw the link in your comment!

At the risk of sounding stupid, I had a question about the following solution:

    
    
        int mid = (low + high) >>> 1;
    

I understand that right shift is like dividing by 2, just like in decimal
right shifting the digits is like dividing by 10 (654/10 = 065). But why can
we just ignore the low + high overflow issue? Is it because the overflow will
be stored in the two's complement bit and right shift doesn't care about what
that bit represents? If unsigned ints were used instead, I suppose that trick
wouldn't work?

~~~
psyklic
low and high are assumed to be positive and at most 2^31 - 1. So, there is a
zeroed high bit in each which, for a signed int, is reserved to indicate
negation.

When low and hi are summed together, the high bit of the result might be set.
For a signed int, this would be an accidental overflow and would be falsely
interpreted as a negative number. However, if you interpret the result as an
unsigned int, it would be the correct sum.

The Java logical shift operator >>> does not retain the sign when shifting.
So, it will always shift a 0 into the high bit, making the result a positive
signed int.

In most C implementations, >> is an arithmetic shift, which would retain the
sign in the result. So in C, we would need to cast low and high as unsigned
first; otherwise, the high bit will be retained rather than shifted.

------
trav4225
Hmm, I always figured it was just assumed that, unless otherwise specified,
most well-known implementations fail with sufficiently large numbers.

~~~
gfody
Yep, this is just vintage click-bait.. you can bet more than sorting routines
will fail with numbers approaching 2^63. It doesn't matter because if you're
running quick sort on a list of 2^64 entries then something else is going
wrong before your (l+r)/2 has a chance to overflow.

~~~
colejohnson66
If you're trying to sort 2^63 items at once, you have other problems.

------
kuharich
Prior discussions:
[http://news.ycombinator.com/item?id=621557](http://news.ycombinator.com/item?id=621557)
[http://news.ycombinator.com/item?id=1130463](http://news.ycombinator.com/item?id=1130463)
[http://news.ycombinator.com/item?id=3530104](http://news.ycombinator.com/item?id=3530104)
[https://news.ycombinator.com/item?id=9857392](https://news.ycombinator.com/item?id=9857392)
[https://news.ycombinator.com/item?id=6799336](https://news.ycombinator.com/item?id=6799336)

------
wandering2
> I was shocked to learn that the binary search program that Bentley proved
> correct and subsequently tested in Chapter 5 of Programming Pearls contains
> a bug.

Huh? How did he _prove_ it correct, then?

~~~
mnarayan01
On an idealized machine with arbitrarily large integers I assume.

~~~
haimez
How else do you "prove" trivial algorithms other than by removing the
realities of an actual runtime environment?

I'm pretty sure you're right though. Integers must have been defined as
infinite in the proof for it to have been a proof at all.

~~~
mnarayan01
If you sufficiently specify the runtime environment the proof can still work:
Since the array is composed of integers, many C compilers/machine
architectures will dictate a maximum size of the array s.t. there can't be
overflow.

~~~
haimez
Touche, and fair enough- although you would expect them to call out that
limitation since it's both critical and highly un-general. Bytes are
comparable, small, and common enough to at least accommodate an exclusionary
statement in the proof.

------
kartickv
At least, in a memory-safe language, you'll get an IndexOutOfBoundsException
rather than corrupting memory, which can be used as an attack.

------
legulere

        int mid = (low + high) >>> 1;
    

I don't get how this fixes anything, the overflow still occurs.

Why not simply

    
    
        int mid = (low/2) + (high/2) + (low & high & 1);

~~~
abecedarius
More expensive.

Here's another approach: represent the range as a (low, length) pair instead
of (low, high). The update becomes either length /= 2 or low += length/2,
length -= length/2\. IIRC this was what I did when I adapted Bentley's code to
C 25 years ago, mainly because of overflow. It never occurred to me to check
if standard libraries got it wrong.

------
Retric
All of those proposed solutions still fail on valid arrays that are to large.
We changed to 64 bit OS and you now 32 bit indexes just don't cut it.

~~~
haimez
Not in java (arrays can only be indexed by a signed 32 bit integer) and the
C++ solution works exactly the same way with the full 64 bit data types. That
much should be obvious, and if you think you might legitimately exceed two
billion elements in the array you're sorting- you'd be a fool to rely on
someone else's sort implementation.

~~~
Retric
Java uses shorts in some JavaCard implementations. So, it's clearly just a
legacy limitation which will eventually be removed.

As to C++ 2+ billion byte arrays are not uncommon. It's just a few gigs of ram
for int[] arrays.

------
Cpoll
> int mid = low + ((high - low) / 2);

Why this instead of the more intuitive `low/2 + high/2` ?

Edit: Nevermind, I get it. The reason is instructive: In the case where high
and low are odd numbers, the "more intuitive" way is off by one.

------
Iv
A more interesting version of that bug, I find, and the reason why I almost
never used unsigned values in my algorithm is that when you compute indices in
a non-trivial way, any intermediate value that is negative will break your
program.

------
tomsmeding
> It is not sufficient merely to prove a program correct; you have to test it
> too.

No, this just indicates that you didn't prove it rigorously enough :p

