
Nearly All Binary Searches and Mergesorts Are Broken (2006) - rargulati
https://research.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html?m=1
======
dang
Previous discussions at
[https://hn.algolia.com/?query=Nearly%20All%20Binary%20Search...](https://hn.algolia.com/?query=Nearly%20All%20Binary%20Searches%20and%20Mergesorts%20Are%20Broken&sort=byDate&dateRange=all&type=story&storyText=false&prefix&page=0)

------
sillysaurus3
I wish it was worth a lot to be good at systems programming these days. It
seems like the huge salaries are for rails senior devs. I spent like ten years
getting really good at this stuff (the (low + high)/2 line jumped right out at
me) but nowadays it feels like being really good at trivial pursuit.

It's interesting how much things have changed in the last decade. I wonder
what next decade's "Rails" will be? Could it be possible to predict?

~~~
sikan
I wish it was worth a lot to be good at assembly these days. It seems like the
huge salaries are for c++ devs. I spent like ten years getting really good at
this stuff but nowadays it feels like being really good at trivial pursuit.

Edit: Did not mean to "mock" the OP, just merely trying to showing that you
can apply that thought to almost anything in the programming world by
replacing the technology/language names.

~~~
Gracana
Are you mocking the parent poster? Why?

~~~
sillysaurus3
Nah, I thought it was a good point. It's easy to forget that we either change
or become obsolete. It's one thing to know it abstractly, but it's hard to
make any lifestyle changes, especially if it involves switching to a job in a
completely new domain.

[http://thecodist.com/article/the_programming_steamroller_wai...](http://thecodist.com/article/the_programming_steamroller_waits_for_no_one)
is a good essay on this.

EDIT: previous discussion:
[https://news.ycombinator.com/item?id=7204515](https://news.ycombinator.com/item?id=7204515)

~~~
sikan
That's an amazing article. Thanks for sharing.

~~~
ryandrake
Yes and no. My response to that article is "The more things change, the more
they stay the same." There is no steam roller. The fundamentals are not
changing. So much touted (even here on HN) as new and innovative are little
more than re-hashed versions of what we were already using 5, 10, 20, 30 years
ago, just prettier. Sure the syntax is always changing a little, and the
frameworks and tools are evolving, but at a fundamental level the job of a
programmer is little different now than it was in any of those eras. I have no
doubt that a competent programmer from then, if picked up and plopped in front
of a MacBook in 2017 could do a little reading up and perform proficiently in
most programming jobs today. Probably more proficient because 1. they've seen
it all before including the bugs and pitfalls and 2. I'd argue programming is
much easier today than it ever has been.

EDIT: I _swear_ I did not read the top comment in the (newly) linked HN
discussion of that article before I wrote my response. I agree completely.

~~~
sanderjd
I really wonder about this. It seems true to me that it's easier now and
anyone who could do it back when it was harder could do it now.

But then, I wonder if it just seems that way to me because I'm a product of
"now", which makes me more proficient in how we do things now, which makes it
seem easier. It's possible neither is easier, that they are just different,
and for everyone, the other seems harder than the one they already know.

I'm not trying to argue that this is the case: I could see it being either way
and I sincerely wonder which way it is.

------
maxton
That line from the C/C++ "fix" is an atrocity; `low`, `mid`, and `high` should
never have been declared as signed integers in the first place, since array
indices are never negative. It's unfortunate that in Java there is no other
option than to use signed ints.

~~~
SAI_Peregrinus
For C99, the correct type for an array index is size_t. unsigned int isn't
guaranteed to be big enough, while size_t is guaranteed to be large enough for
the target architecture.

~~~
pishpash
Here's some code you can compile, it shows the problem:

    
    
      #include <stdio.h>
      #include <stdint.h>
      
      int main() {
        printf("size_t bytes: %u\n", sizeof(size_t));
        
        size_t high = SIZE_MAX;
        size_t low = high-1;
        size_t mid_correct = low+(high-low)/2;
        size_t mid_incorrect = (low+high)/2;
        
        printf("low: %.ju\n", low);
        printf("high: %.ju\n", high);
        printf("low+high: %.ju\n", low+high);
        printf("(low+high)/2 -- incorrect: %.ju\n", mid_incorrect);
        printf("low+(high-low)/2 -- correct: %.ju\n", mid_correct);
      }
    

On my 64-bit machine, I get:

    
    
      size_t bytes: 8
      low: 18446744073709551614
      high: 18446744073709551615
      low+high: 18446744073709551613
      (low+high)/2 -- incorrect: 9223372036854775806
      low+(high-low)/2 -- correct: 18446744073709551614

~~~
SAI_Peregrinus
This is a great illustration of the actual bug.

------
CamperBob2
Programmer: "The bug is in the choice of data type. Use unsigned ints to index
arrays."

Computer scientist: "The bug is in the language. Signed integer overflow
behavior should have been defined in such a way as to guarantee correct
functionality in cases such as this."

Me: "Use int64s for this sort of thing. It's still broken, but I'll be retired
or dead before anyone notices."

Engineer: "The bug is in the documentation. The program is correct but should
have been specified for use with element counts no greater than INT_MAX / 2."

Mathematician: "A solution exists."

Manager: "Ship it."

~~~
kazinator
Mathematician: "I conjecture that a solution exists."

------
e12e
I'm surprised about the suggested solution for java:

    
    
      int mid = (low + high) >>> 1;
    

I suppose things like this is why the ">>>" (unsigned shift) operator exists -
but it's a bit odd when the value it works on is considered signed by the
language, and implemented as two-compliment signed in memory.

What's interesting to me is that this allows the sum to overflow, and would
fail with the ">>" operator as far as I can tell (signed shift) - just like
the original code would fail with simply dividing by 2.

Guess it shows java's "system language" roots - in that one might expect there
to be a way to be alerted to overflow when working with signed integers - but
the solution here is to use a special operator to "fix" the problem.

Maybe it's just me, but it's a solution that would feel more at home in
assembler, than I personally think it does in java.

~~~
bradleyjg
I agree it would be very surprising to see that line in a java codebase. >>>
seems more like an answer to a java trivia question than something you'd come
across on a regular basis.

~~~
e12e
Thinking a bit more about this, I think what feels off about it to me, is the
invisible "type gymnastics" \- it's something that might be a bad idea, but
feel more natural, in assembly (that byte is what you decide it represents at
any given moment). The individual numbers are signed ints, the sum overflows
and is treated as an "unsigned two's complement binary" and shifted down to a
signed int...

It feels like subtle subversion of java's admittedly strange type system for
numbers (mix of raw integer types and boxed numbers is never going to be
pretty...).

------
pishpash
So what 'bout this? It's the latest glibc.

[https://fossies.org/dox/glibc-2.25/stdlib-
bsearch_8h_source....](https://fossies.org/dox/glibc-2.25/stdlib-
bsearch_8h_source.html)

~~~
justin66
The problematic line is:

__idx = (__l + __u) / 2;

__idx, __l, and __u are all the same type and the addition of __l and __u
could cause an overflow resulting in a nonsense value assigned to __idx.

~~~
pishpash
Seems like it's still an open ticket:

[https://sourceware.org/bugzilla/show_bug.cgi?id=2753](https://sourceware.org/bugzilla/show_bug.cgi?id=2753)

------
tzs
> int mid = low + ((high - low) / 2);

That may fix things in the case of searching a sorted array, but binary search
can be used more generally than that. I think that fix might not work for some
of the more general applications of binary search.

For instance, suppose f(n) is an increasing function from the signed integers
to the signed integers, with f(a) < 0 and f(b) > 0, and you want to find an n
in (a,b), if such n exists, such that f(n) = 0. Binary search on [a, b] is a
reasonable approach.

If a < 0 and b > 0, then that mid computation could overflow on the
subtraction.

~~~
rntz
I'm tempted to suggest

    
    
        (low / 2) + (high / 2)
    

but then I have to think about rounding error. Ugh. I guess you could write a
bunch of nested `if`s to handle the parity errors, assuming you know how your
language rounds when dividing negative numbers. (I wouldn't be surprised if C
leaves that "implementation-defined".) If you're really searching an arbitrary
range, maybe just use bigints. Then at least you can stop worrying about
overflow altogether.

~~~
pishpash
That won't get you the middle, these are integer divisions.

~~~
rntz
Yes, that's why I said: "but then I have to think about rounding error. Ugh."
Because integer division rounds. (Or truncates, if you prefer.)

~~~
kazinator
Rouding error can be taken care of:

[https://news.ycombinator.com/item?id=14908117](https://news.ycombinator.com/item?id=14908117)

------
jtchang
I don't think that is broken. The algorithm is sound conceptually. I was
almost looking for some type of theory where all mergesorts and binary
searches could be improved.

~~~
TotallyHuman
Agreed. This was a very underwhelming article.

------
slolean13
(High-low) works, well if we consider this from a pointer pov!

------
pagade
Won't this line have similar issue?

    
    
      return -(low + 1);  // key not found.

------
KirinDave
A more modern version of this is: nearly all discussion about sorts that
claims computers can't do search in better than O(n log n) are wrong and have
been for some time.

Edit: I'm quite surprised I'm being modded down given the magnitude of what
I'm implying. I suspect someone doesn't understand what I'm saying.

~~~
rntz
The O(n log n) bound only holds for comparison-based sorts. It's not a matter
of time; it's an assumption/precondition of the proof.

~~~
KirinDave
Right, but a generalization of discrimination-based sorts didn't exist until
earlier this decade. So it was a reasonable statement to misinterpret or
relegate to "say you have a 30m character string you wanted to sort as quickly
as possible" interview questions.

I certainly didn't realize how generalized discrimination-based sorts were.
Many people I've talked to outside of the Haskell community don't know at all.
I'm still working through the papers, the base of that chain is like 86 pages
long!

This search fallacy has been corrected. I'm mentioning one that hasn't.

~~~
rntz
Huh, is this the stuff you're talking about:

    
    
        http://www.diku.dk/hjemmesider/ansatte/henglein/papers/henglein2011a.pdf
        http://www.diku.dk/hjemmesider/ansatte/henglein/papers/henglein2011c.pdf
        https://www.youtube.com/watch?v=sz9ZlZIRDAg
        https://hackage.haskell.org/package/discrimination
    

I didn't know about this until I read your reply and googled; I thought you
were just obliquely referring to radix sort or something like that.

~~~
KirinDave
Yes that is it. And it's really a variation on radix sort as well. They're all
in the same general family.

I can't believe I got down voted for this thread. What does it take?

I guess I should just post it.

~~~
justin66
Given how few of those posting tonight appeared to understand the import of
what Joshua Bloch wrote, I wouldn't take it personally. I'm slightly shocked
by it, but I wouldn't take it personally...

~~~
KirinDave
I did post it, I even said provocatively, "O(n) general sort" in the title.
Sadly the point gods were not kind.

------
sriram_iyengar
Classic programming defect !

------
ScottBurson
The bug isn't in your algorithm; the bug is in your language, which doesn't
provide arbitrary-precision integer arithmetic by default.

~~~
thebooktocome
Big integer by default is a terrible idea. Just look at Python 3.

~~~
__s
Python 2 was bigint by default. Not even going to deconstruct your argument
past that

~~~
thebooktocome
I didn't say it wasn't...?

~~~
__s
I didn't say you didn't say it wasn't

------
jmull
* for arrays with counts over a billion if you choose to use signed 32-bit integers for array indexes.

Not that it isn't a potential problem, but it's a narrow issue.

