

Using invariants in Binary Search code - RiderOfGiraffes
http://www.solipsys.co.uk/new/BinarySearchSummary.html?HN

======
willvarfar
I think that searching all lg(n) items even if found in the first comparison
is a performance killer. Binary search has poor cache locality of reference.

Normally you'd shrug off performance concerns, but the point of a binary
search is to be fast.

Having said that, its like C's sort vs std::qsort - you get a big speed-up
from the language mechanics and maybe that can swallow it all?

~~~
RiderOfGiraffes

      > I think that searching all lg(n) items
      > even if found in the first comparison is
      > a performance killer.
    

I hear what you say, and that's an important consideration when translation
this to the real world. You can get the same short-cut performance by adding a
single line:

    
    
        int find(int a[], int size, int val) {
            if (size<=0) return -1;
            int lower = 0, upper=size;
            while (upper-lower>1) {
                int m = lower+(upper-lower)/2;
              
         ->     if (val==a[m]) return m;       <- Inserted line
    
                if (val<a[m]) upper = m;
                else          lower = m;
            }
            return (val==a[lower]) ? lower : -1;
        }
    
      > Binary search has poor cache locality of reference.
    

Absolutely, and for that reason there are many variants that can be devised,
and then tuned for different cache considerations. I remember writing a
version that measured the system cache performance every hour or so and
adapted its parameters accordingly.

------
RiderOfGiraffes
I've been asked to write something more complete and more substantial on this
topic, so I thought I'd make an earlier short piece available to see what
people pick up on, and hence help direct my focus.

Think of this as the first step of "Release Early ..."

