
Linear vs. Binary Search - signa11
https://schani.wordpress.com/2010/04/30/linear-vs-binary-search/
======
franzb
Of particular interest:

Binary Search _eliminates_ Branch Mispredictions
[http://www.pvk.ca/Blog/2012/07/03/binary-search-star-
elimina...](http://www.pvk.ca/Blog/2012/07/03/binary-search-star-eliminates-
star-branch-mispredictions/)

~~~
corysama
*When it compiles to CMOV intructions.

~~~
pkhuong
I had two points in that post. The first, obvious, one is that binary search
can be micro-optimised to combine decent algorithmic properties with an
enviable constant factor.

The second one is that, when linear search outperforms binary search, it does
so by breaking out of the search, which needs conditional branches. Binary
search, despite its bad reputation, has conditional branches that are easily
converted to conditional moves or masks; even a loopy implementation is
amenable to trip count prediction (it's a function of the log of the size of
the array). If we must avoid mispredicted branches, binary search is
intrinsically a better option than linear search.

~~~
dmpk2k
Wouldn't cache line misses dominate? Linear search benefits from prefetch.

~~~
ltta
Modern architectures really necessitate a good understanding of the
instruction pipeline and caches to squeeze out the best performance.

If I remember correctly, Python's hashtables are initialized with 8 buckets
that are linearly searched and then switched to a real hashtable
implementation when grown past that size.

I have worked with the L4 microkernel where sooo much emphasis was put on
keeping instruction and data footprints as small as possible every time the
kernel is entered in order not to dirty i- and d-caches.

And I have also seen game engine developers do amazing things in this regard.
An interesting development in the gaming space is data-oriented-design that
deviates from OOP among other things for performance and parallelization. See
[http://www.slideshare.net/mobile/cellperformance/data-
orient...](http://www.slideshare.net/mobile/cellperformance/data-oriented-
design-and-c) (though I don't agree with the three "lies" mentionened, I do
like the data-centric approach).

------
mhewett
>> ...use linear search if your array is below around 64 elements in size,
binary search if it’s above.

Back around 1980 when I first looked into this the generally accepted cutoff
was 25 rather than 64. I don't think people were unrolling loops in the tests
back then so it's hard to tell whether loop unrolling or changes in CPU
architecture is the greater factor in moving the cutoff point from 25 to 64.

~~~
JoeAltmaier
Something to do with the ratio between instruction cache size and cost of a
jump to the pipeline?

~~~
TrainedMonkey
I suspect memory bus width and latency also plays a role.

------
exDM69
This is a bit contrived example because if just searching for an integer in a
large array, the memory bandwidth is going to be the bottle neck.

When dealing with small arrays (like the example of 0-250 integers, ie. less
than 1 kilobyte), the figures are around 20-120 nanoseconds. In other words,
the difference is around one main memory reference ("cache miss") between the
best and the worst case.

While this is somewhat interesting, the lesson to take home should be that
most of the time, smart memory usage is more important than what the CPU
executes. Warming up the caches (e.g. __builtin_prefetch) is as efficient an
optimization as rewriting and unrolling the entire loop and should be done
first. If the caches are warm and the search is still the bottleneck, then
it's time to consider the other optimizations.

~~~
alphonse23
I also think he should have covered the time it would take to sort an array. A
linear search wouldn't need to be sorted, but a binary search would. He made
his cut off at around 64 elements, but if you'd include the time needed by
binary search to sort the array, the cut off limit would be a higher than that
-- but would also introduce a lot of other complexities to consider in a
production system.

~~~
lgeek
I really doubt there's any data size for which sort + binary search is going
to be faster than linear search. The former has higher complexity (so it's
going to be slower for large inputs) and more expensive operations (so it will
be slower for small inputs).

You can also think about it as a sort of reductio ad absurdum: Assume that the
array is actually already sorted and you use a sorting algorithm with O(n)
complexity for already sorted data. In this ideal case you'd need to do n (for
sorting) + log n (binary search) predictable operations. For linear search
you'd only need to n operations of the same type. In practice you'd need to do
n log n operations for sorting, I don't think you can avoid unpredictable
branches, and instead of just loading elements sometimes you'll move them
around.

------
desdiv
_int middle = (min + max) >> 1;

This will work for n < MAX_INT/2\. For larger n, middle must be calculated as
min + ((max - min) >> 1), which is one subtraction per iteration more._

Making middle an unsigned int should make (min + max) >> 1 work for all int
values.

~~~
tromp
That's not guaranteed to work.

If min+max is a negative signed int, then the result of right shifting is
implementation dependent.

Instead, you could do

    
    
        int middle = (unsigned)(min + max) >> 1;
    

Still, this relies on the original indices falling in the non-negative range
of signed ints.

