I think that searching all lg(n) items even if found in the first comparison is a performance killer. Binary search has poor cache locality of reference.
Normally you'd shrug off performance concerns, but the point of a binary search is to be fast.
Having said that, its like C's sort vs std::qsort - you get a big speed-up from the language mechanics and maybe that can swallow it all?
> I think that searching all lg(n) items
> even if found in the first comparison is
> a performance killer.
I hear what you say, and that's an important consideration when translation this to the real world. You can get the same short-cut performance by adding a single line:
int find(int a[], int size, int val) {
if (size<=0) return -1;
int lower = 0, upper=size;
while (upper-lower>1) {
int m = lower+(upper-lower)/2;
-> if (val==a[m]) return m; <- Inserted line
if (val<a[m]) upper = m;
else lower = m;
}
return (val==a[lower]) ? lower : -1;
}
> Binary search has poor cache locality of reference.
Absolutely, and for that reason there are many variants that can be devised, and then tuned for different cache considerations. I remember writing a version that measured the system cache performance every hour or so and adapted its parameters accordingly.
I agree that an early exit would be better, but it's hardly a "killer".
Assuming a uniform distribution of items searched for, 50% of them will require a full search, and 75% will exit one iteration early. 100% of searches for missing items will require a full search.
Secondly, the last couple of items are likely to be cache hits anyway, since adjacent array entries are in adjacent memory.
I've been asked to write something more complete and more substantial on this topic, so I thought I'd make an earlier short piece available to see what people pick up on, and hence help direct my focus.
Think of this as the first step of "Release Early ..."
Normally you'd shrug off performance concerns, but the point of a binary search is to be fast.
Having said that, its like C's sort vs std::qsort - you get a big speed-up from the language mechanics and maybe that can swallow it all?