
Dyalog '18: Sub-Nanosecond Searches Using Vector Instructions [video] - jodrellblank
https://www.youtube.com/watch?v=paxIkKBzqBU
======
jodrellblank
It's about how Dyalog APL uses SSE/AVX instructions to speed up searching, in
cases like:

\- given a short vector of digits, test each character in a long string to see
if it's a member of digits. \- given a short vector of bucket-boundaries (0,
25, 50, 75, 100), test each number in a long vector to see which interval it
would fall into.

They pack multiple small integers into one vector register, use it do 16
simultaneous binary searches, code to avoid branching and branch mis-
prediction, use the crc32 CPU instruction for hashing, and other tricks, to
average under 1ns "item in vector" membership testing.

