True. A gperf alike hash table (with special key indices) works only ok with compile-time known strings. But a simple CMP-alike perfect hash table with double indirection would be worthwhile to test against.
I called it Hanov after http://stevehanov.ca/blog/index.php?id=119
Problem is there that you have to calc a hash for each string, which is only fast with __builtin_crc/_mm_crc32_u64
A hash alone isn't going to help much; he has both a 1-char string in the set and to top it off, most of the other strings begin with (a distinct from the 1-char string one) the same string. There are some tricks (overpopulate the hash table with short string) but hashing is tricky here.
Source: 11 years of string matching work. :-) We built a multiple hash table string matcher in the early days of Hyperscan (it's gone now).
I called it Hanov after http://stevehanov.ca/blog/index.php?id=119 Problem is there that you have to calc a hash for each string, which is only fast with __builtin_crc/_mm_crc32_u64