Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

True. A gperf alike hash table (with special key indices) works only ok with compile-time known strings. But a simple CMP-alike perfect hash table with double indirection would be worthwhile to test against.

I called it Hanov after http://stevehanov.ca/blog/index.php?id=119 Problem is there that you have to calc a hash for each string, which is only fast with __builtin_crc/_mm_crc32_u64



A hash alone isn't going to help much; he has both a 1-char string in the set and to top it off, most of the other strings begin with (a distinct from the 1-char string one) the same string. There are some tricks (overpopulate the hash table with short string) but hashing is tricky here.

Source: 11 years of string matching work. :-) We built a multiple hash table string matcher in the early days of Hyperscan (it's gone now).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: