That is what I immediately thought as well. The problem though is that a perfect hash is still going to be O(key length) for the hash function, which they are trying to avoid.
In theory, if they used a regex it would be using a state machine to do the matching, which should have similar performance to the trie- only O(k) in the worst case. But from what I understand regex libraries don't actually build the state machine, they use backtracking so the performance is no longer O(k).
I'm surprised they couldn't find a performant regex library that already existed that uses state machines. It should have similar performance to the trie. But in reality, the performance is affected more deeply by things like memory access patterns and the performance of specific arithmetic operations, so it's impossible really to speculate.
I wonder how much faster it gets when you shorten the maximum internal key length. If you re-map everything to like "cf-1", "cf-2", .. "cf-;" then you'd only need to hash like 4 characters.
In theory, if they used a regex it would be using a state machine to do the matching, which should have similar performance to the trie- only O(k) in the worst case. But from what I understand regex libraries don't actually build the state machine, they use backtracking so the performance is no longer O(k).
I'm surprised they couldn't find a performant regex library that already existed that uses state machines. It should have similar performance to the trie. But in reality, the performance is affected more deeply by things like memory access patterns and the performance of specific arithmetic operations, so it's impossible really to speculate.