You've traded hashing for a memory access per byte. I did something similar in the 'optimized-trie' program for Rust: https://github.com/benhoyt/countwords/blob/master/rust/optim... --- It ended up being slower. But, it is very much impacted by cache effects. So the next step here would indeed be to figure out how to reduce cache misses. But it's non-trivial.
Like, maybe your idea works. Maybe. I don't know. You'd have to try it. But it's not clear to me that it will. And it doesn't support "is far from what an experienced C programmer would write if performance was paramount" IMO.
Like, maybe your idea works. Maybe. I don't know. You'd have to try it. But it's not clear to me that it will. And it doesn't support "is far from what an experienced C programmer would write if performance was paramount" IMO.