This specific case (the large switch table replacing a lookup) is such an esoteric corner case I think it would be hard to convince clang to spend much more than a trivial bit of effort on fixing it.
The associated posts are interesting, so thanks to the author for that. I think it's more likely to see that clang/gcc do some value-range analysis on lookup tables (as suggested in the first post). I can see a decent amount of general value in that case.
If anything, clang should probably recognize switch case N return M that mimic a lookup table and convert them back to lookup tables, after which they can be optimized with a dedicate LUT-aware pass.
(the large switch table replacing a lookup)
Its not that esoteric, maintaining switches is more
intuitive and flexible than dealing with rigid lookup
table logic.
Although it is a common case, changing the lookup to use an explicit array is a simple fix for code that relies on this optimization. Even non-consecutive ranges are often straightforward to handle with C array designators (standardized for 25 years now!)
This issue doesn't require large switch tables in order to show up. Even if you have 4 cases and the rest of them are default'ed, Clang 18 optimizes that to a switch, while Clang 19 does the (potentially) inefficient labels+jumps approach: https://godbolt.org/z/Y6njP8j38
This whole investigation started because I was writing some Rust code with a couple of small `match`es, and for some reason they weren't being optimized to a lookup table. I wrote a more minimal reproduction of that issue in C++ and eventually found the Clang regression. Since Rust also uses LLVM, `match`es suffer from the same regression (depending on which Rust version you're using).
So it's not a Clang regression per se, it's an issue with the LLVM core? Clang is just a frontend, and Rust AFAIK does not use it at all. If you run LLVM 18's `opt` on bytecode generated by Clang 19 and then compile it, does it also generate the same bad assembly?
> So it's not a Clang regression per se, it's an issue with the LLVM core?
Yes.
> If you run LLVM 18's `opt` on bytecode generated by Clang 19 and then compile it, does it also generate the same bad assembly?
No. If you pass the LLVM IR bitcode generated by Clang 18 to Clang 19, then the assembly is good.
I called it a 'Clang regression' in the sense that the way in which I discovered and tested this difference in performance was via Clang. So from a typical user's perspective (who doesn't care about the inner workings and distinct components of Clang), this is a 'Clang regression'.
The associated posts are interesting, so thanks to the author for that. I think it's more likely to see that clang/gcc do some value-range analysis on lookup tables (as suggested in the first post). I can see a decent amount of general value in that case.
If anything, clang should probably recognize switch case N return M that mimic a lookup table and convert them back to lookup tables, after which they can be optimized with a dedicate LUT-aware pass.