It is should be pointed that they can identify some implementations but miss many common idiomatic implementations that may be superior if you do not have a popcount() instruction. I am often surprised by the algorithms they can't see. You will need to check the code generation. This true for many of the bit-twiddling instructions available on modern CPUs.
As for why someone might care, most bit-twiddling intrinsics cannot be used in "constexpr" functions in C++ (though this is starting to change). Finding a C++ code implementation of bit-twiddling that is reliably converted to the underlying instruction allows you to use the same constexpr implementation in both compile-time and run-time contexts with correct code generation.
Does this imply that it might be better to use the "worse" or "naïve" implementation, if you know that the compiler will identify it and do something architecture-appropriate?
Yes. Surprisingly, many naive implementations you might assume would be identified aren't. You are far more likely to write a naive implementation that is not identified than one that is. I've spent a few weekends testing many different implementations to better understand the limitations of the compiler at identifying bit-twiddling instructions and it still doesn't work nearly as well as you might hope.
Also, not every compiler will identify a specific bit-twiddling implementation, Clang seems to identify more of these idiomatic implementations than GCC in my experience, but that is anecdotal. The set of code patterns they can identify is not documented anywhere that I've seen, though it must be in the compiler source code somewhere.
I strongly disagree with the sibling comments. It is better and more clear to call the compiler built-in than do some magic incantation that gets the compiler to optimize it and hope that future compilers don’t regress.
This does not address the real-world case where built-ins will not compile in some valid code contexts. It is not "better" to do something that literally doesn't work. No one is doing this for fun, it is working around a well-known limitation in current compilers.
Because it is an unfortunate limitation on built-ins that reduces their utility, fixing this is on the roadmap for most popular compilers.
On top of that, the popcount built-in will always compile even if there is no instruction for it, as it can generate fallback code. It actually does so for a naive invocation of gcc or clang for x64 as the original x64 IS did not contain popcount. You need to pass some arch that supports the instruction, or -mpopcount to explicitly enable it. Handling all those builtins properly is tedious.
Assuming they are correct that the 'worse on paper' is the only one detected (I don't know if this is true or not) that would be the logical conclusion, yes.
As for why someone might care, most bit-twiddling intrinsics cannot be used in "constexpr" functions in C++ (though this is starting to change). Finding a C++ code implementation of bit-twiddling that is reliably converted to the underlying instruction allows you to use the same constexpr implementation in both compile-time and run-time contexts with correct code generation.