The function in the article is faster for > 8-bit values when there are more than eight ones in the value being checked. Also, I suppose theirs is better than K&R because it always executes in the same number of cycles.
Thanks for sharing that, love the simplicity behind it which is more intuitive than the example I had gone with.
I think the reason why I chose that particular one was to show how thinking about a problem from an entirely different angle can yield great (and rewarding) results.
Ideally, you'd want your compiler to use the hardware population count instructions though. So this might be one of the places where a simpler algorithm might win out because the compiler can recognize it.
In fact that is one of the follow-ups I want to have out of this. Have architecture-specific optimisations (popcnt on x86) either manually through something like the `asm` crate, or by having a simpler algorithm that the compiler can optimise away.
#include <stdio.h>
main( argc, argv ) int argc; char *argv[];
{
}The function in the article is faster for > 8-bit values when there are more than eight ones in the value being checked. Also, I suppose theirs is better than K&R because it always executes in the same number of cycles.