Hacker News new | past | comments | ask | show | jobs | submit login

> "For example, as of this writing, the first two google hits for popcnt benchmark (and 2 out of the top 3 bing hits) claim that Intel’s hardware popcnt instruction is slower than a sofware implementation that counts the number of bits set in a buffer, via a table lookup using the SSSE3 pshufb instruction."

I believe that my benchmark is one of those top hits. My description is at http://www.dalkescientific.com/writings/diary/archive/2011/1... . I wrote "My answer is that if you have a chip with the POPCNT instruction built-in then use it. I still don't have one of those chips, but I know someone who does, who has given me some numbers. "

My own code's logic is "if POPCNT exists then use it, otherwise test one of a few possibilities to find the fastest, since the best choice depends on the hardware."

I now have a machine with a hardware POPCNT, and a version with inline assembly. I should rerun the numbers...




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: