Intel's PEXT and PDEP instructions are the #1 innovation to bit-manipulations. Its basically bitwise gather / bitwise scatter, some of the most flexible operations I've ever used personally.
Popcount is... ridiculously standard. Even GPUs offer 64-bit popcount. Bit-reversal is surprisingly useful in my experience as well (ARM and GPUs offer single-cycle bit-reversal). So I'm surprised to hear that RISC-V doesn't have popcount standard.
Bit-reversal is great because adder-circuits carry only goes in one direction. So all of the "least-significant set bit" tricks that are done can be inverted into "most-significant set bit" rather easily with just a bit-reversal. x86 is missing out on bit-reverse, while all other platforms (GPUs, ARM, and Power9) seem to have it.
I'm also of the opinion that multiply-xor-bitreversal-multiply is a very powerful hashing tool (multiplication by odd numbers is 1-to-1 bijective, xor is 1-to-1 bijective, bitreversal is 1-to-1-bijective... so a multiply-xor-bitreverse-multiply cycle can transform any number into a singular 'random' number across all 64-bits space). x86-fans don't get bit-reversal, but bswap can be used to largely the same effect for hashing.
Popcount is ridiculously standard as many US government computer contracts require chips that support that operation. Also popcount is very useful if you are doing cryptanalysis.
Popcount is... ridiculously standard. Even GPUs offer 64-bit popcount. Bit-reversal is surprisingly useful in my experience as well (ARM and GPUs offer single-cycle bit-reversal). So I'm surprised to hear that RISC-V doesn't have popcount standard.
Bit-reversal is great because adder-circuits carry only goes in one direction. So all of the "least-significant set bit" tricks that are done can be inverted into "most-significant set bit" rather easily with just a bit-reversal. x86 is missing out on bit-reverse, while all other platforms (GPUs, ARM, and Power9) seem to have it.
I'm also of the opinion that multiply-xor-bitreversal-multiply is a very powerful hashing tool (multiplication by odd numbers is 1-to-1 bijective, xor is 1-to-1 bijective, bitreversal is 1-to-1-bijective... so a multiply-xor-bitreverse-multiply cycle can transform any number into a singular 'random' number across all 64-bits space). x86-fans don't get bit-reversal, but bswap can be used to largely the same effect for hashing.