
RISC-V Bitmanip Extension [pdf] - ncmncm
https://raw.githubusercontent.com/riscv/riscv-bitmanip/master/bitmanip-0.90.pdf
======
CalChris
The bext, bdep instructions are the same as x86_64 BMI2 PDEP/PEXT. These are
essentially scatter/gather instructions for bits and generally have a latency
of three cycles on x86_64. I've never used them. I wonder whether Intel C uses
them for anything but intrinsics.

But ARMv8 has the more prosaic Bitfield Operations which fit in 32b
instructions and are single cycle on A75. I use these a lot.

~~~
ncmncm
This kind of instruction, in general, is not often emitted by compilers under
normal circumstances, but can be essential to get good performance in e.g.
encoding or decoding H.265 or AV1, or decompressing the best compression
formats.

What makes these formats hard to do fast with regular code is that they may
have been designed to work through dedicated hardware; or they just really
pack in the bits.

------
ncmncm
Poster speaking...

The original title on this was "RISC-V Bitmanip Extension proposal is an
education" because, besides a list of proposed instructions and precise
definitions, the new draft explains in some detail how they are useful for
real programs, and in many cases shows how they can be implemented with
minimal additional gate count.

It's an education because many of these operations will be unfamiliar to most
readers, and may suggest to them new ways to solve problems. Some of the
instructions are also implemented in recent x86 cores, sometimes in an AVXx
extension, but Intel's assembly language mnemonic offers little hint at how
powerful it is, and its reference documentation sheds little more light.

So, previous drafts looked like a huge bolus of largely doubtful instructions
that looked to bloat the RISC-V spec but be little used. Now we can see that
most need only an extra gate here and there on such existing subunits as
barrel shifters and multipliers, yet open up whole new vistas of operations
they could be used for. The more powerful ones turn a whole family of O(N)
operations (N the word size or number of 1s or 0s) to O(1). In turn, they are
building blocks for fundamental signal analysis and encoding algorithms that
may then run up to N times faster.

The document is maintained at [https://github.com/riscv/riscv-
bitmanip/](https://github.com/riscv/riscv-bitmanip/) .

~~~
paulrpotts
Interesting stuff, thanks. I am a big fan of Henry S. Warren's book _Hacker's
Delight, 2nd ed._ and have used his algorithms several times - for example,
when I need a 64-bit divide operation on a 32-bit microcontroller with single-
precision floating-point only.

------
JohnJamesRambo
I'm out of the loop, can someone tell me why Hacker News loves RISC so much?

~~~
AceJohnny2
RISC-V is an _open_ and flexible CPU ISA, championed by the guy who invented
RISC [1]. I recommend this talk [2] by him on the history of CPU
architectures, complexity, and why the time is ripe for RISC-V when x86 and
ARM dominate the industry (with some MIPS and PowerPC to round things off).

Note that the flexibility aspect means that it's relatively easy to _design_ a
chip that uses RISC-V (of which the OP article is an extension for). Designing
chips is still out of reach of most hackers, but it does offer hopes of a more
accessible landscape of General-Purpose CPUs, DSPs, GPUs, MCUs, etc...

[1]
[https://en.wikipedia.org/wiki/David_Patterson_(computer_scie...](https://en.wikipedia.org/wiki/David_Patterson_\(computer_scientist\))

[2] [https://californiaconsultants.org/wp-
content/uploads/2018/04...](https://californiaconsultants.org/wp-
content/uploads/2018/04/CNSV-1806-Patterson.pdf)

~~~
JohnJamesRambo
Thank you!

------
childintime
Is it an option on RISC-V to transparently emulate opcodes on cores that
implement only a subset of an extension? This extension in particular seems to
be a good candidate.

~~~
mrob
RISC-V is only an instruction set architecture, not a microarchitecture. It
describes how things work from a software point of view, but it does not
require any specific implementation, and it does not require anything to be
fast. You can implement instructions with microcode or software if you want
to.

~~~
childintime
> You can implement instructions with microcode or software if you want to.

Note that I'm specifically not talking about microcode, nor emulation in
software.

So I gather the answer is "no", and the instructions not directly implemented
_must_ be implemented in microcode instead of trapping them to a library.

~~~
garmaine
No, your original assumption is correct. It is explicitly stated (though I
can't remember where) that it is acceptable for a compliant RISC-V core to
trap and emulate instructions that it doesn't natively support.

------
dooglius
Nice to see that they are using the industry standard CRC32 polynomial in
contrast to Intel/x86 who made up their own

~~~
childintime
Some industry standard CRC's are not recommended as there are much better ones
(see research done by Koopman). So perhaps Intel is justified?

~~~
mrob
>research done by Koopman

Good overview of the research:

[https://users.ece.cmu.edu/~koopman/roses/dsn04/koopman04_crc...](https://users.ece.cmu.edu/~koopman/roses/dsn04/koopman04_crc_poly_embedded.pdf)

Optimal CRC polynomials:

[https://users.ece.cmu.edu/~koopman/crc/index.html](https://users.ece.cmu.edu/~koopman/crc/index.html)

Note that the best CRC depends on the length of the message it's protecting.

------
inetsee
I noticed an odd behavior with this link; it doesn't open the pdf in my
browser (Firefox), and it doesn't ask if I want to save the pdf file. It just
downloads the pdf file without any notification. I don't recall ever seeing an
invisible download like this before.

~~~
klodolph
You can do this with the header:

    
    
        Content-Disposition: attachment
    

That's the _correct_ way to do it. What's actually happening is that the wrong
content type is being returned:

    
    
        Content-Type: application/octet-stream
    

This is the lazy way to do it.

------
amluto
dang, etc: the “is an education” part of the title is inappropriate, and [pdf]
should be added.

~~~
ncmncm
I see [pdf] there already.

How is "is an education" inappropriate?

~~~
wyldfire
Presumably the objection is that it's not the original title. It's also an odd
wording, I think "is interesting to read" might be clearer.

~~~
CalChris
Click baity. It's kind of like but not quite at the level of _You won 't
believe what the RISC-V bit manipulation extension does!_ Why not just cite
the document's name? Bore me. Please.

In fairness to the doc, it's a _rationale_. It should educate.

