
Reading bits in far too many ways - makmanalp
https://fgiesen.wordpress.com/2018/02/19/reading-bits-in-far-too-many-ways-part-1/
======
userbinator
_The first major decision to make is whether fields are packed MSB-first or
LSB-first_

Unfortunately in many cases like image and video codecs, that decision has
been made for you already --- and not in the direction that's most efficient
for a CPU (but of course is pretty much immaterial for hardware.) I suspect
"It’s easy to fool yourself into thinking that one variant is better than
another because it looks prettier" was what drove those decisions.

The big-endian/MSB-first order requires extra "bit movement" if you are
maintaining a bit buffer in a register, since the bits of interest are always
going to reside in the topmost bits, but you often need them in the lowest
bits to use them. With little-endian/LSB-first, the bits of interest are
already in the lowest position, and naturally "fall off the end" as you shift
right and replenish into the upper bits. IMHO another reason to prefer little-
endian: it's very logically consistent.

~~~
ghusbands
Your reasoning about which end has interesting bits has no basis in how CPUs
actually do any of their work The byte-ordering does not fundamentally affect
how operations are implemented.

Us humans often find little-endian confusing because we write numbers big-
endian. If you have the number 258 (0x102) in memory in little endian (which
most computers use) 32-bit, it is hex 02 01 00 00, binary 00000010 00000001
00000000 00000000. If you bit-shift that one to the right without care, you
end up with 00000001 00000000 10000000 00000000, hex 01 00 80 00, which is
8,388,609.

To fix it, write your digits also in little-endian order, so the first number
is 10000000 010000000 00000000 00000000. Then the shift operations match
expectations.

~~~
userbinator
_Your reasoning about which end has interesting bits has no basis in how CPUs
actually do any of their work The byte-ordering does not fundamentally affect
how operations are implemented._

Yes it does. Consider an MSB-first ordering, and you want to extract 4 bits to
use as an integer or whatever: in the bit buffer, that would be "aaaaxxxx...",
where "a" are the bits you're interested in. You'll need to copy from the bit
buffer to the "work" register, then shift right in order to put them in the
right place. Furthermore, to "eject" those bits, you need to shift _left_ and
insert from the lower bits, i.e. the bit buffer becomes "xxxx..." in its most
significant bits.

With LSB-first, "...xxxxaaaa", the bits do not need any shifting --- they're
already in the right place.

~~~
ghusbands
Just no. CPUs don't have an endianness on internal registers and operations -
it only applies to memory access (and occasionally some simple conversion
operations).

~~~
BeeOnRope
They are talking about _bit-endianess_ here not the usual _byte-endianess_
when they mention MSB-first and LSB-first.

The bit-endianesss you chose greatly impacts how your bit-reading loop will be
implemented. In particular, if you want to _use_ some bits, you want them in
the low bits: if you write the value 42 into a 7-bit field, and then later I
give you back (42 << 56) or something, you are going to be very confused: you
expect to get back 42 (i.e., the return value equals 42, or to be pedantic the
7-bit field should be right-aligned in the uint32_t or uint64_t return value).

~~~
ghusbands
Oh my. You're right, I entirely misunderstood. How embarrassing. I wish I
could delete my comments.

------
jrochkind1
This article is awesome, I don't work in this kind of thing at all, and I feel
like it's a good exersize for my brain.

------
zmodem
Part 2:
[https://news.ycombinator.com/item?id=16419774](https://news.ycombinator.com/item?id=16419774)

------
cpach
Fun small educational project: If you want to learn how to work with units
smaller than a byte, implement Base64 from scratch. It’s really fun!

~~~
wmu
Better would be the Huffman coding; one has to deal with variable bitfield
widths.

~~~
zmodem
In fact, that appears to be where the author is coming from. Part 2 mentions
techniques coming from the Kraken Huffman decoder
([http://www.radgametools.com/oodlecompressors.htm](http://www.radgametools.com/oodlecompressors.htm))

------
ahartmetz
Almost everything on this guy's blog is amazing if you are interested in
hardware, optimization and that kind of thing.

