
End-of-buffer checks in decompressors - jsnell
https://fgiesen.wordpress.com/2016/01/02/end-of-buffer-checks-in-decompressors/
======
derf_
_...if after swapping buffers around, our input cursor hits the mark again, we
just hit the refill path again to generate more zeroes. (This is why the
copying code uses memmove)._

This misses one more optimization: if you consume bits by shifting them off
the end of your word, then you are shifting in zeros in their place. If your
semantics are that all reads past the end of the buffer return zeros, then
once you have hit that end, you can just set bitcount to LOTS_OF_BITS (an
arbitrarily large constant).

That is, you pretend your 32-bit word has a lot more than 32 bits in it. Then
you're extremely unlikely to ever hit the refill code again. For extra points,
you can make the constant something that can easily be represented as an ARM
immediate, like 0x40000000.

We do this in the bitreaders for Theora, Opus, and Daala (thanks to Simon
Hosie for contributing the idea to Theora back in the day).

~~~
userbinator
_you can just set bitcount to LOTS_OF_BITS (an arbitrarily large constant)._

You don't even have to explicitly set it if you just let it go below 0 and
automatically wrap around to FFFFFFFF.

------
userbinator
I really enjoy reading such articles that focus on the details of a topic that
at first would seem trivial and boring, but actually turns out to be far more
interesting.

 _If we start getting too close to the end of our “real” input buffer, we can
just copy the remaining bytes over to a small temp buffer_

You can also just make the input buffer slightly larger that required, so the
"padding" is always present at the end and no copying/extra logic is required
at all.

