I suppose the online list of hacks at the 1st reference in the OP [1] makes more sense, but I've got a fondness for the book "Hacker's Delight", which I've owned and browsed for many years now [2]. And checking on that, I see there was a 2nd edition issued 10 years ago. Hmm, might need to pick that up.
There are quite a lot of bit hacks on the web, but Hackers Delight is where it took off for me.It was a massive eye-opener, what was possible and even better doing it the old-fashioned way.
The second book is very, very heavy on division and as such it's not really as much 'fun' as the original, however I'd still recommend it!
I shared the original Hackers delight with a work colleague, who had a mathematical bent and he contacted Henry Warren with a possible addition for the forthcoming second edition, Morton curves (https://en.wikipedia.org/wiki/Z-order_curve) which are extremely simple to calculate, much more so than the space filling curve given in Hackers Delight, but which despite the author's interested response to us, did not go into the second edition. My colleague was very disappointed. Me too. Morton curves are just interlace-the-bits so would have slotted in so well.
Sadly there won't be a third edition as the author died.I found out when I contacted him to let him know his website was down (again!). His family let me know he had gone.
The second book is very, very heavy on division and as such it's not really as much 'fun' as the original, however I'd still recommend it!
What do you mean by this? Did they merely add a ton of information on division? Or did they take out the 'fun' bits from the original and replace them with division stuff?
The last time I reached for Hacker's Delight for "productive" use it was specifically for this chapter (I was aware of space-filling curves from some years in the games industry but had never implemented one), and I remember being disappointed at the lack of depth and options compared to the rest of the book.
I have the second edition, back around 2010 I made a bit of a splurge on books. Super handy when you need it; which in my case has only been one time. Don't spend much time on low-level code nowadays.
Funnily enough just spent a day tracking down a weird problem in an embedded system - some event timestamps were getting corrupted in a weird way. The pattern wasn't obvious until in desperation I dumped them in hex and found the second MSB was toggling between 0 and 1 (whereas it should have been been part of a count sequence). That told me exactly where to look - where the count was re-assembled from four bytes and I found this upstream (paraphrased):-
Having spent a lot of time digging into C code and doing microcontroller programming professionally, I have learned a certain disdain for the practice of using minimal whitespace in expressions, and the related practice of minimising LOC using C's very dense expressions syntax(unary dec/inc operators, the fact that assignment is itself an expression, etc).
The dense expressions syntax is pretty much a holdover from the time C was invented by Dennis Ritchie for the purpose of porting Unix; when it had to be typed on a painfully slow electromechanical teletype. This is also why most Unix commands are 2-4 characters long.
But it serves very little meaningful purpose today. It's important to remember that you might be able to turn 3 lines of code into 1, but it'll still be the same amount of machine code. And it will probably be harder to read, harder to change, and harder to reason about. And it can be a lot easier to miss some edgecase or even a typo like in this case. I have seen such a silly amount of off-by-one errors in C code due to some subtle misuse of unary increment/decrement that I stopped using those operators altogether. They don't improve your software in any meaningful way.
I've done the >> 0 thing to make it very clear what's going on, but I hadn't considered the * 8 construction. That's actually easier to comprehend at a glance because the numbers become less "magic". (8 and 16 are obvious to me, but 24 always has me second-guessing myself somewhere in the back of my mind)
Somewhat oddly even in C++, where bool is a distinct type, g++ doesn't have any warning for implicit bool to int conversion, although clang's clang-tidy "linter" tool does.
Nope, code was clean with -Wall on arm-9 compiler (i.e. gcc).
Interesting thought now you say that (no warning) - only found it because of seeing the pattern (apropos the article) and from that having a good idea what was causing it (knowing the int was assembled a byte at a time).
If I had a criticism of modern compilers, it's the blizzard of uninteresting warnings ("strncmp takes const char star, did you really mean to pass it unsigned char star") that make people want to not use -Wall.
While it's based on C, a large chunk of the content is simply a great introduction to number representations in other bases such as binary and hexadecimal etc. Fundamental knowledge and very well explained, and a few insights I've not seen before.
Thanks, I wanted to post it after it was finished and revised. I also wanted to add more sections, but somebody was faster than I expected (thanks for posting it tod!). I suppose Getting to hn will add some valuable feedback in the first place.
I will probably add some more content in the coming weeks.
Over years I've found that "top performer" is not purely about skill, often it is *just* about motivation, desire to move stuff ahead, to get things done, to improve current state of affair.
Like, rarely stuff is so hard that it requires some outstanding skills.
Like you say it really depends on what your job is. For me those bit algo's for a few years were my jam. I was moving data between 4 different CPU types and across 3 different network transports depending on the project and 4 different OS's. You need to know your bits when moving data between diff arch types and across different transport layers. These days most of that same sort of thing I did then? I would drop it into a json text stream and call it a day.
I think it's fundamental because understanding it goes hand-in-hand with understanding how computers actually work. When I interview applicants, I usually ask one or two questions about bitwise operations for this reason -- an engineer who knows how machines work at a very low level is valuable even if their work is not low level.
I think if you work with UUIDs, it's worth learning about bitwise stuff. So backend developers in general. It's not a day-to-day skill, but it is one I've used at just about every job in the last decade to elegantly and efficiently solve hard problems regarding idempotency and randomness.
A UUID is a bunch of bit fields packed together: the version/variant, and depending on the version, fields for the datetime, MAC address, node ID, etc. You'd use bit ops to both construct and extract the fields.
Or if you work directly with hardware, as I do. I use bitwise stuff constantly. Or if you're doing certain obscure kinds of highly optimized mathematical operations.
Bitwise NOT (~) gives you a bijection between negative and nonnegative integers in two’s complement representation (which negation doesn’t).
This is for example exploited by the return value of Java’s binarySearch() function, which returns the (nonnegative) index of the search key when found, or else the (negative) bitwise NOT of the index where the key would have to be inserted [0]. In other words, it combines a nonnegative int value plus a flag into one int, while making the flag easily testable (< 0) and the value easily flippable (~). Strangely, the API doc doesn’t mention bitwise NOT, but instead expresses it as numeric negation minus one (which is equivalent, as TFA explains).
[0] As opposed to C’s bsearch(), which only returns a position when the key was found.
Great article. One potential improvement is not to call Two's complement MSB a 'sign bit'. It implies that it only stores a sign which is how 'Signed magnitude' works. With all its downsides like having two representations for 0 etc. The beauty of Two's complement is that everything works exactly like unsigned representation, with the exception of MSB. In Two's complement MSB contributes either 0 or negative 2^N-1. So for 8 bit signed numbers, most significant bit contributes either 0 or -128. Everything else, subtraction, hardware adders etc work the same. Signed numbers are basically numbers with a flexible number line, and the first bit represents where this number line starts. I personally remember MSB for signed integers as 'origin bit'.
Oh definitely. Some of this goes back to my 6502 assembly days when there was no hardware multiply instruction. So to multiply by 40, for example. I would shift right 3 bits, store the result, shift right 2 more bits and add the stored result.
Similarly, a fast divisibility test (we’ll assume we’re dividing n by some odd prime p):
1. Shift the bits of p right so that there is a 1 in the last position.
2. If n = p then p∣n, if n < p then p∤n, otherwise continue.
3. Subtract p and go back to step 1.
(One of my ADD habits during meetings is to find the prime factors phone numbers or anything else very long. I do something similar with the numbers in decimal, but for this, I’ll subtract multiples of the potential divisor to get 0s at the end of the number in decimal. I remember co-workers puzzling over a piece of paper with my notes I left behind in a conference room trying to figure out what the numbers represented and how the process worked.)
Depends on the situation. The compiler is smart, but in a way it's also dumb. It's very good at recognizing certain patterns and optimizing them, but not all patterns are recognized, and thus not all optimizations are applied, let alone consistently applied.
One thing to consider is that the compiler can't simply replace a division by just a right shift for signed variables (it will round towards -inf for negative numbers), so even today there's a tiny bit of benefit of explicitly using shifts if you know that the number can't be negative (and the compiler can't prove it) or you don't care about the rounding (https://godbolt.org/z/vTzYYxqz9).
Of course that tiny bit of extra work is usually negligible, but might explain why the idiom has stuck around longer than you might otherwise expect.
I only have a little grey in my beard so far, but like all optimizations it heavily depends on context. My broad rule of thumb is that if you only care what the code does, you should let the compiler figure it out. If you care how the compiler accomplishes that goal, you should specify that rather than hoping things don't silently break in the future. This is a fairly common thing in crypto and systems code.
Nowaways you just write 'divout = divin/8192' and assume the compiler is going to do the right thing (and very possibly do something deeper than "divin>>12" at the assembler level).
Makes me wonder who pays attention to this sort of thing these days :)
I do! When optimizing code that must run obscenely fast, I look at the assembly the compiler spits out to make sure that it can't be improved on, speed-wise.
> I’ve found significant code in c/c++ where bitwise operations are done for things like division etc by shifting a certain way.
Oh, yes. I used to do that sort of thing frequently because the time savings was significant enough. As you say, though, compilers have improved a great deal since then, so it's not generally needed anymore.
If stupid bit tricks like that aren't necessary, they shouldn't be used. They do bring a readability/mental load cost with them.
If x is signed and happens to be negative, x/16 will round one way, and x>>4 another. x/16 can still be implemented more performantly than a general division with unknown (or even known but non power of two) denominator, but it will be marginally slower than a plain shift. It depends on which semantics you desire.
Heads up: the gray_code function does not return the results shown in the table of correspondence above it.
I suspect that this is because both the table and the code used were sourced from Wikipedia, and they correspond to different Gray codes. The table is for the BRGC, but the implementation isn't.
I see several people sharing the Stanford bithacks link, so I'll throw in a slightly-less well-known resource that I found particularly instructive. Basically, a collection of the lemmata we can prove about fixed-length sequences of bits and the fun algorithms that can be built atop those results.
For many years I have recommended "Code" by Charles Petzold and published by microsoft. It's a great bathroom reader, reading on a road trip, or over a spaghetti dinner. It covers bottom up how computers work, starting out with - well what the heck are numbers and works up from there.
The bitwise XOR operator (^) is a binary operator that compares the corresponding bits of two operands and returns a new value where each bit is set to 1 if the corresponding bits of the operand are different, and 0 if they are the same.
You may also think about XOR in a following way:
Any “1” in B flips (inverts) the corresponding bit in A.
It’s like a vectorized NOT operation for single bits. Also works the other way (xor is commutative, A^B === B^A). This way of thinking is helpful when you see expressions like:
X ^ (1 << 3)
X ^ 8
X ^ 0b1000
Which means “X with bit 3 flipped”. (3 means 4th from the right, as in …3210).
I'm trying to find some good resources or some book that covers modern C.
For everything else from Python to Golangz Rust and everything in between - there are tons of quality books and even their own documentation in some cases is pretty decent. But not so with C.
If you read this and have something, please share. Just C (not much interested in C++)
Just do a search on HN itself for "Modern C" and you will find a lot of discussions and recommendations.
That said, you can get started with any C book (The K&R Ansi C book is a good one to start with) since it is not a "huge" language and you can get to "newer" features incrementally.
I totally intuitively understand bitwise operators in C because I grew up with assembler and all this was my second nature.
I’m afraid this article is unnecessarily complicated. The things that are eventually illuminated are really trivially simple, it’s just that they are explained in a complicated way.
I thought this was going to be ridiculous but it's actually a really good. It's clear enough that you could extend it down to showing how logic gates work.
The only thing missing is a little endian discussion; it assumes big endian (network byte order), and that may be confusing for all those x86 users out there.
You are the one that brought up CPU architectures. Except for the now aging S390/z that will probably last in a few installations forever and the 5 people that pay to run POWER workstations the world is little endian.
FTA: “Based on the above picture, another important observation is that to represent the number 1078 in binary, we need at least ten memory cells (bits) for it (look at the most significant power of 2 used, which is 10)”
As the picture shows, we need 11 bits, with powers ranging from zero to ten, both included (under the implicit assumption that we want to represent all integers between 0 and 1078. If all we want to represent is 1078, we can do with one or, pedantically, zero bits)
Like bluesnowmonkey says, the concept you're looking for is called popcount, for population count. It's also called the Hamming weight. Wikipedia has 5 different example implementations under that latter name. Many C/C++ compilers have it available as an intrinsic as well, like __builtin_popcount or __popcnt. It's also std::popcount in C++20.
Wow! Just 5 instructions. This is almost 4 times faster than my previous best solution:
// 19 instructions, does not use intrinsics
int countbits(unsigned x) {
unsigned n;
n = (x >> 1) & 0x77777777;
x = x - n;
n = (n >> 1) & 0x77777777;
x = x - n;
n = (n >> 1) & 0x77777777;
x = x - n;
x = (x + (x >> 4)) & 0x0F0F0F0F;
x = x\*0x01010101;
return x >> 24;
}
The graphics are impressive. Specifically, I've never seen someone fold a "bit surface" in half like that. I've also never seen a ladder view. Assuming Andrei Ciobanu himself posted this, how did you come up with those views? Bored on a Friday night?
I am not sure if this is all that original, I am sure what other have seen that before. For me it clicked when I was redoing the graphics.
What it's more interesting is that symmetry is specific to numbers in general, and the way we represent them. As an exercise if you use base 3, and plot more numbers you will also see hidden patterns.
Articles like these seen today should make us value more the love and effort that requires to train people. Because there is no shortage of "love" and effort (and money) for training AI models.
I don’t understand why bit masking and manipulation is so popular when it makes the code impossible to read. I like using Ruby, string, and pack and unpack.
Well one reason is that bit operations are significantly faster than other methods, it’s one reason for example, why compilers will turn a divide by a constant into bit manipulation.
Using a scripting language’s string packing is hundreds of times slower than doing a couple of bit operations, and if there’s a memory allocation involved, it can be thousands of times slower. If you’re curious about this, I recommend doing some profiling to find out how fast your favorite algorithms are when using C bitwise operators compared to Ruby string packing.
FWIW, having the code be readable is mostly a familiarity problem that you can resolve by practicing more bitwise operations, if you want. If you spend your days in Ruby, then there might not be strong reasons to, but if you’re curious and want to improve, you might have fun playing in C or C++. I spend my days mostly in CUDA, and using bit manipulation is par for the course, failure to use the best tricks can result in much lower compute throughput and much higher power consumption.
C23 finally introduces binary literals, e.g. you can write now "0b11110000" instead of "0xf0" for situations where binaries are more readable (but with a bit of 'training', hexadecimal works just as well).
Also, bitwise operations are essentially "SIMD for bits" and important down on the machine code level, it makes a lot of sense to have that same functionality also in higher level languages (unfortunately not all common bitwise instructions - like rotations - made it into high level languages).
It takes some getting used to but the operations by themselves are readable, it is just that what you do with them can be complex. It is like arithmetic. People usually don't have a problem with addition, subtraction, multiplication and division, but it doesn't mean they won't have a hard time dealing with complex equations.
Packing and unpacking are just some of the things you can do with bitwise operations. They are a common problem, and they are often tricky, so having an API for that makes a lot of sense. But if what you need to do is not packing and unpacking, I find it harder to understand the unpack -> string manipulation -> pack workflow than bitwise operations, it also tends to be more verbose and slower.
Do Ruby's pack and unpack routines let you access a field smaller than a byte? What about a field that is split across a byte boundary?
I agree that bit masking and manipulation is hard to read and can be misused by ego-driven programmers. But the "popularity" because because bit masking and manipulation are both fundamental to computer science and an absolute necessity in certain situations.
I just do not understand why this document is so large - there just is not enough complexity in bitwise operators to warrant this giant multi chapter document. If anything I feel it will be confusing because it’s so much text being used to describe so little actual logic.
This long article is a great example of why I believe GPT has a strong future.
The article spends several pages to explain hexadecimal, bases, etc. Probably some fraction of the audience already knows that. In that case, the article should automatically adapt and skip that section. Some people will react negatively to the mathematical formulation, preferring the intuitive section that follows instead.
But this is a static blog post, so it doesn't know how to adapt. Imagine the same post, but with some toggles/sliders (you can come up with even more sophisticated mechanisms): I indicate my level of competency, and the article re-writes itself to match my knowledge. Skip the boring math, show a lot of examples, etc. Or the opposite. The point is that it adapts to you. GPT (or some future variant) is really good at doing this.
GPT isn't needed. Just hyperlinks, like exist at the top of the document. Paired with the ability to expand sections either in place or in an adjacent view. State the prerequisite knowledge, then give a way to skip past the explanations or have the explanations hidden by default. Permit further expansions up to the limit of what the author cares to provide, can be added to later.
[1] https://graphics.stanford.edu/~seander/bithacks.html [2] https://en.wikipedia.org/wiki/Hacker's_Delight