Hacker News new | past | comments | ask | show | jobs | submit login
Bit Twiddling Hacks (2005) (stanford.edu)
127 points by e-sushi on July 3, 2016 | hide | past | web | favorite | 24 comments

Just a fair warning as someone who has made extensive use of this page in the past: many of these hacks often end up slower these days on modern CPUs than naive approaches. Especially the no-branch and no-multiply math ops.

As always, be sure to performance test your code if you're getting into obfuscating your algorithms to boost speeds.

That said, there are still plenty of real gems in there. And on that note, here's one of my favorite bit-twiddling hacks:




(these functions are for RGB555, but you can easily extend these to RGB888 ops.)

One place the branchless techniques can come in handy is with SIMD (or simulated SIMD by packing several small ints into longer ones.) A branchless algorithm that's 50% slower can pay for itself if you can process four elements at a time.

Agreed that performance testing is absolutely necessary, preferrably with something that can tell you not only where exactly the processor is spending its time but also whether it's spending its time computing or waiting on memory etc, whether branches are being predicted well...

I would suggest Hacker's Delight too; the Chapter 2, Basics, is freely available at the website [1]. Pretty sure that the linked algorithm is also there (don't have a copy right now so can't check).

[1] http://hackersdelight.org/

These hacks have actually been modeled and generalized into a mathematical construct called Mixed Boolean Arithmetic [1][2]. With a few matrix operations, you can actually generate an arbitrary number of hacky-looking (or obfuscated, as that was the actual intention of the authors of [1]) expressions from any initial, simple expression.

[1] http://link.springer.com/chapter/10.1007/978-3-540-77535-5_5 (and I deeply apologize I can't find the free PDF version, though sci-hub may help here)

[2] http://blog.quarkslab.com/what-theoretical-tools-are-needed-...

>> and I deeply apologize I can't find the free PDF version, though sci-hub may help here)


Use https://sci-hub.ac/10.1007/978-3-540-77535-5_5 in order to prevent browsers complaining about certificate mismatch.

I would suggest a great resource about that:


I have that book. Pretty good. If you want to save some money, buy the DRM free PDF copy on informit.com but wait for one of their 40-50% off sales which happen near major holiday. I don't see one for July 4th though.

My favorite is the elementary x&-x (isolate least significant set bit). This is basically a priority arbiter using the carry chain, and in hardware (HDL), since the carry chain is already heavily optimized, is faster than priority arbiters using basic gates.

Next favorite is "compress selected bits", because I figured out the inverse, which Henry Warren then published as "expand" in the 2nd edition: http://www.hackersdelight.org/hdcodetxt/expand.c.txt

Coincidently Hacker's Delight 2e is available on Safari Books Online for any other fellow subscribers.


Seconding the Hacker's Delight book.

I think you'll probably enjoy it more if you do low-level programming, but there's a lot of cool mind-expanding stuff in there--the treatment of Hilbert Curves is quite fun.

I wrote that algorithm! A year later I found Hacker's delight, a wonderful book that had a much "faster" version of the same algorithm (less ops).

For anyone interested in this kind of things, you should go read that book. It actually is a delight.

Which algorithm are you referring to?

I'd guess the linked one at the anchor #NextBitPermutation.

Anchor has since been removed, so the gp comment will look odd from here on out.


The quick and dirty version is still useful (portable, works for all values) for the case of 1/2 word size unsigned versions of min and max:

    unsigned max(unsigned short a, unsigned short b)
        unsigned q = a - b;
        unsigned r = ((~q) >> 16);
        return r + b;

Does the right shift limits this to certain word sizes?

Site seems down. Here is the google cache:


There are some neat tricks there about interleaving bits from two operands. But what about from 3?

If you are referring to Morton's interleaving, then it's quite simple. You can compute the third operand's code and OR it with the others using the appropriate shift. Something like: (z << 2) | (y << 1) | x

The author's name seems to be recursive...

Thanks, we've updated the link.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact