
Implementation Challenge: A count leading zeroes function - ingve
http://foonathan.github.io/blog/2016/02/11/implementation-challenge-2.html
======
jhallenworld
This is not a good way to implement clz. There are nice branch-free methods
for these:

    
    
        /* Count no. trailing zeros */
    
        int ntz(unsigned x)
        {
            /* Set all trailing zeros to ones */
            /* And clear leading zeros */
            x = ~x & (x - 1);
            /* Now get population count */
            x = x - ((x >> 1) & 0x55555555);
            x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
            x = (x + (x >> 4)) & 0x0F0F0F0F;
            x = x + (x >> 8);
            x = x + (x >> 16);
            return x & 0x3F;
        }
    
        /* Count leading zeros */
        /* log2(x) is 31-nlz(x) */
    
        int nlz(unsigned x)
        {
            /* Make all bits to the right of the first 1 a 1 */
            x |= (x >> 1);
            x |= (x >> 2);
            x |= (x >> 4);
            x |= (x >> 8);
            x |= (x >> 16);
            /* Make all bits to the left 1s */
            x = ~x;
            /* Population count */
            x = x - ((x >> 1) & 0x55555555);
            x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
            x = (x + (x >> 4)) & 0x0F0F0F0F;
            x = x + (x >> 8);
            x = x + (x >> 16);
            return x & 0x3F;
        }
    

(these are from Hacker's Delight)

------
jstimpfle
Is this really a useful application of function overloads? Integer promotion
is tricky as it stands. Overloading, instead of having the user signal intent
explicitly, might increase likelyhood of errors, and make them harder to find.

~~~
foonathan
Even if you named the functions differently (clz8/16/32/64), you still needed
to match them to the appropriate __builtin_clz version.

~~~
jstimpfle
The matching itself is not a problem. Arguably it's required for portability,
but it's very straightforward. Obvious implementation, all in one place,
practically no maintenance cost, very unlikely to break.

But think about all the callers of these functions. There are usually more
callers, at disparate places, and context will be less narrow there. It's
important not to conceal what is intended to happen at the call side. If these
functions are overloaded and at some point due to unexpected integer promotion
or a crazy typedef the wrong overload is picked, you will have a hard time
debugging.

~~~
foonathan
And I've showed one way to do the matching in an automated way.

How you wrap those implementation things, is up to you. I agree, overloading
might have problems.

------
chli
Thanks for posting this, I was faced with that exact same problem this week.
But with the following extra constraints :

\- Not everybody uses GCC, especially in the embedded world where CLZ is most
useful

\- It must be portable to a Cortex-M0+ that doesn't have a CLZ instruction

Like me you probably end up here :

[http://embeddedgurus.com/state-space/2014/09/fast-
determinis...](http://embeddedgurus.com/state-space/2014/09/fast-
deterministic-and-portable-counting-leading-zeros/)

------
dewster
This article is a poster child for what's wrong with modern HW & SW.
Cortex-M0+ has no CLZ?!? WTF they thinking? And all the bugfixes around
wrappers around enigmas are a joke. Computing shouldn't be this arcane.

