
8-bit number to binary string - Audiophilip
http://gynvael.coldwind.pl/n/c_cpp_number_to_binary_string_01011010
======
acqq
The big constants used in the formula

    
    
        void to_bin(unsigned char c, char *out) {
          *(unsigned long long*)out = 3472328296227680304ULL +
            (((c * 9241421688590303745ULL) / 128) & 72340172838076673ULL);
        }
    

are much more obvious when they remain in hex:

0x3030303030303030 -- the byte array of '0's encoded as ASCII

0x8040201008040201 -- the magic

0x101010101010101 -- only 0 or 1 to be added to the appropriate array element

Therefore, more readable:

    
    
        void to_bin(unsigned char c, char *out) {
          /* endian dependent, works on x86 */
          *(unsigned long long*)out = 0x3030303030303030ULL +
            (((c * 0x8040201008040201ULL) >> 7) & 0x0101010101010101ULL);
        }
    
    

The function is of course endian dependent (I'd add a comment about that fact
in it) and the explanation of the quoted function on the page starts at "If
you shift the the constant by 7 bits to the right." The page is a result of an
edit where the older function (that ends with + c / 128;) was explained first.

Still, nice.

~~~
seiji
Towards the bottom of the article it says:

 _the last thing I did was changing the constants to decimal - they look way
more scary this way ; >_

because, you know, programming should be innately difficult and confusing so
we can laugh at others who understand less than we do.

Never underestimate how many times code (any code) posted online will be
copy/pasted throughout the world everywhere from school projects to production
systems.

~~~
mikeash
The whole thing is whimsy. You'd never want to use this approach at all in a
production system. Using decimal constants instead of hex is the least of the
sins here.

There should be room in programming for whimsy. This stuff is fun, and play is
healthy. If somebody takes this thing seriously and copies the code into a
production system, that's their own fault, not the author's.

~~~
seiji
My job is basically fixing the code and processes of corporate programmers who
copy/paste examples from websites into production systems. There's little
understanding of how the code works and almost no reading of related
documentation. _If I compiles, I ships._

It's like if bridge designers didn't really know mechanics and just
copy/pasted minecraft designs into the real world. Sure, it's not the fault of
minecraft designers, but we can help kill fewer people by making better
examples.

~~~
mikeash
I don't think that analogy really works. If bridge designers were doing this,
there would be massive outcry and we'd put a stop to it. We allow it to happen
with (terrible) programmers only because the harm they do is pretty limited,
and nobody's getting killed because of it.

~~~
seiji
_there would be massive outcry and we 'd put a stop to it._

Massive security branches? Massive fraud? Ineffective nationalized
cybercommands?

Just because we can't see the world burning with our eyes doesn't mean lead in
the air isn't giving us brain damage.

~~~
mikeash
These are all because of people copy/pasting code from the internet?

------
personjerry
Awesome. I'm always amazed by bitmagic like this. Today I saw one on
stackoverflow and spent a good deal of time failing to understand it:

[https://stackoverflow.com/questions/33532045/how-to-swap-
fir...](https://stackoverflow.com/questions/33532045/how-to-swap-
first-2-consecutive-different-bits)

And of course I'm reminded of the fast inverse square root:

[https://en.wikipedia.org/wiki/Fast_inverse_square_root](https://en.wikipedia.org/wiki/Fast_inverse_square_root)

------
nialo
Why isn't this faster than the "standard" loop based version? The disclaimer
at the start says it's not, but I would have thought that 4 64bit math
operations would be much faster than a loop with 8 steps and comparisons and
so on.

~~~
cnvogel
First and foremost: YES, I KNOW! This is completely senseless
microbenchmarking and micro-optimization. I ONLY LOOK AT THIS FOR FUN.

That being said. The 64bit multiplication is faster, and also constant in
time. I've measured this using the timestamp counter which doesn't necessarily
count actual CPU clock cycles, but possibly an integer multiple of it (seems
to output always numbers divisable by 8 for me).

    
    
        (...)
        100 loop:01100100(152) 64bit:01100100(48)
        101 loop:01100101(128) 64bit:01100101(48)
        102 loop:01100110(160) 64bit:01100110(48)
        103 loop:01100111(128) 64bit:01100111(48)
        104 loop:01101000(192) 64bit:01101000(48)
        105 loop:01101001(88) 64bit:01101001(48)
        106 loop:01101010(152) 64bit:01101010(48)
        107 loop:01101011(128) 64bit:01101011(48)
        108 loop:01101100(160) 64bit:01101100(48)
        (...)
    
    

That's on an _Intel(R) Core(TM) i5 CPU M 450 @ 2.40GHz_ and the cycles include
the rdtsc, a few movs, and the call to the respective function.

[https://github.com/vogelchr/bitstring_via_64bit_mult](https://github.com/vogelchr/bitstring_via_64bit_mult)

~~~
nocsaer1
I couldn't resist so I tried a table look-up (16 elements) version which was
about twice as fast as the multiply version.

Intel Core i5-3210M @ 2.50GHz

~~~
renox
Funny, I read about the first part of the sentence 'I tried a table look-up'
and thought immediately WHAT ABOUT THE CACHE?

Then I continued '16 elements', oh, OK then, next time I'll try to read the
whole sentence first.

------
rjmunro
Rather than add 0x3030303030303030, you could OR with it because you know
there will never be a carry. This may save cycles on some architectures.

------
anewhnaccount2
This is an example of SIMD Within A Register. There are some more neat
examples here: [http://aggregate.org/MAGIC/](http://aggregate.org/MAGIC/)

------
seiji
Here's a fancier, slightly less-magic, version. How to compile is left as an
exercise for the reader (hint: Haswell or newer, requires BMI2 instruction
set):

    
    
        void toBinary(uint8_t byte) {
            uint64_t zeroes = *(uint64_t *)"00000000";
            uint64_t resultStr =
                zeroes | __builtin_bswap64(_pdep_u64(byte, 0x0101010101010101ULL));
        
            printf("hello: %.*s\n", 8, (char *)&resultStr);
        }

~~~
acz
This really reminds me of the old demoscene diskmag Hugi article about adding
16 bit colors (without MMX).

[http://www.hugi.scene.org/online/hugi18/cobrad.htm](http://www.hugi.scene.org/online/hugi18/cobrad.htm)

------
est
For Python

    
    
        >>> f1=lambda c: __import__('struct').pack('<Q', 3472328296227680304  + (((c * 9241421688590303745) / 128) & 72340172838076673))
        >>> f1(1)
        '00000001'
        >>> f1(2)
        '00000010'
        >>> f1(5)
        '00000101'
        >>> f2=lambda c: struct.pack('<Q', (((c * 0x8040201008040201) / 128) & 0x0101010101010101))
        >>> f2(1)
        '\x00\x00\x00\x00\x00\x00\x00\x01'
        >>> f2(2)
        '\x00\x00\x00\x00\x00\x00\x01\x00'
        >>> f2(101)
        '\x00\x01\x01\x00\x00\x01\x00\x01'
    

Use "<" for little-endian for x86

------
zamalek
> x86 is little endian - that's why it's backwards

This can be read the wrong way, although it is correct: the author is
referring to the final byte sequence and not the input bit sequence.
Endianness applies to bytes, not bits.

------
eru
Nice little problem. Thanks for sharing!

------
rlonstein
Cool but I want to say I've seen similar, google turns up:
[http://www.asmcommunity.net/forums/topic/?id=28498](http://www.asmcommunity.net/forums/topic/?id=28498)

------
nwmcsween
I recommend reading over this: [http://0x80.pl/articles/convert-to-
bin.html](http://0x80.pl/articles/convert-to-bin.html)

btw all this SWAR stuff should be aligned before applying.

------
jjnoakes
Be careful with alignment. On architectures where it matters, the caller may
pass in a "char* out" which is not suitably aligned, and doing an "unsigned
long long" store into that address may fault.

------
amelius
Feature request: add a parameter that specifies the base into which the
function rewrites the number.

