

Torvalds:  More bitwise tricks - SkyMarshal
https://plus.google.com/102150693225130002912/posts/7bKRjV92snH

======
po
Google Plus is still a very strange place.

Here you have an exceedingly technical thread with some of OSS brightest minds
working on a little puzzle and it's interspersed with litter. You have the
sincere (yet misplaced) _thank you for working on OSS_ , the clueless guy, and
the _isn't it cool we're talking about such hardcore topics_ guy. Every one of
those comments makes me cringe a little.

This is why geeks have yet to give up mailing lists. The hurdle they put up is
actually an effective barrier for the unmotivated. Some conversations are
meant to be had in a quiet place amongst people who are similarly inclined.
Google misleads you into thinking that Plus is that place, but it just isn't.
Perhaps if there was a way for the community to sort the relevant from the
irrelevant, it's just too tempting of a megaphone for people to jump in front
of. Sometimes adding a little friction to the 'Post' button is a good thing.

~~~
diminish
An open discussion on a very low level technical implementation with a
heterogeneous crowd on Google+ seems to be the dawn of a new era; open source
programming becomes public in full sunlight the same way books or mathematics
became public after the invention of the printing. Compare it to mailing lists
or earlier forms of engineering discussions behind closed doors. Many
outsiders all around the world are now working hard on sunday learning C or
with their gcc compilers trying to solve Linus's bitwise calculation
optimisation problem and they previously had nothing to do with kernel
programming. Their results may go into the next Linux kernel release and that
is exciting for a puzzle.

and as opposed to you I find this to be a nice thing for everyone++ to be
invited to think Linus's question on Google+ (which is better than facebook or
linked in or pinterest for this purpose)

'...so it's not a generic "count number of bits" (or even a generic "count
number of bytes"). So I was hoping somebody would come up with something
simple like (x & mask1) + (x&mask2)>>shift that just got the four
possibilities right.'

~~~
po
I love that Linus does his thinking out in the public, but that's nothing new.
Google Plus isn't enabling anything that he didn't do before, it's just being
exposed to a new audience. Frankly, I love that this new audience is seeing
his writing. I love being able to watch smart people doing their thing and I
hope they inspire others.

But, I'm afraid you you've totally missed my point. What I don't love is the
_commentary_ being mixed in with _the game_. This is why when you have a panel
discussion, you don't hand a microphone to everyone in the auditorium. When
you go to a concert, you don't hand everyone an instrument. People have
varying levels of contextual awareness and/or self control: it's not really
their fault for throwing in their $0.02. I'm just pointing out that Google
Plus is a type of forum and I'm still not sure what kind of conversation is
supposed to go on there. This kind of discussion feels unnatural to me.

~~~
diminish
yes i agree about g+, though i feel the same about facebook, linked in and
disqus about any type of discussion. only on HN; i mostly feel "these are the
right people here answering"

what linus does is more like an experiment, not on what g+ is but on what it
can become. i pray the experiment will be successful, but i think i wont.

~~~
po
Nobody tries to have these kinds of discussions on Facebook because it's
obviously the wrong place (my mom doesn't care what I think of bit-shifting).
Disqus is fine because it's in the context of the blog post. People have to
seek it out. Twitter is fine because conversations are not globally visible,
they are filtered by @replies to relevant parties.

Google Plus was seeded with Google employees and they are the most
enthusiastic users so it is filled with these kinds of discussions.

------
Erwin
If you like this sort of bit-level puzzles, this is a nice book:
[http://www.amazon.com/Hackers-Delight-Henry-S-
Warren/dp/0201...](http://www.amazon.com/Hackers-Delight-Henry-S-
Warren/dp/0201914654/) \-- probably the only book at such a low level I
remember buying in many years.

------
chmike
Can someone explain the problem he is trying to solve with this piece of code
? I don't understand.

~~~
zokier

        no bytes: 00000000 -> 0
        one byte: 000000ff -> 1
        two bytes: 0000ffff -> 2
        three bytes: 00ffffff -> 3
    

I think that is what he is trying to accomplish.

------
alexbell
Login required on my iPhone's browser. Can someone post the contents please?

~~~
hansbo
"More bitwise tricks..

So my quest to calculate the hash and the length of a pathname component
efficiently continues. I'm pretty happy with where I am now (some changes to
the code have happened, it you actually want to see the current situation you
need to check out the kernel mailing list post), but finding the number of
bytes in the final mask bothers me.

Using an explicit loop is out - the branch mispredicts kill it. And while at
least modern Intel CPU's do quite well with just using the bit scan
instructions ("bsf") to find where the first NUL or '/' was in the word, that
sucks on some older CPU's.

So I came up with the following trick to count the number of bytes set in the
byte mask:

    
    
      /* Low bits set in each byte we used as a mask */
      mask &= ONEBYTES;
      /* Add up "mask + (mask<<8) + (mask<<16) +... ":
      same as a multiply */
      mask *= ONEBYTES;
      /* High byte now contains count of bits set */
      len += mask >> 8*(sizeof(unsigned long)-1);
    

and I'm wondering if anybody can come up with something that avoids the need
for that multiply (and again - conditionals don't work, the mispredict costs
kill you).

Because that multiply isn't free either."

EDIT: Thanks for the tips, I'm new here

~~~
6ren
The * have gone away (HN is treating it as markdown for italics). You can put
spaces around it to prevent this (e.g. the first * in the first comment is
treated literally; the last * in that comment isn't).

The easiest solution is to indent the code by two spaces (HN renders it
literally then).

------
Tichy
It requires a log in to see?

------
pbsd
Quickest way (without multiplication) is divide and conquer, just like the
usual bit popcount:

    
    
      static inline int f(const u64 m)
      {
        const u64 ones = 0x0101010101010101ULL;
        const u64 b64 = m & ones;
        const u32 b32 = b64 + (b64>>32);
        const u16 b16 = b32 + (b32>>16);
        const  u8  b8 = b16 + (b16>> 8);
        return b8;
      }

~~~
_sh
But this is slower than with multiplication, as Linus explains:

    
    
      the simple shift+add version is all totally serialized
      and nothing can be done before the previous operation
      ends: as a result the three adds and three shifts will
      inevitably take 6 cycles (the original P4 had that 
      double-pumped ALU, but not for shifts). That's already
      slower than almost any multiply.
    

Edit: oops, I forgot your '(without multiplication)' qualifier. Yes, your way
is likely the quickest without multiplication.

------
DiabloD3
I'm not sure if I understand the problem. Without a loop, is he only counting
a fixed number of bytes (ie, 4 or 8 maximum)?

------
Alind
why does this guy always behave himself like a god

~~~
pjscott
Gods behave in many ways. Some are jealous and quick to anger, and others are
inhumanly patient. Some forbid alcohol to their followers, and others get
drunk and beat up giants. Some descent to earth in the form of a bull to
conduct illicit sexual liaisons with mortals, and others lack reproductive
organs altogether. However, without exception, one thing they all have in
common is that they never conduct polite discussions of bitwise arithmetic
tricks. _And yet that is exactly what I see here._

Next time you decide to #include <linus/stdflame.h>, please check to make sure
it actually applies to the situation.

~~~
tptacek
What is annoying about this comment is that it compels me to go back and read
every other comment you've written. Please write less gracefully next time.

