Hacker News new | past | comments | ask | show | jobs | submit login

the SIMD usage is mainly to do two things.

In the lexer, it is used for block comment skipping. It will find the end of block comments 16 characters at a time (on both PPC and x86).

During line number computation, it will also find newlines 16 characters at a time.

This could actually (nowadays) be done 32 characters at a time on newer processors, but isn't.




This is flat-out fascinating. Thanks.


64 characters with AVX-512? =)


I didn't check to see if the instructions exist, but possibly :)

You do start to hit two issues though as oyu increase the size of the skipping:

1. Alignment 2. If the average block comment/line is < 64 characters, you may lose more time performing the instruction and then counting the trailing zeros in the result to find the place it ended.

I have no numbers to back up whether this matters, of course :)


AVX-512 does not seem to have PMOVMSKB, which is how I assume it is being done with SSE2. There are other ways to skin that cat, but it's unclear whether they have any advantage over using AVX2 with VPMOVMSKB.


I posted a patch here: https://gist.github.com/dberlin/9867614

It adds AVX2 and SSE4.2 instruction support. It makes no discernible difference performance wise that i can find :)


Heh, awesome! I'll try it out today.


I'm curious how much of an effect this has.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: