Hacker News new | past | comments | ask | show | jobs | submit login

Yes data dependent branch serialzizes the execution of the gpu stream processing units (this is not true for amd actually). Yes it sucks operating on string sometimes on gpu and we don't always do it because of this. But we are ignoring certain aspects. Long strings are dictionary encoded in our database usually (no one picks this an optimizer finds the best cascading compression scheme and imposes this). Dictionary encodings are sorted so that each dicitionary value's key is in sorted order. So guess what you can already do comparison and equality checks on much smaller datasizes. A long string can be encoded in 8, 4 ,2 sometimes even 1 byte. Many times we have to do string comparisons on strings that we hhave not encoded ie

select * from table where column1 = "some awesome text here"

In this case we actually do a comparison between hashes of the data. Hashing is cheap, fast, and makes comparisons on the GPU's be a breeze. So long story short, we have no data dependent branching. We do this by never using certain statements inside of kernel code.

the use of "if" is expressly forbidden at blazing for any gpu code and it's use is punished viciously (said individual usually has to be the one that captures meaningful input from one the 80 log files of our 80 gpu cluster ).

Editing to mention the way you can encode a long string down to a 1 byte is by doing a dictionary compression and then bit packing the keys. On gpu the way you do this is getting the max key (min key is always 0) and then you can store this data in 1 byte (if max(key) < 255).




The whole point of making things fast is to remove all the if statements. People complain that their current if-heavy algorithm can't run on GPUs therefor GPUs suck. Of course then the algorithms need to be reworked to have less branches.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: