Hacker News new | past | comments | ask | show | jobs | submit login

Exactly. Consider this example:

  a = f(z);
  b = g(z);
  v = x > y ? a : b;
Assuming computing the two function calls f() and g() is relativelly expensive, it becomes a trade-off whether to emit conditional code or to compute both followed by a select. So it's not a simple choice, and the decision is made by the compiler.





This is a GPU focused article.

The GPU will almost always execute f and g due to GPU differences vs CPU.

You can avoid the f vs g if you can ensure a scalar Boolean / if statement that is consistent across the warp. So it's not 'always' but requires incredibly specific coding patterns to 'force' the optimizer + GPU compiler into making the branch.


It depends. If the code flow is uniform for the warp, only side of the branch needs to be evaluated. But you could still end up with pessimistic register allocation because the compiler can’t know it is uniform. It’s sometimes weirdly hard to reason about how exactly code will end up executing on the GPU.

f or g may have side effects too. Like writing to memory. Now a conditional has a different meaning.

You could also have some fun stuff, where f and g return a boolean, because thanks to short circuit evaluation && || are actually also conditionals in disguise.


Side effects will be masked, the GPU is still executing exactly the same code for the entire workgroup.



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: