Hacker News new | past | comments | ask | show | jobs | submit | nosferalatu123's comments login

A lot of the myth that "branches are slow on GPUs" is because, way back on the PlayStation 3, they were quite slow. NVIDIA's RSX GPU was on the PS3; it was documented that it was six cycles IIRC, but it always measured slower than that to me. That was for even a completely coherent branch, where all threads in the warp took the same path. Incoherent branches were slower because the IFEH instruction took six cycles, and the GPU would have to execute both sides of the branch. I believe that was the origin of the "branches are slow on GPUs" myth that continues to this day. Nowadays GPU branching is quite cheap especially coherent branches.


If someone says branching without qualification, I have to assume it’s incoherent. The branching mechanics might have lower overhead today, but the basic physics of the situation is that throughput on each side of the branch is reduced to the percentage of active threads. If both sides of a branch are taken, and both sides are the same instruction length, the average perf over both sides is at least cut in half. This is why the belief that branches are slow on GPUs is both persistent and true. And this is why it’s worth trying harder to reformulate the problem without branching, if possible.


coherent branches are "free" but the extra instructions increase register pressure. that's the main reason why dynamic branches are avoided, not that they are inherently "slow".


Author here. My background is game development. Now I work on 3D modeling software. I wouldn't say I know "higher level math", because that seems like a very deep well...

What would be the best way to define "transform" here? What I mean is something that can be applied to a point, like a linear map. So translation, and/or rotation, and/or scaling, and/or skewing, are all things that can be done by this "transform" in 3D.

In computer graphics these are often expressed as 3x3 or 4x3 (or sometimes 4x4) matrices. But a "translation+quaternion" can also be a transform, or just a quaternion (a unit quaternion can be used to rotate points for example). So I'd be happy to use a better definition for "transform" than 'given a transform T that can be applied to a point' but I'm not quite sure what the best definition would be.


Honestly, this is a chance to be creative and informative, here. You could have provided a background and motivation for transforms. A "why we are here in the first place." There's actually no need to be formal or a need for a worry about good definitions.

You could have provided your own perspective for why transforms exist and why they need to be used. That would have involved explaining why a point or a vector needs to be transformed into another vector, in the context of 3D graphics.

Or why transformations in 3D space work so well and reliably.

You basically elaborated on the how, like a simple technician, while neglecting the fundamental context of your study. That's great for recording some notes and for helping to reinforce a memorization requirement and routine. But going down the deep wells of abstract subjects like the intimidating ideas of advanced mathematics calls for some unusual approaches to looking at things standing in front of you.

Engineers know how things work. And mathematicians know why those things work. The smarter person sees through the black boxes.


I meant a "transform" as a linear map. I'm using the word as it is used in computer graphics (my background), so it's something that translates, rotates, scales, etc. other things (such as points). That is often a 3x3 or 4x4 matrix, although it can also be a vec3 translation and a quaternion, or just a quaternion. I think "Transform" is clear in the context of computer graphics, but I see what you mean about it being vaguely defined in my blog post.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: