Hacker News new | past | comments | ask | show | jobs | submit login

An easy explanation is that you can think of GPUs as being massively hyperthreaded. So, when one thread hits a data stall, another thread picks up to use the ALU resources until it hits a stall, and so on through many, many threads before it cycles back to the original. But, data stalls are very long. And, if you don't have enough ALU for the other threads to work on before they stall too, you'll end up back on the first thread waiting for data anyway.

If you want to understand low-level GPU architecture, https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-... is a great intro.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact