The CPU for complex stuff and GPU for simple number crushing is a really popular...

Nevermark · 2024-12-08T19:50:03 1733687403

Thanks, that is a good summary.

> If the program has millions of mostly independent tasks, a GPU will run it adequately and a CPU very poorly.

Now I would say:

CPU - optimized to execute threads with independent instruction streams and data use.

GPU - Optimized to execute threads with common instruction streams and data layouts.

—

CPUs

As you noted: Optimizing conditional branches is one reason CPU cores are more complex, larger.

CPUs also handle the special tasks of being overall “host”. I.e. I/O, etc.

—-

GPUs

One instruction stream over many cores greatly reduces per core footprints.

Both sides of conditional code are often taken, by different cores. So branch prediction is also dispensed with.

(All cores in a group step through all the instructions of both if-true and else-false conditional clauses, but each core only executes one branch, and is inactive for the other. Independent per-core execution, without independent code branching. Common branching happens too.)

—

Both CPU and GPU can swap between threads or thread groups, to keep cores busy.

Both optimize layers of faster individual to larger shared memory (i.e. registers, cache levels & RAM).