Yes, the GPUs provide massive parallelism. An NVIDIA RTX 4090 has 16384 "cuda co...

vasco · on June 3, 2024

Read about vector instructions a little bit and you'll see what I mean in the previous comment. A CPU has many many niche instructions it supports, it's way more flexible. A GPU is just trying to multiply the largest arrays possible as fast as possible, so the architecture becomes different. I don't think there's a quick way for you to grasp this without reading more about computer architecture and instruction sets, but you seem to be interested in it, so dive in :)