That's why such workloads gets a near linear scaling when using hyper-threads, unlike workloads like LLMs which are memory bandwidth bound.
That's why such workloads gets a near linear scaling when using hyper-threads, unlike workloads like LLMs which are memory bandwidth bound.