
Intel defends AVX-512 against critics who wish it to die a 'painful death' - blackcat201
https://www.pcworld.com/article/3571956/intel-defends-avx-512-against-critics-who-wish-it-to-die-a-painful-death.html
======
formerly_proven
"Intel defends fragmented ISA obviously meant to put market segmentation into
binaries just like they always wanted to in the good ol' Itanium days by doing
their usual 'up to 300X performance increase with this simple trick!' schtick"

Also let's not gloss over the hilarity of trying to compete against "literally
a grid of ALUs connected to memory" with x86 cores tuned for per-thread
performance for a delightfully parallel task such as "multiple these two
billion tensors".

HPC as a market at least makes some sense, since HPC applications feast on
double precision FLOPS and largely don't care about SP, the raison d'etre of
GPUs.

------
captainbland
I find this argument that AVX-512 is only really useful for HPC kind of odd.
Isn't it going to be useful in all the same kinds of domains that are
traditionally quite SIMD friendly like video encoding, image manipulation and
such? There must be a lot of professionals doing these kinds of tasks on their
laptops and not necessarily wanting to add a discrete GPU or going to the
cloud to get reasonable performance.

~~~
ohazi
It makes more sense when you read Torvalds' initial argument.

AVX-512 draws so much power that (especially in laptops) they have to lower
the base clock of the entire CPU. This doesn't matter too much if you're doing
HPC, because lowering the clock 20% but gaining 200% in throughput is still a
net win.

But the computer can't change the clock speed instantly, so there's a fairly
large penalty for just executing some AVX-512 instructions and then
immediately trying to go back to other tasks (which is what happens if you try
to use AVX-512 to implement memcpy). Or if you're trying to simultaneously use
other parts of that core for other tasks, they'll be slower.

So back to whether it'll speed up video encoding... Perhaps, but the original
complaint was that "when is it appropriate to use AVX-512" already has a lot
of asterisks on a Xeon in a data center... It's going to have even more
asterisks in a power constrained laptop, which suggests that bringing it to
laptops may not be the best option, given the availability of other options.

Apparently it's difficult for Intel to look at AVX-512 objectively. The fact
that their HPC customers love it doesn't imply that it's well designed or
appropriate for laptops.

~~~
Matthias247
As some who works on server/data-center software for workloads with very high
utilization I can concur that it seems hard to judge where AVX-512 would end
up having a net positive or negative impact.

E.g. all that my server might be doing is running a webserver/proxy. Now 50%
of requests might be using TLS, which requires crypto operations that could
potentially benefit from AVX512. Other stuff might be using compression or
erasure-coding mechansims that could benefit too. However all of those
operations are very short and work on a few kB worth of data. Then they are
interleaved by lots of other operations which would benefit more from the
higher frequency.

So am I now better of having AVX-512 active or not? That's only something that
benchmarks for the actual workload could tell.

