So they implemented AVX512 on the Xeon server parts in microcode? That seems cra...

smitherfield · on May 31, 2017

It's fairly common practice with bleeding-edge vector instructions. The reasoning (assuming it is the case here) is that a theoretically-minor performance regression (the cost of converting 1x AVX512 to 2x AVX2 in microcode) is usually much preferred over a CPU exception when attempting to run a binary with AVX512 instructions on a server. It also means you don't need a $15000 chip to test your AVX512 code.