AVX2 vs AVX512 in this case may be somewhat misleading. In .NET, even if you use 256bit-wide vectors, it will still take advantage of AVX512VL whenever available to fuse chained operations into masked, vpternlogd's, etc.[0] (plus standard operations like stack zeroing, struct copying, string comparison, element search, and other can use the full width)[1]
So to force true AVX2 the benchmark would have to be ran with `DOTNET_EnableAVX512F=0` which I assume is not the case here.
So to force true AVX2 the benchmark would have to be ran with `DOTNET_EnableAVX512F=0` which I assume is not the case here.
[0]: https://devblogs.microsoft.com/dotnet/performance-improvemen...
[1]: https://devblogs.microsoft.com/dotnet/performance-improvemen...