
Please confirm 2nd gen Intel bugginess in YMM overlapping - Georgi_Kaze
Hi,
after some tests, by accident, I found a bug in Intel i5-2430M execution of YMM register reads&#x2F;writes with overlapping, one fellow on CodeProject confirmed that Intel 3rd gen executes the same testcode fine, so my question is:
<i>Does this bug occur, say, in other Sandybridges?</i><p>Please, shed some light on this issue!<p>http:&#x2F;&#x2F;www.codeproject.com&#x2F;Questions&#x2F;1109847&#x2F;Do-Intel-rd-gen-and-next-execute-this-code-in-a-bu<p>https:&#x2F;&#x2F;software.intel.com&#x2F;en-us&#x2F;forums&#x2F;intel-moderncode-for-parallel-architectures&#x2F;topic&#x2F;625067
======
Georgi_Kaze
The testcode (C source and executable) is on Intel's forum:
[https://software.intel.com/sites/default/files/managed/2a/09...](https://software.intel.com/sites/default/files/managed/2a/09/buggy_AVX_compile.zip)

During decompression of 'alice29.txt', on i5-2430M, it fails:

D:\Tsubame\buggy_AVX_compile>Nakamichi_Tsubame_YMM_PREFETCH_4096_Intel_15.0_64bit_SSE41.exe
alice29.txt

Nakamichi 'Tsubame', written by Kaze, based on Nobuo Ito's LZSS source,
babealicious suggestion by m^2 enforced, muffinesque suggestion by Jim Dempsey
enforced.

Note: Conor Stokes' LZSSE2(FASTEST Textual Decompressor) is embedded, all
credits along with many thanks go to him.

Limitation: Uncompressed 8192 MB of filesize.

Current priority class is HIGH_PRIORITY_CLASS.

Allocating Source-Buffer 0 MB ...

Allocating Target-Buffer 32 MB ...

Allocating Verification-Buffer 0 MB ...

Compressing 152,089 bytes ...

-; Each rotation means 64KB are encoded; Done 100%

NumberOfFullLiterals (lower-the-better): 4

NumberOf(Tiny)Matches[Tiny]Window (4): 157

NumberOf(Short)Matches[Tiny]Window (8): 52

NumberOf(Medium)Matches[Tiny]Window (12): 11

RAM-to-RAM performance: 11 KB/s.

Compressed to 73,071 bytes.

Source-file-Hash(FNV1A_YoshimitsuTRIAD) = 0x1366,78ee

Target-file-Hash(FNV1A_YoshimitsuTRIAD) = 0x8cec,be70

Decompressing 73,071 (being the compressed stream) bytes ...

RAM-to-RAM performance: 1152 MB/s.

Verification (input and output sizes match) OK.

Verification (input and output blocks mismatch) FAILED!

...

------
brudgers
I believe the Hacker News ranking algorithms often penalize submissions
without links. The link might receive more attention if it is submitted
directly. It is ok to add a comment with additional information once the
thread has been created.

Good luck.

~~~
Georgi_Kaze
Didn't get what you are saying! You mean the buggy behavior is to be explained
with code examples inhere as well, yes?

~~~
brudgers
Sorry for not being clear.

Submissions to Hacker News (this site) are made with the "Submit" link.

If there is text in the box labeled "text" then the URL in the box labeled
"url" is ignored. To submit a URL, leave the "text" field empty.

A reason to do this is because this site, Hacker News, ranks submissions
without a link lower.

After the submission has been created, additional information can be provided
in a comment.

