IEEE specifies exact rounding for +,-,\*,/ and sqrt, but it allows the transcend...

psykotic · on July 10, 2011

Your point about IEEE transcendentals is a good one, and it ties into the so-called tablemaker's dilemma. But this isn't just an IEEE issue since we're talking about specific CPU-level instruction sets. I'd be surprised if AMD wasn't precision-compatible with Intel on x87 and SSE at the lowest level. Were you writing x87/SSE instructions using assembly code or intrinsics so you're sure the same instruction sequence was being generated in both cases of the example you mentioned?

vilhelm_s · on July 10, 2011

The same binary gave different outputs depending on what CPU you ran it on.

psykotic · on July 11, 2011

Interesting, thanks for the confirmation! I'll have to try to replicate that.

edge17 · on July 10, 2011

yea, I've encountered this before while diffing outputs of two versions of the same thing (like a testbench to compare behavioral code vs netlist vs c simulation).

In fact, this is one of the challenges in doing relational databases on gpu hardware. With relational databases you want to make sure the result of the same query is the same across every run. Relational databases are otherwise perfect use cases for gpu type of parallelization - i.e. running large numbers of parallel, read-only lockless operations without cross dependencies (for the most part, for most relatively simple queries)