
Fixing Mass Effect black blobs on modern AMD CPUs - Macha
https://cookieplmonster.github.io/2020/07/19/silentpatch-mass-effect/
======
gruez
>We fuzzed that function and found out that it does not give consistent
results between Intel and AMD CPUs when SSE2 instructions are used.

I'm sad this wasn't investigated further. Was one of the implementations not
standards compliant?

~~~
cwzwarich
Matrix inversion requires taking the reciprocal of a determinant. I don't have
the D3D binary to disassemble it, but chances are that they used the
RCPSS/RCPPS instructions to get an approximate reciprocal. The precision of
the approximation is specified. Both Intel and AMD satisfy the specification,
but with different results.

~~~
Const-me
Here’s Intel versus AMD relative error of RCPPS instruction:
[http://const.me/tmp/vrcpps-errors-chart.png](http://const.me/tmp/vrcpps-
errors-chart.png) AMD is Ryzen 5 3600, Intel is Core i3 6157U.

Over the complete range of floats, AMD is more precise on average, 0.000078
versus 0.000095 relative error. However, Intel has 0.000300 maximum relative
error, AMD 0.000315.

Both are well within the spec. The documentation says “maximum relative error
for this approximation is less than 1.5*2^-12”, in human language that would
be 3.6621E-4.

Source code that compares them by creating 16GB binary files with the complete
range of floats: [https://gist.github.com/Const-
me/a6d36f70a3a77de00c61cf4f6c1...](https://gist.github.com/Const-
me/a6d36f70a3a77de00c61cf4f6c17c7ac)

~~~
creato
Funny, I think this graph is from a thread where I got roasted for arguing
that differences in CPU implementations meant that Intel (or anyone else
really) would need to be careful about shipping SIMD software on processors
they don't test.

~~~
Dylan16807
Rightfully so. Because they were replacing the SIMD code with non-SIMD code
that was similarly untested, with the side effect of crippling performance on
their competitor's chips. That's a really bad kind of "being careful".

~~~
creato
I'm not aware of instructions that have this implementation defined behavior
risk in non-SIMD code, and you can be sure that Intel has tested this. If
Microsoft had shipped a non-SIMD version of D3DXMatrixInverse, this bug likely
could not have existed (or it would have been caught on the developer's
machines).

In this case (games), they probably made the right tradeoff despite the bug.
But in general? I don't want to rehash the argument, I'm just really, really
glad I'm not the one flipping that switch (or subject to the hordes of HNers
who think I'm a monster for not flipping it).

------
frabert
Just some months ago I was re-playing this game and I was looking around for
fixes for this bug. This is incredible, and _very_ welcome. Thank you for the
amazing write-up, it's super interesting.

~~~
Macha
Just to clarify, I am not the original author, I'm not sure if they read HN.

~~~
masklinn
The "Rafael" mentioned in the article seems to have answered a comment above.

------
de6u99er
It's completely off topic, but articles like this one are the reason why I
love HackerNews so much and don't tell anyone about it.

~~~
Impossible
If you love hackernews why wouldn't you tell anyone about it?

~~~
AdrianB1
HN is a very nice place that I feel bad telling people about it just to come
and ruin it like many Internet forums I've seen in the past. From all the
people I know there are maybe 5 that have a place here and most are already
long time readers.

~~~
Eupolemos
Yep, I never link to HN on social media.

I'll talk about it in person with people I know at work or I'd mention it on
my old blog - but I don't want to risk it to the "unwashed masses".

But maybe I'm being too cautious - it is pretty in-depht here. That alone
might dissuade most of the shitposters. And the karma-requirements for
downvoting seem sufficient.

------
badsectoracula
Apparently one common fix for this according to some Reddit post is to use
framecapping. It might be that i am too biased against delta timing in games
(ie. using the time delta between frames for animations) because i've seen
_way_ too many games break using it and is the reason i run any game older
than ~7 years with framecapping enabled, but given the description on the site
then i guess what is really happening is that since the game uses mostly baked
lighting, they sample the environment lighting to apply to the entities and
then -since the sampled positions, be it probes or lightmap lumels, are few
and in fixed locations- they interpolate them as they entities move using the
time delta between frames to make that interpolation look smooth. Then as the
game runs in faster hardware, it ends up with smaller time deltas, which in
turn breaks things because time deltas are evil :-P.

Though that is obviously a guess (that this is what happens, not that time
deltas are evil, that is fact :-P), though i've done something similar at the
past for getting light on dynamic entities from a static environment, so some
things did click.

~~~
colonwqbang
Just out of curiosity, what is the problem with delta timing and what is the
alternative?

~~~
wtallis
It's hard to make your game physics simulation numerically stable and
consistent across a wide range of timesteps, unless you make it really
complicated. Things like collision detection are often implemented in a way
that only works for a certain range of object speeds and timesteps—and that's
just one of many glitches that commonly result from running the physics
simulation at an unintended frequency. Dividing things by unexpectedly tiny
numbers can produce surprisingly big results.

The correct solution is to run the simulation on a different thread from the
rendering, so that the simulation can be run at an appropriate frequency and
the rendering can proceed at whatever framerate the user's hardware is capable
of. The more commonly used "solution" is for the game to run the simulation
and rendering on the same thread, and cap the framerate as a way to indirectly
cap the simulation update frequency. Occasionally you find a game that merely
_assumes_ that the framerate won't go over 60Hz, and if your monitor is
faster, the game itself runs in fast-forward.

~~~
hu3
Interesting but:

1) How do you deal with hardware that can't run the simulation at the
appropriate frequency?

2) How do you keep simulation and animation smooth and linear over time when
facing processing oscilations if not by using time deltas?

3) Is ther a graphics/processing demanding game that doesn't use time delta?

~~~
johncolanduoni
Usually a constant time step is used, and you throttle the rate at which you
run physics ticks to make that match wall clock time. It’s common for game
physics engines to also run iterative integrators for a fixed number of steps
per tick instead of to convergence, which reduces variation in work per tick.
If the system can’t keep up, you start dropping physics frames.

You keep animation smooth by decoupling it from physics; the output of your
physics engine will generally include at least linear and angular velocities
you can use for interpolation. This kind of thing is necessary anyway if
you’re running your physics simulation on a server and have to communicate to
the renderer over the network.

------
dmos62
Super tangential and subjective, but I found the writing in the Mass Effect
games subpar. I know a lot of people think the opposite. I've spent a fair
amount of time looking for games with writing I like, but it seems to be an
uphill battle. Someone suggested to not play for writing, which somewhat makes
sense and has led me to do more reading, which I've been thoroughly enjoying.
I replaced looking for games with looking for books and find it more
rewarding. That said, for the purposes of a survey, if nothing more, what game
stories did you enjoy?

To start it off, Kingdom Come: Deliverance was a big deal for me (though not
sure if I especially liked the story or just that it didn't get in the way),
and Planescape: Torment is a favorite too (still remember the wonder with
which I explored it the first time).

~~~
silveroriole
The older I get the more I find all games to have horrible writing. Even games
I used to love like ME are painful on replay. Most conversations boil down to
“I don’t want to do a thing.” “But I’m Commander Shepard and I’m picking the
blue option.” “OK, I will do the thing now!” Recent games with ‘good writing’
(pillars of eternity, divinity, MGS5) have all left me completely cold and
wondering if I’m even playing the same game as reviewers are.

Maybe I can recommend games with a strong atmosphere instead. Like Morrowind,
Deadly Premonition, Sleeping Dogs, Kentucky Route Zero, Silent Hill 2/3\. Nier
Automata if you like anime (which I don’t, so I didn’t find it as thrilling as
many game reviewers seemed to).

~~~
Sharlin
Have you tried Planescape: Torment? The original Deus Ex (the prequels are
definitely lackluster)? VtM: Bloodlines? Life is Strange? Dishonored?

~~~
blaser-waffle
+1 for VtM: Bloodlines. There is a review by a dude who goes by sseth (that's
with 2 x s) that does a good job explaining why it's great.

------
peter_d_sherman
">Since PIX does not “take screenshots” but instead _captures the sequence of
D3D commands and executes them on hardware_ , we can observe that executing
the commands captured from an AMD box results in the same bug when executed on
Intel."

Mass Effect bugs aside, this is interesting!

Before this article, I never knew that DirectX (D3D) commands could be proxied
from PC to PC; I think that's a great capability!

Also, if that's the case, and apparently it is, then it would seem like you
could do something like X-Windows/X11 but for PC's running Windows over a
network by proxying D3D commands... And of course, if Microsoft wants to be
proprietary about that, then the same thing could probably be done with open
source software using OpenGL commands, that is, proxy them over a network
connection to gain an X-Windows like effect, if I am understanding the
underlying technology correctly, or am I mistaken?

~~~
pjmlp
That is just one example among many, why most AAA studios favour proprietary
APIs.

Khronos just does specifications and then lets its partners come up with
actual tooling, which means that you end up with OEM specific SDKs most of
them very thin in capabilities.

------
edem
I just finished Mass Effect (for the 342345th time) and I only encountered
this on Ilos. I thought that it was just a temporary artifact on my machine ad
I haven't seen it ever since.

------
ficklepickle
I really enjoyed this article! I'm not even a game dev, but I still found the
article very approachable and engaging. Thanks!

------
YetAnotherNick
Am I correct in understanding that there is a bug in `D3DXMatrixInverse`, or
is it that some assumption is wrong?

~~~
ndepoel
No, it's not necessarily a bug but rather multiple systems working as
designed, yet coming together to produce incorrect results.
`D3DXMatrixInverse` makes use of hardware-implemented fast math routines by
design, for performance reasons. These implementations may differ depending on
the CPU model but they are all valid, provided they remain within the IEEE 754
spec.

What has happened here is that a new implementation of these fast math
routines appeared that returned results that were unexpected by the game
engine and the engine was not robust enough to deal with these variations.
This is not too surprising as these AMD CPUs did not exist yet when the game
was developed so QA will not have tested the game's compatibility with these
CPUs.

The solution was to divert calls to `D3DXMatrixInverse` to another matrix
inversion routine that makes use of more accurate floating point math, which
produces identical results on all tested hardware.

------
treefry
It’s awesome investigation and well written!

------
shmerl
How does Wine handle it on AMD CPUs?

~~~
sascha_sl
It doesn't. You generally install real D3DX runtimes to run games on wine,
where it then would do the same thing on AMD CPUs.

~~~
nwallin
In the past 3 years or so, there is a native d3d9 implementation in mesa which
is used by wine. There's also vkd3d, which is a wine implementation of dx12 on
top of vulkan.

I don't think the real direct3d binaries are used by default anymore, unless
you go out of your way to configure it that way.

~~~
kevingadd
D3DX is a user-space library, though. Separate from D3D. So if WINE or VKD3D
ship their own open-source version of D3DX9, you could use that, but you could
also use the original Microsoft version of the .dll - the individual numbered
versions (d3dx9_31.dll, etc) exist to facilitate using the same version that
your game was compiled and tested against. The D3D shader compiler
(D3DCompiler) dlls are still versioned in this way for games to link against
as well. If you look inside the SxS directory for a current installation of
Google Chrome, you'll find a D3DCompiler_47.dll sitting in there that they
deploy instead of relying on whatever the OS has available.

------
bzb3
Buying AMD GPUs has been in my experience a terrible gamble. I still remember
when I had to wait months to be able to play GTA V until they released working
drivers.

~~~
throwaway8941
Same here. AMD is still a much better choice for me since since I spend 99% of
my time on Linux, but the GTA V situation was something else. I ran into the
same easily reproducible crash, it was reported by myself and many other
users, and we waited for something like 8 or 9 months for it to be fixed.

~~~
simion314
Do you know if the problem was in the driver or in the game, there are some
games that worked by accident and years later will fail to work in newer
driver/windows versions. I had such a bug with Oblivion,Nvidia and Win7 but
for some reason I could play the game under wine in Linux.

~~~
paulmd
not the person you were responding to, but most of AMD's gpu troubles come
down to the windows driver being hot garbage. They opened the linux driver
enough that others can do a lot of the work for them. Things that don't work
under the windows driver usually work great under linux.

Radeon Tech Group's in-house software support has always been abysmal since
the days they were called ATI. It's been a chronic problem for both their
drivers and their GPGPU ecosystem, NVIDIA can afford more engineers and better
engineers to develop libraries that support the ecosystem and to make sure
that everything works properly on their hardware. AMD's greatest successes
have been when they get the open-source community to maintain and develop
something for them.

Yes, AMD is operating on a much smaller budget but in the end it doesn't
matter too much to the consumer when they can't play Overwatch for 9 months
because AMD has a driver bug that causes "render target lost" errors leading
to competitive bans for repeated disconnects, or... whatever the fuck happened
with Navi.

Part of what you are paying for when you buy a graphics card is the ongoing
software support, and AMD has always fallen flat on their face into a dumpster
of rusty used syringe needles in that department.

~~~
account42
> but most of AMD's gpu troubles come down to the windows driver being hot
> garbage.

That definitely needs some evidence to back it up. In my experience, most game
rendering code is hot garbage that has been hammered just enough to work on
the tested platforms (read: mostly nVidia).

> They opened the linux driver enough that others can do a lot of the work for
> them.

While there are outside contributions, most work on radeonsi (OpenGL) and the
amdgpu kernel driver is done by AMD employees. The AMD Linux driver is better
because it has less legacy code, can share more work with other drivers, can
benefit from users who are more used to filing detailed bug reports and test
development builds, and yes because users and other interested parties (Valve,
Red Hat, ...) can contribute fixes for their pet issues - but it is still AMD
doing most of the work.

For Vulkan on Linux with AMD graphics the most popular driver is entirely
community developed. But AMD's Vulkan driver also works from what I hear.

------
cletus
So the last PC I built was with the Intel 9700 and that was ~2 years ago. At
that time Ryzen was pretty good but didn't have the proven track record or
price-performance it does now.

Now, in terms of pure price-performance, i think I'd want to buy Ryzen but...
this sort of stuff is what scares me off AMD. I just want my crap to work.
Reading some of the comments here, one commenter suggested there's an
instruction to find an approximate determinant of a matrix where both Intel
and AMD are standards compliant but those instructions produce different
results on each chip.

Of course I don't know if this is true or not but saving $200 on a PC build is
just not something that justifies (to me) dealing with kind of issue or,
worse, potentially dealing with issues like these.

I buy NVidia for pretty much the same reason.

~~~
qes
> saving $200 on a PC build is just not something that justifies (to me)
> dealing with kind of issue

This kind of issue.. you mean a visual glitch with limited scope and available
work arounds in a more than decade old video game?

Yes, what a serious issue. /s

~~~
paulmd
Or the Destiny 2 bug (broken RDRAND implementation). Or the segfault bug
(manufacturing error with broken uop cache). etc etc. There have been a fair
number of teething problems affecting AMD users.

They are not incorrect that there is a certain turnkey nature of using Intel,
and certain merits to using a core that has been basically only incrementally
refined for the last 10 years.

And yes, Intel has processor errata too, but AMD had to work through some
major ones because it was a brand new architecture. They also chose not to
take corrective action for some rather major ones - the fix for the segfault
bug should have been disabling uop cache, or to do a recall, instead they just
let people go on thinking the intermittent crashes they experience (including
in windows) are software-related. They entirely declined to patch the Ryzen
Take-A-Way bug, which leaks metadata about page table layouts and breaks KASLR
on their processors, leaving users even more vulnerable to spectre v2. etc.

