
GameCube Emulation and Pixel Processing Problems - delroth
https://dolphin-emu.org/blog/2014/03/15/pixel-processing-problems/
======
hibikir
Since the Dolphin team puts playability above accuracy, the emulator can keep
moving the goalposts towards perfect emulation, while still remembering that
an unplayable emulator does nothing to preserve the source material.

Compare this to the MAME situation, where most early 3d arcade games are
completely unplayable, and will remain like this for the foreseeable future,
because the way computers are going, there is no way in hell we'll be able to
emulate those old, custom graphics cards with full accuracy and full speed
using only general purpose CPUs.

~~~
magila
Last time I messed around with MAME pretty much everything which was playable
ran at full speed, even stuff like STV, Namco System 22, and Seattle based
games. This was on my 2600K running at 4GHz, which isn't exactly the latest
and greatest.

From what I understand the most demanding thing to emulate isn't the 3D
hardware anyways, it's the high clock rate CPUs. Most of the latest 3D
hardware supported by MAME is implemented with ASICs which have a relatively
high level interface. This allows the emulation to be pretty highly optimized
since the low-level details aren't visible to the game software anyways. CPUs
are a whole different story though. There you are stuck emulating individual
instructions which gets pretty hairy when the CPU you're emulating can do 200+
MIPS.

------
heydenberk
It's interesting that this has ramifications for the emulated console(s)
within the console. On one hand, it's an amazing technological achievement to
emulate a system well-enough that it can emulate another. On the other hand,
however, it's staggeringly inefficient. Emulating a system is inefficient, but
it makes sense because it prevents the need to keep all kinds of hardware 0n
hand. Emulating a system that emulates a system compounds the inefficiency and
is unnecessary, but is a really cool achievement.

~~~
delroth
Fun fact: Dolphin emulates some N64 games better than current PC N64 emulators
do. For example, Mario Tennis (N64) is considered very difficult to emulate,
but the official Nintendo emulator running in Dolphin has almost no problem
with it!

~~~
heydenberk
I'd have to imagine this is because the game executed by the virtual console
has been simplified and improved by Nintendo since they have direct access to
the source.

~~~
derefr
Or, possibly, that each Virtual Console ROM ships with a set of shims or
plugins to the emulator, to add extra logic and workarounds specific to each
game.

Which is pretty much exactly how NES and SNES cartridges worked, come to think
of it—except that the console they were patching was hardware, so they had to
add new physical chips to do it. (Speaking of, I've always wondered why no
console just ships with some FPGAs inside that are free for each game to
program on startup.)

~~~
simcop2387
It'd be doable to have an FPGA there, it'd have to be an SRAM backed FPGA
which will make it more expensive from what I've seen. Otherwise they're
usually flash based which will wear out after a while. It'd also likely suffer
a similar fate to the random processors in the playstation systems where they
barely got any use in games to a serious extent since they were only on one
system.

------
Sarkie
I bloody love all content on this blog, the hex math fail, the mobile drivers,
always the content is great. Off to bed to read this on my tablet. Thanks for
the link

------
joevandyk
Why is floating point math faster than integer? (Seems like integer should be
simpler)

~~~
neobrain
Basically, what kevingadd said.

As a matter of fact, AMD GPUs are using the floating point ALUs to perform
integer math (note: this might have changed with their GCN architecture).
However, given that the mantissa of IEEE floats is just 24 bits long, the ALU
also can just handle 24 bits, which is not sufficient for full 32 bit math.
Hence, for "true" integer arithmetic, the ALUs need to be double-pumped or
emulated via floats (i.e. the dirty tricks which we avoided need to be done in
the driver instead - but at this stage it can actually be done reliably, even
if it's still ugly).

I assume the situation is similar for Nvidia GPUs. Either way both vendors
said their GPUs aren't designed for optimal integer performance - so that's
why we expect the performance drawbacks of integer usage to become less and
less of an issue in the future.

~~~
foxhill
i'd like to see a source for that claim, regardless, pre-GCN GPUs had double
precision support that wasn't purposefully knee-capped. 52 bits of mantissa
would have been plenty.

as for nvidia knee-capping integer arithmetic on their GPUs.. i terribly doubt
that is the case - pointer arithmetic (and hence memory access) requires
integer operations, and i've seen very little evidence to suggest that there
are any artificial issues with it.

~~~
neobrain
There's no public source for the claim, it's what an AMD engineer told me via
private e-mail (and I don't want to publish private mails for obvious
reasons).

That said, there's a "fast" path on AMD GPUs for shader code which only
requires 24 bits of integer precision. Those are actually exactly enough for
GameCube/Wii GPU emulation, however I'm not sure if their shader compiler
properly optimizes our code to use that path.

------
MaxGabriel
Hah! I hadn't played Twilight Princess for many years before playing playing
it on Dolphin, so I actually thought Midna was supposed to have those lava-
arms!

Being on a mac, I'm tied to OpenGL so I'm hoping this doesn't hurt me too
much.

~~~
archagon
Do you have a Windows license lying around? Bootcamp might be worth it: I
almost get 2x performance in Windows compared to OSX!

~~~
MaxGabriel
I do actually, for windows 8. But I don't think they give you a disk image,
making it a pain to install.

~~~
archagon
You can actually download the ISO from a Microsoft CDN, but I don't have a
link on me right now.

~~~
pbhjpbhj
You can get ISOs for MS Windows 7, eg links at
[http://bratnin.narod.ru/Windows_7.html](http://bratnin.narod.ru/Windows_7.html),
from msft.digitalriver.com servers.

You can also get an upgrade application from [http://windows.microsoft.com/en-
US/windows-8/upgrade-product...](http://windows.microsoft.com/en-
US/windows-8/upgrade-product-key-only) to move Win7 up to Win8.

But AFAIK you can't get a MS Windows 8 ISO to download from Microsoft itself?

~~~
archagon
That's what I meant: I'm pretty sure digitalriver.com is Microsoft's own CDN,
used for App Store and MSDN downloads. You can even verify the hashes on an
actual Microsoft subsite — possibly here? [http://msdn.microsoft.com/en-
us/subscriptions/downloads/](http://msdn.microsoft.com/en-
us/subscriptions/downloads/)

------
Buge
I don't understand those overflow equations.

In the first equation (it admits it's wrong), I plug in 1 and get 0.00390625.

In the second equation, I plug in 1 and get -0.992188.

I think the answer is supposed to still be 1, because there should be no
overflow until it is 256.

I thought maybe I was misunderstanding and the equation isn't just to handle
overflows but is supposed to add 1 then handle overflows. So I plugged in 0
into the equations and they both outputted 0. So they aren't trying to add 1.

Wouldn't the correct equation be frac(value / 256.0) * 256.0 ?

~~~
anon4
The value is encoded as a fraction out of 256. So 1 is 1/256, up to 255 being
255/256.

It's still not completely correct, some values passed through that function
come back as +/\- 1e-16 and 255/256 becomes 0. Showing us again how floats are
bloody hard to work with for the average programmer.

Edit: saw neobrain's comment, please disregard my post. Still, shouldn't there
be a round in there too?

~~~
neobrain
"Still, shouldn't there be a round in there too?"

I don't think so, since that particular code was just meant to emulate integer
overflows (in contrast to the limited decimal precision of integers). If you
were to emulate the precision as well, it would likely need an additional
round around everything indeed, i.e. something like round(value - 2.0 *
round(0.5 * value * (255.0/256.0)) * 256.0).

If anything, this discussion shows that it's getting annoyingly complex to
find the correct formula though, especially if all corner cases are supposed
to be handled correctly. Oh right, and the real fun begins when you try to
emulate 24 bit integers, for which the proposed method doesn't work at all
because floats only have 23 bits of mantissa :)

Chances are there are simpler ways to emulate this stuff, I really don't know.
Would be interesting to hear from GPU driver developers how integers are
emulated within the driver via floats if hardware does not support integers :)

------
pauldacheez
My body is not ready for the amount of performance complaints this'll cause on
the forums.

At least I have an excuse to buy a GTX 780 now.

~~~
ginko
Replacing several FP operations with a single integer OP can only improve
performance.

~~~
tasty_freeze
It depends on the context. In the case of a GPU, floating point resources are
great and integer operations are limited, your statement is false.

~~~
Jasper_
Depends on the GPU, too. Some of the more popular mobile GPUs don't even have
integer ALUs anymore.

------
anonymousab
Very cool. I have to wonder if they've ever been contacted by Nintendo over
this though - many recent-ish games have been in a playable state for several
years now and you'd think functional Wii emulation attract negative attention.

~~~
neobrain
As far as I can tell, none of the developers have been approached by Nintendo
so far. And I guess by now they should have noticed us in some way.

EDIT: Fixed incorrect double-negative

~~~
Guvante
They have certainly noticed, but likely won't do anything about it. IANAL but
I am not even sure if emulation of this kind can be considered illegal. And if
I were in their shoes I would much prefer a group of random people create a
playable open source emulator of my old hardware. If they ever decide to leave
the hardware business it makes going down that path themselves much easier.

~~~
keeperofdakeys
As long as you are doing black-box reverse engineering. If you tried to
disassemble the gamecube or wii software, then you likely are breaking the
law.

A similar situation comes up with Gnash, the GNU adaptation of flash. They
require developers to have never installed flash, which requires signing the
EULA, which includes a clause about reverse-engineering the program.

~~~
cookiecaper
Disassembly is actually explicitly legalized for the purposes of reverse-
engineering. It's just distributing any software that circumvents copy
protections that's illegal, whether it's the result of a disassembly or not.

People avoid disassembly in clean-room implementations out of an abundance of
caution. If it's evident that the logic was ripped from disassembled
executables, you'll have a harder time defending against patent claims or
frivolous copyright claims.

IANAL, but this is my understanding.

------
mintplant
On a related note, the branch browser feature of the Download Page [1] is
completely broken, displaying only "Branch list" in place of what I assume
should be an actual branch list.

[1] [https://dolphin-emu.org/download/branches/](https://dolphin-
emu.org/download/branches/)

~~~
delroth
Yeah, we recently moved away from Google Code to GitHub and updating the
website hasn't been in my priorities. We moved from a "branches in the main
repository" model to a more classic, GitHub style, "everyone has his fork"
model, so tracking branches it not really even possible anymore.

I'll probably just remove this link from the downloads page.

