
How Diablo Was Reverse-Engineered Without Source Code [video] - bane
https://www.youtube.com/watch?v=5tADL_fmsHQ
======
jchw
I really appreciate the rise of YouTubers that are fairly knowledgable in
retro gaming and computing niches. David Murray (of The 8-bit Guy) and Clint
Basinger (of Lazy Game Reviews) are two other YouTubers who consistently put
out interesting content about retro stuff. Of course they all have their areas
of expertise and interest, and sometimes there’s things vague or occasionally
outright incorrect, but I think the entertainment value more than makes up for
it.

This video is really cool, though. It’s full of interesting anecdotes, and I
enjoy the visual content showing things like looking at the binary in Ghirda.

Even though the answer does just boil down to discovering debug symbols and a
version with debug assertions enabled, I still recommend watching as they go
above and beyond, especially near the end when introducing some of their own
work.

(Edit: fixed spelling error in Clint’s surname.)

~~~
Nition
I'll recommend GameHut[1], a channel run by Jon Burton[2] who worked on a lot
of classic titles himself. The clickbaity titles and thumbnails are a little
unfortunate, but there's a lot of interesting content on how things were
programmed, with all the details broken down. e.g. 'Mickey Mania's
"Impossible" 3D Chase'[3].

[1]
[https://www.youtube.com/channel/UCfVFSjHQ57zyxajhhRc7i0g/vid...](https://www.youtube.com/channel/UCfVFSjHQ57zyxajhhRc7i0g/videos?view=0&sort=da&flow=grid)

[2]
[https://en.wikipedia.org/wiki/Jon_Burton](https://en.wikipedia.org/wiki/Jon_Burton)

[3] [https://www.youtube.com/watch?v=nt-
AxAqlrOo](https://www.youtube.com/watch?v=nt-AxAqlrOo)

~~~
jonny_eh
> The clickbaity titles and thumbnails are a little unfortunate

They bring views which helps him make more videos.

~~~
naikrovek
One day enough people will see through clickbait titles that they won't work
anymore. At least I hope so. It may be happening already, I don't know.

If that happens, it will be yet another example of optimization for the short
term at the expense of the long term, and people still will fail to understand
why that tradeoff is bad. Sure it's great for current-day you, but future you
will pay, somehow.

~~~
kadoban
How will the future "you" be paying? Not really clear what you mean. I don't
see the short vs long tradeoff.

~~~
naikrovek
There are many people who do not understand that optimization for the short
term is paid for in the long term. Far too many. I don't understand how people
do not understand that, and I recognize that, apparently, most people do not.

All decisions have (or are) trade-offs. Cause & effect. Since a cause cannot
have an effect in the past, all causes have effects in the future, be it one
nanosecond in the future or one thousand years in the future. That is to say,
all decisions have consequences, and those consequences are always in the
future.

If you optimize for the short term, you are, by definition, not optimizing for
the long term. So, short term optimization necessitates long term de-
optimization.

Examples:

Spend money now and have none for retirement (short term benefit), or save
money now, and have it during retirement (long term benefit)?

Pollute the air now, earning maximum profit at the expense of air cleanliness
in the future, or pay a lot of money today to find cleaner ways to operate,
and have clean air in the future?

Fully invest in clickbait headlines today, earning ad impressions today, and
risk clickbait fatigue later, or find a longer term strategy to retain readers
in the long-term?

I can't readily think of an example decision that optimizes for short term
gain that does not also optimize for long term loss.

My problem with this is that the short term is indeed very short, and the long
term is very long. Short term optimization has long term consequences, and I
would prefer short term consequences over long term consequences.

~~~
xamolxix
> If you optimize for the short term, you are, by definition, not optimizing
> for the long term. So, short term optimization necessitates long term de-
> optimization.

I don't think " short term optimization necessitates long term de-
optimization" follows. They can be both optimal.

> Spend money now and have none for retirement (short term benefit), or save
> money now, and have it during retirement (long term benefit)?

Spend money now on something that increases your chances to get more money
(easy example education?) and benefit now and at retirement.

------
dwrodri
Tl;DW: There were versions of the game released that still had debug symbols.

~~~
jchw
To be more particular:

\- The Japanese Playstation port contained debug symbols

\- A Windows debug binary with all assertions in tact was found

It is my understanding that this combination allowed for a better
understanding than just symbols alone.

(Could be a little fuzzy on the details; I watched this video last night, not
just now.)

~~~
mewmew
To give further background, the Devilution team has primarily relied on these
resources:

1\. The Japanese Playstation port with debug symbols contained in
`DIABPSX.SYM`. (see [1]).

Example debug info of the Cathedral dungeon generation algorithm:

    
    
      // address: 0x801259D0
      // line start: 612
      // line end:   624
      void DRLG_L1Floor__Fv() {
       // register: 19
       register int i;
       // register: 20
       register int j;
       // register: 3
       register long rv;
      }
    

2\. The debug release of the PE executable, which contained assert strings
(see [2]).

Example assert string:

    
    
      "plr[myplr].InvGrid[i] <= plr[myplr]._pNumInv"
    

3\. The Rich header of the PE executable, which details the exact version of
the original compilers and linkers used to build `Diablo.exe` (see [3,4]).

Example information recovered from the Rich header of `Diablo.exe`:

    
    
      Id  Build  Count  Name       Description
       0      0    155  Unknown    [---] Number of imported functions (old)
       1      0    229  Import0    [---] Number of imported functions
       6   1668      1  Cvtres500  [RES] VS97 (5.0) SP3 cvtres 5.00.1668
       2   7303     29  Linker510  [IMP] VS97 (5.0) SP3 link 5.10.7303
       3   7303      1  Cvtomf510        VS97 (5.0) SP3 cvtomf 5.10.7303
       4   8447      2  Linker600  [LNK] VC++ 6.0 SP3,SP4,SP5,SP6 link 6.00.8447
      48   9044     72  Utc12_2_C  [---] VC++ 6.0 SP5 Processor Pack
      19   9049     12  Linker512        Microsoft LINK 5.12.9049
    

4\. Discovery of the original set of compiler flags used to build `Diablo.exe`
(see [5]).

Primarily "/O1" was used, but there are also peculiarities such as the use of
both Microsoft Visual Studio 6 and Microsoft Visual Code 5 for linking the
game.

5\. The heartfelt dedication of a team of people. GalaXyHaXz did the initial
heavy lifting and succeeded in the tremendous task of getting the decompiled
source code of Diablo 1 compiling with the original toolchain. Later on she
released the project open source and a community of open source collaborators
formed. Most of us have never met in real life prior to joining the project,
which stands to show that there is strength in online collaboration that
transcend both culture and borders.

6\. The Beta release and the Alpha4 release of Diablo 1 has also proved
invaluable resources for cross-validation as the compiler optimization level
was not set to release mode for these binaries.

Interestingly, in the process a number of bugs in the original implementation
of Diablo 1 were discovered. These have been documented in the source code of
Devilution with `// BUGFIX: foo` comments, and have also been detailed in [6].

To track the progress of the project, the "Binary identical functions"
milestone has been used in tandem with an assembly diffing tool developed in
Rust (see [7,8]).

Anecdotally, it was an incredible moment when we first managed to run the
cross-platform port of Diablo 1 (DevilutionX, see [9]) natively on Linux and
succeeded in playing a multiplayer game connecting our computers in Korea and
Denmark. It is equally thrilling to see the modding and porting community
picking up the torch and already succeeding in porting Diablo 1 to Nintendo
Switch!

The main reason for conducting this bit of software archeology is to preserve
the classic title that is Diablo 1, for generations to come. And to revive it
for modern hardware platforms and make it more mod-friendly in the age of open
source software.

Happy coding! \- The Devilution Team

P.S. the project README explicitly states that to play the game, you still
need to have access to the original game assets released on the Diablo 1 CD.
To acquire a legal copy, please refer to
[https://www.gog.com/game/diablo](https://www.gog.com/game/diablo)

P.P.S. for the verification process, there have been proposals that are both
ambitious at a level of PhD research (see [10]) and that made us feel warm and
fuzzy <3 In the end, many of the techniques outlined were discussed mostly on
a design level, some were included as Proof of Concepts, but most of the work
in reverse engineering Diablo 1 was from tender labour of a team that care for
Diablo 1 the way you would your firstborn child.

[1]:
[https://github.com/diasurgical/scalpel/blob/master/psx/_dump...](https://github.com/diasurgical/scalpel/blob/master/psx/_dump_/_dump_merge_c_src_/diabpsx/source/drlg_l1.cpp)

[2]: [http://diablo1.se/notes/debug.html](http://diablo1.se/notes/debug.html)

[3]:
[https://github.com/diasurgical/devilution/issues/111#issueco...](https://github.com/diasurgical/devilution/issues/111#issuecomment-426059660)

[4]:
[http://bytepointer.com/articles/the_microsoft_rich_header.ht...](http://bytepointer.com/articles/the_microsoft_rich_header.htm)

[5]:
[https://github.com/diasurgical/devilution/issues/111](https://github.com/diasurgical/devilution/issues/111)

[6]:
[https://github.com/diasurgical/devilution/issues/64](https://github.com/diasurgical/devilution/issues/64)

[7]:
[https://github.com/diasurgical/devilution/milestone/3](https://github.com/diasurgical/devilution/milestone/3)

[8]: [https://github.com/diasurgical/devilution-
comparer](https://github.com/diasurgical/devilution-comparer)

[9]:
[https://github.com/diasurgical/devilutionX](https://github.com/diasurgical/devilutionX)

[10]:
[https://github.com/diasurgical/devilution/issues/171](https://github.com/diasurgical/devilution/issues/171)

~~~
brankoB
Super noob question but i'm trying to wrap my head around how one could figure
out the source code just based on things like debug info and assert strings? I
watched the video and am staring at your examples and I just don't understand
how you go from those to actual source code.

~~~
sudomakeup
In the disassembly we can see a bunch of fine grained operations but the
meaning behind them is opaque. For example, we see two array access operations
but its not clear what they do. They might look like this:

mov al, [array + ebx]

Considering the assert statement from point 2: "plr[myplr].InvGrid[i] <=
plr[myplr]._pNumInv"

From this we can see what the variables were named in the source code.
Assuming "plr" = player and "InvGrid" = Inventory grid, we can deduce one the
array access operations is to get the current player and another is for
getting an item from the inventory grid.

------
cmcginty
I've been watching all of Modern Vintage Gamer's content for about 9 months
now. This guy puts out some really nice videos with high production quality.
There was a great video about game disk copy-protection a few weeks back.
Awesome stuff, and very well researched.

Some other really good content out there is Game Historian, RetroRGB, My Life
in Gaming, and Wrestling with Gaming. If you're into hardware restoration then
Retro Man Cave has a lot of content too.

~~~
joshschreuder
The DF Retro series on the Digital Foundry channel is great too.

And RetroAhoy on the Ahoy channel.

------
buserror
The original Mac version of Diablo also came with debug symbols, at the time I
had written some sort of 'companion cheater map' thing that piggybacked onto
the game for doing various nefarious things. It was called 'Dieblo' !

I also had to reverse engineer a whole lot of things, but at the time with
Macsbug and a little bit of patience (and an OS with no memory protection
whatsoever) it was pretty trivial work.

------
IlegCowcat
It's much easier to do reverse engineering on programs compiled with old
compilers, nowdays compiler are really good at optimizing shit, which means
making the assembly code more complex, using new instructions etc...

------
TravHatesMe
I love open-source mods/remakes of legacy games. It requires technical
expertise and exceptional patience to build something so large in scale and
complexity. The disgusting amount of work to accomplish this compounded by the
fact that it's free for everyone makes me envious of their passion. A
technical feat that is equally admirable as it is impressive.

------
paxys
Doesn't "reverse engineered" imply that source code wasn't used?

~~~
talaketu
well, source code wasn't used was it?

Anyway, I think "reverse engineering" is a broad category that include
disassembly techniques. "Clean room reverse engineering" is a stricter idea.

------
m463
I wonder why blizzard didn't just release diablo as open source.

I assume there's no strong minded folks at the top like John Carmack.

Id Software went through a little turmoil when he wanted to release the doom
source, but it sort of highlights his unshakable confidence.

~~~
davesmith1983
They may not have the original source. They lost quite a lot of the original
Starcraft Assets and they were only later found by lick in someone's garage or
loft well over a decade later IIRC.

I have a lot of code myself (mostly perl scripts, very old .NET and Java code)
for old systems that are deployed that weren't in any sort of source control
until recently.

This may surprise some but until 2006-2007 most source control systems were by
today's standard terrible.

I've been through the torture of working sometimes with Accurev and Microsoft
Sourcesafe and tbh you are better just zipping up your source code and putting
a date and time on the archive name. Even more modern systems like Team
Foundation Server are painful.

------
ekianjo
Semantic question but doesnt reverse engineered already means you have no
access to source code?

~~~
mbel
Yeah. On the other hand probably most of us had a chance to work with code
that needed some reverse engineering to be understood ;)

------
hd4
So is Devilution considered the current best port by the Diablo community?
When I last checked there were at least 3-4 others.

------
vagab0nd
Ah, the good old mpq files... I remember extracting sound/music from Warcraft
3 and using them as ringtones.

------
kowdermeister
Aha! So game developers do write tests after all :)

------
apo
This sounds lees like reverse engineering and more like decompiling.

~~~
mproud
Yeah, this was hotly debated on Reddit that it wasn’t truly reverse
engineering.

~~~
davesmith1983
Well it is by most definitions tbh. Tbh the people on Reddit aren't the
brightest.

------
yosoyalejandro
Amazing video ;)

