
Statically Recompiling NES Games into Native Executables with LLVM and Go (2013) - wooby
http://andrewkelley.me/post/jamulator.html
======
Animats
All that work, and then he hits the expected killer problem - self modifying
code. Getting past that without an interpreter is going to be tough. Not
impossible, because the self modifying code isn't that dynamic in the places
he found it. It's not like it's reading external input and compiling it. There
are a limited number of cases to be handled.

It's amusing seeing this in a machine which gets its code from a ROM.

~~~
Perseids
Which is one of the reasons I hope WebAssembly becomes the de-facto standard
for binary code distribution: It preserves more structure of the original
program and allows less of the lowest level shenanigans (it provides a clear
and protected control flow, has a small instruction set, no dynamic code
generation, and no instruction misalignment / dual-alignment).

~~~
zokier
Wait, WASM disallows all dynamic code gen? That sounds really disappointing.
How do JITs work on WASM?

~~~
dangerbird2
They don't now. The spec will almost certainly allow some form of dynamic code
generation/loading in the future

[http://webassembly.org/docs/future-features/#platform-
indepe...](http://webassembly.org/docs/future-features/#platform-independent-
just-in-time-jit-compilation)

------
dogprez
NES cartridges also could have their own hardware too. Don’t like the NES
sound card? You could ship your own inside of the cartridge. I think
notoriously castlevania 3 shipped some custom hardware in their cartridge.
Some of the cartridges for the famicom came with an fm synth chip.

~~~
mistercow
I believe this was even more common on the SNES. The most well-known example
is the Super FX chip, which was used in Star Fox and Yoshi's Island among
others, but there were actually a ton of these things:
[https://en.wikipedia.org/wiki/List_of_Super_NES_enhancement_...](https://en.wikipedia.org/wiki/List_of_Super_NES_enhancement_chips)

------
convivialdingo
Wonderful, detailed deep-dive into static compilation and emulation with an
approachable start. Good job.

------
nickcw
Interesting read, shame about the dynamic code problems.

I wonder whether something like this might be better attempted in rpython
which will build a JIT compiler for you.

[http://rpython.readthedocs.io/en/latest/](http://rpython.readthedocs.io/en/latest/)

------
ericfrederich
_This means that they test some condition, and then either transfer control
flow to the next instruction, or to a different label. This means that we can
mark the possible branch address and the next address as instructions._

I hope all of these NES ROMs were coded well. Seems like you could do
something like...

    
    
      if False:
          jump_to_data_address
    

... then he'd be interpreting data as code.

~~~
dezgeg
This actually happens surprisingly often in 6502 code, because unconditional
branches always take three bytes, unlike conditional jumps which take two
bytes. So if the value of some condition flags are known to be constant (e.g.
ORing with a nonzero constant always clears the zero flag) it saves a byte to
use an always-true conditional jump in place of an unconditional jump.

------
hayd
Aside, it's surprising to me that there isn't a (maintained) Java/JVM
implementation on top of LLVM...

~~~
legulere
Reflection is hard to support with ahead of time compilation, yet java code
often makes heavily use of it for (de-)serialization.

~~~
Reason077
_" Reflection is hard to support with ahead of time compilation"_

Not true at all. There's nothing in reflection and serialization[1] that is
hard to implement with an AOT compiler.

The real problem with AOT-compilation in Java is that there are many
optimisations, like devirtualization, that can only be done (or done much more
easily/effectively) at runtime.

You can get some of this back with whole-program/closed-world optimisation,
but in reality that's highly impractical for Java. Many Java programs are very
large and people want the ability to update them quickly and easily, often
without even restarting let alone recompiling.

An LLVM-based Java runtime could use a hybrid AOT/JIT model, however.
Recompiling parts of the program as needed at runtime based on profiler data,
you'd get the fast startup of AOT combined with the high performance of JIT.

Don't get any crazy ideas about beating Hotspot's performance, though.

[1] one exception is classes/methods that are defined at runtime using
dynamically generated bytecode. But that's pretty rare.

~~~
pjmlp
Still there are plenty of commercial options to AOT compile Java, and OpenJDK
is getting its free variant with Java 9 for Linux x64, with Windows and OS X
already supported on the Java 10 development branch.

Oracle Labs is also pushing forward their research of meta-circular JVM, which
makes use of AOT compilation. Although it restricts it to a "native Java"
subset.

[http://cr.openjdk.java.net/~jrose/metropolis/Metropolis-
Prop...](http://cr.openjdk.java.net/~jrose/metropolis/Metropolis-
Proposal.html)

~~~
amaranth
OpenJDK's AOT compiler, when you also allow it to include the JIT and still
recompile hotspots, is about 5% slower than just using the JIT. The only
reasons to use it are startup times, ease of distribution (no JVM to ship,
sort of), and for use in places that don't allow JITs (iOS).

~~~
pjmlp
OpenJDK's AOT is also on its early days.

Have you ever profiled Excelsior JET, Aonix, Jamaica, IBM Websphere Real Time
for example?

Their products are almost as old as Java, so I imagine they are worth their
price.

~~~
amaranth
I believe those products prohibit doing public benchmarking so even if I had I
couldn't tell you the results. :)

~~~
pjmlp
However if they were really bad, they would have already disbanded the teams,
because companies would just use the gratis JDK with pure JIT.

Regarding OpenJDK, according to the Java Languages Summit talk, the Java 10
dev branch already has better optimizations than version 9.

------
emerged
I had a similar project idea for ages. Instead of LLVM it would static
recompile into C macros or C++ inline function calls for each opcode. Then it
would compile the output and run the game.

Emulation is just plain fun.

~~~
taocp
[http://mp2.dk/blog/blog/2014/04/14/practical-and-portable-
bi...](http://mp2.dk/blog/blog/2014/04/14/practical-and-portable-binary-
recompilation/)

~~~
emerged
The compiler tends to treat static variables very differently from locals. I
was holding out for efficient local functions which can be forced inlined to
keep the emulated registers as actual system registers. Sadly std::function
doesn't play well with force inline on all platforms.

------
k__
lol, it just came to me that clang is an abbreviation for C language. I always
imagined it to be the sound metal makes when you slam it together, because
bare metal programming etc.

~~~
SyneRyder
Though for what it's worth, the official pronunciation is the klang metal
sound:

[http://lists.llvm.org/pipermail/llvm-
dev/2008-July/015629.ht...](http://lists.llvm.org/pipermail/llvm-
dev/2008-July/015629.html)

------
zsmizzle
Dupe: submitted 1560 days ago -
[https://news.ycombinator.com/item?id=5838326](https://news.ycombinator.com/item?id=5838326)

~~~
vog
Please note that duplicate submission are _not discouraged_ on HN in general,
especially not after such a long time.

However, thanks for sharing the link to the previous discussion!

~~~
DanBC
Although clicking the |past| link takes you to this page:
[https://hn.algolia.com/?query=Statically%20Recompiling%20NES...](https://hn.algolia.com/?query=Statically%20Recompiling%20NES%20Games%20into%20Native%20Executables%20with%20LLVM%20and%20Go&sort=byDate&dateRange=all&type=story&storyText=false&prefix&page=0)

------
mohaine
This should have a 2013 added to the tittle.

