
Fergulator - NES emulator, written in Go - bconway
https://plus.google.com/102388027951815318627/posts/G7kNAAJcKZU
======
tptacek
I found Go to be really productive for expressing machine emulators; the
language is deliberately amenable to working with ints at bit level, the
packaging system is effective and mostly stays out of the way, it has a
flexible "switch", there's just enough abstraction so that it's easy to swap
different components (memories, etc) in and out, and, of course, once you
start running the things, goroutines make it easy to step machines as
coroutines.

~~~
laumars
With regards to the switch statement, I've read on stackoverflow[1] that
function tables actually outperform switch statements when there's several
cases. So I tend to avoid switches for performance critical routines.

It will be interesting to see how 1.1 performs in relation to this.

[1] [http://stackoverflow.com/questions/9928221/table-of-
function...](http://stackoverflow.com/questions/9928221/table-of-functions-vs-
switch-in-golang)

~~~
m0th87
This is pretty strange if it's true; shouldn't a switch (especially one that
simple) compile into a jump table? Otherwise what's the point? Does Go just
linearly search through case conditions in a switch?

EDIT: Missed the link: [https://groups.google.com/forum/#!msg/golang-
nuts/IURR4Z2SY7...](https://groups.google.com/forum/#!msg/golang-
nuts/IURR4Z2SY7M/R7ORD_yDix4J)

Seems like a reasonable argument, but then again I don't see why they even
bothered adding a switch given those constraints.

~~~
ANTSANTS
Dispatching a table of functions should have roughly similar performance to
dispatching a large switch table, yes, but implementing opcodes as a table of
functions allows you restructure your bytecode interpreter in a way that
significantly improves performance. Here's a much better explanation than I
could give of the dispatch-per-opcode technique, how it works, and why it's
faster (tl;dr it's all about branch prediction):
[http://eli.thegreenplace.net/2012/07/12/computed-goto-for-
ef...](http://eli.thegreenplace.net/2012/07/12/computed-goto-for-efficient-
dispatch-tables/)

That article explains how to implement the technique for C programs using a
"computed goto"/"labels as values" gcc extension. While Go lacks that feature,
dispatching a table of functions as the last statement in each opcode
implementation should yield a similar result. As long as Go supports "tail-
call optimization" in the trivial case of "functions that take no parameters
and return nothing calling similar functions" it should work just fine.
Googling suggests that Go does not support TCO, but at least this program
didn't explode the stack:

EDIT: Yes, it does.

    
    
      package main
      import ("fmt")
      var acc byte
      func main() {
      	fmt.Println(acc)
      	acc++
      	main()
      }

~~~
sevenelfen
> I'm not familiar with the implementation of Go, but at least this program
> didn't explode the stack:

It does, it's just that Go is so slow printing to the console that it would
take years to run out of stack space. If you redirect to null it will use up
all your memory and swap space in a few minutes:

./main > /dev/null

In most other languages this same code would run for a short time and then
abort after exhausting the stack. This is the best behavior since algorithms
that use unbounded memory are where you certainly must handle out of memory
errors and set limits; using too much stack space is an error that should be
caught quickly not postponed. Go on the other hand uses a growable stack, so
the code you gave will use up all available memory and swap before finally
crashing.

Go uses a growable stack so that programs can use many goroutines on 32-bit
machines. This is bad for performance due to extra checks on calling function
to see if the stack needs to be grown or shrunk, the overhead to actually do
that, and less efficient use of cpu data cache. It makes it complicated to
call functions from any other language. It seems like any modern language
should work best on 64-bit and make trade-offs for 32-bit, not the other way.

~~~
cmccabe
You are mistaken. You don't need extra checks to determine when to grow the
stack. You just need to leave some unmapped space after the stack and
correctly handle a SIGBUS error. There is no extra overhead above what the
operating system is already doing.

Growable stacks aren't about getting optimal performance on 32 bit
architectures. That is explicitly a non-goal of Go. They're about minimizing
memory consumption for goroutines which don't use very much stack space, which
is expected to be most goroutines. You can't have hundreds of thousands of
goroutines if you have a high fixed amount of memory per goroutine.

As for your argument that growable stacks make it harder to determine program
correctness, it seems like nonsense to me. I could make the same arguments
about heap space, but nobody thinks a low fixed limit on heap sizes is a great
idea. If you want to test your program under low memory conditions, try
mlocking a lot of memory and then running your program. Alternately you could
try something involving cgroups or virtual machines.

------
kodablah
The juxtaposition between this and <https://github.com/pcwalton/sprocketnes>
is neat.

E.g. <https://github.com/pcwalton/sprocketnes/blob/master/audio.rs> compared
to <https://github.com/scottferg/Fergulator/blob/master/audio.go> or
[https://github.com/pcwalton/sprocketnes/blob/master/disasm.r...](https://github.com/pcwalton/sprocketnes/blob/master/disasm.rs)
compared to
[https://github.com/scottferg/Fergulator/blob/master/disassem...](https://github.com/scottferg/Fergulator/blob/master/disassembler.go)

~~~
pcwalton
Ew, my audio code. :( I hacked the SDL audio bindings together with no regard
to proper Rust idioms; hence the unsafe everywhere. The Go code reads far
better.

The disassembler is nicer though—it demonstrates macros and traits well. The
macros and AddressingMode trait help avoid duplicating the instruction decode
logic between the CPU interpreter and disassembler, with no overhead at
runtime.

~~~
ferg
The audio bindings are due to the excellent Go-SDL package. Nevermind my own
fork, that was just to remove some unneeded stuff to fix compilation on OSX.

------
GhotiFish
He should consider making the emulator able to read .fcm input logs from a
TAS.

These runs usually serve as good tests of an emulators compliance.
Particularly the runs that were verified on the actual console.

though... there are some cartridges that have random (read: not psudo-random)
behavior, and can't actually be tested. (or tased at all)

------
eruditely
Where does one go to learn how to create an emulator? I'm interested in
picking up C/Go (probably the latter nowadays) while trying to make an
emulator. Is there a process people adhere by or do they seriously just figure
it out?

~~~
fiatpandas
I'm just beginning with emulators and I wrote my first one to emulate CHIP8 a
few months ago. I'm now working on a gameboy emulator. Anyways, having no
experience with assembly, processor instructions, bitwise operations, etc, I
started with this tutorial:

[http://www.multigesture.net/articles/how-to-write-an-
emulato...](http://www.multigesture.net/articles/how-to-write-an-emulator-
chip-8-interpreter/)

Also, good reference: <http://en.wikipedia.org/wiki/CHIP-8>

I think that tutorial quite good and I emerged with a solid understanding of
what exactly emulators/interpreters do, and what it means to emulate a certain
device. CHIP8 is very simple, so moving to gameboy is actually a very big
leap. The gameboy's instruction set (you'll know why that is a crucial piece
of information after completing the CHIP8 tutorial) is much larger, so it
requires a lot more work and understanding.

Let me know if you have any questions

~~~
jimmaswell
I have a 90%-working chip-8 emulator in JS. I never got around to fixing an
instruction that waits for user input, though.
<http://luna.thehorseplace.us/pr/chip8.html>

------
xyproto
Best part is that if you've installed Go and set up the GOPATH already,
downloading and compiling fergulator is just a

go get github.com/scottferg/Fergulator

away.

~~~
mseepgood
However, sdl, SDL_image and glew must be installed with header files.

