
LaiNES – Cycle-accurate NES emulator in around 1000 lines of code - mmphosis
https://github.com/AndreaOrru/LaiNES
======
glandium
This makes me wonder. The 6502 in the NES ran at 1.79MHz.

According to
[https://en.wikipedia.org/wiki/Instructions_per_second](https://en.wikipedia.org/wiki/Instructions_per_second)
, a modern i7 processor handles north of 100,000 MIPS.

According to
[https://en.wikipedia.org/wiki/Transistor_count](https://en.wikipedia.org/wiki/Transistor_count),
the 6502 had 3510 transistors.

At 100,000 MIPS, a modern CPU would have a budget of ~56k instructions to
process one cycle of that CPU, or about 16 instructions per transistor.

So it would seem it might now be possible to simulate those old processors, at
the transistor level, in real time. Is anyone aware of experiments in this
domain? If not really useful, that sounds like an interesting fun (side)
project.

~~~
mattbee
[http://www.visual6502.org/](http://www.visual6502.org/) does it in the
browser & flashes transistor states at you!

And if you liked that you'll probably also enjoy
[http://www.megaprocessor.com/progress.html](http://www.megaprocessor.com/progress.html)

~~~
vincnetas
Also: Introducing the MOnSter 6502
[http://www.evilmadscientist.com/2016/6502/](http://www.evilmadscientist.com/2016/6502/)

------
raldi
What does "cycle-accurate" mean? The README assumes the reader already knows;
Wikipedia via Google is totally unhelpful: "A cycle-accurate simulator is a
computer program that simulates a microarchitecture on a cycle-by-cycle
basis."

~~~
0xcde4c3db
The traditional way of coding a console emulator was to figure out the time
interval to the next interrupt in units of CPU cycles, emulate enough
instructions to cross that threshold, then emulate the interrupt and hook
other routines (redrawing the screen, filling sound output buffers, reading
input, etc.) off of those events (see e.g. Marat Fayzullin's classic Emulator
HOWTO [1]). This approach runs a lot of stuff just fine because it does
synchronize to the most important events, but can cause problems. For example,
"well-behaved" code generally only writes to graphics registers or sprite
tables during blanking periods, as writing during active display is usually
undefined behavior. Some code breaks the rules. Sometimes this is done
intentionally to do cool effects with the hardware. Other times it's a side
effect of a bug that wasn't caught because there are coincidentally no
symptoms with the timing of the actual hardware. But then you plug it into an
emulator with only roughly accurate timing and it blows up.

In reality, the clocks for the various components don't necessarily run at the
same rate, or even at integer multiples of the CPU clock. You can have a
situation where, for example, there are 3.5 clock cycles on the graphics
hardware for every one CPU clock cycle. For a lot of the classic systems, this
happened because a single higher master clock is divided down for each
component.

A "cycle-accurate" emulator is one that operates as if the emulated state of
all hardware were updated on every tick of the master clock. This wasn't
generally done in the past because it was far too slow ~20 years ago when
emulation of classic consoles and computers really took off.

More sophisticated hardware doesn't necessarily have any single master clock
in this sense, so it doesn't make much sense to talk about a "cycle-accurate"
emulator of a modern PC, for example.

[1]
[http://fms.komkon.org/EMUL8/HOWTO.html](http://fms.komkon.org/EMUL8/HOWTO.html)

~~~
colanderman
I played around with this in my own NES emulator, which works roughly the way
you describe. I found that, at least for major-brand titles
(Mario/Zelda/Metroid/Kirby), cycle accuracy actually _doesn 't_ matter.
Everything's based off PPU/APU/mapper interrupts.

In fact, doubling (or more) the CPU clock _enhanced_ some of the games I
tried. Animations in Kirby's Adventure became more smooth (e.g. the Spark
ability). Screens full of enemies in Metroid ran with no slowdown. The glitchy
first scanline of status overlays in various games cleared up. I haven't yet
observed negative effects in major-party titles.

I had designed my emulator as an experiment: treat the NES as an abstract
specification, rather than a concrete implementation. Turns out that a lot of
games seem to have actually been designed following the same principle. It was
a wonderful feeling to see these games as I imagine they were intended to be
experienced, as if I opened a letter from the game designers left unopened for
thirty years.

~~~
makomk
Bear in mind that sometimes, people want to do tricks that depend on the
slowdowns and glitches that happen on real hardware.

~~~
colanderman
I'm sure. That wasn't the point of my experiment.

------
tlack
People always say that terse code is hard to read and understand. I'd say I
just learned a lot from a quick skim of the concise source here. If it had
been structured with tons of white space and split over dozens of
files/folders/modules, I'd have no chance of understanding it at a glance.

~~~
qwertyuiop924
Terse code is easy to read. Gratuitously compact code isn't

The rule of thumb is that if it's short but in no way obfuscated, it's terse.
If it's short, and impossible to read because of it, it's gratuitously
compact.

~~~
posterboy
It is in multiple files, but still very terse, eg. using single letter
variables.

    
    
       /* CPU state */
       u8 ram[0x800];
       u8 A, X, Y, S;
       u16 PC;
       Flags P;
       bool nmi, irq;
    

I like it.

~~~
andreaorru
I'm the author.

The reason for the single letters is that those are the actual names of the
6502 registers.

Glad you like it overall. :)

~~~
posterboy
accumulator xchange ... y, something stacky perhaps?

program counter

flagPort

non maskable interrupt, interrupt request

I'm sure you knew that though, it's common micro processor nomenclature.

~~~
Two9A
Specifically on 6502-derived processors, you have the following registers:
Accumulator, X and Y indexes, Stack Pointer, Program Counter, Flags. Of these,
only the Program Counter is double-width.

Additionally, the zeroth "page" of memory (the lowest 256 bytes) are
accessible with a dedicated addressing mode which saves space and time in the
program, and can be used as a form of cache.

------
qwertyuiop924
That is very cool.

I'm always kind of in awe of this sort of thing. Maybe I should try to do it.
Should take some of the awe away.

But I should probably focus on sucking less, first.

~~~
bluedino
Try making a small virtual machine first -
[https://github.com/tekknolagi/carp/blob/master/README.md](https://github.com/tekknolagi/carp/blob/master/README.md)

~~~
tekknolagi
Whoa, that's my project. Wouldn't recommend looking at that one, but
definitely take a look at it's successor (linked on the README).

------
rounce
First I'd like to say this project represents fantastically impressive
achievement regardless.

However, putting this repo through `cloc` reveals the HN title to be rather
misleading.

EDIT: I had initially read through {cpu,apu,ppu,gui,joypad,mapper}.{cpp,h} and
noticed that the mental tally I was taking had run well over 1000LOC. In my
haste I quickly cloned the repository and ran cloc against the current dir
_which massively inflated the result_ (Doh!). See author's comment below for a
more sensible figure.

~~~
andreaorru
Sorry, but that's really not fair. You are running that on all the folders
inside the src/ directory (and the README?), including blargg's libraries that
I'm using. I'm not counting that. Go ahead and include those if you want. It's
code I didn't write, and bigger than the rest of the emulator! I don't think
that's representative.

CPU and PPU implementations tend to be in the order of the thousands of lines
-- they are around 200 and 300 lines respectively in LaiNES. In fact, most of
the code is in the GUI that I could easily strip away if this was a
competition. And this wasn't written to be small - it was written to be
simple. It also came out small, but that's incidental.

Here's how I counted the lines and how I decided on the description for the
repository, which by the way has been catapulted from totally unknown to
worldwide attention overnight, and it's now object of unexpected, ruthless
scrutiny that I couldn't foresee.

    
    
      [andrea@manhattan src]$ rm -rf boost nes_apu Sound_Queue.*
      [andrea@manhattan src]$ cloc .
            24 text files.
            24 unique files.                              
             1 file ignored.
    
      github.com/AlDanial/cloc v 1.70  T=0.03 s (780.3 files/s, 63170.2 lines/s)
      -------------------------------------------------------------------------------
      Language                     files          blank        comment           code
      -------------------------------------------------------------------------------
      C++                             11            210            110           1163
      C/C++ Header                    12             87              7            285
      -------------------------------------------------------------------------------
      SUM:                            23            297            117           1448
      -------------------------------------------------------------------------------

~~~
rounce
I agree with this figure, please see my edited comment above.

Still though, my gripe with the HN title still stands: to say this is ~1000
LOC is a bit rich. It's still bloody small so why try to shoehorn it into this
category?

> And this wasn't written to be small - it was written to be simple. It also
> came out small, but that's incidental.

I think that why this is pure gold! Because it's _simple_ , it's easy to
understand, I would have been hopping with glee if this had been available to
me as a teenager, instead of having to read tons of articles/textfiles of
varying quality with lots of trial, error and head-scratching. It's size is
besides the point, and why the title is still - in my opinion - misleading. I
can't help but feel the very people that would benefit the most from this
might possibly be put off that they're going to be presented with some
indecipherable demo comp entry.

Either way keep up the good work.

~~~
andreaorru
No problem, we are cool. I see your point.

I believe there is still a lot of room for improvement in terms of accuracy,
clarity and code size. Note that this repository was more than 3 years old.
Maybe this will motivate me to improve it even further.

~~~
rounce
I think that'd be pretty cool to see. I'd be especially be interested how
quickly the returns in size diminish given your starting point. Likewise, how
much it'd would have to change architecturally as it gets smaller. Sounds like
a _slightly_ more extreme form of the game Shenzen-I/O.

While reading through the repo one thing I kept thinking was it would be nice
IMO would be decoupling the everything from the GUI so that it was a little
'flatter'. So `main` would call `NES::run()` (or something), and `NES` would
leverage GUI. GUI would be just drawing stuff. (In my head at least,) it feels
like that way it might be easier to mentally partition things, for those using
it as a learning project. As `NES` would be responsible for ownership and
interop that GUI is doing now. I'll add it to my todo list and perhaps in god-
knows-when I'll fork it and do this if you haven't gotten round to it :)
Having said that it's inspired me to finish my first Go project which was a
GameBoy emulator. Was started mainly to deep-dive the language, but I think it
could be useful in a similar way if cleaned up and documented for folks.

------
anonbanker
Any plans for a libretro port?

------
ggggtez
Maybe of you inline entire functions...

~~~
userbinator
Functions which are only "called" once, which I think is a perfectly
reasonable thing to do.

