Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A way to do this more directly and without dealing with any assembly syntax and at a deeper level is to compile down to an emulation of a straightforward machine (I really like MSP430 for this), preferably to an emulator you write.

Having had the pleasure of writing a couple emulators and a couple compilers at this point: the emulator is easier than the compiler, by a lot. Relative to parsing and emitting code, emulating a straightforward architecture is easy.

This is a great post!



I agree completely, and I would add that the compiler doesn't even need to compile down to a system/architecture that actually exists. You can make your own simple 'architecture' with a minimal list of instructions that wouldn't require much more then a loop and some switch statements to emulate, and you're free to design it in any way to make your life easier. For an architecture that's designed to be targeted by C, you likely don't even need registers and could just use memory operands for everything.


There's also a reasonable chance that if you're interested in compiler design and making an architecture you're the sort of person who would enjoy learning a hardware description language. With a cheap FPGA dev board you could then implement your toy architecture and run your compiled programs on that. This adds substantial difficulty to the project, but is damn fun.


Yet another approach which I have used in Fur (https://github.com/kerkeslager/fur) is to compile to a low(er)-level programming language (in the case of Fur, I'm compiling to C).

These approaches aren't just for toy languages. Java/C#/Python standard implementations compile to byte code that runs on virtual machines. Glasgow Haskell Compiler compiles to C--, and a bunch of languages compile to JavaScript.


The approach taken at my compiler course was one that I consider the best one for toy compilers, but it does mean having to deal with Assembly.

Do the generation to some kind of byte code as you are suggesting, but using an instruction format that be used as macro calls in macro assemblers.

Then just write the macros for each bytecode, doesn't matter matter if the register usage is bad, goal is just to have a plain native executable that runs.


> is to compile down to an emulation of a straightforward machine (I really like MSP430 for this)

What exactly do you mean by this? Direct object code emission?

> the emulator is easier than the compiler, by a lot

Yeah, definite +1 there. For most simple machines it really is not much more than a loop with a switch on the opcode where the cases update the CPU state.


Yes: just spit out opcodes.


Any reference(s) where I can read more about writing emulators?


Here is one https://fms.komkon.org/EMUL8/HOWTO.html I found quickly.

Writing an emulator is almost as much fun as writing a compiler. And when you are done, you will have another level of appreciation of how computers work.


Wirth came up with a simple RISC architecture for his compiler book. There's an emulator[1] for it, written by Peter De Wachter. Michael Schierl adapted Peter's emulator code into Java and JS. You can run it here:

http://schierlm.github.io/OberonEmulator/

I sent a bunch of patches to rework the in-browser emulator a couple weeks ago. If you don't know C, I recommend reading through the JS emulator's source. (View source should suffice—it's all unminified vanilla JS; there's no opinionated JS framework involved.)

With it running in the browser, the emulator frontend treats the web platform as its widget toolkit. The code to interface with that is in webdriver.js[2] and takes about 1000 lines of code. The CPU and memory operations themselves are implemented in risc.js[3] and only take about 1/3 that.

To follow along with instruction fetching/decoding/execution, you'll need to understand the ISA. There's a good 3-page overview linked from projectoberon.com under the title "RISC Architecture"[4]. A more in-depth description of the design is also available[5].

I have some tentative work for a machine-code level debugger online[6]. It's unfinished, however, so it comes with no documentation and the toolbar icons are missing. (There are tooltips, however.) So you can play with it if you feel like watching the registers and flags change while stepping through machine instructions.

1. https://github.com/pdewacht/oberon-risc-emu/

2. https://github.com/schierlm/OberonEmulator/blob/master/JS/we...

3. https://github.com/schierlm/OberonEmulator/blob/master/JS/ri...

4. https://www.inf.ethz.ch/personal/wirth/FPGA-relatedWork/RISC...

5. https://www.inf.ethz.ch/personal/wirth/FPGA-relatedWork/RISC...

6. https://www.colbyrussell.com/staging/aubergine/emu.html?imag...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: