
Show HN: Writing my first interpreter in C for fun - jhedwards
https://github.com/incrediblesound/huo
======
naveen99
Sometimes i wish for compiler tutorials that progress feature by feature,
starting with simple features, rather than starting with complicated things
like parsing expressions with multiple op codes (op codes being implemented as
function calls, so it gets even more complicated)...

For example, a sequence like this:

1\. variable assignment

2\. commands of one parameter

3\. if / branch statements

4\. much later: nested expressions, function calls, multiple stacks,
environments

basically, starting with an abstraction of 3 parameter opcodes, and then
slowly building features up the abstraction layers, as well as slowly building
down the optimization layers, step by step.

~~~
segmondy
You are looking for this, "let's build a compiler"
[http://www.stack.nl/~marcov/compiler.pdf](http://www.stack.nl/~marcov/compiler.pdf)

~~~
naveen99
not exactly. more like an expanded version of chapter 5 of sicp
[https://mitpress.mit.edu/sicp/full-text/book/book-
Z-H-31.htm...](https://mitpress.mit.edu/sicp/full-text/book/book-Z-H-31.html)

I will put it on github if i ever get around to writing one myself.

I am thinking something targeting / including a toy version of qemu.

------
Dangeranger
Hey this is great work! What resources would you recommend for someone else
who wants to build their own toy language for fun and practice?

~~~
cel1ne
The classic way would be to start out with LEX/FLEX and YACC.

~~~
zzzcpan
Forget about lex, flex, bison, yacc and all that. It's way easier and more
important to start with a hand written recursive descent parser.

~~~
textmode
Q: Why do those programs exist? Lex and yacc originally come from Bell Labs.
Why wouldn't they use what you recommend instead?

And when I look at the credits for lex and yacc today I see some very capable
programmers. Why do these programs exist? I may be biased somewhat as I use
them regularly for small tasks, but my question is an honest one.

Also, someone else mentioned swi-prolog. I think gprolog is just as easy and
is noticably faster. Curious what HN readers think are the advantages to swi-
prolog.

~~~
pjmlp
Because many don't know any better and are stuck in 70's parser tools mindset.

Tools like ANTLR[0], SableCC[1], JavaCC[2], MPS[5] and approaches like
Attribute Grammars[3], Parsing expression grammars[4] are much more suitable
for the modern days.

I don't know gprolog, but SWI has GUI tooling, compilation and I was already
using it in 1997, so it is very sound toolchain.

[0] - [http://www.antlr.org/](http://www.antlr.org/)

[1] - [https://github.com/SableCC/sablecc](https://github.com/SableCC/sablecc)

[2] -
[https://javacc.java.net/doc/JJTree.html](https://javacc.java.net/doc/JJTree.html)

[3] -
[https://wiki.haskell.org/Attribute_grammar](https://wiki.haskell.org/Attribute_grammar)

[4] -
[https://en.wikipedia.org/wiki/Parsing_expression_grammar](https://en.wikipedia.org/wiki/Parsing_expression_grammar)

[5] - [https://www.jetbrains.com/mps](https://www.jetbrains.com/mps)

~~~
91891879181
I've tried Antlr and it was a horrible experience compared to Menhir: Being
stuck with LL(n) causes a lot of unnecessary refactoring, the syntax for
anything out of the ordinary is arcane and IMO not well documented.

Give me the 70's tools any day.

~~~
groovy2shoes
Menhir is hardly a 70s tool...

~~~
99189198191
This is true of course. But it is ocamlyacc compatible and uses a separate
lexer.

I'd at least say that it is using the lex/yacc _approach_ , which I actually
like.

------
melling
Here are my compiler resources:

[https://github.com/melling/ComputerLanguages/blob/master/com...](https://github.com/melling/ComputerLanguages/blob/master/compilers.org)

------
weberc2
Very cool. Knowing little about programming language implementation, I wonder
what it would take to make a compiler. For so simple a language, I would think
it would be not much harder, no?

~~~
felixangell1024
Depends what you compile to. Writing a register allocator may be a little
tricky (if you compile to a VM/Assembly/..), though if you compile to
something like C, it's very easy. LLVM is another alternative which is also
relatively simple to learn.

~~~
andars
Spilling every variable to memory isn't too bad if you just want it to work
and aren't worried about performance.

------
ndesaulniers
Now to make it self hosting. ;)

------
sdegutis
Nice! If you're looking for another fun challenge, make your language compile
into bytecode and execute that instead of an AST. :)

~~~
chrisseaton
ASTs are fighting back these days! I work with a system at Oracle for going
straight from AST to JIT, with no bytecode, and it's proving to be very fast
and very easy.

~~~
mungoman2
That is super cool! Are there any public resources available about this?

~~~
chrisseaton
The key paper: [http://lafo.ssw.uni-
linz.ac.at/papers/2013_Onward_OneVMToRul...](http://lafo.ssw.uni-
linz.ac.at/papers/2013_Onward_OneVMToRuleThemAll.pdf)

There's implementations of Ruby, JS, R, Python, etc, using it.

I have a blog about the Ruby effort
[http://chrisseaton.com/rubytruffle/](http://chrisseaton.com/rubytruffle/)

------
stevekemp
That's a nice set of readable code. I made a minor change just to add in
floats instead of ints, which wasn't too hard to accomplish thanks to your
structure.

(Though my changes are ugly).

If/When you get to implementing functions things will get really useful :)

~~~
jhedwards
I just realized reading your comment that I did implement functions but I
forgot to document them! I will update the readme immediately.

If you look in the main file, huo.c, you will see something called store_defs.
That function takes the ASTs of defined functions and stores them in a key-val
type store. If someone invokes a user-defined function I just grab the AST,
replace the variables with the values they passed in and execute it. That code
is in process_defs and is invoked by execute.

~~~
stevekemp
Wonderful news. I managed to gloss over that :)

------
ckaygusu
Nice work.

You may want to take a look at Bison/Flex. While they don't suit the tastes of
everyone, it's good to know how to work with them. For example, BASH parser is
written using those tools.

------
analog31
Nice work. What would take, to make it into a REPL? I haven't looked closely
at the code yet, so maybe it's close to being there.

~~~
jhedwards
That's a good idea. I don't think it would be hard, you just get the
interpreter to run in a loop and read from the keyboard input. I'll put that
on the list for upcoming features :-)

