
Let’s Build a Simple Interpreter, Part 1 - rspivak
http://ruslanspivak.com/lsbasi-part1/
======
amelius
> “If you don’t know how compilers work, then you don’t know how computers
> work. If you’re not 100% sure whether you know how compilers work, then you
> don’t know how they work.”

What nonsense. If you don't know how quantum mechanics works, you don't know
how transistors work. If you don't know how transistors work, you don't know
how computers work, etcetera.

~~~
mozumder
Most electrical engineers do go through quantum mechanics in their modern
physics classes, in order to understand how transistors work...

~~~
ChuckMcM
I was thinking that same thing, and because I was minoring in CS during my EE
degree I also got to do the compilers class with the rest of the CS majors
(and a class on interpreters).

But I get that you can understand a lot about computation without
understanding compilers. Just as you can understand a lot about circuits
without understanding quantum mechanics.

------
jussij
Here is a link to the source code for a _C like interpreter_ that I many years
ago:

[http://www.zeusedit.com/tools.html](http://www.zeusedit.com/tools.html)

Is it any good? Not really. There are many far better interpreters out there
to choose from.

But it was my _first and last_ ever interpreter, written some 20+ years ago,
written when I was fresh out of university.

Was it hard to write?

I did not think so and I only had an engineering degree ;)

Now lets try to write a compiler. Suddenly things get far more difficult!!!

~~~
yoklov
A compiler isn't actually that much more difficult than an interpreter.
Extending an interpreter to compile instead of interpret is not that much
code, and the code won't be that difficult or complex either.

The hard part is making the generated code perform well, which requires much
more work.

------
JoeAltmaier
Interpreters are fun, but still too complicated. There's a simpler form, which
is called 'interplets' around here. It has a parser, then directly generates
an object instance for every semantic element e.g. variable, control statement
etc. The objects get linked, and they have an 'execute' method. Voila! Runs
about 10X as fast as an interpreter.

~~~
arethuza
Well, if you want a _really_ simple interpreter there is always the option to
go Reverse Polish (e.g. Forth/PostScript).

Of course, this makes the implementation simple while arguably pushing some of
the work onto the programmer using the language - but RPN languages can be
rather fun in a Lisp kind of way.

~~~
Someone
For beginners, I would go for a recursive descent interpreter (just made up
that term) that implements a 'little language'. That is something on the
complexity of a shell without looping or variables or of parsing argv:

    
    
      - read a line
      - do a switch on the first word of the line
      - inside each switch, make up your own syntax, parse the
        remainder of the line, and execute it or issue an error
        message and abort.
        Feel free to cheat by using scanf to parse integers.
    

Iteration 2 can add printing the line number in error messages.

Iteration 3 can add if/then and looping: keep the lines in an array, and add
'goto' to the list of commands, and hack the jump into the interpreter loop by
adjusting a global variable from inside the 'goto' statement.

Iteration 4 can add the 'let' command and variables. Naming them A through Z
for simplicity is an option.

Then, add simple assignments of the form

    
    
      let a = x <op> y
    

where x and y are liberals or variables. For now, don't bother with
expressions with multiple operators.

Next, add statements of the form

    
    
      y = f(x)
    

That gives you a sort-of 70's Basic that can be quite useful and fun to use.

Only then would I start introducing grammars, the word 'parse', etc.

~~~
sklogic
And what's a point in such a complex approach, if a proper interpreter (or,
even better, compiler) is much simpler?

~~~
Someone
Complex? Its building up from almost nothing, and shows how simple an
interpreter can be.

If you start with a grammar, you lose half your audience within 5 minutes
(numbers made up)

If you start with a program that reads a text file and plays notes for lines
contaning do, re, and mi, part of that lost audience will get drawn in,
complete the scale, add 'fart' commands, etc.

Also, smart kids may figure out they can easily add single-statement loops on
their own, by adding a case 'repeat' that scans for the number of iterations,
removes the 'repeat' and that number of iterations from the line, and then
calls the 'processALine' code.

It's "learn by playing" vs "learn through study".

~~~
sklogic
You do not have to do such clumsy things to build from nothing. Split the
input by whitespaces, lookup word definitions in a table (linked list is ok),
push the addresses of the found words to the output stream as call
instructions or execute immediately if marked so. That's it, a minimal direct
threaded Forth compiler is done.

------
aikah
I like this article:

[http://lisperator.net/pltut/](http://lisperator.net/pltut/)

it's in javascript and does a good job at teaching writing interpreters
withtout regex tricks. There is even a compiler tutorial at the end.

------
ufo
I am a huge fan of using functional programming languages to implement
interpreters. Algebraic data types makes working with syntax-trees much more
pleasant.

~~~
sklogic
And yet, it's a lot of boilerplate to write recursive ast walkers manually.
ADTs are not enough, some (ideally, compile-time) reflection is also needed.

~~~
pjmlp
That is why for prototyping it is much better to use tools like JavaCC, ANTLR,
attribute grammars,... than doing everything from scratch.

~~~
sklogic
None of these helps with implementing AST visitors. I was rather talking about
things like Scrap Your Boilerplate or Nanopass.

~~~
pjmlp
They are able to generate part of the AST visitors boilerplate.

For example, JJTree on JavaCC.

~~~
sklogic
It's not very useful - it's generating only one visitor structure, while you'd
need many for different passes. E.g., a visitor for extracting bound and free
identifiers lists would be different from a visitor for lowering various
syntax sugar.

~~~
pjmlp
Ah ok, but you could also do the multiple passes at once, at least in some
cases.

I am talking about prototyping, not going with a state of art implementation.

~~~
sklogic
Doing multiple passes at once is a (premature) optimisation. For prototyping
you'd rather split the process into as small and simple passes as possible.

