

Let’s Build a Simple Interpreter, Part 3 - rspivak
http://ruslanspivak.com/lsbasi-part3/

======
w0utert
It's going to be interesting how this series will expand to more complex
concepts of interpreter design, such as grammars, AST's etc. To be honest,
right now I'm thinking the approach taken to teach the basics of parsers and
interpreters is detrimental to understanding how these things work in the real
world.

IMO, it's important to learn about concepts such as regular expressions/finite
state machines for lexers, grammars, lookahead, recursive descent, etc. for
parsers first, before diving into code and writing a ghetto-lexer and a linear
parser with hard-coded rules and trying to build from there. At some point
when you really want to parse and interpret Pascal (which is what the first
article in the series mentions as the ultimate goal), you will have to forget
about everything the naive code did, and throw away all the code, because
you're not going to be parsing any 'real' Pascal code the way the toy examples
work right now.

Not saying this to talk down the author's effort, because it's an interesting
topic and I assume all these things will be addressed in later posts. I just
wanted to point out to people who want to learn about parsers and
interpreters, that the approach the toy interpreter in its current form will
not scale to a real interpreter that can parse something like Pascal code.

~~~
hajile
The hardest part for most people is understanding how these things even work
in the first place. Use simple methods to teach the concepts. Sure you throw
away those simple methods, but by understanding the core concepts, the learner
can grasp the more advanced methods.

------
userbinator
Recursive descent is a very elegant and simple algorithm, which is quite easy
to understand even without going deep into language theory. Deriving
precedence climbing from it is also not difficult:

[https://www.engr.mun.ca/~theo/Misc/exp_parsing.htm#more_clim...](https://www.engr.mun.ca/~theo/Misc/exp_parsing.htm#more_climbing)

Precedence climbing is an even more elegant and simpler solution to multiple
precedence levels, allowing things like this:

[https://news.ycombinator.com/item?id=8558822](https://news.ycombinator.com/item?id=8558822)

------
varlock
I'm reading "Programming: Principles and Practice Using C++" and one of the
first programs illustrated is actually a calculator. Being new to parsers in
general, I love to see how two different approaches work: in the book the
author implements the code by creating a formal (BNF-like) grammar first
whereas in this articles the author draws diagrams first.

~~~
rednab
In case you didn't know yet, a BNF grammar always is a graph. So the diagram
is just a different way of depicting the same thing.

For example, the (E)BNF¹ for the graph in the article is:

    
    
      expr = term { ("+" | "-") term }
    

where the accolades indicate zero-or-more repetition.

This actually is important because if your grammar can be written down as a
graph it means it is a _context-free_ grammar².

Parsing is a wonderful subject to study thanks to the depth of the subject and
the breath of texts available explaining it. Have fun!

¹)
[https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_F...](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_Form)

²) [https://en.wikipedia.org/wiki/Context-
free_grammar](https://en.wikipedia.org/wiki/Context-free_grammar)

~~~
varlock
I honestly didn't know! Thanks for the pointers!

------
lispm
Lisp / CLOS version of the code:

[https://gist.github.com/lispm/cb1e1c9fc75ad34624ed](https://gist.github.com/lispm/cb1e1c9fc75ad34624ed)

    
    
        CL-USER 72 > (calc)
    
        calc> 3+3+9
        15
        calc> 3*3
    
        Error: parse error getting next token
          1 (abort) calc toplevel
          2 Return to level 0.
          3 Return to top loop level 0.
    
        Type :b for backtrace or :c <option number> to proceed.
        Type :bug-form "<subject>" for a bug report template or :? for other options.
    
        CL-USER 73 : 1 > :c 1
    
        calc> 10
        10
        calc> 
        NIL

------
fijal
It's of course a bit far fetched, but it would not be particularly hard to
convert the interpreter to RPython and get performance/JIT for free

