Hacker News new | past | comments | ask | show | jobs | submit login

The "scheme in scheme" section is cheating, imo. I like to imagine language implementation on a line between:

- Implement JS in one line with `eval(input)`

- Hand-crafted asm compiler with manual lexing, parsing, codegen, etc

This article is far closer to `(eval input)` than to a real interpreter. The entire front-end is pushed onto the native implementation, and even the tree-walking interpreter just mostly delegates to the host implementation.

I'm not saying that it's useless; it's a very elegant demonstration of how a tree-walking interpreter works at the very simplest level: for each node, evaluate their children, with literals evaluating to themselves and symbols resolving according to their environment. It neatly demonstrates the core principles, but fails to demonstrate any of the parts about language implementation that's actually interesting.

You don't get details (that somebody ultimately has to write) about memory layout, GC implementation, continuations, etc. This is all punted to the host implementation.




While you are right that there are a lot deeper topics, the point of this primer was to be a primer, not a language design piece. It was also to show off one of the more powerful pieces of Lisp/Scheme: that metacircular evaluators (or, as you say, tree walking interpreters) are trivially easy to write partly because of how the language works in a way that is fairly unique to those languages.

If you read SICP, chapter 4 is basically a long-form introduction to the tree walking interpreter I introduce in the document, and chapter 5 gets into how to write a register VM with the lower level details you describe. However chapter 4 is where the most important unlocks of demystifying how programming languages work. Much design of programming languages in lisp has happened within metacircular evaluators first, and the efficient implementation later. Within a few tweaks, you can change your language to a logic programming language like Prolog, or switch from lexical scoping to dynamic scoping. This is power. And it isn't possible in most other programming languages. SICP was recently ported to Javascript, and it's an amazing conversion, but there's an extra step for parsing the source text which turns it into something that doesn't look like javascript anymore, whereas a lispy evaluator looks just like lisp.

Again, this is a primer. Showing off that you can do such a thing in 30 lines of code is something that few primers which are less than 30 pages would dare to do. It's possible to demonstrate in lisp/scheme partly because of lisp's language choices. And it's hopefully useful: I know when I wrote my first metacircular evaluator at the end of reading The Little Schemer, it demystified programming languages for me in a huge way. I hope a bit of that demystification shows up here for others too.


But does it create a good springboard for language design?


Depends on what you mean by language design. The interesting parts of modern language design, imo, is around type systems, invariants that the compiler manages, concurrency models, verification, memory models, etc.

The 30 lines or so just demonstrates on a very high level how you walk a tree. Sure, it's a good demonstration, but if you handed those 30 lines and their accompanying explanation to an intro undergrad class they would still have very little idea about how to go about building interpreters beyond a high-level idea of recursively evaluating subtrees, which I would argue is probably the least interesting part about building interpreters. It's a foundational concept, yes, but not much beyond that.

In my mind a good springboard for language design would be a fast tour through a bunch of interesting languages, of which scheme would be one of many. Haskell, Rust, C, Scheme, Erlang, Smalltalk, OCaml, etc all combined would be a wonderful exploration of both the history and major debates around how programmers write and think about code.

A good springboard around the implementation of languages would be to actually go through and build an interpreter for scheme in C. This can cover lexing, a simple recursive descent parser, tree-walking interpreter, mark-and-sweep GC, TCO detection, CPS transformations, etc.

The natural follow-up is then building the mid-backend targeting a bytecode VM. This will also include the major code analysis areas like tree-shaking, data-flow analysis, dependency analysis, etc. Targeting a real ISA is less interesting imo because at that point you're dealing more with the peculiarities with x64/ARM/whatever than with general code analysis techniques, but there's still valuable lessons to be learnt here.

Anyways, all the long tangent to say that tree-walking interpreters are such a small part of language design and implementation that I personally don't see the value in talking about it too much. Yes, it's elegant, yes, it's fundamental, but the concept is simple and not really that interesting imo.


A tree walking intrepreter can be an excellent way to prototype type systems, concurrency models, verification, etc.

Again, this is a tutorial, for newcomers, which happens to include a tree walking interpreter. I think you've pushed the goalpost pretty far back just because a reasonably interesting one was hit at all.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: