Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: “Statements and State”, the next chapter of my book on interpreters (craftinginterpreters.com)
141 points by munificent on June 2, 2017 | hide | past | favorite | 50 comments

I'll add my usual comment in discussions about interpreter/compiler books: I'd really love to read a modern book or in-depth tutorial on creating a statically-typed functional language, with discussions on pattern matching, type inference, tail call elimination, ADTs, etc.

There are so many great books out there on how to create a lisp, or a typical mutable object-oriented language, but with one notable exception (that's unfinished), there are no approachable online tutorials/books out there that I've found on building compilers or interpreters for functional languages. Only academic papers, a textbook, and one or two books from 20-30 years ago.

Pierce's Types and Programming Languages is a great textbook that covers all of this material, but from an extremely detailed and formal academic perspective. It would be great to see more approachable tutorials or short books online to complement Pierce's text.

I've started writing my own in-depth tutorial on this subject using Scala as the implementation language, but would love to see other tutorials/books as well.

> I'd really love to read a modern book or in-depth tutorial on creating a statically-typed functional language, with discussions on pattern matching, type inference, tail call elimination, ADTs, etc

I'm interested in that too. I've heard good things about Appel's book [1], but haven't read it yet.

> There are so many great books out there on how to create a lisp, or a typical mutable object-oriented language

For what it's worth, I haven't found much about creating object-oriented languages. There's a lot on Lispy dynamicall-typed languages, but I haven't seen much on things like method dispatch, vtables, inheritance, etc.

One of the reasons I'm writing this book is to try to cover that. Even if you don't like OOP languages, I think it's worth knowing more about how they work under the hood since they are so prevalent in the industry.

> I've started writing my own in-depth tutorial on this subject using Scala as the implementation language

Interesting! Is it online yet?

[1]: https://www.cs.princeton.edu/~appel/modern/ml/

"Modern Compiler Implementation" isn't too bad, but it is a general purpose introductory compiler text rather than being dedicated to functional languages. It does briefly touch on subjects like closure conversion and type inference, but doesn't give working code.

Appel's other compiler book, "Compiling with Continuations", goes into more depth while falling short of being a complete tutorial on writing a compiler for ML.

You might find http://esumii.github.io/min-caml/index-e.html interesting. While not as well-presented as munificent's current effort, it does present a working functional language compiler as a tutorial.

Essentials of Programming Languages doesn't cover all of this, but it's a pretty good start (at least on the interpreter side).

What's the notable exception?

Write you a Haskell by Stephen Diehl:


I like this book a lot - especially making it available to read free in HTML. It's good to see practical information about parsing and grammar rather than plainly theoretical - although there's room and a for need both.

Another similar read is Vidar Hokstad's blog series - Writing a compiler in Ruby [1] which was first submitted here 9 years ago! [2]

[1] http://hokstad.com/compiler

[2] https://news.ycombinator.com/item?id=319337

I vividly remember learning about recursive descent parsing in my compilers class and thinking, "Wow, I can use this everywhere!"

Of course in addition to executing an AST you can also generate CODE for it...

> Of course in addition to executing an AST you can also generate CODE for it...

Yes! In the third part of the book [1], we'll do exactly that. The C interpreter's parser generates bytecode as it parses. I haven't written the text yet (that's the hard part...) but you can see the code for it here [2].

[1]: http://www.craftinginterpreters.com/a-bytecode-virtual-machi...

[2]: https://github.com/munificent/craftinginterpreters/blob/mast...

That will be awesome.

Recursive descent is still a great way to implement parsers.

"Recursive descent" is an ancient African expression meaning "I can't write grammars".

Most production compilers still depend on recursive descent because (a) it isn't that much harder compared to the generative approach, and (b) it is the best way to work in real error recovery or to support IDE services.

I haven't had a chance to dive into this book, but I just want to say that Game Programming Patterns[0] is one of the finest programming books I have ever read, 100% worth reading even if you never intend to create a game. Very readable, funny, and full of wisdom. You are the man!

[0] http://gameprogrammingpatterns.com/

Yes, I can second this! Game Programming Patterns is one of my favorite programming books of the last few years.

Mine too! Though some have accused me of bias in that regard...

Thanks for taking your time to commit to this, I was wondering if you had an RSS feed rather than the email subscription option? I tried a few feed readers and couldn't discover RSS or ATOM feeds present. (Although I may have missed it as on my phone at present)

I don't. My blog has one, but the hand-rolled static site generator I use for the book doesn't do that. It's pretty minimal.

My impression (for better or worse) is that few people use RSS these days, so it didn't seem like a great use of my time to get it set up.

If you don't want to join the mailing list for whatever reason (though it is quite low traffic), you could follow me on twitter too (@munificentbob). I tweet whenever a new chapter is up.

Ah I see, if you use Jekyll or Hugo they will generate the feeds for you so I'm assuming it's something else?

Personally I've found that RSS is more popular than ever, especially in the tech space as the number of news sources available these days is so large partly due to the number of individual blogs out there and services like feedly along with wonderful apps like reeder.

> I'm assuming it's something else?

Yes, I hand-rolled a little build script tailored specifically to the book.

Hey that's neat that you rolled your own build for your needs though.

Yeah, it was a deliberate choice to hack together something simple.

Otherwise, I'd spend countless hours tinkering with frameworks and libraries and never actually write the damn book.

This is really a wonderful book. Nystrom's writing is clear and entertaining. I look forward to every new chapter.

I am loving this book so far, always wanted a resource with such details and a 1:1 session like feeling.

Also, to enforce an extra layer of learning, I am writing the interpreter in Go. (book uses Java and C)

The "satisfactory" part is, I've started applying the learning to other problems (writing a transpiler, completely different from the book's context), despite being in the middle of the book.

to enforce an extra layer of learning

That's a very interesting approach. Is that something you usually do when learning, or you just wanted to use Go for this? Anything else you can say about the practice?

This is great writing: very clear and easy to understand!

I feel that dynamic variable lookup is a mistake, though -- it's just so painful to have to wait until runtime to discover you've made a typo. Is supporting mutual recursion really important enough to offset this pain?

Yeah, it's a... choice is the best way to put it. It has upsides and downsides.

I do think supporting mutual recursion is very important. People expect it to simply work, especially in an object-oriented language. It does work out of the box in JavaScript, Python, Ruby, and Lua. (And note that also none of those will detect access to undefined variables until runtime.)

Early versions of my main hobby language Wren did detect this error statically but it ended up interacting poorly with the REPL and I became convinced it didn't provide enough value to justify the static checking.

The main thing that tipped the scales for me was that Scheme works this way. That seems like a pretty reasonable path to follow for a dynamically typed language.

I don't think dynamic variable lookup should be required for mutual recursion. There are plenty of static languages that support mutual recursion.

This topic is covered in a sidebar in the chapter: allowing the top level to be list of statements instead of declarations means that each function definition must happen in order. Otherwise you can end up with troublesome code like this:

    function f() { g() }
    var x = read_user_input()
    function g() { print x }
Static languages typically don't allow the calls to f() and read_user_input() -- maybe that's the right answer here. There are other answers that come to mind as well. But, either way, I don't think we should perpetuate the mistake of dynamic variable lookup.

Ah, this is solved in OCaml using the 'and' keyword:

    let rec even n =
        match n with
        | 0 -> true
        | x -> odd (x-1)
    and odd n =
        match n with
        | 0 -> false
        | x -> even (x-1)
Although personally I've never liked the idiom where everything at the top-level is a statement. Even in OCaml I usually define a 'main' function, and call it at the bottom of the file:

    let main args = do stuff...
    let () = main Sys.argv

Yes, more specifically, the "let rec" is what begins the series of mutually recursive definitions. The "rec" is a clue that this is specifically for supporting this exact recursive case.

> Although personally I've never liked the idiom where everything at the top-level is a statement.

For statically-typed languages, I'm not a huge fan of it either. But for dynamically-typed ones, I think it works out OK.

Typography and layout is also very pleasing.

Thank you! I put more time into this than most people probably realize.

any reason for not putting this book on github you would hopefully get stars and various encouragements and it would be more transparent to the whole community


I should probably link to it from the site...

Beutiful! What tools do you use for generating HTML ?

A pretty simple hand-cobbled-together chunk of Python:


It's more complex than the build script[1] for my previous book[2], but still pretty simple. There's a real luxury in writing a program that literally only has to run on one single set of input data. A whole lot of things get easier.

[1]: https://github.com/munificent/game-programming-patterns/blob...

[2]: http://gameprogrammingpatterns.com/


> Jokes aside, i am VERY MUCH looking forward and EXCITED to see that the next chapter is on implementing a bytecode virtual machine.

The next chapter will be on adding control flow to the Java interpreter. The next part, which is still a handful of chapters away, will begin on the bytecode VM.

The repo is here https://github.com/munificent/craftinginterpreters so you could have looked without the need for 'comedic' guessing.

It contains exactly zero getX/setX methods.

Man, what happened to sense of humor in Hacker News? Has it been deprecated in the current version?

I think you'd have to use actual humor for that. Instead you wrote the programmer equivalent of 'BUTTS LOL' (hilarious all-caps included) except longer.

Umm... You should understand that the use of all-caps in "COBOL", "ADD" and "GIVING" wasn't gratuitous.

Historically, programming languages (early 1950s, COBOL, FORTRAN, etc), not only had acronyms as their names (thus the names in all caps) but the keywords themselves (and thus most of the source code) was written (and supposed to be written) as all-caps: "GOTO", "IF", "THEN", "GOSUB", etc.

Actually, historically only one representation of letters was available and was conventionally presented with uppercase letters. Once lowercase was available, there was a split between languages/implementations that wanted things written in all caps, ones that didn't care how you wrote it but would internally convert to all caps, and ones which were case insensitive (there were newer languages, too, which accepted or requires lowercase for keywords and were case sensitive, but I'm mostly talking about languages which predated having lowercase available and how they coped with it.)

The gratuitous thing is the dumb, tedious and unfunny crack not your COBOL capitalization (although you capitalized other things).

Seriously, take a moment to look at the text and the code. Beside the lengthy verbiage in the text explaining the motivation and choice, the code goes out of its way to be explicit, succinct and clear. All you had to say about it was 'BUTTS', sight unseen. Comedy gold, right there. Wouldn't it be easier to just say you got it wrong?


I thought this would be a book in political statement interpretation. Would be really interested in that! Any suggestions?

Really appreciate the attempt. Thanks! Yet, I don't think that will be an answer.

A very simple example of political speak: Merkel basically said last Sunday: "We can't trust our overseas partners anymore. I've seen that the last few days. We must take our fate in our own hands."

Yet what she meant was: "Trump will let you down. I'm strong enough to defy him. If you want a strong leader that listens, come to me."

Another example is how the G7 fought just about phrasing of a final statement for day and night before. What does the result mean? Who won what? How will that influence decision of other people in the political landscape?

Really curious to learn about that.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact