The path to implementing a programming language

kazinator · on July 18, 2023

> [A tree interpreter] is really only used for extremely simple languages and small DSLs.

I don't agree; maintaining a tree interpreter for the entire language, in parallel with a compiler, allows you to write the compiler and library in that same language, without having a bootstrapping problem (needing an existing installation of the language to rebuild the language). You can bootstrap from some popular language that is readily available everywhere, like C.

You also have the language features specified in three ways: documentation, compiled semantics, interpreted semantics. If two out of three agree, the third is probably wrong.

I find it a great boon in the TXR project that if I'm chasing some problem where something in the library is miscompiled, all I have to do is delete that module's compiled file to have that module be tree-interpreted.

There is also this trick. If in a file that is being compiled we suspect that we have some top-level (expr ...) that is being mishandled by the compiler, we can change it like this:

  (eval '(expr ..))

Now when the compiler sees this, it will compile it to a single function call to eval, which passes the literal code to it. When the compiled file is loaded, then (expr ...) is tree-interpreted, while everything else is compiled.

The tree interpreter can be your boostrapper and life jacket; don't throw it away too casually.

PartiallyTyped · on July 18, 2023

I agree with this, plus doing property based testing where parity between interp and compiler is checked with arbitrarily complex trees seems like a great idea to be honest.

amusingimpala75 · on July 18, 2023

Also reading Crafting Interpreters[1] is a great intro to language implementation.

[1]https://craftinginterpreters.com

seeknotfind · on July 18, 2023

Interesting or useful semantics and how the language will be used is the most important decision. If you are writing a language, I'd recommend saving parsing, optimization, code generation later. Interpreting a coded AST at first will give you the quickest path to playing with your ideas.

tines · on July 18, 2023

Yep, and that's why every language concept should start off as a lisp :)

packetlost · on July 18, 2023

I'll second this. Lisps really are great for prototyping language semantics

smitty1e · on July 18, 2023

Why not start at Episode IV with a set of JSON documents representing how your new language would have parsed if one were less of a slacker?

That way one can focus on deciding upon a complete set of language components ahead of fretting such pesky matters as syntax.

danielvaughn · on July 18, 2023

I dove into language design a couple of years ago; this would have saved me quite a bit of time. I'll add that I still haven't found a good resource that walks you through transpilation/code-gen.

macintux · on July 18, 2023

Related from last month: Designing a language without a parser

https://news.ycombinator.com/item?id=36591079

samsquire · on July 18, 2023

Thank you for this article.

I'm a beginner to programming language implementation and design but here's what I learned. But what I want to do this with this comment I want to encourage you to start work on your programming language and just "Do Something©", Anything! You might have always dreamed to create a programming language. You can indeed try that! Have faith that you can do something, even if it's simple or incomplete, at least you learned something and got another skill.

I don't want to be trapped by the idea that building your own programming language is impossibly difficult and that it will never be used so what's the point.

It's so worthwhile.

You can still do something! Effort doesn't have to be wasted! Go and try write a simple virtual machine: it's just a for loop over instructions that manipulate memory. I wrote a non-bytecode compiler, which just uses List<String> for instructions and HashMap<String, String> for instruction arguments.

Andreas Kling built a browser and operating system and Terry Davis built an operating system. They encourage that someone can in fact learn a lot and do a lot.

I don't want to endlessy design things OR only write implementations. I think you can write lots of ideas down AND spend time implementing things and getting your keyboard busy.

EDIT: What I mean by this is that I want to get something end-to-end working, it doesn't have to be finished on the first attempt! As long as something goes through the entire compiler. I think many projects start on lexing and parsing and try perfect that and then never get to code generation or interpretation.

I wrote this incomplete JIT compiler in C which has a simple nondesigned frontend that resembles Javascript. ANF is my intermediate representation.

https://github.com/samsquire/compiler

I wrote a multithreaded imaginary assembly language that sends integers between threads through mailboxes but nowhere near LMAX Disruptor performance.

I think you should avoid spending too much time on your parser or lexer, use the Kaledeiscope LLVM tutorial to learn how to write recursive descent parsers and move onto code generation. I did mine with switch statements. The more you actually write parsers the easier it gets, but at first when you have no clue, you CAN just read someone else's implementation of it. Understand it, then write your own to your own design. if you get Analysis paralysis and worry about making a mistake or unoptimal decision and that prevents you from doing something suboptimal but actually do something.

I rushed through my compiler to get to the code generation step because my goal was code generation.

My dream: parallel and concurrent language that combines threads and coroutines with efficient interthread communication similar to LMAX Disruptor and allows writing of efficient pipelines that are serialisable like Temporal.io.

danielvaughn · on July 18, 2023

I dove into language design a couple of years ago to try and build my own transpiled DSL. At first I tried reading about it, and quickly became overwhelmed. Took a step back and just started writing code that seemed like it would do what I wanted.

I ended up writing a parser that built an AST, even though I didn't know what an AST was at the time. It's a huge boost to your confidence when you get around to reading the literature and it describes exactly the solution you came up with, like "yes I'm on the right track and I'm not as stupid as I thought I was."

johnea · on July 18, 2023

Just say No!

tines · on July 18, 2023

So many projects that I've started and ideas that I've had have died without ever seeing the light of day because I couldn't say no to the urge to implement "the perfect language" for them. I'm forcing myself to just use existing tools, and I'm finally getting things done again. The problem with language implementation is that it's very fun, and it feels like progress. But for 99.999% of people and projects, it's a distraction, an advanced form of yak shaving. You must choose what you really want to work on, because you can't work on everything.