Thanks to the author for an enjoyable intellectual journey, it's a wonderful series of articles.
Having tried my hand at following the Make a Lisp project ¹, implementing an interpreter in a couple of languages, it's fascinating to see the process of designing the language and compiler from the ground up, discussing various tradeoffs for simplicity, reasons for internal data structure, etc.
In fact, I think I'm learning more about C99 than Lisp - there's something about creating a language (or just following along like me), by necessity it leads to the foundations of computing/programming, the concepts that make up the basis of all languages. It's a perfect (meta)subject for an educational project.
Critique: the TOC is only shown in first post and not clickable, makes it hard to navigate the series. It would also be nice to have the TOC repeated in each post.
Question: I saw a chapter about unary functions. In some interpreters let bindings are explained as "sugar", or syntactic extensions on top of function calls. For instance `let` is not present in the core of scheme [1]. So in this way one could transform a let binding from:
(let ((a 1) (b 2)) (+ a b))
to:
((((lambda (a) (lambda (b) (+ a b))) 1) 2)
through:
(define-syntax let
(syntax-rules ()
[(_ ((x e) ...) b1 b2 ...)
((lambda (x ...) b1 b2 ...) e ...)]))
... then you wouldn't need to compile let bindings, just function application, which some people call "desugaring".
I was wondering if this kind of desugaring is used in practical lisp compilers or not... without knowing too much about lisp compilers, I imagine one reason to not go this way is that it could potentially generate way more code than compiling higher level concepts like `let` directly.
Yes, real compilers do handle `((lambda (x) ...) ...)` as well as a `let`, because as a Lisp programmer you want to be able to define and use new macros freely, without having to worry too much about low-level optimizations. Kent Dybvig gave a talk on this sort of thing called the "macro-writer's Bill of Rights". It's a similar concept to the recent talk about what optimizations need to be guaranteed for "zero-cost abstraction" in Rust and C++.
Most lisp compilers do exactly as you suggest. Generally, lisp is implemented on a small core of maybe 20~30 functions, with everything else as macros. If you're curious about how lisps are implemented, I recommend checking out this series: https://www.youtube.com/watch?v=Wa81OJnlsoI
That's really not true of Common Lisp implementations. These have many more than 20-30 functions, and they all have to be there because one can obtain them as values and pass those values around (and, in the case of standard generic functions, add new methods for them). There may be specific expansions for calls to particular standard functions when the types of the arguments are known, or when particular keyword and optional arguments are or are not present.
It's one of the features of Common Lisp that things that might be buried in the compiler in other languages are surfaced and made available to the user. This means a great deal of the language's standard functionality can be implemented as if it were in "user space". Macros, readtables, method combinations, setf expanders and generalized places, for example.
IMO, if Common Lisp were to be extended (it probably won't be, at least officially, as there's no money for another whack at a standard) I think it would be best even if more "internal" features were surfaced this way, and made available for user extension.
Re: lambda: it's important to realize that this series is building up incrementally bigger features from nothing. This Lisp compiler won't have a full implementation of closures or even labeled procedures for some time. So right now we're getting used to name binding and using what we have.
While it's possible to generate the same code I did even when rewriting to lambda, you'll probably have to have some kind of strength reduction afterward.
And anyway, implementing 'letrec' in C/Asm is really not that hard: allocate cells, assign names to them, compute the values. Compare that with the machinery of making closures (flat or linked?) and function invocation.
I personally prefer to have 'letrec' as a primitive and implement non-recursive 'let' on top of it by checking whether any newly bound names are referenced from the bound values.
I'm not surprised by that. The parent suggested (or I thought that they did) that this was actually the definition for let* . And my point was to show that it does not behave like let* but rather like plain let.
I don't have any questions, but I just started reading it a few days ago, and I'm re-implementing it in Nim (which is an interesting experience). One issue I ran into right away was that my code from step 1 was segfaulting, and the issue was that I accidentally called `mprotect` before copying the memory into the buffer. Easy mistake to make, so you might want to add a warning for people who aren't just copy-pasting the code.
You're more than welcome to your opinion, but I've explicitly outlined that those are non-goals of this series. Maybe maybe I'll get to caching or optimization.
Seems good and reading. But it should explain or say it will cover the difference to c if c makes a distinction of statement and expression. And why that matter. Not throw a bone into the sky and assume we know it will later become a space station.
What? The let way of variable bindings is standard just about everywhere. Why replace it with something that doesn't visually distinguish binding pairs? To my lispy eyes it is harder to navigate than
(let ((a 1) (b 2))
(+ a b))
and harder to implement as a syntax rules macro :)
I have gone the other way. when I first learned macros I spent loads of time trying to make paren-free versions of the standard scheme forms, and now I can't even be assed to do the racketism of surrounding my binding clauses with [].
The parens of let lets me easily distinguish the binding clauses, and lets me navigate it a lot easier in Emacs. The lets macro seems like the opposite of that. Every odd thing except the last element is an identifier and every even thing is that identifiers value.
Fewer parens. It was just a trivial example to illustrate the macro.
(lets (a b) (list 1 2) c (+ a 3) (d e f) (some-func a b c) (* d e f))
It destructures automatically. The equivalent traditional macro might look like
(destructuring-bind* [[(a b) (list 1 2)] [c (+ a 3)] [(d e f) (some-func a b c)]] (* d e f))
which is unreadable. And yeah, normally it's indented properly and yada-yada, and people use editor plugins like paredit to make it not such a pain to type, but still. It adds up.
Having tried my hand at following the Make a Lisp project ¹, implementing an interpreter in a couple of languages, it's fascinating to see the process of designing the language and compiler from the ground up, discussing various tradeoffs for simplicity, reasons for internal data structure, etc.
In fact, I think I'm learning more about C99 than Lisp - there's something about creating a language (or just following along like me), by necessity it leads to the foundations of computing/programming, the concepts that make up the basis of all languages. It's a perfect (meta)subject for an educational project.
¹ https://github.com/kanaka/mal