Reader Macros in Common Lisp (2014)

sph · on Nov 20, 2022

Great timing. It's my turn to write a Lisp, and I was just thinking about implementing reader macros this morning :-)

My goal is to create the barest-metal Lisp OS (basically Lisp REPL with full ring-0 access, the OS) with an asm reader macro to write low level assembly opcodes you can jmp into.

    (defn add1 (x)
      (declare x int64)
      #asm(
        mov rax, %x
        add rax, 1
      ))

(Just thought of this syntax on the spot, I'm a few dozen hours away from that still.)

moonchild · on Nov 20, 2022

Why a reader macro? Just use s-expressions, like sbcl and ccl.

guenthert · on Nov 20, 2022

Indeed. That would ease making other tools, e.g. a super-optimizer, which generate or modify assembly code.

User23 · on Nov 20, 2022

I’m guessing #ASM does the assembly at read or load time?

Does it desugar to something like this?

  (assemble ‘((mov rax %x) (add rax 1)))

Anyhow I think this is a great project. I hope you share it! I’ve often thought bootstrapping an assembler with lisp macros would be an interesting way to build a compiler from the very ground up. Assembler macros are nothing new, but I’ve never heard of an homoiconic implementation.

rst · on Nov 20, 2022

The original LISP's "Lisp Assembly Program" was rather like this -- see Appendix C in the Lisp 1.5 manual, from 1962: https://www.softwarepreservation.org/projects/LISP/book/LISP...

timonoko · on Nov 20, 2022

1970's calling. Been there. Done that: https://news.ycombinator.com/item?id=33517092#33517797

sph · on Nov 20, 2022

Thanks for the inspiration. The fact that you've already done it won't stop me.

Got more details/links to study?

timonoko · on Nov 20, 2022

https://github.com/timonoko/nokolisp_addons/blob/master/asse...

Keyword is the line

     (dos-eval '(debug < TO.DEB > NUL))

("mapc" and "map" had their parameters in wrong order, which annoys me even today)

jlarocco · on Nov 20, 2022

Out of curiosity, why not write a Scheme or Common Lisp compiler? Or just a subset?

sph · on Nov 20, 2022

I want an operating system from first principles. To write the smallest OS with the smallest language and the least amount of restriction.

The goal is being able to replace every single part of the runtime/interpreter if you wish, but the OS itself starts with the fewest functions/features possible. cons, lambda, and the other usual suspects, a way of running actual native machine code, the homoiconicity of LISP, a runtime-editable image, packed into a UEFI binary.

You could build yourself a Scheme or a Common Lisp on top of it if you wish.

jlarocco · on Nov 21, 2022

> You could build yourself a Scheme or a Common Lisp on top of it if you wish.

No offense, but if it were Common Lisp I would take a look, but I don't have much interest in learning a one off language.

Sounds like you're having so fun, though, so good luck!

macoovacany · on Nov 20, 2022

https://movitz.common-lisp.dev/

dgan · on Nov 20, 2022

so I started learning common lisp lately. It's insane how one can combine very high level language, with down-to-metal compilation. I think that's the only language allowing it? Absolutely impressive. Feels like writing python but compiling to native

krastanov · on Nov 20, 2022

Julia has similar capabilities, probably because it was heavily inspired by lisps. You can even modify Julia's compiler from within Julia. I often write code at a python level of abstraction and then use julia introspection to check the machine code that was generated.

celeritascelery · on Nov 21, 2022

So if I understand correctly, in Julia you programmatically look at generated machine code? Is there a way to modify it, or is just for making sure some optimizations were applied?

eigenspace · on Nov 21, 2022

We can modify the code at a few different levels. The easiest level is our untyped intermediate representation. The next easiest level is to modify things at the level of the LLVM code which is basically one step above assembler, and almost always better to work on than direct machine code (also machine code can be embedded in LLVM code if you need to). You can also use https://github.com/YingboMa/AsmMacro.jl if you like.

We are also working out interfaces to make it easier to programatically work on our typed IR through a technique and set of interfaces known as "abstract interpretation".

adgjlsfhk1 · on Nov 21, 2022

@code_native just lets you look at generated code, but Julia also uses macros frequently to give the compiler hints about how to compile your code. Some examples are @inbounds which disables bounds checks, @fastmath which is the local version of C/Fortran's --math-mode=fast, @simd which lets the compiler assume it can re-order loops (it will do so anyway if it can prove you won't notice). If you need more fine grained control (which is very rare) you can also emit LLVM bytecode (or direct assembly) directly.

shele · on Nov 21, 2022

Some abstractions are costly if the compiler doesn't optimize them away, so one use is to check if that happens. So one iterates changing the Julia code, not the machine code mostly.

vitiral · on Nov 20, 2022

Forth is even more "down to the metal" than lisp, but yes I agree it is quite impressive.

vitiral · on Nov 20, 2022

Okay let me get this straight...

Are you saying that I don't need to use parenthesis as much in lisp? Instead I could define a few "reader macros" and then immediately have a different syntax?

Say if I wanted to change

    (if (test-clause) (action1)
    (action2))

To

    if (test-clause) do (action1)
    else (action2)

Is this actually possible inline in lisp, meaning I just need to import a library that implements the if-reader-macro?

kazinator · on Nov 21, 2022

Yes; for isntance see the CMU infix.cl file from the 1990's:

https://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/...

It has test cases at the bottom. like checking that f(a)*=g(b) expands to (setf (f a) (* (f a) (g b))))

The module installs the syntax under the #i dispatch characters; you wrap the infix expression with #I( ... )

fiddlerwoaroof · on Nov 21, 2022

This works in many Common Lisp implementations: https://readable.sourceforge.io

vitiral · on Nov 21, 2022

Cool stuff, thanks.

sph · on Nov 20, 2022

For this concept of user-defined languages/DSLs, look no further than Racket.

timonoko · on Nov 20, 2022

Apropos. Is there reader macro definitions for "(" and ")" ?

Or is Gödel rolling in his grave for such insolent heresy?

stassats · on Nov 20, 2022

Sure, what's so special about it?

(get-macro-character #\() => SB-IMPL::READ-LIST

(get-macro-character #\)) => SB-IMPL::READ-RIGHT-PAREN (that one just signals an error, the read-list one picks up the closing #\))

timonoko · on Nov 20, 2022

In other words : You cannot redefine ")" because it cannot be undefined while reading the new definition.

thelopa · on Nov 20, 2022

That’s completely incorrect. The parent comment has it right. In normal usage, the meaning of #\) is entirely determined by the implementation for the #\( reader macro. This happens because the reader macro for #\( keeps consuming characters until it has located AND consumed the matching #\). The reader would only run the #\) reader macro if it encountered a #\) that didn’t have a corresponding #\(. That’s what they meant when they said “it signals an error”. Specifically, it signals that there are unbalanced parentheses.

If you redefine the #\) reader macro, you could use it in a top-level context. That’s probably not a good idea since it’s easy to accidentally have too many #\) when closing a deeply nested expression and thus accidentally invoke the macro.

Aside from signaling an error, the #\) reader macro is also useful because it changes the rules when reading symbols. Basically, if you write (+ foo bar), the existence of the #\) macro helps the reader know that you’re referencing the “bar” symbol rather than the “bar)” symbol.

Generally, when people define new balanced-pair syntax for Common Lisp, (such as a #\{ macro for hash tables) they will follow the same pattern and define a corresponding reader macro for the closing side that always signals an error for all the same reasons.

Edit: also, as others have pointed out, you seem to be mistakenly assuming that the redefinition takes affect mid-way through reading the expression. That’s not how CL works. CL cleanly separates the process of executing code into a few distinct phases. First, the reader reads an entire expression (“form” in lisp terms). Then, that form is macroexpanded (traditional macros, not reader macros!) as needed before (optionally) being compiled and then executed.

The change to the read table would happen during the execution phase — well ordered after the original characters for that form are out of the picture.

You COULD force a change to the readtable mid-way through reading a form using the #. reader macro, but that definitely gets into chainsaw-juggling territory.

timonoko · on Nov 20, 2022

Ok. I understood my error in about 5.3 second but let it stay, because it was funny, and would trigger somebody to generate a wall-chart of explanations.

Anyways already in 1982 I was diabolically opposed to this kind of shit. Read and Print should be as simple as possible and always one-to-one. When you print something to disk, that is what you get when reading, no additional adjustment needed.

If you want reading macros, for example, you make your own Read. Better Read could even be in standard package "Additional-macros-for-common-lisp".

TeMPOraL · on Nov 20, 2022

> If you want reading macros, for example, you make your own Read. Better Read could even be in standard package "Additional-macros-for-common-lisp".

It makes little sense to reimplement (and maintain over time) the whole Read if you only care about changing some small aspect of it. Instead, you can consider standard Read mechanism from CL to be extensible - reader macros are plugins/hooks/customization points/whatever you want to call it. You can maintain 1:1 Read/Print compatibility easily by defining/overriding a matching printer method, which is the Print side plugin/hook/customization point/whatever.

The only thing that could be simpler than this would be some magic that lets you automatically derive a read macro and a printing method from a simple declaration, but for that you'd have to sacrifice Turing completeness.

In my experience, reader macros are just freaking people out for unconscious and irrational reasons. I can tell because they still freak me out a little, even though I'm conceptually fine with them.

creepycrawler · on Nov 20, 2022

> Aside from signaling an error, the #\) reader macro is also useful because it changes the rules when reading symbols.

This is wrong. See http://www.lispworks.com/documentation/lw50/CLHS/Body/02_ad....

thelopa · on Nov 20, 2022

> A macro character is either terminating or non-terminating. The difference between terminating and non-terminating macro characters lies in what happens when such characters occur in the middle of a token. If a non-terminating macro character occurs in the middle of a token, the function associated with the non-terminating macro character is not called, and the non-terminating macro character does not terminate the token's name; it becomes part of the name as if the macro character were really a constituent character. A terminating macro character terminates any token, and its associated reader macro function is called no matter where the character appears. The only non-terminating macro character in standard syntax is sharpsign.

http://www.lispworks.com/documentation/lw50/CLHS/Body/02_add...

That is the system I was alluding to. It’s been a few years since I’ve done anything with CL and so it seems my memory was slightly off.

creepycrawler · on Nov 20, 2022

Sure you can.

    CL-USER> (set-macro-character #\) (get-macro-character #\())
    T
    CL-USER> )format t "Hello")
    Hello
    NIL

aidenn0 · on Nov 20, 2022

No, it can totally be redefined. Reading happens before evaluation

sph · on Nov 20, 2022

It would be a noop. A reader macro that takes a s-expr bounded by parentheses, and returns a s-expr, literally is just the identity function^Hmacro.

stassats · on Nov 20, 2022

It's not. Reader macros don't take s-expressions, they take characters.