Great timing. It's my turn to write a Lisp, and I was just thinking about implementing reader macros this morning :-)
My goal is to create the barest-metal Lisp OS (basically Lisp REPL with full ring-0 access, the OS) with an asm reader macro to write low level assembly opcodes you can jmp into.
I’m guessing #ASM does the assembly at read or load time?
Does it desugar to something like this?
(assemble ‘((mov rax %x) (add rax 1)))
Anyhow I think this is a great project. I hope you share it! I’ve often thought bootstrapping an assembler with lisp macros would be an interesting way to build a compiler from the very ground up. Assembler macros are nothing new, but I’ve never heard of an homoiconic implementation.
I want an operating system from first principles. To write the smallest OS with the smallest language and the least amount of restriction.
The goal is being able to replace every single part of the runtime/interpreter if you wish, but the OS itself starts with the fewest functions/features possible. cons, lambda, and the other usual suspects, a way of running actual native machine code, the homoiconicity of LISP, a runtime-editable image, packed into a UEFI binary.
You could build yourself a Scheme or a Common Lisp on top of it if you wish.
so I started learning common lisp lately. It's insane how one can combine very high level language, with down-to-metal compilation. I think that's the only language allowing it? Absolutely impressive. Feels like writing python but compiling to native
Julia has similar capabilities, probably because it was heavily inspired by lisps. You can even modify Julia's compiler from within Julia. I often write code at a python level of abstraction and then use julia introspection to check the machine code that was generated.
So if I understand correctly, in Julia you programmatically look at generated machine code? Is there a way to modify it, or is just for making sure some optimizations were applied?
We can modify the code at a few different levels. The easiest level is our untyped intermediate representation. The next easiest level is to modify things at the level of the LLVM code which is basically one step above assembler, and almost always better to work on than direct machine code (also machine code can be embedded in LLVM code if you need to). You can also use https://github.com/YingboMa/AsmMacro.jl if you like.
We are also working out interfaces to make it easier to programatically work on our typed IR through a technique and set of interfaces known as "abstract interpretation".
@code_native just lets you look at generated code, but Julia also uses macros frequently to give the compiler hints about how to compile your code. Some examples are @inbounds which disables bounds checks, @fastmath which is the local version of C/Fortran's --math-mode=fast, @simd which lets the compiler assume it can re-order loops (it will do so anyway if it can prove you won't notice). If you need more fine grained control (which is very rare) you can also emit LLVM bytecode (or direct assembly) directly.
Some abstractions are costly if the compiler doesn't optimize them away, so one use is to check if that happens. So one iterates changing the Julia code, not the machine code mostly.
Are you saying that I don't need to use parenthesis as much in lisp? Instead I could define a few "reader macros" and then immediately have a different syntax?
Say if I wanted to change
(if (test-clause) (action1)
(action2))
To
if (test-clause) do (action1)
else (action2)
Is this actually possible inline in lisp, meaning I just need to import a library that implements the if-reader-macro?
That’s completely incorrect. The parent comment has it right. In normal usage, the meaning of #\) is entirely determined by the implementation for the #\( reader macro. This happens because the reader macro for #\( keeps consuming characters until it has located AND consumed the matching #\). The reader would only run the #\) reader macro if it encountered a #\) that didn’t have a corresponding #\(. That’s what they meant when they said “it signals an error”. Specifically, it signals that there are unbalanced parentheses.
If you redefine the #\) reader macro, you could use it in a top-level context. That’s probably not a good idea since it’s easy to accidentally have too many #\) when closing a deeply nested expression and thus accidentally invoke the macro.
Aside from signaling an error, the #\) reader macro is also useful because it changes the rules when reading symbols. Basically, if you write (+ foo bar), the existence of the #\) macro helps the reader know that you’re referencing the “bar” symbol rather than the “bar)” symbol.
Generally, when people define new balanced-pair syntax for Common Lisp, (such as a #\{ macro for hash tables) they will follow the same pattern and define a corresponding reader macro for the closing side that always signals an error for all the same reasons.
Edit: also, as others have pointed out, you seem to be mistakenly assuming that the redefinition takes affect mid-way through reading the expression. That’s not how CL works. CL cleanly separates the process of executing code into a few distinct phases. First, the reader reads an entire expression (“form” in lisp terms). Then, that form is macroexpanded (traditional macros, not reader macros!) as needed before (optionally) being compiled and then executed.
The change to the read table would happen during the execution phase — well ordered after the original characters for that form are out of the picture.
You COULD force a change to the readtable mid-way through reading a form using the #. reader macro, but that definitely gets into chainsaw-juggling territory.
Ok. I understood my error in about 5.3 second but let it stay, because it was funny, and would trigger somebody to generate a wall-chart of explanations.
Anyways already in 1982 I was diabolically opposed to this kind of shit. Read and Print should be as simple as possible and always one-to-one. When you print something to disk, that is what you get when reading, no additional adjustment needed.
If you want reading macros, for example, you make your own Read. Better Read could even be in standard package "Additional-macros-for-common-lisp".
> If you want reading macros, for example, you make your own Read. Better Read could even be in standard package "Additional-macros-for-common-lisp".
It makes little sense to reimplement (and maintain over time) the whole Read if you only care about changing some small aspect of it. Instead, you can consider standard Read mechanism from CL to be extensible - reader macros are plugins/hooks/customization points/whatever you want to call it. You can maintain 1:1 Read/Print compatibility easily by defining/overriding a matching printer method, which is the Print side plugin/hook/customization point/whatever.
The only thing that could be simpler than this would be some magic that lets you automatically derive a read macro and a printing method from a simple declaration, but for that you'd have to sacrifice Turing completeness.
In my experience, reader macros are just freaking people out for unconscious and irrational reasons. I can tell because they still freak me out a little, even though I'm conceptually fine with them.
> A macro character is either terminating or non-terminating. The difference between terminating and non-terminating macro characters lies in what happens when such characters occur in the middle of a token. If a non-terminating macro character occurs in the middle of a token, the function associated with the non-terminating macro character is not called, and the non-terminating macro character does not terminate the token's name; it becomes part of the name as if the macro character were really a constituent character. A terminating macro character terminates any token, and its associated reader macro function is called no matter where the character appears. The only non-terminating macro character in standard syntax is sharpsign.
My goal is to create the barest-metal Lisp OS (basically Lisp REPL with full ring-0 access, the OS) with an asm reader macro to write low level assembly opcodes you can jmp into.
(Just thought of this syntax on the spot, I'm a few dozen hours away from that still.)