Nice to see some ideas from Newspeak being played with. Another idea which comes to mind whenever I see 'low-level Smalltalk' is the Id object model[1], where objects are opaque blobs of memory, preceded by a pointer to a vtable. Vtables are themselves objects, with a bootstrapping dance to 'tie-the-knot'.
Id ties the metastability knot the AMOP way, by explicitly branching in the base case. The only instance of avoiding the metaobject recursion through the cache that I’ve encountered before this article is Tony Finch’s unimplemented(?) cobj[1].
> We believe an implementation written entirely in the language it implements has some engineering benefits. The first is that a programmer does not need to know any other language, in order to modify any part of their implementation.
Scheme48 was based on itself as a bootstrap language, with the restriction that only those closures that could be stack allocated were allowed. This restricted Scheme the implementors called Prescheme and it was pretty readable.
The difference here is that Zero Feet uses the same language and implementation framework for all code, but with capabilities and restraints chosen per module, rather than in a sub-language.
Indeed ZF relies on stack allocation, but this is done "optimistically" and for all code.
"The compiler can be similar to the compiler we proposed in I don't want to go to Chel-C, which merely replaces each bytecode instruction with a set of native instructions. "
I'm just wondering...with modern CPUs, isn't this just a version of subroutine threading [1] that doesn't take advantage of modern pipelined CPU hardware and also blows up instruction caches? Why not go for subroutine threading?
Co-author here; the first goal is to work at all, not to be awfully fast. But the branches that appear would most likely be due to inline caching, which are very predictable in practice; [1] claims a 95% hit rate for a simple inline cache. And given that there is little to inline, this scheme probably does approximate subroutine threading.
I was simply assuming that the average instruction for a Lisp implementation would be more substantial in terms of length the sequence of native instructions than an average instruction for a translation of C code. The closest thing that comes to my mind is the CLISP VM, which has fairly heavyweight instructions. The proposal doesn't give me an idea of how fine-grained the proposed instructions would be, or I must have missed it if it does.
Indeed it doesn't mention the instruction set. The closest would be Self, which only has instructions to do message sends and read particular VM registers; which boil down to calls and pushes respectively. That part of the semantics of the proposed language are closer to Smalltalk heritage than Lisp; the expression orientation, syntax and, at lack of a better term, "nesting" are the Lisp influence. Say, Smalltalk provides variable declarations at the start of a method, we intend to provide LET/LETREC forms instead.
I would agree that CLISP bytecode would be heavier; this approach ("template compiler"?) has been used as a baseline JIT in e.g. Jalepeño for the JVM with some success, but being a baseline JIT it is not intended to be that fast either.
This discussion brings to mind Niklaus Wirth's approach to evolving a compiler: the self-compilation performance should remain near constant even as the size of the code base grows.
Putting aside the considerable differences between Pascal/Modula/Oberon era languages and those of today, the primary purpose of any compiler has not changed: it is to read a source language, analyze it and produce executable machine code.
> Implementations which are written in another language and use an interpreter also have a very large difference in performance between code written in the language used for the implementation, and the language being implemented.
There's no need to write things like this in the current decade. LuaJIT has been able to provide comparable performance to the implementation language, in important cases, since 2.0.
It achieves this with a very different philosophy to what's being proposed here, one which in fact requires considerable assembler for each instruction set.
It's certainly not the only way to get the job done, other comments have pointed to the various ways Scheme has tackled this problem.
What isn't the case is that implementing a runtime for an interpreted language in a 'systems language' requires that code written in the interpreted language be slower than comparable code in the systems language, let alone that a "very large difference in performance" is inherent to the approach.
If LuaJIT is the example, is Lua being interpreted? I know that there is a bytecode interpreter hand-coded in assembly, but isn't the JIT used for hot code?
Edit: perhaps this is about missing the word "primarily", that performance in such an implementation is bound by interpretation speed.
LuaJIT is written in another language, C with assembler intrinsics, and uses an interpreter.
These are the two requirements of the sentence I quoted.
There's no need for the Zero Feet proposal to say something untrue to justify their approach. I'm not imputing malice because I see none, but moderating that sentence or just removing it would strengthen their case.
There is also the requirement in this counter argument that the interpreter design substantially affects runtime performance. Generally, when there is a JIT compiler and an interpreter, my understanding is that one attempts to spend most cycles running in compiled code. Does LuaJIT not do that?
HotSpot also uses an interpreter, but its performance comes from the C2 compiler, for example. Quoth Cliff Click [1]:
> if you are spending any amount of time beyond e.g. 5% in the interpreter / stage-0 JIT you need to adjust your JIT'ing strategy. Not saying "no-gain" in a stage-0-only, but definitely should not be a super high payoff if there's a stage-1 following you.
Well yes, of course. The question is if one can create a reasonably ergonomic lisp that transpiles to Rust. I'd tentatively imagine the borrow checker could make this difficult.
There is an SRFI for a chain macro which works like that, but the problem is that we'd need to figure out for each call whether we need to transpile it to a function or a method call.
And I agree with you that that S-expression notation is very readable, at least for us who are used to putting the parentheses in the correct spot.
That's definitely not the hard part of targeting Rust. I don't really understand why people keep thinking it's a good idea, Trying to generate borrow-checker compliant code sounds like a nightmare.
This exact comment was asked on Reddit some time ago - I mean the exact same characters. [1]
Nonetheless I'll provide the same answer; the compiler technology should be orthogonal to how it is implemented, so it should absolutely be possible. The proposal of using (monomorphic) inline caches is only the bare minimum to avoid crashing.
Or maybe just testing their new engine against a live group. People have suggested (disclosed?) this has happened on HN before, which is why I thought of it.
[1] https://www.piumarta.com/software/id-objmodel