It's probably worth a few words specifically about the included Lisp and how it works, and how many rules it breaks ! The reason for doing the Lisp was to allow me to create an assembler to replace NASM, I was not concerned with sticking to 'the lisp way', or whatever your local Lisp guru says. No doubt I will have many demons in hell awaiting me...
First of all there is no garbage collector by choice. All objects used by the Lisp are reference counted objects from the class library. Early on the Lisp was never going to have a problem with cycles because it had no way to create them, but as I developed the assembler I decided to introduce two functions, push and elem-set, that could create cycles. However the efficiency advantage in coding the assembler made me look the other way. There are now other ways that a cycle can be created, by naming an environment within its own scope, but again this was too good an efficiency feature to miss out on. So you do have to be careful not to create cycles, so think about how your code works.
No tail recursion optimization ! There is a single looping function provided in native code, while, every other looping construct builds on this primitive. There are also two native primitives some! and each! that provide generic access to iterating over a slice of a sequence/s, while calling a function on the grouped elements. Standard some and each are built on these but they also allow other constructs to be built and gain the advantage of machine coded iteration. I try to stick to a functional approach in my Lisp code, and manipulate collections of things in a functional way with operations like map, filter, reduce, each etc. I've not found the lack of tail recursion a problem.
All symbols live in the same environment, functions, macros, everything. The environment is a chain of hash maps. Each lambda gets a new hash map pushed onto the environment chain on invocation, and dereferenced on exit. The env function can be used to return the current hash map and optionally resize the number of buckets from the default of 1. This proves very effective for storing large numbers of symbols and objects for the assembler as well as creating caches. Make sure to setq the symbol you bind to the result of env to nil before returning from the function if you do this, else you will create a cycle that can't be freed.
defq and bind always create entries in the top environment hash map. setq searches the environment chain to find an existing entry and sets that entry or fails with an error. This means setq can be used to write to symbols outside the scope of the current function. Some people don't like this, but used wisely it can be very powerful. Coming from an assembler background I prefer to have all the guns and knives available, so try not to shoot your foot off.
There is no cons, cdr or car stuff. Lists are just vector objects and you use push, cat, slice etc to manipulate elements. Also an empty list does not evaluate to nil, it's just an error.
Function and macro definitions are scoped and visible only within the scope of the declaring function. There is no global macro list. During macro expansion the environment chain is searched to see if a macro exists.
The code doesn’t look like Lisp at all. Is this some sort of lower assembler language that Lisp is implemented on top of?
Have a read through the docs/ASSIGNMENT.md file to see how this all fits together.
Same with (def-enum) etc.
I am going to have to write a document to clearly state where these boundaries between the Lisp, the C-Script and the VP layers. I realise it's not that obvious sometimes.
Anything in particular that sticks out to? Lack of recursion?
Related, on the TAOS operating system:
2015 https://news.ycombinator.com/item?id=9806607 (with lots of comments by original contributors)
https://github.com/vygr/ChrysaLisp/blob/1808d2db54cdda378aae... sure looks like plausibly a Lisp dialect to me.
Is your objection to the execution VM (all the files ending in .vp)?
2. No tail recursion
3. There is no cons, cdr or car stuff. Lists are just vector objects and you use push, cat, slice etc to manipulate elements.
Only this much is radical enough to say that it isn't a Lisp but disguised in Lisp syntax. Lisp is as much about semantics as the syntax. The power of Lisp to do amazing things comes from the language capabilities. If you cannot reasonably translate powerful programs from Lisp textbooks, then is it a Lisp?
(IIUC the "garbage collector" is refcounting; I'm not sure what "All objects used by the Lisp are reference counted objects from the class library." means in practice, and specifically I'm not sure if that means "you basically have a GC in lisp as long as you don't make cycles".)
All objects used by the Lisp are instances of classes from the class library, ie that stuff in class/ folder.
These instances are ref counted objects, as they are derefed and they drop to 0 refs they get deinited and freed.
The Lisp is constructed from these, for example Lists in the Lisp are an instance of the class/vector, numbers are an instance of class/num, the environment is a chain of class/hmap etc.
So yes, you do have GC if you don't make cycles, and you do have to care about not doing that just as you would in C++.
Clojure doesn't because it's a Lisp in Java, and Java doesn't. Back when Clojure was just starting to get some attention I watched a video  of Rich doing a demo for a user group. Tail call optimization was one of their first questions. He gave them an answer that sounded like he had given it multiple times before.
 can't find it, sorry.
I keep hearing this and I have no idea where it comes from. x86_64 doesn't either, Clojure has recur, and prefers consistent var semantics to silently specialcasing some tail calls. There's no particularly good reason the compiler couldn't just do it.
x86_64 instruction set gives the programmer complete control over the organization of memory: the stack, use of registers, calling conventions and so on.
It has few safety features.
x86_64 doesn't specifically support tail calls, but it makes tail calls possible by way of jump instructions being able to go anywhere in the address space. An instruction in the middle of one function can branch to an instruction in the middle of another function, and without doing anything with the registers or stack.
The JVM byte code doesn't allow such a thing.
The JVM supports iteration. Therefore local tail calling is possible, because it's a syntactic sugar for local control transfers. That's presumably why Clojure can have recur.
I'm going to rephrase my own argument because it doesn't appear you interacted with it: what part of your argument doesn't work for x86_64? "One would need to compile the code in complex ways: function calls would no longer map directly to CALL/RET instructions." Sure! That's arguably what makes it an optimization. Who cares? CPUs can loop, and so can the JVM.
I would maybe see your point if Clojure didn't already know how to do tail calls (and so would need to implement the allegedly complicated compilation step), but as I have pointed out several times: it already has `recur`, which lets you call a function without the JVM thinking there is a function call going on.
That one needs to manually annotate it in Clojure just means that the compiler is manually instructed to generate a loop-like construct instead of a function call.
The advantage in Clojure is that these loops are explicitly marked, which IMHO improves readablity.
Many Lisps don't bother to implement tail recursion optimization, because they support the more general tail call optimization - by adjusting/reusing the current stack frame and using a JMP instruction.
So, like Clojure.
Clojure has cons, car, cdr which you can use in the classic Lisp sense. The only thing you CAN'T do is have a dotted pair or dotted list in the last cons cell.
One might argue that you cannot surgically alter data structures in Clojure. But you can! It's just that the function doing the alteration returns an entirely new data structure with the alteration in place -- but without the expected inefficiency of a copy operation. If you have an array of a billion elements, and alter one of the elements, you get back a new array with the alteration. But it is not a COPY of the entire array (with the expected time required to copy). The original array without the alteration also still exists -- but you don't have twice the memory usage now that there seem to be two slightly different arrays with these billion elements. Yet accessing or altering any element in the array has close to the performance you would expect of an actual array implementation.
So, it doesn't have them in the classic Lisp sense. Conses are just pairs. Using them as such isn't exotic. (cons 1 2) being an error in Clojure isn't a minor thing, it's very unique compared to other Lisps. It has a very different definition of cons.
Clojure could have done the same, but Rich Hickey deliberately chose not to, instead providing a special 'rec' construct for tail self calls. But this doesn't make Clojure not a Lisp, it only makes it not a Scheme.
I'm not sure why it matters what the JVM supports. x86_64 doesn't do tail recursion either; it's the compiler's job to translate. Let's separate the `recur` vs the function name itself for a minute: if Clojure automatically took a function body with a tail-position recursion using the name of the function itself and translated it to a loop, would you agree that's tail recursion optimization?
Imagine two routines that call each other. A calls B, which calls A, which calls B, and continues recursion until the real answer is calculated, then returns and unwinds the stack. With real tail calls, there is no stack expansion. When A calls B, the original stack frame of A is overwritten to be the new stack call frame for B, and then A does a JUMP to the right code in B which eventually will do the RETURN instruction (or recursively call A).
The above tail recursion macros achieve local tail calls by transforming to a tagbody with go. This is offered in the form of a tlet macro whose syntax is like labels. Write your mutually tail recursive functions as labels, then change labels to tlet to try it with this.
tlet is based on argtags: a form of tagbody whose labels take arguments. These arguments perform a re-assignment of local variables from parameters that accompany the goto transfer.
Tail calls among top-level functions are supposed by a complementary facility called deftail which uses a combination of non-local dynamic control transfers and a dispatch trampoline.
other languages supporting tail recursion on the jvm do that. I don't know enough about clojure to say that is how they do it, though.
A calls B calls C calls A calls B ... repeat until ... one of the functions solves the problem and does a return. Then the stack is unwound. But with TCO all of the tail calls simply re-use the current stack frame and JUMP to the next function.
i honestly don't know how they would do it otherwise, really. I actually liked the `recur` syntax of clojure when i was playing around with it a few years ago, though.
This reminds me of the UCSD p-System of the early 1980's.
To port the OS all you had to port was the very small p-Machine emulator (PME). Once you had a bootable PME, everything else, the entire OS, utilities, and all applications instantly came along for the ride, without even being recompiled.
I could only imagine a modern OS that would take an approach like this, but Hotspot JIT everything to native code.
Check Burroughs B5000 (now Unisys ClearPath), AS/400 (now IBM I), z/OS (now IBM z), Xerox PARC workstations, Lilith, Ceres, watch OS, Garmin devices and plenty of others.
p-System never did JIT. Remember this was for machines with 64 K of memory. Not 64 MB. But 64 K! The p-Code was much more compact than native code. Interpreting most code was plenty fast for most things -- even on the slow (under 10 MHz) clock speeds of the era. Critical operations could be written in assembler. The p-System had its own assembler for native code.
JVM is a whole other world. Adding GC is a whole new dimension in complexity. But decades of research have gone into the JVM making it the amazing runtime platform it is today.
Adding a GC to bytecode systems, is what Xerox PARC did with their Interlisp-D, Smalltalk and Mesa/Cedar workstations.
The CPUs were microcoded and as part of the boot process they would load the respective hardware interpreter for the environment being booted into the workstation.
A similar idea was explored in one of Oberon's implementations, where instead of using straight native code like on the Ceres workstation, the modules would have a compact representation, JITed into native code on module load.
See section 7, Machine-Independent Mobile Code on http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.90....
Taos articles link at the bottom, Byte, IEEE, Edge etc.
Chris was far ahead of the curve - a heterogeneous, load-balancing, multi-processor, multi-tasking OS, with Object-Oriented Virtual Processor, byte code, load-time translation, support for multiple CPU types (including RISC/CISC/transputer and LE and BE), and tiny footprint message-passing kernel, when most people used MS-DOS 3.3 or DR-DOS 5.0, with Windows 95 still a gleam in Bill Gates's beady eye! Nearly 30 years on and he is still ahead of the curve!
p.s. no one has mentioned it yet - try a Google search for "edge tao spyfish". That was 1995!! Not released, but still very interesting reading material.
The system can build itself from source after clean, on my 2014 Macbook in under 0.4s. It's not exactly a standard benchmark though, but it's not slow considering that's a Lisp (like) interpreter doing all the compiling and assembling !. Footprint at the moment for everything is 158KB.
The snapshot.zip file contains 3 separate snapshots for the 3 ABIs supported currently.
Could this be run this in VMWare? Or is it specifically bound to a host OS?
I have a small abstraction to the host OS via the sys/pii/functions, these just use the jump table you can see in the main.c file. Any host OS should be able to compile the main.c with little or no changes, but eventually there will be a native filesystem available from VP/Lisp.
I will be doing direct bare metal hardware support, and there are things happening on that front but I can't talk about that here.
But during my experimentation stage it was very useful to have it running hosted, and I'll keep the hosted version running going forward as it provides a nice dev environment.
I do have a plan to get a version done that could boot under VMWare etc, but I've only got so much time on my hands and have to spread myself over the whole codebase as best I can. :)