Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
ChrysaLisp (github.com/vygr)
121 points by ngcc_hk on Jan 13, 2020 | hide | past | favorite | 61 comments



Thanks again folks for the interest. You catch me yet again when I’m at work but I’ll try answer question in an hour or so once I get home. Regards to all. Chris


Just a quick comment till I can get home. There are three main layers of in use. The more Lisp like layer in apps/ and cmd/. The lowest level VP assembler and an assembler with stack variables and expression compiler I call C-Script. This is my hobby and a follow up to Taos that I did back in the 90’s. See ‘Virtual Processor’ page on Wikipedia for link articles of various scans from back then. Got to head into the car now.....


Taken from the documenation:

Lisp

It's probably worth a few words specifically about the included Lisp and how it works, and how many rules it breaks ! The reason for doing the Lisp was to allow me to create an assembler to replace NASM, I was not concerned with sticking to 'the lisp way', or whatever your local Lisp guru says. No doubt I will have many demons in hell awaiting me...

First of all there is no garbage collector by choice. All objects used by the Lisp are reference counted objects from the class library. Early on the Lisp was never going to have a problem with cycles because it had no way to create them, but as I developed the assembler I decided to introduce two functions, push and elem-set, that could create cycles. However the efficiency advantage in coding the assembler made me look the other way. There are now other ways that a cycle can be created, by naming an environment within its own scope, but again this was too good an efficiency feature to miss out on. So you do have to be careful not to create cycles, so think about how your code works.

No tail recursion optimization ! There is a single looping function provided in native code, while, every other looping construct builds on this primitive. There are also two native primitives some! and each! that provide generic access to iterating over a slice of a sequence/s, while calling a function on the grouped elements. Standard some and each are built on these but they also allow other constructs to be built and gain the advantage of machine coded iteration. I try to stick to a functional approach in my Lisp code, and manipulate collections of things in a functional way with operations like map, filter, reduce, each etc. I've not found the lack of tail recursion a problem.

All symbols live in the same environment, functions, macros, everything. The environment is a chain of hash maps. Each lambda gets a new hash map pushed onto the environment chain on invocation, and dereferenced on exit. The env function can be used to return the current hash map and optionally resize the number of buckets from the default of 1. This proves very effective for storing large numbers of symbols and objects for the assembler as well as creating caches. Make sure to setq the symbol you bind to the result of env to nil before returning from the function if you do this, else you will create a cycle that can't be freed.

defq and bind always create entries in the top environment hash map. setq searches the environment chain to find an existing entry and sets that entry or fails with an error. This means setq can be used to write to symbols outside the scope of the current function. Some people don't like this, but used wisely it can be very powerful. Coming from an assembler background I prefer to have all the guns and knives available, so try not to shoot your foot off.

There is no cons, cdr or car stuff. Lists are just vector objects and you use push, cat, slice etc to manipulate elements. Also an empty list does not evaluate to nil, it's just an error.

Function and macro definitions are scoped and visible only within the scope of the declaring function. There is no global macro list. During macro expansion the environment chain is searched to see if a macro exists.


This is cool, but the code is... strange? https://github.com/vygr/ChrysaLisp/blob/master/gui/canvas/tg...

The code doesn’t look like Lisp at all. Is this some sort of lower assembler language that Lisp is implemented on top of?


That code is using the C-Script assembler prototyping macros. It will produce VP assembler output, but lets you work without worrying about registers while you're getting stuff going.

Have a read through the docs/ASSIGNMENT.md file to see how this all fits together.



As someone who isn't familiar with this Lisp dialect / DSL, something that stands out to me is the (def-struct ...) ... (def-struct-end), (def-method ...) ... (def-func-end), (switch) ... (endswitch), and (vpif ...) ... (endif). The appeal of s-expressions is that their nested structure is supposed to delineate the abstract syntax tree of the program, with the opening and closing parentheses denoting the boundaries of a construct. IMO this language's syntax isn't faithful to the spirit of Lisp.


The (def-struct) stuff is part of the assembler ! Not the Lisp.

Same with (def-enum) etc.

I am going to have to write a document to clearly state where these boundaries between the Lisp, the C-Script and the VP layers. I realise it's not that obvious sometimes.

Chris


(def-func-end) and other ‘ends’ look pretty un-lispy to me. Helps readability, though.



Looks like imperative/sequential lisp to me. Got those parenthesis, quoting and some light-weight nesting.

Anything in particular that sticks out to? Lack of recursion?




Very confusing. It's not actually Lisp but some new language with superficially Lisp-like syntax. It doesn't say what it's for or why the hell anybody would want to use it.


What makes it "not actually Lisp but some new language with a superficially Lisp-like syntax"? (Never mind that it's ostensibly a lot more than just a language but an entire OS with a Lisp-geared runtime -- I'm assuming you're just referring to the language).

https://github.com/vygr/ChrysaLisp/blob/1808d2db54cdda378aae... sure looks like plausibly a Lisp dialect to me.

Is your objection to the execution VM (all the files ending in .vp)?


1. no garbage collector

2. No tail recursion

3. There is no cons, cdr or car stuff. Lists are just vector objects and you use push, cat, slice etc to manipulate elements.

Only this much is radical enough to say that it isn't a Lisp but disguised in Lisp syntax. Lisp is as much about semantics as the syntax. The power of Lisp to do amazing things comes from the language capabilities. If you cannot reasonably translate powerful programs from Lisp textbooks, then is it a Lisp?

Some languages (example: JavaScript, Python) make it easy to create an inefficient Lisp interpreter that could execute common textbook programs. Minimax, A-star search, unification matching, etc.


Welp, you're right, all of those things are true for the lisp files too (not just vp). I just read some sample source and missed https://github.com/vygr/ChrysaLisp/blob/5483e0926b926c2cde63... . I agree those are pretty serious deviations that make "only superficially lisp-like" a reasonable statement. Thanks!

(IIUC the "garbage collector" is refcounting; I'm not sure what "All objects used by the Lisp are reference counted objects from the class library." means in practice, and specifically I'm not sure if that means "you basically have a GC in lisp as long as you don't make cycles".)


It means what it says ;)

All objects used by the Lisp are instances of classes from the class library, ie that stuff in class/ folder.

These instances are ref counted objects, as they are derefed and they drop to 0 refs they get deinited and freed.

The Lisp is constructed from these, for example Lists in the Lisp are an instance of the class/vector, numbers are an instance of class/num, the environment is a chain of class/hmap etc.

So yes, you do have GC if you don't make cycles, and you do have to care about not doing that just as you would in C++.


There is currently no GC, by choice, but I recently did some work to assist the Stats app that I added recently that could lead to me implementing a GC at some point in the future.


Lisp does not require tail recursion, only Scheme does.


Most Common Lisps (certainly the compiled ones) do tail call optimization.

Clojure doesn't because it's a Lisp in Java, and Java doesn't. Back when Clojure was just starting to get some attention I watched a video [1] of Rich doing a demo for a user group. Tail call optimization was one of their first questions. He gave them an answer that sounded like he had given it multiple times before.

[1] can't find it, sorry.


> Clojure doesn't because it's a Lisp in Java, and Java doesn't.

I keep hearing this and I have no idea where it comes from. x86_64 doesn't either, Clojure has recur, and prefers consistent var semantics to silently specialcasing some tail calls. There's no particularly good reason the compiler couldn't just do it.


The JVM defines a particular model for program organization, which is largely inescapable from the point of view of someone targeting a language front-end to it.

x86_64 instruction set gives the programmer complete control over the organization of memory: the stack, use of registers, calling conventions and so on.

It has few safety features.

x86_64 doesn't specifically support tail calls, but it makes tail calls possible by way of jump instructions being able to go anywhere in the address space. An instruction in the middle of one function can branch to an instruction in the middle of another function, and without doing anything with the registers or stack.

The JVM byte code doesn't allow such a thing.

The JVM supports iteration. Therefore local tail calling is possible, because it's a syntactic sugar for local control transfers. That's presumably why Clojure can have recur.


That sounds like a long-winded way of saying "x86_64 and the JVM are different but neither prevents TCO", which is precisely my point! Unless your point is "it's not TCO unless it's literally an instruction called 'jump' and iteration transforms don't count"?


For every function call it should not grow the stack and jump into the function. The JVM does not support that. One would need to compile the code in complex ways - function calls would no longer map directly to JVM instructions.


You're explaining TCO. I understand how TCO works.

I'm going to rephrase my own argument because it doesn't appear you interacted with it: what part of your argument doesn't work for x86_64? "One would need to compile the code in complex ways: function calls would no longer map directly to CALL/RET instructions." Sure! That's arguably what makes it an optimization. Who cares? CPUs can loop, and so can the JVM.

I would maybe see your point if Clojure didn't already know how to do tail calls (and so would need to implement the allegedly complicated compilation step), but as I have pointed out several times: it already has `recur`, which lets you call a function without the JVM thinking there is a function call going on.


Tail recursion optimization would be when the compiler on its own would recognize tail recursion calls and generate special code for that.

That one needs to manually annotate it in Clojure just means that the compiler is manually instructed to generate a loop-like construct instead of a function call.

The advantage in Clojure is that these loops are explicitly marked, which IMHO improves readablity.

Many Lisps don't bother to implement tail recursion optimization, because they support the more general tail call optimization - by adjusting/reusing the current stack frame and using a JMP instruction.


Yes it is a common optimization, however it is not a requirement for Common Lisp standard compliance, like it happens with Scheme.


IMO closures and macros are absolutely essential, not just conses.


It ostensibly has both macros and closures, though. The lisp source i linked creates an anonymous function and uses it with map (which I assume means it has real closures though I guess you could fake that too), and here are some macros: https://github.com/vygr/ChrysaLisp/blob/bb4ad4116d6e794eb66e...


> It's not actually Lisp but some new language with superficially Lisp-like syntax.

So, like Clojure.


Why Clojure?

Clojure has cons, car, cdr which you can use in the classic Lisp sense. The only thing you CAN'T do is have a dotted pair or dotted list in the last cons cell.

One might argue that you cannot surgically alter data structures in Clojure. But you can! It's just that the function doing the alteration returns an entirely new data structure with the alteration in place -- but without the expected inefficiency of a copy operation. If you have an array of a billion elements, and alter one of the elements, you get back a new array with the alteration. But it is not a COPY of the entire array (with the expected time required to copy). The original array without the alteration also still exists -- but you don't have twice the memory usage now that there seem to be two slightly different arrays with these billion elements. Yet accessing or altering any element in the array has close to the performance you would expect of an actual array implementation.


> Clojure has cons, car, cdr which you can use in the classic Lisp sense. The only thing you CAN'T do is have a dotted pair or dotted list in the last cons cell.

So, it doesn't have them in the classic Lisp sense. Conses are just pairs. Using them as such isn't exotic. (cons 1 2) being an error in Clojure isn't a minor thing, it's very unique compared to other Lisps. It has a very different definition of cons.


Yes. Clojure doesn't have cons cells in the traditional sense. But as long as you don't need dotted cdrs, you can "think" of using cons, car, cdr in the ordinary way. You can use existing lisp idioms of recursively taking apart, editing and rebuilding a complex form.


Tail recursion? As far as I know, Clojure doesn't and cannot have it because JVM doesn't support it.


Kawa supports TCO with an optional flag. It uses trampolines to work around the JVM's limitations. This makes Kawa more akin to a real Scheme. (Still doesn't have full conts, only upward conts which are homeomorphic to Java exceptions.)

Clojure could have done the same, but Rich Hickey deliberately chose not to, instead providing a special 'rec' construct for tail self calls. But this doesn't make Clojure not a Lisp, it only makes it not a Scheme.


Clojure has `recur` that does what you want.

I'm not sure why it matters what the JVM supports. x86_64 doesn't do tail recursion either; it's the compiler's job to translate. Let's separate the `recur` vs the function name itself for a minute: if Clojure automatically took a function body with a tail-position recursion using the name of the function itself and translated it to a loop, would you agree that's tail recursion optimization?


Recur is useful in Clojure, but still not a replacement for tail calls.

Imagine two routines that call each other. A calls B, which calls A, which calls B, and continues recursion until the real answer is calculated, then returns and unwinds the stack. With real tail calls, there is no stack expansion. When A calls B, the original stack frame of A is overwritten to be the new stack call frame for B, and then A does a JUMP to the right code in B which eventually will do the RETURN instruction (or recursively call A).


I agree that's an optimization and an important litmus test, but calling any TCO that doesn't also handle mutual recursion (but does naive fib just fine) not "real" feels like a stretch. At least we agree on the facts: if you want mutual recursion in Clojure you need to write it with letfn + trampoline. That feels like it's only a hop away from recur to me, as in: if recur doesn't feel like real TCO to you I imagine letfn + trampoline won't either, but letfn+trampoline means that I can write all the mutual recursion things I want to write almost exactly the way I want to write them, so it's close enough for me.


Common Lisp doesn't require tail-call support (though some implementations have it).


Tail recursion can be cobbed together with macros. This is particularly effective for recursion among local functions in the same lexical scope:

http://www.kylheku.com/cgit/lisp-snippets/tree/tail-recursio...

The above tail recursion macros achieve local tail calls by transforming to a tagbody with go. This is offered in the form of a tlet macro whose syntax is like labels. Write your mutually tail recursive functions as labels, then change labels to tlet to try it with this.

tlet is based on argtags: a form of tagbody whose labels take arguments. These arguments perform a re-assignment of local variables from parameters that accompany the goto transfer.

Tail calls among top-level functions are supposed by a complementary facility called deftail which uses a combination of non-local dynamic control transfers and a dispatch trampoline.


While tail call is an optional feature in CL, are there any (or very many) implementations that DO NOT have tail call elimination? (just curious)


don't they get around that by compiling the tailrec to bytecode as a while loop?

other languages supporting tail recursion on the jvm do that. I don't know enough about clojure to say that is how they do it, though.


Tail calls aren't just for loops. Tail calls can be across functions.

A calls B calls C calls A calls B ... repeat until ... one of the functions solves the problem and does a return. Then the stack is unwound. But with TCO all of the tail calls simply re-use the current stack frame and JUMP to the next function.


Yes, but you have to be explicit that it's a recur by using the recur special form. (I'm not convinced that matters, but some people get hung up on it.)


yeah, there is special syntax for it in other jvm languages that support tail recursion, as well (kotlin).

i honestly don't know how they would do it otherwise, really. I actually liked the `recur` syntax of clojure when i was playing around with it a few years ago, though.


I don't see why the compiler couldn't be aware of the function it's compiling and special-cases that tail recursion; that's what every other compiler does.


One interesting thing is the idea that app applications and utilities are in a virtual object code rather than native code.

This reminds me of the UCSD p-System of the early 1980's.

To port the OS all you had to port was the very small p-Machine emulator (PME). Once you had a bootable PME, everything else, the entire OS, utilities, and all applications instantly came along for the ride, without even being recompiled.

I could only imagine a modern OS that would take an approach like this, but Hotspot JIT everything to native code.


Inferno could do this. Port the Dis VM and whamo, you had all of Inferno. On Windows, Linux, bare metal, x86, RISC... pick your poison.


This is how most mainframes and microcomputer work.

Check Burroughs B5000 (now Unisys ClearPath), AS/400 (now IBM I), z/OS (now IBM z), Xerox PARC workstations, Lilith, Ceres, watch OS, Garmin devices and plenty of others.


I mean that’s basically the JVM, isn’t it?


Yes. As a long time Java developer. But the UCSD p-System was much earlier (and more primitive) than Java.

p-System never did JIT. Remember this was for machines with 64 K of memory. Not 64 MB. But 64 K! The p-Code was much more compact than native code. Interpreting most code was plenty fast for most things -- even on the slow (under 10 MHz) clock speeds of the era. Critical operations could be written in assembler. The p-System had its own assembler for native code.

JVM is a whole other world. Adding GC is a whole new dimension in complexity. But decades of research have gone into the JVM making it the amazing runtime platform it is today.


There were toolchains that did compile P-System into native code though.

https://en.wikipedia.org/wiki/Pascal_MicroEngine

https://en.wikipedia.org/wiki/Corvus_Systems

Adding a GC to bytecode systems, is what Xerox PARC did with their Interlisp-D, Smalltalk and Mesa/Cedar workstations.

The CPUs were microcoded and as part of the boot process they would load the respective hardware interpreter for the environment being booted into the workstation.

A similar idea was explored in one of Oberon's implementations, where instead of using straight native code like on the Ceres workstation, the modules would have a compact representation, JITed into native code on module load.

See section 7, Machine-Independent Mobile Code on http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.90....


I remember Corvus quite well. (I've been doing this job for a _loooong_ time.)



https://en.wikipedia.org/wiki/Virtual_Processor

Taos articles link at the bottom, Byte, IEEE, Edge etc.


It's real deja vu looking at this thread. Many of the questions/comments here are exactly the same ones that came up in the early 1990s with Taos on CiX, and then later with Elate/intent. :)

Chris was far ahead of the curve - a heterogeneous, load-balancing, multi-processor, multi-tasking OS, with Object-Oriented Virtual Processor, byte code, load-time translation, support for multiple CPU types (including RISC/CISC/transputer and LE and BE), and tiny footprint message-passing kernel, when most people used MS-DOS 3.3 or DR-DOS 5.0, with Windows 95 still a gleam in Bill Gates's beady eye! Nearly 30 years on and he is still ahead of the curve!

p.s. no one has mentioned it yet - try a Google search for "edge tao spyfish". That was 1995!! Not released, but still very interesting reading material.


This is pretty amazing! would appreciate some more documentation on this and perhaps some benchmarks.


There is a start on documentation in the docs/ folder. I know it's sparse at present I do aim to add more over time.

The system can build itself from source after clean, on my 2014 Macbook in under 0.4s. It's not exactly a standard benchmark though, but it's not slow considering that's a Lisp (like) interpreter doing all the compiling and assembling !. Footprint at the moment for everything is 158KB.

The snapshot.zip file contains 3 separate snapshots for the 3 ABIs supported currently.


This is quite an intersting and approchable OS. Hope to see progress on it.

Could this be run this in VMWare? Or is it specifically bound to a host OS?


It's not bound to any host OS. I am using SDL to provided a simple abstraction to open a Window and obtain key and mouse input. I use very little of the SDL features, just a rectangle fill and rectangle bit blit. See the gui/ctx/class.inc for all those, everything else in the GUI, like the compositor and vector graphics library, widgets et al are done in VP on the CrysaLisp side.

I have a small abstraction to the host OS via the sys/pii/functions, these just use the jump table you can see in the main.c file. Any host OS should be able to compile the main.c with little or no changes, but eventually there will be a native filesystem available from VP/Lisp.

I will be doing direct bare metal hardware support, and there are things happening on that front but I can't talk about that here.

But during my experimentation stage it was very useful to have it running hosted, and I'll keep the hosted version running going forward as it provides a nice dev environment.

I do have a plan to get a version done that could boot under VMWare etc, but I've only got so much time on my hands and have to spread myself over the whole codebase as best I can. :)




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: