Coalton  is a project that adds tried-and-true Hindley-Milner type checking to Common Lisp which allows for gradual adoption, in the same way Typed Racket or Hack allows for. You can think of Coalton as an embedded DSL in Lisp that resembles Standard ML or OCaml, but lets you seamlessly interoperate with non-statically-typed Lisp code (and vice versa).
It’s really intended as a practical tool to allow parts or most of your codebase to be statically typed, without compromising the dynamic and interactive nature of Common Lisp.
On top of it all you get all of the facilities of Lisp, like macros, compile-time evaluation, etc. Coalton code is truly Lisp code, it’s just—at the end of the day—extra syntax to allow a type checker (also Lisp!) to run. As a benefit, it allows quasi-trivial porting of ML-like code to Lisp. Almost all ML construct has a direct correspondence in Coalton.
It’s still a work in progress, especially in terms of polish and documentation, but it works.
Shen  is another interesting project, but I didn’t generally find it very compatible with the ML style of program construction, and it seemed to have goals diverging from being an extension of Lisp, into being some hyper-portable independent language.
The point of the article wasn't to say that all common lisps support type declaration but to show how easy it is to add support for such decoration (and optional support to boot).
Arbitrary amounts of inference could be added in such declarations before feeding the result to the compiler; the same work could be easily extended for other compilers -- and that other pre-analysis would work there too.
No amount of inference can extend Lisp’s type system to support parametric polymorphism. It’s just not an idea Lisp’s type system can express.
Well of course it has runtime polymorphism. I assume you mean the metasyntactic polymorphism of, say, C++'s std::vector<foobar>. Yeah, you could sort of get there with some macrology (not even extreme macrology) but you'd probably need to add some call-site sugar which wouldn't be very transparent. Clearly the generic type system could handle it, but there's no inherent representation, which I presume is what you mean.
Is it your intent that Coalton work with any ANSI Common Lisp or is it dependent on the extra-special type checking that SBCL does?
The first example is easy: Implement an O(1) jump table in Common Lisp that respects the lexical environment. (You can’t, unless CASE is itself optimized to a jump table, which it usually isn’t.)
The second example: Consider the SAME-FRINGE  problem. You’ll have a hard time macro-ing your way out of that unless your Lisp implementation has a built-in API for coroutines , or you write extraordinarily difficult macros to CPS-convert the entirety of Common Lisp to allow for CALL/CC. The latter is itself a hefty project to do well.
This is not to say SAME-FRINGE can’t be solved in Lisp. It can. The page I linked has several solutions in Common Lisp. But idiomatic solutions are consy and inefficient. For instance, the usual SICP hack to write lazy sequences with thunks “works”, but doesn’t deeply integrate with the rest of Lisp so easily. And coroutine/CPS libraries in Lisp often have a bag of gotchas (e.g., they don’t work with high-order functions like MAPCAR).
While I understand this is subjective, the most natural solution to this problem uses a control mechanism that just doesn’t exist in Common Lisp.
 Many Lisps used to, but it has fallen by the wayside since Lisps began to support native OS threads.
Perhaps I don't understand your objection: you can certainly store any object in an array and funcall it which would be O(1); the object could be a lambda that captured the lexical environment. I was doing that 35 years ago on the lisp machine
Interestingly I did this also on the D-Machine Interlisp and each closure forked the stack, so it ground the machine to a halt, a design bug later fixed. Both examples I'm talking about predate Common Lisp standardization.
> Implement an O(1) jump table in Common Lisp that respects the lexical environment.
You don't know the lexical environment until there's a lexical environment to know, you can't have your cake and also eat it (or not have your cake and also know it).
Interestingly, try/catch also solves this problem fairly elegantly without distastefully consing up a list or array at run-time.
(defun jump2 (i)
(let ((location nil))
(setf location (elt #(a b c d) i))
(throw location nil))
edit: removed some cruft from experiments and fixed formatting
But that’s beside the point because in practice this is probably O(n) not O(1). I think most CL compilers are going to need to push things to the control stack for each catch. Even if the compiler is clever enough to eliminate the catches, it still mightn’t implement the computation of which catch catches as O(n). So even in the best case of a very smart compiler, your code is basically equivalent to:
(0 (print "A!"))
The point the GP is making is that the only solution which looks like it might O(1) (and indeed is O(1) if there is no (or just one) lexical scope to capture) is the array-of-lambdas-doing-go, but that this isn’t guaranteed by the language to be O(1) when the lexical scope changes each time because in a naive evaluation, the array of lambdas must be consed before jumping and this is O(n).
The reason that it is reasonable to complain about this is that what one needs for a jump table is weaker than this. The lambdas do capture the lexical scope but they only have dynamic extent and don’t extend the scope at all so a lambda of the form (lambda () (go foo)) only has to be consed each time you hit the jump table because you can only reference foo once you are inside the tagbody. However the code to actually jump is likely to be static for most compilers.
For guaranteed O(1) performance (which doesn’t rely on the cleverness of the compiler) you’d need to introduce a new special form, e.g. (jump n tag1 tag2 ... tagn) which evaluates n and jumps to the nth tag in its arguments in O(1) time.
I think this is a fine demonstration of not being able to “macro your way” out of a problem.
As for a sufficiently smart compiler, it probably can’t optimize CATCH/THROW since those have dynamic extent to all callees. The compiler will certainly not be able to prove that a function won’t possibly throw a tag down deep somewhere.
- if the prints were changed to go outside of the catches before executing (ie so the compiler could know that they wouldn’t throw) then I think a sufficiently smart compiler could prove that nothing else could throw here (it would be allowed to know that elt wouldn’t throw)
- If I recall correctly, there are java compilers which can optimize throw-inside-catch code to not need the catch on the control stack. (Ie the throw is just compiled into a goto)
- some languages allow one to throw things that are unique to the function call which gives a better chance of them being unique.
This is further confused by the inclusion of 'lexical scope' as a priority. Lexical scope is a thing, and of course the jump table will have to do more work if handling lexical scope involved. If you aren't managing the stack, you aren't dealing with lexical scope appropriately. You have incompatible requirements here.
If you were simulating the equivalent of a C jump table, you would simply have an array of functions, and you'd call the function that you want based on a reference.
This is very easy to do in common lisp, trivial in fact, so I'm a little confused about what the commotion is about, in that case.
(defvar *jump-table* (vector (lambda () 'a) (lambda () 'b) (lambda () 'c) (lambda () 'd)))
(defun jump-table-caller (i)
(funcall (aref *jump-table* i)))
(let ((jump-table (vector (lambda () 'a) (lambda () 'b) (lambda () 'c) (lambda () 'd))))
(defun jump-table-caller2 (i)
(funcall (aref jump-table i))))
Write your second example where the lambda bodies depend on the function parameter i, or indeed any other binding in the scope of the “jump”. This is a very normal and natural thing to do in other languages, such as C. Forget big-O, complexity, whatever. In a language such as C, you incur no runtime penalty for using local scope in the branches.
In Common Lisp, you incur a runtime penalty dependent on the size of the table. This penalty rears its head either in the number of comparisons that must be made to find the branch (dependent on the size of the table), or in terms of memory allocation (allocating closures for each branch). Either of those penalties can be expressed as a linear (or at least non-constant) function in the size of the table.
You are incorrect that lexical scope is at odds with a jump table. This can be formally proven, but it has also been empirically shown in actual (non-portable) implementations of a jump table.
(defvar *my-variable* nil)
(let ((jump-table (vector (lambda () (list 'a *my-variable*))
(lambda () (list 'b *my-variable*))
(lambda () (list 'c *my-variable*))
(lambda () (list 'd *my-variable*)))))
(defun jump-table-caller3 (i)
(let ((*my-variable* i))
(funcall (aref jump-table i)))))
You will quickly find that this also will not work, for if a closure was made inside of each branch, the environment would not be shared among them.
The original intent of this exercise was to demonstrate you can’t macro your way out of all deficiencies in Lisp. And certainly in this case, even if we were to accept this dynamic variable rigamarole, we would find that we could not write a macro that, e.g., replaces CASE.
In fact, it would still be a constant if the 'case' were implemented as a chain of 'if's. Time complexity, since it concerns asymptotic growth of runtime as a function of input size, is not the correct notion to invoke here. You're just interested in performance in the ordinary sense: you want your program to run as fast as reasonably possible, and you know that a chain of 'if's leaves something to be desired in that regard.
Fair enough. But let's look at the array-of-closures implementation again. Assuming that the number of jumps made through one of these tables, once you've created it, is much larger than the number of times you have to create a table, it's a perfectly workable implementation.
I think your second point is more substantive. I too have come to the conclusion that coroutines were an unfortunate omission from the CL spec.
The point of a jump table is to compile N branches, laying them out in memory at addresses known at compile-time, and allowing a fixed jump to occur during program execution, equivalent to incrementing a program counter by some static amount. This cannot be done portably in ANSI Common Lisp.
The best you can do is construct a binary tree of IF branches to achieve O(log N) time.
You cannot construct the vector of N function entries without first constructing the vector (can be done at compile time), without second constructing N closures representing the code you’d jump to (must be done at runtime to capture the lexical environment), and without third populating the vector (must be done at runtime since the data being populated has been constructed at runtime).
Once the jump table is constructed, the actually vector dereference and call is O(1), but that’s beside the point if each invocation requires linear construction.
The number of possible inputs is proportional to the size of the table (i.e., number of entries) for which it can discriminate. It may be that the input is an integer N of unbounded length (say, K bits), which means that the time complexity is O(2^K) = O(N) > O(1).
If you don’t believe me, please show me how you would implement this.
Not necessarily. What matters is the ratio of calls through the table to the number of times the table is constructed. If that ratio is bounded, for any size input, then you have a point. My response is simply that programs for which such bounds exist are relatively rare in practice; the common case is that such a table will be called through many more times than it is constructed, and furthermore that the ratio of calls to constructions will increase without bound as the size of the input increases. Then the fraction of time that the program spends constructing the tables asymptotically approaches zero.
Admittedly, this argument is of little comfort if you're actually writing a program that constructs many such tables, only to call through each a small (and bounded) number of times. But the force of that fact as a criticism of a language has to take into account the likelihood of needing to write such a program in the first place.
There was no consensus on multiprocessing at the time of standardization and language support was very uncommon.
But still it’s weird: coroutines were over 20 years old by then, and newer ideas like the condition system and CLOS were included. And while nowadays the CL runtime is considered small, at the time it was criticized for being too large.
I'm sure some of the implementors (though not Symbolics, of course) would have been unenthusiastic about their inclusion, but still I wish it had been done.
Their inclusion would have made staples like unwind-protect, conditions, and underlying them try/catch, much more difficult to implement, if not nearly impossible to get right.
I personally think that those language features are worth it and that this particular trade-off was well made.
Therefore to reply as if the GP were making a statement about O-notation is to reply to a point which was not made.
What GP means is slightly fiddly to define formally as you need to formalise the jump operation that processors can do in constant time, but informally I think it’s completely obvious what this means. It’s also obvious you can do this in C (with the extension that lets you make jump tables) or assembly for basically any modern ISA, so I think it is reasonable to complain that you can’t do it in CL.
But the array-of-closures implementation satisfies that requirement! reikonomusha's objection to that proposal is that even though the time for one branch is constant, there is a setup time dependent on the number of branches. This is true. The essence of my reply is that it is, however, unlikely to matter much in practice, because the whole thing still runs in amortized constant time  per jump.
If you're doing something that's really so performance-sensitive that amortized constant time per operation isn't good enough for you, then you're probably going to have to work in C or assembler, because any higher-level language — even C++ — will have constructs with linear setup times (consider 'make-array' in CL, or 'std::vector' in C++).
> It’s also obvious you can do this in C (with the extension that lets you make jump tables)
Complaining that CL is deficient because a nonstandard extension exists in some other language that allows you to do something, or that you can do it in assembler, seems like a rather weak criticism. Besides, this seems to me more like a quality-of-implementation issue: as reikonomusha mentioned, it's entirely possible for a CL compiler to compile 'case' into a jump table, given reasonable constraints on the case values. I don't know which ones do this, if any, but it seems like the kind of thing the SBCL maintainers could be persuaded to do if someone had a clear need for it.
(defun leaves (tree)
"Return a function that returns the leaves of TREE one by one when called repeatedly."
(let ((work (list tree)))
(flet ((schedule (x) (when x (push x work))))
(loop for task = (pop work)
until (atom task)
do (schedule (cdr task))
(schedule (car task))
finally (return task))))))
(defun same-fringe-p (t1 t2)
(loop with l1 = (leaves t1) and l2 = (leaves t2)
for x1 = (funcall l1) and x2 = (funcall l2)
while (or x1 x2) always (eql x1 x2)))
That's surprising. Even the brain-dead compiler in my TXR Lisp (a dialect that has some things in common with Common Lisp) does it.
'(caseql x (1 'a) (2 'b) (3 'c) ((4 5 6) 'd) (7 'e) (8 'f) (9 'g))))
** warning: (expr-1:3) unbound variable x
0: 04020002 frame 2 2
1: A0800000 getlx v00000 0
2: 30810400 movsr v00001 d000
3: 24010003 gcall t003 1 v00000
5: 48000026 if t003 38
7: 38050004 movrsi t004 1
8: 38250005 movrsi t005 9
9: 24030003 gcall t003 2 t004 v00000 t005
12: 48000026 if t003 38
14: 38050005 movrsi t005 1
15: 24020004 gcall t004 3 v00000 t005
18: 54090004 swtch t004 24 26 28 30 30 30 32 34 36
24: 30810401 movsr v00001 d001
25: 44000025 jmp 37
26: 30810402 movsr v00001 d002
27: 44000025 jmp 37
28: 30810403 movsr v00001 d003
29: 44000025 jmp 37
30: 30810404 movsr v00001 d004
31: 44000025 jmp 37
32: 30810405 movsr v00001 d005
33: 44000025 jmp 37
34: 30810406 movsr v00001 d006
35: 44000025 jmp 37
36: 30810407 movsr v00001 d007
37: 30030801 movsr t003 v00001
38: 4C00002A ifq v00001 d000 42
40: 30020000 movsr t002 nil
41: 4400002B jmp 43
42: 30020801 movsr t002 v00001
43: 10000002 end t002
44: 10000002 end t002
2> (macroexpand '(caseql x (1 'a) (2 'b) (3 'c) ((4 5 6) 'd) (7 'e) (8 'f) (9 'g)))
(and (integerp #:test-0034)
(<= 1 #:test-0034
(sys:switch (- #:test-0034
#(('a) ('b) ('c)
('d) ('d) ('d)
('e) ('f) ('g)))))
(if (eq #:swres-0036
What you do is modify the individual cases so that instead of accessing the surrounding runtime lexenv directly, they go through reader and writer thunks. These thunks are then passed in as a separate argument to the lambda for each case. The table mapping case values to these lambdas is a compile time constant.
You might complain that this takes time proportional to the number of accessed objects in the lexenv, to set up the thunks. And that's true, but it's constant in the number of cases.
There is a working delimited continuation library, but it's not exactly trivial.
(go (aref #(a b c d) N))
Suppose that you have function that you know only returns fixnum, but isn't declared as such, and you use that result as the input to a function declared to only take fixnum then the compiler has to add runtime type checks. Even very cheap type checks can cause massive overhead if they are executed in tight loops.
Or suppose the fixnum x is given as input to an identity function: (id x). The compiler knows that x is a fixnum, but unless it is sufficiently smart it has no idea that (id x) is one too.
x: int = oh()
x + 3
I suppose you could argue that someone might write a program that relies on being able to call a function with the wrong type of arguments and catch the resulting error. In this case removing the type checks obviously changes the behavior of the program. This however would not be portable across different CL implementations anyway, so you should really be using explicit type checks.
> But the difference is the frequency in which it is lost. In SBCL Lisp, if you want fast code, the answer is "almost all the time".
SBCL code is fast enough for the vast majority of a program even with full run time type checking, so you would only be using `(safety 0)` for very performance critical code, whose safety you have ensured yourself. The `(safety 0)` declaration also does not disable any compile time checking that SBCL does.
Think of soundness as a guarantee that if the program type checks, then all type declarations are correct. Note also that soundness is a property of the type system itself. A piece of code is not sound or unsound -- it is the type system itself that is.
* (defun id (f) f)
* (declaim (ftype (function (fixnum) fixnum) id))
* (declaim (optimize (safety 0)))
Let's define a function which calls ID with a string
* (defun use-id (s) (declare (type string s)) (id s))
; in: DEFUN USE-ID
; (ID S)
; caught WARNING:
; Derived type of S is
; (VALUES STRING &OPTIONAL),
; conflicting with its asserted type
; See also:
; The SBCL Manual, Node "Handling of Types"
; compilation unit finished
; caught 1 WARNING condition
This would be the case that I referred to in the second paragraph of my last comment. I would agree if we were talking about explicit type checks, but a portable Common Lisp program cannot consider implicit added by the compiler to be part of the programs semantics, because a compliant implementation is not required to add any type checks. SBCL just adds them as a debugging aid, but if you rely on type errors being signaled, you have to check them explicitly with `CHECK-TYPE` or something equivalent.
> Think of soundness as a guarantee that if the program type checks
This would be where our view is different. You consider soundness to be a guarantee given by the compiler, while I consider it a property of the program itself (which can sometimes be checked by the compiler, and sometimes by the programmer). Common Lisp does not give any guarantee of type safety. SBCL simply tries to be helpful and catch most real world type errors, but ultimately it is the programmer who must ensure the soundness of the program.
Hence if you want to talk about portable Common Lisp, you cannot achieve any type of soundness!
> This would be where our view is different. You consider soundness to be a guarantee given by the compiler, while I consider it a property of the program itself (which can sometimes be checked by the compiler, and sometimes by the programmer).
Your view is wrong. Read about type system soundness here https://papl.cs.brown.edu/2014/safety-soundness.html
Absolutely false, (declare (optimize (speed 3) (safety 1)) is perfectly valid.
If you know, you can declare it. You can drop a (declaim (ftype ...)) anywhere you want, it doesn't have to be next to the function. You can also declare it locally, near the call site. Finally, you can always use `the', as in (the fixnum (yourfunction ...)), which declares what the return value at a given point is going to be.
As 'juki said, when you add (declare (optimize (safety 0))), SBCL will happily drop runtime type checks within the scope of the declaration. You can do that in as small a scope as possible, even doing something like this:
(defun bar (y)
(let ((x (locally
(declare (optimize (speed 3) (safety 0)))
(the fixnum (foo y)))))
My problems so far are:
- lack of (loop), (dotimes), etc.
- lack of (restart-case), (invoke-restart), etc.
- REPL feels extremely different from SBCL's one (or emacs + SLIME + SBCL), it is definitely non-Lispy
- inconsistent naming (e.g. see (path-get-extension) and (file-name-from-path) where in other libraries it is "filename" not "file-name" hence the inconsistency, and the "get" in (path-get-extension) seems odd. There is a deprecated (filename-extension), I think sticking to that would be better. I know that backwards-compatibility is the issue but cannot we just issue a warning for a few releases instead?)
- (time) is much less informative than SBCL's (time)
- this is really subjective, but I prefer (progn) over (begin)
In my opinion Racket is better in some regards than Common Lisp, but it still has its warts. Common Lisp has its own warts. I really wish that these issues were to be resolved somehow in Racket so I could have my "perfect" language. :(
Make an issue with regards to the filename inconsistency. I hadn't noticed before - and had to look it up. The confusion stems in the fact that both "file name" and "filename" count as correct spelling (sigh).
Wrt to `time` take a look at the `benchmark` package:
I don't know how its features compares to the real thing.
I am using https://github.com/samth/disassemble for now.
I kind of wish the author had named his type defun `defunct` though!
The interesting question is: what does that type checker actually check? How does it deal with normal lisp functions? How is non-trivial data represented? And what happens when it stumbles upon a runtime macro?
And, on a more practical side, how does it access the environment?
The big caveat with this post is that while types and type declarations are part of the standard, doing what SBCL does to infer them beyond what's declared or handle them for purposes of compile-time warnings and optimized assembly code generation is up to a Common Lisp implementation and others that aren't SBCL (or older SBCL versions, or programs that declare things like (safety 0) to turn off type checks) may behave differently.
Normal lisp functions can have more restrictive types than T, too. Typing (describe 'symbol) for a function will have SBCL say (among other things) that the symbol names a compiled function with a certain argument list, declared type, and derived type. (There's a distinction because I might have declared the function parameter a to be type (integer 0 255) which SBCL simplifies to (unsigned-byte 8).) SBCL has a convenient way to get the derived type info as a list that might be useful for macros or editor extensions, it works for your functions or standard functions e.g. the standard string-to-uppercase function as provided by SBCL has the type: (sb-introspect:function-type #'string-upcase)
((OR (VECTOR CHARACTER) (VECTOR NIL) BASE-STRING SYMBOL CHARACTER) &KEY
(:START (MOD 4611686018427387901))
(:END (OR NULL (MOD 4611686018427387901))))
(VALUES SIMPLE-STRING &OPTIONAL))
You can define your own types with (deftype) for non-trivial data. (Or even trivial data like an enum type: (deftype valid-color () `(member :red :green :blue)) Depending on how non-trivial it is, this may impair optimization opportunities or the compiler being able to notice type mismatches (though you can tell the compiler to give you a notice about type uncertainties), so you're back to runtime checking.
Runtime macros talk to the compiler to generate code, same as compile-time macros. The compiler never goes away in runtime (without some effort in extracting it from the final deliverable anyway) -- COMPILE is a standard compiled function -- which makes it possible to e.g. interactively debug an erroring function, recompile a fixed version that everything will now call instead, all from 100 million miles away back on earth.
What do you mean by "the environment"?