Hacker News new | past | comments | ask | show | jobs | submit login
The empty list (tfeb.org)
90 points by todsacerdoti on Dec 17, 2022 | hide | past | favorite | 105 comments



I like that the article was non judgement. The real world is messy and Common Lisp formalized existing practice among several branches of an already old language from before programming language theory.

There are a number things worth “fixing” and it’s delightful that so much interest and implementation in this regard has been sustained for so long.

But to the ideologues, I can only quote Emerson: "A foolish consistency is the hobgoblin of little minds, adored by little statesmen and [programming] philosophers and divines."


To help someone learn to use a language, we should be judgemental and call special attention to all of its shortcomings and gotchas. I don't see value in sweeping them under the rug. That kind of language advocacy is a zero-sum game (or negative-sum).

For example:

> ... it is not at all strange that there is an object whose type is both list and symbol.

Actually, it is strange, and unfortunate. One should think about examples of code that might store different types of values in a variable or data structure, using e.g. `symbolp` and `listp` to distinguish the different representations, and consider how the collision between the symbol nil and empty list could produce unexpected behavior. One must be aware and remain vigilant.

> the things which are not necessary are that it be a symbol, and that it represent falsity.

Other non-necessary things: `car` and `cdr` special cases for `()`, self-evaluating `()`.

> CL requires precisely one implementationally-weird object, while Scheme requires two, or three if you count #t

There is nothing intrinsically weird about having an empty list object. The only thing weird about it is the behavior of `car`, `cdr`, and `symbolp` in CL.

Also, there is nothing weird about having a proper boolean type. On the other hand, if CL's `(type-of 't)` actually returns `boolean` and not `symbol`, as the article indicates, then that is indeed weird.


If you want that kind of clarity, use another lisp, like Scheme, that doesn't have the legacy quirks. There are plenty to choose from that make different choices.

That's kind of my point.


Lisp dialects that have deviated from these choices require the programmer to write hideously verbose code, unable to take advantage of the economic idioms they afford.


What useful idiom is afforded by nil being a symbol?


There are a number of benefits.

One is that, in the role of the list terminator, nil effectively serves as a symbol. It is unique, and tested for its identity: if the cdr of the cell of a list is the nil object, then that is the last cell. This is how symbols are used. Anything that is exploited for its unique identity should be a symbol.

Nil isn't necessarily implemented the same way as other symbols. It could actually be a null pointer. One way to look at it is that something now known as the GoF Null Object Pattern is at work. All the functions which take a symbolic argument, such as symbol-name, symbol-value, symbol-package, ... are made to work with a nil argument. Just like (car nil) is made work with a nil argument, even though nil isn't a cons.

Being able to pass nil to symbol functions makes some things work that otherwise wouldn't, which can reduce the code. All the special case testing (if necessary) is done in the library already.

Say we have a hash table H which maps keys to symbols.

We can do, say, (symbol-value (gethash K H)). If K is not found in the table, gethash will return nil. Because nil is a symbol, (symbol-value nil) works fine and returns nil. If that's what we want in the not found case, it keeps our code short and sweet.

Whenever some slot or variable defaults to nil, it's not only defaulting to a false value, and empty list, but to a valid symbol object that can be passed to symbol API's.

If the empty list weren't a symbol, it could be notated (), and nothing but that. That notation works grewat when we are dealing with the empty list semantics. It's not so nice when () is denoting false, other than () kind of looking like a zero. When () denotes the bottom type in the type system, things are also not nice; all other types are either symbols or expressions like (or integer string). It's much nicer to have a word like "nil" for all these roles.


Personally I feel about Scheme's distinction between false and the empty list the way Schemers feel about CL's distinction between variables and functions.


Why? It is largely inconsequential. You save a line or two, but there are bigger fish. Clos, contitions and restarts, and types change how you actually write code. Nil being false doesn't mean anything. You can save a line of code when iterating over a list, which is handy, but it lacks meaning.

Multiple namespaces are a nuisance for about one day until you have gotten used to them.


An annoying thing about `nil` is when it stomps a namespace in which you're using symbols, such as in a minilanguage, by being read as the nil value rather than as the symbol `nil`.

Besides it being obvious, I actually ran into this in a real-world situation, with some super-rapid work on a dataset of only about 100 entries, where it turned out one user's real name had the initials N.I.L.


You should not have been reading that in a package that inherited from COMMON-LISP. Did you really want user names that happened to be COMMON-LISP symbols to collide?


That time was in Emacs Lisp. And it was very rapid-convenient to be using symbols.

When I moved to Scheme, I liked that it didn't have that particular collision. Though quasiquoting had some collisions unnecessarily (when they already had precedent of using the `#` character to introduce special values).


The section on the `t` value could also have added that in the Common Lisp standard, there are a great many functions that returns a boolean value. When they return a true value, all that is required is that they return a value that isn't nil. This could be `t`, and often is, but is not required to be.

There are only a few forms that are required to return `t` for true, calls to the functions `not` and `null` being two of them. So, to coerce a true to `t`, one does `(not (not <form>))`.


I've been trying to learn CL by using it to do AOC. As someone coming from mostly Python, there is a lot about it that confuses and, occasionally, frustrates me.

Maybe I'm just missing something obvious but it doesn't seem like you can start with an empty list that you can repeatedly append to. The empty list is nil, and `(append nil foo)` seems to just yield `foo`, not `(list foo)`. So trying to append to that object will (probably) raise an error.


In LISP, the cheap / straightforward operation is attaching things to the front of a list.

Appending to the rear of a list is (usually) going to be the more expensive operation (naively, doing so requires walking all the way to the end of the list, because lists are more like linked list constructs than arrays that silently resize as needed).


`append` on lists is equivalent to concatenation, not Python's append which is equivalent to `vector-push` or `vector-push-extend` in Lisp (and which, as the name suggests, takes a vector for the place and not a list). For performance reasons, the more conventional behavior in Lisp to collect values would be:

  (let ((some-place nil))
    (push an-item some-place)
    ...
    (nreverse some-place)) ;; if the order really matters, you see the same pattern in Erlang and others
Pushing (and popping) work on the front of lists. Counterintuitively, vector-push and vector-pop work on the end of the vector (though the rationale around performance is the same,). By pushing/popping on the front of the list the action can be done in constant time. `push item place` is equivalent to this whether it's precisely this under the hood for an implementation or not:

  (setf place (cons item place))
Pop is this:

  (prog1
    (car place)
    (setf place (cdr place)))
(Maybe I've been spending too much time in the hyper spec, my equivalence examples could have been copy/pasted from it and I wrote them before double checking.)

Pop: http://clhs.lisp.se/Body/m_pop.htm#pop

Push: http://clhs.lisp.se/Body/m_push.htm#push


> For performance reasons, the more conventional behavior in Lisp to collect values would be

What you describe is what a human typical would write.

For tools this is optimized. For example in (loop for e in list collect (oddp e)) this usually would be optimized. It would not add to the end and then reverse. Instead, it would keep a pointer to the end and add to the end via that pointer.


Would the code to maintain a pointer on the last element look something like this?

(let* ((list (list 1 2 3 4))

       (last (last list)))

  (rplacd last (list 5 6))

  (setf last (last last))

  (format t "list: ~a~%last: ~a~%" list last))

=> list: (1 2 3 4 5 6)

last: (6)


in an iteration:

  (let* ((rlist (list '()))
         (last  list))

    (flet ((collect (item)
             (setf (cdr last) (list item)
                   last       (cdr last))
             item)
           (finish ()
             (cdr rlist)))

      (dotimes (i 10)
        (collect i))

      (finish)))


  (let* ((rlist (list '()))
         (last  rlist))

    (flet ((collect (item &aux (cons-cell (list item)))
             (setf (cdr last) cons-cell
                   last       cons-cell)
             item)
           (finish ()
             (cdr rlist)))

      (dotimes (i 10)
        (collect i))

      (finish)))


Thank you for the reply!

And I see that even in my original attempt, I could have used (setf (cdr)) instead of (rplacd)...


I think what you want to do is

  (cons 'foo nil)
It's in the oposite order. It starts with the empty list and it adds foo in the head.

An extension is

  (cons 'bar (cons 'foo nil))
that is equivalent to

  (list 'bar 'foo)


Lists in python are what are called vectors in almost every other language.

You can make a resizable vector in CL and use vector-push-extend to do things just like python.


APPEND is for concatenating two lists. If you're trying to add a value, then if your NIL is already in a variable, use PUSH to add the value (to the front). This will build the result list in reverse, so you can NREVERSE it at the end.


There’s a relatively straightforward deque implementation too if you need to append in addition to pushing. It can require a little rebalancing, but I imagine it amortizes to constant time.


The easiest way to think about this, as Tim mentions in his article, is that in Common Lisp, you can write nil as '(), so if foo is defined as the list (d e f), (append nil foo) is equivalent to (append '() '(d e f)) which more obviously evaluates to (d e f), just as (append '(a b c) '(d e f)) evaluates to (a b c d e f). It's better to think of Lisp append's first argument being appended onto its second: (append '(a b) '(c d)) => (cons 'a (append '(b) '(c d))) => (cons 'a (cons 'b (append '() '(c d)))) => (cons 'a (cons 'b '(c d))) => (cons 'a '(b c d)), which evaluates to (a b c d). The lists '(a b) and '(c d) are still unaltered at the end of the computation, and a new list '(a b c d) is returned by the call to append.


> Maybe I'm just missing something obvious but it doesn't seem like you can start with an empty list that you can repeatedly append to

You can, but appending some list L and an empty list, or an empty list and L yields L in either case; there's nothing to append. Subsequent appends will be more interesting, e.g.

    CL-USER> (append '(x) (append '(y z) '()))
    (X Y Z)
> The empty list is nil, and `(append nil foo)` seems to just yield `foo`, not `(list foo)`

Right. Conceptually APPEND appends lists and not bare elements; try (append '() '(foo))


There's a huge amount of documentation telling you that NIL is an empty list.

It isn't. It has none of the mechanics of an empty list in Python etc.

It's more like a null link, and has quite a bit in common with /0 as a string terminator.

It does different things on its own and in the context of a list.

The key is that Lisp list items are stored as car (link/pointer to an item) and cdr (link to the next item/s in the list) pairs.

Lisp's dot notation makes this explicit, but it's hidden behind syntactic sugar because it's messy and hard to read.

     (a b c) is really (a . (b . (c . NIL)))
Each dot shows the cdr of the preceding car.

On its own NIL is just a constant. It has no listy features - specifically no slots for car or cdr values. Although you can do things like (length NIL) you can't change it. It's simply not a list.

You can add car/cdr pointers to it with cons. Now you can use NIL in a list.

You can use NIL as a cdr list terminator. (something . NIL) means the list is over. There are no more cdrs after it.

You can use NIL as an empty car placeholder. (NIL . something) means the car has no value, but the list continues onwards with a cdr link.

In your example (append NIL foo) actually returns (foo . NIL). NIL is being used as a cdr terminator. Lisp hides the ". NIL" because syntactic sugar.

If you (append NIL foo) again, you still get (foo) because NIL already terminates the list and there's no reason to add another NIL after it.

(push foo NIL) throws an error because push is destructive and is attempting to change NIL. Which isn't allowed.

(cons NIL a) gives you (NIL a) because NIL is being prepended as an empty car pointer. Which is how you get NILs into a list without it collapsing around itself. It's just like any other list item, except it has a null value. The list can continue past it into further cdrs in the usual way.

Anyway. This is confusing because you have a single symbol doing three different things in three different contexts. (Not even counting its use as a Boolean...) Worse, syntactic sugar and the various function internals hide this from you. And the documentation tells you something that isn't true.

So unless you look at the source and/or learn how the various functions understand and use NIL you will be confused.

The up side is a REPL is interactive and easy to play with, so it's not a huge effort to experiment and see what falls out.


> NIL is an empty list.... It isn't. It has none of the mechanics of an empty list in Python etc.

I can see how NIL is confusing when coming from Python, but I think this comment has it backwards. The idea of lists as chains or trees of pointers with a distinguished termination value predates LISP by a few years. Moreover, this kind of list predates Python's use of the term by more than 35 years. If you treat lists as just a particularly useful subclass of trees of CONS cells, then they and NIL make perfect sense. Python's lists are fine too, they're just a different kind of thing.


What is AOC? It’s hard to keep up with all the acronyms.



Thanks I was just on that website!


append concatenates lists so you'd need (append nil '(foo))


It actually leaves both its arguments unaltered and creates a new list. There's another function called nconc which does concatenate them, changing its first argument to the new list (unless its first argument is NIL).


Note that append does not copy the last argument, so there is structure sharing when you use it.


I don't see the benefit of the Scheme approach at all. As far as I'm concerned, the empty list being also the false object makes certain list functions more elegant to implement and it is mostly irrelevant elsewhere.


> It is necessary that there be a special empty-list object which is a list but not a cons.

That is not true. NIL could be a CONS. In fact, it acts just like a CONS whose CAR and CDR are itself. The only way in which NIL does not behave like a CONS is that it does not answer true to CONSP. But that doesn't matter because it answers true to NULL, so you can build a CL-style CONSP if you want it by checking for (AND (CONSP thing) (NOT (NULL thing))).

It is not even necessary that NIL be unique. The only thing that is necessary is that there be some non-empty set of distinguished objects with a predicate to test for membership in that set which by convention designate the ends of lists.


If NIL were a CONS, how would you distinguish between an empty list and a list of length one containing NIL?


The same way you do now: (NIL) would not answer true to NULL. And it would not be EQ to NIL.


If NIL were a CONS whose CAR is NIL and whose CDR is NIL, wouldn't NIL be the same as (CONS NIL NIL) = (NIL . NIL) = '(NIL)?


It depends on what you mean by "the same". It is already "the same" in the sense that the CAR and CDR of both NIL and (NIL) are all NIL. They just aren't EQUAL, despite the fact that their CARs and CDRs are EQUAL (in fact, they are all EQ).


Technically not, because a cons of two objects is not eq to another cons of the same two objects.

  (eq (cons 1 2)
      (cons 1 2))
  ;; => nil
It's memory locations, after all.


I see, fair enough. I forgot to switch out of pure functional programming/referential transparency mode for this discussion. Fine, you can take NIL to be one particular (NIL . NIL) that's different from all other (NIL . NIL)s, the only particular (NIL . NIL) that is considered a list of length 0 while all other (NIL . NIL)s are lists of length 1.


I tend to agree but would be happy to be proven wrong by more knowledgeable folks in these comments.

However I assume here we are talking about dotted lists and not ‘proper’ ones?


No. Dotted lists are just a notational convention. All lists can be written in dotted notation.

You can define NIL as follows:

    (setf NIL (cons 0 0)))
    (setf (car NIL) NIL)
    (setf (cdr NIL) NIL)
Then:

    (defun null (thing) (eq thing NIL))
You also have to add a bunch of special cases to other functions:

    (defun cl-style-consp (thing)
      (and (consp thing) (not (null thing)))
    (defun cl-style-symbolp (thing)
      (or (symbolp thing) (null thing))
    (defun cl-style-symbol-name (thing)
      (if (null thing) "NIL" (symbol-name thing)))
and a few others.

In fact, many CL implementations actually implement NIL that way so that CAR and CDR of NIL return NIL without having to make that a special case.


In other words

    CL-USER> (equalp '(1 2 3)
                     '(1 . (2 . (3 . nil))))
    => T


You are conflating implementation tricks with language semantics. In Lisp, NIL is always an atom, never a cons.

Also, a dotted list is a nonempty list where the cdr of the last cons is not NIL. It is not a notational convention. A list (a b c) is not a dotted list, even if you write it as (a . (b . (c . nil))).


> You are conflating implementation tricks with language semantics.

No, I'm not.

> In Lisp, NIL is always an atom, never a cons.

That depends on what you mean by "atom". If by "atom" you mean something that answers true to the ATOM predicate then yes, NIL is an atom. But if by "atom" you mean something that produces an error if you try to call CAR or CDR on it then no, NIL is not an atom, it is equal to (CONS NIL NIL) because (CAR NIL) and (CDR NIL) are both NIL.

So the result of (ATOM NIL) is an arbitrary choice. And CL arguably got it wrong because it fails to preserve the invariant that if (ATOM X) is true then (CAR X) and (CDR X) will produce errors.

[UPDATE]

> a dotted list is a nonempty list where the cdr of the last cons is not NIL

But this is just terminology. In CL, NIL behaves exactly the same as (NIL . NIL) with respect to CAR and CDR.

Indeed, it is exactly this confusion that is the basis for a lot of criticism of CL's design. In Scheme, the empty list is unambiguously atomic: trying to take the CAR or CDR of the empty list in Scheme is an error, as with any other atom.


When baby Lispers are born, the first thing they are taught is a dichotomy: the universe is split into conses and atoms. Cons cells are simple and composed of two components, which are called the car and cdr. There are functions to retrieve what's stored in these components, which are called CAR and CDR. Atoms may be as complex as you like. They include objects like numbers, characters, strings, arrays, symbols, etc. NIL is not a cons cell, but a symbol. We can represent lists by chaining cons cells. We may start with an empty list, and by convention this is the symbol NIL. If we also choose to designate NIL as the false value, and everything else as true values, then it is useful to modify CAR and CDR so that they take not only conses, but also the symbol NIL, and return NIL, which is false, and the empty list. We then say that CAR and CDR take a list, i.e. an object of type (or cons null). We do not say that NIL is a cons.

Your [UPDATE] shows that you still don't understand what is meant by "dotted list". I already gave a definition of one, but did not give an example. An example of a dotted list is (a b . c) i.e. the last cons has a cdr that is (i) an atom (otherwise, it wouldn't have been the last cons) and (ii) not NIL (which is the conventional empty list designation).


(a b . c), as an object is an improper list.

The printed notation is a dotted list.

By an informal metonymy, the internal object is called a dotted list. It's only an informal usage among Lisp coders. The correct terminology is "improper" for the object and "dotted" for the spelling.

Note that (a b c . nil) is dotted, but it's the same as (a b c) which is proper.

Unfortunately, the Common Lisp specification encodes the "dotted list" informality in the Glossary. Common Lisp does things like that.

Furthermore Common Lisp uses "dotted list" as an essential term denoting a subset of of "improper list". An "improper list" is circular or not terminated by nil. A dotted list is only the latter.

That's in spite of the fact that a circular list will print with the dot notation: #1=(a b c . #1#)! Circular lists are dotted when completely printed, under the circle notation.

Another problem is that the append function and others support the idea of a non-nil atom being an empty dotted list. For instance (append '(a b c) 'd) will work and produce (a b c . d). Yet the "dotted list" definition excludes such an atom. If we go by the presence of a dot, that is correct, but then we know from circular lists not being dotted that that isn't the criterion.


Right, I just wrote about it in my reply to "lisper". The name "dotted list" may not have been the best choice, but that's not an outlier in human history.


You don't need to be quite so condescending. I understand all this perfectly well. But the concept of a "dotted list" has to do with how a list is serialized, not how it is represented in the internally, which is what I am talking about.

NIL is weird in CL and different implementations handle this weirdness in different ways. One way to handle it is to represent NIL as a symbol whose name is "NIL" and write CAR and CDR to recognize when they see this symbol and return it. Another way to handle it is to represent NIL as a cons cell whose CAR and CDR point to itself and write SYMBOLP and SYMBOL-NAME to recognize when they see this privileged cons cell and return T and "NIL" respectively. (There are a lot of other special cases -- this is not an exhaustive list.)

However you slice it, the concept of a dotted list is a non-sequitur because that has NOTHING to do with how NIL is implemented internally, which what I am talking about.


The name "dotted list" comes from how the list is serialized, yes, but not the concept. You can write a function dotted-list-p that takes an object and returns the value of (and (consp object) (cdr (last object))). Indeed, it is not related to the NIL issue, and I made no such assumption. You can read an equivalent definition that uses different words in the CLHS glossary.

You admit to talking about "how NIL is implemented internally", hence my original remark that you are conflating implementation tricks with language semantics. From language semantics point of view, NIL is not weird, it is an ordinary symbol, like NIK or NIM. Lisp tradition assigns it certain roles, like representing the empty list, representing the false value, representing the empty type, and also has conveniences like having NIL evaluate to itself or having CAR and CDR take lists instead of conses.

I didn't mean to be condescending. I know you are not a baby Lisper. The matter at hand is very basic, and I understand the desire to present a more sophisticated take, but believe it leads to (and reflects) a distorted ontological view of Lisp if taken seriously. NIL is not weird, it is a simple symbol. We just assigned it a few roles and made a few conveniences. Why did we pick it for those roles? Arbitrary choice. Why did we choose to extend the domain of CAR and CDR? Practical choice. Let the puritans complain and build their own ivory tower languages. Lisp is pragmatic. Could an implementor choose to represent NIL as a closet cons for simple CAR and CDR implementations and special case everything else? Sure. But this has nothing to do with the ontology of Lisp.


> having CAR and CDR take lists instead of conses.

(car nil) and (cdr nil) safely returning nil was introduced in InterLisp, according to Gabriel's HOPL palper. At some big Lisp summit in the early 1970's, InterLisp decided to adopt MacLisp's readtables, and MacLisp adopted (car nil) -> nil.

Why the empty list is a symbol is natural: math uses symbols to refer to such things. For instance the empty set is notated both {} and ∅: empty braces or the special null set symbol.

I mean, why have symbols refer to things in a language that is consciously oriented toward symbolic processing, whose initial creators were people with math backgrounds? It's pretty much a forgone no-brainer.


Having a symbol DENOTE the empty list and having a symbol be THE SAME OBJECT as the empty list are two very different things. The former is perfectly reasonable. The latter causes no end of difficulties.


The former is somewhat reasonable for having pi denote 3.14159... and such.

It brings in its own difficulties. If nil is a variable/constant which evaluates to some nil object, then to talk about nil itself, we have to quote it.

The object which it denotes doesn't print as nil; it has its own printed rep like () and we will end up seeing that printed rep and using it. So then we have two nil representations to deal with: the nil variable which we can use in evaluated contexts, and the literal nil like () that we use elsewhere.

If () isn't self-evaluating (like the criminally stupid design in Scheme), we have to quote it: '().

If we have additional semantic roles for () like it being Boolean false and the bottom of the type spindle, those uses are not nicely served by the () notation, from an esthetic point of view.

The list terminator is de facto a symbol because it's exploited for its identity: we care whether the cdr of a cons is or is not nil, and that's all. That's a symbolic behavior; we might as well complete things so that the symbol functions work with nil; it can have a property list, name and so on.


> to talk about nil itself, we have to quote it.

So? That's no different than if you want to talk about 'pi rather than pi a.k.a. 3.14159...

> So then we have two nil representations to deal with

No different than "pi" and "3.14159".

> If () isn't self-evaluating (like the criminally stupid design in Scheme), we have to quote it: '().

I agree, () should be self-evaluating just like numbers and vectors. The behavior of () should be analogous to the behavior of 0 or 0.0 or #() or "".

> If we have additional semantic roles for () like it being Boolean false

And why would you want to do a stupid thing like that?

> The list terminator is de facto a symbol because it's exploited for its identity

First, the list terminator need not be the same thing as the empty list. The list terminator is an implementation thing. The empty list is a language-semantics thing. These need not be the same.

Second, neither of these need to be unique. There can be multiple list terminators, and there can be multiple empty lists, just as there can be multiple empty vectors and multiple empty strings and even multiple instances of the same number (which actually happens with bignums).

> we care whether the cdr of a cons is or is not nil

No, what we care about -- or at least what we should care about -- is whether the CDR of a cons is an (n.b. not the) empty list. (eq #() #()) need not be true (in fact, generally isn't). Why should (eq () ()) be any different?

And BTW under no circumstances should taking the CAR or CDR of an empty list do anything other than signal an error.


> First, the list terminator need not be the same thing as the empty list.

No it doesn't, but that choice happens to give us a compact recursive definition.

> There can be multiple list terminators ... multiple empty lists ...

Sure, and 2022 can be written MMXXII, and whatnot.

Mathematically, there is one empty list, so why proliferate it?

There can be multiple empty strings which is useful if strings are mutable. If strings are immutable, it's silly to have more than one empty string.

The empty list is immutable, so ...

> under no circumstances should taking the CAR or CDR of an empty list do anything other than signal an error.

Lisp 1 and 1.5 had it that way, certainly. It's mostly just inconvenient. A good mix is to have strict vectors and strings which signal on out-of-bounds, but lists which are forgiving. Forgiving lists allow good old Ashwin Ram to have:

  (cdr (assq key a-list))
rather than

  (let ((val (assq key a-list)))
     (cond ((not (null? val)) (cdr val))
          (else nil)))


> If strings are immutable, it's silly to have more than one empty string.

I guess Common Lisp is silly then.

    Clozure Common Lisp Version 1.12.1 (v1.12.1-10-gca107b94) DarwinX8664
    ? (eq "" "")
    NIL
> (cdr (assq key a-list))

Much better to have an abstract associative map (dictionary) type with an opaque implementation rather than punning cons cells (which locks you in to an O(n) implementation). ALists are interesting from a historical point of view but they should never be used in modern code without hiding them under a layer of abstraction.

And even if you are going to pun cons cells to build an associative map, alists are the wrong way to do it because it forces you to duplicate the keys for every frame. Much better to use D-lists ((key1 key2 ...) val1 val2 ...) because that lets you re-use the key list, which can cut your memory usage in half, and provides a much more straightforward path from interpreter to compiler for pedagogical purposes.


Strings aren't immutable in Lisp, so there isn't a huge benefit to making (eq "" "").

It may be undefined behavior to modify "", but it's not the same thing. Suppose that mutating "" signals an error; that still leaves the problem that some other string which you are allowed to mutate can be mutated empty, yet is a distinct object from that empty string.

Moreover, though every string has "" as a suffix, it's not by way of pointing to a "" object. A unique "" wouldn't serve the role of terminator.

A language with immutable strings could intern all strings, so they are de facto symbol, and then exact string comparison is eq. That implies there is only one empty string object.


> Strings aren't immutable in Lisp

Neither are lists. Only the empty list is immutable, and likewise empty strings. It really is a completely equivalent situation. There is no principled reason that the empty list should be unique and the empty string not.


Yes there is a principled reason. If we accept that we have a list which is recursively defined using a binary cell aggregate structure, as a right-leaning tree shape, then it is advantageous to have an atomic, unique empty list at the bottom of the recursion. That atomic empty list can be represented as a machine word. We can perform a single comparison to detect the terminator: is it that object or not? Anything else, though workable, is a gratuitous complication.

The empty string is mutable. Even the literal one is potentially mutable: you can try it at your own risk. You ca make a mutable empty string with (make-string 0) or by mutating a non-empty string.

Any mutable string can be mutated to make it empty:

   (delete #\a (make-string 3 :initial-element #\a)) -> ""
Since strings aren't linked structures with a terminator, the comparison is moot.


> The name "dotted list" comes from how the list is serialized, yes, but not the concept.

Yes, it does. Dotted lists are entirely related to serialization. The only reason they matter is that, by convention, (a b ... z) is a shorthand notation for (a . (b . (... (z . nil)) ...) and (a b ... z . anything-but-nil) is a shorthand notation for (a . (b . (... (z . anything-but-nil)) ...)

> You can write a function dotted-list-p

Of course you can. So what? You can write a predicate for any (computable) property. I can write a function list-ends-in-3-p. That doesn't mean that lists that end in 3 matter.

The reason "dotted lists" are called dotted lists is entirely because of their serialization behavior: dotted lists have a dot in their serialization and non-dotted-lists don't. And the reason that the I/O behavior is what it is is that it turns out that punning cons cells as linked lists is a useful hack (or at least it was in 1958). But it is only a hack. There is no reason that the data structure used to represent pairs has to be the same data structure that is used to represent linked lists. Indeed, one could argue that this punning is actually a serious mistake. There should be pairs, with CAR and CDR fields which can take on any value, and there should be (linked) lists, with a FIRST field that can take on any value, and a REST field that is restricted to only contain another linked list (including a privileged empty list object). In such a design, the whole concept of "dotted list" would be non-sensical. You could still write (a . nil) or (a . ()) but that would no longer be the same object as (a), the former being a pair and the latter being a list.

So you see the concept of dotted list is rooted entirely in an I/O hack that John McCarthy invented back in 1958 so he could build linked lists out of punned pairs rather than make them a separate data type.

> NIL is not weird, it is an ordinary symbol, like NIK or NIM

No, NIL is not an "ordinary symbol". NIL is both a symbol and a list. NIL answers T to LISTP. No other symbol does that. You can call CAR, CDR, LENGTH, ASSOC etc. on NIL and not get an error. You cannot do that with any other symbol.

NIL is the only symbol to which you cannot give a function binding. In some implementations, attempting to do so triggers its own error message:

    Clozure Common Lisp Version 1.12.1 (v1.12.1-10-gca107b94) DarwinX8664
    ? (defun nil () t)
    > Error: Using NIL as a function name is silly.
Also, NIL answers T to LISTP but not to CONSP. So you can call CAR and CDR on it but not RPLACA and RPLACD. And, at the risk of stating the obvious, NIL answers T to NULL.

NIL is the only object with these properties. That is the very definition of "weird". (And note that I have made no reference to implementation details here.)

> The matter at hand is very basic

No, it isn't, or we wouldn't be arguing about it.


The point was that dotted-list-p doesn't have anything to do with the "serialized form" (printed representation) of the list. Dotted lists are important because they are one of the two types of improper list, the other type being circular lists. Many Lisp functions work on proper lists.

Of course NIL is an ordinary symbol. It has roles, the same way other symbols have roles. Are you saying the symbol &BODY is not an ordinary symbol because it is a lambda list keyword? Are you saying the symbol T is not an ordinary symbol because type boolean is (member t nil)? Are you saying symbol * is not an ordinary symbol because it is a special variable bound by the REPL, it is used in declaration syntax, and also is the name of the product function? Are you saying MUMBLE:VAPORIZE is not an ordinary symbol because it names the most dangerous operation in the mumble library? They are all ordinary symbols, sometimes with unique roles. Because NIL was chosen to satisfy the role of the empty list, is it any wonder that type list is (or cons null), type null is (eql nil), LENGTH and ASSOC and other list functions can deal with it etc.? Of course not. That doesn't make NIL qua symbol any weirder than any other symbol that has particular roles in a given context.

There are many symbols in Common Lisp that you cannot DEFUN. See section 11.1.2.1.2 in the hyperspec. Since NIL names a constant variable but not a standardized function, macro, or special operator, you can bind it to a function lexically using FLET or LABELS.

Once again, NIL is not a cons, it is an atomic symbol that also represents the empty list, so it's no wonder that LISTP returns true and CONSP returns false. CAR and CDR take lists (again see their entries in the CLHS) so it's no wonder that they can work with it. RPLACA and RPLACD take conses only so it's no wonder that they don't. At the risk of stating the obvious (again and again) the type null is (eql nil) so it's no wonder that NULL returns T for it. That operators treat a symbol specially does not make a symbol weird. It merely means that it serves a particular role. Any symbol can take any number of roles within a program. That doesn't make the symbol weird. I will repeat once more, because I've a feeling you didn't get this. It doesn't make a symbol weird.

It is a basic matter, definitely. I learned all this stuff about conses, atoms, symbols, lists, etc. very early on, as baby Lisper. I don't know why someone who spent 35 years programming Lisp should have trouble understanding all this, and has to keep bringing up new irrelevant/incorrect claims instead of just opening a beginner's Lisp book and re-reading the first chapter or two, where they talk about all this stuff.


> dotted-list-p doesn't have anything to do with the "serialized form" (printed representation) of the list

It does: the details of the serialization are the reason that the concept of "dotted list" even exists. At the risk of belaboring the obvious, dotted lists are called "dotted lists" because there is a syntactically-significant dot in their serialization.

> There are many symbols in Common Lisp that you cannot DEFUN. See section 11.1.2.1.2 in the hyperspec.

It is not that you cannot defun them, it is that this is undefined behavior. At least one implementation (CCL) lets you defun just about everything except NIL:

    ? (defun &body () t)
    &BODY
    ? (&body)
    T
    ? (defun t () nil)
    T
    ? (t)
    NIL
But, as noted earlier...

    ? (defun nil () t)
    > Error: Using NIL as a function name is silly.
> Since NIL names a constant variable but not a standardized function, macro, or special operator, you can bind it to a function lexically using FLET or LABELS.

NIL's lack of weirdness in this one regard actually seems pretty weird to me. IMHO it would actually makes sense to prohibit defining a function whose name is the same object that designates an empty list, just as it makes sense to prohibit defining functions whose names are (say) numbers.

> That operators treat a symbol specially does not make a symbol weird.

That all turns on how you define "weird". One dictionary definition of "weird" is "strikingly odd or unusual, especially in an unsettling way; strange". The fact that the CAR and CDR of NIL are both NIL seems weird to me. The fact that NIL is the canonical boolean false is NIL rather than F seems weird to me. (Actually, the canonical booleans, if they were going to be symbols at all, should have been :T and :F or :TRUE and :FALSE. But that's a different discussion.)

> Are you saying the symbol &BODY is not an ordinary symbol because it is a lambda list keyword?

Yes. Lambda-list keywords are weird. They behave differently from X and Y and Z and BAZ and BAR and BING and BIFF and BOFF and ORDINARY-SYMBOL and even WEIRD-SYMBOL. The same is true for symbols interned in the keyword package. All of these things are weird.


But NIL is not equal to (CONS NIL NIL).

(CONS X NIL) = '(X), a list of length 1 containing X, so (CONS NIL NIL) = '(NIL), a list of length 1 containing NIL.

But NIL is a list of length 0, not a list of length 1.

The wart is that it should never have been the case that you could call CAR and CDR on NIL. But even though you can wartily call them on NIL, there is still clearly a very important distinction between NIL and (CONS (CAR NIL) (CDR NIL))!


> But NIL is not equal to (CONS NIL NIL).

Yes, that's true, but that is a special case. For all other objects X and Y, if (CAR X) is equal to (CAR Y) and (CDR X) is equal to (CDR Y) then X and Y are equal. NIL and (NIL) are the only exception. And the fact that they are an exception is a consequence of the design decision to allow CAR and CDR to be called on NIL and return NIL.

> The wart is that it should never have been the case that you could call CAR and CDR on NIL.

Yes, that is the whole point.

> But even though you can wartily call them on NIL, there is still clearly a very important distinction between NIL and (CONS (CAR NIL) (CDR NIL))!

You could have as well said between NIL and (CONS NIL NIL) or just (NIL). And yes, this is true. Nonetheless, it is possible to implement NIL as a privileged cons cell with both CAR and CDR pointing to itself under the hood, and many CL implementations actually do this. It's a design decision. You have to put the warty code somewhere. You can put it in CAR and CDR, or you can put it in NULL, EQUAL, SYMBOLP, etc. But you have to put it somewhere.


I like the way scheme does it, but IMO this is one of the least significant reasons scheme choices are much preferable to cl imo. It’s just a much cleaner, friendlier language overall


I really like Scheme, though the Common Lisp type system has completely won me over (especially with defstar).


I distinctly remember from over two decades ago that, as I was exploring Lisp under the guidance of Shapiro's book Common Lisp: An Interactive Approach, the situation with nil being a symbol, empty list and false was simply a fantastic design, and it likely helped me get hooked on Lisp.

My view today is that if anything calls itself Lisp and doesn't have these design elements, it is, in a way, vandalizing the word "Lisp".


The Null Object[1] is a useful thing. It can represent the behavior when there's no correct behavior. It exists in languages like SmallTalk as well as LISP and LISP-derived languages.

In Go, you can call methods on nil objects. [2]

1 https://wiki.c2.com/?NullObject

2 https://go.dev/play/p/mgayNTjdacL


It's very difficult to deserialize in serialize things like yaml and json in CL because it doesn't have a distinct false object. When you do deserialize `false` it turns into nil. Then when you serialize it back again it turns into `null`. From a practical perspective, this is annoying.

I don't hate using it as a language, it's fine, but because it's different enough than other languages it confuses people.


> It's very difficult to deserialize in serialize things like yaml and json in CL because it doesn't have a distinct false object.

Its not particularly hard to represent JSON’s type system in CL and deserialize JSON to CL objects.

Its also not particularly hard to construct and use a model of CL data in JSON from CL. (Its even easier with YAML.)

OTOH, yes, JSONs data model is almost exactly JS’s data model (minus anything callable, which js a lot of JS, and with a slightly different model of numbers, but anything that doesn’t exactly match JS’s model is canonically unreliable and so discouraged), So with any thing other than JS using JSON, you have to determine if your use case is modeling your language’s data in JSON or JSON’s data model in your language, while with a particular subset of JS you can just ignore the distinction.


Serialization formats not designed specifically for the language should not be treated as serializers for objects in that language.

That is, JSON is not a representation of objects in your language. It's your application that holds a representation of JSON data. If you're not paying attention to the translation layer between your native objects and the JSON ones, you'll hit problems sooner or later.


JSON is the JavaScript Object Notation; it's in the name! The closer your language's object model is to JavaScript, the more likely it is you'll get away with treating it as a direct serialization format. Conversely, the further your language is from JavaScript, the more you'll need to add types specifically to represent JSON.


I share your frustration, particularly since one could add a canonical false value to CL in a backwards-compatible way. But it's too late now.

But as a practical solution you could de-serialize false into, say, :false, in order to distinguish them from empty lists, and then do a post-processing step to turn those into nils if you wanted to.


> you could de-serialize false into, say, :false ...

That's the approach that st-json [0] builds on:

- true and false become :true and :false, respectively

- [] becomes nil

- {} becomes #s(st-json:jso :alist nil)

---

[0] in QuickLisp and at https://github.com/marijnh/ST-JSON


Yeah, good point. IMHO the lack of a canonical serialization for hash tables is a much more serious omission than not having a separate boolean false value.


> in a backwards-compatible way. But it's too late now.

You just contradicted yourself.


No, that's not a contradiction. To add a canonical false value you would have to change the semantics of the language, which means you have to change the standard. In general changing standards is possible, but in the case of CL it is not possible for practical reasons: changing standards costs money and no one would fund such an effort. CL is moribund, not unlike Smalltalk, APL, and COBOL.


>CL is moribund, not unlike Smalltalk, APL, and COBOL.

If someone wants to do some hobby/small-scale but complex programming in LISP, what kind of dialect should they use? Scheme/Racket or something else?


Personally I do all my personal coding in CL but it's a matter of taste. Another word for "moribund" is "stable" and that can be a good thing. But it really is mostly personal preference. And you might want to add Clojure to your list of languages to consider.

The reason I like CL is that I find it is a local optimum impedance match for my brain in language design space. But that is at least in part because I have been using it for 35 years so I am used to its quirks (of which there are many). But it also has lots of really nice features, some of which are still unique to CL and which I find indispensable to my coding style. Generic functions are at the top of that list, and a close second is the macro system.


I'm afraid you're confusing 'stable' with 'stagnant'. (The dictionary suggests 'dying'... I won't go there ;))


CL is not stagnant. SBCL and Quicklisp are still active, and Franz is still a going concern.


As a programming language, CL isn't moribund. There are several high-quality Common Lisp implementations. There's just no real chance of there being a revision of the language standard.


Yep. It's probably the case that one should have a more complicated serializer / deserializer for JSON in CL (one that takes into account the expected JSON datatypes) because JSON is yet another type and value system from Scheme's and CL's, but there's an isomorphism from the Scheme one to the JSON one and there's no such similar isomorphism for the primitive CL types (though were one so inclined, one could certainly build one in one's own application).

I'd be willing to argue that this is indicative of JSON's unnecessary complexity (inherited from its source language... why does JavaScript need a false and an empty array and a null and even an undefined?), but reasonable developers can disagree on that point.


> why does JavaScript need a false and an empty array and a null and even an undefined?)

Why does Lisp need a 0 and a nil? Because numbers are conceptually different from lists, and the number 0 is conceptually different from no number at all.

Why does JavaScript need a false and an empty array and a null and an undefined? Because all of those represent different concepts. Trying to stuff as many concepts as possible into a single representation doesn't necessarily make a language better (though I'm certainly not claiming that JavaScript is the epitome of good language design).


You want empty lists because you can't iterate a null. So an empty list should be an empty list, otherwise it will require special code to check for the null case.


Do you mean you can't iterate a null in JavaScript or in LISP?

You can't iterate a null in JavaScript, but that's a design decision of the language that could have gone another way; `for (const elem of null) {do();}` could have been specified to be valid JavaScript that never calls `do()`, but it wasn't because it wasn't.


Well if you could iterate a null, you'd also be able to sum 2 nulls. And you'd need null to carry a length (of zero, I presume).


Correct. Which could, of course, all be defined.

- null + null === null ; list + null === list; null + list === list

- null.length === 0 /* should probably make it a runtime error to write this property, or allow it but have the result be a list with length elements, all undefined */

(Note: somewhat hilariously, null + null is already defined in JavaScript. It, of course, is 0. Wat. ;) )

Whether null and empty list should be different or the same is an old argument, and both directions lead to different problems. http://thecodelesscode.com/case/6


Why not just have empty lists instead of null then? :D Seems simpler.


I agree. I use quite a few languages that don't have the "scorpion in a jar" problem where an argument might be an empty list or null (some that address it with static typing, some where the two values are the same value).


I'm not familiar with the "scorpion in a jar" idiom; guessing it's something like [to kill the scorpion you have to open the jar]... care to elucidate?


It's a reference to the "Codeless Code" link I shared upthread: 'All nothings are not equal.' A language that supports null and empty list as separate things opens the risk "what happens when the user passes null to something expecting a list?"

It is, at least, something a language with good static typing can mitigate by disallowing null as a list argument. And then there's Java...


In C you sometimes have to pass int* instead of int, because you want null to signify that you aren't passing a value, rather than passing a value of 0. The two things aren't interchangeable. An empty string and a string of length 0 aren't the same thing. You'd just end up needing an extra boolean parameter for "is_null".


Agreed. I'm not a fan of 0===false===null for equality / truthiness tests. But "null is empty list" is categorically different and fine by me.

> You'd just end up needing an extra boolean parameter

Correct. Or an Option() wrapper or another box. Such an unusual construction is fine because, as you noted, the cases where one would want to distinguish null from empty list are rare. Rare cases should stick out.

The fact that in JavaScript (and Java), every argument that takes an array could also take null and mean something different by it, triggering a runtime error as a result, is a design foot-gun. Rarely are both null and [] as different symbols appropriate; those languages made that a by-default-always-allowed feature.


> I'm not a fan of 0===false===null for equality / truthiness tests. But "null is empty list" is categorically different and fine by me.

Wat?

You might want to signify that you are not passing a list instead of you are passing an empty list. Same way that happens with any other type…

> The fact that in JavaScript (and Java)

Well in JS it's completely insane. With typed python if you say "list of XXX" null is not accepted by linters. You need to specify it's also an option.


JavaScript already has undefined as a primitive that could be used to signify you are not passing a list as opposed to passing an empty list.

I think we are talking around agreement here, which is that JavaScript was not particularly well-designed.


There is no undefined value in JSON. (Of course ommitting something is like an undefined)

https://stackoverflow.com/questions/13796751/json-undefined-...


> it's different enough than other languages it confuses people.

And javascript integers (or lack thereof) don't match CL integers. So if you want to interoperate between distinct languages you do have to consider the differences anyway.


I agree that it’s annoying. It is possible to work around without too much trouble by defining some other constant that evaluates to nil to distinguish.


TLDR (at the end): "Certainly the CL approach carries more historical baggage."

> Combining the empty list and the special false object can lead to particularly good implementations perhaps.

I suppose they wanted the null pointer to represent them both in RAM, because every CPU has an instruction to test whether a value is zero, or even a flag that is set for free when you encounter the value.


That's one reason. Another is that it allows you to use the idiom (if X ...) to test for non-null rather than (if (not (null X)) ...)


Indeed, although some languages achieve the same by defining multiple falsy values, e.g. JS has 9 such values: https://developer.mozilla.org/en-US/docs/Glossary/Falsy


Yeah, but this causes its own set of problems when, for example, you are surprised that 0.0 is false but -0.0 is true.

In general, punning of any kind is a Really Bad Idea when code gets complicated.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: