
What is the “minimal” set of primitives needed for a Lisp interpreter? - shawndumas
http://www.cs.cmu.edu/Groups/AI/html/faqs/lang/lisp/part1/faq-doc-6.html
======
farva
There's been a lot of discussion about the minimal set of primitives to build
an interpreter in the Forth community (for instance
[https://groups.google.com/d/msg/comp.lang.forth/KHw28PKZXUk/...](https://groups.google.com/d/msg/comp.lang.forth/KHw28PKZXUk/JrXcjrYshxAJ).
) Particularly interesting is eForth, a Forth made to be easily ported while
still having acceptable performance, and Minimal ANS Forth, a heavily-
commented ANS compliant Forth written in Forth (both of these use
significantly more than the minimum for practical reasons.) The answer, for
both Lisp and Forth, is somewhere between 8 and 11. In "real world" usage, the
set of primitives largely depends on the hardware you're targeting and how
well you need it to perform.

~~~
upofadown
The Forth exercise is interesting because you end up with a super simple
instruction set. If someone ever invents a machine that can do operations
really fast, but where implementing complex operations is prohibitive, such
exercises could have practical value.

------
juliangamble
On this site they say there are _10 LISP primitives_. The primitives are:

    
    
        atom 
        quote 
        eq
        car 
        cdr 
        cons 
        cond 
        lambda 
        label 
        apply
    

[http://hyperpolyglot.wikidot.com/lisp#ten-
primitives](http://hyperpolyglot.wikidot.com/lisp#ten-primitives)

Steve Yegge reckons there are _seven or five_ :

    
    
        Its part of the purity of the idea of LISP: you only need the seven (or is 
        it five?) primitives to build the full machine.
    

[http://steve-yegge.blogspot.com/2006/04/lisp-is-not-
acceptab...](http://steve-yegge.blogspot.com/2006/04/lisp-is-not-acceptable-
lisp.html)

McCarthy had _five_ :

    
    
        atom
        eq
        car
        cdr
        cons
    

[http://www-formal.stanford.edu/jmc/recursive.pdf](http://www-
formal.stanford.edu/jmc/recursive.pdf)

McCarthy then _added another four_ :

    
    
        quote
        cond
        lambda
        label
    

So according to McCarthy - _the answer is nine_.

See here for a more in depth discussion:

[http://stackoverflow.com/questions/3482389/how-many-
primitiv...](http://stackoverflow.com/questions/3482389/how-many-primitives-
does-it-take-to-build-a-lisp-machine-ten-seven-or-five/)

~~~
sklogic
The answer will vary significantly depending on the degree of interpretation
required (i.e., how much of your Lisp you want to lower down to something
simpler prior to interpretation). That's why some lists include lambda, while
some are only listing primitive functions.

------
5kg
Shen[1] is a powerful language built upon Kl, which have only 46
primitives[2].

[1]: [http://www.shenlanguage.org/](http://www.shenlanguage.org/) [2]:
[http://www.shenlanguage.org/learn-
shen/shendoc.htm#The](http://www.shenlanguage.org/learn-shen/shendoc.htm#The)
Primitive Functions of K Lambda

------
auvi
pg has an implementation of Lisp primitives that was described in McCarthy's
1960 paper.

[http://ep.yimg.com/ty/cdn/paulgraham/jmc.lisp](http://ep.yimg.com/ty/cdn/paulgraham/jmc.lisp)

------
jules
You only need lambda (and primitives for doing I/O). You can define numbers,
lists, and all other data types using only lambda. For example you can define
booleans like this:

    
    
        (define true (lambda (a b) a)
        (define false (lambda (a b) b)
        (define if (lambda (b a b) (b a b)))
    

Now you can do:

    
    
        (if true 4 9) -> returns 4
    

You can define logical `and` and `or` like this:

    
    
        (define and (lambda (p q) (if p q false)))
        (define or (lambda (p q) (if p true q)))
    

To make that work right for (if b (...) (...)) you'd need lazy evaluation
though, otherwise both the then and else branch would always be evaluated
regardless of whether the condition is true or false. Or you could wrap the
then and else branch in a lambda, and then call the lambda that is returned by
`if`.

\- So booleans are the type forall r. r -> r -> r. "To use a boolean, you have
to give it a value of type r for the then branch, and a value of type r for
the else branch, and then you get a value of type r back".

\- You can define natural numbers as functions of type forall r. r -> (r -> r)
-> r. "To use a natural number, you should give a value of type r for n=0, and
a function of type r -> r for n=k+1, then you get a value of type r back".

\- You can define pairs of type (A,B) as forall r. (A -> B -> r) -> r. "To use
a pair of type A,B you have to give a function that takes an A and a B and
returns an r, then you get an r back".

\- You can define sum types A+B (either an A or a B) as forall r. (A -> r) ->
(B -> r) -> r.

\- You can define least fix points (inductive types) of a type constructor F
as forall r. (F r -> r) -> r.

\- You can define greatest fix points (coinductive types) of a type
constructor F as exists s. (s, s -> F s).

See also
[http://en.wikipedia.org/wiki/Church_encoding](http://en.wikipedia.org/wiki/Church_encoding)

Church encodings are a general way to define data types in lambda calculus
purely based on lambdas. Church encodings are still not fully understood. For
example the relation between Church encodings and mathematical induction is
not fully understood yet (in type theory terms, the relation between non-
dependent elimination / recursion principle and dependent elimination /
induction principle). Also it's not yet fully understood how to take a fix
point of a contravariant type constructor. Another thing that's as far as I
know not fully understood is how to do Church encodings for higher inductive
types. This goes to the very foundations of mathematics, and researchers in
type theory think that in the coming years we will have breakthroughs that
will make type theory a viable replacement for set theory as the basis of
mathematics, with the benefit that mathematical proofs will be verified by
computers.

~~~
jbert
I like the definition of cons/car/cdr in terms of lambda.

A cons cell is just a function you can call to get back the car or cdr. Once
you have 'if' and any two constants, you can do:

(define mcons (lambda (a b) (lambda (method) (if (= method :car) a (if (=
method :cdr) b)))))

(define (mcar p) (p :car))

(define (mcdr p) (p :cdr))

~~~
jules
There is also an encoding based directly on lambda:

    
    
        (define (cons a b) (lambda (f) (f a b)))
        (define (car c) (c (lambda (a b) a)))
        (define (cdr c) (c (lambda (a b) b)))
    

No `if`, no `=`, no `:car/:cdr`. Only lambda!

~~~
jbert
Nice.

It looks like that's effectively using true/false (as defined via lambda) as
the two constants and inlining the "if true" and "if false"?

~~~
jules
Yes, that's easier to see if you wrote your version like this:

    
    
        (define mcons (lambda (a b) (lambda (c) (if c a b))))
        (define (mcar p) (p true))
        (define (mcdr p) (p false))
    

In type theory terms this expresses the isomorphism (A,B) ~= (b:Bool) -> (if b
then A else B), where the latter is a type called a 'dependent function'. This
is a function where the type of output depends on the _value_ of the input.
For example if you had (cons 42 "hello") then that would be a function that
takes a boolean, and if the boolean is true then it returns the integer 42,
and if the boolean is false then it returns the string "hello". So the type of
the output depends on the value of the input. This is what dependent types are
all about, but that hasn't yet penetrated into mainstream programming
languages.

Church pairs on the other hand express the isomorphism (A,B) ~= forall r. (A
-> B -> r) -> r. For that I find it easier to think in terms of
producers/consumers. You can ask yourself "what is a consumer of a pair of
type (A,B)?" -> answer: it's a consumer of an A and a B. Therefore we can
define cons like this:

    
    
        (define (cons car-value cdr-value) (lambda (consumer) (consumer car-value cdr-value)))
    

Instead of thinking of a cons as a pair, we think of a cons as a function that
takes a cons-consumer and it passes its data to the consumer. What kind of
consumers can we write? For example, a consumer that sums the components:

    
    
        (define summer (lambda (car-value cdr-value) (+ car-value cdr-value)))
    

Now given a cons, we can sum it like this:

    
    
        (define somecons (cons 4 5))
        (somecons summer) -> 9
    

Now we can define a function sum that sums the components of a cons like this:

    
    
        (define (sum c) (c summer))
    

Similarly, we can define "cdr-ers" and "car-ers":

    
    
        (define car-er (lambda (car-value cdr-value) car-value))
        (define cdr-er (lambda (car-value cdr-value) cdr-value))
    

Now we define car and cdr like this:

    
    
        (define (car c) (c car-er))
        (define (cdr c) (c cdr-er))
    

The same sort of reasoning works for church booleans, church numerals, church
lists, church binary trees, and so forth. That is, you _could_ try to think
about this in a dynamically typed framework, but for the more difficult church
encodings that becomes almost impossible. At least for me, types really help
to bring order to the madness of lambda. If I try to think about what a type
like forall r. (F r -> r) -> r means in dynamic terms that causes instant
brain melt, from the perspective of types it's a short and innocent
expression.

~~~
jbert
Excellent post. Your type-based approach is a very useful tool for thinking
about things which I find difficult. Thanks for that.

------
TazeTSchnitzel
lambda, quote, cons, car, cdr, atom, cond, eq, label

Or with pattern-matching: cons, lambda, label, quote

And really, label is just immediately-invoked lambda, so: cons, lambda, quote

[http://bodil.org/building-lisp/#0](http://bodil.org/building-lisp/#0)

------
whitten
The struggle to understand what is a primitive has to take into account what
the instruction set architecture (ISA) is for the machine that will be running
the system.

This article doesn't even consider the idea that the LISP may be running on
any exotic hardware at all.

It might be instructive to look at the ISA for the Lisp Machine from the
1980s.

~~~
rdc12
Which is quite different to the theoretical minimum number which they are
presumably looking at

~~~
throwaway823472
Wouldn't the minimal number be 1: NAND?

~~~
arethuza
Or if you want to build things on a purely functional basis all you need is S
and K:

[http://en.wikipedia.org/wiki/SKI_combinator_calculus](http://en.wikipedia.org/wiki/SKI_combinator_calculus)

It's straightforward to convert lambda calculus expressions to to SK(I)
combinators - the fact that you can get recursion using the Y combinator from
SK is trivial to demonstrate but still surprises me!

~~~
ingenter
Then again, you could use a single universal combinator,
[http://en.wikipedia.org/wiki/Iota_and_Jot](http://en.wikipedia.org/wiki/Iota_and_Jot)
.

I find it weird that one can express all of combinator calculus using one
function.

~~~
arethuza
Cool - I had heard that there was as a single combinator scheme but hadn't
seen a definition of it before.

I had some fun as an undergraduate "compiling" proper code down to, in the
worst cast, SK combinators and that was entertainingly inefficient (resulting
in a need to become acquainted with implementing garbage collection). I
suspect that U might be even more fun...

NB I did my SK stuff on a multi-user Unix minicomputer in '87/'88 so modern
machines probably makes this a bit more realistic to play with.

