There's been a lot of discussion about the minimal set of primitives to build an interpreter in the Forth community (for instance https://groups.google.com/d/msg/comp.lang.forth/KHw28PKZXUk/.... ) Particularly interesting is eForth, a Forth made to be easily ported while still having acceptable performance, and Minimal ANS Forth, a heavily-commented ANS compliant Forth written in Forth (both of these use significantly more than the minimum for practical reasons.) The answer, for both Lisp and Forth, is somewhere between 8 and 11. In "real world" usage, the set of primitives largely depends on the hardware you're targeting and how well you need it to perform.
The Forth exercise is interesting because you end up with a super simple instruction set. If someone ever invents a machine that can do operations really fast, but where implementing complex operations is prohibitive, such exercises could have practical value.
The answer will vary significantly depending on the degree of interpretation required (i.e., how much of your Lisp you want to lower down to something simpler prior to interpretation). That's why some lists include lambda, while some are only listing primitive functions.
You only need lambda (and primitives for doing I/O). You can define numbers, lists, and all other data types using only lambda. For example you can define booleans like this:
(define true (lambda (a b) a)
(define false (lambda (a b) b)
(define if (lambda (b a b) (b a b)))
Now you can do:
(if true 4 9) -> returns 4
You can define logical `and` and `or` like this:
(define and (lambda (p q) (if p q false)))
(define or (lambda (p q) (if p true q)))
To make that work right for (if b (...) (...)) you'd need lazy evaluation though, otherwise both the then and else branch would always be evaluated regardless of whether the condition is true or false. Or you could wrap the then and else branch in a lambda, and then call the lambda that is returned by `if`.
- So booleans are the type forall r. r -> r -> r. "To use a boolean, you have to give it a value of type r for the then branch, and a value of type r for the else branch, and then you get a value of type r back".
- You can define natural numbers as functions of type forall r. r -> (r -> r) -> r. "To use a natural number, you should give a value of type r for n=0, and a function of type r -> r for n=k+1, then you get a value of type r back".
- You can define pairs of type (A,B) as forall r. (A -> B -> r) -> r. "To use a pair of type A,B you have to give a function that takes an A and a B and returns an r, then you get an r back".
- You can define sum types A+B (either an A or a B) as forall r. (A -> r) -> (B -> r) -> r.
- You can define least fix points (inductive types) of a type constructor F as forall r. (F r -> r) -> r.
- You can define greatest fix points (coinductive types) of a type constructor F as exists s. (s, s -> F s).
Church encodings are a general way to define data types in lambda calculus purely based on lambdas. Church encodings are still not fully understood. For example the relation between Church encodings and mathematical induction is not fully understood yet (in type theory terms, the relation between non-dependent elimination / recursion principle and dependent elimination / induction principle). Also it's not yet fully understood how to take a fix point of a contravariant type constructor. Another thing that's as far as I know not fully understood is how to do Church encodings for higher inductive types. This goes to the very foundations of mathematics, and researchers in type theory think that in the coming years we will have breakthroughs that will make type theory a viable replacement for set theory as the basis of mathematics, with the benefit that mathematical proofs will be verified by computers.
Yes, that's easier to see if you wrote your version like this:
(define mcons (lambda (a b) (lambda (c) (if c a b))))
(define (mcar p) (p true))
(define (mcdr p) (p false))
In type theory terms this expresses the isomorphism (A,B) ~= (b:Bool) -> (if b then A else B), where the latter is a type called a 'dependent function'. This is a function where the type of output depends on the value of the input. For example if you had (cons 42 "hello") then that would be a function that takes a boolean, and if the boolean is true then it returns the integer 42, and if the boolean is false then it returns the string "hello". So the type of the output depends on the value of the input. This is what dependent types are all about, but that hasn't yet penetrated into mainstream programming languages.
Church pairs on the other hand express the isomorphism (A,B) ~= forall r. (A -> B -> r) -> r. For that I find it easier to think in terms of producers/consumers. You can ask yourself "what is a consumer of a pair of type (A,B)?" -> answer: it's a consumer of an A and a B. Therefore we can define cons like this:
Instead of thinking of a cons as a pair, we think of a cons as a function that takes a cons-consumer and it passes its data to the consumer. What kind of consumers can we write? For example, a consumer that sums the components:
(define (car c) (c car-er))
(define (cdr c) (c cdr-er))
The same sort of reasoning works for church booleans, church numerals, church lists, church binary trees, and so forth. That is, you could try to think about this in a dynamically typed framework, but for the more difficult church encodings that becomes almost impossible. At least for me, types really help to bring order to the madness of lambda. If I try to think about what a type like forall r. (F r -> r) -> r means in dynamic terms that causes instant brain melt, from the perspective of types it's a short and innocent expression.
Lambda is a very complicated thing, with extremely twisted name substitution rules. A minimal system needs only one combinator - http://en.wikipedia.org/wiki/Iota_and_Jot
The struggle to understand what is a primitive has to take into account what the instruction set architecture (ISA) is for the machine that will be running the system.
This article doesn't even consider the idea that the LISP may be running on any exotic hardware at all.
It might be instructive to look at the ISA for the Lisp Machine from the 1980s.
Why would what is a primitive depend on the ISA? If you need car, cdr and cons, it doesn't matter whether each is a single instruction or then thousand, does it?
It's straightforward to convert lambda calculus expressions to to SK(I) combinators - the fact that you can get recursion using the Y combinator from SK is trivial to demonstrate but still surprises me!
Cool - I had heard that there was as a single combinator scheme but hadn't seen a definition of it before.
I had some fun as an undergraduate "compiling" proper code down to, in the worst cast, SK combinators and that was entertainingly inefficient (resulting in a need to become acquainted with implementing garbage collection). I suspect that U might be even more fun...
NB I did my SK stuff on a multi-user Unix minicomputer in '87/'88 so modern machines probably makes this a bit more realistic to play with.
You could build the lisp primitives out of NAND, but you still need those primitives to build the lisp implementation, which is what the orignial question is. You could also write the lisp primitives without direct reference to NAND (excluding NAND inside of the CPU).