
Compiler basics: Lisp to Assembly - eatonphil
http://notes.eatonphil.com/compiler-basics-lisp-to-assembly.html
======
paddy_m
Years ago I read an article or paper about writing a lisp compiler. The
compiler started with regular looking lisp, then through macro expansion lisp
primitives were defined in terms of each other. Then the small subset of lisp
primitives were expanded to goto's and conditionals. The gotos and
conditionals were still expressed in lisp S Expressions. Finally the gotos and
conditionals were implemented in terms of assembly looking lisp primitives.
Then they removed the parentheses and it left an assembly program.

I can't remember the name of the paper. Any ideas? It was such an elegant way
of walking through a compiler implementation.

~~~
absrnd
Was it this?
[https://github.com/kingcons/cl-6502](https://github.com/kingcons/cl-6502)

------
z3phyr
A similar approach is taken in this Obligatory mention:
[https://compilers.iecc.com/crenshaw/](https://compilers.iecc.com/crenshaw/)

~~~
kureikain
Thank you. This is such a great resource and very readable on mobile too since
it's just txt like a man page.

------
eiurafhlfie
Hi Hacker News PL enthusiasts. This is off topic, but I hope you will have
mercy on me.

I'm reaching out for some help from you. I've been working on interpreters
this whole year, taking on no more work for pay than needed for rent and food,
so that I can realize my programming language dream. I'm writing a couple of
interpreters every week, trying out new ideas. But there is one hurdle I've
been unable to overcome for a long time.

I'm trying to write a typechecker for a simple lambda calculus with
polymorphic functions. My acceptance criteria is being able to typecheck the K
combinator. I've been stuck at this point for months. Have read all I can find
on the internet, and have read TAPL many times.

Am honestly feeling really stupid for not having been able to complete this.

What I'm struggling with beta reducing expressions that are type lambdas (i.e.
the expression related to `Forall`).

Λ means type lambda, i.e. the set of expressions indexed by type

λ means expression lambda, i.e. the set of expressions indexed by expression

E.g. Λa.λx:a.x is the polymorphic "identity" function, which in javascript
looks like `x => x`

k = ΛA.ΛB.λx:A.λy:B.x

k_intint = ((k IntTy) IntTy)

What I have yet been unable to do is type_of k_intint, and this because beta
reduction should return an expression, but type_of should return a type.

I'll paste the source code of my latest attempt below, the function in
question is `typ_of_expression`. I've written this in OCaml, but would be very
thankful for a response in any language, or pseudocode, or just explained in
English.

Many thanks for taking the time to read this. For any reply or pointer in the
right direction I would be greatly thankful. You know what, I'd be happy to
pay €100 for any help that lets me solve this (SEPA, Bitcoin).

    
    
      (* λ calculus, with Forall and simple types *)
      type 'a binding =
        | SymBind of 'a
        | QuantifierBind
    
      type 'a context = (string * 'a binding) list
      
      type typ =
        | UnitTy
        | IntTy
        | Arrow of typ * typ
        | Forall of string * typ
        (* Thought it simpler to separate Closure and Lambda, for w  hen I later add records/unions etc *)
        | ClosureTy of (typ context) * typ
        | SymbolTy of string
      
      type expr =
        (* again thought it simpler to separate Closure and Lambda,   for when I later add records/unions etc *)
        | Closure of (expr context) * expr
        | Unit
        | Symbol of string
        | Int of int
        | Lam of string * typ * expr  (* turns into Arrow. Takes ex  pr as argument, returns expr *)
        | App of expr * expr
        | TLam of string * expr       (* expr version of Forall. Ta  kes typ as argument, returns expr *)
        | TApp of expr * typ
      
      let rec lookup context key =
        match context with
        | [] -> Error "No result"
        | (k, QuantifierBind) :: rest -> lookup rest key
        | (k, SymBind v) :: rest -> if key = k
                                    then Ok v
                                    else lookup rest key
      
      let str = String.concat ""
      
      let rec string_of_typ t = match t with
        | UnitTy -> "UnitTy"
        | IntTy -> "IntTy"
        | Arrow _ -> "Arrow"
        | Forall (s, t) -> str ["Forall "; s; " "; string_of_typ t]
        | ClosureTy (_ctx, t) -> str ["Closure "; string_of_typ t]
        | SymbolTy s -> str ["SymbolTy "; s]
      
      let string_of_expr e = match e with
        | Unit -> "Unit"
        | Closure _ -> "Closure"
        | Symbol s -> str ["Symbol "; s]
        | Int i -> str ["Int "; string_of_int i]
        | Lam _ -> "Lam"
        | App _ -> "App"
        | TLam _ -> "TLam"
        | TApp _ -> "TApp"
      
      (* Convert between context types, such as (expr context) or (  typ context) *)
      let rec convert_context converter_fn original_context exprcon  text acc = (* カリたべたい *)
        let convert_context = convert_context converter_fn original  _context in
        match exprcontext with
        | [] -> acc
        | (x, QuantifierBind) :: rest ->
           convert_context rest ((x, QuantifierBind) :: acc)
        | (x, SymBind e) :: rest ->
           (match converter_fn original_context e with
            | Ok t -> convert_context rest ((x, SymBind t) :: acc)
            | Error e -> failwith (str ["typ_context error: "; e]))
      
      let rec typ_of_expr context e =
        match e with
        | Unit -> Ok UnitTy
        | Int i -> Ok IntTy
        | Closure (ctx, e) ->
           let new_context = convert_context typ_of_expr context ct  x context in
           typ_of_expr new_context e
        | Symbol s -> lookup context s
        | Lam (x, in_typ, body) ->
           (let new_context = (x, SymBind in_typ) :: context in
            match typ_of_expr new_context body with
            | Ok return_typ -> Ok (Arrow (in_typ, return_typ))
            | Error e -> Error (str [ "Error finding return type of   lambda. Error: "
                                    ; e]))
        | App (e0,
               e1) ->
           (match (typ_of_expr context e0,
                   typ_of_expr context e1) with
            | (Ok (Arrow (t0_0, t0_1)),
               Ok t1) -> if t1 = t0_0
                         then Ok t0_1
                         else Error (str ["Type mismatch"])
            | (Ok (ClosureTy (ctx, Arrow (t0_0, t0_1))),
               Ok t1) -> if t1 = t0_0
                         then Ok t0_1
                         else Error
                                (str
                                   ["Type mismatch. t0_0: "
                                   ;string_of_typ t0_0
                                   ;", t0_1: "
                                   ;string_of_typ t0_1
                                   ;", t1: "
                                   ;string_of_typ t1
                                ])
            | (Ok t, _) -> Error (str ["Not given a lambda as first   thing to App. "
                                      ;string_of_typ t])
            | _ -> Error (str ["Error getting typ of expr App"]))
        | TLam (a, body) ->
           (let new_context = (a, QuantifierBind) :: context in
            match typ_of_expr new_context body with
            | Ok body_type -> Ok (ClosureTy (new_context,
                                             Forall (a,
                                                     body_type)))
            | Error e -> Error (str [ "Failed to get typ of TLam, a  nd thus can not construct Forall. Error: "
                                    ; e]))
        | TApp (TLam (a, body),
                t1) -> let new_context = (a, QuantifierBind) :: con  text in
                       typ_of_expr new_context body
        | TApp (e0, t1) ->
           (match typ_of_expr context e0 with
            | Ok (ClosureTy (ctx,
                             Forall (a, body))) ->
               let new_context = (a, SymBind t1) :: ctx @ context i  n
               Ok (ClosureTy (new_context,
                              body))
            | Ok (ClosureTy(ctx,
                            t)) ->
               Error (str ["TApp given ClosureTy with non-forall on   left-hand-side. IS: "
                          ;string_of_typ t])
            | Ok (Forall (a, body)) -> Error "TApp given naked Fora  ll as first argument"
            | Ok _ -> Error "TApp given non-Forall first argument"
            | Error e -> Error (str ["Error in TApp - error: "
                                    ;e]))
      
      let rec eval context e =
        let context: (expr context) = context in
        match e with
        | Unit -> Ok Unit
        | Closure (ctx, e) -> eval (List.append ctx context) e
        | Int i -> Ok (Int i)
        | Lam (x, t, body) -> Ok (Closure (context,
                                           Lam (x, t, body)))
        | Symbol s -> lookup context s
        | App (Closure(ctx,
                       Lam (x, t, body)),
               e1) ->
           let new_context = (x, SymBind e1) :: ctx @ context in
           eval new_context body
        | App (e0, e1) ->
           (match eval context e0 with
            | Ok (Closure (ctx,
                           Lam (x, t, body))) ->
               eval (ctx @ context) (App (Lam (x,
                                               t,
                                               body),
                                          e1))
            | Ok _ -> Error "Can't apply non-Lam"
            | Error e -> Error (str ["Apply error: "
                                    ;e]))
        | TLam (a, body) -> Ok (TLam (a, body))
        | TApp (TLam (a, body),
                given_typ) ->
           let new_context = (a, QuantifierBind) :: context in
           eval new_context body
        | TApp (e0, t) ->
           (match eval context e0 with
            | Ok (TLam (a, body)) ->
               let _ = failwith "Not gonna happen" in
               eval context (TApp (TLam (a, body), t))
            | Ok (Closure (ctx,
                           TLam (a,
                                 body))) ->
               eval (ctx @ context) body
            | Ok _ -> Error "Can't type-apply non-Λ"
            | Error e ->
               Error (str [ "Error applying type lambda. Error: "
                          ; e]))
      
      (* helper functions *)
      let tlam a body = TLam (a, body)
      let lam x t body = Lam (x, t, body)
      let app e0 e1 = App (e0, e1)
      let tapp e0 tyT1 = TApp (e0, tyT1)
      
      (* eval [] (tlam "a" (lam "x"
       *                      (SymbolTy "a")
       *                      (Symbol "x"))) *)
      
      let k_combinator = (tlam "a" (tlam "_b"
                                      (lam
                                         "x"
                                         (SymbolTy "a")
                                         (lam
                                            "_y"
                                            (SymbolTy "_b")
                                            (Symbol "x")))))
      
      let k_intint = (TApp (TApp (k_combinator,
                                  IntTy),
                            IntTy))
      
      (* DEAR HN READER: This is what is failing me. *)
      let applied_k = typ_of_expr [] (app k_intint (Int 1))

~~~
lower
Are you aware that the TAPL book comes with an implementation of System F?

[https://www.cis.upenn.edu/~bcpierce/tapl/checkers/fullpoly/](https://www.cis.upenn.edu/~bcpierce/tapl/checkers/fullpoly/)

------
toutouast
Why not just compile "primitive" functions call to respective assembly code?
(+ => "add")

~~~
eatonphil
> In a better compiler, we would not make plus a built-in function. We'd emit
> code for the assembly instruction ADD. However, making plus a function makes
> code generation simpler and also allows us to see what function calls look
> like.

:)

------
m0skit0
"[Assembly is] 1:1 representation of the binary instructions the CPU can
interpret". 99% correct.

~~~
m0skit0
Assembly is NOT a 1:1 representation of the binary instructions. It is for the
99% of a given assembly. Pseudo-instructions exist in most assemblers and they
usually translate to several machine code instructions.

~~~
eatonphil
Thanks for the correction. Do you have an example?

~~~
m0skit0
For Intel:
[https://www.intel.com/content/www/us/en/programmable/documen...](https://www.intel.com/content/www/us/en/programmable/documentation/iga1420498949526/iga1409764012031/iga1409764164759.html)

------
evolveyourmind
Compiler without a Grammar...

~~~
junke
What do you mean?

~~~
duck2
The article is about "compiler basics", but it never mentions PLs are parsed
according to context-free grammars, let alone giving one. Just throws a bunch
of code and says "yeah this'll parse it".

~~~
vkazanov
Maybe because compilers nowadays are mostly _not_ about syntax? Parsers are
the boring part.

------
aap_
Calling this a lisp is a bit of a stretch. S-expressions aren't lisp if they
don't have lisp semantics.

~~~
fao_
What part of the code that will compile under this simple compiler (i.e. `(+ 2
(+ 3 4))`) _doesn't_ have lisp semantics?

~~~
junke
Integer overflow.

~~~
kazinator
Not all historic dialects of Lisp have bignum integers. I don't know what the
current situation is in Emacs Lisp, but it had only fixnums at one point.

------
amelius
Ok, I get this is a DIY post, but wouldn't it make much more sense to use LLVM
here? That way, you get low-level optimizations for free, and this allows you
to focus on the rest of the language. You can always implement the code-
generation part if you want to and have the time for it.

~~~
cloudhead
I think you answered your own question with the last sentence.

~~~
amelius
You could learn more from targeting an intermediate language than from
targeting a specific architecture, imho.

~~~
vidarh
Maybe you learn "more". But you learn very different things.

