In the TXR Lisp compiler, I did lambda lifiting simply: lambda expressions that ...

In the TXR Lisp compiler, I did lambda lifiting simply: lambda expressions that don't capture variables can move to the top via a code transformation that inserts them into a load-time form (very similar to ANSI Common Lisp's load-time-value).

E.g.

  (let ((fun (lambda (x) (+ x x))))
    ...)

That can just be turned into:

  (let ((fun (load-time (lambda (x) (+ x x)))))
    ...)

Then the compilation strategy for load-time takes care of it. I had load-time working and debugged at the time I started thinking about optimizing lambdas in this way, so it was obvious.

load-time creates a kind of pseudo-constant. The compiler arranges for the enclosed expression to be evaluated just once. The object is captured and it becomes a de-facto constant after that; each time the expression is evaluated it just refers to that object.

At the VM level, constants are represented by D registers. The only reason D registers are writable is to support load-time: load time will store a value into a D register, where it becomes indistinguishable from a constant. If I were so inclined, I could put in a vm feature that will write-protect the D register file after the static time has done executing.

If we compile the following expression, the d0 register is initially nil. The d1 register holds 3, which comes from the (+ 3 x ) expression:

  1> (compile-toplevel '(lambda () (lambda (x) (+ 3 x))))
  #<sys:vm-desc: a32a150>
  2> (disassemble *1)
  data:
      0: nil
      1: 3
  syms:
      0: sys:b+
  code:
      0: 8C000009 close d0 0 4 9 1 1 nil t2
      1: 00000400
      2: 00010001
      3: 00000004
      4: 00000002
      5: 20020003 gcall t3 0 d1 t2
      6: 04010000
      7: 00000002
      8: 10000003 end t3
      9: 8C00000E close t2 0 2 14 0 0 nil
     10: 00000002
     11: 00000000
     12: 00000002
     13: 10000400 end d0
     14: 10000002 end t2
  instruction count:
      6
  #<sys:vm-desc: a32a150>

The close instruction has d0 as its destination register "close d0 ...". The 9 argument in it indicates the instruction offset where to jump to after the closure is created: offset 9, where another "close ..." instruction is found: that represents the outer (lambda () ...)

We have only compiled this top-level form and not yet executed any of the code. To execute it we can call it as if it were a function, with no arguments:

  3> [*1]
  #<vm fun: 0 param>

It returns the outer lambda produced at instruction 9: as expected. When we dissassemble the compiled form again, register d0 is filled in, because the close instruction at 0 executed:

  4> (disassemble *1)
  data:
      0: #<vm fun: 1 param>
      1: 3
  syms:
      0: sys:b+
  code:
      0: 8C000009 close d0 0 4 9 1 1 nil t2
  [... SNIP; all same]

d0 now holds a #<vm fun: 1 param>, which is the compiled (lambda (x) ...). We can call the #<fm fun: 0 param> returned at prompt 3 to get that inner lambda:

  5> [*3]
  #<vm fun: 1 param>
  6> [*5 4]
  7

We can disassemble the functions 3 and 5; we get the same assembly code, but different entry points. I.e. the lambdas reference this same VM description for their code and static data:

  7> (disassemble *3)   ; <-- outer (lambda () ...)
  data:
      0: #<vm fun: 1 param>
      1: 3
  [ SNIP same disassembly ]
      9: 8C00000E close t2 0 2 14 0 0 nil
     10: 00000002
     11: 00000000
     12: 00000002
     13: 10000400 end d0    <---
     14: 10000002 end t2
  instruction count:
      6
  entry point:
     13                     <---
  #<vm fun: 0 param>

The entry point for the outer-lambda is offset 13. And that just executes "end d0": terminate and return d0, which holds the compiled inner lambda.

If we disassemble that inner lambda:

  8> (disassemble *5) ; <--- inner lambda (lambda (x) (+ 3 x)) 
  data:
      0: #<vm fun: 1 param>  <--- also in here 
      1: 3
  syms:
      0: sys:b+
  code:
      0: 8C000009 close d0 0 4 9 1 1 nil t2
      1: 00000400
      2: 00010001
      3: 00000004
      4: 00000002    <---
      5: 20020003 gcall t3 0 d1 t2.    <-- sys:b+ 3 x
      6: 04010000
      7: 00000002
      8: 10000003 end t3
  [ SNIP ... ]
  entry point:
      4             <---
  #<vm fun: 1 param>

The entry point is 4, referencing into the lifted lambda that got placed into d0. Entry point 4 is in part of the close instruction which indicates the parameter mapping. The word there indicates that the argument value is to be put into register t2. Function 0 is called with the t2 argument and d1 (which is 3). Function 0 is in the syms table: sys:b+, a binary add. When it returns, its value is put into t3 and execution terminates with "end t3".