Hacker News new | comments | show | ask | jobs | submit login

I've been thinking about a compiler for Lisp/ML like functional languages. If we take the source program, convert it to continuation passing style, then lambda lift, we have an intermediate form where no closures appear, every call is a tail call and where the stack is represented as data structures explicitly. The next step is to convert every call site


    if(f == FUNCTION1) FUNCTION1(x)
    else if(f == FUNCTION2) FUNCTION2(x)
The reason for doing this is that now all function calls are to known functions (there are no calls to function pointers anymore). Because all calls are tail calls we can then convert every call to a jump instruction.

(note that this could cause code blowup with a lot of different functions and call sites, so it would probably be better to do this transformation lazily as you analyse what values f could take, and only actually do this transformation if you know that f can take at most 10 different values)

What we have now is one giant control flow graph (state machine) representing the entire program. All functions have been eliminated. The only thing left is primitive operations and jumps.

Why do I think that such a representation is interesting for a compiler? Because many things that previously had to be coded separately are now one analysis/transformation. Because there are no functions left all analysis and transformations automatically become interprocedural. And for example (loop) unrolling and inlining become the same transformation. So this could possibly make the compiler much simpler.

Could something like this work?

Have you read Appel's _Compiling with Continuations_?

I have not, but the table of contents sure looks interesting. Am I right that they don't do lambda lifting / closure immediately? They seem to have a lot of optimizations working on lambda expressions, and also optimizations working on records. By lambda lifting immediately I am hoping that the optimizations on records can take care of (some of) the lambda-optimizations.

CwC is very much what you want.

The focus on lambda and record optimizations is because ML code is usually heavy with sum types (tagged unions), and it's often possible to completely unbox things at compile time.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact