Hacker News new | comments | show | ask | jobs | submit login

Fascinating read.

"[..] I had never been to California before, so I was discovering San Francisco, my favorite city in the US and second-favorite city in the world. [..]"

Out of curiosity does somebody know what is PG's first-favorite city in the world?

"[..] It was also a massive validation of a thesis Steele had argued for his Master's, which was that CPS was a great intermediate representation for a compiler [..]"

What does CPS mean? Further down in the article

"[..] Richard Kelsey took his front end, which was a very aggressive CPS-based optimiser, and extended it all the way down to the ground to produce a complete, second compiler, which he called "TC" for the "Transformational Compiler." His approach was simply to keep transforming the program from one simple, CPS, lambda language to an even simpler one, until the language was so simple it only had 16 variables... r1 through r15, at which time you could just kill the lambdas and call it assembler. [..]"

So I guess CPS means Continuation-Passing-Style? But then I wonder what PG means by "CPS was a great intermediate representation for a compiler"?

Even further down we can see PG's own opinion:

"[..] So the lineage of the CPS-as-compiler-IR thesis goes from Steele's Rabbit compiler through T's Orbit to SML/NJ. At which point Sabry & Felleisen at Rice published a series of very heavy-duty papers dumping on CPS as a representation and proposing an alternate called A-Normal Form. ANF has been the fashionable representation for about ten years now; CPS is out of favor. This thread then sort of jumps tracks over to the CMU ML community, where it picks up the important typed-intermediate-language track and heads to Cornell, and Yale, but I'm not going to follow that now. However, just to tell you where I am on this issue, I think the whole movement from CPS to ANF is a bad idea (though Sabry & Felleisen's technical observations and math are as rock solid as one would expect from people of their caliber). [..]

Fascinating stuff, though I do not what IRs are used by the modern compilers like LLVM, GHC, Java, .Net?

"[..] Jim Philbin, like Kelsey, also went to NEC, where he built an operating system tuned for functional programming languages, STING (or perhaps it was spelled "STNG" -- in any event it was pronounced "sting"). He built it in T, of course, and one could see that it had its roots in his work on T3. (Implementing the runtime for a functional language, in some sense, requires you to implement a little virtual OS on top of the real, underlying OS.) [..]" - Does anybody know what happened to this OS? The only link that I found is http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

The summary of this article then goes to say "[..] Sting is currently a prototype implemented on an eight processor Silicon Graphics PowerSeries 480 running Unix. The next step is to integrate Sting into the micro-kernel of an operating system such as Mac [..]" - Would be interesting if this fruited in some new works for VMs for functional languages.

> Fascinating stuff, though I do not what IRs are used by the modern > compilers like LLVM, GHC, Java, .Net?

There are a couple of things going on with CPS as implemented in this style of compilers that make it particularly nice for writing optimization passes:

1) There are only function calls ("throws") and no returns. Though there's no theoretical reduction in the stuff you have to track (since what is a direct-style return is now a throw to the rest of the program starting at the return point), there's a huge reduction in in the complexity of your optimizations because you're not tracking the extra thing. For some examples, you can look in the Manticore codebase where even simple optimizations like contraction and let-floating are implemented in both our early direct-style IR (BOM) and our CPS-style IR (CPS). The latter is gobs more readable, and there's no way at all I'd be willing to port most of the harder optimizations like reflow-informed higher-order inlining or useless variable elimination to BOM.

2) IRs are cheap in languages like lisp and ML. So you write optimizations as a tree-to-tree transformation (micro-optimization passes). This style makes it much easier to enforce invariants. If you look at the internals of most compilers written in C++, you'll see far fewer copies of the IR made and a whole bunch of staged state in the object that's only valid at certain points in the compilation process (e.g., symbols fully resolved only after phase 2b, but possibly invalid for a short time during optimization 3d unless you call function F3d_resolve_symbol...). Just CPS'ing the representation of a C++ compiler without also making it easy to write your optimization passes as efficient tree-to-tree transformations will not buy you much, IMO.

PG didn't write this article, Olin Shivers did.

And Jonathan had some small amendments:


Olin's favorite city is Paris, where he is right now, if I understand correctly.

CPS is short for continuation passing style (http://en.wikipedia.org/wiki/Continuation-passing_style).

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact