Hacker News new | past | comments | ask | show | jobs | submit login
A Simple Graph-Based Intermediate Representation (1995) [pdf] (amazonaws.com)
80 points by luu 15 days ago | hide | past | web | favorite | 27 comments

This is such an influential paper.

However, I don’t recommend using this IR style (aka sea of nodes) because:

- The original CFG and the original ordering of operations contains useful data. You want an IR that preserves that at least up to the point where some pass proves that a better order exists.

- Lots of optimizations can be made cheaper to run by using basic blocks, so it’s good to have an IR in which control flow nodes are basic blocks. Load elimination is an example.

- IRs need to be intuitive to the people who work on the compiler. Sea of nodes is more challenging to think about than SSA over CFG.

- There does not exist a program transform or analysis that is uniquely enabled by sea of nodes. Those program transforms that are made easier by sea of nodes are pretty easy to write in SSA over CFG as well.

> - IRs need to be intuitive to the people who work on the compiler. Sea of nodes is more challenging to think about than SSA over CFG.

I worked on V8 for awhile, which uses a sea of nodes representation. This point really resonates with me. I and many of my teammates never completely developed an intuition for the different edges in the graph. There was a lot of trial and error involved.

I remember reading this paper and thinking "wow! this is the solution to all the problems I never knew I had." The generality and simplicity of a single graph that does everything is appealing. But in practice, I think you'd most often be better with SSA over CFG.

I don’t completely disagree with you, but I would point out the paper ‘A Higher-Order Graph-Based Intermediate Representation’ by Roland LeBia, et al from the U of Saarland Compiler Lab. That paper, which describes the groups Thorin IR, does show some advantage to the sea of nodes based approach (although the representation may not be classified as purely SoN). The advantages seem particularly compelling for mixed functional/imperative source languages, which was essentially their goal.

The name of the author is "Leißa". (The fourth letter is a sharp s).

Apologies, I was aware, but am posting from iOS and couldn’t find the character.

fyi, if you long-press the "s" on your keyboard, it should come up as an option!

And in ASCII, ß->ss: Leißa -> Leissa

and ä, ö, ü -> ae, oe, ue


Take a look at mlir. It's a very novel and powerful abstraction that can be used and applied in very creative ways to solve problems that require translation from source syntax to some target(s) (machine, api, a mix).

tensorflow is the biggest user of mlir so far. The project recently moved to llvm repo to be used by other llvm sub projects and because it's the natural place for it to live.

for example, c/c++ -> mlir -> llvm ir can allow many optimizations that cannot be possibly done at llvm ir level given that it would have lost a lot of context from its original source(the language syntax).

Meh. Seems over engineered and too general. Maybe it’ll be cool for ML. Probably best left ignored for anything else.

The whole thing about using it for C++ optimizations... I’ll believe it when I see a statistically significant speedup. Until then it’s vapor.

you don't need to limit yourself to c/c++. It can be used for various other reasons.

As for c/c++ specifically, you can optimize much better and safer if you have 1 or more intermediate dialects prior to llvm's IR. You don't need to believe it. This is compiler 101.

Of course you can optimize better if you have the right intermediate dialects.

I’m saying that MLIR is unproven for that purpose outside of the ML space.

On a related note: What's a good introduction to compiler design ? Has the dragon book been superseded by something yet? Ideally you'd be able to write a decent compiler for your own language after going through it.

The 2nd edition of the Dragon Book is quite good. I don't think it's been surpassed. For myself, I prefer it with respect to compilers much more than I prefer Computer Architecture A Quantitative Approach with respect to computer architecture.

Second edition, as in the first "Compilers: Principles, Techniques, and Tools", or the second ?

What's wrong with the Dragon Book? Obviously, the older versions won't cover SSA. However, in general, for a hobby project, single-static assignment and abstract interpretation (the modern bits) are only going to help with a very tiny part of the compiler (the optimizer). By the time you need a "lot" of SSA (LICM, SCEV, PEO-register assignment, etc.), you're looking at an easy 10-years of effort, so there'd be no need for an "intro" book at that point...

For a lot of us, the optimizer is not the tiny part. It's the biggest and most important part.

It does not take 10 years to implement an SSA compiler with LICM, SCEV, and good register allocation. (No idea why you'd want Professional Employer Organizations as part of your RA, but that's just me.) For example, WebKit's B3 compiler took 4 months to write. It's not hard to write it if you know what to write. Hence the need for a good book. Maybe the reason why you think it takes ten years and Professional Employer Organizations is that you never read a good book on the subject!

Could you recommend a good book/reference?

I can’t! :-(

I learned by interning on a team that knew how to do it right and learning from them and then learning more by word of mouth - asking folks who architected compilers how/why they did things.

It would be good if title included the publication year: 1995.

Ah, thanks. I have no idea why academic papers don't have the date right under the title. That's what's nice about papers on arxiv, the date is right there on the left margin of the first page.

Academic papers don't really have a publication date when they first become available - the date typically used is the one when the issue it's bundled in is released. That's one complication, and one of the many oddities of publishing metadata (that's what I deal with).

Papers are always so careful to point to grant info (when applicable), so it always bothered me that they didn't have any dates in them. I've been a proponent of including a footnote or something that includes the research period, or date of first authorship, or something. I didn't gain much traction when I was in academia with this, since various publications have pretty strict formatting rules. You may be in a position to push on that idea though!

I've seen a lot of discussion on publication dates, drafts, acceptance, preprints, publication, revisions, etc. However this is the first time I've seen someone mention the research period. It's a fascinating idea. If you want to push this, perhaps an avenue might be to talk to crossref. While they're not the only DOI registration agency, they cover most western content and are good with their APIs. They regularly have events and meetups, and have always been good for a chat. https://www.crossref.org/events/

They're a (relatively) central point for this kind of metadata.

> You may be in a position to push on that idea though!

Ah, unfortunately probably not, I'm very much on the receiving end of the data.

> Papers are always so careful to point to grant info (when applicable),

There's a lot here - surprisingly common for this not to be the case! Funders definitely want this though.

That's a good point, and it's a point strongly in favor of this paper and its authors.

Very few papers keep getting cited and talked about so long after publication!

Added. Thanks!

Applications are open for YC Summer 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact