Hacker Newsnew | comments | show | ask | jobs | submit login
Ask HN - What's the best language for AST manipulation?
7 points by rch 1678 days ago | 7 comments
I'm starting a new project at work that will require a good bit of run-time code generation. I like the AST module in Python, but I'd like to think I have an open mind.

I also already know about, and sometimes use, Java Emitter Templates, but that doesn't feel like a run-time solution.

Maybe I'm subconsciously looking for something with the flavor of Go, Lua, Erlang, Ruby, etc. etc. -- you know, trendy ;)

This is a 18+ month long project, so learning a new language is completely reasonable (I picked up Python under similar circumstances).

It is OK if Lisp is the answer... how is arc coming along anyway?

ASTs are trees of nodes. At each level, a node will typically be one out of a range of choices.

In OO languages this turns into a class hierarchy of nodes, and you'd use the visitor pattern to manipulate them. This approach is awful: it generates large amount of boilerplate code in your program, which, if you're not careful, becomes so large that it obscures the actual logic.

Functional languages provide a better abstraction, in the form of discriminated unions for data structures, and pattern matching in place of the visitor pattern.

I recommend one of the ML family of languages: this is likely to be OCaml, Haskell or F#. I recommend Andrew W Appel's books on "Modern Compiler Implementation": he wrote three books, aimed at Java, C and ML, and the ML book is wonderfully clear.


Thanks for the excellent answer, and I am certainly considering Haskell (even though I didn't mention it). I'll grab the book as well.

I'd still like to make the best case for the AST side though, since my first goal is to compare the two approaches directly. I'd hate to relay what you've listed to a bunch of Perl programmers just to have someone stand up and ask about Lua0x, or some such.

It might be worth noting that this project is only on the table because the Python generator just plain works...

Anyway, thanks again.


You'll also want to investigate Prolog. That's the grandfather of pattern-matching tree-manipulation languages. Also, it is dynamically typed if that floats your boat more than Haskell or ML. ML always forces you to deal with incomplete patterns, but you need to specifically ask your Haskell compiler to warn you about incomplete patterns.

It is hard to beat Prolog for quickly whipping up prototype language interpreters. Besides pattern matching, there are easy to use symbols (just start a word with a lower case letter), declarative semantics, definite clause grammars (for parsing), user definable infix, prefix, and postfix operators, etc


Thanks very much! That is exactly the kind of answer I was looking for.


I'm having a hard time thinking of a downside to Clojure here...


The answer is probably Lisp. Whether you use macros or eval, homoiconicity is your friend. Especially if it's 18+ months; you get a lot of leverage from macros in large/long projects, even if you don't use them as the engine of the application.


Here's one: 'automatic dynamic compilation, generating code at run time with the goal of improving application performance.'



Applications are open for YC Winter 2016

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact