

Ask HN - What's the best language for AST manipulation? - rch

I'm starting a new project at work that will require a good bit of run-time code generation. I like the AST module in Python, but I'd like to think I have an open mind.<p>I also already know about, and sometimes use, Java Emitter Templates, but that doesn't feel like a run-time solution.<p>Maybe I'm subconsciously looking for something with the flavor of Go, Lua, Erlang, Ruby, etc. etc. -- you know, trendy ;)<p>This is a 18+ month long project, so learning a new language is completely reasonable (I picked up Python under similar circumstances).<p>It is OK if Lisp is the answer... how is arc coming along anyway?
======
timrobinson
ASTs are trees of nodes. At each level, a node will typically be one out of a
range of choices.

In OO languages this turns into a class hierarchy of nodes, and you'd use the
visitor pattern to manipulate them. This approach is awful: it generates large
amount of boilerplate code in your program, which, if you're not careful,
becomes so large that it obscures the actual logic.

Functional languages provide a better abstraction, in the form of
discriminated unions for data structures, and pattern matching in place of the
visitor pattern.

I recommend one of the ML family of languages: this is likely to be OCaml,
Haskell or F#. I recommend Andrew W Appel's books on "Modern Compiler
Implementation": he wrote three books, aimed at Java, C and ML, and the ML
book is wonderfully clear.

~~~
rch
Thanks for the excellent answer, and I am certainly considering Haskell (even
though I didn't mention it). I'll grab the book as well.

I'd still like to make the best case for the AST side though, since my first
goal is to compare the two approaches directly. I'd hate to relay what you've
listed to a bunch of Perl programmers just to have someone stand up and ask
about Lua0x, or some such.

It might be worth noting that this project is only on the table because the
Python generator just plain works...

Anyway, thanks again.

~~~
GregBuchholz
You'll also want to investigate Prolog. That's the grandfather of pattern-
matching tree-manipulation languages. Also, it is dynamically typed if that
floats your boat more than Haskell or ML. ML always forces you to deal with
incomplete patterns, but you need to specifically ask your Haskell compiler to
warn you about incomplete patterns.

It is hard to beat Prolog for quickly whipping up prototype language
interpreters. Besides pattern matching, there are easy to use symbols (just
start a word with a lower case letter), declarative semantics, definite clause
grammars (for parsing), user definable infix, prefix, and postfix operators,
etc

~~~
rch
Thanks very much! That is exactly the kind of answer I was looking for.

------
regularfry
I'm having a hard time thinking of a downside to Clojure here...

------
daniel-cussen
The answer is probably Lisp. Whether you use macros or eval, homoiconicity is
your friend. Especially if it's 18+ months; you get a lot of leverage from
macros in large/long projects, even if you don't use them as the engine of the
application.

------
rch
Here's one: 'automatic dynamic compilation, generating code at run time with
the goal of improving application performance.'

[http://www.cs.washington.edu/research/projects/unisw/DynComp...](http://www.cs.washington.edu/research/projects/unisw/DynComp/www/)

