
Compiling dynamic programming languages - eatonphil
http://notes.eatonphil.com/compiling-dynamic-programming-languages.html
======
alangpierce
Hmm, I think the performance results for the last example aren't actually
valid. The example/tco.js program is a linear-time algorithm since it's only
making one recursive call, not two, so at n=50 you should expect it to be
pretty much instant. Much, much less than a millisecond to evaluate.

I think the "0.080 total" and "0.087 total" are just from node startup time
(and the variability within that), not time actually executing the function. I
just ran node on an empty file and the running time was "0.087 total".

I think when doing these sorts of perf measurements it's best to make the code
run for multiple seconds to better account for startup time, JIT warmup,
caches, and the various other subtle factors that come into play.

~~~
snek
I cannot agree more here. I strongly recommend benchmark.js
([https://github.com/bestiejs/benchmark.js/](https://github.com/bestiejs/benchmark.js/))

~~~
eatonphil
Totally fair, and thank you for the link! To focus too heavily on performance
right now would be a mistake. Even longer-term, jsc's most likely application
is in simplifying deployments and packaging of Node applications. I included
the programs in this article because I wanted to show _some_ example of where
jsc stood a few weeks in.

------
snek
In V8, crossing the boundary between the public C++ API and "js land" is
actually quite expensive. In most cases you will get more perf from writing
your code in JS. This is why we write a lot of Node.js's core in JS instead of
C++.

Operations you see in the compiled output like
`Local<Function>::Cast(global_3->Get(String::NewFromUtf8(isolate,
"Boolean")));` are extraordinarily expensive, and should pretty much be
avoided at all costs.

~~~
eatonphil
Thanks for the info. jsc right now is little more than PoC and I don't have a
ton of hope for competing with V8 for performance long term (except perhaps
when type inference or hinting enter the story).

The most immediate advantage I can think of for having a project like jsc
long-term is for packaging/deployment. we use zeit's pkg tool at work today
but a robust/mature Javascript-to-native compiler is much more compelling.

~~~
mncharity
> competing with V8

The early python community went for C monoliths, but there was an alternative
of python objects carrying C pointers. So python provides a form of api/plugin
runtime dynamic linkage for smaller-grainsize C libraries. I don't know if
anyone has explored that for javascript. If you've a problem domain of
interest in need of almost-native speed (pointer call rather than static
linkage) with 'compute-intensive runtime assemblages of C chunks' (eg,
graphics or scientific), it might be a possibility.

> [...limited subset of javascript...?]

I don't quickly find the comment I intended to reply to, but fwiw, note that
the ecmascript spec is written in sort-of pseudo-code English. Many years back
I framed spec implementation as a semi-manual textural database cleanup and
transformation exercise, and in a couple of days massaged spec into code. It
helped that the target language had label gotos and permitted arbitrary
identifiers, so the transformation was simple, and the resulting code looked
like just like the spec. (Kind of ironic to see a comment elsewhere on this
page dis'ing regexps.)

------
timruffles
I'm doing this for Javascript - finding it a lot of fun!
[https://github.com/timruffles/js-to-c](https://github.com/timruffles/js-to-c)

I'd recommend it as a project to any wanting to learn more about a particular
language (implementing a language teaches you how it works in painstaking
detail), and get a better 'feel' for how languages and compilation works in
general.

~~~
sdegutis
Ironic that your first initial and last name is truffles considering bridging
between C and JS (and other languages) is possible if they're implemented via
Graal's Truffle library:
[https://github.com/oracle/graal/tree/master/truffle](https://github.com/oracle/graal/tree/master/truffle)

~~~
jjnoakes
It doesn't have to be negative to get downvotes. Sometimes irrelevant
discussion items are downvoted to keep them in the periphery.

------
UncleEntity
I wonder if it's possible to just get the AST out of V8 and not have maintain
your own lexer/parser? I also wonder if abstract interpretation might be the
way to go to get type information out of the js code which should simplify the
generated C++.

That said, reading TFA and the linked BSDScheme one gave me some ideas for the
next time I get around to playing with minischeme. Too many toys and not
enough time...

~~~
eatonphil
I'm not using a parser I wrote. Though of course someone must maintain it.

I considered writing this in C++ to get more tooling (JS parser, C++ AST
libraries, etc.) but in the short-term I do not see myself switching. I'd be a
little more likely to switch back to D though because I find data structures
in Rust annoying.

------
bambataa
It’s striking how unreadable the V8 output is. Is that due to V8 itself or
just this particular use of it?

~~~
benbristow
I don't think the output really needs to be readable since they're going for
speed over anything else.

Who is normally reading this stuff?

~~~
alangpierce
Depends on the context, but FWIW, CoffeeScript has an explicit design goal of
producing human-readable output (even though it's just JS to run in the
browser) so that people learning the language can try out examples and build
intuition and trust in what JS is going to be produced. It also helps when
using a debugger without source maps.

------
jejones3141
I'm disappointed. From the title it sounded like someone had created a
language specifically for dynamic programming
([https://en.wikipedia.org/wiki/Dynamic_programming](https://en.wikipedia.org/wiki/Dynamic_programming)).

~~~
jakeinspace
Plenty already exist which support dynamic programming "natively." For
example, the 'memoize' function in clojure.

~~~
alangpierce
I bet there are ways a programming language could support dynamic programming
better than just having a memoize function in the standard library.

Examples:

* For a given DP algorithm, you might evaluate it top-down (recursive memoized call) or bottom-up (filling in the table values in topological order). Top-down is easiest to implement, but bottom-up can avoid stack depth limits and I think can be more efficient if implemented well. A language specifically crafted for it might let you write the simpler recursive implementation and efficiently evaluate it bottom-up.

* Sometimes you might want to run the same DP algorithm in different contexts, e.g. operating on different data sets (where the data set is constant for a recursion tree). Keeping the data set (or even a pointer to it) in each cache entry is wasteful if your cache has 100 million things in it, and passing it down as a parameter is also not as efficient as it could be. I guess closures may solve the problem, but maybe there are smarter ways that a language could help here.

------
znpy
One might argue that this guy actually wrote a transpiler and not a true
compiler as the target language is D and not assembly or machine language.

I don't want to diminish the work this guy has done, but one of the biggest
challenges, unless you want to take credit for someone else's work, is to do
all the various kind of optimization that a mature compiler would do, at
various levels.

So yeah, that's cool, but I was expecting more given the use of the word
"compiling".

~~~
n4r9
I think the use of the word is fine. Whilst "compile" is often used to refer
to programs whose target language is assembly/machine code, it generally
refers to any translation from one computer language to another [0].

A "transpiler" is therefore a specific type of compiler.

[0]
[http://www.compilers.net/paedia/compiler/index.htm](http://www.compilers.net/paedia/compiler/index.htm)

~~~
pulsarpietro
Yeah, that is what I was taught as well. From one abstract machine to another.

If we think about it javac would not be a compiler otherwise :-)

~~~
eatonphil
The reason I'm comfortable calling BSDScheme in particular a compiler is
because ultimately you get a small binary you can bring to another machine
without a runtime. Javac may get you bytecode, but you're still dependent on a
JVM. This is the most useful distinction for most users, not if it compiles
straight to assembly or not.

In the case of JSC, a binary is produced currently but it must be launched by
a Node app. So it doesn't exactly meet my criteria. But being able to produce
embedded V8 code will not be significantly more difficult for these examples.
(Recreating Node's stdlib would of course be difficult but that's separate.)
I'm hoping to have a Node-independent target soon.

~~~
pulsarpietro
I depend on the JVM in my mind traslates to "my target is the JVM"

You are "dependent" if you want on the RunTime (rt.jar). The way it was
explained to me, and the model which I sticking to in my mind is:

Abstract Machine/language + RunTime = Level of the onions of a computer.

Assembly/CPU + libc Bytecode/JVM + rt.jar

The language manipulates the resources provided by the machine (registers,
stack-machine etc etc).

My recollections are fading though - but I found this model to be good enough
to explain me _well_ how a computer works.

