
Ask HN: Can an interpreted-only language be implemented in a JITted one? - rscho
To illustrate my question, J is a language that is, to the best of my knowledge, only interpreted since parsing it requires runtime information (http:&#x2F;&#x2F;www.jsoftware.com&#x2F;help&#x2F;jforc&#x2F;parsing_and_execution_ii.htm).<p>Given that we now have languages that combine a JIT with metaprogramming abilities such as Racket or Scheme, would it be possible to implement an interpreter-only language such as J as a set of macros to compile to the host language without compromising the original language behaviour?<p>I am asking this because, as I understand it, a JIT blurs the &quot;compile-time vs runtime&quot; frontier.
======
yorwba
Yes, but the JIT may not actually know how to optimize the resulting
translated code, if it doesn't have the kinds of patterns the optimizer is
looking for.

Based on the page you linked, it seems like there are 4 different kinds of
language elements, and parsing depends on which of those kinds something
evaluates to. I'd assume that in normal code, most expressions will always
evaluate to the same kind, since otherwise the humans writing the code would
get confused. In that case, it might be possible to eliminate most of the
ambiguity through static type inference.

For programs where you are unable to eliminate that ambiguity at compile time,
you could compile code into a series of of function calls that take the
relevant runtime parameters and dispatch on the types to perform the
appropriate action. If you do that in a way that's similar to the usual type-
dispatch in the host language, the JIT compiler will likely optimize it using
e.g. a polymorphic inline cache.

I hope that helps and you'll actually see my answer. If you do, I'd like to
know what got you interested in that question. (Are you trying to optimize an
existing J codebase?)

~~~
rscho
Thank you for your very informative answer!

My job leads me to write a lot of ad-hoc scripts for medical research. It
consists almost solely in chains of extremely boring pandas/R functions which
quickly turn into a mess, because the raw data ALWAYS comes in non-regular
form, with corner cases and irregularities everywhere that must be regularized
before analysis (that is, of course, done with Excel by my boss...)

The key here, is that medical data consists in very varied small datasets in
the vast majority of cases (typically < 10^6 cells), so code reusability is
very low since the data irregularities are different every time.

My impression is that languages such as J/APL would shine for such
applications, since you won't have to remember what your code meant a year
later, and allows very concise code compared to the pages of Pandas/R I have
to write to handle corner cases. That said, I would also definitely appreciate
to freely mix J in my codebase as a DSL rather than count on FFI. I also hope
to be able to achieve auto annotation of functions using J code through the
nice Racket tools for language transformation. As for performance improvement,
I'm not really counting on it since I suspect that the Racket JIT will not be
any faster than the J interpreter that is so small that it fits entirely into
cache memory.

In summary, big and regular data is all the rage, but let's not forget the
armies of people handling small and irregular data :-)

~~~
yorwba
If something more like APL is within the realm of possibilities, you might
want to take a look at the co-dfns compiler (although some of it may be
specific to GPU code): [https://github.com/Co-dfns/Co-
dfns](https://github.com/Co-dfns/Co-dfns)

However, for 10^6 cells, even O(n^2) algorithms should run in a few hours at
worst, so performance optimizations might not be too critical. I'd suggest
first validating your idea with an existing interpreter. Building your own
tools is always fun, but can also be a huge distraction that you probably
shouldn't take on unless your existing options are not adequate.

------
gus_massa
Can you give a simple example, that shows the problem parsing J? IIUC the
example in the link, the parser has to know the arity of the functions. This
can be done in Racket using macros that look like functions. (For example,
functions with optional arguments are actually macros that look like
functions.)

I'm not sure that JIT is so important here, unless you are planning to
generate bytecode on the flight and JIT it.

