
Transpiling between any programming languages (2019) - chrisseaton
https://engineering.mongodb.com/post/transpiling-between-any-programming-languages-part-1
======
krcz
While it seems that their transpiler supports just simple JSON-like
expressions with support for constructors for non-native literals (like big
integers) and I don't think it could generalize into arbitrary code
translation, I still appreciate the work they've done. Transpilation between
arbitrary high level languages is a very hard task and I doubt anyone can
achieve good solution for that anytime soon.

I have a different solution in mind: designing a language specifically for
transpilation, so that it could be used to generate high quality, readable
code for various programming languages (I call it "idiomatic translation").
I've recently started exploring that topic as a part of Zygote project [1].

[1] [https://github.com/krcz/zygote](https://github.com/krcz/zygote)

~~~
swsieber
You should check out Haxe. - it compiles to 10 source level languages and a
couple of VMs.

[https://haxe.org/](https://haxe.org/)

~~~
krcz
Thank you, Haxe definitely has its place on my list of things to check. What
worries me however is that there are no examples of output code on their
webpage.

~~~
haxiomic
The output is pretty slim, it's one of the reasons I prefer using haxe over
TypeScript when targeting the web – haxe has better dead-code-elimination so
output js tends to be smaller and include a greater level of optimisation –
check out this example to see what I mean
[https://try.haxe.org/#4f46D](https://try.haxe.org/#4f46D)

------
cdirkx
I did something similar as in my Bachelors end project. A manufacturer of
industrial printers wanted to generate API bindings in multiple languages from
a single source of truth.

We started with just transpiling data definitions based on Apache Thrift [1],
but later expanded it to also be able to transpile code checking invariants on
the data and integration tests.

Jetbrains' Meta Programming Studio (MPS) [2] was a really great tool for this,
as it does a lot of what is mentioned in the article for you. It makes it
really easy to define new nodes, syntax rules, parse rules from text, ast-to-
ast transformations etc.

[1] [https://thrift.apache.org/](https://thrift.apache.org/)

[2] [https://www.jetbrains.com/mps/](https://www.jetbrains.com/mps/)

------
geewee
This seems like a really hard problem to solve - but it's some pretty cool
progress. I do wonder how Mongo decided building a high-level transpiler was
worth their time though. Is the business value of translating those queries
really that high?

~~~
PaulHoule
I think potentially very high.

There's a subset of operators that are similar (but not identical) between
programming languages. For instance you can add numbers like

    
    
       const a = b + c
    

in javascript

    
    
       int a = b + c
    

in java, etc, you can even write

    
    
       (b+c) as a
    

in the select statement of a SQL query. If you look at basic operations on
numbers and strings it is not hard to translate expressions between
conventional languages and query languages. Now query languages have different
ideas about control structures, but working at the expression level you can
mostly avoid that.

Now there are issues around data types (are we adding ints or floats, what to
do about overflow, when you index the Nth element of a string is it 8 bits
(maybe part of a Unicode char), is it a basic plane unicode char, or could it
be any codepoint, or something else...) but those problems don't turn up 100%
of the time when you try to translate.

~~~
nradov
The data type and edge case issues have to be considered 100% of the time
unless the transpiler can reliably determine through static analysis that they
aren't a factor.

------
LakshyAAAgrawal
As a Google Summer of Code project, I worked on transpiling Maxima CAS to
Python, both high level languages. The approach I followed involved conversion
to an internal custom defined IR, and then converting that to Python code.

The full project report:
[https://gist.github.com/LakshyAAAgrawal/33eee2d33c4788764087...](https://gist.github.com/LakshyAAAgrawal/33eee2d33c4788764087eef1fa67269e)

------
empath75
I’m wondering if you had a large enough corpus you could use the same
techniques you use for NLP translation.

~~~
tgv
Of course. I hope you don't want to use the output, though.

~~~
sdenton4
Convert the tests by hand, then convert the language by NLP. Then go back and
fix breakages in the test suite...

------
franciscop
I thought that this was near when I wrote "Running PHP in Javascript" [1]. I'm
so happy this article includes images and more in-depth technical
explanations, since I basically hacked it together without understanding half
of it (for fun!).

[1] [https://francisco.io/blog/running-php-in-
javascript/](https://francisco.io/blog/running-php-in-javascript/)

------
wruza
If you believe that “transpilation between any languages” is a thing, chances
are you’re observing your local blub paradox. That said, most practical and
practiced languages are really interchangeable, because they bring nothing new
beyond the syntax, minor runtime api differences and fresh names for old
subroutines. Heck, this makes me want to work on my custom best language
again.

------
kyberias
Looks like they read the Watt & Brown book.

~~~
agumonkey
this [https://www.pearson.com/us/higher-education/program/Watt-
Pro...](https://www.pearson.com/us/higher-education/program/Watt-Programming-
Language-Processors-in-Java-Compilers-and-Interpreters/PGM308089.html) ?

~~~
kyberias
That.

------
maest
What is the use-case for these transpilation features? Who would want to
convert queries from an arbitrary language to another one?

------
yters
General transpilation is logically impossible.

~~~
firethief
Any language that can be compiled for x86 can be transpiled to any language in
which an interpreter of x86 opcodes could be written. The question is what
level of abstraction can be preserved; if it requires lowering that far, it's
not likely to be worth it.

~~~
yters
Well, for that matter any Turing complete language can embed an
interpreter/compiler for any other Turing complete language. If that's all
that is meant by transpiling, it is no different than something like the JVM.

My understanding of 'transpiling' is that it is meant to convert, say,
idiomatic Python into idiomatic Java.

------
fsajkdnjk
sound great, doesn't work. if it would work, we would have perfect
translations for spoken languages already. and google(translate) with it's
massive computing, financial and brain power still hasn't figured it out.

~~~
cdirkx
Spoken languages have the problem of ambiguity, context, changing meaning over
time to different people. Programming languages are pretty structured and well
defined.

~~~
krcz
On the other hand most spoken languages use similar underlying concepts (with
some exceptions, e.g. translation to language without subjective direction
terms like "left" or "right" might be tricky). Programming languages can have
completely different execution models, especially when you compare ones
between paradigms. Translating imperative for loop into map/filter/fold chain
in functional language would have to be very non-local.

~~~
cdirkx
Yes it would probably produce a lot of unidiomatic code (see for example
c2rust [1], which generates semantically equivalent Rust code, but with raw
pointers and unsafe everywhere). In the worst case you could even just emulate
the origin language in the target language, everything is turing complete
after all.

With enough effort and smart transformation passes however, I think you could
come pretty close to something resembling "native" code. It might not always
be worth it to spend that development time though.

[1] [https://c2rust.com/](https://c2rust.com/)

