
What do people mean when they say “transpiler”? - tosh
http://composition.al/blog/2017/07/30/what-do-people-mean-when-they-say-transpiler/
======
Stratoscope
Back in the mid-80s, I frequented BIX (the BYTE Information eXchange). There
was a discussion group for the C programming language, and someone mentioned
this new C++ language. I naively asked, "C++? That's a preprocessor, isn't
it?" I was thinking of the original Cfront, of course, which translated C++ to
C.

Bjarne Stroustrup jumped in and gently chewed me out: "The Cfront compiler is
_not_ a preprocessor! It is a full-fledged compiler just like any other. Sure,
it uses C as the compilation target right now, but it won't always be that
way. It could generate native code, but compiling to C let us get up and
running more quickly and on more hardware architectures than we could if we
targeted machine code right away."

(Those weren't his exact words, but definitely the gist of it. Some things
like this stick in your mind.)

If "compiler" was good enough for Bjarne, it's definitely good enough for me.

Now jumping ahead to the present, consider Kotlin. Its compiler can target JVM
bytecode and JavaScript, and they're working on native support using LLVM for
the final code generation pass.

Is Kotlin a transpiler when it targets JavaScript, and only a true compiler
when it targets the JVM or native code?

It just seems like a silly distinction. Why bother having a separate name? A
compiler is a compiler.

~~~
snarfy
It is silly. I think part of it is simply a cultural divide between old school
C programmers and young front end web developers.

There used to be two spaces after a period, oxford commas, and 'literally'
meant 'as it was written'. Languages change (for better or worse). Kids these
days.

~~~
Stratoscope
Hey, you're welcome on my lawn any time.

Some years ago I went on an editing binge on Wikipedia, finding a bunch of
templates that "misused" the phrase "due to" and changed them to "because of".
As we all know, you _must_ get this right:

"The flight delay was due to bad weather."

vs.

"The flight was delayed because of bad weather."

but _never_ :

"The flight was delayed due to bad weather."

The next day, someone reverted all my changes and asked, "Mike, have you
looked at a dictionary lately?"

Oops.

In that moment I was enlightened: I realized I could care less.

I even leave out the Oxford comma as often as I use it. In fact I leave out a
_lot_ of commas I used to put in.

But there is one case where the Oxford comma is still necessary. If you leave
it out of this joke it ruins it:

"There are two hard problems in computer science: naming things, cache
expiration, and off by one errors."

~~~
gumoro
> I realized I could care less.

I see what you did there. Well done.

~~~
e12e
Webster doesn't mention any change in use, simply:

"care less : not to care — used positively and negatively with the same
meaning <I could care less what happens> <I couldn't care less what happens>"

¯\\_(ツ)_/¯

~~~
taneq
The first doesn't mean the same as the second, though. O.o

~~~
e12e
(On the off-chance you're not being tounge in cheek): No, "used positively and
negatively _with the same meaning_ ", as in the two differently phrased
_idioms_ mean the same thing:

[http://www.dictionary.com/e/could-care-
less/](http://www.dictionary.com/e/could-care-less/)

[http://www.slate.com/blogs/lexicon_valley/2016/04/05/the_rea...](http://www.slate.com/blogs/lexicon_valley/2016/04/05/the_real_reason_people_say_i_could_care_less.html)

[https://xkcd.com/1576/](https://xkcd.com/1576/) (!)

~~~
e12e
I suppose, these days, for the sake of clarifying that I'm being sarcastic, I
should just start saying: "I cloud care less".

~~~
taneq
To be perfectly honest, I'm kind of 50/50 between a genuine vexation with
"could care less" being semantically wrong, and a cheerful agreement that
usage defines meaning. The world's inconsistent and so occasionally, in
response, am I. :P

~~~
Asooka
Just use I'd'nt careless.

------
duncanawoods
I have written transpilers and would not want to use the term compiler because
it would be a deceptive description of the work involved. It would be a
pretentious aggrandisement akin to saying I built a car because I changed its
air freshener.

A production ready optimising compiler is one of the greatest achievements in
software engineering. A level of complexity and sophistication few of us can
reach. A transpiler is often just a trivial text transformation that a junior
programmer can do. The purpose may seem superficially similar but the effort
and expertise is so totally incomparable that a specific term is justified.

edit: it reminds me of the difference between a ship and boat. A transpiler
will use some-else's compiler/interpreter but a compiler won't use someone
else's transpiler. The two terms are meaningful to those intimate enough to
need to distinguish degrees of sophistication but seem redundant to those who
merely use them.

~~~
Stratoscope
Sure, if you're doing a simple text transformation akin to the C preprocessor,
then maybe that shouldn't be called a compiler. But we already have a word for
that: "preprocessor". Which is probably why, as I mentioned in my other
comment, Bjarne was miffed when I called Cfront a preprocessor - it was
certainly nothing like the C preprocessor!

I guess in today's terminology I would have called it a "transpiler", because
it simply transformed one source language to a fairly similar source language
and didn't have to optimize the final machine code.

But there's a lot more to a compiler than optimization. Take TypeScript for
example. Even though it generates JavaScript - and JavaScript that looks very
much like the original TypeScript code if you're not having it translate newer
syntax to older syntax - it does quite a bit of rather sophisticated work with
all the type inference and type checking.

TypeScript doesn't need to worry too much about optimization, because it knows
that the JavaScript engine that eventually runs the compiled code will do a
bunch of JIT optimization.

Similarly, a compiler that targets LLVM or the JVM can rely on those engines
to do much of the optimization. But the compiler is still a nontrivial piece
of work.

Maybe I would suggest the term "preprocessor" for something that really is a
just a simple text transformation, one that you might implement with regular
expressions or hand off to a junior programmer.

~~~
duncanawoods
I see a preprocesser as something distinct from a transpiler - a way to extend
a language rather than transform it into another one.

This debate is really just about how terms fit a continuum from extension
(supported by language constructs) -> preprocesser (add features that can't be
supported by host language extensions) -> transpiler (support a totally
different but conceptually similar language) -> compiler (conceptual
differences from target requiring a more self-standing implementation that
tackles fundamental problems).

The separation between the levels of sophistication are fuzzy and matters of
degree. A single project might legitimately mature through the stages. For
example, I don't object to calling Typescript a compiler once it began to
generate code for sophisticated concepts not supported by the underlying
language.

------
derefr
My take is that a "compiler" always compiles _down_ (i.e. throws away
information that cannot be recovered from the result), while a transpiler may
actually be _lossless_ —you could potentially write another transpiler to go
in the other direction, and the two together would form a bijective function.

On the other hand, sometimes "transpiration" involves both throwing away
information, and then either heuristically _recovering_ information (i.e.
compiling to assembler, followed by _decompiling_ to the target language) or
_inventing_ information (i.e. compiling to object code and then wrapping that
object code in a VM written in the target language.) You wouldn't call a
program that involved either of these a "compiler"; it would most certainly be
a "transpiler" only.

~~~
sigjuice
Citation(s) needed for "compiler" always compiles _down_.

------
deergomoo
Despite the variation in meaning, I think what ‘transpiler’ means locally to a
particular technology group is usually well understood by its community. For
example, if you’re a JavaScript developer and someone mentions a transpiler, I
think it’s safe to say you immediately think of something like Babel (although
interestingly, it describes itself as a compiler) and it’s generally well
understood that it means a tool that takes either a JavaScript alternative or
currently browser-unsupported JS and builds something that a wide range of
browsers support.

Is local understanding good enough? I’d say probably, because if you’re unsure
what a particular transpiler does, you could just research that particular
tool. And, in the case of JavaScript at least, when you’re in deep enough to
learn about transpilers and build systems, the myriad tools available for
slightly different approaches to the same job is probably more of a concern
than the generic word used to describe them.

~~~
yoz-y
I know it when I see it. Having more precise words makes communication faster
and I agree that people who talk about transpilers know what they mean. I
suspect that most people who criticise the term understand its meaning very
well too. They just don't like it.

------
rocqua
I'd say if it doesn't output bytecode, its a transpiler. In bytecode I'd
include JVM, LLVM and actual machine code. There is some ambiguity for python,
since python files are transformed to python bytecode before being
interpreted.

~~~
xigency
So Chicken Scheme uses a transpiler and not a compiler? What about Nim?

I don't think that's s sufficient definition when these fully capable
languages call their tools that output C code "compilers."

Honestly, the article's definition with levels of abstraction seems
reasonable. The languages I mentioned are high-level and C operates at a lower
level of abstraction, therefore they use compilers.

~~~
dom96
Indeed, Wikipedia also defines it this way. Compiling from Nim to C is moving
from a high-level of abstraction to a lower level and so Nim is a compiler.

------
duneroadrunner
One case the author left out was translation to "idiomatic" higher level
language, which involves identifying abstractions that are intrinsically
present, but where never explicitly expressed, in the original code. For
example I've been working on a translator/converter from C to a memory-safe
subset of C++[1]. The output is intended to be readable and maintainable as
source code in its own right.

So when for example, encountering a pointer in the original C source, you have
to determine from context whether the author is using it as an iterator to a
fixed-sized array, an iterator to a dynamically-sized array, an "owning"
reference to a dynamically allocated object, or just a weak/observer reference
to an object, and translate to the appropriate "higher-level" element.

So in a sense it's a "decompiler" to a (higher-level) language the code was
never compiled from. As a source-to-source transformer, presumably it would
qualify as a "transpiler". But does that term have an connotation that the
output is just an intermediate translation not intended to be maintained or
used directly?

[1] [https://github.com/duneroadrunner/SaferCPlusPlus-
AutoTransla...](https://github.com/duneroadrunner/SaferCPlusPlus-
AutoTranslation)

~~~
saltcured
I think you are describing a source to source translator, and an awfully fancy
one at that. A translator generally would be trying to map or maintain the
structural idioms in the source code, allowing for human understanding or
maintenance of the translated code. But to improve the coding style by
detecting latent idioms seems a bit much even for most translators.

I always considered a transpiler to be closer to a translator than to a
compiler, but without the translator's concern for maintaining human-readable
code. For me, the boundary between transpiler and compiler is in the runtimes.

A transpiler would be targeting the whole runtime of a target language, i.e.
one with non-trivial type systems, flow control/exception handling, and memory
management/GC. The transpiler maps source language runtime concepts to target
language runtime concepts in a relatively straightforward fashion. A compiler
targets some lower-level abstract machine language and provides its own
distinct runtime system as well.

A similar boundary exists for interpreters, where a meta-interpreter does
relatively high-level source-to-source conversion before delegating to a
target language interpreter and its runtime. These meta-interpreters are akin
to transpilers, while full blown interpreters are akin to compilers, targeting
a lower level abstract machine and providing their own runtime systems.

~~~
tom_mellior
> where a meta-interpreter does relatively high-level source-to-source
> conversion before delegating to a target language interpreter

This is most emphatically _not_ what Prolog meta-interpreters do. (Those are
the ones I am most familiar with.) They do not build up new source code, they
interpret given terms in "new" ways that are not built into the Prolog
implementation. Systems that build new source code are called expansions or
sometimes macros.

I don't think the Lisp world would call such systems meta-interpreters either.
There, too, you have macros that transform source code.

Do you have a reference that uses the term "meta-interpreter" in the way you
are describing?

~~~
saltcured
Sorry, it's been many years and I misremembered the term "meta-circular
evaluator", i.e. everybody's toy implementation of an interpreter with a REPL
and very simplistic (if any) changes to the source language being interpreted.

------
sjrd
My main issue with the term "transpiler" is not its existence per se. It is
the effect that its existence has on a large portion of developers, having
them reject some kinds of technologies for the wrong reasons.

If we look back to when the term "transpiler" was made popular (not
necessarily coined), it is fairly widely acknowledged that it was through
CoffeeScript, which defined itself as such. In a sense, I think CoffeeScript
was right (or at least not wrong) to define itself as "transpiler" rather than
"compiler". After all, it was a syntax tree to syntax tree transformation,
technically involving no more than a parser and a pretty-printer (not saying
there's anything wrong with that).

However, because CoffeeScript was the first language to compile to JavaScript,
swaths of developers have associated the term "transpiler" to "compiles to JS"
or "compiles to another language that also happens to be used as source
language". Now, all these developers will systematically refuse to call
TypeScript, ClojureScript, Scala.js, etc. as "true compilers" (let me list a
few more in alphabetical orther so I'm not perceived as _totally_ biased:
BuckleScript, Elm, Flow, Kotlin, PureScript). Instead they insist on calling
them "transpilers" and associating their characteristics to that of
CoffeeScript.

Now that CoffeeScript is falling out of favor (in part because most of what it
brought to the table has been picked up by ES2015), this category of
developers associates any kind of language that compiles to JS as a thing that
no one would ever want to use. Worse, they will often consider such languages
as insults to their craft. "Why don't you just _learn_ JavaScript and code in
it?", they ask. This is a cultural problem, because this mindset prevents from
even considering what other languages can bring to them.

Technologically speaking, there is absolutely nothing separating
ClojureScript/Scala.js from Clojure/Scala. The former compile to JavaScript;
the latter to JVM bytecode; but the amount of compiler engineering that go
into all these compilers is basically the same. However, somehow, culturally,
they are fundamentally different: ClojureScript/Scala.js devs should just
learn JavaScript, while it's OK for Clojure/Scala devs not to "learn Java".

In the end, the existence of the term "transpiler" has a negative effect on
the _perception_ that many developers have on the quality of languages that
compile to JS. You will often read things like "such transpilers always leak
JavaScript in the end", which is simply not true for ClojureScript and
Scala.js. Or "interop with JS libraries is always an issue with transpilers";
again, not true. And those misconceptions make them reject and bash on similar
technologies, for no good reason.

And _that_ is why I loathe the term "transpiler".

~~~
ovao
_In the end, the existence of the term "transpiler" has a negative effect on
the perception that many developers have on the quality of languages that
compile to JS._

Is this actually true? Have there been actual widespread complaints about,
say, TypeScript's quality because CofeeScript was labeled similarly?

~~~
sjrd
Fair enough. I think TypeScript enjoys some "protection" from the obvious
complaints because it advertises itself as (and it is) "just JavaScript", with
types. Therefore the complaints I often see about other languages compiling to
JS are kind of moot or very easy to dismiss. For example "it eventually leaks
JS" is dismissed as "yeah, duh! it _is_ JS".

TBH I do not have too much contact with TS other than its type definitions, so
the experience I describe may not be relevant to TypeScript. I have repeatedly
seen it for languages that provide a different set of abstractions than JS,
though.

------
foldr
I would define a transpiler as a compiler that targets a language that is (i)
high level and (ii) typically a source language in its own right. I think that
accurately captures most people's usage of the term. I don't see anything
objectionable about having a term for a specific subset of compilers.

~~~
rdiddly
Your definition also includes _decompilers_ though, which go from low-level to
high-level. Or is a decompiler a type of transpiler? Oh brother!

------
eldavido
You want to know what they mean?

"I haven't studied CS and don't understand that all compilers are just
translation layers from one language to another"

------
everheardofc
Words mean whatever the people want them to mean. Just pick whatever word you
want. That's how language evolves.

~~~
xigency
Yes, especially people as a collective rather than individuals.

------
pornel
I'd say AST to AST translation is a transpiler. It keeps the high-level
structure of the program intact.

Once you break down the AST into basic blocks/CFG, then you have a compiler
(if the output is from the "lowered" representation that has lost its high-
level shape).

