Hacker News new | past | comments | ask | show | jobs | submit login
TeaVM – Ahead-of-time transpiler of Java bytecode to JavaScript or WebAssembly (teavm.org)
195 points by entelechy on Jan 5, 2018 | hide | past | web | favorite | 126 comments



For the reverse, I wrote a WASM-to-JVM compiler: https://github.com/cretz/asmble.

I think WASM has the ability for us to finally cross the language boundaries without marrying ourselves to C/C++ ABI. But at this early stage, there's no common stdlib for all languages to share.


> I think WASM has the ability for us to finally cross the language boundaries without marrying ourselves to C/C++ ABI

Uhhhhh. This sounds like a kind of silly thing to say, when you just described the jvm. Except for the part where there is a common stdlib.


If only the JVM had included unsigned math.


Legend goes Gosling did not found anyone at his department that could answer all his questions about unsigned arithmetic, hence why he left it out from Java's initial design.

However since Java 8, the java.lang numeric classes do support unsigned arithmetic.

Yes it is a bummer that byte is signed, requiring extra math to simulate unsigned.


Yup, I use those unsigned methods, e.g. https://github.com/cretz/asmble/blob/3cb439e887245f23bf876e8.... Luckily WASM only has 32/64 bit integer arithmetic.


there is... 'char'; char is an unsigned 16bit primitive (of course, adding 2 chars results into an int and you need a cast but that's another story)


> Except for the part where there is a common stdlib.

Yeah, and it's huge and interconnected (just see how much they had to do to get a java.base module and how large that still is). And the JVM has lots of specifics around threading, GC, OOP, classes, security model, etc. And many of those specifics don't translate to the needed safe-by-default, minimal-by-default bytecode for the web and other targets.

I could really go on for a long time about why applets failed and why JVM bytecode and the JVM runtime are terrible for the web (and any generic bytecode that doesn't target the web can't really be that generic). What is silly is ignoring all of this. Think about why there are already C++-to-WASM compilers after such a short time and no reasonably-usable and maintained C++-to-JVM compilers.


Whoever said JVM should be for the web? The question was, why don't we have a way to cross language boundaries and the answer was: we do, the jvm.


I would have had a great answer: Normally I count upwards from 0.


There is also a WASM to Java source transpiler: https://github.com/wrmsr/wava


> I think WASM has the ability for us to finally cross the language boundaries without marrying ourselves to C/C++ ABI

That is how many maninframes work, and the original idea behind the CLR.


Isn't the CLR just the same as the JVM?


No.

CLR was designed to be able to support multiple kinds of languages, including C and C++.

At the release event there were about 40 languages being supported by multiple vendors.

The SDK being offered across multiple developer magazines had samples for several languages.

CLR was born from Ext-VOS project, formelly known as COM Runtime, the idea being that Windows would eventually be fully .NET based.

The idea never came to be, because .NET belongs to DevTools and Windows team was never happy with it.

The biggest example being the sabotage during Longhorn development and how the Phoenix, Singularity and Midori projects were received by them.

Also CLR always had the option of AOT compilation.

Initially via NGEN, Bartok in Singularity later adopted in Windows 8 for store apps, Project N from Midori later .NET Native for UWP apps.


That's really interesting! How does calling WASM code on the JVM work?

And also in addition to a stdlib, maybe a great package management system. Is there a popular crate / maven / npm etc equivalent out there for wasm yet?


There seems to be a good deal of info in the README: https://github.com/cretz/asmble#compilation-details

I haven't dug deeply but it seems from the documentation that the project is trying to do things the 'right' way (mapping instructions to JVM bytecodes at a low level)... as opposed to, say, trying somehow to compile to Java code.


I read some parts of the spec of WebAssembly recently, and there were several things that disappointed me. To name a few:

1) There is no way to allocate in a function's variable stack continuous memory space that spans more than 8 bytes. This implies that if a function has as a local variable compound data structure such as an array or a struct, you have to a) have a global virtual stack pointer, b) manipulate the stack pointer to allocate space for the data structure in the virtual stack on the memory, c) read from or write to the allocated memory space, and finally d) restore the stack pointer before the function exits. This is exactly like what resulting assembly code of C compiler does. Why do we have function locals as a WASM primitive, then, if we have to end up with manipulating the stack pointer?

2) Currently a function can only return one value even though it is trivial to add support to mutiple return values taking advantage of the fact that WASM is stack-based. And what puzzles me is that mutiple return values is actually mentioned as a possible feature in the future, in several places in the spec. WASM reached the MVP before adding it, which means that we will have to stick with one-return-value functions for quite a long time even if it is added in the future.


See the GC proposal: https://github.com/WebAssembly/gc/blob/master/proposals/gc/O.... Here you get structs, arrays, and tuples (that can be used for multiple returns). I am not disappointed that none of these features made the MVP personally.


What does this have to do with TeaVM? I see this as a generic tangent as it could be posted to any thread mentioning WebAssembly.

>Avoid unrelated controversies and generic tangents.

https://news.ycombinator.com/newsguidelines.html


“TeaVM is a primarily web development tool. It’s not for getting your large existing codebase in Java or Kotlin and producing JavaScript.”

“TeaVM is for you, if:

- You are a Java developer and you are going to write web front-end from scratch.

- You already have Java-based backend and want to integrate front-end code tightly into your existing development infrastructure.

- You have some Java back-end code you want to reuse in front-end.

-You are ready to rewrite your code to work with TeaVM.“

http://teavm.org/docs/intro/overview.html


That word,"Transpile". Lol. GCC transpiles C to assembly.


I always understood it as this

Transpilation: text to text (eg: Scheme to C)

Compilation: anything to anything but usually it implies text to binary code

transpilation is compilation but compilation is not necessary transpilation (the all squares are rectangles but all rectangles ...)


So classical UNIX C compilers are now transpilers, because they always did text to text (C to Assembly), before forking to the assembler.


The first AIX C compiler called itself a "transcompiler" in it's manual, so yes, they are.


Any digitalized version of it available online?


Not that I know of.

But you can see the term of the day, "translator", here[0][1], and it isn't much of a leap to see how that evolved into translator-compiler, and then shortened itself.

[0] XLT86, 8080 to 8086 Translator http://www.s100computers.com/Software%20Folder/Assembler%20C...

[1] Pg. 47, Trans Z80 to 8086 Translator http://www.patersontech.com/dos/Docs/86_Dos_usr_03.pdf


I guess it implies a 1-1 or similar conversion without optimizations or multiple levels of intermediate representation. GCC certainly doesn't "transpile" in that case...


So GCC -O0 is a transpiler, but GCC -O3 is a compiler? Nice.


You can think of it as "translate (but in the tool-chain role of a compiler)"


it's not a new word. i find it odd how some people react to it.


Not new, but as unnecessary as ever.


i find it fascinating how some in a community notoriously anal about 'being precise' want to be less precise, presumably because of some incorrect assumptions about a word.

is it because they think it's a neologism? is it because they associate it with javascript developers, who they look down upon and sneer at?

the word and idea predates javascript. rather amusing. :)


From a quick Google search, I find mainly two kinds of definitions of "transpile".

The first is a generic "translating from one source programming language to another, producing translated source code in the other language". This is basically compiling, except that it excludes "non-source" languages from the output.

The second, which looks more rigorous to me, is "taking source code written in one language and transforming into another language that has a similar level of abstraction" (found for example here: https://www.stevefenton.co.uk/2012/11/compiling-vs-transpili...).

Both are totally unnecessary because if you mention or know the source and target languages of your compiler, then you already know if it fits those definitions or not.

For example, "Transpile Java Bytecode to WebAssembly" adds exactly zero information to "Compile Java Bytecode to WebAssembly", because the word "transpile" only carries information about the relation between the languages you are processing and that is an already-known variable. As it is every time I see this word.


> The second, which looks more rigorous to me, is "taking source code written in one language and transforming into another language that has a similar level of abstraction" (found for example here: https://www.stevefenton.co.uk/2012/11/compiling-vs-transpili...).

> For example, “Transpile Java Bytecode to WebAssembly” adds exactly zero information to “Compile Java Bytecode to WebAssembly”

No, the point is that ‘transpile’ more narrowly defines the action being undertaken. While all transpiling is a form of compiling, the reverse is jot true. Specifically from the second definition you cited transpiling connotes that a similar level of abstraction is retained. Often this means that type level information is retained. It’s similar to the distinction of lossy vs lossless compression.

So yes, when I read “X is transpiled to Y” that infers that a certain level of abstraction is maintiained which is not necessarily the case. You can transpile say OCaml to JavaScript and use the garbage collection and native JS types or you can compile OCaml to a JavaScript using a subset of it such as ASM.js using byte arrays and loose type information in he target language. That’s different.


It may not be semantically required, but redundancy adds emphasis and makes things easier to process.


An old colleague of mine hated the word "awesome" because, in his words, "it's a word js-kiddies use to exaggerate their little accomplishments".

Guess how he reacted when someone used the phrase "isomorphic js"...


> If you are a Java (or Kotlin, or Scala) developer who used to write back-end code, TeaVM might be your choice. It’s true that a good developer (including Java developer) can learn JavaScript. However, to become an expert you have to spend reasonable amount of your time.

Oh, this is nice. I finally have a reason to give Kotlin a go. I have some time off and want to write something basic. Pretty excited about this, Rust, and all things wasm popping up.

I am, in no way, excited about debugging it, though.


Kotlin compiles to javascript already...why wouldn't you go that route?


Kotlin/JS does not allow to reuse Java code and Java libraries. That's the main difference.


So... from reading the page linked (admitting my ignorance)... I really wonder how useful this is? Not to be rude, or to talk ill of the work, it’s far better than I could do... but!

With the release of other ahead of time compilers/VMs/features for the JVM/Java9, I cannot fathom this produces a small enough payload for a web app over a mobile connection.

Im probably totally missing something; could someone educate me on practical or even useful but novel applications of this?


One of the most important features in TeaVM is its inteprocedural dead code elimination algorithm, which allows to produce very small code for simple Java programs. For example, Hello World produces about 40 kb JavaScript, TodoMVC (http://teavm.org/live-examples/todomvc/#/) is only 125 kb.


If you look at how much their demos in their gallery downloads, it's in the hundreds of kilobytes on average. Granted these are not full web apps but even still it's roughly on par with the latest web app framework's demos.

Performance is not too shabby either


If you are compiling to the JVM the output is probably nothing related to a web app.


One isn't compiling to the JVM with this, they're compiling to WebAssembly. And I think the point of this is to compile JVM code to WebAssembly, right? Am I missing something even larger than that?!

Should someone want another VM to run their code in; how, good is this at reducing the size of the std lib or the JVM itself? Can I depend on useful packages with this, or will they immediately shit the bed? Will the payload be something like 150mb?


In that case it will be interesting to observe this in practice. My fear (and prediction) is a lot of continued frustration and failure because the web environment is a different context than the JVM target with many different technical requirements. If the underlying technical differences in the environments were so trivial in the first place then simply swapping languages from Java to JavaScript wouldn't be so challenging.


Generally speaking, there are ways to reduce a JVM library to the subset that is actually ever referenced by the app during compilation, like ProGuard. So of the 150 MB theorized library payload only maybe a megabyte is actually used by some app and needs to be delivered to client (browser).


Does this support GC?



For those that are wondering but skimmed the page: It looks like they're thinking about SPA framework (Angular/React/Etc) and have a subproject to support them.


Does this support (a subset of) the standard Java libraries? Where can I find what is supported? (I just saw mentions that reflection wasn't)


As for 0.5.x version, see this: http://teavm.org/jcl-support/0.5.x/jcl.html

And even more in 0.6.x (including Stream API)


my guess is no. otherwise it would be called out as a feature with lots of fanfare.


In the end, how will this differ from a Java applet?


You don't need to have Java enabled in your browser.


The problem with that is you'll essentially re-download the entire JVM (and libraries) on each page load. If you don't want to use a Java applet, you would need to reimplement GC, JNI, and so on. Java didn't work on the web for a reason.


GC should be available in WASM in the future. And, not all applets needed JNI.

In other words, this is enabling a subset of well behaved applets, while excluding the naughty (or advanced) applets.


GC is only about 300 lines of code. There's no JNI in TeaVM (and never will be), instead of JNI there's JSO to interact with JavaScript, and still no interop layer for WebAssembly (that's why I claim WASM support to be experimental). JSO is rather lightweight, and I hope I'll be able to implement a lightweigh interop for WASM.

As for Java libraries, TeaVM is able to throw away unused parts of it, producing relatively small binaries.


> The problem with that is you'll essentially re-download the entire JVM (and libraries) on each page load.

I always here these arguments, but if you develop an angular application and somehow can't turn on AoT you need to download a ton of stuff, too. Also it's way easier to make use of dead-code elimination on java than on javascript (especially without reflective code)


The browser would cache that automatically. And in many cases, you'd probably use a Service Worker to implement your own custom cache and "install" the application locally in the browser, so that is usable offline and only version updates are downloaded in the background.


Weren't applets also cached?

To be honest, I think that hate of applets comes from a time when 56kbps download speeds were considered very high, the average CPU had 1 core at 200 Mhz and the average webpage weighed 20k and executed 5-10 lines of Javascript.

I haven't used applets in the time, but if the user experience would be improved (nice looking UI, respecting modern UX conventions, etc.), I imagine your average applet from circa 1999 would run circles around Gmail, for example :)


Not only that, swing applets are notoriously fugly. One reason electron is successful is because it kinda looks like native. Swing Look and Feels all seem alien.


Swing can be styled, it's that not a lot of people really bothered. Jetbrains products use Swing and they're decent.

Not amazing, but decent.


The JVM can be very small. The libraries shouldn't be any worse than with current JS.

The browser won't have to re-download on each page load because the browser has a cache, which can be validated on each page load to check for updates.


Caching exists. You hsouldn't have to redownload the same libraries for each page load.


The argument against the critics saying WebAssembly would make the web less open was, that you could always disassemble WebAssembly and that said disassembly would be much more readable than for example x86 assembly.

So - since a quick research did not turn up anything - is there a WASM to C "transpiler" that generates readable C?


But that's not really the worry. It's always been possible to obfuscate JavaScript. All that changes with WebAssembly is that it might drive even more needless code, and complicate the web's architecture even further.

For example, web browsers for the blind become far harder to implement when ordinary websites are using heaps of boated JavaScript. (The web used to have HTML for content, and CSS for presentation. No longer. Now it's just a mess.) The problem would be compounded if WebAssembly+canvas were to catch on.


> So - since a quick research did not turn up anything - is there a WASM to C "transpiler" that generates readable C?

I'm not aware of one, though it would probably be trivial to write. https://github.com/WebAssembly/binaryen has the helping code for things like parsing. Both https://github.com/kanaka/wac and https://github.com/WebAssembly/wasm-jit-prototype interpret WASM. The latter even JITs into LLVM IR.


Currently, if you run TeaVM with 'debug' (-g) option, and WASM target, you'll get both '.wasm', '.wast' and '.c' files. C file is compilable with GCC and you should write several trivial functions manually to get working binary, either '.so' or '.exe'.


But how does garbage collection work?

Is it of the stop-the-world kind?



So basically useless for browsers


Why do you think so?


I think, by now "transpile" just means "compile, but in web development".

Because, as we all know, web development you never compile things... /s


Compilation is for old people writing old programs that run on old things called "computers".

We're fresh and new and iterating quickly on our MVP, so we don't have time to wait for long compilation time. Instead we transpile our code and run it in a serverless environment, a docker container or a browser.


> ... and run it in a serverless environment, a docker container or a browser.

... Waiting for tens of seconds, or minutes, until all the JS libs are loaded into their GB's of RAM by their GHz multicore processors while those old veterans compiled all their stuff within a second on their ancient Turbo Pascal on their ancient MHz CPU's with barely noticable 64K RAM :-)


very cute, i like your sarc.

for those who aren't familiar with transpiling, in this case it's kind of a legitimate use of the term: taking one flavor of compiled code (java bytecode, binary stuff that runs on a jvm) and translating that directly to another type of compiled code (wasm, binary stuff that runs in a browser)


Yeah, that's a good definition.

My (less sarcastic) understanding was that "compiling" means translation from a "high-level"/"humans-first" to a "low-level"/"machines-first" language while "transpiling" is translation from one high-level to another high-level language.

That would make gcc, clang (mostly) and javac clear examples of compilers while GWT, coffeescript and the C preprocessor would be transpilers.

I guess TeaVM would fit neither by those definition but it would definitely be closer to a transpiler.

Then again, I believe GWT calls itself a compiler, so I got nothing...


That is actually what compilation means.


I suppose "compilation" usually refers to translating source code (as in human-readable) to some kind of byte code (as in machine-readable).

Here, we start with Java byte code which isn't exactly human-readable, hence the use of the term "transpile". But then again, you're right that on some abstract level, it all is just a translation process from one representation to another.

"A rose by any other name would smell as sweet."


They're both bytecodes though? I don't see how that is compilation.


Compiling is independent of source and target language, although it is mostly used to describe transformation from a higher-level human friendly language into a lower-level machine friendly language. Even so, the defined meaning of compiling is independent of the source and target languages. If you want to use the word "transpiling", you would have to define it in terms of compiling; transpiling is a special case of compiling, where the source and target languages have such and such properties.


It's really just semantics, but Merriam Webster defines a compiler as "A computer program that translates an entire set of instructions written in a higher-level symbolic language (such as C) into machine language before the instructions can be executed"


A dictionary really isn't a great source for the exact definitions of technical words. Wikipedia does much better in this instance: https://en.wikipedia.org/wiki/Compiler

> A compiler is computer software that transforms computer code written in one programming language (the source language) into another programming language (the target language).

Later on it lists some classes of compilers that could be called by more specific names than just "compiler".


That would be recursive:

"Compilation means taking one type of compiled code and translating it into another type of compiled code."


What is the difference between cross-compile and transpile? (Honest question, no sarcasm)


Doesn't "cross-compile" mean to compile a binary meant to run on a different platform than the host? Like, compiling a Linux binary on a Mac... If I'm right then that's nothing like what "transpile" means.


"Transpile" is an artificial buzzword, just meaning source-to-source translation, from Pascal to C for instance.

Actually, there is no such thing as a "transpiler". There are only compilers which are programs which translate from one regular language A into another regular language B. It doesn't matter whether B is a high-level language or low-level machine code.


Yes, it's another word for source-to-source translation, you are right. It is not, however, cross-compilation, which is what I was responding to. So I'm not sure why you replied to me with this statement.

In any event, is it so bad to have another word for source-to-source compilation? What is it that angers you about it?


> is it so bad to have another word for source-to-source compilation?

If you compile C to Assembler then you compile from one code (C) to another code (Asm, or machine code which is the compact form of Asm). What's the difference to a "transpiler"?

Wikipedia states that the difference is just the (almost) equal level of abstraction. I would call such a thing a high-level source-to-source compiler. The word "transpiler" is uncertain and barely known, hence the people here who ask what a transpiler is.

Cross-compilation is another thing since it means compiling on one platform for another platform. In all three cases we deal with compilers from one regular language A into another regular language B.


compile: convert from one language to another (though usually used in the specific context of converting from high level source code to lower level code like assembly/machine/etc)

cross-compile: convert from one language to another that will be executed on another platform (e.g. compiling something on a linux box to be ran on a windows machine)

transpile: convert from one (typically high level) source language to another (high level) source language


The way you write them.


The last thing we need in hindsight to Meltdown and Spectre is running untrusted binaries in the browser aka WebAssembly.

Btw deactivating WebAssembly support in Chrome 63 (up-to-date) doesn't work anymore!!

  chrome://flags/#enable-webassembly
Setting it to "deactivated" does nothing, WebAssembly is still active.


I keep saying that WebAssembly is the revenge of Flash, Applets, Silverlight, ActiveX,...

Just wait until it gets a bit more mature.

It will be the same fun as when Ads moved away from Flash into HTML 5/JavaScript.


Well, in general the problem with flash was the plugin wasn't well isolated, and often would crash the browser, and before NT-based windows was common the entire OS pretty readily (presumes windows). Not to mention the security track record. Browser isolation, and how well it will likely be with wasm is quite a bit different.

That said, it will lead to more closed commercial sites, but the JS outputted from webpack+babel+uglify is already unbelievably difficult to wade through without source maps. It's not significantly different imho.


> untrusted binaries

afaik, wasm is not 'binaries' in that it's not an arbitrary blob of machine code fed right into the cpu. it's still running in a sandbox (a la javascript) including similar limitations wrt CORS etc.


And a sandbox so powerful that all the browser vendors just turned off SharedArrayBuffer.

https://www.chromium.org/Home/chromium-security/ssca

https://blog.mozilla.org/security/2018/01/03/mitigations-lan...

WASM is portable binaries.


well, i guess in the same way that javascript or anything else is, when there are sandbox escapes. bugs are a thing.


> it's still running in a sandbox (a la javascript)

People frequently make this comparison upon hearing the term sandbox, but this is a weak comparison. Yes JavaScript executes in a sandbox, but the JavaScript (JIT) sandbox is purely for performance instead of isolation, which is like comparing a pencil sharpener to a bulldozer just because they are both portable machines. A better comparison of the JavaScript (JIT) sandbox is the JVM.


[flagged]


I get that we've lost this battle, and some people seem to think there's a useful distinction being made, but man, does the word 'transpile' grate on me.


At the point where we are calling things that compile to assembly "transpilers" I don't think there is any distinction left, much less a useful one. I mean the classic notion of a compiler is a program that turns things into assembly, which is then taken into machine code by an assembler.

I agree with you that we've lost though: no amount of protest is going to make people stop using the word "transpiler".


A compiler is a program that turns programs in a source language into equivalent programs in a target language.

It is generally assumed that a compiler goes from high level to low level, a decompiler low-to-high and a transpiler high-to-high. I guess in this case the transpiler is low-to-low, so maybe "transpiler" is just used to mean "samey-to-samey"


Isn't "low-to-low" typically called static recompilation or binary translation esp. in emulation circles?


That is correct but we are not talking in the context of the emulation circles.


A compiler is a program that translates programs from one language to another.

> At the point where we are calling things that compile to assembly "transpilers" I don't think there is any distinction left

"Transpiler" is grating (apparently).


thank you; some consolation to know it's not just me

does this seems like another instance of the annoying practice in our field of someone giving a name to a thing because they think it's new, though it's not (ie, source-to-source compilers have been around as long as source-to-bytecode compilers)--no need to re-name them

wish they would just stop it and get the hell off my lawn


Yep, like calling themselves Engineer without having the respective degree.

Thankfully in most jurisdictions it is forbidden by law to do as such.



> some people seem to think there's a useful distinction being made

What is that distinction? Excluding the javascript crowd, I doubt this phrase will see extensive use in the field until the term gets properly defined and is meaningfully distinct from the word Compile as used today.


I assume "javascript crowd" knows how to compile their node binary. Thus, they know the distinction between babeljs and gcc, and therefore they know the distinction between compiler and transpiler.


What's the distinction between Compile and transpile then? and is this distinction agreed upon and meaningful?


I agree with drdrey's definition here[1], which is echoed in paragraph two here[2] as well, which provides you with the agreement part at least. Whether if it is meaningful to you, is up to you.

[1] - https://news.ycombinator.com/item?id=16076823 [2] - https://hackernoon.com/moving-to-es6-babel-and-transpilers-3...


There was a battle? Transpile is between human readable sources, and compiler is always into machine readable code. This is the only project that I've seen to (for some reason) not conform to that definition. I guess I don't see the point of contention.


> compiler is always into machine readable code

The frustrating bit is that this is absolutely not true. A compiler is anything that parses some text according to some grammar, manipulates it, and emits it in a different format. While the most well-known and popular compilers are for C, to emit machine code, there's nothing inherent in the definition of a compiler that means it can't emit something human-readable.

I wouldn't have nearly as much issue with this if the JavaScript community instead had decided "compiler isn't specific enough: we need different words for compilers that drop vs. maintain a level of abstraction", rather than "we need a word for a thing like a compiler, but that doesn't, as compilers apparently inherently do, drop a level of abstraction".


Transpilers used to be called "source-to-source compilers", or "compilers" when the "source-to-source" distinction wasn't relevant.

Then the JS world came along and decided to use a new word to make it look more hip and cool.


>decided to use a new word

A friendly FYI... the old word "transpiler" has been around since at least the 1960s. See the 2nd-to-last paragraph on the last page:

http://comjnl.oxfordjournals.org/content/7/1/28.full.pdf+htm...


It's not a new word. People have been using the word "transpiler" or "transcompiler" for source-to-source compilers since at least the late 80's [0]. They used to be popular for translating between different dialects of assembly. Then they fell out of fashion for a while before gaining popularity again in the last decade for targeting JavaScript. So while the "JS world" is responsible for re-popularizing both the word and the concept, they certainly didn't invent it.

[0]edit: Apparently 60's.


YACC compiles human readable source to human readable source.

Is it YACT now?


well, there's even a wiki page for source-to-source translation. it's not always useful to distinguish 'truck' from 'automobile', but sometimes it is.


It's sometimes useful to distinguish “truck” from “automobile”, but never when the thing being labelled a “truck” is a Honda Civic.


are you saying it's not useful to refer to something as a transpiler if it's not a transpiler? that sounds reasonable.


Agreed especially given that "transpile" lacks a well-defined meaning.

Just ask folks how this term differentiates from Compile and you'll get wildly varying answers and definitions.


“Transpile” is a proper subset of “compile”, where both the source and target of compilation are languages designed primarily for direct human editing (source code); it is source-to-source compilation.

Using the term for anything where the source is JVM bytecode is plain wrong, and it's also dubious for anything targeting WebAssembly (though if the target is specifically .wat/.wast, it may perhaps be arguably defensible.)


there existed "translators" from one asm variant to another in the past. [0][1]

so i guess one might use "transpile" wrt bytecode in the spirit of that. but i guess asm source files are still "human readable", where bytecode generally isn't considered as such.

[0] P47, "TRANS": http://www.patersontech.com/dos/Docs/86_Dos_usr_03.pdf

[1] http://www.s100computers.com/Software%20Folder/Assembler%20C...


My point exactly!


I think it's pretty clear that "transpile" is a subset of "compile" and that there are both examples of "compile" that are indisputably not "transpile" (plain old source to architecture specific binary) and examples of "compile" that anybody who uses the term "transpile" would include in their personal variety of "transpile" (translation between source formats that are commonly used for human written code).

I don't see why the existence of a grey area should be enough to question the utility of the term.

If compile/decompile are used for a transformations along one axis, transpile is used for transformations that are predominantly orthogonal to that axis.


At the time of this reply, there are at least five competing definitions being argued for and against in this thread; some either requiring that the input and output be machine code or disallowing it, some necessitating that both input and output be at "the same level of abstraction" and others arguing that inputs and output can be at different levels of abstraction.

Given the large inconsistency and incompatibility between all these definitions and the existence of a well-accepted term that comprises all of these definitions, is it a mystery why people question the utility of the term? After all, what use is a nuanced term if it can't reliably convey that additional information.


>After all, what use is a nuanced term if it can't reliably convey that additional information.

I would recommend the general term "A/B compiler" where A and B are arbitrary regular languages. That way it would be pretty clear what a Pascal/C compiler is, or a JVM/Webasm compiler, for instance. Or a x86/C compiler which would also make the word "decompiler" obsolete.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: