Hacker News new | past | comments | ask | show | jobs | submit login
Java Polyfill for the Browser (javapoly.com)
330 points by velmu on May 8, 2016 | hide | past | web | favorite | 88 comments

To others trying to use this demonstration: Be patient with the page, as it has to download quite a few JAR files in order to be ready to compile and run any test code. To verify that the javapoly is ready to run your code, open the web developer console and wait for it to say "Java Main started". Additionally, first-time compilation will also need to download additional JARs for the demo code to run. Generally speaking though, the compilation time is a good 5-10 seconds on my computer for the demo code, so be sure to not just spam the compile button like I did :)

Just a friendly note to the author of this page: The total download size of this page's underlying Java resources are much larger than I realized (maybe a collective 30-40MB). Given that I waited almost 5 minutes after clicking "Compile & Run!" for nothing to happen (because I had clicked it well before the runtime JARs had fully downloaded), I would suggest to at least add a progress bar or something to let users know when they can actually compile and run the demo code, in addition to some acknowledgement that the code is compiling. This is definitely an interesting project though.

Yeah, Doppio needs to download the entire JDK to run, since it's a full-featured JVM (JRE for running Java programs, JDK for javac so it can compile them). Compressed, it's ~30MB.

The authors of this project can preload the JDK with a progress bar and stash it into IndexedDB, since Doppio's file system, BrowserFS, supports arbitrary backends [1].

[1] https://github.com/jvilk/browserfs

Very nice :). Maybe the Jar could be split so that people don't load things they don't use ?

Why isn't http caching sufficient?

If the authors compressed the JAR files into a single bundle, storing the data into a BrowserFS-created IndexedDB file system removes decompression/extraction overhead on subsequent visits.

If they didn't.... then it might be sufficient, but I am not sure how well the browser cache handles large files! How long until it evicts them? Is there an upper bound on the size of cached items? etc.

I am only familiar with Firefox, where the default is 43.75MiB, anything larger than that is not cached.

There's a `browser.cache.disk.max_entry_size` setting that defaults to 51200 (50MiB), however the code also has an explicit override that no item larger than 1/8 of the total cache size (as dictated by `browser.cache.disk.capacity`, default 350MiB) is ever cached, hence the 43.75MiB limit.

All these settings can be changed, obviously, but I suspect few people ever touch them.

I assume other desktop browsers have similar default limits, while mobile browsers probably have a much lower threshold.

Lead author of Doppio here! This is quite cool. I don't recognize the authors of this work, so I was unaware of the project and was surprised to see it on the front page of HN!

If you find any issues with Doppio or have any requests, feel free to open up an issue on our GitHub issue tracker.

doppio sounds awesome. from the paper:

Numeric support. Direct support for 64-bit integers would enable languages to efficiently represent a broader range of numeric types in the browser. The DOPPIOJVM uses a comprehensive software implementation of 64-bit integers to bring the long data type into the browser, but it is extremely slow when compared to normal numeric operations in JavaScript.


how far is this from being solved?

Thanks! I hope Doppio can become useful to many more people. :)

There is a proposal for value types [0] that Niko Matsakis [1] and Brendan Eich [2] talked about a couple of years back and could be used to implement 64-bit numbers in JS, but I'm still waiting for an implementation or a more complete proposal. Looks like it might be dead in the water, as the proposal hasn't been edited in quite some time, but I do not have my finger on the pulse of browser standards so I could be wrong.

[0] http://wiki.ecmascript.org/doku.php?id=strawman:value_object...

[1] http://smallcultfollowing.com/babysteps/blog/2014/04/01/valu...

[2] http://www.slideshare.net/BrendanEich/value-objects

You could use (the compiled and optimized version of) the implementation we have in Scala.js: https://github.com/scala-js/scala-js/blob/master/library/src... It is significantly faster, especially for arithmetic operations, and dramatically so for division, remainder, as well as `toString()` (up to 100x speedup). It also contains the unsigned variant of operations (for java.lang.Lond.divideUnsigned for instance).

Oh, nice! I'll take a look at that when I next get the chance. I assumed that Closure's Long library, which is based on GWT's Long implementation, would be well-optimized, and never looked into an alternative.

Maybe someone could make Rhino[0] run in this, so we can run Javascript in Java in Javascript! Yo dawg!

[0]: https://developer.mozilla.org/en-US/docs/Mozilla/Projects/Rh...

Actually, the master version of Doppio can run Rhino AND Nashorn!

(I'm the primary author of Doppio.)

JavaPoly uses Doppio. As a result, you can run both Rhino and Nashorn in JavaPoly too!

Awesome, and can we run Doppio in Rhino run in Doppio then? Recursive Atwood's law!

Haha, I've actually tried to do this. The only missing link, I believe, is to add a Nashorn/Rhino backend to BrowserFS (Doppio's file system library) so that Doppio can access the filesystem that Rhino is running in to load the JDK.

If that sounds like fun, I take pull requests [0]. :)

[0] https://github.com/jvilk/browserfs

I have an experimental version of chromium running with the JVM somewhere on my HD. It's pretty easy to do with CEF/JCEF.

Two thoughts:

I always imagined something like this should use the <object src="app.jar" /> tag. If you consider that someone could build a browser again with java in it, (like I have) then it would be good to design a polyfill such that, it allows for a native version to exist (and then doesn't run) Also using the same mimetype for jars, classes and source files seems non optimal.

It would be really good if gwt,doppio,teavm all agreed on a single api facade for targeting the browser API. I think that would really put some steam behind the underlying idea here.

> It would be really good if gwt,doppio,teavm all agreed on a single api facade for targeting the browser API.

Unfortunately, that is impossible due to differing requirements; I've actually talked with the teavm and bck2brwsr folks [0]. Doppio requires asynchronous function calls to support preemptive multithreading. TeaVM, GWT, and bck2brwsr all map Java methods directly to synchronous JavaScript methods, preventing them from supporting multithreading.

This is actually one of the larger differences we describe in the academic paper that sets Doppio apart from those projects and Emscripten [1].

[0] Start of the conversation: https://groups.google.com/d/msg/plasma-umass-gsoc/uuZk09CGIM...

[1] PLDI 2014 paper (Sorry, I know I'm linking this a lot in the comments thread, but I promise it's a fun read!): https://plasma-umass.github.io/doppio-demo/paper.pdf

To my knowledge TeaVM supports multi threading. See the async demo: http://teavm.org/

'Async demo that shows how TeaVM can translate multithreaded applications with synchronization primitives.'

Also see this effort trying to unify the APIs: http://dukescript.com/

Interesting! A cursory glance at the source code looks like they are doing some form of stack splitting at synchronization points, denoted by method annotations. Note that they reimplement the class library and eschew compatibility with traditional Java programs, so I'm not completely sure how general purpose their threads are. I tried finding some writeup about them, but I suspect the code is the documentation. :)

I haven't checked in with this project since we last talked a few years ago, so it's nice to see notable progress! Thanks for the pointer.

What are your thoughts on WebAssembly? Do you see Doppio suppporting it as a target in the future?

One of Doppio's explicit goals was to leverage existing resources in the browser to bring conventional programming languages to the web on top of JavaScript. Using WebAssembly would prevent Doppio from using the browser's garbage collector; we would have to write our own. It would also prevent Doppio from mapping JVM objects onto JavaScript objects.

The WebAssembly standard is constantly evolving, and now contains some ambiguous statements regarding the ability to take advantage of the browser's GC [0]. Considering WebAssembly's current focus on C/C++ code, I do not believe this will come to fruition anytime soon. If it does, I do not see how it would noticeably improve Doppio's current performance, which is bottlenecked primarily by its interpreter. A contributor is working on a JIT right now to make execution faster [1].

Hope that clarifies!

[0] https://github.com/WebAssembly/design/blob/master/GC.md

[1] https://github.com/plasma-umass/doppio/pull/443

It could be an async api facade. It would just happen to be mostly synchronous in those other cases.

Absolutely true. I believe I proposed that in our correspondence, but it was a dealbreaker for bck2brwsr and TeaV M, which is understandable.

This should have a DK: prefix= data plan killer. I was browsing on my phone and realized that it was downloading a lot of libraries. It can be especially a problem when data roaming in Europe.

Should show a loading progress indicator for the demo because that took ages & I thought it was broken until I inspected Console.

Huh - could this run existing Java tools like Google Closure Compiler in the browser? That would be handy for simplifying distribution, especially since the Java installer started bundling crapware.

The last time I checked, doppio can run the Closure compiler [0].

[0] https://github.com/plasma-umass/doppio/issues/317

Yes, you can!

What is the difference of this to doppio? A new approach from some of the doppio authors?

Also if someone says this is useless an interesting use case could be e.g. a hybrid navigation application https://karussell.wordpress.com/2014/05/04/graphhopper-in-th... or portable native apps using a webview

BTW: I got lots of errors in the Firefox console saying 'Error: Assertion failed: A non-running thread has an expired quantum' although the example works (should have a progress bar :))

The Doppio authors (me) are completely uninvolved; this is an independent effort! This project makes it a bit easier to integrate Doppio and use it out-of-the-box.

I suspect the assertion failure is caused by some of their modifications to Doppio's start up code to pause and extend the main thread's runtime.

Thanks! As you seem to be the author of browserjs too :) ... will or does doppio/browserjs support the Java memory mapping API?

You mean this [0]? I have not implemented it, but it might be possible to do...

[0] https://docs.oracle.com/javase/7/docs/api/java/nio/MappedByt...

That would be really cool :) !

> What is the difference of this to doppio? A new approach from some of the doppio authors?

(I contribute to both Javapoly and Doppio)

Javapoly tries to make Doppio easier to use (in my subjective opinion):

* easier loading of jars, classes and Java source code * a promise based async interface to Java methods * automatic marshaling of primitive values between JS and Java lands * a proxy based interface into the Java namespace.

Why didn't you contribute to Doppio itself if this is just about making it easier to use?

As I see it, Javapoly is a layer above Doppio.

As a very crude analogy: shells and editors make it easy to use the filesystem. But we can't contribute the shell / editor to the filesystem! They sit in different layers of the stack.

It's nice to see this work taking Doppio into new places. For an overview of Doppio, BrowserFS and the DoppioJVM, here's a video presentation mostly based on a talk given by John Vilk at PLDI 2014. Unfortunately, there's no audio, but the slides should be easy enough to follow.


FIXED - forget this video, watch John deliver it at Microsoft Research! http://research.microsoft.com/apps/video/default.aspx?id=238...

And here's a video presentation of John Vilk delivering his PLDI 2014 talk at Microsoft Research. This one has audio! :)


My browser hangs for ten seconds as soon as I try to load the page, followed by an unresponsive script error.

I'm on a pretty anemic computer, but there's got to be a way to keep my browser responsive while it loads in everything it needs.

I finally had to kill safari on my iphone 4. Which is admittedly old.

Excellent execution, and very cool idea.

Somewhat in jest let me say, "People are complaining that JavaScript is a terrible language. Great, let's put Java in the browser." Now we have an even worse language to code in for the browser.

Note: jvilk explains the perfectly valid reason for this library in another comment below.

Having Java's powerful multithreading constructs available in the browser is a pretty big deal, I think.

As I understand the Doppio paper, they just simulate JVM threads in synchronous JavaScript. So there does not seem to be added performance benefit at least.

It's still a very neat hack, though!


My browser froze when loading the website which apparently uses the polyfill. Everything as expected.

Write once, crash everywhere! For everyone!

<script type="text/java" src="http://www.yourdomain.com/jimboxutilities.jar"></script>

having text/java as MIME type - instead of at least looking up the correct one in Wikipedia makes this implementation look quite uggly on the first sight. https://en.wikipedia.org/wiki/JAR_(file_format)

Also I cannot imagine a single valid usecase for this.

I suppose Clojure devs needn't learn ClojureScript anymore. ;)

And Scala devs don't need to learn Scala.js; Doppio runs Clojure and Scala. :)

(Joking; both of those projects are pretty great, and have different goals from Doppio.)

(Also, Scala in Doppio is... uh... quite slow. As you may imagine, if you are familiar with Scala's compilation process.)

Kotlin, the "better Java" language by JetBrains, can also compile to JavaScript.


Which to me appears to be the better way around, since it generates static JS instead of having to compile Java every time, as well as having to set up a JVM and whatnot.

It's better if you are writing new code for the web and want to use a different language, but Doppio is better if you need compatibility with existing JVM code. The goals are different.

There's a good reason every major vendor killed off Java applets in the browser. This is probably the worst idea since Windows 95 on the Apple Watch.

The main reason is not present in this implementation, which is that clients could execute arbitrary lower-level code in a separate process which the browser was powerless to make secure.

Yes, there is a good reason: Security exploits!

Fortunately, Doppio requires no plugins, and is written in 100% JavaScript. Crisis averted!

Also, more languages than just Java run on the JVM; Doppio can run Scala, Clojure, and other JVM languages.

First I thought this was a belated April fools...

But it seems to download crazy Java stuff and I get "threadpool.ts:83 Uncaught TypeError: Cannot read property 'run' of undefined" with Chrome 50.

License is ISC, in case anyone is wondering.

They don't mention it anywhere, but you can find it in the package.json file in their git repo.

ISC is the default license for NPM modules when you `npm init`, so I am not sure if that's their actual license.

Doppio itself is under the MIT license.

Regardless of their intent, it is the license with which they released it.

I'm always amused when a new and "better" (browser) technology's website makes my browser hang up.

The main goal of Doppio was never speed -- it was compatibility [0]. It's actually a JVM interpreter at present, leading to noticeable slowdown over a native JVM. Thus, there are many opportunities for performance improvements!

A recent contributor is starting to add a JIT to Doppio, which is a step in the right direction [1].

[0] Our academic paper from PLDI 2014 has more details, although the project has evolved since then (it's now JDK8 compatible, for example): https://plasma-umass.github.io/doppio-demo/paper.pdf

[1] https://github.com/plasma-umass/doppio/pull/443

The most difficult part (I think) of writing a JVM is the concurrent garbage collector. Is this making use of Javascript's garbage collector to do the heavy lifting? Or do they implement their own GC in Javascript? Also, I'm wondering how they are handling concurrency.

Doppio uses JavaScript's garbage collector. (As a result, it cannot support weak references.) As for concurrency, thread quanta are mapped to JavaScript events, so Doppio can emulate preemptive multithreading. Doppio potentially preempts a "thread" at each function call.

The nitty gritty details are in the PLDI 2014 paper [0]. Some details have slightly changed, though (e.g. DoppioJVM supports JDK8 now).

[0] https://plasma-umass.github.io/doppio-demo/paper.pdf

Once ES6 support is everywhere, you should be able to use WeakMap and WeakSet.

[0] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

Actually, those are not sufficient to implement weak references, and have a much different use case! With WeakMap and WeakSet, the keys are weakly referenced, hence the names. If you have a WeakMap, you can't produce a value stored in the map without a strong reference to a key.

You can actually polyfill WeakMap and WeakSet. You can't do the same for weak references.

I misspoke -- you can polyfill WeakSet, but you cannot polyfill WeakMap, since you do not know when you can shrink the map. (But you still cannot use it to emulate weak references, since you need a strong reference to get data out of the map.)

It'd be interesting to see if they could get multithreaded support with web workers. Good stuff!

You can use simulated threads with JavaPoly, which allows you to utilize the full Java threading model (including locks). If you enable the native jvm, you get true threads. The reason you wouldn't want to use WebWorkers as a primitive for building threads is that the javascript memory model doesn't allow for shared memory, so you would take a huge performance hit while crossing the serialization boundary for virtually every memory access. You'd be better off just using the simulated threads at that point.

Just a note: shared memory is coming and I think Firefox Nightly has an experimental version that you can enable in about:config.

It's true that SharedArrayBuffer is coming, but there's no way to share objects from what I understand. Since Doppio maps JVM objects to JS objects, it cannot take advantage of shared array buffers to emulate shared memory threads.

Well, at some point, all objects are just a blob of bytes.

Well, if we represented objects with a blob of bytes, we would have to implement our own garbage collector and manage our own heap, as object references would just be a pointer that points into an array somewhere.

For interop with JavaScript, there's a usability difference between a JavaScript object and a blob of bytes, although that could be overcome with an object "mirror" that proxies operations appropriately.

Our approach was to leverage the existing GC and language features that browsers already have.

If someone wants to jump directly to the sources :)


The only thing that came to my mind was "why put yet another abstraction in the browser?".

Then I realised the answer was probably "because they could".

Someone will write a C compiler in Java and stuff it in the browser next...

Actually, Doppio is an artifact from my own research, so there is a very good reason "why" [0]! Basically, if you want to re-use your existing, well-tested code in the browser, it is quite difficult. The browser environment is very different from the environment that most programs expect. Doppio bridges the gap between the environment that these programs expect and the environment that the browser presents, making it possible to bring a full JVM into the browser that can run complicated, unmodified programs.

[0] Doppio: Breaking the Browser Language Barrier paper from PLDI 2014: https://plasma-umass.github.io/doppio-demo/paper.pdf

From a technical because-we-can standpoint this is really cool, but I am having a hard time imagining actual use cases that warrant running a JVM in the browser. For modern web applications it is too slow (judging from your paper), so perhaps abandoned legacy applications?

Have you encountered any practical use cases?

Yes; the paper mentions CodeMoo.com, which the University of Illinois created independently of us to teach basic programming skills to kids [0]. The file system component is readily used by the Internet Archive for their MS-DOS collection [1]. A number of other instructors approached me regarding using Doppio for in-browser IDEs, but Doppio is somewhat cumbersome to integrate into webpages, and never had the time to dramatically improve its documentation and ease-of-use. I believe that is a more fundamental barrier to using Doppio than its performance, especially since you can run Doppio in a WebWorker to avoid some of the responsiveness issues.

Also, note that Doppio is an interpreter, so there is significant interpreter overhead. Using a JIT compilation approach would improve performance, and a contributor is currently working on a basic implementation. I honestly believe that it could become significantly faster with additional work, but as a single person with other projects, I lack the resources to do this work myself.

[0] http://www.codemoo.com/

[1] http://ascii.textfiles.com/archives/4924

There is already Emscripten, for compiling from C or C++ to JavaScript...

Every time I read “polyfill” I think it has to do with pixel graphics.

I was expecting a routine in JavaScript for a fast implementation of 2d polygonal texture filling. I was disappointed as this was something about loading Java in a browser.

Other useful hint to authors: the Starter Pack seems to be missing on the server ...

It would be interesting to see how this performs, when it comes, in WebAssembly.

It would require a complete rewrite; I commented previously [0] describing how WebAssembly seems inappropriate for Doppio, and also discussed similar thoughts about SharedArrayBuffer [1]. Also, the main source of slowness is due to the fact that Doppio uses a JVM interpreter and does not JIT; it could be much faster than it currently is with additional engineering.

[0] https://news.ycombinator.com/item?id=11656373

[1] https://news.ycombinator.com/item?id=11655922

You can kiss your data plan goodbye after visiting this. Exercise caution.

Cool! It would be great if it could a little bit faster tho.

Madness. No thanks!

Wow speechless

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact