Interesting! It looks like you've reimplemented portions of the Java Class Library rather than use e.g the existing class files from OpenJDK.
My current research project, Doppio [1], implements the native portions of the OpenJDK Java Class Library so it can use an unmodified copy of the OpenJDK JCL. As a result, it can run a bunch of nontrivial programs (javac/javap/Rhino/Kawa-Scheme).
One issue you will run into is with multithreading. Since JavaScript has no true threading implementation with shared memory, you'll need to be able to suspend and resume virtual JVM threads. For this reason, Doppio maintains an explicit JVM stack representation.
Anyway, feel free to check out our code, reuse portions of it, or contribute if you're interested; it's MIT Licensed and under active development. :)
Seems like the "reimplemented" portions are bits that need to interact directly with the script environment -- String is implemented in terms of V8's native strings, etc... That seems like a feature, not a bug -- you wouldn't want the interpreter needlessly doing bytewise diddling of an abstraction that is already very robust in the host environment.
In the Java Class Library, any function that needs to interact with the environment outside of the JVM itself (OS/file system/network/threading/etc.) is implemented in C and is marked as 'native'. Those are the functions that we explicitly implement for the JavaScript/browser environment.
You are right that we could extract more performance by mapping particular classes directly onto efficient browser functionality -- such as String. In the future, we could implement specific classes such as String directly in JavaScript for a performance boost. :)
As a side note, ASM.js itself does not have a String type, which leads me to believe that this particular optimization might not be beneficial if we switch to an efficient JIT strategy.
asm.js doesn't because C doesn't, fundamentally. Given both Java and JavaScript strings are immutable, you probably want to for the sake of having non-copying operations for substring and the like.
Nice experiment and a huge undertaking, do you already support the current implementation of java.util.concurrent? (requires atomic CAS through sun.misc.Unsafe and ability to park threads)
Note that the demo is somewhat old; I do not know if it has park implemented. You'll want to build it from GitHub, which we've made relatively straightforward.
We are planning to update the demo (and perhaps post on HN) once I fix some IE issues (since we strive to be compatible with IE9).
We have the ability to park threads and atomic CAS implemented, so we should support java.util.concurrent.
Granted, it's possible that it invokes a native function somewhere deep inside itself that we don't implement yet, but fixing those are usually very simple.
We had an excellent Google Summer of Code student work on AWT support for half of the summer. We discovered that the OpenJDK AWT implementation betrays its own internal abstractions, making it impossible to implement a new backend without forking the Java Class Library and making nontrivial changes.
Unfortunately, the project described in the above link appears to be defunct and unmaintained. :( We don't have the resources to work further on the problem at the moment.
Regarding performance, we are about 20-40X slower than the HotSpot Interpreter in Google Chrome, with similar numbers in other browsers. There is a significant amount of work that we can and plan to do to make that much better, though.
Right now, http://codemoo.com/ by the University of Illinois uses it to teach Java. A few teachers have expressed interest in using it for similar purposes, but we need to make deployment easier.
It's too slow at the moment for production use, since it is an interpreter. But if we add JIT compilation support, I believe it will be fast enough for usably running legacy code/libraries in the browser.
Really cool hack, and yet I can't help but think that the mile-high software stack between the developer and the CPU continues to grow meaning we're stuck in a perpetual game of filling up spare CPU cycles with nothing in particular.
I'm sure I'll see a demo showing off something that barely runs on modern hardware that we were more than capable of running with good performance in 1990.
sigh I feel like I'm being such a Debbie-downer even if this is a really cool hack.
Not to the users who sit around watching swirlies and beachballs. They don't care what rad, flavor of the month framework you used to build your software. Websites and desktop apps were more responsive in 1998 than they are today*
*In general and not supported by facts of any kind other than my own anecdotia.
users do care about the tools they know. Our perl based webapp must interact with java, windows services, ... you name it.
You should not use virtual runtimes for everything your application does but using them in overnight cronjobs is ok.
Right now we are working on some middleware because the customers wants to use existing reporting tools(c#/windows only) with our product(perl/linux).
>Look how fast and simple this is!
>
>Yeah nice so how do I genearte my <wateva> reports?
>
>Well we can do the same with our <another> reporting service!
>
>No I want to keep my <goodOldstuff> based reports can you integrate those?
The paradigm is that it's easier (cheaper) to have less performant code that takes less time to build for developers then it is to have maximum performant code...
This project at least provides the potential for people like myself who have existing Java projects with longstanding functioning methods that I don't want to rewrite in Node.js... Yes it's another layer, but for some functions that layer can be less complicated then reimplementation.
> This project at least provides the potential for people like myself who have existing Java projects with longstanding functioning methods that I don't want to rewrite in Node.js
In this case, wouldn't a java-to-js translator be more efficient than rewriting the whole jvm for Node.js?
Sometimes you just have to let people try to convince themselves that wildly impractical (but really neat) toys could have serious applications. <s>For example, Node.js.</s>
I agree, but then again, I remember the days when any given piece of software was ported to a dozen competing home computing platforms with wildly different architectures and operating systems as a matter of course. These days we basically have 2 architectures and 3 or 4 different OSs and that's about it.
Looking at lots of "web-scale" technologies, I can't help but think that lots of what we're trying to solve with racks of computers could probably be handled with ease by a single modern machine if the software wasn't so inefficient.
You have a point. But then, what I remember about that software that was ported to all the different architectures was:
1. the software was an order of magnitude or two less complex- in terms of features and platform. That is, there was maybe a dozen different kinds of (highly predictable,testable) home computers instead of a combinatorial infinity of videocards/motherboards/soundcards/operating systems/etc.., plus, the GUI/Evented model is more conceptually difficult for programmers to cope with than straightforward "I own the machine" imperative blocking code. If we want to have guis, we need languages that either abtract away events, or at least provide tools to more easily handle events than C or assembler could provide. Such languages tend to be fairly complex (in implementation)/"inefficient"/garbage collected (since events link to memory/state that must be disposed of eventually,somehow, safely,without segfaulting). Failing that, we need programmers that are smart and capable enough to just plow through the complexity of events in plain C or C++. It's a different caliber of programmer than those who could singlehandedly make simple computer games on an 8-bit home microcomputer.
2. In cases of more complex software, ports were achieved by first designing and implementing a one off virtual machine, then simply porting the virtual machine to the different platforms. See ScummVM, Z-machine, Another World, etc. Surely that threw away some notion of efficiency. You made up for it by implementing the performance sensitive routines inside the VM itself.
3. The inefficiency of software is usually a tradeoff for development speed. To make a virtual platform work consistently across the many different real platforms they must run on, the efficiency of the virtual platform becomes (roughly) a function of ( platform implementors / platform count ) * time. We're only 5 years down the road of this round of serious javascript optimisation. At the end of the day, the curve of that function isn't that different from java, or C or C++ or anything else. those other things just have a huge head start.
Why can't this be used to replace all of the java applets or allow java back into the browser? I know there were security issues before with native Java, perhaps JS sandboxed Java could solve the security issues.
Java applets are more powerful than any web browser API (you can't get at USB devices from the web browser except through a plugin, for example). BankID (used for banking and electronic payment in the Nordic countries by almost every bank) makes use of this to do validation with a USB device to authenticate based on bank card.
>you can't get at USB devices from the web browser except through a plugin, for example
Not yet, we can't. Mozilla is working on the WebUSB[0] API that'll allow access to USB devices via javascript as part of their ongoing WebAPI effort. They need this for FirefoxOS at some point, and these APIs are being implemented in desktop Firefox as well.
It didn't go full speed until Sun released HotSpot, which was probably a year or two later. The farther back in time you go, naturally, the slower Java was, sort of like the universe as you go back to the Big Bang. You also could have made more in the stock market with each receding year. But 1999 was a memorable year.
1999 was right about the time Java starting getting fast enough to use for servers (where you don't care much about start-up time).
About that time, I needed to decide whether to use Java or Delphi (an ahead-of-time Pascal compiler) for a new project. I was worried about garbage collection overhead. So, I designed a trivial benchmark wherein an array of simple objects was constructed in a loop (as we revisit each slot in the array, we replace the old object with a new one). Java came out ahead. The superiority of GC, along with platform independence, convinced me to go with Java.
Highly object-oriented code makes extensive use of memory allocation; I suspect this to be a general rule. Java almost forces you to use this style, so consequently, I expect that the GC is highly optimized for it, and GC is lazy, which cuts the cost of deallocations. I can't see any reason why manual memory management with new/delete, malloc/free, etc, couldn't be done lazy, also, but I think it usually reclaims memory eagerly, and gives a different trade-off (i.e. delete/free are implemented eagerly inside the standard library, rather than just marking the memory and returning). If you want the highest performance, you'll allocate memory in large blocks and take it out of the hands of the GC or library.
PS. I can't swear to it that you could do a decent lazy delete or free and get better performance in C or C++, but it doesn't sound absurd on its face. What the standard library could not do is run its own background thread and update memory structures asynchronously, the way some advanced GCs do. But the whole problem can be avoided by doing fewer allocations in the first place, and performance will be consistently good no matter what the platform.
ClojureScript is probably more efficient for Clojure, but running Scala REPL (and scripts) would be really nice. Some tasks are not CPU-intensive, they mostly wait on network I/O, but expressing them in something like Akka is just cleaner.
I'm thinking the next big step is to build ASIC's with hierarchical memory sub-systems that can quickly load and execute either x86 or ARM instructions, kind of like binary run-times implemented in hardware, and then boot operating systems on them so that programs don't have to worry about the physical memory address layout or disk file-systems, and then finally run a compiled version of the JVM that includes hot-spot. Then we would all be able to run Java, and it would be pretty fast, probably faster than any other implementation, and totally capable of integrating with the entire Java ecosystem.
What you're describing sounds like how computers work now. What I see as the logical next step is to design chips that use Javascript as their instruction set, because in the future, every layer underneath Javascript will just be overhead, because everything will be Javascript. We might as well start planning for it now.
like a chip that runs asm.js directly as its instruction set, and a js interpreter written in asm.js, with javascript runtime semantics implemented in hardware?
Java 6.0 comes with JSR 223 ("scripting for the Java platform"). It's a framework for hosting scripting engines. 6.0+ is shipped with a JavaScript engine based on Mozilla's Rhino.
Just a toy implementation[1], if someone is wondering.
[1]The basic .class file attribute parsers are located in libs/classfile, while a simple bytecode interpreter can be found in jvm.js. It doesn't load a real runtime library (classpath, apache harmony,openjdk,etc...) but it partially implements a few java.* classes in pure javascript under libs/java.
Can someone point out what it can do and what it can not do; I am curious what 2.000 lines of JavaScript can do. How much more code would it take to get a full featured implementation?
It has a java bytecode interpreter and an initial implementation of a classfile loader. As said above it doesn't have a full implementation of the java runtime, just a few classes to be able to print out to standard output and su things with string and not much else (at least from what i've seen of the content of libs/java).
Regarding the loader, it can load a few sections from a classfile, the one containing methods bytecodes, the exceptions table and the constant pool (i.e. where all your string are stored with some other "constants").
How much it could take for a full implementation? It depends of what you mean by full. Even using one of the opensource runtime library like openjdk,classpath,harmony/android i'd say a few years done solo, and not much less with more people, to build something complete and stable enough for general use. Definitely not a simple project.
It's too early to talk about java versions, but i suppose he used the latest jvm specification to implement his classfile parser. Only a few methods of some classes of java.lang and java.io (screen output and some basic stuff like String methods,etc..).
Mozilla has written pdf.js to replace Adobe's PDF plugin and shumway.js to replace Adobe's Flash plugin. When will we see a java.js that replaces Java applets on the web?
This question on this kind of topic needs to stop being asked on Hacker News. It's fair to ask whether or not there was a material usage intended for the project, but if isn't one, we really need to stop implying that there should be.
that's not a chance I'm willing to take. This is how PHP started. It was a reject experiment that escaped the lab. Then some poor souls took it seriously. Now all of humanity suffers.
I would argue you haven't really learned how to program until you implement some kind of machine yourself or at least written a compiler. He probably has a very intimate understanding of how all that works now, and I bet he did it for that reason alone.
It depends on what you consider the JVM. If you're just talking about something that can load class files and do some instruction interpretation (like this project) then that's one thing... if you're talking about implementing all of the JNI functions within the JVM standard library (awt, net) along with the concurrency primitives (conforming to the Java memory model?) then that is a significant multi-year, multi-developer project.
Is PyJNIus support coming anytime soon? The ability to use Python to script Java calls into the JS run-time to access Cordova API's on top of Android libraries is really important for developers who have Python, Cython, and Java skills, but find P4A too daunting compared to Cordova. It's only three more layers of nonsense.
With GWT you can translate the Java application to standalone JavaScript files ... and with this implementation you can run Java in a Javascript environment.
I think that the other way round, Go on the JVM, might be nice. It could actually be faster than native Go; Go function dispatch might benefit from the HotSpot JIT inlining optimizations.
I hate java, I hate javascript, I hate node.js, so basically, I just leave a comment here, and you can make a conclusion about what I think about all this.
My current research project, Doppio [1], implements the native portions of the OpenJDK Java Class Library so it can use an unmodified copy of the OpenJDK JCL. As a result, it can run a bunch of nontrivial programs (javac/javap/Rhino/Kawa-Scheme).
One issue you will run into is with multithreading. Since JavaScript has no true threading implementation with shared memory, you'll need to be able to suspend and resume virtual JVM threads. For this reason, Doppio maintains an explicit JVM stack representation.
Anyway, feel free to check out our code, reuse portions of it, or contribute if you're interested; it's MIT Licensed and under active development. :)
[1] Demo: http://doppiojvm.org/ Code: https://github.com/int3/doppio