Hacker News new | past | comments | ask | show | jobs | submit login
Java Virtual Machine in pure Node.js (github.com/yaroslavgaponov)
286 points by binarymax on Oct 30, 2013 | hide | past | favorite | 138 comments

Interesting! It looks like you've reimplemented portions of the Java Class Library rather than use e.g the existing class files from OpenJDK.

My current research project, Doppio [1], implements the native portions of the OpenJDK Java Class Library so it can use an unmodified copy of the OpenJDK JCL. As a result, it can run a bunch of nontrivial programs (javac/javap/Rhino/Kawa-Scheme).

One issue you will run into is with multithreading. Since JavaScript has no true threading implementation with shared memory, you'll need to be able to suspend and resume virtual JVM threads. For this reason, Doppio maintains an explicit JVM stack representation.

Anyway, feel free to check out our code, reuse portions of it, or contribute if you're interested; it's MIT Licensed and under active development. :)

[1] Demo: http://doppiojvm.org/ Code: https://github.com/int3/doppio

Seems like the "reimplemented" portions are bits that need to interact directly with the script environment -- String is implemented in terms of V8's native strings, etc... That seems like a feature, not a bug -- you wouldn't want the interpreter needlessly doing bytewise diddling of an abstraction that is already very robust in the host environment.

In the Java Class Library, any function that needs to interact with the environment outside of the JVM itself (OS/file system/network/threading/etc.) is implemented in C and is marked as 'native'. Those are the functions that we explicitly implement for the JavaScript/browser environment.

You are right that we could extract more performance by mapping particular classes directly onto efficient browser functionality -- such as String. In the future, we could implement specific classes such as String directly in JavaScript for a performance boost. :)

As a side note, ASM.js itself does not have a String type, which leads me to believe that this particular optimization might not be beneficial if we switch to an efficient JIT strategy.

asm.js doesn't because C doesn't, fundamentally. Given both Java and JavaScript strings are immutable, you probably want to for the sake of having non-copying operations for substring and the like.

Nice experiment and a huge undertaking, do you already support the current implementation of java.util.concurrent? (requires atomic CAS through sun.misc.Unsafe and ability to park threads)

Note that the demo is somewhat old; I do not know if it has park implemented. You'll want to build it from GitHub, which we've made relatively straightforward.

We are planning to update the demo (and perhaps post on HN) once I fix some IE issues (since we strive to be compatible with IE9).

We have the ability to park threads and atomic CAS implemented, so we should support java.util.concurrent.

Granted, it's possible that it invokes a native function somewhere deep inside itself that we don't implement yet, but fixing those are usually very simple.

Looks great. How is the performance, do you expect to be able to produce guis(awt/swing) soon?


We had an excellent Google Summer of Code student work on AWT support for half of the summer. We discovered that the OpenJDK AWT implementation betrays its own internal abstractions, making it impossible to implement a new backend without forking the Java Class Library and making nontrivial changes.

You can read more about the problem here: http://mail.openjdk.java.net/pipermail/challenge-discuss/200...

Unfortunately, the project described in the above link appears to be defunct and unmaintained. :( We don't have the resources to work further on the problem at the moment.

Regarding performance, we are about 20-40X slower than the HotSpot Interpreter in Google Chrome, with similar numbers in other browsers. There is a significant amount of work that we can and plan to do to make that much better, though.

Is this used in actual apps?

Right now, http://codemoo.com/ by the University of Illinois uses it to teach Java. A few teachers have expressed interest in using it for similar purposes, but we need to make deployment easier.

It's too slow at the moment for production use, since it is an interpreter. But if we add JIT compilation support, I believe it will be fast enough for usably running legacy code/libraries in the browser.

Really cool hack, and yet I can't help but think that the mile-high software stack between the developer and the CPU continues to grow meaning we're stuck in a perpetual game of filling up spare CPU cycles with nothing in particular.

I'm sure I'll see a demo showing off something that barely runs on modern hardware that we were more than capable of running with good performance in 1990.

sigh I feel like I'm being such a Debbie-downer even if this is a really cool hack.

Now we just need to wait for someone to run Rhino on this so we can have Javascript hosting a JVM which hosts Javascript.

Nashorn would be more appropriate once it is released. It leverages the JVM dynamic invocation instructions better IIRC.

And Nashorn should jit to bytecodes, which should jit back to JS. Like realtime closure compiler.


That actually made me giggle

>we're stuck in a perpetual game of filling up spare CPU cycles with nothing in particular.

There are lots of CPU cycles, and plenty of cases where trading a few for increased flexibility makes a ton of sense.

Not to the users who sit around watching swirlies and beachballs. They don't care what rad, flavor of the month framework you used to build your software. Websites and desktop apps were more responsive in 1998 than they are today*

*In general and not supported by facts of any kind other than my own anecdotia.

users do care about the tools they know. Our perl based webapp must interact with java, windows services, ... you name it.

You should not use virtual runtimes for everything your application does but using them in overnight cronjobs is ok.

Right now we are working on some middleware because the customers wants to use existing reporting tools(c#/windows only) with our product(perl/linux).

>Look how fast and simple this is! > >Yeah nice so how do I genearte my <wateva> reports? > >Well we can do the same with our <another> reporting service! > >No I want to keep my <goodOldstuff> based reports can you integrate those?

beachballs are for IO. You should have an SSD and banish the beachball.

I feel exactly the same. But there are some cool projects around that do thinner the layers between software and hardware. Look for example Extempore (http://extempore.moso.com.au/) or Hylas-Lisp (https://github.com/eudoxia0/Hylas-Lisp).

The paradigm is that it's easier (cheaper) to have less performant code that takes less time to build for developers then it is to have maximum performant code...

This project at least provides the potential for people like myself who have existing Java projects with longstanding functioning methods that I don't want to rewrite in Node.js... Yes it's another layer, but for some functions that layer can be less complicated then reimplementation.

> This project at least provides the potential for people like myself who have existing Java projects with longstanding functioning methods that I don't want to rewrite in Node.js

In this case, wouldn't a java-to-js translator be more efficient than rewriting the whole jvm for Node.js?

Sometimes you just have to let people try to convince themselves that wildly impractical (but really neat) toys could have serious applications. <s>For example, Node.js.</s>

There's only one way to find out ;)

the differences are in how quickly the software can get written, and where that software can go to be run, and in how many different places.

It could be we haven't improved those either, but there's more to life than raw performance you see.

I agree, but then again, I remember the days when any given piece of software was ported to a dozen competing home computing platforms with wildly different architectures and operating systems as a matter of course. These days we basically have 2 architectures and 3 or 4 different OSs and that's about it.

Looking at lots of "web-scale" technologies, I can't help but think that lots of what we're trying to solve with racks of computers could probably be handled with ease by a single modern machine if the software wasn't so inefficient.

You have a point. But then, what I remember about that software that was ported to all the different architectures was:

1. the software was an order of magnitude or two less complex- in terms of features and platform. That is, there was maybe a dozen different kinds of (highly predictable,testable) home computers instead of a combinatorial infinity of videocards/motherboards/soundcards/operating systems/etc.., plus, the GUI/Evented model is more conceptually difficult for programmers to cope with than straightforward "I own the machine" imperative blocking code. If we want to have guis, we need languages that either abtract away events, or at least provide tools to more easily handle events than C or assembler could provide. Such languages tend to be fairly complex (in implementation)/"inefficient"/garbage collected (since events link to memory/state that must be disposed of eventually,somehow, safely,without segfaulting). Failing that, we need programmers that are smart and capable enough to just plow through the complexity of events in plain C or C++. It's a different caliber of programmer than those who could singlehandedly make simple computer games on an 8-bit home microcomputer.

2. In cases of more complex software, ports were achieved by first designing and implementing a one off virtual machine, then simply porting the virtual machine to the different platforms. See ScummVM, Z-machine, Another World, etc. Surely that threw away some notion of efficiency. You made up for it by implementing the performance sensitive routines inside the VM itself.

3. The inefficiency of software is usually a tradeoff for development speed. To make a virtual platform work consistently across the many different real platforms they must run on, the efficiency of the virtual platform becomes (roughly) a function of ( platform implementors / platform count ) * time. We're only 5 years down the road of this round of serious javascript optimisation. At the end of the day, the curve of that function isn't that different from java, or C or C++ or anything else. those other things just have a huge head start.

Why can't this be used to replace all of the java applets or allow java back into the browser? I know there were security issues before with native Java, perhaps JS sandboxed Java could solve the security issues.

Java applets are more powerful than any web browser API (you can't get at USB devices from the web browser except through a plugin, for example). BankID (used for banking and electronic payment in the Nordic countries by almost every bank) makes use of this to do validation with a USB device to authenticate based on bank card.

>you can't get at USB devices from the web browser except through a plugin, for example

Not yet, we can't. Mozilla is working on the WebUSB[0] API that'll allow access to USB devices via javascript as part of their ongoing WebAPI effort. They need this for FirefoxOS at some point, and these APIs are being implemented in desktop Firefox as well.

[0] https://wiki.mozilla.org/WebAPI/WebUSB

Interesting point. On that note, does a JVM exist that can run on bare metal?

18ms for calculating Fibonacci numbers from 1 to 10. This is definitily bringing JAVA back to the good old days of enterprise execution speeds. :-)

Really cool proof of concept! I would really enjoy seeing more projects like this!

Who doesn't wish it was still 1999? Hot stock market, millenium mania, cheap gas, dot-coms. Really slow Java makes it all come rushing back.

> [1999's] slow Java

Java 1.2 (1998) used a JIT and a generational collector. It was probably faster than today's Ruby, Python, or PHP.

What was slow was start up. Applets in particular were really horrible since they completely froze the browser for several seconds.

If I remember correctly, this was finally fixed many years later with some version of Java 6.0.

It didn't go full speed until Sun released HotSpot, which was probably a year or two later. The farther back in time you go, naturally, the slower Java was, sort of like the universe as you go back to the Big Bang. You also could have made more in the stock market with each receding year. But 1999 was a memorable year.

1999 was right about the time Java starting getting fast enough to use for servers (where you don't care much about start-up time).

About that time, I needed to decide whether to use Java or Delphi (an ahead-of-time Pascal compiler) for a new project. I was worried about garbage collection overhead. So, I designed a trivial benchmark wherein an array of simple objects was constructed in a loop (as we revisit each slot in the array, we replace the old object with a new one). Java came out ahead. The superiority of GC, along with platform independence, convinced me to go with Java.

Highly object-oriented code makes extensive use of memory allocation; I suspect this to be a general rule. Java almost forces you to use this style, so consequently, I expect that the GC is highly optimized for it, and GC is lazy, which cuts the cost of deallocations. I can't see any reason why manual memory management with new/delete, malloc/free, etc, couldn't be done lazy, also, but I think it usually reclaims memory eagerly, and gives a different trade-off (i.e. delete/free are implemented eagerly inside the standard library, rather than just marking the memory and returning). If you want the highest performance, you'll allocate memory in large blocks and take it out of the hands of the GC or library.

PS. I can't swear to it that you could do a decent lazy delete or free and get better performance in C or C++, but it doesn't sound absurd on its face. What the standard library could not do is run its own background thread and update memory structures asynchronously, the way some advanced GCs do. But the whole problem can be avoided by doing fewer allocations in the first place, and performance will be consistently good no matter what the platform.


Why did you spell it in all-caps here?

Everything enterprisey has to be in all caps and mean something that's trying to be cool.

JAVA = Just Another Vague Acronym.

That and you need to use xml for everything to be enterprise.

XML is like violence: if it isn't working, you aren't using enough.

At least now in java things are going from XML to annotations :)

Hasn't that transition been going on for 6+ years now? I guess that's what counts as agile in the enterprise space.

Finally, I can run JRuby on my node server!

From poster with similar project above:


> We're also making progress on others, but we're not quite there yet: JRuby, Clojure, Scala REPL

I think I just had a micro heart attack reading that.

It'd be interesting to port the JavaFX/GL stuff to WebGL...

...but I'd rather stab myself in the face with a screwdriver.

ClojureScript is probably more efficient for Clojure, but running Scala REPL (and scripts) would be really nice. Some tasks are not CPU-intensive, they mostly wait on network I/O, but expressing them in something like Akka is just cleaner.

and clojurescript, circle is complete.

You mean Rhino (the JavaScript engine written in Java)? :)


Hey, if you want your precious node back, you can then just run Rhino on the JVM on the node.

This provides a highly scalable, highly available infrastructure for running the JVM in Rhino on the JVM in Node.

Scalable in the sense that you can try and climb up the layers of abstraction in the same way you'd scale a mountain of rubble, yes.

It's turtles all the way down.

No, but apparently all the way up.

Why not just use the javascript x86 emulator that somebody posted to boot an OS? Then you can run the Hotspot JVM, v8, ad infinitum. Piece of cake.

Hotspot runs the program faster than V8? no problem.

Emulate x86 in v8 run hotspot inside emulator.

now you got your performance back.

Yes, it's like a beautiful perpetual motion machine.

Only one problem. Where's node.js?

I'm thinking the next big step is to build ASIC's with hierarchical memory sub-systems that can quickly load and execute either x86 or ARM instructions, kind of like binary run-times implemented in hardware, and then boot operating systems on them so that programs don't have to worry about the physical memory address layout or disk file-systems, and then finally run a compiled version of the JVM that includes hot-spot. Then we would all be able to run Java, and it would be pretty fast, probably faster than any other implementation, and totally capable of integrating with the entire Java ecosystem.

What you're describing sounds like how computers work now. What I see as the logical next step is to design chips that use Javascript as their instruction set, because in the future, every layer underneath Javascript will just be overhead, because everything will be Javascript. We might as well start planning for it now.

like a chip that runs asm.js directly as its instruction set, and a js interpreter written in asm.js, with javascript runtime semantics implemented in hardware?

kind of like the espruino but even moreso?

I didn't know of asm.js and espruino before you mentioned them, but it looks like the future may be coming sooner than I thought.

What I really want to say is that you can turn my kernel calls into JSON-RPC when you pry my ARM assembler from my cold, dead fingers.

Now I'll have even bigger problem explaining what the difference between Java and Javascript is! :)

It's like Car and Carpet. But it's also the kind of car that you can build out of carpets.

...or the carpet that you can build out of cars, considering the Ringojs.

Java 6.0 comes with JSR 223 ("scripting for the Java platform"). It's a framework for hosting scripting engines. 6.0+ is shipped with a JavaScript engine based on Mozilla's Rhino.


For anyone interested in the details:

-Reads in real .class files

-Uses the JS run-time to implement the Java run-time (e.g. there isn't a garbage collector written in JS, the JS collector is used)

-Only implemented part of java.lang and java.io

A handful of classes with a few of their methods, in java.lang and java.io.

I created something similar some time ago.


Although I targeted only J2ME subset. Also I found few similar projects:



Just a toy implementation[1], if someone is wondering.

[1]The basic .class file attribute parsers are located in libs/classfile, while a simple bytecode interpreter can be found in jvm.js. It doesn't load a real runtime library (classpath, apache harmony,openjdk,etc...) but it partially implements a few java.* classes in pure javascript under libs/java.

Can someone point out what it can do and what it can not do; I am curious what 2.000 lines of JavaScript can do. How much more code would it take to get a full featured implementation?

It has a java bytecode interpreter and an initial implementation of a classfile loader. As said above it doesn't have a full implementation of the java runtime, just a few classes to be able to print out to standard output and su things with string and not much else (at least from what i've seen of the content of libs/java).

Regarding the loader, it can load a few sections from a classfile, the one containing methods bytecodes, the exceptions table and the constant pool (i.e. where all your string are stored with some other "constants"). How much it could take for a full implementation? It depends of what you mean by full. Even using one of the opensource runtime library like openjdk,classpath,harmony/android i'd say a few years done solo, and not much less with more people, to build something complete and stable enough for general use. Definitely not a simple project.

Should be renamed to node-java instead of node-jvm, I think.

What version of java does it handle? Does it support java.lang.*? What other standard libraries?

It's too early to talk about java versions, but i suppose he used the latest jvm specification to implement his classfile parser. Only a few methods of some classes of java.lang and java.io (screen output and some basic stuff like String methods,etc..).

Please feel free to fork it, improve and post back.

I respectfully decline the offer :)

Atwood's Law: any application that can be written in JavaScript, will eventually be written in JavaScript.

tlrobinson's Law: any submission to Hacker News about a novel JavaScript program will contain a comment referencing Atwood's Law.

...and, eventually, there will be no other language than JavaScript.


And Mozilla's (mothballed) Narcissus: https://github.com/mozilla/narcissus

And I thought I was making a joke.

Yeah, can I spin up Nashorn on this Java running on V8?

Javascript's Rule 34

Mozilla has written pdf.js to replace Adobe's PDF plugin and shumway.js to replace Adobe's Flash plugin. When will we see a java.js that replaces Java applets on the web?

But really though, that could be amazing! Or fail! I can't decide.

maybe they like oracle more than adobe

I'm waiting for a JavaScript interpreter written in Javascript.

Edit: It's already here :) https://github.com/jterrace/js.js/

Atwood's Law ad extremum.

Atwood's Ouroboros

indeed :)

Now, my question is. WHY!? I might be crazy, but does anyone see a legitimate use for this?

This question on this kind of topic needs to stop being asked on Hacker News. It's fair to ask whether or not there was a material usage intended for the project, but if isn't one, we really need to stop implying that there should be.

that's not a chance I'm willing to take. This is how PHP started. It was a reject experiment that escaped the lab. Then some poor souls took it seriously. Now all of humanity suffers.

I would argue you haven't really learned how to program until you implement some kind of machine yourself or at least written a compiler. He probably has a very intimate understanding of how all that works now, and I bet he did it for that reason alone.

Yes, yes and you can't DJ without crates of vinyl and 2 turntables. I'll get off your lawn now

Now you just need to run Rhino(https://developer.mozilla.org/en/docs/Rhino) on it and you will be totally meta.

My current research project, Doppio, can already run Rhino in the browser using JavaScript. :)

Demo: http://doppiojvm.org/


Why run Rhino when you can run Nashorn and Node.jar? http://kaeff.net/posts/why-ruby-and-nodejs-folks-should-care...

I have a feeling James Gosling choked on his coffee this morning. :)

How hard is it to write a JVM? I'd think it pretty easy, as it's intended to be small and easily portable.

You could even write one in Java (and I'm sure it's been done, many times). EDIT e.g. http://igormaznitsa.com/projects/mjvm/index.html

Of course, doing all the tricky JIT etc of the JVM is a different story...

It depends on what you consider the JVM. If you're just talking about something that can load class files and do some instruction interpretation (like this project) then that's one thing... if you're talking about implementing all of the JNI functions within the JVM standard library (awt, net) along with the concurrency primitives (conforming to the Java memory model?) then that is a significant multi-year, multi-developer project.

Is PyJNIus support coming anytime soon? The ability to use Python to script Java calls into the JS run-time to access Cordova API's on top of Android libraries is really important for developers who have Python, Cython, and Java skills, but find P4A too daunting compared to Cordova. It's only three more layers of nonsense.

This is the worst thing that ever happened

Worse than the last worst thing that ever happened?


I thought that that was the worst thing that ever happened, and I turned out to be wrong. No inconsistency here.

haha this is a really excellent callout. well done.

Yep. Mad props.

Agreed. He must have done it for fun, just to prove that it could be done. I don't think anyone legitimately would actually want this...

What does Node have to do with it (besides perhaps a little IO)? Isn't it just JavaScript?

For comparison, here is a recent JVM in Lua: https://cowlark.com/luje/doc/stable/doc/index.wiki

With GWT you can translate the Java application to standalone JavaScript files ... and with this implementation you can run Java in a Javascript environment.

Which one would perform best?

GWT with out a doubt. It's running on a smaller stack and the translation is done at compile time not runtime.

The example makefiles appear to use javac.

Next step: Java compiler in node

So now I can use one of those Java libraries that implement a JavaScript engine and finally have JavaScript running on node.js?

I made a house out of butter yey!!!!

Would be interested in seeing a JVM written in Go.

I think that the other way round, Go on the JVM, might be nice. It could actually be faster than native Go; Go function dispatch might benefit from the HotSpot JIT inlining optimizations.

Why? I must be missing the connection between Node and Go.

For 15 minutes of fame on HN, of course.

Considering Go is slower than C, and the JVM is written in C/C++, I'd say the JVM would be slower.

I'm not trying to say I wouldn't like to see it, but I don't think anyone would use it, is all. And it's sad when cool software doesn't get used.

Once the JVM emits machine code for the bytecode, it doesn't matter what language the JVM is coded in.

It's exactly what I thought whem I saw this.

What's the point of doing this, why not just use Mozilla Rhino?

Rhino is the opposite of this.

It is the right tool for mixed JS and Java codebase.

I hate java, I hate javascript, I hate node.js, so basically, I just leave a comment here, and you can make a conclusion about what I think about all this.

Just out of curiosity, what are your preferred languages and server environments?

Based on his HN profile, my guess is Python.

C, C++, Python, Lua

Why all the hate but none of the conceptualization?


Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
