Hacker News new | past | comments | ask | show | jobs | submit login
Rootbeer GPU Compiler Lets Almost Any Java Code Run On the GPU (github.com/pcpratts)
223 points by doublextremevil on Aug 12, 2012 | hide | past | favorite | 79 comments

I wonder if this is as big a win as it sounds. Regardless of what language you're using, you have to "think GPU" to get any performance from GPUs. The additional overhead of using CUDA/OpenCL syntax seems pretty small in comparison.

Yes, and it requires quite a low-level understanding of the architecture to "think GPU". SIMD, warps, blocks, threads, different memory types, no branching/identical branching per core, ... Some of this could probably be abstracted away but you definitely need to be aware and adjust algorithms appropriately. You can't just convert code and hope for the best, unless you just want a slow co-processor.

That's my experience in a nutshell. The cost of doing a cudamemcpy() far outweighs the advantages for computationally small tasks. The surprising bit for me was what's classed as a small task.

Decompressing a 5MP jpg then applying various filters is too lightweight a task to benefit. I thought that would be more or less a perfect GPU task, not so.

Running on OpenCL, the CPU with vector instructions horses a small GPU performance wise for this problem.

Probably not, typical java code is dominated by branching indirect code. Such code typically operates on mutable data which is not what GPUs are designed to do efficiently.

This should be a performance nightmare.

I had forgotten just how much I hate java namespaces.

import edu.syr.pcpratts.rootbeer.testcases.rootbeertest.serialization.MMult;

This seems like a pretty amazing project if the claims are true, though - I wasn't aware that CUDA was able to express so many of the concepts used to implement Java applications. The performance data in the slides is certainly compelling!

It's nothing compared to the pain of debugging in a language that doesn't have/encourage proper namespaces.

Ruby has Modules, but many, many common libraries do not use them. I had fun recently debugging a project that (through transitive dependencies) relied on two different "progress bar" libraries, both with a class called "Progress", and neither of which was namespaced. Namespaces solve a real problem.

The thing with namespaces is that they also introduce a problem through the various using clauses programming languages provide and how that interacts with common tools like grep. An API that uses a naming-convention that emulates namespaces (e.g. as in Emacs Lisp) removes that problem. This also makes good harder to follow when reading, especially when C++ things like SFINAE are around and suddenly everybody is calling everything fully qualified again to make code look and be unambiguous.

Of course, that can be solved with proper tooling, but some people are just averse to using something more heavyweight than necessary.

Does it really matter? Namespaces/packages only serve to:

- make things unique

- group things logically (which makes the systems design more explicit)

This applies to all programming languages. It's just that there's a convention in the Java community to prefix namespaces with a FQDN, which adds to the length. But you're free to choose another convention if you fancy. Although I wouldn't recommend since it's not a major issue, especially considering IDE support.

It's really not a problem when IDEs like Eclipse and Netbeans automatically handle imports for you.

It's a problem with a language if it requires heavyweight environments to make aspects of writing code in it acceptable.


Writing Java is unbearable without Eclipse/Netbeans. .NET a little bit less unbearable

If your IDE "works harder" than your compiler, something seems wrong to me (of course we all expect things like syntax highlighting today)

Meh. It's about trade-offs. Some languages are designed to be lightweight and require minimal assistance from the IDE. Some languages are optimised for very large codebases and expect a lot of "ceremony and plumbing" around them. And then there are, of course, millions of shades between the two extremes. Java and C# are in the second group. These are all very useful tools when used for the right purposes.

(Also I have written a fair amount of Java in vim and I didn't find it any more frustrating than writing Ruby in vim. And I don't even like Java.)

Java pretty much created the concept of a modern IDE where the editor knows about the libraries and does boilerplate expansion for you. This wasn't generally needed before the age of Java.

Java also invented extreme overengineering of libraries, making it a need to have syntax completion for even trivial tasks

Probably doing socket programming in C is easier than in Java

(to be fair, it wasn't Java really, it was "OO evangelists" most likely, starting with C++ then going crazy with Java. Too bad they didn't even touch smalltalk

> Probably doing socket programming in C is easier than in Java

Actually... networked programming is really easy in Java.

You are right though the state of a lot of java libs is hilarious! Smalltalk seems really cool but I've never had a reason to devote time to it.

Learning OOP with Smalltalk can actually hurt your career.

I learned OOP with Smalltalk/V, wrote some apps with Actor on Windows and, when first confronted with C++ I closed the book in horror and only came back 8 years later.

Is not smalltalk's "IDE" so, so much more all-knowing than any java one?

It's not unbearable, I've done a huge amount of Java coding on the command-line. Yes, it's harder, but I strongly believe it's better to move onto Eclipse only after learning how the tools actually work underneath. It makes it far easier to debug weird issues with the IDE, and means you're not SOL if the IDE breaks, or you're on another person's computer - you're just a bit slower, and I'd argue that applies to all tools (custom aliases, bash scripts, vim config) and not just IDEs.

This may be a problem if your truck breaks down in the desert, your 3g is out of range and you never thought to install an ide beforehand but apart from that when would you find yourself writing java without an ide?

When you're using somebody else's computer.

When was the last time any professional software developer was forced to do any real development on someone elses computer? In that case it is likely that one will miss his keyboard, desktop shortcuts, shell aliases, browser bookmarks much more than tour ide.

So you have never been asked for help from a junior developer whose development setup is different than yours?

Help in this context means discussion, advice, him showing me what he had done, me suggesting quick solution, or talking to him what to type, similar to pair programming. If _I_ had to actually write more than few lines of code, we'd just switch to my computer.

We have a standard IDE set-up for our team and require more junior devs to use it -- a least when starting out. It makes pairing with them easier.

The junior developer who insists on not using an IDE working with Java is precisely the one who needs the most help.

Edge case. I develop on my own machine (That said, I absolutely hate the verbosity of Java and love Nods.js exactly for this reason).

Is this really a concern? It's like blaming C for needing an editor, what's the big deal about using an IDE for some languages. It makes you more productive and they are not heavyweight by modern laptop standards.

And even without an IDE, it's hardly that difficult to use namespaces. As has been mentioned here already, they solve a real problem, are necessary, and are much better than not having them.

This just seems like ignorant hating.

You are assuming the IDE makes everyone more productive. For some people that is absolutely not the case. And the idea that you need to use an IDE for a bunch of boilerplate is abhorrent to them.

It's hard for a Java zealot to understand this position for some reason.

It's pretty hard to imagine an IDE not making a Java programmer more productive. The difference is truly astounding.

Is that why Lisps are so poorly thought of? They're designed around the idea of having a whole REPL all the time.

Do you even know what a REPL is?

It's just a way to run code interactively. "A whole REPL" is something that usually fits in 10 lines of code. With comments.

That is an unusually long namespace. It says more about the coding style of the programmer that wrote it than it does about Java.

After a while, most Java developers start thinking it's ok to build deep hierarchies of modules.

Complaining about java as a language for providing a namespace feature that's basically identical to every other language just shows a want to hate on Java.

It's OK to build hierarchies of modules. It's not OK to make them deeper than needed and violate YAGNI with sparse hierarchies prepared for situations that'll never arise. Not all programmers fall into this trap, but due to IDE support, Java developers are specially vulnerable because, for them, it costs nothing - programs do the job of keeping everything neatly over-organized.

I don't hate Java. In fact, along with Python and C, it's one of my favorite (and more used) languages. But I do realize it has a huge potential for being abused in horrible ways.

You find similarly named namespaces, packages, modules in most languages that support modules.

Don't blame the language for what some guys do with them.

If they would remove all the redundant information it would be a lot better.

import edu.syr.pcpratts.<asterix>;

The only reason you see the fully qualified class names is because the IDE does this automatically for you. If there were no IDEs everyone would be using the *s;

Wildcard imports are a bad idea regardless of whether you're using an IDE or not due to the risk of future name conflicts.

This sentiment is dogma from long ago when there was a slight performance improvement for individual imports. Today, there is no such performance advantage. Their risk of a namespace collision, but in practice, it is very rare and very easy to identify when it does. The main reason it remains is mainly due to default IDE configuration.

On the flip side, wildcard imports make coding much more fluid and concise. How many times have you had to hit ctrl+space two or more times in the same declaration to add import statements for interfaces and classes from the same package? IMHO, explicit imports greatly decrease the usefulness of the static import feature.

There has never been any runtime performance difference between individual and wildcard imports. Compiled byte code is identical.

Name conflicts are a compile error - the worst thing that can happen is that you have to change it back to an explicit include.

They are not a bad idea if you use and IDE and not an editor. IDE lets you find usages of a type instantly. Anyways such conflicts are rare.

True iff the code cannot be modified (say for example it's packaged in a jar and is closed source), otherwise this is trivially worked around on the very few times it ever happens.

This issue is irrelevant for jar files. Compiled byte code contains fully qualified class names, not import statements.

My point was more than Java namespaces can be easy to use.

Pretty much everyone uses an IDE anyway so it's not like namespaces are ever an issue unless you do come across that rare name clash.

> Pretty much everyone uses X anyway so it's not like Y are ever an issue

This 'its good enough, and everyone does it anyway' attitude leads us to incredibly damaging status quos.

Often though, "good enough" really is good enough. You can spend untold man hours trying to get that last little bit of perfection and never really achieve it anyways.

It would be good to post the code for the performance tests in the slides. https://raw.github.com/pcpratts/rootbeer1/master/doc/hpcc_ro...

this sort of thing is why NVIDIA is supporting LLVM:


in other words, with that and the right frontend, you can take Language X, compile to LLVM IR, and run it through the PTX backend to get CUDA.

however, in the grand scheme of things, this probably doesn't make GPU programming significantly easier to your average developer (as you still have to deal with big complicated parallel machines); what it really does is ease integration into various codebases in those different languages.

A comparison with AMD's Java offering, Arapi (http://developer.amd.com/zones/java/aparapi/pages/default.as...), would be interesting.

Looking through the code, this seems to do the exact same thing as Aparapi. I'm surprised this was given funding given the high quality implementation AMD has put together.

The headline is misleading; only a small subset of Java can be ported to the GPU. It works great for inner math loops and such, but not for higher level problems. Even if the author managed to find a way to translate more complicated problems (I see object locking in the list of features), they would be better suited to run on a CPU, or refactored to avoid locks.

"Rootbeer was created using Test Driven Development and testing is essentially important in Rootbeer"

I'm not sure what the "essentially" means here, but this is the first "big" program I'm aware of that name-checks TDD, and a counter-example to my theory that programs where much of the programmer effort goes into algorithms and data structures are not suited to TDD.

Was the TDD approach "pure"? (Only one feature implemented at a time, with absolutely no design thought given to what language features might need to be implemented in the future.)

I'd think a project like this is ideally suited to TDD: you know what the results should be for most operations, and they are easily testable. It's the same reason the Perl 6 test suite has been so valuable to the various Perl 6 compiler projects. (Not that any of them claim to use TDD.)

I agree, the biggest philosophical problem with TDD in my mind is that it neglects design -- a little foresight can make all the difference, as in that sudoku case study that went off the rails. But if you are building to a vetted, stable spec as in the case of Java, design isn't nearly as important as compliance so TDD can shine.

Its pretty cool. Of course you are probably using that GPU in a desktop, but an on die GPU in a server class machine? Something to consider.

The GPU in a desktop is the only interesting kind of GPU. The built-in GPU in servers is ten years behind the current cutting edge on desktops. Though servers can have PCI Express slots for modern GPU installation.

I think most of the GPU-based bitcoin farmers are using desktop hardware, but I might be wrong.

Well, there's really two classes of server GPUs.

One is the tiny ancient GPUs used to drive VGA outputs on servers. Those are hardly even worth talking about; they're only there so that you can hook a monitor up in an emergency.

And then there are real server GPUs, like nVidia Tesla stuff. Those typically don't even have video outputs, but they're on par with modern high-end gaming GPUs, possibly even better at some tasks.

I believe most bitcoin farmers are still using desktop GPUs, but that's largely just because they're running relatively simple kernels, all things considered, and because desktop GPUs are cheaper and easier to source.

To my knowledge bitcoin farmers found the hash rates of the Tesla based ec2 instances far slower than consumer grade high performance graphics cards

I'm not clear on the reason, but desktop GPUs from nvidia and AMD that have similar graphics performance, have very different bitcoin mining performance. The nvidia cards get about half the hash rate of a similar AMD GPU.

AMD GPUs are better optimized for integer ops, whereas on nVidia integer ops are basically a second class citizen.

Also, AMD tends to be more cores at a lower clock rate than nVidia. For embarrassingly parallel integer operations (like hashing) AMD blows nVidia out of the water.

The built-in GPU on Xeons/Opterons/similar hardware is closer than you think. Not everybody needs Quattro/FirePro performance but a lot of people still want workstation/CAD graphics support in the drivers.

Xeon and Opteron CPUs generally don't have integrated GPUs. In fact, the primary difference between Intel's desktop and server LGA1155 CPUs is that most of the server equivalents have the GPU disabled (probably to save power).

i was talking about what's coming, not what's available. expect workstation CPUs with good-enough integrated GPUs next year.

Does this simply run your java code on the GPU, or does it parallelize your code automatically? The latter would be really cool.

This is indeed IMHO a central question in this topic, since parallelizing an algorithm is not an easy task.

So I guess you still end up writing your algorithm in OpenCL / Cuda and maybe use the serialization provided by this lib.

Update: (Just read the hpcc_rootbeer.pdf slides.) You write your _parralellized_ implementation of an Algorithm in Java - and it will be executed on the GPU.

I prefer programming in CUDA over programming in Java. However, I have a lot of respect for the Java runtime.

If Rootbeer or something similar allows me to program CUDA stuff in Clojure, then I am impressed and excited.

Has anybody used Rootbeer with language which compiles to Java bytecode (like Clojure)?

It would be interesting to see how functional languages designed for parallelism perform on gpu.

this was my first thought in reading this! as other people would said you would have to think in gpu terms, but a quick glance at the code looks like you could just look into the compile functions. I hope someone out there beats me to this because they'll do a better job :D

> The license is currently GPL, but I am planning on changing to a more permissive license. My overall goal is to get as many people using Rootbeer as possible.

It would be bad to compromise the freedoms of the users in order to be able to limit the freedoms of more of them.

Any reason why the GPLv3 would be considered unsuitable? How about the LGPLv3?

The title captured the true character of Java: Write once, almost run almost everywhere.

Don't spill the beans :)

  4. sleeping while inside a monitor.
... Can someone clarify what this is?

  // In Java, all instances of Object are also a monitor.
  // See https://en.wikipedia.org/wiki/Monitor_%28synchronization%29
  Foo myMonitor = new Foo()

  // To "enter" a monitor, you use 'synchronized'.
  synchronized (myMonitor) {
    // Inside the monitor, we could do a Thread.sleep.
This is a very simplistic problem case. However, it is very possible for this to become a bigger problem. Because I can call arbitrary code when "inside" a monitor, it is very possible to call a method that does a sleep incidentally. (e.g. many implementations of IO operations will require some sort of sleep.)

Seems much nicer than coding directly for CUDA or OpenCL

The host-side application could already be written using Java, atleast for OpenCL applications[1] (The kernel--that is, the GPU code--was still written in OpenCL). My only concern is that Java will make it more difficult to find out exactly what's going on in the kernel code, and hence more difficult to optimise.

Now, this also doesn't solve the issue of needing to consider the parallel architecture when coding the kernel to actually make use of the hardware. Nevertheless, kudos to the guys behind this.

[1] http://code.google.com/p/javacl/

At first glance this seems very impressive.

Any security implications? Malware running on the GPU?

This is already possible, but it never was a problem. The GPU is just a slave to the CPU.

The biggest problem I could see would be when your computer becomes part of a botnet. Your computer could be used for brute fore encryption cracking. Again, this was also possible with just CUDA or OpenCL.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact