
Rootbeer GPU Compiler Lets Almost Any Java Code Run On the GPU - doublextremevil
https://github.com/pcpratts/rootbeer1
======
wmf
I wonder if this is as big a win as it sounds. Regardless of what language
you're using, you have to "think GPU" to get any performance from GPUs. The
additional overhead of using CUDA/OpenCL syntax seems pretty small in
comparison.

~~~
relix
Yes, and it requires quite a low-level understanding of the architecture to
"think GPU". SIMD, warps, blocks, threads, different memory types, no
branching/identical branching per core, ... Some of this could probably be
abstracted away but you definitely need to be aware and adjust algorithms
appropriately. You can't just convert code and hope for the best, unless you
just want a slow co-processor.

------
kevingadd
I had forgotten just how much I hate java namespaces.

import edu.syr.pcpratts.rootbeer.testcases.rootbeertest.serialization.MMult;

This seems like a pretty amazing project if the claims are true, though - I
wasn't aware that CUDA was able to express so many of the concepts used to
implement Java applications. The performance data in the slides is certainly
compelling!

~~~
kaffeinecoma
It's nothing compared to the pain of debugging in a language that doesn't
have/encourage proper namespaces.

Ruby has Modules, but many, _many_ common libraries do not use them. I had fun
recently debugging a project that (through transitive dependencies) relied on
two different "progress bar" libraries, both with a class called "Progress",
and neither of which was namespaced. Namespaces solve a real problem.

~~~
pmr_
The thing with namespaces is that they also introduce a problem through the
various using clauses programming languages provide and how that interacts
with common tools like grep. An API that uses a naming-convention that
emulates namespaces (e.g. as in Emacs Lisp) removes that problem. This also
makes good harder to follow when reading, especially when C++ things like
SFINAE are around and suddenly everybody is calling everything fully qualified
again to make code look and be unambiguous.

Of course, that can be solved with proper tooling, but some people are just
averse to using something more heavyweight than necessary.

------
sillysaurus
It would be good to post the code for the performance tests in the slides.
[https://raw.github.com/pcpratts/rootbeer1/master/doc/hpcc_ro...](https://raw.github.com/pcpratts/rootbeer1/master/doc/hpcc_rootbeer.pdf)

------
tmurray
this sort of thing is why NVIDIA is supporting LLVM:

[http://nvidianews.nvidia.com/Releases/NVIDIA-Contributes-
CUD...](http://nvidianews.nvidia.com/Releases/NVIDIA-Contributes-CUDA-
Compiler-to-Open-Source-Community-7d0.aspx)

in other words, with that and the right frontend, you can take Language X,
compile to LLVM IR, and run it through the PTX backend to get CUDA.

however, in the grand scheme of things, this probably doesn't make GPU
programming significantly easier to your average developer (as you still have
to deal with big complicated parallel machines); what it really does is ease
integration into various codebases in those different languages.

------
pjmlp
A comparison with AMD's Java offering, Arapi
([http://developer.amd.com/zones/java/aparapi/pages/default.as...](http://developer.amd.com/zones/java/aparapi/pages/default.aspx)),
would be interesting.

~~~
wetherbeei
Looking through the code, this seems to do the exact same thing as Aparapi.
I'm surprised this was given funding given the high quality implementation AMD
has put together.

The headline is misleading; only a small subset of Java can be ported to the
GPU. It works great for inner math loops and such, but not for higher level
problems. Even if the author managed to find a way to translate more
complicated problems (I see object locking in the list of features), they
would be better suited to run on a CPU, or refactored to avoid locks.

------
mjs
"Rootbeer was created using Test Driven Development and testing is essentially
important in Rootbeer"

I'm not sure what the "essentially" means here, but this is the first "big"
program I'm aware of that name-checks TDD, and a counter-example to my theory
that programs where much of the programmer effort goes into algorithms and
data structures are not suited to TDD.

Was the TDD approach "pure"? (Only one feature implemented at a time, with
absolutely no design thought given to what language features might need to be
implemented in the future.)

~~~
colomon
I'd think a project like this is ideally suited to TDD: you know what the
results should be for most operations, and they are easily testable. It's the
same reason the Perl 6 test suite has been so valuable to the various Perl 6
compiler projects. (Not that any of them claim to use TDD.)

~~~
sirclueless
I agree, the biggest philosophical problem with TDD in my mind is that it
neglects design -- a little foresight can make all the difference, as in that
sudoku case study that went off the rails. But if you are building to a
vetted, stable spec as in the case of Java, design isn't nearly as important
as compliance so TDD can shine.

------
ChuckMcM
Its pretty cool. Of course you are probably using that GPU in a desktop, but
an on die GPU in a server class machine? Something to consider.

~~~
SwellJoe
The GPU in a desktop is the only interesting kind of GPU. The built-in GPU in
servers is ten years behind the current cutting edge on desktops. Though
servers can have PCI Express slots for modern GPU installation.

I think most of the GPU-based bitcoin farmers are using desktop hardware, but
I might be wrong.

~~~
duskwuff
Well, there's really two classes of server GPUs.

One is the tiny ancient GPUs used to drive VGA outputs on servers. Those are
hardly even worth talking about; they're only there so that you can hook a
monitor up in an emergency.

And then there are real server GPUs, like nVidia Tesla stuff. Those typically
don't even have video outputs, but they're on par with modern high-end gaming
GPUs, possibly even better at some tasks.

I believe most bitcoin farmers are still using desktop GPUs, but that's
largely just because they're running relatively simple kernels, all things
considered, and because desktop GPUs are cheaper and easier to source.

~~~
jvc26
To my knowledge bitcoin farmers found the hash rates of the Tesla based ec2
instances far slower than consumer grade high performance graphics cards

~~~
sp332
I'm not clear on the reason, but desktop GPUs from nvidia and AMD that have
similar graphics performance, have very different bitcoin mining performance.
The nvidia cards get about half the hash rate of a similar AMD GPU.

~~~
phamilton
AMD GPUs are better optimized for integer ops, whereas on nVidia integer ops
are basically a second class citizen.

Also, AMD tends to be more cores at a lower clock rate than nVidia. For
embarrassingly parallel integer operations (like hashing) AMD blows nVidia out
of the water.

------
winter_blue
Does this simply run your java code on the GPU, or does it parallelize your
code automatically? The latter would be really cool.

~~~
dlsym
This is indeed IMHO a central question in this topic, since parallelizing an
algorithm is not an easy task.

So I guess you still end up writing your algorithm in OpenCL / Cuda and maybe
use the serialization provided by this lib.

Update: (Just read the hpcc_rootbeer.pdf slides.) You write your
_parralellized_ implementation of an Algorithm in Java - and it will be
executed on the GPU.

------
AnthonBerg
I prefer programming in CUDA over programming in Java. However, I have a lot
of respect for the Java runtime.

If Rootbeer or something similar allows me to program CUDA stuff in Clojure,
then I am impressed and excited.

------
skardan
Has anybody used Rootbeer with language which compiles to Java bytecode (like
Clojure)?

It would be interesting to see how functional languages designed for
parallelism perform on gpu.

~~~
th0ma5
this was my first thought in reading this! as other people would said you
would have to think in gpu terms, but a quick glance at the code looks like
you could just look into the compile functions. I hope someone out there beats
me to this because they'll do a better job :D

------
rbanffy
> The license is currently GPL, but I am planning on changing to a more
> permissive license. My overall goal is to get as many people using Rootbeer
> as possible.

It would be bad to compromise the freedoms of the users in order to be able to
limit the freedoms of more of them.

Any reason why the GPLv3 would be considered unsuitable? How about the LGPLv3?

------
DaNmarner
The title captured the true character of Java: Write once, almost run almost
everywhere.

~~~
malkia
Don't spill the beans :)

------
damncabbage

      4. sleeping while inside a monitor.
    

... Can someone clarify what this is?

~~~
mkross

      // In Java, all instances of Object are also a monitor.
      // See https://en.wikipedia.org/wiki/Monitor_%28synchronization%29
      Foo myMonitor = new Foo()
    
      // To "enter" a monitor, you use 'synchronized'.
      synchronized (myMonitor) {
        // Inside the monitor, we could do a Thread.sleep.
        Thread.sleep(100);
      }
    

This is a very simplistic problem case. However, it is very possible for this
to become a bigger problem. Because I can call arbitrary code when "inside" a
monitor, it is very possible to call a method that does a sleep incidentally.
(e.g. many implementations of IO operations will require some sort of sleep.)

------
stephen272
Seems much nicer than coding directly for CUDA or OpenCL

~~~
joestringer
The host-side application could already be written using Java, atleast for
OpenCL applications[1] (The kernel--that is, the GPU code--was still written
in OpenCL). My only concern is that Java will make it more difficult to find
out exactly what's going on in the kernel code, and hence more difficult to
optimise.

Now, this also doesn't solve the issue of needing to consider the parallel
architecture when coding the kernel to actually make use of the hardware.
Nevertheless, kudos to the guys behind this.

[1] <http://code.google.com/p/javacl/>

------
pron
At first glance this seems very impressive.

------
algad
Any security implications? Malware running on the GPU?

~~~
Vulume
This is already possible, but it never was a problem. The GPU is just a slave
to the CPU.

The biggest problem I could see would be when your computer becomes part of a
botnet. Your computer could be used for brute fore encryption cracking. Again,
this was also possible with just CUDA or OpenCL.

