
FastR: An implementation of the R language in Java [pdf] - susi22
http://www.oracle.com/technetwork/java/jvmls2013vitek-2013524.pdf
======
brendano
The programming language analysis is pretty interesting, but you have to ask,
what's the point of a brand-new Java implementation? R isn't just a
programming language, but it's a software framework/ecosystem. They mentioned
this in the slides, but it's problematic because R _crucially_ relies on C and
Fortran interaction (which I thought the JVM can't do efficiently, since it
doesn't like giving C/Fortran raw memory access to its internals). Decades of
work has gone into highly optimized Fortran linear algebra libraries, for
example -- which R and all the other high-level numerical languages
(NumPy/SciPy, Matlab, Julia) use. And many of the CRAN packages (the
availability of which are a major reason anyone uses R in the first place) are
partly or mostly C/Fortran code.

There are many other R implementation efforts going on right now -- Radford
Neal lists a few (as well as his own) here:
[http://radfordneal.wordpress.com/2013/07/24/deferred-
evaluat...](http://radfordneal.wordpress.com/2013/07/24/deferred-evaluation-
in-renjin-riposte-and-pqr/)

The presentation focuses on the R programming language, which they nicely show
has all sorts of misfeatures that impede rapid execution. If you're going to
not try to have compatibility with R and CRAN, you might as well start from
scratch with design and performance in mind, as in Julia:
[http://julialang.org/](http://julialang.org/)

~~~
pron
> ... C and Fortran interaction (which I thought the JVM can't do efficiently,
> since it doesn't like giving C/Fortran raw memory access to its internals).

As of 2002 (JDK 1.4) Java has excellent integration with native memory (you
can freely pass pointers from C/FORTRAN to Java and vice versa[1]).

There are numerous Java math libraries that use BLAS/LAPACK already[2]. In
fact, AFAIK, _most_ Java matrix math libraries use FORTRAN code (at least as
an option).

[1]: Java side:
[http://docs.oracle.com/javase/7/docs/api/java/nio/ByteBuffer...](http://docs.oracle.com/javase/7/docs/api/java/nio/ByteBuffer.html)
C side:
[http://docs.oracle.com/javase/7/docs/technotes/guides/jni/sp...](http://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/functions.html#nio_support)

[2]: For example, [https://github.com/fommil/matrix-toolkits-
java](https://github.com/fommil/matrix-toolkits-java),
[http://mikiobraun.github.io/jblas/](http://mikiobraun.github.io/jblas/)

~~~
beagle3
Is there any way to ensure that accessing ByteBuffers (or whatever) is fast?

I had used memory mapped buffers (which should be equivalent in performance),
and there was no way to make the JIT inline access to these arrays. It was all
calls (indirect, not properly branch predicted at that). Equivalent code to
C++, running 10 times slower, with no way to speed it up.

(And the reason I was using memory maps, if you insist -is a 2GB read-only
dataset used by multiple processes at the same time - I went to C++
eventually, because there was no way to get reasonable performance from Java,
either memory use or speed. This is circa 2010)

~~~
pron
Yes. High performance Java code sometimes makes use of JVM intrinsics,
accessible through the sun.misc.Unsafe class. Those are JITted down to a
simple memory access instruction. That class also has intrinsics for CAS, and
in JDK 8 it's got intrinsics for different memory fences as well. Those calls
are compiled to a single native instruction.

For example, the Java Chronicle library[1] uses these techniques, as well as
memory mapped files, to implement fast persistent message queues.

[1]: [https://github.com/OpenHFT/Java-
Chronicle](https://github.com/OpenHFT/Java-Chronicle)

~~~
beagle3
Cool. Was this in Java 6? (circa 2010 - I couldn't find a way to do it back
then). Also, why would you need (even on JDK7 or JDK8) unsafe access to jit a
memory mapped access inline? Is there an underlying philosophical reason, or
is it just that they never got to do it?

~~~
pron
Yes, it was in Java 6 (except for direct access to fences, which has been
added in Java 8).

The sun.misc.Unsafe class is used extensively by JDK classes, and is meant for
internal use. It provides intrinsics that are translated to a single machine
instruction. Normally, you don't use the class directly. For example, you use,
say, AtomicInt for CAS operations (which, internally uses s.m.Unsaafe) or the
ByteBuffer class (which internally uses s.m.Unsafe for direct pointer access).
The JDK classes add all sorts of protection (like range checks) around
s.m.Unsafe, but if you know what you're doing, using s.m.Unsafe directly and
eschewing some of those protections (usually adding ones more pertinent to
your domain), you get some performance gains which may be significant
depending on your use case.

------
th0br0
FWIW, there's also Renjin that does this:
[https://github.com/bedatadriven/renjin](https://github.com/bedatadriven/renjin)

------
ubasu
Given the problems with Java's floating-point implementation [1], would this
be reliable for statistical analysis?

[1]
[https://news.ycombinator.com/item?id=6585828](https://news.ycombinator.com/item?id=6585828)

~~~
pron
1\. It is more a philosophical disagreement than a problem, and whatever the
semantics you want, the JVM does not limit you. The disagreement revolves
around the Java compiler and language semantics, not the JVM.

2\. The math in FastR, if I understand the presentation correctly, is
performed by FORTRAN libraries anyway. Using battle tested FORTRAN libraries
for matrix computations is common practice in C, Java, Julia, Matlab and most
other environments. They basically all share the same underlying matrix math
code.

~~~
StefanKarpinski
I do think that Kahan's objections to Java's floating-point support were more
than just philosophical, although it's unclear to me at this point how many of
them still apply. The biggest issue that seems to still remain is that you
can't change rounding modes or check flags, both of which are essential for
verifying numerical stability and correctness. Far worse than any floating-
point issues on the JVM is that your indices and integers are 32-bit, so
you're limited to 2GB arrays before you have to take bizarre measures to
access larger amounts of data.

Although it is standard in high-level systems to call out to a BLAS library
[1], for some inexplicable reason it seems that both R and NumPy use the
reference BLAS by default, which is quite slow – around 4x slower than better
BLASes. Matlab ships with Intel's proprietary MKL, which includes a very fast
BLAS implementation, while Julia ships with OpenBLAS, which is a similarly
fast open source BLAS implementation derived from the legendary GotoBLAS [2].
Since all BLAS implementations share a common Fortran ABI, it's easy to swap
them out, but it's not quite true that all of these systems are using the
exact same Fortran code.

[1]
[https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprogra...](https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms)

[2]
[https://en.wikipedia.org/wiki/GotoBLAS](https://en.wikipedia.org/wiki/GotoBLAS)

~~~
pron
There is a lot of work going into designing java arrays "2.0", including
support for "long" arrays, immutable arrays, and arrays of structs.

------
susi22
Video presentation for the slides:

[http://download.oracle.com/technetwork/java/javase/community...](http://download.oracle.com/technetwork/java/javase/community/H264_1280x720/20144523.mov)

------
noahmarc
The Relite mention at the end looks very worthwhile. We were just about to
begin a large rewrite of analysis code from R to CUDA; Relite has the
potential to save the effort of rewriting our existing code.

~~~
tiarkrompf
Relite author here. I'd love to learn more about your use cases, feel free to
get in touch!

~~~
noahmarc
Definitely, thank you for the offer. I'm running through the install now but
will send you an email once I've run a couple tests. I see your contact info
is listed on your home page.

------
moondowner
Worth noting is that FastR is a JVM implementation of R that uses Truffle and
Graal.

[http://openjdk.java.net/projects/graal/](http://openjdk.java.net/projects/graal/)

[https://wiki.openjdk.java.net/display/Graal/Publications+and...](https://wiki.openjdk.java.net/display/Graal/Publications+and+Presentations)

------
Demiurge
Are there any tools to convert R to JS or Python, or... any other common
language that doesn't require a 60mb runtime distribution?

~~~
chubot
That's a really bizarre reason not to use R. Python's distribution is about
30MB if I recall. It doesn't really make much sense to convert R to JS or
Python, since the semantics are so different.

If you don't want to use R, Pandas in Python provides very powerful data
frames (which are likely faster for many cases). However, it depends on NumPy,
matplotlib, and a few other libraries, which probably total more than 60 MB.

~~~
neeee
Python 3's installed size is 90MB here, 2 is 60MB.

~~~
Demiurge
Good thing is that python is pretty ubiquitous, as is virtual env :)

------
knodi
more java -_- I need less java in my life, not more.

~~~
AsmMAn
Which platforms are you targeting?

