
Haskell running on top of Apache Spark - rkrzr
http://www.tweag.io/blog/haskell-meets-large-scale-distributed-analytics
======
rkrzr
This is a very interesting use case of the new static pointer feature in GHC:

They first generate JVM bytecode from the Haskell source using their own code
generator. This code is then shipped to the Spark cluster which executes the
bytecode until it hits a static closure, the JVM will then call back into
Haskell using a static pointer to execute the closure and will then continue
with the bytecode.

~~~
mboes
Yup, pretty close. Just to be clear though we don't generate any JVM bytecode
for any of the Haskell code - we compile that to native code using GHC as per
usual but then we dynamically load the resulting shared objects into the JVM.

The static pointer thing is what allows us to share (practically) arbitrary
Haskell closures with Scala/Java. And more than that - to get Spark's Scala
code to ship these Haskell closures between machines across a cluster.

~~~
rkrzr
Ah, thanks for the clarification!

So just for my understanding: the interop with the JVM works via JNI then?

The native code makes calls against the JNI and the Java bytecode calls back
into the Haskell native code using the static pointers?

~~~
mboes
That's right!

