
Programming R at native speed using Haskell - psibi
http://www.tweag.io/blog/programming-r-at-native-speed-using-haskell
======
baldfat
I love R and I love programming in it, but I have always thought I need to
learn Julia due to a few things including speed, but that the R community
keeps coming out with great answers to those questions where Julia keeps my
interest. I don't see how this would speed up my typical R programming or code
to be worth learning Haskell. What am I missing and yes I read the whole
article.

I have found Haskell very difficult to use with Cabal and its package
management. The reason I went to Haskell was to teach myself functional
programming. After struggling and going through 2 books I still felt like I
hated Haskell due to making the environment just work. I work in three
different locations and to get all 5 computers to work in Haskell was a
serious pain. This lead me to see what else is out there. It lead me to
[https://www.coursera.org/course/proglang](https://www.coursera.org/course/proglang)
which introduced me to Racket and I loved it and felt that Racket was the
perfect fit for me.

~~~
peatmoss
I've long had an unrealized dream of doing something similar with R and Racket
as the HaskellR people have done. I've thought that it would be cool to:

1\. implement R or something very akin to R as a Racket language. I'd call it
"Arket" because is far more Googleable than R or Racket ;-) Functions would be
callable either from Arket or Racket.

2\. Ensure that all my important R packages work with Arket, either by porting
or by virtue of similarity between Arket and R.

3\. Use Racket for everything.

I think this would be great until I realize I would need to do a _lot_ of work
to create this.

~~~
baldfat
Best part is:

> I'd call it "Arket" because is far more Googleable than R or Racket

------
jmount
Looks fun. However, what is REALLY slow in R isn't the interpreter or garbage
collector: but some of the object re-allocation patterns you accidentally
trigger. Here is my note making some comparisons and recommendations on
writing fast code in R: [http://www.win-vector.com/blog/2015/07/efficient-
accumulatio...](http://www.win-vector.com/blog/2015/07/efficient-accumulation-
in-r/)

------
JanneVee
Well haskell is not the only option.

From .NET: [http://rdotnet.codeplex.com/](http://rdotnet.codeplex.com/) there
is also a dataprovider directly to F# :
[http://bluemountaincapital.github.io/FSharpRProvider/](http://bluemountaincapital.github.io/FSharpRProvider/)

Also there is C++ binding available also:
[http://www.rcpp.org/](http://www.rcpp.org/)

~~~
pron
Or just run it with FastR
([https://bitbucket.org/allr/fastr](https://bitbucket.org/allr/fastr)). Here
it is in action (running with JS and C in the same REPL):
[https://dl.dropboxusercontent.com/u/292832/useR_multilang_de...](https://dl.dropboxusercontent.com/u/292832/useR_multilang_demo.mp4)

------
jzwinck
How does garbage collection work? Does this version improve on the R I know
from a year ago, which had a reference counting system with only three values:
0, 1, and 2+?

~~~
rfergie
[https://tweag.github.io/HaskellR/docs/managing-
memory.html](https://tweag.github.io/HaskellR/docs/managing-memory.html)

[https://tweag.github.io/HaskellR/docs/memory-
allocation.html](https://tweag.github.io/HaskellR/docs/memory-allocation.html)

Both these links have more information on this.

As far as I can tell (and most of this is way over my head) the native R
garbage collection is used

~~~
mboes
Yes - the R GC manages R objects allocated by the R interpreter, and
conversely the Haskell GC manages Haskell objects allocated by the Haskell
runtime.

------
ximeng
Does anyone have good links to documentation on quasiquotation in Haskell?
Last time I looked I struggled to find a good summary.

~~~
chriswarbo
I've been teaching myself Template Haskell recently, and found
[https://ocharles.org.uk/blog/guest-
posts/2014-12-22-template...](https://ocharles.org.uk/blog/guest-
posts/2014-12-22-template-haskell.html) to be a really understandable
explanation. It specifically _doesn 't_ cover quasiquotation, but it has many
links to things which do.

------
Mikeb85
While this really has nothing to do with programming R, this is a neat way to
use Haskell for some statistics/data science tasks whilst making use of all
the work that has gone into R over the years.

I personally don't see this as a way of using R, rather a way of using
Haskell.

------
jbssm
I don't know almost anything about Haskell (tried to learn it some years ago
but never pass form the basics).

Yet from what I'm seeing we are only calling R functions from Haskell. I'm
probably missing something on the explanation, but how does this speed up R?

~~~
dagw
Often what is slow in an R program is not the core analysis function you want
to call, but everything you have to do before and after you call that
function, IE turning your raw data into a data structure containing the
relevant data to feed into the function and taking the output of the function
and munging it into whatever output format you actually need. This should help
speed up those bits quite a lot.

~~~
jbssm
I see, thanks. I my own code I suspect that it's the for loops I can't
vectorise that are making R taking so long in some operations, so this is
where this project would help.

Must give it a try then.

~~~
Mikeb85
In R its very easy to simply create a function in C++ or Fortran, then call
that. Plus there's plenty of functions already that you can call instead of
using for loops in R (which really aren't idiomatic).

