

The fastest Statistical Programming Language is …Javascript? - TalGalili
http://www.r-bloggers.com/the-best-statistical-programming-language-is-%E2%80%A6javascript/

======
disgruntledphd2
Urrgh. This came through on my feed earlier today, and I left a comment on it.
While javascript is fast (and can be used for many things), the real issue
with using it for stats is the lack of libraries. More specifically, as far as
I know it cannot interface with Fortran. That's a death knell for any
statistical programming language, as it means no LAPACK, and no-one (sane) is
going to rewrite all of those linear algebra libraries. So regardless of how
fast it is, its not going to make it as a stats language.

That being said, it does make it easier to develop statistically aware web-
apps (a particular interest of mine), so that's definitely good.

~~~
olalonde
> no-one (sane) is going to rewrite all of those linear algebra libraries

Is it because it would take a long time or because it's inherently hard?

~~~
jules
It's easy enough to do a basic implementation, but getting good numerical
stability and good performance is hard (and in Javascript, it's pretty much
impossible with current implementations). See also the matrix multiplication
benchmarks in the post: JS is 60x slower than Matlab, even though it's already
using typed arrays. A naive triply nested for loop in C would probably perform
similarly to the JS, which is to say a lot slower than something optimized for
the characteristics of the processor (number of registers, vectorized floating
point and cache sizes mostly, I'm not sure if Matlab is using multiple cores
here).

~~~
bmuon
Yes, latest versions of Matlab use multithreading.

------
shocks
Eh... I'm getting tired of this "x is faster than y" business. JavaScript
might be fast in these examples, but that doesn't mean it's faster at
_everything_. Comparing programming languages is fruitless. X might be faster
than Y at Z, but that does not mean X is better suited than Y for all
applications.

JavaScript is faster than MatLAB etc in these examples, but as mentioned
already it's slower at matrix multiplication and I'm sure that's just one
example.

Does JavaScript have tonnes of libs? Does it have type-checking? Does it have
all those other things that I would be desperate for if I was performing
important calculations? Can I distribute the computing easily? Etc, etc.

Let's stop comparing programming languages as if they're one tool to do one
job. Different programming languages have different applications and are
suited for different jobs.

------
Tichy
The fascinating question is: why is Javascript fast? I suppose it is because
of the competition between the browsers. Yay for competition!

~~~
arunoda
Of course yes. Competition. And it has a good foundation and solid commercial
backing with big fish companies.

------
gte910h
Who the hell would use the relatively library less javascript to do analysis?

Sorry, R, Matlab, Python, Syntax, Fortran have actual libraries for this
stuff, JS, no.

~~~
xtracto
Yeah, I find it kind of funny when people compare a general purpose
programming language with a statistical software. In R you have libraries for
things like Apprximate Bayesian Computation, parametric and non-parametric
statistics, and even neural networks.

Sure, you could achieve the same with a general purpose PL, but you would have
to implement everything from scratch.

~~~
gte910h
Several general purpose languages (Fortran, Python, and Matlab ) have very
nice statistical programming packages at the current date.

------
plg
What about straight C??? There are many great stats libraries in C (e.g.
Apophenia). C is not a hipster language but maybe (like polaroid filters in
Instagram) it's time for it to make a "retro" comeback. C kicks ass for speed.
Obviously.

~~~
msutherl
I'm waiting for somebody to make a CoffeeScript for C. No semicolons,
comprehensions, syntactic sugar for function pointers, etc.

~~~
amalter
<http://golang.org/>

------
lucian1900
PyPy is often faster than v8, so a more complete benchmark should include it.

~~~
wheaties
Pypy should have direct access to lapack. No need to bring in Pandas. Lapack
is just that fast.

------
jbooth
Fastest for everything but the statistical parts. Not that someone couldn't
write the bindings to C for server-side JS, but they haven't.

------
igorgue
Does performance really matters? I rather have richer libraries (like R has)
than performance, since it's impossible to plot for example, all your Apache
logs or any other big data problem, you just need a subset of the data and
plot them, and with that you don't need a super fast language.

------
NonEUCitizen
His table shows js is 40x slower on matrix multiplication.

~~~
platzhirsch
Ergo, JavaScript isn't the fasted language for that matters, because matrix
multiplication is too important.

~~~
simonster
It's not just that. It's that there's no concept of a vector or matrix at all,
and no operator overloading to allow these concepts to be introduced into the
language in an idiomatic way. You could put these things into a bastardized
JavaScript JIT, but that seems at least as awkward as Julia.

~~~
btilly
<http://coffeescript.org/> has demonstrated how to fix that problem.

If the underlying engine is fast, a more convenient syntax can be introduced.

~~~
simonster
I thought about this a little bit, and I don't think this would be very
trivial. CoffeeScript is designed to map easily onto JavaScript. A
transcompiler that compiles JavaScript with matrix extensions to performant
plain JavaScript would likely be significantly more complex than the
CoffeeScript transcompiler.

Consider that you want to translate the matrix operation A * B into
A.times(B). You have two options:

1) Figure out what's a matrix before runtime, using static type inference. 2)
Translate the code into JavaScript that determines whether to treat the code
as a matrix at runtime.

In the first case, you don't need a JIT at all. JITs exist largely because you
can't do perfect type inference in dynamic languages. If you can do perfect
type inference on all acceptable code (a la RPython), or if you require type
annotations, you can compile straight to C or machine code.

In the second case, you take a speed hit of 25-50% on scalar operations for
the guard, at least in modern versions of SpiderMonkey and V8 (see
<http://jsperf.com/cost-of-multiplication-via-function>).

You can probably get acceptable performance out of combining static type
inference with guards. My understanding is that this is what SpiderMonkey does
internally. But at this point, it might be easier to integrate your
functionality into an existing JIT than to write your transcompiler with type
inference, particularly since you will have to implement matrix and vector ops
inside the JS engine to achieve acceptable performance anyway.

