
ND4J: Scientific Computing on the JVM - bryanrasmussen
https://github.com/deeplearning4j/nd4j
======
mruts
Hopefully this can pave the way for deep learning and data analysis for
Scala/Java. It's really frustrating having to use a subpar and inexpressive
language like Python because of the huge data science network effect.

Or if Julia could get more libraries (it's maybe 25% there, which isn't bad),
that would be pretty great as well.

~~~
geoalchimista
> It's really frustrating having to use a subpar and inexpressive language
> like Python because of the huge data science network effect.

I'm afraid that the "network effect" is gonna continue for the years to come.
NDArray is not the only thing you need for data analysis. DataFrame,
statistical functions, and data visualization packages like matplotlib and
ggplot are all indispensable components in a data analysis workflow. It has
taken thousands of contributors a decade to build a solid ecosystem for data
science in Python or in R. It's just unlikely for Java to become a mainstream
data science language overnight.

I'd say if you are looking for a statically typed language for deep learning,
C++ is probably a better bet than Java for now.

~~~
mruts
I don’t really want Java as a data science language, but Scala would be a
perfect fit.

There’s already a nice data frame library that I helped develop at my old
company called Saddle. Now we just need some quant finance libraries (like the
excellent zip line, pyfolio, and alpha lens), some industrial NNs,
optimization libs (like cvxpy), and something like scipy.

Is this ever going to happen? Probably not, but a man can dream..

------
lern_too_spel
This moved into the deeplearning4j monorepo:
[https://github.com/deeplearning4j/deeplearning4j/blob/master...](https://github.com/deeplearning4j/deeplearning4j/blob/master/nd4j/README.md)

------
lenticular
I'm eagerly awaiting their sparse matrix support. It's unbelievable that the
entire JVM doesn't have a single comprehensive, production quality sparse
matrix library [0]. This is one of the big things keeping my machine learning
in Python.

~~~
sooheon
Check out Neanderthal[0], it seems to have support for some subset of sparse
matrices and it's faster than ND4J[1] to boot.

[0]:
[https://neanderthal.uncomplicate.org](https://neanderthal.uncomplicate.org)
[1]: [https://dragan.rocks/articles/18/Neanderthal-vs-
ND4J-vol1](https://dragan.rocks/articles/18/Neanderthal-vs-ND4J-vol1)

~~~
lenticular
Yeah, Neanderthal is great (I'm a Clojure a user). It's got support for
structured sparse matrices (like Toeplitz) the last I checked, but not general
CSC/CSR matrices.

------
nestorD
William Kahan, the father of IEEE floattig point arithmetic and one of the
main references on numerical errors, has some bad thing to say about doing
floatting point computations on the JVM :
[http://people.eecs.berkeley.edu/~wkahan/JAVAhurt.pdf](http://people.eecs.berkeley.edu/~wkahan/JAVAhurt.pdf)

An nd array library could be great for machine learning but it is probably not
the place where you want to do numerical simulations.

~~~
walkingolof
Honest question, how much is this about Java and not the JVM and its from
"July 30, 2004", is it even relevant now 15 years later ?

~~~
freeone3000
Everything except operator overloading also applies to the JVM... But equally
well to python. Basically, the complaint is you don't get any of the hardware
fp stuff like traps and signals and all of the math is slightly unportable
because it gives different answers on different machines, but these are well-
known issues.

~~~
nestorD
There is also the fact that intermediate operation between floats are done in
float precision and not double precision (as would be the case with C). It
might be the behaviour one would expect but it can lead to serious
degradations of a numerical computation.

