
Deep neural network written from scratch in Julia - jostmey
https://github.com/jostmey/DeepNeuralClassifier
======
arvinsim
All of these machine learning news is making me regret not taking math
seriously in my undergraduate studies. I am beginning to get bored doing web
development.

Time to peruse Khan Academy and befriend Math.

~~~
ivan_ah
Here's a crash course on high school math for coders:
[http://minireference.com/static/tutorials/sympy_tutorial.pdf](http://minireference.com/static/tutorials/sympy_tutorial.pdf)

~~~
arvinsim
Cool! I did not know such a library exists.

------
IshKebab
The Readme actually clarified a lot of things for me that many longer texts
skip over - why ReLU's are used (they prevent vanishing (or exploding)
gradients), exactly how dropout works, why there is "momentum", etc.

I'm still not sure why he uses softplus instead of ReLU though. The
implication is that it is better to have a smooth function, but is it? And
does the benefit outweight the extra computational burden?

Also, the code is fantastically short.

~~~
jostmey
One could separate the training data into a validation and training set. Then
you could try both the ReLU and softplus to see which performs better. I have
no idea which type would---I just liked the idea of using a smooth activation
function instead of a jagged one.

------
Ono-Sendai
Why don't these library authors ever attach performance information? How many
training runs per second etc..? How many weights processed per second?

~~~
aidanf
From the README, under the performance section:

> This package is not written for speed. It is meant to serve as a working
> example of an artificial neural network. As such, there is no GPU
> acceleration. Training using only the CPU can take days or even weeks. The
> training time can be shortened by reducing the number of updates, but this
> could lead to poorer performance on the test data. Consider using an exising
> machine learning package when searching for a deployable solution.

It seems the main aim of this software is educational, not production use.

~~~
Ono-Sendai
Just because it doesn't run on the GPU doesn't mean it can't be fast. At least
they acknowledge it's slow.

~~~
jostmey
Actually, I added that disclaimer on performance because of your first
comment. I realized people were getting the wrong idea about my little
example, and were thinking this could be used in place of packages like Caffe,
Torch7, Theano, TensorFlow, ect...

------
kapv89
I really hope someone writes a few good machine learning tutorials in
Javascript (preferably ES2015). Many people use it for everything else other
than machine learning stuff. Wouldn't hurt to not having to shift to another
lang when dealing with ML.

------
aman2304
With so many deep/machine learning frameworks around, which framework would
you put your money on? I use Theano but it makes life difficult when it comes
to debugging

~~~
Veratyr
Why choose? Write Keras and switch to whichever takes your fancy at the time.

~~~
aman2304
Doesn't Keras work with TensorFlow and Theano only?

~~~
Veratyr
Yup but those are the only libraries for Python that'll use the GPU. The rest
are mostly CPU-only (PyBrain, quite slow as a result), built on Theano
(Lasagne, pydnn), very specific to a particular problem (Caffe) or written in
a non-compatible language (Torch).

Given that it supports 2/3 of the big general purpose libraries, it's good
enough.

------
NonEUCitizen
Can this take advantage of CUDA ?

~~~
olympus
Short answer: No.

Long answer: Looking at the code, this is written in pure Julia and nothing in
place for running on a GPU. You could (re)write it but I'm guessing that's not
what you meant when you asked.

Look at Mocha.jl if you want a Neural Network implementation in Julia that can
run on a GPU: [http://devblogs.nvidia.com/parallelforall/mocha-jl-deep-
lear...](http://devblogs.nvidia.com/parallelforall/mocha-jl-deep-learning-
julia/)

~~~
etrain
Short answer is actually: maybe!

The bulk of the work done in this code (in terms of FLOPS and, likely, wall-
clock time) is going to be in BLAS-3 operations in the feed-forward and back-
prop steps. That is, almost all of the work is done using Matrix-Matrix
multiplies and in-place arithmetic/transcendental functions.

CUBLAS[1] will allow you to run these types of operations on your GPU at
highly accelerated rates, without much more effort than replacing your BLAS
library with a new binary. Additionally, if you want finer granularity control
over what gets done on the GPU, there are other libraries[2] which provides a
direct interface to CUBLAS.

[1] [https://developer.nvidia.com/cublas](https://developer.nvidia.com/cublas)
[2]
[https://github.com/JuliaGPU/CUBLAS.jl](https://github.com/JuliaGPU/CUBLAS.jl)

