
Building a neural network from scratch in Haskell - allenleein
http://www-cs-students.stanford.edu/~blynn/haskell/brain.html
======
eigenvalue
Maybe it's just me, but this looks like a bunch of incomprehensible gibberish
code. I'm sure I could understand it if I spent many hours pouring over it,
but why would you ever do that? When written out in Python (or even in JS!),
the whole thing is so much simpler looking and more closely resembles the
underlying math. This is especially true if you use a package like Numpy. I
know he said he didn't want to use any libraries, but if the underlying
primitives in the problem are vectors/matrices, then it seems like you are
reinventing the wheel in a very substandard way that doesn't aid in
understanding in any way and results in something that isn't beautiful, isn't
high performance, and is confusing for someone to read-- even if the person is
familiar with the subject matter!

~~~
kick
_but if the underlying primitives in the problem are vectors /matrices, then
it seems like you are reinventing the wheel in a very substandard way that
doesn't aid in understanding in any way and results in something that isn't
beautiful, isn't high performance, and is confusing for someone to read_

You mean like...both of the languages you listed?

There's an obviously superior, faster, simpler language when working with
vectors (APL), but people are _obsessed_ with new languages.

If you really think it can be done in Python better than in Haskell, why not
demo it in Python? You'll get internet points, and if you're right, you'll
have something to show for it.

~~~
stfwn
> Why not demo it in Python?

Not OP, but here:
[https://gist.github.com/stfwn/62e51d86ca4ff155becd3c6a14adf6...](https://gist.github.com/stfwn/62e51d86ca4ff155becd3c6a14adf60e)

You should be able to wget the file and run it (Python 3) from start to finish
without any set-up and get ~88% accuracy on the test set.

It uses all the data (not one-sixth like in the blog posts) and does 200
iterations by default, so here's the loss plot on the training set if you want
to skip all the fun:
[https://i.imgur.com/F57zmXV.png](https://i.imgur.com/F57zmXV.png)

------
6gvONxR4sf7o
I love haskell as much as the next PL nerd, but the community has a real code
golf problem. An example from the blog post:

    
    
        deltas :: [Float] -> [Float] -> [([Float], [[Float]])] -> ([[Float]], [[Float]])
        deltas xv yv layers = let
          (avs@(av:_), zv:zvs) = revaz xv layers
          delta0 = zipWith (*) (zipWith dCost av yv) (relu' <$> zv)
          in (reverse avs, f (transpose . snd <$> reverse layers) zvs [delta0]) where
            f _ [] dvs = dvs
            f (wm:wms) (zv:zvs) dvs@(dv:_) = f wms zvs $ (:dvs) $
              zipWith (*) [(sum $ zipWith (*) row dv) | row <- wm] (relu' <$> zv)
        
        ...
    
        descend av dv = zipWith (-) av ((eta *) <$> dv)
    
        learn :: [Float] -> [Float] -> [([Float], [[Float]])] -> [([Float], [[Float]])]
        learn xv yv layers = let (avs, dvs) = deltas xv yv layers
          in zip (zipWith descend (fst <$> layers) dvs) $
            zipWith3 (\wvs av dv -> zipWith (\wv d -> descend wv ((d*) <$> av)) wvs dv)
              (snd <$> layers) avs dvs
    
    

Writing this in 2-3x as many lines with clear variable names for some
intermediate expressions would make it so much clearer. Haskell has a nasty
reputation for "you have to study the shit out of it to make heads or tails of
the code" and I'm pretty certain that 90% of it comes from how terse
haskellers try to make code.

Just add intermediate expressions and annotate their types, maybe even with
some type synonyms for intermediate types, because code is for humans.

~~~
tome
Your wish is my command:

[http://h2.jaguarpaw.co.uk/posts/refactoring-neural-
network/](http://h2.jaguarpaw.co.uk/posts/refactoring-neural-network/)

------
nafizh
All these building NNs in haskell, ocaml or another language solving an
extremely simple problem which is ok if you want to just have some fun. But if
the proponents are really serious they would put out a detailed tutorial
solving a complex task (e.g building a state-of-the-art language model)
showing the ease / difficulty of the process, and how using a typed functional
language helps in debugging the model - which is supposed to be the biggest
selling point of these languages? This will also show whether it is viable for
a real life practitioner to invest time in learning these languages /
frameworks.

~~~
RobertDeNiro
People have done this but with hasktorch. Which are haskell bindings to the
pytorch C++ libraries. The static typing definitely helps when you want to
ensure that the input/output shapes of a layer are consistent.

~~~
nafizh
Can you please point to such a tutorial? I am interested.

~~~
RobertDeNiro
There's no proper tutorial AFAIK. Just examples in the hasktorch repo.

------
thom
Not really my scene, but I've been liking the look of Torsten Scholak's
Hasktorch:

[https://github.com/tscholak/hasktorch](https://github.com/tscholak/hasktorch)

~~~
mark_l_watson
I think that Hasktorch is a very cool project but it is not turtles all the
way down: it is a Haskell API on top of PyTorch.

The Haskell bindings for TensorFlow are a little bit difficult for me to work
with. When HaskTorch gets more mature and stable, it will hopefully be easier
to use than the TensorFlow bindings.

------
pickle-ts
In typescript (from scratch recognizing hand-written digits in browser):

Live: [https://deep-learning.stackblitz.io/](https://deep-
learning.stackblitz.io/)

Live edit code: [https://stackblitz.com/edit/deep-
learning](https://stackblitz.com/edit/deep-learning)

~~~
mark_l_watson
Nice. I really recommend that practitioners implement simple neural
architectures from scratch for learning, but use TensorFlow, PyTorch, mxnet,
etc. for production and serious research.

New frameworks like TensorFlow for Swift and Julia’s Flux are a little easier
to understand if you read the code, but still complex stuff.

------
kidintech
played with this for ~5 mins and it's insanely bad (maybe it "guessed" right
3/15 tries?) i.e. slightly better than random guesses, even with very clear
handwriting :(

~~~
Jaxan
You had bad luck I guess. I just did 15 tries and it only got one wrong. Not
really sure what this says though.

~~~
bspammer
Are you talking about the "Sample" button, or drawing the digits yourself? It
seems to have a very good accuracy on the samples, but gets a lot of my hand-
drawn digits wrong.

Seems like a classic case of overfitting to be honest.

~~~
Jaxan
Ah I didn’t know you could draw yourself. That explains the difference.

------
cluoma
The book linked at the top of the article is a great read[1]. Very easy to
follow along if you want to do a similar thing yourself.

[1]
[http://neuralnetworksanddeeplearning.com/](http://neuralnetworksanddeeplearning.com/)

------
iiian
Where is 8? I flipped the sample like 100+ times. Didn't encounter a single
#8. I guess Haskell bytes back sometimes.

~~~
mrkeen
If this is a failing of the language, then at least blame the right one.

[http://www-cs-students.stanford.edu/~blynn/mnist/samples.js](http://www-cs-
students.stanford.edu/~blynn/mnist/samples.js)

------
dmead
this is really not a good look for haskell.

