
Neural Networks, Types, and Functional Programming (2015) - ghosthamlet
http://colah.github.io/posts/2015-09-NN-Types-FP/
======
KirinDave
Folks who are into this may also like the "Type Safe Neural Networks in
Haskell" [https://blog.jle.im/entry/practical-dependent-types-in-
haske...](https://blog.jle.im/entry/practical-dependent-types-in-
haskell-1.html)

I'm also starting work on a set of bindings to libdarknet for Idris with
similar properties.

------
partycoder
Note that the article has a comment by Yann LeCun (hopefully it's not an
impersonator).

You may also be interested in Differentiable Neural Computers:

\- [https://deepmind.com/blog/differentiable-neural-
computers/](https://deepmind.com/blog/differentiable-neural-computers/)

\- [https://github.com/deepmind/dnc](https://github.com/deepmind/dnc)

~~~
vinn124
> Note that the article has a comment by Yann LeCun (hopefully it's not an
> impersonator).

i wouldnt be surprised if it was lecun. colah's illustrations on nonlinear
transformations have made it into several lecun papers, including the
following [nature review](
[https://www.nature.com/articles/nature14539](https://www.nature.com/articles/nature14539)).

------
dang
Discussed at the time:
[https://news.ycombinator.com/item?id=10165716](https://news.ycombinator.com/item?id=10165716).

------
charlescearl
Some related work also include:

\- "Strongly-Typed Recurrent Neural Networks"
[http://proceedings.mlr.press/v48/balduzzi16.pdf](http://proceedings.mlr.press/v48/balduzzi16.pdf)

\- Principled Approaches to Deep Learning workshop
[http://padl.ws/](http://padl.ws/) (maybe this is in line with the meta-point
Colah's paper)

\- Haskell accelerate library
[https://github.com/AccelerateHS/accelerate/](https://github.com/AccelerateHS/accelerate/).
Not deep learning per se but perhaps some of the ideas are applicable

------
gtani
This covers some of same topics plus the rapidly expanding #s of frameworks,
SIMD /SIMT backends etc
[https://julialang.org/blog/2017/12/ml&pl](https://julialang.org/blog/2017/12/ml&pl)

20 pg tutorial on why RNN's are tricky
[https://arxiv.org/abs/1801.01078](https://arxiv.org/abs/1801.01078)

~~~
marmaduke
Many mentions of Python, none of Numba, which does a good job of jitting
Python with LLVM.

It seems unlikely though that an entire modeling community could rally behind
a single language or framework, given all the possibilities, many of which are
commercially oriented. But one I’ve used recently with a lot of flexibility is
Loopy

[https://documen.tician.de/loopy/](https://documen.tician.de/loopy/)

------
josquindesprez
There's also this take on differentiating datatypes:
[https://codewords.recurse.com/issues/three/algebra-and-
calcu...](https://codewords.recurse.com/issues/three/algebra-and-calculus-of-
algebraic-data-types)

Reading this makes a lot of the operations in colah's article feel more
intuitive. (To me, at least. I'm no expert here.)

------
platz
The examples are odd because he doesn't incorporate any notion of
differentiability.

So a Generating RNN is not quite like foldr, since foldr has no notion of
differentiability.

One needs to show examples that pulls in some kind of automatic-
differentiation capability.

~~~
colah3
When you pass a differentiable function into fold -- or most higher order
functions, for that matter -- you get a function that is differentiable on
everything but a measure zero set.

The mechanics of how you compute the derivatives are separate from this.
Obviously, the efficient way is to use backprop (reverse mode AD), as we
always do in deep learning. But you could also use discrete derivative
approximations. The point is that the resulting function is differentiable,
which is independent of how you compute the derivatives.

