
Show HN: KANN – mid-scale deep learning and autodiff in ANSI C (4 files; 4k LOC) - attractivechaos
https://github.com/attractivechaos/kann/
======
attractivechaos
There have been several posts on tiny/lightweight neural network libraries.
They are indeed small, but they only implement the most basic multi-layer
perceptrons (MLPs) with little optimization (e.g. they often implement
backprop inefficiently). KANN is far beyond MLPs. It implements automatic
differentiation on top of generic computational graphs [1], supporting
dropout, proper mini-batching, reshape/slice/concat of general n-d arrays,
arbitrary weight sharing, 1D/2D convolution/pooling, SGD/RMSprop, RNN/LSTM and
graph/weight saving/loading. KANN is like a CPU-only mini-tensorflow.

KANN optimizes convolution [2] and matrix multiplication [3] (also optionally
calls BLAS' sgemm). It can use SSE and multi-threading when available. On
multiple threads, KANN is sometimes faster than CPU-only Theano+Keras.

I implemented KANN when I was studying deep learning [4]. I have not touched
the code base for a while as I am not really working on ML/DL. Nonetheless,
KANN should still work fine.

[1]:
[http://colah.github.io/posts/2015-08-Backprop/](http://colah.github.io/posts/2015-08-Backprop/)

[2]:
[https://github.com/attractivechaos/kann/blob/master/doc/02de...](https://github.com/attractivechaos/kann/blob/master/doc/02dev.md#implementing-
the-convolution-operation)

[3]:
[https://attractivechaos.wordpress.com/2016/08/28/optimizing-...](https://attractivechaos.wordpress.com/2016/08/28/optimizing-
matrix-multiplication/)

[4]: [https://attractivechaos.wordpress.com/2017/03/04/kann-a-c-
li...](https://attractivechaos.wordpress.com/2017/03/04/kann-a-c-library-for-
artificial-neural-network/)

~~~
p1esk
There are also tiny-dnn [1] and Darknet [2]. I'm curious how they compare to
KANN.

[1] [https://github.com/tiny-dnn/tiny-dnn](https://github.com/tiny-dnn/tiny-
dnn) [2]
[https://github.com/pjreddie/darknet](https://github.com/pjreddie/darknet)

~~~
attractivechaos
Darknet supports GPU and thus complex models. It is not intended as a library
in my understanding. As last time I checked, tiny-dnn is built on top of
"layers", not general computational graph. This greatly limits its
functionality. Its implementation is fairly inefficient, too.

