
Basic Neural Network on Python - dfrodriguez143
http://danielfrg.github.io/blog/2013/07/03/basic-neural-network-python/
======
gamegoblin
Very good write up. If you want to trade speed and memory for accuracy, you
can make a large lookup table for your sigmoidal function which should just
about double the speed of it.

As an aside, and not to be too critical, because the post was great, but as
(presumably) a non-native English speaker, you might do a spell-checker on
your post. There are also some missing pronouns which make some sentences very
Spanishy.

~~~
lightcatcher
A relatively small lookup table for the sigmoid function can also work well.
Here are the various sigmoid approximations that Theano (a library used for
deep learning research among other things) offers:
[http://deeplearning.net/software/theano/library/tensor/nnet/...](http://deeplearning.net/software/theano/library/tensor/nnet/nnet.html#tensor.nnet.sigmoid)

~~~
gamegoblin
I usually use an array with a few thousand entries. In C this gives me a 2.5x
speedup over the exact function with no important decrease in accuracy.

~~~
mistercow
I wonder though if you might actually do better overall with a smaller lookup
table and interpolation (or even just a polynomial approximation, which can be
evaluated without branching), since large lookup tables can lead to bad cache
behavior.

------
benhamner
Both datasets you used (iris and digits) are way too simple for neural
networks to shine.

Neural networks / deep neural networks work best in domains where the
underlying data has a very rich, complex, and hierarchical structure (such as
computer vision and speech recognition). Currently, training these models is
both computationally expensive and fickle. Most state of the art research in
this area is performed on GPU's and there are many tuneable parameters.

For most typical applied machine learning problems, especially on simpler
datasets that fit in RAM, variants of ensembled decision trees (such as Random
Forests) to perform at least as well as neural networks with less parameter
tuning and far shorter training times.

~~~
sine_dicendo
Not for nothing but Ben did you read the article? He's not even discussing
most of what you mention. He is simply taking his learning and applying it.
You seem to be going off on a tangent about advanced applications where he is
obviously just learning about how these things work and not trying to teach a
method or suggesting that he has discovered anything significant..

To the author: I liked the article. A simple, concise read.

~~~
dbecker
In Ben's defense: The original article declares random forest a "winner" over
neural networks. Ben's comment is a cautionary note that this result only
applies to a specific class of problems.

This was a nice post, but it's reasonable to warn users not to overgeneralize
the algorithm comparison.

------
theschreon
You could try the following improvements to speed up neural network training:

\- Resilient Propagation (RPROP), it significantly speeds up training for full
batch learning:
[http://davinci.fmph.uniba.sk/~uhliarik4/recognition/resource...](http://davinci.fmph.uniba.sk/~uhliarik4/recognition/resources/rprop/rb_1993_rprop.pdf)

\- RMSProp, introduced by Geoffrey Hinton, also speeds up training but can
also be used for mini-batch learning:
[https://class.coursera.org/neuralnets-2012-001/lecture/67](https://class.coursera.org/neuralnets-2012-001/lecture/67)
(sign up to view the video)

Please consider more datasets when benchmarking methods:

\- MNIST ( 70k 28x28 pixel images of handwritten digits ):
[http://yann.lecun.com/exdb/mnist/](http://yann.lecun.com/exdb/mnist/) . There
are several wrappers for Python on github.

\- UCI Machine Learning Repository:
[http://archive.ics.uci.edu/ml/datasets.html](http://archive.ics.uci.edu/ml/datasets.html)

~~~
dfrodriguez143
Definitely a lot to read and improvements to make. I will probably do a more
complete benchmark with more datasets on a later post.

Thanks for the suggestions.

~~~
benhamner
You may be interested in this ICML 2006 paper, which empirically compared many
standard algorithms across a combination of metrics and UCI datasets -
[http://www.cs.cornell.edu/~caruana/ctp/ct.papers/caruana.icm...](http://www.cs.cornell.edu/~caruana/ctp/ct.papers/caruana.icml06.pdf)

------
mbq
You are just doing a simple validation on a test set rather than cross-
validation; the point of CV is to make many iterations of validation on
different train-test splits and average the results.

~~~
dfrodriguez143
I agree completely a more complex benchmark should be done with a complete
cross-validation.

Just for future reference I did ran the fitting a few times founding
very(+-2%) similar results. Also Random Forests do an average so probably not
much to improve on that particular algorithm.

~~~
mbq
To be honest I don't expect the results to change; but this is an only way to
attach significance to the observed differences and to ensure this wasn't a
lucky shot.

------
lelandbatey
Hmmmm... The layout of the page seems very messed up. Is anyone else having it
show up like this?:

[http://puu.sh/3vTL8.png](http://puu.sh/3vTL8.png)

~~~
dfrodriguez143
Should work with most newer versions of any browser.

Which browser are you using?

~~~
lelandbatey
Firefox 22 on Windows 8

------
scotty79
What learning scientists think brain actually uses? Back-propagation and such
seem like a method god would use to architect static brain for given task.

~~~
wfn
For starters - see Hebbian theory. [1]

Backprop falls within the class of 'supervised learning' which can indeed be
said not to be very biologically realistic. However, reinforcement learning is
observed, so the overall picture is probably much more complex: e.g.
associative/recurrent/etc networks with Hebb-like unsupervised learning
developing neuronal group testing and selection systems that involve
reinforcement learning. (see first lecture/talk in [3].)

Perhaps worth a watch is a very nice talk by Geoffrey Hinton [2], which is oft
referred to on HN. (Hinton does refer to the notion of biological plausibility
etc. in this talk as far as I recall, but the focus is elsewhere (developing
next generation state-of-the-art (mostly unsupervised) machine learning
techniques/systems.))

[1]:
[https://en.wikipedia.org/wiki/Hebbian_theory](https://en.wikipedia.org/wiki/Hebbian_theory)

[2]:
[https://www.youtube.com/watch?v=AyzOUbkUf3M](https://www.youtube.com/watch?v=AyzOUbkUf3M)

[3]:
[http://kostas.mkj.lt/almaden2006/agenda.shtml](http://kostas.mkj.lt/almaden2006/agenda.shtml)
(The original summary HTML file is gone from the original source, so this is a
mirror; the links to videos and slides do work, though.) The first and the
second talks are somewhat relevant (particularly the first one, re: bio
plausibility etc ("Nobelist Gerald Edelman, The Neurosciences Institute: From
Brain Dynamics to Consciousness: A Prelude to the Future of Brain-Based
Devices")), but all are great. Rather heavy, though. (Also, skip the intros.)

 _edit_ that first talk/lecture from Almaden (Edelman's) is actually a very
nice exposure of the whole paradigm in which {cognitive,computational,etc}
neuroscience rests; it does get hairy later on; overall, it's a great talk for
the truly curious.

------
primelens
Good writeup. Is there a feed for that blog? I only found one for the
comments.

~~~
dfrodriguez143
Yes:
[http://danielfrg.github.io/feeds/all.atom.xml](http://danielfrg.github.io/feeds/all.atom.xml)

Gonna add a direct link from the site soon.

------
skatenerd
"def function(...)"

