

Geoff Hinton AMA – Deep Learning's Biological Inspiration - zackchase
http://www.reddit.com/r/MachineLearning/comments/2lmo0l/ama_geoffrey_hinton

======
timClicks
The AMA highlights one of the deficiencies of Markdown. Prof Hinton has
attempted to number many of his responses to the highest rated thread, but
they're all coming up as 1. 1. 1. 1.

------
msaroufim
I'm having trouble understanding why the success of pooling would be deemed
unfortunate. Max-pooling or average-pooling are essentially learning something
about a pair of contiguous features: while max-pooling saves only the most
prominent/largest value, average pooling compresses information using the
mean. Saying this is unfortunate from Hinton's standpoint amounts to saying
that it is very unlikely to see any sort of pooling behavior at the level of
neuronal populations. At a physiological and psychological level, what would
pooling equate to?

~~~
zackchase
I think the intuition behind max-pooling is that it says "something is in this
local region", without overly specifically saying where it is. Intuitively, a
human may detects an edge or intense bright light, but not care so much about
precisely where it is in the field of vision. After many successive layers of
max-pooling, however (if the pools are not over-lapping), even somewhat course
information about locality is lost.

I believe Hinton objects to this gross loss of spatial information for two
reasons: 1) Humans don't lose so much spatial information, and Hinton would
like his models to ultimately capture a neurologically plausible computation.
2) It may not be necessary for object detection (Imagenet), but it would
likely be important for more sophisticated tasks.

~~~
bainsfather
He also 'did not like' Support Vector Machines, back when they were the best
method for image recognition. His reason was that SVMs were a 'dead end' \-
they were not a step on the path to human-level image recognition. His
argument now seems pretty valid.

I think he is saying the same thing about Max Pooling. Just my guess.

------
bglazer
>I guess we should just train an RNN to output a caption so that it can tell
us what it thinks is there. Then maybe the philosophers and cognitive
scientists will stop telling us what our nets cannot do.

I wonder if he knew about the Stanford paper that demonstrates this? Or if he
just guessed this would happen.

[http://cs.stanford.edu/people/karpathy/deepimagesent/](http://cs.stanford.edu/people/karpathy/deepimagesent/)

~~~
Teodolfo
Or knew about the Google paper demonstrating this?

[http://googleresearch.blogspot.ca/2014/11/a-picture-is-
worth...](http://googleresearch.blogspot.ca/2014/11/a-picture-is-worth-
thousand-coherent.html)

Or the Toronto one?
[http://deeplearning.cs.toronto.edu/i2t](http://deeplearning.cs.toronto.edu/i2t)

Two institutions he is affiliated with.

------
dang
Url changed from [http://www.kdnuggets.com/2014/12/geoff-hinton-ama-neural-
net...](http://www.kdnuggets.com/2014/12/geoff-hinton-ama-neural-networks-
brain-machine-learning.html), which points to this.

