He talks about a surprising amount of cutting edge achievements being made by deep neural networks just over the last few months.
Unfortunately, even though it was posted three times to HN
it never made the fron page.
Here is the my summary and comment:
Great talk. I don't know much about artificial neural networks (ANN) and even less about natural ones, but I have the feeling that I learnt a lot from this video.
If I understand correct, Hinton uses so many artificial neurons compared to the amount of learning data, that you would usually see an overfitting effect. However, his ANN's randomly shut of a substantial part (~50%) of the neurons during each learning iteration. He calls this "dropout". Therefore, a single ANN represents many different models. Most models never get trained, but they exist in the ANN, because they share their weights with the trained models. This learning method avoids over specializing and therefore improves robustness with respect to new data but it also allows for arbitrary combination of different models which tremendously enlarges the pool of testable models.
When using or testing these ANNs you also "dropout" neurons during every prediction. Practically, every rerun predicts a different result by using a different model. Afterwards, these results are averaged. The more results, the higher the chance, that the classification is correct.
Hinton argues, that our brains work in a similar way. This explains among other things
a) Why are neurons firing in a random manner? It's an equivalent implementation to his "dropout" where only a part of the neurons is used at any given time.
b) Why does spending more time on a decision improve the likely hood of success? Even though there might be more at work, his theory alone is able to explain the effect. The longer you think, the more models you test, simply by rerunning the prediction. The more such predictions the higher the chance, that the average prediction is correct.
To me, the latter also explains in an intuitive way, why the "wisdom of the crowds" works well when predicting events that many people have an, halfway sophisticated, understanding of. Examples are betting on sport events or movies box office success. As far as I know, no single expert beats the "wisdom of the crowd" in such cases.
What I would like to know is, how many, random model based predictions do you need until the improvement rate becomes insignificant? In other words, would humans act much smarter if they could afford more time to think about decisions? Put another way, does the "wisdom of the crowd" effect stem from the larger amount of combined neurons and the diversity of the available models that follows, or from the larger amount of predictions that are used to compute the average? How much less effective would the crowd be, if less people make more ("e.g. top 5") predictions or if the crowd was made up of few individuals which are cloned?
If the limiting factor for humans is the time to predict based on many different models and not the amount of neurons we have, this would have interesting implications. Once, a single computer would have sufficient complexity to compete with the human brain, you could merely build more of these computers and average there opinions to arrive at better conclusions that any human could . Computers wouldn't be just faster than humans, they would be much smarter, too.
 I'm talking about brain like ANN implementations here. Obviously, we already use specialized software to predict complex events like weather, better than any single human could. But these are not general purpose machines.
Did I about cover all the problems with groupthink and hivemind?
It's a shame though, but I hang out in new a lot, and I urge other HN'ers to hang out there too.
Upvote only interesting things