Hacker News new | past | comments | ask | show | jobs | submit login

backpropagation didn't get solved until the '80s, weirdly. before then people were using genetic algorithms to train neural networks.

and it was only in the last decade that the vanishing gradients problem was tamed.

my impression is that ML researchers were stumbling along in the mathematical dark, until they hit a combination (deep neural nets trained via stochastic gradient descent with ReLU activation) that worked like magic and ended the AI winter.




Right, and the practice of neural networks has significantly overshot the mathematical theory. Most of the aspects we know work and result in good models have poorly understood theoretical underpinnings. The whole overparamiterized thing for example, or generalization generally. There's a lot that "just works" but we don't know why, thus the stumbling around and landing on stuff that works


> and it was only in the last decade that the vanishing gradients problem was tamed.

One of the big pieces was Schmidhuber's lab's highway nets, done ~30 years ago, but just didn't land until a more limited version was rediscovered.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: