backpropagation didn't get solved until the '80s, weirdly. before then people we...

version_five · on March 15, 2023

Right, and the practice of neural networks has significantly overshot the mathematical theory. Most of the aspects we know work and result in good models have poorly understood theoretical underpinnings. The whole overparamiterized thing for example, or generalization generally. There's a lot that "just works" but we don't know why, thus the stumbling around and landing on stuff that works

cma · on March 16, 2023

> and it was only in the last decade that the vanishing gradients problem was tamed.

One of the big pieces was Schmidhuber's lab's highway nets, done ~30 years ago, but just didn't land until a more limited version was rediscovered.