Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

None of the parties mentioned actually deny the above equivalence. The reason backprop is a popular idea in deep learning is because people started developing continuous models, where the output (and the error) was a continuous and differentiable function of the input and the weights, which allowed chain rule to be used to compute the gradients, which allowed one to use gradient descent methods. This shift from discrete units to continuous units was termed error backpropogation, and not just chain rule.



It's kind of unfortunate, as it's forced everything to be continuous. Which is not very computationally efficient or easily interpreted by humans.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: