Hacker News new | past | comments | ask | show | jobs | submit login

My thinking is that if this line of reasoning is true it suggests that we may as well be using kernel methods or other feature generation systems with high dimensionality with better algebraic properties that be might be able to train quicker or better interpret. My understanding is that those other techniques haven't been especially competitive with deep neural nets on large data sets these days, so that implies that the high dimension argument for gradient descent is insufficient to explain the success of deep neural nets.



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: