Hacker News new | past | comments | ask | show | jobs | submit login

> But I’ve always thought that the major advantage of using deep learning over simpler models is that if you have a massive amount of data you can fit a massive number of parameters.

The major advantage of deep learning is not that it works better on more data. It's that it automatically learns features that would otherwise take expert humans a lot of time and energy to figure out and hardcode into the system.




I don't really think of it as learning. I think of it as statistics on crack.


That's an advantage of convolutional nets. Deep fully connected nets don't do that afaik.


They do, at least as far as I understand the statement. Historically the big benefit was training them layer by layer, which was like training a feature detector then a feature of features detector etc. If that's still how they're trained (been nearly a decade for me now) then they discover features rather than you engineering them.

This meant that you could train on large unlabelled data and then small amounts of labelled data.


Yeah now that I think about it my statement didn't make any sense, since each intermediate layer computes a projection of the previous one, which is technically feature learning. I still disagree with the original comment though, because the intermediate representations of the data computed by a fully connected network are nothing like the ones that would be built by a human doing feature engineering. The ones learned by a convolutional layer would be closer to human-understandable features.


Loosely speaking, convolutional nets are just a smart way of computing a function that would otherwise take the computational load of a fully connected net.


All kinds of other structures also do that - for example, the "family" of recurrent networks in many non-visual problems.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: