Hacker News new | comments | show | ask | jobs | submit login

The number of layers is one of the factors controlling model complexity.

Interesting, the number of units in the hidden layers isn't as important as the size of the weights of the hidden units. This is a famous result from the 1990's.

Hence, a principled way of controlling overfitting in NNs is: Pick a large enough number of weights that you don't underfit, and apply l2-regularization to the hidden unit weights. This is superior to fiddling with the number of hidden units in an unregularized net.

A related result is that you can control model complexity by imposing sparsity on the activations of the hidden units.




Thanks, very interesting! How do I know if I have enough regularization?


Both underfitting and overfitting cause poor generalization performance. You can use cross validation to search for the parameters that give the lowest validation error.


do you have a reference?

Is L2 better than L1 in this regard? My experience is that L1 significantly outperforms L2 whenever overfitting (rather than noise / bad measurements) is the problem you are addressing.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: