
Transfer Learning and Fine-Tuning Deep Convolutional Neural Networks - rasmi
http://blog.revolutionanalytics.com/2016/08/deep-learning-part-2.html
======
eveningcoffee
What I have not seen is the explanation about transferring the data
normalization parameters. Say you apply the contrast normalization to the
images you use to train the first network.

Is there anything better you can do than applying the same parameters to the
second training set?

~~~
argonaut
If you had enough data, I imagine you could just re-learn the parameters on
the new dataset (and also finetune the network)?

------
iraphael
This is the interesting part:

> New dataset is smaller in size and similar in content compared to original
> dataset: If the data is small, it is not a good idea to fine-tune the DCNN
> due to overfitting concerns. Since the data is similar to the original data,
> we expect higher-level features in the DCNN to be relevant to this dataset
> as well. Hence, the best idea might be to train a linear classifier on the
> CNN-features.

> New dataset is relatively large in size and similar in content compared to
> the original dataset: Since we have more data, we can have more confidence
> that we would not over fit if we were to try to fine-tune through the full
> network.

> New dataset is smaller in size but very different in content compared to the
> original dataset: Since the data is small, it is likely best to only train a
> linear classifier. Since the dataset is very different, it might not be best
> to train the classifier from the top of the network, which contains more
> dataset-specific features. Instead, it might work better to train a
> classifier from activations somewhere earlier in the network.

> New dataset is relatively large in size and very different in content
> compared to the original dataset: Since the dataset is very large, we may
> expect that we can afford to train a DCNN from scratch. However, in practice
> it is very often still beneficial to initialize with weights from a pre-
> trained model. In this case, we would have enough data and confidence to
> fine-tune through the entire network.

------
rasmi
This is one of the most concise and accessible explanations of fine-tuning
CNNs that I've come across. I hope someone finds it as helpful as I did.

