Right way to Fine Tune - Train a fully connected layer as a separate step

2017-03-20 23:29:59

I'm using Fine Tuning with caffenet and it works really well but then I read this in Keras blog entry on Fine Tuning (They use a trained VGG16 model):

"in order to perform fine-tuning, all layers should start with properly trained weights:

for instance you should not slap a randomly initialized fully-connected network on top of a pre-trained convolutional base.

This is because the large gradient updates triggered by the randomly initialized weights would wreck the learned weights in the convolutional base.

In our case this is why we first train the top-level classifier, and only then start fine-tuning convolutional weights alongside it."

So as a separate step in Fine tuning they save the output of the last layer before the fully connected layer (the "bottleneck features") and then they train a "small fully-connected model" on those features and only then they put the newly trained fully connected layer on top of the whole net and train the "last convolutional block".