Very Deep Convolutional Networks for Large-Scale Image Recognition(VGG)
Published year : 2014 Paper URL
What
A neural networks using an architecture with 3 by 3 convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
How
1. Using 3x3 conv. layers instead of 7x7 or 5x5 to reduce the parameters in the model. 2. The incorporation of 1x1 conv. layers to increase the nonlinearity of the decision function without affecting the receptive fields of the conv. layers. 3. To better initialize the weight of the network, the writters first training the configuraton A in the above picture with random parameter initialization, then use the weights of network A to train othe network B,C,D and E.
From the result below we can see that as the network goes deeper the performance would be better.
- Note:
- VGG stands for Vision Geometry Group of Oxford University
- VGG16 means 13 conv. layers + 3 FC layers.
- VGG19 means 16 conv. layers + 3 FC layers
And?
The defect of VGG network is that the number of parameters is too large and the training time is long. Thus we usually do not train this by scratch instead we can get the pre-trained model from tensorflow. Use these pr-trained model to do other project.