Stage2:
包含两个卷积层,一个池化层,每个卷积层和池化层的信息如下:
3 * 3 128 1 * 1
Stage3:
包含四个卷积层,一个池化层,每个卷积层和池化层的信息如下:
| 卷积核 |深度 |步长 |
|--------------|-|-------------|---|
| 3 * 3 |256|1 * 1 |
Stage4:
包含四个卷积层,一个池化层,每个卷积层和池化层的信息如下:
3 * 3 512 1 * 1
Stage5:
包含四个卷积层,一个池化层,每个卷积层和池化层的信息如下:
3 * 3 512 1 * 1
池化层
整个网络包含5个池化层,分别位于每一个Stage的后面,每个池化层的尺寸均一样,如下:
2 * 2 2 * 2
对于其他的隐藏层,作者在论文中做了如下阐述:
“All hidden layers are equipped with the rectification (ReLU (Krizhevsky et al., 2012)) non-linearity.We note that none of our networks (except for one) contain Local Response Normalisation(LRN) normalisation (Krizhevsky et al., 2012): as will be shown in Sect. 4, such normalisation does not improve the performance on the ILSVRC dataset, but leads to increased memory consumption and computation time. ”
整个网络不包含LRN,因为LRN会占用内存和增加计算时间。接着经过3个全链层的处理,由Softmax输出1000个类别的分类结果。
2. 用Tensorflow搭建VGG19网络VGG团队早已用Tensorflow搭建好了VGG16和VGG19网络,在使用他们的网络前,你需要下载已经训练好的参数文件vgg19.npy,下载地址为: 。原版的VGG16/19模型代码在 https://github.com/machrisaa/tensorflow-vgg (该模型中提到的weights文件已不可用), 我们根据该模型代码对VGG19网络做了一些微调以适应自己的训练需求,同时也像上一篇的AlexNet一样,增加了精调训练代码,后面会有介绍。
使用Tensorflow来搭建一个完整的VGG19网络,包含我修改过的整整用了160行代码,如下附上一部分代码,该网络也是VGG团队已经训练好了的,你可以拿来直接进行图片识别和分类,但是如果你有其他的图片识别需求,你需要用自己的训练集来训练一次以获得想要的结果,并存储好自己的权重文件。
我们在原版的基础上做了一些改动,增加了入参num_class,该参数代表分类个数,如果你有100个种类的图片需要训练,这个值必须设置成100,以此类推。
class Vgg19(object): """ A trainable version VGG19. """ def __init__(self, bgr_image, num_class, vgg19_npy_path=None, trainable=True, dropout=0.5): if vgg19_npy_path is not None: self.data_dict = np.load(vgg19_npy_path, encoding='latin1').item() else: self.data_dict = None self.BGR_IMAGE = bgr_image self.NUM_CLASS = num_class self.var_dict = {} self.trainable = trainable self.dropout = dropout self.build() def build(self, train_mode=None): self.conv1_1 = self.conv_layer(self.BGR_IMAGE, 3, 64, "conv1_1") self.conv1_2 = self.conv_layer(self.conv1_1, 64, 64, "conv1_2") self.pool1 = self.max_pool(self.conv1_2, 'pool1') self.conv2_1 = self.conv_layer(self.pool1, 64, 128, "conv2_1") self.conv2_2 = self.conv_layer(self.conv2_1, 128, 128, "conv2_2") self.pool2 = self.max_pool(self.conv2_2, 'pool2') self.conv3_1 = self.conv_layer(self.pool2, 128, 256, "conv3_1") self.conv3_2 = self.conv_layer(self.conv3_1, 256, 256, "conv3_2") self.conv3_3 = self.conv_layer(self.conv3_2, 256, 256, "conv3_3") self.conv3_4 = self.conv_layer(self.conv3_3, 256, 256, "conv3_4") self.pool3 = self.max_pool(self.conv3_4, 'pool3') self.conv4_1 = self.conv_layer(self.pool3, 256, 512, "conv4_1") self.conv4_2 = self.conv_layer(self.conv4_1, 512, 512, "conv4_2") self.conv4_3 = self.conv_layer(self.conv4_2, 512, 512, "conv4_3") self.conv4_4 = self.conv_layer(self.conv4_3, 512, 512, "conv4_4") self.pool4 = self.max_pool(self.conv4_4, 'pool4') self.conv5_1 = self.conv_layer(self.pool4, 512, 512, "conv5_1") self.conv5_2 = self.conv_layer(self.conv5_1, 512, 512, "conv5_2") self.conv5_3 = self.conv_layer(self.conv5_2, 512, 512, "conv5_3") self.conv5_4 = self.conv_layer(self.conv5_3, 512, 512, "conv5_4") self.pool5 = self.max_pool(self.conv5_4, 'pool5') self.fc6 = self.fc_layer(self.pool5, 25088, 4096, "fc6") self.relu6 = tf.nn.relu(self.fc6) if train_mode is not None: self.relu6 = tf.cond(train_mode, lambda: tf.nn.dropout(self.relu6, self.dropout), lambda: self.relu6) elif train_mode: self.relu6 = tf.nn.dropout(self.relu6, self.dropout) self.fc7 = self.fc_layer(self.relu6, 4096, 4096, "fc7") self.relu7 = tf.nn.relu(self.fc7) if train_mode is not None: self.relu7 = tf.cond(train_mode, lambda: tf.nn.dropout(self.relu7, self.dropout), lambda: self.relu7) elif train_mode: self.relu7 = tf.nn.dropout(self.relu7, self.dropout) self.fc8 = self.fc_layer(self.relu7, 4096, self.NUM_CLASS, "fc8") self.prob = tf.nn.softmax(self.fc8, name="prob") self.data_dict = None