《PyTorch之VGG16网络结构详解以及源码解读》由会员分享,可在线阅读,更多相关《PyTorch之VGG16网络结构详解以及源码解读(5页珍藏版)》请在金锄头文库上搜索。
1、PyTorch之VGG16络结构详解以及源码解读论:1. 简单介绍意义: 证明了增加卷积核的个数以及络深度可以提分类结果的正确率。预处理: 各通道减去RGB在训练集上的均值。特点:1)使的卷积核(3 3)叠加的形式代替的卷积核(5 5 or 7 7)2)卷积层不会改变layer,是通过max pooling减layer3)络层数较深优点:1)络结构简洁:整个络都使了同样的卷积核(3x3)和最池化尺(2x2)2)使的卷积核叠加的形式代替的卷积核,表达能更强,络性能更佳缺点:1)络参数较多,训练时间过长,调参难度。2)需要的存储容量,不利于部署。例如存储VGG16权重值件的为500多MB。其中需要
2、说明的是2个3 x 3的卷积核是可以代替个5 x 5的卷积核的,其意图如下:2. 络结构其络结构图如上图所,均还有5个block,其中VGG系列包含了vgg11、vgg13、vgg16以及vgg19,其中后的数字代表的是对应的络结构中卷积层和全连接层的数量,例如vgg16中含有13个卷积层和3个全连接层。其中vgg11中的LRN(LocalResponse Normalization)表局部响应归化。3. 源码讲解先要在电脑中安装torchvision,其源码可在torchvision下的models件夹中找到,名为vgg.pytorchvision是pytorch框架中个常重要且好的包,该包
3、主要由三个包组成,分别是:torchvision.datasets;torchvision.models;torchvision.transforms1)导相应的包import torchimport torch.nn as nnfrom .utils import load_state_dict_from_url2)所有的络名称及其预训练好的参数件_all_ = VGG, vgg11, vgg11_bn, vgg13, vgg13_bn, vgg16, vgg16_bn, vgg19_bn, vgg19,model_urls = vgg11: https:/download.pytorch.
4、org/models/vgg11-bbd30ac9.pth, vgg13: https:/download.pytorch.org/models/vgg13-c768596a.pth, vgg16: https:/download.pytorch.org/models/vgg16-397923af.pth, vgg19: https:/download.pytorch.org/models/vgg19-dcbb9e9d.pth, vgg11_bn: https:/download.pytorch.org/models/vgg11_bn-6002323d.pth, vgg13_bn: https
5、:/download.pytorch.org/models/vgg13_bn-abd245e5.pth, vgg16_bn: https:/download.pytorch.org/models/vgg16_bn-6c64b313.pth, vgg19_bn: https:/download.pytorch.org/models/vgg19_bn-c79401a0.pth,3)vgg类的定义,其中features表对应的所有卷积以及池化层,avgpool表平均池化(池化分为平均池化以及最池化),classifier表全连接层,共三层,_initialize_weights函数表对络参数进初始化
6、class VGG(nn.Module): def _init_(self, features, num_classes=1000, init_weights=True): super(VGG, self)._init_() self.features = features self.avgpool = nn.AdaptiveAvgPool2d(7, 7) self.classifier = nn.Sequential( nn.Linear(512 * 7 * 7, 4096), nn.ReLU(True), nn.Dropout(), nn.Linear(4096, 4096), nn.Re
7、LU(True), nn.Dropout(), nn.Linear(4096, num_classes), ) if init_weights: self._initialize_weights() def forward(self, x): x = self.features(x) x = self.avgpool(x) x = torch.flatten(x, 1) x = self.classifier(x) return x def _initialize_weights(self): for m in self.modules(): if isinstance(m, nn.Conv2
8、d): nn.init.kaiming_normal_(m.weight, mode=fan_out, nonlinearity=relu) if m.bias is not None: nn.init.constant_(m.bias, 0) elif isinstance(m, nn.BatchNorm2d): nn.init.constant_(m.weight, 1) nn.init.constant_(m.bias, 0) elif isinstance(m, nn.Linear): nn.init.normal_(m.weight, 0, 0.01) nn.init.constan
9、t_(m.bias, 0)4)该函数表添加相应的卷积层以及池化层,其中nn.Sequential表个有序的容器,神经络模块将按照在传nn.Sequential的顺序依次被添加到计算图中执。def make_layers(cfg, batch_norm=False): layers = in_channels = 3 for v in cfg: if v = M: layers += nn.MaxPool2d(kernel_size=2, stride=2) else: conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1) if
10、 batch_norm: layers += conv2d, nn.BatchNorm2d(v), nn.ReLU(inplace=True) else: layers += conv2d, nn.ReLU(inplace=True) in_channels = v return nn.Sequential(*layers)5)其中A、B、D、E分别表vgg11、vgg13、vgg16以及vgg19,其中数字表个卷积层对应的输出通道数, M 表池化层cfgs = A: 64, M, 128, M, 256, 256, M, 512, 512, M, 512, 512, M, B: 64, 64
11、, M, 128, 128, M, 256, 256, M, 512, 512, M, 512, 512, M, D: 64, 64, M, 128, 128, M, 256, 256, 256, M, 512, 512, 512, M, 512, 512, 512, M, E: 64, 64, M, 128, 128, M, 256, 256, 256, 256, M, 512, 512, 512, 512, M, 512, 512, 512, 512, M,6)下表不同的vgg络接def _vgg(arch, cfg, batch_norm, pretrained, progress, *
12、kwargs): if pretrained: kwargsinit_weights = False model = VGG(make_layers(cfgscfg, batch_norm=batch_norm), *kwargs) if pretrained: state_dict = load_state_dict_from_url(model_urlsarch, progress=progress) model.load_state_dict(state_dict) return modeldef vgg11(pretrained=False, progress=True, *kwarg
13、s): rVGG 11-layer model (configuration A) from Very Deep Convolutional Networks For Large-Scale Image Recognition _ Args: pretrained (bool): If True, returns a model pre-trained on ImageNet progress (bool): If True, displays a progress bar of the download to stderr return _vgg(vgg11, A, False, pretr
14、ained, progress, *kwargs)def vgg11_bn(pretrained=False, progress=True, *kwargs): rVGG 11-layer model (configuration A) with batch normalization Very Deep Convolutional Networks For Large-Scale Image Recognition _ Args: pretrained (bool): If True, returns a model pre-trained on ImageNet progress (boo
15、l): If True, displays a progress bar of the download to stderr return _vgg(vgg11_bn, A, True, pretrained, progress, *kwargs)def vgg13(pretrained=False, progress=True, *kwargs): rVGG 13-layer model (configuration B) Very Deep Convolutional Networks For Large-Scale Image Recognition _ Args: pretrained
16、 (bool): If True, returns a model pre-trained on ImageNet progress (bool): If True, displays a progress bar of the download to stderr return _vgg(vgg13, B, False, pretrained, progress, *kwargs)def vgg13_bn(pretrained=False, progress=True, *kwargs): rVGG 13-layer model (configuration B) with batch normalization Very Deep Convolutional Networks For Large-Scale Image Recognition _ Args: pretrained (bool): If True, returns a model pre-trained on ImageNet progress (bool): If True, displays a progress