1. 程式人生 > >AlexNet網路的Pytorch實現

AlexNet網路的Pytorch實現

1.文章原文地址

ImageNet Classification with Deep Convolutional Neural Networks

2.文章摘要

我們訓練了一個大型的深度卷積神經網路用於在ImageNet LSVRC-2010競賽中,將120萬(12百萬)的高解析度影象進行1000個類別的分類。在測試集上,網路的top-1和top-5誤差分別為37.5%和17.0%,這結果極大的優於先前的最好結果。這個擁有6千萬(60百萬)引數和65萬神經元的神經網路包括了五個卷積層,其中一些卷積層後面會跟著最大池化層,以及三個全連線層,其中全連線層是以1000維的softmax啟用函式結尾的。為了可以訓練的更快,我們使用了非飽和神經元(如Relu,啟用函式輸出沒有將其限定在特定範圍)和一個非常高效的GPU來完成卷積運算,為了減少過擬合,我們在全連線層中使用了近期發展起來的一種正則化方式,即dropout,它被證明是非常有效的。我們也使用了該模型的一個變體用於ILSVRC-2012競賽中,並且以top-5的測試誤差為15.3贏得比賽,該比賽中第二名的top-5測試誤差為26.2%。

3.網路結構

4.Pytorch實現

 1 import torch.nn as nn
 2 from torchsummary import summary
 3 
 4 try:
 5     from torch.hub import load_state_dict_from_url
 6 except ImportError:
 7     from torch.utils.model_zoo import load_url as load_state_dict_from_url
 8 
 9 model_urls = {
10     'alexnet': 'https://download.pytorch.org/models/alexnet-owt-4df8aa71.pth',
11 }
12 
13 class AlexNet(nn.Module):
14     def __init__(self,num_classes=1000):
15         super(AlexNet,self).__init__()
16         self.features=nn.Sequential(
17             nn.Conv2d(3,96,kernel_size=11,stride=4,padding=2),   #(224+2*2-11)/4+1=55
18             nn.ReLU(inplace=True),
19             nn.MaxPool2d(kernel_size=3,stride=2),   #(55-3)/2+1=27
20             nn.Conv2d(96,256,kernel_size=5,stride=1,padding=2), #(27+2*2-5)/1+1=27
21             nn.ReLU(inplace=True),
22             nn.MaxPool2d(kernel_size=3,stride=2),   #(27-3)/2+1=13
23             nn.Conv2d(256,384,kernel_size=3,stride=1,padding=1),    #(13+1*2-3)/1+1=13
24             nn.ReLU(inplace=True),
25             nn.Conv2d(384,384,kernel_size=3,stride=1,padding=1),    #(13+1*2-3)/1+1=13
26             nn.ReLU(inplace=True),
27             nn.Conv2d(384,256,kernel_size=3,stride=1,padding=1),    #13+1*2-3)/1+1=13
28             nn.ReLU(inplace=True),
29             nn.MaxPool2d(kernel_size=3,stride=2),   #(13-3)/2+1=6
30         )   #6*6*256=9126
31 
32         self.avgpool=nn.AdaptiveAvgPool2d((6,6))
33         self.classifier=nn.Sequential(
34             nn.Dropout(),
35             nn.Linear(256*6*6,4096),
36             nn.ReLU(inplace=True),
37             nn.Dropout(),
38             nn.Linear(4096,4096),
39             nn.ReLU(inplace=True),
40             nn.Linear(4096,num_classes),
41         )
42 
43     def forward(self,x):
44         x=self.features(x)
45         x=self.avgpool(x)
46         x=x.view(x.size(0),-1)
47         x=self.classifier(x)
48         return x
49 
50 def alexnet(pretrain=False,progress=True,**kwargs):
51     r"""
52     Args:
53         pretrained(bool):If True, retures a model pre-trained on IMageNet
54         progress(bool):If True, displays a progress bar of the download to stderr
55     """
56     model=AlexNet(**kwargs)
57     if pretrain:
58         state_dict=load_state_dict_from_url(model_urls['alexnet'],
59                                             progress=progress)
60         model.load_state_dict(state_dict)
61     return model
62 
63 if __name__=="__main__":
64     model=alexnet()
65     print(summary(model,(3,224,224)))
 1 Output:
 2 ----------------------------------------------------------------
 3         Layer (type)               Output Shape         Param #
 4 ================================================================
 5             Conv2d-1           [-1, 96, 55, 55]          34,944
 6               ReLU-2           [-1, 96, 55, 55]               0
 7          MaxPool2d-3           [-1, 96, 27, 27]               0
 8             Conv2d-4          [-1, 256, 27, 27]         614,656
 9               ReLU-5          [-1, 256, 27, 27]               0
10          MaxPool2d-6          [-1, 256, 13, 13]               0
11             Conv2d-7          [-1, 384, 13, 13]         885,120
12               ReLU-8          [-1, 384, 13, 13]               0
13             Conv2d-9          [-1, 384, 13, 13]       1,327,488
14              ReLU-10          [-1, 384, 13, 13]               0
15            Conv2d-11          [-1, 256, 13, 13]         884,992
16              ReLU-12          [-1, 256, 13, 13]               0
17         MaxPool2d-13            [-1, 256, 6, 6]               0
18 AdaptiveAvgPool2d-14            [-1, 256, 6, 6]               0
19           Dropout-15                 [-1, 9216]               0
20            Linear-16                 [-1, 4096]      37,752,832
21              ReLU-17                 [-1, 4096]               0
22           Dropout-18                 [-1, 4096]               0
23            Linear-19                 [-1, 4096]      16,781,312
24              ReLU-20                 [-1, 4096]               0
25            Linear-21                 [-1, 1000]       4,097,000
26 ================================================================
27 Total params: 62,378,344
28 Trainable params: 62,378,344
29 Non-trainable params: 0
30 ----------------------------------------------------------------
31 Input size (MB): 0.57
32 Forward/backward pass size (MB): 11.16
33 Params size (MB): 237.95
34 Estimated Total Size (MB): 249.69
35 ----------------------------------------------------------------

參考

https://github.com/pytorch/vision/tree/master/torchvision/mo