在jupyter Notebook中使用PyTorch中的預訓練模型ResNet進行影象分類
預訓練模型是在像ImageNet這樣的大型基準資料集上訓練得到的神經網路模型。
現在通過Pytorch的torchvision.models 模組中現有模型如 ResNet,用一張圖片去其預測類別。
1. 下載資源
這裡隨意從網上下載一張狗的圖片。
類別標籤IMAGENET1000 從 https://blog.csdn.net/weixin_34304013/article/details/93708121複製到一個空的txt裡,去掉最外面的{}即可。
2.使用TorchVision載入預訓練模型ResNet
2.1 從torchvison模組匯入models模組,可以看一下有哪些不同的模型和網路結構。
1 fromtorchvision import models 2 dir(models)
1 ['AlexNet', 2 'DenseNet', 3 'GoogLeNet', 4 'GoogLeNetOutputs', 5 'Inception3', 6 'InceptionOutputs', 7 'MNASNet', 8 'MobileNetV2', 9 'ResNet', 10 'ShuffleNetV2', 11 'SqueezeNet', 12 'VGG', 13 '_GoogLeNetOutputs', 14 '_InceptionOutputs', 15 '__builtins__', 16 '__cached__', 17 '__doc__', 18 '__file__', 19 '__loader__', 20 '__name__', 21 '__package__', 22 '__path__', 23 '__spec__', 24 '_utils', 25 'alexnet', 26 'densenet', 27 'densenet121', 28 'densenet161', 29 'densenet169', 30 'densenet201', 31 'detection', 32 'googlenet', 33 'inception', 34 'inception_v3', 35 'mnasnet', 36 'mnasnet0_5', 37 'mnasnet0_75', 38 'mnasnet1_0', 39 'mnasnet1_3', 40 'mobilenet', 41 'mobilenet_v2', 42 'quantization', 43 'resnet', 44 'resnet101', 45 'resnet152', 46 'resnet18', 47 'resnet34', 48 'resnet50', 49 'resnext101_32x8d', 50 'resnext50_32x4d', 51 'segmentation', 52 'shufflenet_v2_x0_5', 53 'shufflenet_v2_x1_0', 54 'shufflenet_v2_x1_5', 55 'shufflenet_v2_x2_0', 56 'shufflenetv2', 57 'squeezenet', 58 'squeezenet1_0', 59 'squeezenet1_1', 60 'utils', 61 'vgg', 62 'vgg11', 63 'vgg11_bn', 64 'vgg13', 65 'vgg13_bn', 66 'vgg16', 67 'vgg16_bn', 68 'vgg19', 69 'vgg19_bn', 70 'video', 71 'wide_resnet101_2', 72 'wide_resnet50_2']
注:大寫的名稱指的是實現許多流行模型的Python類。它們的體系結構不同——也就是說,在輸入和輸出之間發生的操作的安排不同。
小寫的名稱是函式,返回從這些類例項化的模型,有時使用不同的引數集。例如,resnet101返回一個有101層的ResNet例項,resnet18有18層,以此類推。
2.2載入預訓練模型,建立例項
1 resnet = models.resnet101(pretrained=True)
注:下載的模型檔案會快取在使用者的相應目錄中,這裡在C:\Users\Dell\.cache\torch\hub\checkpoints\resnet101-5d3b4d8f.pth
可輸出以下網路結構如下
1 ResNet( 2 (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False) 3 (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 4 (relu): ReLU(inplace=True) 5 (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False) 6 (layer1): Sequential( 7 (0): Bottleneck( 8 (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) 9 (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 10 (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 11 (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 12 (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 13 (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 14 (relu): ReLU(inplace=True) 15 (downsample): Sequential( 16 (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 17 (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 18 ) 19 ) 20 (1): Bottleneck( 21 (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) 22 (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 23 (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 24 (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 25 (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 26 (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 27 (relu): ReLU(inplace=True) 28 ) 29 (2): Bottleneck( 30 (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) 31 (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 32 (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 33 (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 34 (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 35 (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 36 (relu): ReLU(inplace=True) 37 ) 38 ) 39 (layer2): Sequential( 40 (0): Bottleneck( 41 (conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) 42 (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 43 (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) 44 (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 45 (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) 46 (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 47 (relu): ReLU(inplace=True) 48 (downsample): Sequential( 49 (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False) 50 (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 51 ) 52 ) 53 (1): Bottleneck( 54 (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) 55 (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 56 (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 57 (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 58 (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) 59 (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 60 (relu): ReLU(inplace=True) 61 ) 62 (2): Bottleneck( 63 (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) 64 (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 65 (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 66 (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 67 (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) 68 (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 69 (relu): ReLU(inplace=True) 70 ) 71 (3): Bottleneck( 72 (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) 73 (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 74 (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 75 (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 76 (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) 77 (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 78 (relu): ReLU(inplace=True) 79 ) 80 ) 81 (layer3): Sequential( 82 (0): Bottleneck( 83 (conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 84 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 85 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) 86 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 87 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 88 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 89 (relu): ReLU(inplace=True) 90 (downsample): Sequential( 91 (0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False) 92 (1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 93 ) 94 ) 95 (1): Bottleneck( 96 (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 97 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 98 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 99 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 100 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 101 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 102 (relu): ReLU(inplace=True) 103 ) 104 (2): Bottleneck( 105 (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 106 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 107 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 108 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 109 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 110 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 111 (relu): ReLU(inplace=True) 112 ) 113 (3): Bottleneck( 114 (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 115 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 116 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 117 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 118 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 119 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 120 (relu): ReLU(inplace=True) 121 ) 122 (4): Bottleneck( 123 (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 124 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 125 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 126 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 127 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 128 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 129 (relu): ReLU(inplace=True) 130 ) 131 (5): Bottleneck( 132 (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 133 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 134 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 135 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 136 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 137 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 138 (relu): ReLU(inplace=True) 139 ) 140 (6): Bottleneck( 141 (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 142 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 143 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 144 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 145 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 146 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 147 (relu): ReLU(inplace=True) 148 ) 149 (7): Bottleneck( 150 (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 151 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 152 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 153 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 154 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 155 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 156 (relu): ReLU(inplace=True) 157 ) 158 (8): Bottleneck( 159 (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 160 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 161 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 162 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 163 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 164 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 165 (relu): ReLU(inplace=True) 166 ) 167 (9): Bottleneck( 168 (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 169 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 170 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 171 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 172 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 173 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 174 (relu): ReLU(inplace=True) 175 ) 176 (10): Bottleneck( 177 (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 178 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 179 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 180 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 181 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 182 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 183 (relu): ReLU(inplace=True) 184 ) 185 (11): Bottleneck( 186 (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 187 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 188 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 189 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 190 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 191 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 192 (relu): ReLU(inplace=True) 193 ) 194 (12): Bottleneck( 195 (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 196 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 197 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 198 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 199 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 200 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 201 (relu): ReLU(inplace=True) 202 ) 203 (13): Bottleneck( 204 (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 205 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 206 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 207 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 208 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 209 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 210 (relu): ReLU(inplace=True) 211 ) 212 (14): Bottleneck( 213 (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 214 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 215 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 216 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 217 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 218 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 219 (relu): ReLU(inplace=True) 220 ) 221 (15): Bottleneck( 222 (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 223 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 224 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 225 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 226 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 227 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 228 (relu): ReLU(inplace=True) 229 ) 230 (16): Bottleneck( 231 (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 232 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 233 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 234 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 235 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 236 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 237 (relu): ReLU(inplace=True) 238 ) 239 (17): Bottleneck( 240 (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 241 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 242 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 243 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 244 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 245 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 246 (relu): ReLU(inplace=True) 247 ) 248 (18): Bottleneck( 249 (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 250 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 251 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 252 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 253 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 254 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 255 (relu): ReLU(inplace=True) 256 ) 257 (19): Bottleneck( 258 (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 259 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 260 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 261 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 262 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 263 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 264 (relu): ReLU(inplace=True) 265 ) 266 (20): Bottleneck( 267 (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 268 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 269 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 270 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 271 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 272 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 273 (relu): ReLU(inplace=True) 274 ) 275 (21): Bottleneck( 276 (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 277 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 278 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 279 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 280 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 281 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 282 (relu): ReLU(inplace=True) 283 ) 284 (22): Bottleneck( 285 (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) 286 (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 287 (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 288 (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 289 (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) 290 (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 291 (relu): ReLU(inplace=True) 292 ) 293 ) 294 (layer4): Sequential( 295 (0): Bottleneck( 296 (conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) 297 (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 298 (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) 299 (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 300 (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) 301 (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 302 (relu): ReLU(inplace=True) 303 (downsample): Sequential( 304 (0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False) 305 (1): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 306 ) 307 ) 308 (1): Bottleneck( 309 (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) 310 (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 311 (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 312 (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 313 (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) 314 (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 315 (relu): ReLU(inplace=True) 316 ) 317 (2): Bottleneck( 318 (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) 319 (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 320 (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) 321 (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 322 (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) 323 (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) 324 (relu): ReLU(inplace=True) 325 ) 326 ) 327 (avgpool): AdaptiveAvgPool2d(output_size=(1, 1)) 328 (fc): Linear(in_features=2048, out_features=1000, bias=True) 329 )View Code
2.3 影象預處理
1 from torchvision import transforms 2 preprocess = transforms.Compose([ 3 transforms.Resize(256), 4 transforms.CenterCrop(224), 5 transforms.ToTensor(), 6 transforms.Normalize( 7 mean=[0.485, 0.456, 0.406], 8 std=[0.229, 0.224, 0.225] 9 )])
注:藉助TochVision模組中的transforms對輸入影象進行預處理。
第2行:定義了一個變數,是對輸入影象進行的所有影象轉換的組合。
第3行:將影象調整為256×256畫素。
第4行:將影象中心裁剪出來,大小為224×224畫素。
第5行:將影象轉換為PyTorch張量(tensor)資料型別。
第6-8行]:通過將影象的平均值和標準差設定為指定的值來正則化影象。
2.4 載入影象,並進行預處理轉換為模型對應的輸入形式
1 from PIL import Image 2 img = Image.open("C:/Users/Dell/Pictures/dog.jpg") 3 img
影象如下:
注:PIL:Python Imaging Library,可以通過Python直譯器進行影象處理,提供了強大的影象處理能力。
1 import torch 2 img_t = preprocess(img) 3 batch_t = torch.unsqueeze(img_t, 0)
注:將影象tensor增加一個維度,因為一張影象只有3個維度,但模型要求輸入是4緯張量,也就是預設是輸入一批影象,而不是一張。
經過處理後,batch_t也代表一批影象,不過其中只有一張影象而已。
這裡檢視一下 img_t 和 batch_t 的維度
1 img_t.size() 2 3 torch.Size([3, 224, 224])
1 batch_t.size() 2 3 torch.Size([1, 3, 224, 224])
2.5 模型推斷
使用預訓練模型來看看模型認為影象是什麼。首先,將模型置於eval模式,然後推斷。
1 resnet.eval() 2 out = resnet(batch_t) 3 out.size()
1 torch.Size([1, 1000])
注:out為一個二維向量,行為1,列為1000。
前面提到,模型輸入要求是一批影象,如果我們輸入5張影象,則out的行為5,列為1000,列表示1000個類。
也就是行表示每一個影象,列表示1000個類,每個類的置信度。故每一行中的1000個元素,分別表示該行對應影象為每個類的可能性。
接下來需要用到一開始下載的類別標籤imagenet_classes.txt,從文字檔案中讀取和儲存標籤。
1 with open('C:/Users/Dell/Desktop/imagenet_classes.txt') as f: 2 classes = [line.strip() for line in f.readlines()]
注:classes為含有1000個類名稱字串的列表(ImageNet資料集共包含1000個類)。
行號確定了類號,因此順序不可更改。
接下來找出輸出向量out中的最大置信度發生在哪個位置,用這個位置的下標來得出預測。
即索引最大預測值的位置。
1 _, index = torch.max(out, 1)
1 index 2 3 tensor([207])
注:取出二維向量 out 中每一行的最大值及下標,index為每行最大值的下標組成的列表。
接下來把預測值變為概率值。
1 percentage = torch.nn.functional.softmax(out, dim=1)[0] * 100 2 classes[index[0]], percentage[index[0]].item()
1 ("207: 'golden retriever',", 41.10982131958008)
該模型預測影象是一隻金毛獵犬,置信度為41.11%
注:
第1行:對二維向量 out 中的每一行進行歸一化(softmax是常用的歸一化指數函式),然後取出第一行並使每個元素乘以100,得到本例中狗對應的每種型別的可能性(即置信度)。
第2行:列印類名及其置信度。
classes[index[0]]即是最大置信度對應的類名稱。classes[index[0]]中,index[0]是第一行最大值的下標,即第一張圖片的最大置信度的下標,index[1]為第二張圖片的,index[2]是第三張圖片的,以此類推。所以classes列表的元素順序不可更改。
percentage[index[0]].item()中,index[0]的含義同上,percentage[index[0]]代表最大置信度那一項,.item()取出該項的值。
接下來看一看模型認為影象屬於其他類的置信度。
1 _, indices = torch.sort(out, descending=True) 2 [(classes[idx], percentage[idx].item()) for idx in indices[0][:5]]
1 [("207: 'golden retriever',", 41.10982131958008), 2 ("151: 'Chihuahua',", 20.436004638671875), 3 ("154: 'Pekinese, Pekingese, Peke',", 8.29426097869873), 4 ("852: 'tennis ball',", 7.233486175537109), 5 ("259: 'Pomeranian',", 5.713674068450928)]
注:
torch.sort將out進行排序,預設對每一行排序,這裡指定以遞減的方式排序。
[(classes[idx], percentage[idx].item()) for idx in indices[0][:5]]中,indices[0][:5]產生一個臨時的一維列表,包含indices的第一行前5個元素,也就是置信度最高的5個元素的下標值。
參考:使用PyTorch中的預訓練模型進行影象分類_u013679159的部落格-CSDN部落格
PyTorch預訓練模型影象分類之一_zhangzhifu2019的部落格-CSDN部落格(前面有.pth檔案載入)