1. 程式人生 > >Pytorch學習-訓練CIFAR10分類器

Pytorch學習-訓練CIFAR10分類器

output_10_1.png

TRAINING A CLASSIFIER

參考Pytorch Tutorial:Deep Learning with PyTorch: A 60 Minute Blitz

在學會了以下後:

  1. 定義神經網路
  2. 計算損失函式
  3. 更新權重

What about data

Generally, when you have to deal with image, text, audio or video data, you can use standard python packages that load data into a numpy array. Then you can convert this array into a torch.*Tensor.

For images, packages such as Pillow, OpenCV are useful
For audio, packages such as scipy and librosa
For text, either raw Python or Cython based loading, or NLTK and SpaCy are useful

Specifically for vision, we have created a package called torchvision, that has data loaders for common datasets such as Imagenet, CIFAR10, MNIST, etc. and data transformers for images, viz., torchvision.datasets and torch.utils.data.DataLoader.

當處理影象、文字、音訊或視訊資料時,可以用python的標準包來家在資料並存為Numpy Array,而後再轉成torch.Tensor

  • 影象: 常用Pillow,OpenCv
  • 音訊: scipy,librosa
  • 文字: 原python或cython載入,或NLTK和Spacy常用

針對計算機視覺,pytorch有提供了便於處理的包torchvision裡面包括了'data loader',可以載入常用的資料集imagenet,Cifar10,Mnist等

還包括一些轉換器(可以做資料增強 Augment)

torchvision.datasets torch.utils.data.DataLoader

在這個實驗中,使用CIFAR10資料集

包含型別:‘airplane’, ‘automobile’, ‘bird’, ‘cat’, ‘deer’, ‘dog’, ‘frog’, ‘horse’, ‘ship’, ‘truck’

CIFAR10資料集中的圖片size均為33232(3個通道rgb,32*32大小)

Training an image classifier

步驟:

  1. 載入並標準化訓練與測試資料集,使用 torchvision
  2. 定義卷積神經網路convnet
  3. 定義損失函式
  4. 訓練集訓練神經網路
  5. 測試集測試網路效能

Step1:載入並標準化訓練與測試資料集

import torch
import torchvision
import torchvision.transforms as transforms

torchvison資料集是 PILImage型別,值在[0,1]之間,需要轉換成Tensors並標準化到[-1,1]

transform = transforms.Compose([transforms.ToTensor(),transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))])
#compose 是將多個轉換器功能混合在一起
#./是當前目錄 ../是父目錄 /是根目錄
trainset = torchvision.datasets.CIFAR10(root='./data',train=True,download=True,transform=transform)#已經下載就不會再下載了
trainloader = torch.utils.data.DataLoader(trainset,batch_size=4,shuffle=True,num_workers=2)
testset = torchvision.datasets.CIFAR10(root='./data',train=False,download=True,transform=transform)
testloader = torch.utils.data.DataLoader(testset,batch_size=4,shuffle=False,num_workers=2) 
#num_workers 處理程序數
classes = ('plane','car','bird','cat','deer','dog','frog','horse','ship','truck')
Files already downloaded and verified
Files already downloaded and verified
print(trainset)
print("----"*10)
print(testset)
Dataset CIFAR10
    Number of datapoints: 50000
    Split: train
    Root Location: ./data
    Transforms (if any): Compose(
                             ToTensor()
                             Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5))
                         )
    Target Transforms (if any): None
----------------------------------------
Dataset CIFAR10
    Number of datapoints: 10000
    Split: test
    Root Location: ./data
    Transforms (if any): Compose(
                             ToTensor()
                             Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5))
                         )
    Target Transforms (if any): None
#show一些圖片 for fun??
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

def imshow(img):
    img = img/2+0.5
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg,(1,2,0))) #轉回正常格式 從chw轉回hwc
    
dataiter = iter(trainloader) #迭代器
images,labels = dataiter.next()
print(labels)
imshow(torchvision.utils.make_grid(images))

print(''.join('%5s'%classes[labels[j]] for j in range(4))) #因為一個batch是4,所以一次next取4個
tensor([2, 8, 1, 5])
 bird ship  car  dog

labels
tensor([2, 8, 1, 5])

Step2: 定義卷積神經網路

import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    #這一步只是定義了可能要用到的層,在計算中,可能有的層用了多次,有的不用
    def __init__(self):
        super(Net,self).__init__()
        self.conv1 = nn.Conv2d(3,6,5) #(輸入channel,輸出channel,卷積核)
        self.pool = nn.MaxPool2d(2,2) #定義一個池化層,用兩次
        self.conv2 = nn.Conv2d(6,16,5)
        self.fc1 = nn.Linear(16*5*5,120)
        self.fc2 = nn.Linear(120,84)
        self.fc3 = nn.Linear(84,10)

    #實際如何構建神經網路是根據forward確定
    def forward(self,x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1,16*5*5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x
    
net = Net()
Net(
  (conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=400, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)

定義損失函式和優化器(用於更新權重)

注意⚠️:torch 中最後輸出了10維,而labels是一個1* 1 數字。這樣處理的也是正確的,計算loss時是通過x[labels]來取得每一個數來計算,所以實際上是一樣

而在其他地方是將labels當作10維向量來處理。其實都是一個東西。系統內部自行處理,不用太糾結於細節

import torch.optim as optim
#這裡的crossentropy包含了softmax層,可以不用再加softmax了。 #而且這個損失函式的原理是讓正確值儘可能大,錯值儘可能小
criterion = nn.CrossEntropyLoss() # 交叉熵 #在這裡計算的交叉熵是直接用類別來取值的,而不是化成n類-》n列向量,所在類為1這樣子
optimizer = optim.SGD(net.parameters(),lr = 0.001,momentum=0.9)

訓練網路

for epoch in range(2): #訓練的epoch數
    running_loss = 0.0
    for i,data in enumerate(trainloader,0): #0表示是從0開始,一般預設就是0
        #得到data
        inputs,labels = data
        #初始化梯度(0)
        optimizer.zero_grad()         
        #前向計算
        outputs = net(inputs)
        #計算損失函式
        loss = criterion(outputs,labels)
        #反向傳播(計算梯度)
        loss.backward()
        #更新梯度
        optimizer.step()
        
        #print 統計資料
        running_loss += loss.item() #統計資料的損失
        if i% 2000 == 1999: #每2000個batch 列印一次
            print('[%d, %5d] loss: %.3f'%(epoch+1,i+1,running_loss))
            running_loss = 0.0 #列印完歸零
print('Finished Training')
[1,  2000] loss: 4505.347
[1,  4000] loss: 3816.202
[1,  6000] loss: 3448.905
[1,  8000] loss: 3221.118
[1, 10000] loss: 3091.055
[1, 12000] loss: 2993.834
[2,  2000] loss: 2793.536
[2,  4000] loss: 2777.763
[2,  6000] loss: 2710.222
[2,  8000] loss: 2668.854
[2, 10000] loss: 2622.627
[2, 12000] loss: 2571.615
Finished Training

用test資料測試網路

通過預測類別並對比ground-truth

#先顯示下test的影象
dataiter = iter(testloader)
images,labels = dataiter.next()

imshow(torchvision.utils.make_grid(images))
print('GroundTruth: ',' '.join('%5s' % classes[labels[j]] for j in range(4)))

GroundTruth:    cat  ship  ship plane

png

outputs = net(images) #放進去計算預測結果

_,predicted = torch.max(outputs,1) #outputs的第2維(各行的每一列中取出最大的1列)中取出最大的數(丟棄),取出最大數所在索引(predicted)

print('Predicted: ' ,' '.join('%5s'% classes[predicted[j]] for j in range(4)))
Predicted:   deer   cat  deer horse
print(outputs)
print(predicted)
tensor([[-3.4898, -3.6106,  1.2521,  3.3437,  3.3692,  3.2635,  2.6993,  2.0445,
         -4.8485, -3.5421],
        [-1.9592, -2.6239,  1.1073,  3.4853,  1.0128,  3.2079, -0.2431,  1.9412,
         -2.4887, -2.2249],
        [-0.2035,  1.3960,  0.6715, -0.1788,  3.5923, -1.4808,  0.4605, -0.0833,
         -2.6476, -1.5091],
        [-1.7742, -2.5306,  1.0426,  0.2753,  3.6487,  0.9355,  0.2774,  4.9753,
         -4.7646, -2.7965]], grad_fn=<ThAddmmBackward>)
tensor([4, 3, 4, 7])

計算整體精度

在整個測試集的表現

correct = 0
total = 0
with torch.no_grad(): #告訴機器不用再去自動計算每一個tensor梯度了。
    for data in testloader:
        images,labels = data
        outputs = net(images)
        _,predicted = torch.max(outputs.data,1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
print('Accuracy of the network on the 10000 test images:%d %%'%(100*correct/total))
Accuracy of the network on the 10000 test images:54 %

似乎學到了東西,再看看哪些類別表現的更好

class_correct = list(0.for i in range(10)) #生成浮點型list
class_total = list(0.for i in range(10))
with torch.no_grad():
    for data in testloader:
        images,labels = data
        outputs = net(images)
        _,predicted = torch.max(outputs,1)
        c = (predicted == labels).squeeze() #就是所有資料都擠到一行,可以方便c[i]取值
        for i in range(4):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] +=1
for i in range(10):
    print('Accuracy of %5s : %2d %%'%(classes[i],100*class_correct[i]/class_total[i]))
Accuracy of plane : 57 %
Accuracy of   car : 80 %
Accuracy of  bird : 37 %
Accuracy of   cat : 45 %
Accuracy of  deer : 45 %
Accuracy of   dog : 43 %
Accuracy of  frog : 61 %
Accuracy of horse : 54 %
Accuracy of  ship : 64 %
Accuracy of truck : 54 %

用GPU做怎麼做?

就像轉移tensor到gpu一樣,轉移整個neural net 到gpu。
先定義一個device作為首個可見的cuda device(如果有,沒有則做不了)

device = torch.device("cude:0" if torch.cuda.is_available() else 'cpu')
#假如在cuda機器中,這裡會列印cuda device
print(device)
cpu
net.to(device)
#切記 要在每一步的inputs和targets都放到gpu device 中
inputs,labels = inputs.to(device),labels.to(device)

為什麼沒有顯著速度提升?因為網路的太小,不明顯

如何用上所有GPUs(多個)? Data Parallelism

有用的函式

  • torch.from_numpy() numpy直接轉tensor,不變維度
  • transforms.ToTensor() numpy轉tensor,第三維變成第一維,其他兩維後移
  • x.numpy() 轉回numpy格式 x是tensor變數
  • x.transpose((2,0,1)) x是numpy格式,但維度不正確,進行維度轉換 意思是將最後一維變為第一維 ,(0,1,2)即表示不變