pytorch 訓練資料以及測試全部程式碼(4)

阿新 • • 發佈：2018-11-10

接到上文

            # Show 10 * 3 images results each epoch
            if ii % (num_img_tr // 10) == 0:
                grid_image = make_grid(inputs[:3].clone().cpu().data, 3, normalize=True)
                writer.add_image('Image', grid_image, global_step)
                grid_image = make_grid(utils.decode_seg_map_sequence(torch.max(outputs[:3], 1)[1].detach().cpu().numpy()), 3, normalize=False,
                                       range=(0, 255))
                writer.add_image('Predicted label', grid_image, global_step)
                grid_image = make_grid(utils.decode_seg_map_sequence(torch.squeeze(labels[:3], 1).detach().cpu().numpy()), 3, normalize=False, range=(0, 255))
                writer.add_image('Groundtruth label', grid_image, global_step)

這部分待補充，到現在為止的需要補充的是模型框架，tensorboardx,用writer資料的儲存等知識

 # Save the model
        if (epoch % snapshot) == snapshot - 1:  # snapshot = 10
            torch.save(net.state_dict(), os.path.join(save_dir, 'models', modelName + '_epoch-' + str(epoch) + '.pth'))
            print("Save model at {}\n".format(os.path.join(save_dir, 'models', modelName + '_epoch-' + str(epoch) + '.pth')))

net.load_state_dict(
        torch.load(os.path.join(save_dir, 'models', modelName + '_epoch-' + str(resume_epoch - 1) + '.pth'),
                   map_location=lambda storage, loc: storage)) # Load all tensors onto the CPU

載入模型引數

每十次ｅｐｏｃｈ就儲存一次模型引數，這個方式可以待優化！！！！　torch.save(net.state_dict(), os.path.join(save_dir, 'models', modelName + '_epoch-' + str(epoch) + '.pth'))　儲存檔名字是pth結尾的．net.state_dict()這裡有一個小疑問要解答一下，根據前面可知net以及它的引數全部都在ＧＰＵ上面，這個時候儲存的地點明顯就是本地路徑，為什麼不先轉移到ＣＰＵ再儲存？原因可能是函式state_dict()已經進行了處理所以不需要了．

 # One testing epoch
        if useTest and epoch % nTestInterval == (nTestInterval - 1):  # nTestInterval = 5
            total_miou = 0.0
            net.eval()
            for ii, sample_batched in enumerate(testloader):
                inputs, labels = sample_batched['image'], sample_batched['label']

                # Forward pass of the mini-batch
                inputs, labels = Variable(inputs, requires_grad=True), Variable(labels)
                if gpu_id >= 0:
                    inputs, labels = inputs.cuda(), labels.cuda()

                with torch.no_grad():
                    outputs = net.forward(inputs)

                predictions = torch.max(outputs, 1)[1]

                loss = criterion(outputs, labels, size_average=False, batch_average=True)
                running_loss_ts += loss.item()

                total_miou += utils.get_iou(predictions, labels)

                # Print stuff
                if ii % num_img_ts == num_img_ts - 1:

                    miou = total_miou / (ii * testBatch + inputs.data.shape[0])
                    running_loss_ts = running_loss_ts / num_img_ts

                    print('Validation:')
                    print('[Epoch: %d, numImages: %5d]' % (epoch, ii * testBatch + inputs.data.shape[0]))
                    writer.add_scalar('data/test_loss_epoch', running_loss_ts, epoch)
                    writer.add_scalar('data/test_miour', miou, epoch)
                    print('Loss: %f' % running_loss_ts)
                    print('MIoU: %f\n' % miou)
                    running_loss_ts = 0

上面的就是驗證集部分了，也包含在了訓練的ｅｐｏｃｈ裡面．

 net.eval()　＃進行測試
＃之前我們已經看到了如下的包含net的函式
net.load_state_dict(
        torch.load(os.path.join(save_dir, 'models', modelName + '_epoch-' + str(resume_epoch - 1) + '.pth'),
                   map_location=lambda storage, loc: storage))
net.cuda()
optimizer = optim.SGD(net.parameters(), lr=p['lr'], momentum=p['momentum'], weight_decay=p['wd'])
net.train()
net.forward(inputs)
torch.save(net.state_dict(), os.path.join(save_dir, 'models', modelName + '_epoch-' + str(epoch) + '.pth'))

測試的時候就要用net.eval()和訓練的時候要使用net.train()是一樣的

 # Forward pass of the mini-batch
                inputs, labels = Variable(inputs, requires_grad=True), Variable(labels)
                if gpu_id >= 0:
                    inputs, labels = inputs.cuda(), labels.cuda()

                with torch.no_grad():
                    outputs = net.forward(inputs)

在這裡因為我們不需要求梯度了所以使用的是torch.no_grad()，當然這裡也可以修改為以下程式碼, Variable預設是False

 # Forward pass of the mini-batch
                inputs, labels = Variable(inputs, requires_grad=False), Variable(labels)
              # or inputs, labels = Variable(inputs), Variable(labels)
                if gpu_id >= 0:
                    inputs, labels = inputs.cuda(), labels.cuda()

                outputs = net.forward(inputs)

predictions = torch.max(outputs, 1)[1]

torch.max函式功能參考：https://blog.csdn.net/Z_lbj/article/details/79766690

predictions的結構和數值現在還沒有弄清楚，要先弄明白網路的輸出之後才能知道。

接下來是計算miou：

total_miou += utils.get_iou(predictions, labels)
def get_iou(pred, gt, n_classes=21):
    total_miou = 0.0
    for i in range(len(pred)):
        pred_tmp = pred[i]
        gt_tmp = gt[i]

        intersect = [0] * n_classes　#　符號＊表示倍乘
      # union=[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
        union = [0] * n_classes　
        for j in range(n_classes):
            match = (pred_tmp == j) + (gt_tmp == j)

            it = torch.sum(match == 2).item()
            un = torch.sum(match > 0).item()

            intersect[j] += it
            union[j] += un

        iou = []
        unique_label = np.unique(gt_tmp.data.cpu().numpy())
        for k in range(len(intersect)):
            if k not in unique_label:
                continue
            iou.append(intersect[k] / union[k])

        miou = (sum(iou) / len(iou))
        total_miou += miou

    return total_miou

utils.get_iou(predictions, labels)計算一個batch的miou的函式，intersect儲存的是每一類物體的預測和標籤一致的數量，union儲存的是每一類物體的數量也就是總的數量，也有可能還包含著將背景預測成物體的那些畫素或者說是物體的數量。 unique_label = np.unique(gt_tmp.data.cpu().numpy())是得到標籤圖中含有的種類數目。

if k not in unique_label:
                continue

上面這句保證了將背景預測成物體的那些畫素或者說是物體的數量這種情況不會在計算裡面出現，也就是說把偽標籤去掉了。所以iou的大小可能不是21個元素，大部分是小於21。並且在這裡計算出來的每一類iou=預測出的真實標籤數量/真實標籤數量。

這個就是將所有的batch產生的miou加起來得到total_miou:

total_miou += utils.get_iou(predictions, labels)
# 總共的miou值

跑完全部驗證圖圖片的時候：

 # Print stuff
                if ii % num_img_ts == num_img_ts - 1:

                    miou = total_miou / (ii * testBatch + inputs.data.shape[0])
                    running_loss_ts = running_loss_ts / num_img_ts

                    print('Validation:')
                    print('[Epoch: %d, numImages: %5d]' % (epoch, ii * testBatch + inputs.data.shape[0]))
                    writer.add_scalar('data/test_loss_epoch', running_loss_ts, epoch)
                    writer.add_scalar('data/test_miour', miou, epoch)
                    print('Loss: %f' % running_loss_ts)
                    print('MIoU: %f\n' % miou)
                    running_loss_ts = 0

這裡ii * testBatch + inputs.data.shape[0]=241*6+6=242*6=1452, 而實際上總張數是1449，就是說最後一個batch只有3張圖片。

那麼這裡得到的miou就是平均每一張的iou值，running_loss_ts也是平均每一張圖圖片的損失。

最終全部的epoch跑完之後就要關閉

writer.close()

pytorch 訓練資料以及測試全部程式碼(4)

pytorch 訓練資料以及測試全部程式碼(4)

pytorch 訓練資料以及測試全部程式碼(3)

pytorch 訓練資料以及測試全部程式碼(2)

pytorch 訓練資料以及測試全部程式碼(1)

pytorch 訓練資料以及測試全部程式碼(5) 網路

pytorch 訓練資料以及測試全部程式碼(6) 網路

關於使用tensorflow object detection API訓練自己的模型-補充部分（程式碼，資料標註工具，訓練資料，測試資料）

python交叉驗證以及將全部資料分類訓練集和測試集（分類）

YOLOv2在自己的資料集上訓練以及測試

Mybatis+MySQL動態分頁查詢資料經典案例（含程式碼以及測試）

python 把資料分成訓練集和測試集

小程式學習--訪問API資料以及後期多次請求的程式碼封裝優化！

Mxnet訓練自己的資料集並測試

深度學習入門專案完整流程——圖片製作資料集、訓練網路、測試準確率（TensorFlow+keras）

神經網路中訓練資料集、驗證資料集和測試資料集的區別

pytorch：EDSR 生成訓練資料的方法

資料結構-順序棧的基本操作的實現（含全部程式碼）

資料結構-迴圈佇列的基本操作函式實現（含全部程式碼）

資料結構-鏈隊的基本操作函式的實現（含全部程式碼）

資料結構-簡單選擇排序（含全部程式碼）

pytorch 訓練資料以及測試 全部程式碼(4)

相關推薦

pytorch 訓練資料以及測試全部程式碼(4)