PyTorch官方教程(四)-Transfer_Learning_Tutorial

阿新 • • 發佈：2018-11-12

通常情況下, 我們不會從頭訓練整個神經網路, 更常用的做法是先讓模型在一個非常大的資料集上進行預訓練, 然後將預訓練模型的權重作為當前任務的初始化引數, 或者作為固定的特徵提取器來使用. 既通常我們需要面對的是下面兩種情形:

Finetuning the convnet: 在一個已經訓練好的模型上面進行二次訓練
ConvNet as fixed feature extractor: 此時, 我們會將整個網路模型的權重引數固定, 並且將最後一層全連線層替換為我們希望的網路層. 此時, 相當於是將前面的整個網路當做是一個特徵提取器使用.

Load Data

我們將會使用torch.utils.data

包來載入資料. 我們接下來需要解決的問題是訓練一個模型來分類螞蟻和蜜蜂. 我們總共擁有120張訓練圖片, 具有75張驗證圖片.

data_transforms = {
    "train": transforms.Compose([
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(), # 注意轉換成tensor後, 畫素會變成[0,1]之間的浮點數
        transforms.Normalize([0.485 
,0.456,0.406],[0.229,0.224,0.225])
    ]),
    "val": transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485,0.456,0.406],[0.229,0.224,0.225])
    ])
}

data_dir = "hymenoptera_data"
# from torchvision import datasets 

image_datasets = {x:datasets.ImageFolder(root=os.path.join(data_dir, x),
                        transform=data_transforms[x])
                        for x in ["train", "val"]}
dataloaders = {x:torch.utils.data.DataLoader(image_datasets[x]), batch_size=4, shuffle=True, num_workers=4)
                            for x in ["train", "val"]}
dataset_sizes = {x:len(image_datasets[x]) for x in ["train", "val"]}
class_names = image_datasets["train"].classes
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

Visualize a few images

def imshow(inp, title=None):
    inp = inp.numpy().transpose((1,2,0))
    mean = np.array([0.485, 0.456, 0.406])
    std = np.array([0.229, 0.224, 0.225])
    inp = std * inp + mean
    inp = np.clip(inp, 0, 1)
    plt.imshow(inp)
    if title is not None:
        plt.title(title)
    plt.pause(0.001)  # pause a bit so that plots are updated

inputs, class_ids = next(iter(dataloaders["train"])) # 獲取一個batch
out = torchvision.utils.make_grid(inputs)
imshow(out, title=[class_names[x] for x in class_ids])

Training the model

接下來, 讓我們定義一個簡單的函式來訓練模型, 我們會利用LR scheduler物件torch.optim.lr_scheduler設定lr scheduler, 並且儲存最好的模型.

def train_model(model, criterion, optimizer, scheduler, num_epochs=25):
    since = time.time()

    best_model_wts = copy.deepcopy(model.state_dict())
    best_acc = 0.0

    for epoch in range(num_epochs):
        print(epoch)

        for phase in ["train", "val"]:
            if phase == "train":
                model.train()
            else:
                model.eval()

            running_loss = 0.0
            running_corrects = 0

            for inputs, labels in dataloaders[phase]:
                inputs = inputs.to(device)
                labels = labels.to(device)

                optimizer.zero_grad()

                # forward
                with torch.set_grad_enabled(phase == "train"):
                    outputs = model(inputs)
                    _, preds = torch.max(outputs,1) # preds代表最大值的座標, 相當於獲取了最大值對應的類別
                    loss = criterion(outputs, labels)

                    if phase = "train": # 只有處於train模式時, 來更新權重
                        loss.backward()
                        optimizer.step()
                # 統計狀態
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds==labels.data)

            epoch_loss = running_loss / dataset_sizes[phase]
            epoch_acc = running_corrects.double() / dataset_sizes[phase]
            print(phase, epoch_loss, epoch_acc)

            if phase == "val" and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())

    time_elapsed = time.time() - since
    print(time_elapsed)
    print(best_acc)

    # load best model weights
    model.load_state_dic(best_model_wts)
    return model

Visualizing the model predictions

下面的程式碼用於顯示預測結果

def visualize_model(model, num_images=6):
    was_training = model.training
    model.eval()
    images_so_far = 0
    fig = plt.figure()

    with torch.no_grad(): # 不計算梯度
        for i, (inputs, labels) in enumerate(dataloaders["val"]):
            inputs = inputs.to(device)
            labels = labels.to(device)

            outputs = model(inputs)
            _, preds = torch.max(outputs,1)

            for j in range(inputs.size()[0]): # 或者batch size
                images_so_far += 1
                ax = plt.subplot(num_images//2, 2, images_so_far)
                ax.axis("off")
                ax.set_title(class_names[preds[j]])
                imshow(inputs.cpu().data[j]) # 由於imshow不能作用在gpu的資料上, 因此需要先將其移動到cpu上.

                if images_so_far == num_images:
                    model.train(mode = was_training)
                    return
        model.train(mode=was_training)

FineTuning the convnet

載入預訓練模型, 並重置最後一層全連線層

# from torchvisioin import models
model_ft = models.resnet18(pretrained=True)
num_ftrs = model_ft.fc.in_features
model_ft = model_ft.to(device)

criterion = nn.CrossEntropyLoss()


# 這裡是讓所有的引數都進行更新迭代
optimizer_ft = optim.SGD(model_ft.parameters(), lr=0.001, momentum=0.9)

exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)

Train and evaluate

呼叫剛剛定義的訓練函式對模型進行訓練

model_ft = train_model(model_ft, criterion, optimizer_ft, exp_lr_scheduler, num_epochs=25)

visualize_model(model_ft)

Convnet as Fixed Feature Extractor

假設我們需要將除了最後一層的其它層網路的引數固定(freeze), 為此, 我們需要將這些引數的requires_grad屬性設定為False.

model_conv = torchvision.models.resnet18(pretrained=True)
for param in model_conv.parameters():
    param.requires_grad = False

# 將最後一層fc層重新指向一個新的Module, 其內部引數的requires_grad屬性預設為True
num_ftrs = model_conv.fc.in_features
model_conv.fc = nn.Linear(num_ftrs,2)

model_conv = model.to(device)

criterion = nn.CrossEntropyLoss()

optimizer_conv = optim.SGD(model_conv.fc.parameters(), lr=0.001, momentum=0.9)

exp_lr_scheduler = lr_scheduler.StepLR(optimizer_conv, step_size=7, gamma=0.1)

Train and evaluate

model_conv = train_model(model_conv, criterion, optimizer_conv, exp_lr_scheduler, num_eopch=25)
visualize_model(model_conv)

PyTorch官方教程(四)-Transfer_Learning_Tutorial

通常情況下, 我們不會從頭訓練整個神經網路, 更常用的做法是先讓模型在一個非常大的資料集上進行預訓練, 然後將預訓練模型的權重作為當前任務的初始化引數, 或者作為固定的特徵提取器來使用. 既通常我們需要面對的是下面兩種情形: Finetuning the convnet: 在

PyTorch官方教程(三)-Learning PyTorch with Examples

Tensors Warm-up: numpy 對於numpy來說, 它對計算圖, 深度學習, 梯度等等概念幾乎是不知道的, 但是, 如果我們瞭解簡單神經網路的具體結構, 那麼我們就可以很輕易的用numpy來實現這個簡單網路, 對此, 我們通常需要自己來實現前向計算和反向計算的邏輯

PyTorch官方教程(二)-DataLoadingAndProcessing

對於一個新的機器/深度學習任務, 大量的時間都會花費在資料準備上. PyTorch提供了多種輔助工具來幫助使用者更方便的處理和載入資料. 本示例主要會用到以下兩個包: scikit-image: 用於讀取和處理圖片 pandas: 用於解析csv檔案匯入下面

PyTorch官方教程(一)-A 60 Minute Blitz

What is PyTorch? 一個基於Python的科學計算包, 設計目的有兩點: numpy在GPUs實現上的替代品具有高度靈活性和速度的深度學習研究平臺 Tensors Tensors可以理解成是Numpy中的ndarrays, 只不過Ten

Pytorch官方教程學習筆記（4）

資料並行處理 Authors: Sung Kim <https://github.com/hunkim>_ and Jenny Kang <https://github.com/jennykang>_ 在本文中，我們將學習如何使用Data

Pytorch官方教程學習筆記（5）

資料載入與處理 Author: Sasank Chilamkurthy <https://chsasank.github.io>_ A lot of effort in solving any machine learning problem goe

【PyTorch】Pytorch入門教程四

logistic_regression import torch import torch.nn as nn import torchvision.datasets as dsets import

【DeepLearning】【PyTorch (1)】PyTorch官方教程個人筆記

PyTorch 官方教程 Getting Started 第一部分 Deep Learning with PyTorch: A 60 Minute Blitz 筆記文章目錄 1. What is PyTorch? 2. Autogr

RabbitMQ 官方NET教程(四)【路由選擇】

在上一個教程中，我們構建了一個簡單的日誌記錄系統。我們能夠廣播日誌訊息給所有你的接收者。在本教程中，我們將為其新增一個功能 - 我們將讓日誌接收者可以僅訂閱一部分訊息。例如，我們將能夠僅將關鍵的錯誤訊息寫入到日誌檔案（以節省磁碟空間），同時仍然能夠在控制

redis學習教程四《管理、備份、客戶端連接》

node 讀文件配置文件 cluster config 方案 then connect ram redis學習教程四《管理、備份、客戶端連接》一：Redis服務器命令 Redis服務器命令下表列出了與Redis服務器相關的一些基本命令。序號命令說明

輕松學習之Linux教程四神器vi程序編輯器攻略

分享內置 snippet 2014年答案程序 ice 界面 fff 本系列文章由@超人愛因斯坦出品，轉載請註明出處。文章鏈接： http://hpw123.net/a/Linux/Linuxjichu/2014

RabbitMQ官方教程一 "Hello World!"

tps rabbit 官方教程 blank python.h targe world target rabbitmq https://zhuanlan.zhihu.com/p/24335916 https://www.rabbitmq.com/tutorials/tutor

Unity 官方教程學習

mas variables ber item term media nim als -m Interface & Essentials Using the Unity Interface 1.Interface Overview https://unity3d.co

MySQL官方教程及各平臺的安裝教程和配置詳解入口

www 官方 apt源 nbsp chrom 版本選擇 rom gui apt 官方文檔入口： https://dev.mysql.com/doc/ 一般選擇MySQL服務器版本入口： https://dev.mysql.com/doc/refman/en/

Elasticsearch: 權威指南（官方教程）

span spa art agg current .html mode ide gui 《Elasticsearch 權威指南》中文版序言前言基礎入門深入搜索處理人類語言聚合地理位置數據建模管理、監控和部署

Spring Cloud 入門教程(四)：分布式環境下自動發現配置服務

.html article png discover ice conf label tail 註釋前一章，我們的Hello world應用服務，通過配置服務器Config Server獲取到了我們配置的hello信息“hello world”. 但自己的配置文件中必須配

webpack教程(四)——css的加載

重復 black 添加開發 alt 類名 app.js modules 做了首先要安裝css的loader npm install css-loader style-loader --save-dev 然後在webpack.config.js中配置如下代碼意思是

Spring Boot系列教程四：配置文件詳解properties

date int ava ota axu return 端口 rand work 一.配置隨機數，使用隨機數在application.properties文件添加配置信息 1 #32位隨機數 2 woniu.secret=${random.value} 3 #隨機整數

Hadoop MapReduce 官方教程 -- WordCount示例

get pre red oop hadoop apache tor ria pac Hadoop MapReduce 官方教程 -- WordCount示例： http://hadoop.apache.org/docs/r1.0.4/cn/mapred_tutorial.h

Java NIO框架Netty教程(四) ChannelBuffer

ets 認識 buffers 不想 http 觸發 getch 我們基於在學字符串消息收發(http://www.it165.net/pro/html/201207/3174.html)的時候，已經提到過。ChannelBuffer是Netty中非常重要的概念。所有消息

PyTorch官方教程(四)-Transfer_Learning_Tutorial

Load Data

Visualize a few images

Training the model

Visualizing the model predictions

FineTuning the convnet

Train and evaluate

Convnet as Fixed Feature Extractor

Train and evaluate

相關推薦