pytorch入門——邊學邊練01基礎知識

阿新 • • 發佈：2018-11-08

訪問本站觀看效果更佳

寫在前面

首先說一下寫作目的，很多時候，看了官網的教程，感覺看懂了，但在實際操作的時候又無從下手，所以我打算整理幾篇博文幫助大家迅速入門，如果大家有什麼好的意見建議，歡迎在文末留言。

一、pytorch簡介

現在流行的機器學習框架很多，比如tensorflow、Keras等等，那麼我為什麼要用pytorch呢？之前我一直在使用Keras，然而畢竟多套了一層，不夠靈活。而pytorch又很好學，所以我打算把一部分工作遷移到pytorch上來，也算是多掌握一門手藝活。
從今天開始，我會和大家一起學習pytorch。
pytorch的學習資源，文件很詳細，github上也有很多程式碼。pytorch

的官網應該沒有被牆吧？pytorch官網上有詳細的安裝教程以及相關文件，所以本文就不講如何安裝pytorch了。有時間我再把安裝配置CUDA環境的博文補上，我發現好多博文裡基本不講CUDA程式設計的知識，其實這部分也十分有趣。
我琢磨了一下，從何講起呢？翻來覆去想了想不如按照github上面的教程pytorch-tutorial講解吧！還是動手敲敲程式碼學習得快。
下面展示一下我的學習過程～～

二、今天我們要做什麼？

我們首先講一下pytorch的基本操作，詳細程式碼參見PyTorch Basics。如果您已經看過官網的教程，熟知基本的操作，建議您跳過該章節，直接看後面的博文。
我們先引入所需的包。

import torch 
import torchvision
import torch.nn as nn
import numpy as np
import torchvision.transforms as transforms

在實際工作中，我們常常需要和圖片，文字等打交道，這些非結構化的資料在框架裡又是怎麼體現的呢？正如tensorflow這個框架所暗示的，操作的基本單元就是’tensor’張量。拿到張量後，我們又需要幹什麼呢？加加減減乘乘除除，經過一系列運算，得到一個結果，然後對這個結果求導。所謂的網路也就是一個複雜點的公式吧！

三、一個小例子

我們給出一個簡單的例子，我們先用常量賦值的方式創造一些tensor

：

# Create tensors.
x = torch.tensor(1., requires_grad=True)
w = torch.tensor(2., requires_grad=True)
b = torch.tensor(3., requires_grad=True)

# Build a computational graph.
y = w * x + b    # y = 2 * x + 3

# Compute gradients.
y.backward()

# Print out the gradients.
print(x.grad)    # x.grad = 2 
print(w.grad)    # w.grad = 1 
print(b.grad)    # b.grad = 1

對公式y = 2 * x + 3 求導再簡單不過了。requires_grad=True表明一直跟蹤變數的狀態，時刻準備求導。只要我們敲一下：

# Compute gradients.
y.backward()

就會計算出各自的導數啦。

# Print out the gradients.
print(x.grad)    # x.grad = 2 
print(w.grad)    # w.grad = 1 
print(b.grad)    # b.grad = 1

但是這個例子比較簡單，我們來看一個更加複雜的例子。

四、複雜一些的例子

首先我們來構造，一些tensor，為了使資料具有代表性，我們希望它們是多維的。

# Create tensors of shape (10, 3) and (10, 2).
x = torch.randn(10, 3)
y = torch.randn(10, 2)

我們既然有了兩個“矩陣”（10×3以及10×2），那麼我們可以做什麼呢？我們不妨把y視作結果，能不能找到一個方法，去用x擬合y呢？似乎是可能的，我們只要構造y = Wx + b的形式就可以吧？在pytorch裡怎麼表示W和b呢？我們可以呼叫torch.nn下的線性層。

# Build a fully connected layer.
linear = nn.Linear(3, 2)
print ('w: ', linear.weight)
print ('b: ', linear.bias)

列印結果如下

w:  Parameter containing:
tensor([[ 0.4690, -0.5511,  0.2672],
        [ 0.4337, -0.4777,  0.1417]], requires_grad=True)
b:  Parameter containing:
tensor([-0.2510,  0.2035], requires_grad=True)

我們看到說到底還是一個個tensor嘛！那麼這個nn.Linear究竟是什麼？我們開啟跟蹤一下程式碼到linear.py。
這裡頭有一個類class Linear(Module)，文件裡是這麼介紹的。
Applies a linear transformation to the incoming data: :math:`y = xA^T + b
引數有三個，很容易就看懂了：

    Args:
        in_features: size of each input sample
        out_features: size of each output sample
        bias: If set to False, the layer will not learn an additive bias.
            Default: ``True``

那麼我們在看看nn.Linear(3, 2)也就很好理解了，前面一個是輸入的維度(3)後面一個是輸入的維度(2)。這就是一個矩陣乘法。
再看看如何逼近y。怎麼算是逼近y？相差的越小越好，就是一個反向傳播求導，逐步縮小誤差的過程。
我們能不能像上面的例子裡一樣自己定義一個差值呢？理一下思路，loss公式是什麼樣的？（公式是Latex格式，解析地址在中國，國外網路沒解析出來請等一會，或者複製下來找自行解析）。
$ \ell(x, y) = L = {l_1,\dots,l_N}^\top，\quad
l_n = \left( x_n - y_n \right)^2$

$ \ell(x, y) = \begin{cases}
\operatorname{mean}(L), & \text{if}; \text{size_average} = \text{True},\
\operatorname{sum}(L), & \text{if}; \text{size_average} = \text{False}.
\end{cases}$
當然可以但是這樣做存在一個問題，難道我們每次都要自己實現一遍？太麻煩了所以我們呼叫pytorch替我們實現好的函式吧！

# Build loss function and optimizer.
criterion = nn.MSELoss()

同樣的，pytorch為我們提供了求導的工具，我們可以直接呼叫。

optimizer = torch.optim.SGD(linear.parameters(), lr=0.01)

目前為止，我們已經理清了大體思路，設定好了求導過程。讓我們繼續完成後續的工作吧！

# Forward pass.
pred = linear(x)

# Compute loss.
loss = criterion(pred, y)
print('loss: ', loss.item())

扣動求導的扳機～

# Backward pass.
loss.backward()

# Print out the gradients.
print ('dL/dw: ', linear.weight.grad) 
print ('dL/db: ', linear.bias.grad)

結果如下所示：

loss:  1.4480115175247192
dL/dw:  tensor([[ 0.9242,  0.2026,  0.4504],
        [ 0.6620, -0.2875,  0.1874]])
dL/db:  tensor([-0.0164,  0.1238])

上面的程式碼中完成了一次求導的工作，如果想更加直觀的操作一把，您可以進行如下操作，實際結果上是相近的：

# 1-step gradient descent.
optimizer.step()

# You can also perform gradient descent at the low level.
# linear.weight.data.sub_(0.01 * linear.weight.grad.data)
# linear.bias.data.sub_(0.01 * linear.bias.grad.data)

# Print out the loss after 1-step gradient descent.
pred = linear(x)
loss = criterion(pred, y)
print('loss after 1 step optimization: ', loss.item())

五、從numpy載入資料

pytorch尤其方便的一點就是可以輕鬆的實現numpy到torch tensor的轉換，直接看程式碼吧～

# Create a numpy array.
x = np.array([[1, 2], [3, 4]])

# Convert the numpy array to a torch tensor.
y = torch.from_numpy(x)

# Convert the torch tensor to a numpy array.
z = y.numpy()

六、資料讀取

資料讀取往往是大家在修改程式碼時面臨的第一個問題，我們先就一個標準資料集合說明一下讀取問題。

第一步下載資料集

# Download and construct CIFAR-10 dataset.
train_dataset = torchvision.datasets.CIFAR10(root='../../data/',
                                             train=True, 
                                             transform=transforms.ToTensor(),
                                             download=True)

其實非常簡單，就是指定一下位置下載資料。那麼資料是什麼樣的呢？我們得看看吧！

第二步預覽資料
資料說明在dataset的說明裡都有，我們要做的是觀察一下資料的格式大小。

# Fetch one data pair (read data from disk).
image, label = train_dataset[0]
print (image.size())
print (label)

輸出結果如下所示：

torch.Size([3, 32, 32])
6

pytorch有沒有給我們提供一個較為方便的介面，讓我們可以遍歷資料呢？當然有。我們檢視文件，看到class DataLoader，看看它是怎麼說的。
**Data loader. Combines a dataset and a sampler, and provides single- or multi-process iterators over the dataset. **
有什麼引數呢？看著一堆，其實也很好懂的。

Arguments:
        dataset (Dataset): dataset from which to load the data.
        batch_size (int, optional): how many samples per batch to load
            (default: 1).
        shuffle (bool, optional): set to ``True`` to have the data reshuffled
            at every epoch (default: False).
        sampler (Sampler, optional): defines the strategy to draw samples from
            the dataset. If specified, ``shuffle`` must be False.
        batch_sampler (Sampler, optional): like sampler, but returns a batch of
            indices at a time. Mutually exclusive with batch_size, shuffle,
            sampler, and drop_last.
        num_workers (int, optional): how many subprocesses to use for data
            loading. 0 means that the data will be loaded in the main process.
            (default: 0)
        collate_fn (callable, optional): merges a list of samples to form a mini-batch.
        pin_memory (bool, optional): If ``True``, the data loader will copy tensors
            into CUDA pinned memory before returning them.
        drop_last (bool, optional): set to ``True`` to drop the last incomplete batch,
            if the dataset size is not divisible by the batch size. If ``False`` and
            the size of dataset is not divisible by the batch size, then the last batch
            will be smaller. (default: False)
        timeout (numeric, optional): if positive, the timeout value for collecting a batch
            from workers. Should always be non-negative. (default: 0)
        worker_init_fn (callable, optional): If not None, this will be called on each
            worker subprocess with the worker id (an int in ``[0, num_workers - 1]``) as
            input, after seeding and before data loading. (default: None)

書接上文，我們看一個例子，先例項化一個dataloader：

第三步定義loader

# Data loader (this provides queues and threads in a very simple way).
train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
                                           batch_size=64, 
                                           shuffle=True)

不難看出，常用的選項就這幾個，畢竟這裡資料量還不算太大。暫時不考慮並行性。那麼這個train_loader又是什麼呢？我們要去如何訪問一個個元素呢？文件裡顯示DataLoader底層是使用迭代的方法訪問元素的，顯然我們可以呼叫一個迭代器。

# When iteration starts, queue and thread start to load data from files.
data_iter = iter(train_loader)

第四步呼叫資料
每次讀取多少由上面的batch_size告知。現在我們拿到了迭代器，就可以一個個取值，並進行實際操作了。

# Mini-batch images and labels.
images, labels = data_iter.next()

# Actual usage of the data loader is as below.
for images, labels in train_loader:
    # Training code should be written here.
    pass

七、從自定義資料集合裡匯入資料

在上文中我們直接載入的預設的資料集，這對一名有其它資料要求的使用者來說顯然是不夠的。我們需要做的就是重構一下pytorch的程式碼。

# You should your build your custom dataset as below.
class CustomDataset(torch.utils.data.Dataset):
    def __init__(self):
        # TODO
        # 1. Initialize file paths or a list of file names. 
        pass
    def __getitem__(self, index):
        # TODO
        # 1. Read one data from file (e.g. using numpy.fromfile, PIL.Image.open).
        # 2. Preprocess the data (e.g. torchvision.Transform).
        # 3. Return a data pair (e.g. image and label).
        pass
    def __len__(self):
        # You should change 0 to the total size of your dataset.
        return 0 

# You can then use the prebuilt data loader. 
custom_dataset = CustomDataset()
train_loader = torch.utils.data.DataLoader(dataset=custom_dataset,
                                           batch_size=64, 
                                           shuffle=True)

八、載入預訓練模型

pytorch為我們提供了一些預訓練模型，可以在models下檢視。

# Download and load the pretrained ResNet-18.
resnet = torchvision.models.resnet18(pretrained=True)

# If you want to finetune only the top layer of the model, set as below.
for param in resnet.parameters():
    param.requires_grad = False

# Replace the top layer for finetuning.
resnet.fc = nn.Linear(resnet.fc.in_features, 100)  # 100 is an example.

# Forward pass.
images = torch.randn(64, 3, 224, 224)
outputs = resnet(images)
print (outputs.size())     # (64, 100)

九、儲存和載入模型

非常簡單的操作，相信大家一下子就能看明白了。


# Save and load the entire model.
torch.save(resnet, 'model.ckpt')
model = torch.load('model.ckpt')

# Save and load only the model parameters (recommended).
torch.save(resnet.state_dict(), 'params.ckpt')
resnet.load_state_dict(torch.load('params.ckpt'))

十、小結

今天我們學習了pytorch的基本操作以及常用的一些內建模組，是不是很簡單？後面會陸陸續續介紹更加豐富的內容，敬請期待！

pytorch入門——邊學邊練01基礎知識

寫在前面

一、pytorch簡介

二、今天我們要做什麼？

三、一個小例子

四、複雜一些的例子

五、從numpy載入資料

六、資料讀取

七、從自定義資料集合裡匯入資料

八、載入預訓練模型

九、儲存和載入模型

十、小結

pytorch入門——邊學邊練01基礎知識

pytorch入門——邊學邊練05卷積神經網路

pytorch入門——邊學邊練04一個簡單網路

pytorch入門——邊學邊練03邏輯迴歸

pytorch入門——邊學邊練02線性迴歸

pytorch入門——邊學邊練06 Residual_Network

邊學邊練之部落格園----設計表

Java 邊學邊做（一）過一下基礎

初學python:邊學邊練，定義函式

正則表示式文件邊學邊練，一小時輕鬆學會

邊學邊敲邊記之爬蟲系列(九)：Item+Pipeline資料儲存

邊學邊敲邊記之爬蟲系列(四)：Scrapy框架搭建

邊學邊敲邊記之爬蟲系列(三)：url去重策略及實現

Flink邊學邊記

Windows10下安裝RabbitMQ邊學邊用

程式人生——邊學邊記

閉包、箭頭函式、generator JavaScript邊學邊記（五）

Django邊學邊做（三）

邊學邊敲邊記爬蟲系列(六)：CSS選擇器實戰訓練

Python邊學邊用--BT客戶端實現之BitTorrent檔案解析

pytorch入門——邊學邊練01基礎知識

寫在前面

一、pytorch簡介

二、今天我們要做什麼？

三、一個小例子

四、複雜一些的例子

五、從numpy載入資料

六、資料讀取

七、從自定義資料集合裡匯入資料

八、載入預訓練模型

九、儲存和載入模型

十、小結

相關推薦