什麼是pytorch（3神經網路）(翻譯)

阿新 • • 發佈：2018-11-12

神經網路

torch.nn 包可以用來構建神經網路。

前面介紹了 autograd包， nn 依賴於 autograd 用於定義和求導模型。 nn.Module 包括layers（神經網路層）, 以及forward函式 forward(input)，其返回結果 output.

例如我們來看一個手寫數字的網路:

卷積神經網路

這是一個簡單的前饋神經網路。接受輸入，向前傳幾層，然後輸出結果。

一個神經網路訓練的簡單過程是：

定義一個具有可學習引數的神經網路。
輸入資料集迭代
網路運算資料輸入的計算結果
計算損失 (how far is the output from being correct)

傳播梯度
跟新權值，通常可以簡單的使用梯度下降: weight = weight - learning_rate * gradient

定義網路

先來頂一個網路:

import torch
import torch.nn as nn import torch.nn.functional as F class Net(nn.Module): def __init__(self): super(Net, self).__init__() # 1 input image channel, 6 output channels, 5x5 square convolution # kernel self.conv1 = nn.Conv2d(1, 6, 5) self.conv2 = nn.Conv2d(6, 16, 5) # an affine operation: y = Wx + b self.fc1 = nn.Linear(16 * 5 * 5, 120) self.fc2 = nn.Linear(120, 84) self.fc3 = nn.Linear(84, 10) def forward(self, x): # Max pooling over a (2, 2) window x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2)) # If the size is a square you can only specify a single number x = F.max_pool2d(F.relu(self.conv2(x)), 2) x = x.view(-1, self.num_flat_features(x)) x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) return x def num_flat_features(self, x): size = x.size()[1:] # all dimensions except the batch dimension num_features = 1 for s in size: num_features *= s return num_features net = Net() print(net)

Out:

Net(
  (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=400, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)

你只需要定義前向傳播函式 forward , 後向傳播函式 backward (梯度的計算) 就會使用autograd自動定義。你可以在forward函式裡使用任何Tensor的運算。

網路的學習到的引數可以通過net.parameters()獲取。

params = list(net.parameters()) print(len(params)) print(params[0].size()) # conv1's .weight

輸出:

10
torch.Size([6, 1, 5, 5])

讓我們隨機輸入一個 32x32 的資料。Note: Expected input size to this net(LeNet) is 32x32.

要把MNIST dataset作為該網路的資料集，需要把資料 resize到32x32.

input = torch.randn(1, 1, 32, 32) out = net(input) print(out)

輸出:

tensor([[ 0.1246, -0.0511,  0.0235,  0.1766, -0.0359, -0.0334,  0.1161,  0.0534,
          0.0282, -0.0202]], grad_fn=<ThAddmmBackward>)

使所有引數的梯度恢復為0，然後使用隨機梯度後向傳播:

net.zero_grad()
out.backward(torch.randn(1, 10))

注意：

torch.nn 只支援mini-batches. 整個 torch.nn 包只接受批樣本，不接受單個樣本。

例如, nn.Conv2d 接受一個4D的張量形如： nSamples x nChannels x Height x Width.

如果你只有一個樣本，那就使用 input.unsqueeze(0) 創造一個假的mini-batch。

在進一步之前，我們來回顧目前你所見到的所有類。

回顧:

torch.Tensor - 一個多維度的陣列，支援自動梯度 backward()。其梯度任然儲存在張量裡。
nn.Module - 神經網路模型。方便的封裝引數，可以匯出模型到GPU，載入模型，匯出模型等。
nn.Parameter - 一種張量, 自動註冊為paramter當賦給 Module作為屬性。
autograd.Function - 實現 forward and backward 的定義，包括autograd. Every Tensor operation, creates at least a single Function node, that connects to functions that created a Tensor and encodes its history.

到此, 我們覆蓋了:

定義一個網路
處理輸入和反向傳播。

剩餘的內容:

計算損失
更新網路的引數

損失函式

一個損失函式接受（output,targe)對作為輸入，計算output和target相差的程度。

nn包裡有多種不同的 loss functions 。最簡單的損失函式是: nn.MSELoss ，計算(output,target)間的均方誤差損失函式。

For example:

output = net(input) target = torch.randn(10) # a dummy target, for example target = target.view(1, -1) # make it the same shape as output criterion = nn.MSELoss() loss = criterion(output, target) print(loss)

輸出:

tensor(1.3638, grad_fn=<MseLossBackward>)

Now, if you follow loss in the backward direction, using its .grad_fn attribute, you will see a graph of computations that looks like this:

input -> conv2d -> relu -> maxpool2d -> conv2d -> relu -> maxpool2d -> view -> linear -> relu -> linear -> relu -> linear -> MSELoss -> loss

現在我們使用 loss.backward(),就會被 loss所微分, 所有計算圖裡引數屬性為 requires_grad=True 將會使 .grad Tensor 和gradient累加起來。

For illustration, let us follow a few steps backward:

print(loss.grad_fn) # MSELoss print(loss.grad_fn.next_functions[0][0]) # Linear print(loss.grad_fn.next_functions[0][0].next_functions[0][0]) # ReLU

Out:

<MseLossBackward object at 0x7f0e86396a90>
<ThAddmmBackward object at 0x7f0e863967b8>
<ExpandBackward object at 0x7f0e863967b8>

反向傳播

為了反向傳播誤差，我們必須使用loss.backward(). 首先需要清除已存在的梯度，然後把梯度累加起來。

現在我們就可以呼叫：loss.backward(), 我們來看看 conv1’s bias gradients 在反向傳播前後。

net.zero_grad()     # zeroes the gradient buffers of all parameters

print('conv1.bias.grad before backward') print(net.conv1.bias.grad) loss.backward() print('conv1.bias.grad after backward') print(net.conv1.bias.grad)

輸出:

conv1.bias.grad before backward
tensor([0., 0., 0., 0., 0., 0.])
conv1.bias.grad after backward
tensor([ 0.0181, -0.0048, -0.0229, -0.0138, -0.0088, -0.0107])

現在，我們來看如何使用損失函式。

進一步閱讀:

nn包包括了各種型別的模型和損失函式，可以用來構建深度神經網路的block，詳細參閱nn的文件： here.

最後一步需要學習的是:

跟新網路的引數

跟新權重Update the weights

最簡單方式就是使用隨機梯度下降（SGD):

weight = weight - learning_rate * gradient

可以使用以下程式碼:

learning_rate = 0.01
for f in net.parameters(): f.data.sub_(f.grad.data * learning_rate)

神經網路裡可以使用各種跟新權重的方法，比如：SGD, Nesterov-SGD, Adam, RMSProp, etc等，為了使用這些方法，有一個小包： torch.optim 實現了這些方法。

用起來非常的容易：

import torch.optim as optim

# create your optimizer
optimizer = optim.SGD(net.parameters(), lr=0.01) # in your training loop: optimizer.zero_grad() # zero the gradient buffers output = net(input) loss = criterion(output, target) loss.backward() optimizer.step() # Does the update

注意：

使用optimizer.zero_grad()把網路的引數梯度手動設定為0.前面在Backprop說了，梯度會累加起來的。

什麼是pytorch（3神經網路）(翻譯)

神經網路

定義網路

損失函式

反向傳播

跟新權重Update the weights

什麼是pytorch（3神經網路）(翻譯)

機器學習筆記（十）：TensorFlow實戰二（深層神經網路）

DL4J中文文件/模型/RNN（迴圈神經網路）

第五週（反向神經網路）-【機器學習-Coursera Machine Learning-吳恩達】

一文詳解什麼是RNN（迴圈神經網路）

吳恩達深度學習第一課第四周（深層神經網路）

【直觀理解】一文搞懂RNN（迴圈神經網路）基礎篇

C++實現誤差反向傳播演算法（BP神經網路）

ANN（人工神經網路）基礎知識

CNN （深度神經網路）的本質

乾貨|如何除錯神經網路（深度神經網路）？

MLP多層感知機（人工神經網路）原理及程式碼實現

卷積神經網路學習筆記——Siamese networks（孿生神經網路）

樹莓派3 安裝tensorflow1.9.0（神經網路）

deep learning tutorial 翻譯（theano學習指南4（翻譯）- 卷積神經網路）

pytorch 卷積神經網路（alexnet）訓練中問題以及解決辦法（更新中）

用pytorch實現一個神經網路（一）

使用pytorch快速搭建神經網路實現二分類任務（包含示例）

Make your own neural network（Python神經網路程式設計）一

深度學習筆記（四）——神經網路和深度學習（淺層神經網路）

什麼是pytorch（3神經網路）(翻譯)

神經網路

定義網路

損失函式

反向傳播

跟新權重Update the weights

相關推薦