1. 程式人生 > >torch記錄:torch.nn模組

torch記錄:torch.nn模組

Recurrent layers

class torch.nn.RNN(*args, **kwargs)

引數:
input_size – 輸入x的特徵數量。
hidden_size – 隱層的特徵數量。
num_layers – RNN的層數。
bidirectional – 如果True,將會變成一個雙向RNN,預設為False。

RNN的輸入: (input, h_0)
- input (seq_len, batch, input_size): 儲存輸入序列特徵的tensor。
h_0 (num_layers * num_directions, batch, hidden_size)

: 儲存著初始隱狀態的tensor

RNN的輸出: (output, h_n)

output (seq_len, batch, hidden_size * num_directions): 儲存著RNN最後一層的輸出特徵。
h_n (num_layers * num_directions, batch, hidden_size): 儲存著最後一個時刻隱狀態。

例子:


#輸入x的長度是10,隱層的長度是20,RNN的層數是2層
rnn = nn.RNN(10, 20, 2)
# (seq_len, batch, input_size)
input = torch.randn(5, 3, 10
) # (num_layers * num_directions, batch, hidden_size) h0 = torch.randn(2, 3, 20) output, hn = rnn(input, h0) print(output.shape) # (seq_len, batch, hidden_size * num_directions) print(hn.shape) # (num_layers * num_directions, batch, hidden_size) torch.Size([5, 3, 20]) torch.Size([2, 3, 20])

同理:

class
torch.nn.GRU(*args, **kwargs)
class torch.nn.RNNCell(input_size, hidden_size, bias=True, nonlinearity='tanh')[source]

另一類:

class torch.nn.RNNCell(input_size, hidden_size, bias=True, nonlinearity='tanh')

Linear layers

class torch.nn.Linear(in_features, out_features, bias=True)
Applies a linear transformation to the incoming data: y=xA^T+b

例子:

# 三維特徵轉化為2維特徵
m = nn.Linear(3, 2)
input = torch.randn(10, 3)
output = m(input)
print(output.size())


torch.Size([10, 2])

Dropout layers

class torch.nn.Dropout(p=0.5, inplace=False)

引數:

p - 將元素置0的概率。預設值:0.5
in-place - 若設定為True,會在原地執行操作。預設值:False

形狀:

輸入: 任意。輸入可以為任意形狀。
輸出: 相同。輸出和輸入形狀相同。

例子:

m = nn.Dropout(p=0.5)
input = autograd.Variable(torch.randn(2, 2))
output = m(input)
output

tensor([[-0.0000, -2.9296],
        [ 0.0924,  0.0000]])

Sparse layers

class torch.nn.Embedding(num_embeddings, embedding_dim, padding_idx=None, max_norm=None, norm_type=2, scale_grad_by_freq=False, sparse=False, _weight=None)[s

引數:

num_embeddings (int) - 嵌入字典的大小
embedding_dim (int) - 每個嵌入向量的大小
padding_idx (int, optional) - 如果提供的話,輸出遇到此下標時用零填充
max_norm (float, optional) - 如果提供的話,會重新歸一化詞嵌入,使它們的範數小於提供的值
norm_type (float, optional) - 對於max_norm選項計算p範數時的p
scale_grad_by_freq (boolean, optional) - 如果提供的話,會根據字典中單詞頻率縮放梯度

變數:

weight (Tensor) -形狀為(num_embeddings, embedding_dim)的模組中可學習的權值
形狀:

輸入: LongTensor (N, W), N = mini-batch, W = 每個mini-batch中提取的下標數
輸出: (N, W, embedding_dim)

例子:

from torch.autograd import Variable
# an Embedding module containing 10 tensors of size 3
embedding = nn.Embedding(10, 3)
# a batch of 2 samples of 4 indices each
input = Variable(torch.LongTensor([[1,2,4,5],[5,4,2,1]]))
embedding(input)

tensor([[[-0.4031,  1.8008,  1.4954],
         [ 0.3768, -0.2439,  0.9262],
         [ 0.8444, -0.1265,  2.0801],
         [ 1.0576, -0.9705, -0.1841]],

        [[ 1.0576, -0.9705, -0.1841],
         [ 0.8444, -0.1265,  2.0801],
         [ 0.3768, -0.2439,  0.9262],
         [-0.4031,  1.8008,  1.4954]]])
embedding.weight


Parameter containing:
tensor([[-0.6084,  0.0402, -1.5447],
        [-0.4031,  1.8008,  1.4954],
        [ 0.3768, -0.2439,  0.9262],
        [ 0.4351, -1.6146,  0.7603],
        [ 0.8444, -0.1265,  2.0801],
        [ 1.0576, -0.9705, -0.1841],
        [ 0.6502, -0.1189,  0.0794],
        [-0.9843, -0.1582, -0.0912],
        [ 0.1690, -0.0980, -0.1338],
        [-0.9448, -1.9642, -0.1723]])

example with padding_idx:

# example with padding_idx
embedding = nn.Embedding(10, 3, padding_idx= 1)
input = Variable(torch.LongTensor([[0,1,0,5]]))
embedding(input)

tensor([[[-1.1790,  1.2073, -1.0174],
         [ 0.0000,  0.0000,  0.0000],
         [-1.1790,  1.2073, -1.0174],
         [-0.2278,  1.1332, -0.2259]]])