PyTorch之 torch.nn.Embedding 詞嵌入層的理解

阿新 • • 發佈：2020-07-25

1.word Embedding的概念理解

首先，我們先理解一下什麼是Embedding。Word Embedding翻譯過來的意思就是詞嵌入，通俗來講就是將文字轉換為一串數字。因為數字是計算機更容易識別的一種表達形式。我們詞嵌入的過程，就相當於是我們在給計算機制造出一本字典的過程。計算機可以通過這個字典來間接地識別文字。詞嵌入向量的意思也可以理解成：詞在神經網路中的向量表示。

2.Pytorch中的Embedding

官方文件的定義：

A simple lookup table that stores embeddings of a fixed dictionary and size.
This module  
is often used to store word embeddings and retrieve them using indices.
The input to the module is a list of indices, and the output is the corresponding word embeddings.

一個簡單的儲存固定大小的詞典的嵌入向量的查詢表，意思就是說，給一個編號，嵌入層就能返回這個編號對應的嵌入向量，嵌入向量反映了各個編號代表的符號之間的語義關係。該模組通常用於儲存單詞嵌入並使用索引檢索它們。

模組的輸入是索引列表，輸出是相應的詞嵌入。

官方文件引數說明：

def __init__(self, num_embeddings, embedding_dim, padding_idx=None,
                 max_norm=None, norm_type=2., scale_grad_by_freq=False,
                 sparse=False, _weight=None)

Args:
        num_embeddings (int): size of the dictionary of embeddings
        embedding_dim (int): the size of each embedding vector
        padding_idx (int, optional): If given, pads the output with the embedding vector at :attr:`padding_idx`
                                         (initialized to zeros) whenever it encounters the index.
        max_norm (float, optional): If given, each embedding vector with norm larger than :attr:`max_norm`
                                     
is renormalized to have norm :attr:`max_norm`.
        norm_type (float, optional): The p of the p-norm to compute for the :attr:`max_norm` option. Default ``2``.
        scale_grad_by_freq (boolean, optional): If given, this will scale gradients by the inverse of frequency of
                                                the words in the mini-batch. Default ``False``.
        sparse (bool, optional): If ``True``, gradient w.r.t. :attr:`weight` matrix will be a sparse tensor.
                                 See Notes for more details regarding sparse gradients.

引數理解說明：

num_embeddings(python:int) – 詞典的大小尺寸，即一個詞典裡要有多少個詞，比如總共出現5000個詞，那就輸入5000。此時index為（0-4999）
embedding_dim(python:int) – 嵌入向量的維度，即用多少維來表示一個符號。
padding_idx(python:int,optional) – 填充id，比如，輸入長度為100，但是每次的句子長度並不一樣，後面就需要用統一的數字填充，而這裡就是指定這個數字，這樣，網路在遇到填充id時，就不會計算其與其它符號的相關性。（初始化為0）
max_norm(python:float,optional) – 最大範數，如果嵌入向量的範數超過了這個界限，就要進行再歸一化。
norm_type(python:float,optional) – 指定利用什麼範數計算，並用於對比max_norm，預設為2範數。
scale_grad_by_freq(boolean,optional) – 根據單詞在mini-batch中出現的頻率，對梯度進行放縮。預設為False.
sparse(bool,optional) – 若為True,則與權重矩陣相關的梯度轉變為稀疏張量

PyTorch之 torch.nn.Embedding 詞嵌入層的理解

PyTorch之 torch.nn.Embedding 詞嵌入層的理解

PyTorch之torch.nn.CrossEntropyLoss()

pytorch之torch.nn.Conv2d()函式詳解

torch.nn.Embedding(num_embeddings,embedding_dim)實現文字轉換詞向量，並完成文字情感分類任務

torch.nn.Embedding進行word Embedding

Unsupervised Learning: Word Embedding(詞嵌入)

PyTorch基礎——torch.nn.CrossEntropyLoss交叉熵損失

pytorch torch.nn.AdaptiveAvgPool2d()自適應平均池化函式詳解

PyTorch裡面的torch.nn.Parameter()詳解

Pytorch之卷積層的使用詳解

Keras—embedding嵌入層的用法詳解

PyTorch之nn.ReLU與F.ReLU的區別介紹

Pytorch學習之torch用法----比較操作(Comparison Ops)

PyTorch之nn.Module類與前向傳播函式forward的理解

PyTorch 原始碼解讀之 torch.autograd

PyTorch學習筆記（二）——torch.nn解析

Pytorch——torch.nn.Sequential()詳解

Pytorch——torch.nn.init 中實現的初始化函式

Pytorch：利用torch.nn.Modules.parameters修改模型引數

pytorch自定義二值化網路層方式

PyTorch之 torch.nn.Embedding 詞嵌入層的理解

相關推薦