Keras—embedding嵌入層的用法詳解

阿新 • • 發佈：2020-06-11

最近在工作中進行了NLP的內容，使用的還是Keras中embedding的詞嵌入來做的。

Keras中embedding層做一下介紹。

中文文件地址：https://keras.io/zh/layers/embeddings/

引數如下：

Keras—embedding嵌入層的用法詳解

其中引數重點有input_dim,output_dim,非必選引數input_length.

初始化方法引數設定後面會單獨總結一下。

demo使用預訓練（使用百度百科（word2vec）的語料庫）參考

embedding使用的demo參考：

def create_embedding(word_index,num_words,word2vec_model):
 embedding_matrix = np.zeros((num_words,EMBEDDING_DIM))
 for word,i in word_index.items():
  try:
   embedding_vector = word2vec_model[word]
   embedding_matrix[i] = embedding_vector
  except:
   continue
 return embedding_matrix
 
#word_index:詞典（統計詞轉換為索引）
#num_word:詞典長度+1
#word2vec_model:詞向量的model

載入詞向量model的方法：

def pre_load_embedding_model(model_file):
 # model = gensim.models.Word2Vec.load(model_file)
 # model = gensim.models.Word2Vec.load(model_file,binary=True)
 model = gensim.models.KeyedVectors.load_word2vec_format(model_file)
 return model

model中Embedding層的設定（注意引數，Input層的輸入，初始化方法）：

 embedding_matrix = create_embedding(word_index,word2vec_model)
 
 embedding_layer = Embedding(num_words,EMBEDDING_DIM,embeddings_initializer=Constant(embedding_matrix),input_length=MAX_SEQUENCE_LENGTH,trainable=False)
 sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,),dtype='int32')
 embedded_sequences = embedding_layer(sequence_input)

embedding層的初始化設定

keras embeding設定初始值的兩種方式

隨機初始化Embedding

from keras.models import Sequential
from keras.layers import Embedding
import numpy as np
 
model = Sequential()
model.add(Embedding(1000,64,input_length=10))
# the model will take as input an integer matrix of size (batch,input_length).
# the largest integer (i.e. word index) in the input should be no larger than 999 (vocabulary size).
# now model.output_shape == (None,10,64),where None is the batch dimension.
 
input_array = np.random.randint(1000,size=(32,10))
 
model.compile('rmsprop','mse')
output_array = model.predict(input_array)
print(output_array)
assert output_array.shape == (32,64)

使用weights引數指明embedding初始值

import numpy as np
import keras
 
m = keras.models.Sequential()
"""
可以通過weights引數指定初始的weights引數
因為Embedding層是不可導的 
梯度東流至此回,所以把embedding放在中間層是沒有意義的,emebedding只能作為第一層
注意weights到embeddings的繫結過程很複雜，weights是一個列表
"""
embedding = keras.layers.Embedding(input_dim=3,output_dim=2,input_length=1,weights=[np.arange(3 * 2).reshape((3,2))],mask_zero=True)
m.add(embedding) # 一旦add，就會自動呼叫embedding的build函式,print(keras.backend.get_value(embedding.embeddings))
m.compile(keras.optimizers.RMSprop(),keras.losses.mse)
print(m.predict([1,2,1,0]))
print(m.get_layer(index=0).get_weights())
print(keras.backend.get_value(embedding.embeddings))

給embedding設定初始值的第二種方式：使用initializer

import numpy as np
import keras
 
m = keras.models.Sequential()
"""
可以通過weights引數指定初始的weights引數
因為Embedding層是不可導的 
梯度東流至此回,emebedding只能作為第一層
給embedding設定權值的第二種方式，使用constant_initializer 
"""
embedding = keras.layers.Embedding(input_dim=3,embeddings_initializer=keras.initializers.constant(np.arange(3 * 2,dtype=np.float32).reshape((3,2))))
m.add(embedding)
print(keras.backend.get_value(embedding.embeddings))
m.compile(keras.optimizers.RMSprop(),2]))
print(m.get_layer(index=0).get_weights())
print(keras.backend.get_value(embedding.embeddings))

關鍵的難點在於理清weights是怎麼傳入到embedding.embeddings張量裡面去的。

Embedding是一個層，繼承自Layer，Layer有weights引數，weights引數是一個list，裡面的元素都是numpy陣列。在呼叫Layer的建構函式的時候，weights引數就被儲存到了_initial_weights變數

basic_layer.py 之Layer類

  if 'weights' in kwargs:
   self._initial_weights = kwargs['weights']
  else:
   self._initial_weights = None

當把Embedding層新增到模型中、跟模型的上一層進行拼接的時候，會呼叫layer(上一層)函式，此處layer是Embedding例項，Embedding是一個繼承了Layer的類，Embedding類沒有重寫__call__()方法，Layer實現了__call__()方法。

父類Layer的__call__方法呼叫子類的call()方法來獲取結果。

所以最終呼叫的是Layer.__call__()。在這個方法中，會自動檢測該層是否build過（根據self.built布林變數）。

Layer.__call__函式非常重要。

 def __call__(self,inputs,**kwargs):
  """Wrapper around self.call(),for handling internal references.
  If a Keras tensor is passed:
   - We call self._add_inbound_node().
   - If necessary,we `build` the layer to match
    the _keras_shape of the input(s).
   - We update the _keras_shape of every input tensor with
    its new shape (obtained via self.compute_output_shape).
    This is done as part of _add_inbound_node().
   - We update the _keras_history of the output tensor(s)
    with the current layer.
    This is done as part of _add_inbound_node().
  # Arguments
   inputs: Can be a tensor or list/tuple of tensors.
   **kwargs: Additional keyword arguments to be passed to `call()`.
  # Returns
   Output of the layer's `call` method.
  # Raises
   ValueError: in case the layer is missing shape information
    for its `build` call.
  """
  if isinstance(inputs,list):
   inputs = inputs[:]
  with K.name_scope(self.name):
   # Handle laying building (weight creating,input spec locking).
   if not self.built:#如果未曾build，那就要先執行build再呼叫call函式
    # Raise exceptions in case the input is not compatible
    # with the input_spec specified in the layer constructor.
    self.assert_input_compatibility(inputs)
 
    # Collect input shapes to build layer.
    input_shapes = []
    for x_elem in to_list(inputs):
     if hasattr(x_elem,'_keras_shape'):
      input_shapes.append(x_elem._keras_shape)
     elif hasattr(K,'int_shape'):
      input_shapes.append(K.int_shape(x_elem))
     else:
      raise ValueError('You tried to call layer "' +
           self.name +
           '". This layer has no information'
           ' about its expected input shape,'
           'and thus cannot be built. '
           'You can build it manually via: '
           '`layer.build(batch_input_shape)`')
    self.build(unpack_singleton(input_shapes))
    self.built = True#這句話其實有些多餘，因為self.build函式已經把built置為True了
 
    # Load weights that were specified at layer instantiation.
    if self._initial_weights is not None:#如果傳入了weights，把weights引數賦值到每個變數，此處會覆蓋上面的self.build函式中的賦值。
     self.set_weights(self._initial_weights)
 
   # Raise exceptions in case the input is not compatible
   # with the input_spec set at build time.
   self.assert_input_compatibility(inputs)
 
   # Handle mask propagation.
   previous_mask = _collect_previous_mask(inputs)
   user_kwargs = copy.copy(kwargs)
   if not is_all_none(previous_mask):
    # The previous layer generated a mask.
    if has_arg(self.call,'mask'):
     if 'mask' not in kwargs:
      # If mask is explicitly passed to __call__,# we should override the default mask.
      kwargs['mask'] = previous_mask
   # Handle automatic shape inference (only useful for Theano).
   input_shape = _collect_input_shape(inputs)
 
   # Actually call the layer,# collecting output(s),mask(s),and shape(s).
   output = self.call(inputs,**kwargs)
   output_mask = self.compute_mask(inputs,previous_mask)
 
   # If the layer returns tensors from its inputs,unmodified,# we copy them to avoid loss of tensor metadata.
   output_ls = to_list(output)
   inputs_ls = to_list(inputs)
   output_ls_copy = []
   for x in output_ls:
    if x in inputs_ls:
     x = K.identity(x)
    output_ls_copy.append(x)
   output = unpack_singleton(output_ls_copy)
 
   # Inferring the output shape is only relevant for Theano.
   if all([s is not None
     for s in to_list(input_shape)]):
    output_shape = self.compute_output_shape(input_shape)
   else:
    if isinstance(input_shape,list):
     output_shape = [None for _ in input_shape]
    else:
     output_shape = None
 
   if (not isinstance(output_mask,(list,tuple)) and
     len(output_ls) > 1):
    # Augment the mask to match the length of the output.
    output_mask = [output_mask] * len(output_ls)
 
   # Add an inbound node to the layer,so that it keeps track
   # of the call and of all new variables created during the call.
   # This also updates the layer history of the output tensor(s).
   # If the input tensor(s) had not previous Keras history,# this does nothing.
   self._add_inbound_node(input_tensors=inputs,output_tensors=output,input_masks=previous_mask,output_masks=output_mask,input_shapes=input_shape,output_shapes=output_shape,arguments=user_kwargs)
 
   # Apply activity regularizer if any:
   if (hasattr(self,'activity_regularizer') and
     self.activity_regularizer is not None):
    with K.name_scope('activity_regularizer'):
     regularization_losses = [
      self.activity_regularizer(x)
      for x in to_list(output)]
    self.add_loss(regularization_losses,inputs=to_list(inputs))
  return output

如果沒有build過，會自動呼叫Embedding類的build()函式。Embedding.build()這個函式並不會去管weights，如果它使用的initializer沒有傳入，self.embeddings_initializer會變成隨機初始化。

如果傳入了，那麼在這一步就能夠把weights初始化好。

如果同時傳入embeddings_initializer和weights引數，那麼weights引數稍後會把Embedding#embeddings覆蓋掉。

embedding.py Embedding類的build函式

 def build(self,input_shape):
  self.embeddings = self.add_weight(
   shape=(self.input_dim,self.output_dim),initializer=self.embeddings_initializer,name='embeddings',regularizer=self.embeddings_regularizer,constraint=self.embeddings_constraint,dtype=self.dtype)
  self.built = True

綜上，在keras中，使用weights給Layer的變數賦值是一個比較通用的方法，但是不夠直觀。keras鼓勵多多使用明確的initializer，而儘量不要觸碰weights。

以上這篇Keras—embedding嵌入層的用法詳解就是小編分享給大家的全部內容了，希望能給大家一個參考，也希望大家多多支援我們。

Keras—embedding嵌入層的用法詳解

Keras—embedding嵌入層的用法詳解

Caffeine LoadingCache用法詳解

Spring表示式語言SpEL用法詳解

java ArrayList.remove()的三種錯誤用法以及六種正確用法詳解

c++優先佇列(priority_queue)用法詳解

Spring實戰之@Autowire註解用法詳解

mybatis之foreach用法詳解

C語言switch使用之詭異用法詳解

C# Winfom 中ListBox的簡單用法詳解

C# ManualResetEvent用法詳解

Spring ApplicationListener監聽器用法詳解

Spring的組合註解和元註解原理與用法詳解

Java switch關鍵字原理及用法詳解

Java NIO Selector用法詳解【含多人聊天室例項】

Java 反射機制原理與用法詳解

android popupwindow用法詳解

spring註解@Import用法詳解

mysql儲存過程之遊標（DECLARE）原理與用法詳解

SQL中 patindex函式的用法詳解

MySQL查詢條件常見用法詳解

Keras—embedding嵌入層的用法詳解

相關推薦