Keras—embedding嵌入層的用法詳解
最近在工作中進行了NLP的內容,使用的還是Keras中embedding的詞嵌入來做的。
Keras中embedding層做一下介紹。
中文文件地址:https://keras.io/zh/layers/embeddings/
引數如下:
其中引數重點有input_dim,output_dim,非必選引數input_length.
初始化方法引數設定後面會單獨總結一下。
demo使用預訓練(使用百度百科(word2vec)的語料庫)參考
embedding使用的demo參考:
def create_embedding(word_index,num_words,word2vec_model): embedding_matrix = np.zeros((num_words,EMBEDDING_DIM)) for word,i in word_index.items(): try: embedding_vector = word2vec_model[word] embedding_matrix[i] = embedding_vector except: continue return embedding_matrix #word_index:詞典(統計詞轉換為索引) #num_word:詞典長度+1 #word2vec_model:詞向量的model
載入詞向量model的方法:
def pre_load_embedding_model(model_file): # model = gensim.models.Word2Vec.load(model_file) # model = gensim.models.Word2Vec.load(model_file,binary=True) model = gensim.models.KeyedVectors.load_word2vec_format(model_file) return model
model中Embedding層的設定(注意引數,Input層的輸入,初始化方法):
embedding_matrix = create_embedding(word_index,word2vec_model) embedding_layer = Embedding(num_words,EMBEDDING_DIM,embeddings_initializer=Constant(embedding_matrix),input_length=MAX_SEQUENCE_LENGTH,trainable=False) sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,),dtype='int32') embedded_sequences = embedding_layer(sequence_input)
embedding層的初始化設定
keras embeding設定初始值的兩種方式
隨機初始化Embedding
from keras.models import Sequential from keras.layers import Embedding import numpy as np model = Sequential() model.add(Embedding(1000,64,input_length=10)) # the model will take as input an integer matrix of size (batch,input_length). # the largest integer (i.e. word index) in the input should be no larger than 999 (vocabulary size). # now model.output_shape == (None,10,64),where None is the batch dimension. input_array = np.random.randint(1000,size=(32,10)) model.compile('rmsprop','mse') output_array = model.predict(input_array) print(output_array) assert output_array.shape == (32,64)
使用weights引數指明embedding初始值
import numpy as np import keras m = keras.models.Sequential() """ 可以通過weights引數指定初始的weights引數 因為Embedding層是不可導的 梯度東流至此回,所以把embedding放在中間層是沒有意義的,emebedding只能作為第一層 注意weights到embeddings的繫結過程很複雜,weights是一個列表 """ embedding = keras.layers.Embedding(input_dim=3,output_dim=2,input_length=1,weights=[np.arange(3 * 2).reshape((3,2))],mask_zero=True) m.add(embedding) # 一旦add,就會自動呼叫embedding的build函式,print(keras.backend.get_value(embedding.embeddings)) m.compile(keras.optimizers.RMSprop(),keras.losses.mse) print(m.predict([1,2,1,0])) print(m.get_layer(index=0).get_weights()) print(keras.backend.get_value(embedding.embeddings))
給embedding設定初始值的第二種方式:使用initializer
import numpy as np import keras m = keras.models.Sequential() """ 可以通過weights引數指定初始的weights引數 因為Embedding層是不可導的 梯度東流至此回,emebedding只能作為第一層 給embedding設定權值的第二種方式,使用constant_initializer """ embedding = keras.layers.Embedding(input_dim=3,embeddings_initializer=keras.initializers.constant(np.arange(3 * 2,dtype=np.float32).reshape((3,2)))) m.add(embedding) print(keras.backend.get_value(embedding.embeddings)) m.compile(keras.optimizers.RMSprop(),2])) print(m.get_layer(index=0).get_weights()) print(keras.backend.get_value(embedding.embeddings))
關鍵的難點在於理清weights是怎麼傳入到embedding.embeddings張量裡面去的。
Embedding是一個層,繼承自Layer,Layer有weights引數,weights引數是一個list,裡面的元素都是numpy陣列。在呼叫Layer的建構函式的時候,weights引數就被儲存到了_initial_weights變數
basic_layer.py 之Layer類
if 'weights' in kwargs: self._initial_weights = kwargs['weights'] else: self._initial_weights = None
當把Embedding層新增到模型中、跟模型的上一層進行拼接的時候,會呼叫layer(上一層)函式,此處layer是Embedding例項,Embedding是一個繼承了Layer的類,Embedding類沒有重寫__call__()方法,Layer實現了__call__()方法。
父類Layer的__call__方法呼叫子類的call()方法來獲取結果。
所以最終呼叫的是Layer.__call__()。在這個方法中,會自動檢測該層是否build過(根據self.built布林變數)。
Layer.__call__函式非常重要。
def __call__(self,inputs,**kwargs): """Wrapper around self.call(),for handling internal references. If a Keras tensor is passed: - We call self._add_inbound_node(). - If necessary,we `build` the layer to match the _keras_shape of the input(s). - We update the _keras_shape of every input tensor with its new shape (obtained via self.compute_output_shape). This is done as part of _add_inbound_node(). - We update the _keras_history of the output tensor(s) with the current layer. This is done as part of _add_inbound_node(). # Arguments inputs: Can be a tensor or list/tuple of tensors. **kwargs: Additional keyword arguments to be passed to `call()`. # Returns Output of the layer's `call` method. # Raises ValueError: in case the layer is missing shape information for its `build` call. """ if isinstance(inputs,list): inputs = inputs[:] with K.name_scope(self.name): # Handle laying building (weight creating,input spec locking). if not self.built:#如果未曾build,那就要先執行build再呼叫call函式 # Raise exceptions in case the input is not compatible # with the input_spec specified in the layer constructor. self.assert_input_compatibility(inputs) # Collect input shapes to build layer. input_shapes = [] for x_elem in to_list(inputs): if hasattr(x_elem,'_keras_shape'): input_shapes.append(x_elem._keras_shape) elif hasattr(K,'int_shape'): input_shapes.append(K.int_shape(x_elem)) else: raise ValueError('You tried to call layer "' + self.name + '". This layer has no information' ' about its expected input shape,' 'and thus cannot be built. ' 'You can build it manually via: ' '`layer.build(batch_input_shape)`') self.build(unpack_singleton(input_shapes)) self.built = True#這句話其實有些多餘,因為self.build函式已經把built置為True了 # Load weights that were specified at layer instantiation. if self._initial_weights is not None:#如果傳入了weights,把weights引數賦值到每個變數,此處會覆蓋上面的self.build函式中的賦值。 self.set_weights(self._initial_weights) # Raise exceptions in case the input is not compatible # with the input_spec set at build time. self.assert_input_compatibility(inputs) # Handle mask propagation. previous_mask = _collect_previous_mask(inputs) user_kwargs = copy.copy(kwargs) if not is_all_none(previous_mask): # The previous layer generated a mask. if has_arg(self.call,'mask'): if 'mask' not in kwargs: # If mask is explicitly passed to __call__,# we should override the default mask. kwargs['mask'] = previous_mask # Handle automatic shape inference (only useful for Theano). input_shape = _collect_input_shape(inputs) # Actually call the layer,# collecting output(s),mask(s),and shape(s). output = self.call(inputs,**kwargs) output_mask = self.compute_mask(inputs,previous_mask) # If the layer returns tensors from its inputs,unmodified,# we copy them to avoid loss of tensor metadata. output_ls = to_list(output) inputs_ls = to_list(inputs) output_ls_copy = [] for x in output_ls: if x in inputs_ls: x = K.identity(x) output_ls_copy.append(x) output = unpack_singleton(output_ls_copy) # Inferring the output shape is only relevant for Theano. if all([s is not None for s in to_list(input_shape)]): output_shape = self.compute_output_shape(input_shape) else: if isinstance(input_shape,list): output_shape = [None for _ in input_shape] else: output_shape = None if (not isinstance(output_mask,(list,tuple)) and len(output_ls) > 1): # Augment the mask to match the length of the output. output_mask = [output_mask] * len(output_ls) # Add an inbound node to the layer,so that it keeps track # of the call and of all new variables created during the call. # This also updates the layer history of the output tensor(s). # If the input tensor(s) had not previous Keras history,# this does nothing. self._add_inbound_node(input_tensors=inputs,output_tensors=output,input_masks=previous_mask,output_masks=output_mask,input_shapes=input_shape,output_shapes=output_shape,arguments=user_kwargs) # Apply activity regularizer if any: if (hasattr(self,'activity_regularizer') and self.activity_regularizer is not None): with K.name_scope('activity_regularizer'): regularization_losses = [ self.activity_regularizer(x) for x in to_list(output)] self.add_loss(regularization_losses,inputs=to_list(inputs)) return output
如果沒有build過,會自動呼叫Embedding類的build()函式。Embedding.build()這個函式並不會去管weights,如果它使用的initializer沒有傳入,self.embeddings_initializer會變成隨機初始化。
如果傳入了,那麼在這一步就能夠把weights初始化好。
如果同時傳入embeddings_initializer和weights引數,那麼weights引數稍後會把Embedding#embeddings覆蓋掉。
embedding.py Embedding類的build函式
def build(self,input_shape): self.embeddings = self.add_weight( shape=(self.input_dim,self.output_dim),initializer=self.embeddings_initializer,name='embeddings',regularizer=self.embeddings_regularizer,constraint=self.embeddings_constraint,dtype=self.dtype) self.built = True
綜上,在keras中,使用weights給Layer的變數賦值是一個比較通用的方法,但是不夠直觀。keras鼓勵多多使用明確的initializer,而儘量不要觸碰weights。
以上這篇Keras—embedding嵌入層的用法詳解就是小編分享給大家的全部內容了,希望能給大家一個參考,也希望大家多多支援我們。