TensorFlow中 tf.nn.embedding_lookup
阿新 • • 發佈:2018-12-25
import tensorflow as tf
src_vocab_size = 10
src_embed_size = 5
source = [1,3]
with tf.variable_scope("encoder"):
embedding_encoder = tf.get_variable(
"embedding_encoder", [src_vocab_size, src_embed_size], tf.float32)
encoder_emb_inp = tf.nn.embedding_lookup(
embedding_encoder, source)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
emb_mat = sess.run(embedding_encoder)
for line in emb_mat:
print line
en_input = sess.run(encoder_emb_inp)
print
for line in en_input:
print line
輸出結果:
[ 0.56113797 0.04369807 0.18308383 -0.48125005 -0.43450889]
[-0.6047132 -0.21060479 0.40796143 -0.40531671 0.55036896]
[ 0.31311834 -0.4060598 0.36560428 0.2722581 -0.02451819]
[ 0.18635517 -0.12266624 -0.39344144 -0.1277926 -0.45468265]
[ 0.30129766 0.56903845 -0.03529584 -0.33247966 0.45404953]
[-0.58887643 0.50933784 -0.19886917 -0.03041148 -0.44376266]
[ 0.35494697 0.25374722 0.41377074 0.06932443 -0.21179438 ]
[ 0.10084659 -0.60172981 0.49977249 -0.28413546 -0.33590576]
[-0.01577765 0.41795093 0.43442172 0.59790486 0.58752233]
[ 0.42998117 -0.0969131 -0.34563044 0.16796118 0.62855309]
[-0.6047132 -0.21060479 0.40796143 -0.40531671 0.55036896]
[ 0.18635517 -0.12266624 -0.39344144 -0.1277926 -0.45468265]
1、對於one-hot的編碼embedding操作
2、embedding_lookup即去矩陣中的某一行,同時其不是簡單的查表,id對應的向量是可以訓練,即其實一個全連線
3、在分類模型中用id類的特徵,注意希望模型能夠記住資訊,但是id的維度太高,同一個商品數量也不大,因此可以用iterm embedding來代替id