DSSM & Multi-view DSSM TensorFlow實現

阿新 • • 發佈：2019-01-10

1. 資料

DSSM，對於輸入資料是Query對，即Query短句和相應的展示，展示中分點選和未點選，分別為正負樣，同時對於點選的先後順序，也是有不同賦值，具體可參考論文。

對於我的Query資料本人無權開放，還請自行尋找資料。

2. word hashing

原文使用3-grams，對於中文，我使用了uni-gram，因為中文字身字有一定代表意義（也有論文拆筆畫），對於每個gram都使用one-hot編碼代替，最終可以大大降低短句維度。

3. 結構

結構圖：

把條目對映成低維向量。
計算查詢和文件的cosine相似度。

3.1 輸入

這裡使用了TensorBoard視覺化，所以定義了name_scope:

with 
 tf.name_scope('input'):
    query_batch = tf.sparse_placeholder(tf.float32, shape=[None, TRIGRAM_D], name='QueryBatch')
    doc_positive_batch = tf.sparse_placeholder(tf.float32, shape=[None, TRIGRAM_D], name='DocBatch')
    doc_negative_batch = tf.sparse_placeholder(tf.float32, shape=[None, TRIGRAM_D], name='DocBatch' 
)
    on_train = tf.placeholder(tf.bool)

3.2 全連線層

我使用三層的全連線層，對於每一層全連線層，除了神經元不一樣，其他都一樣，所以可以寫一個函式複用。

ln=Wnx+b1ln=Wnx+b1

def add_layer(inputs, in_size, out_size, activation_function=None):
    wlimit = np.sqrt(6.0 / (in_size + out_size))
    Weights = tf.Variable(tf.random_uniform([in_size, out_size], -wlimit, wlimit))
    biases = tf.Variable(tf.random_uniform([out_size], -wlimit, wlimit))
    Wx_plus_b = tf.matmul(inputs, Weights) + biases
    if 
 activation_function is None:
        outputs = Wx_plus_b
    else:
        outputs = activation_function(Wx_plus_b)
    return outputs

其中，對於權重和Bias，使用了按照論文的特定的初始化方式：

    wlimit = np.sqrt(6.0 / (in_size + out_size))
    Weights = tf.Variable(tf.random_uniform([in_size, out_size], -wlimit, wlimit))
    biases = tf.Variable(tf.random_uniform([out_size], -wlimit, wlimit))

Batch Normalization

def batch_normalization(x, phase_train, out_size):
    """
    Batch normalization on convolutional maps.
    Ref.: http://stackoverflow.com/questions/33949786/how-could-i-use-batch-normalization-in-tensorflow
    Args:
        x:           Tensor, 4D BHWD input maps
        out_size:       integer, depth of input maps
        phase_train: boolean tf.Varialbe, true indicates training phase
        scope:       string, variable scope
    Return:
        normed:      batch-normalized maps
    """
    with tf.variable_scope('bn'):
        beta = tf.Variable(tf.constant(0.0, shape=[out_size]),
                           name='beta', trainable=True)
        gamma = tf.Variable(tf.constant(1.0, shape=[out_size]),
                            name='gamma', trainable=True)
        batch_mean, batch_var = tf.nn.moments(x, [0], name='moments')
        ema = tf.train.ExponentialMovingAverage(decay=0.5)

        def mean_var_with_update():
            ema_apply_op = ema.apply([batch_mean, batch_var])
            with tf.control_dependencies([ema_apply_op]):
                return tf.identity(batch_mean), tf.identity(batch_var)

        mean, var = tf.cond(phase_train,
                            mean_var_with_update,
                            lambda: (ema.average(batch_mean), ema.average(batch_var)))
        normed = tf.nn.batch_normalization(x, mean, var, beta, gamma, 1e-3)
    return normed

單層

with tf.name_scope('FC1'):
    # 啟用函式在BN之後，所以此處為None
    query_l1 = add_layer(query_batch, TRIGRAM_D, L1_N, activation_function=None)
    doc_positive_l1 = add_layer(doc_positive_batch, TRIGRAM_D, L1_N, activation_function=None)
    doc_negative_l1 = add_layer(doc_negative_batch, TRIGRAM_D, L1_N, activation_function=None)

with tf.name_scope('BN1'):
    query_l1 = batch_normalization(query_l1, on_train, L1_N)
    doc_l1 = batch_normalization(tf.concat([doc_positive_l1, doc_negative_l1], axis=0), on_train, L1_N)
    doc_positive_l1 = tf.slice(doc_l1, [0, 0], [query_BS, -1])
    doc_negative_l1 = tf.slice(doc_l1, [query_BS, 0], [-1, -1])
    query_l1_out = tf.nn.relu(query_l1)
    doc_positive_l1_out = tf.nn.relu(doc_positive_l1)
    doc_negative_l1_out = tf.nn.relu(doc_negative_l1)
······

合併負樣本

with tf.name_scope('Merge_Negative_Doc'):
    # 合併負樣本，tile可選擇是否擴充套件負樣本。
    doc_y = tf.tile(doc_positive_y, [1, 1])
    for i in range(NEG):
        for j in range(query_BS):
            # slice(input_, begin, size)切片API
            doc_y = tf.concat([doc_y, tf.slice(doc_negative_y, [j * NEG + i, 0], [1, -1])], 0)

3.3 計算cos相似度

with tf.name_scope('Cosine_Similarity'):
    # Cosine similarity
    # query_norm = sqrt(sum(each x^2))
    query_norm = tf.tile(tf.sqrt(tf.reduce_sum(tf.square(query_y), 1, True)), [NEG + 1, 1])
    # doc_norm = sqrt(sum(each x^2))
    doc_norm = tf.sqrt(tf.reduce_sum(tf.square(doc_y), 1, True))

    prod = tf.reduce_sum(tf.multiply(tf.tile(query_y, [NEG + 1, 1]), doc_y), 1, True)
    norm_prod = tf.multiply(query_norm, doc_norm)

    # cos_sim_raw = query * doc / (||query|| * ||doc||)
    cos_sim_raw = tf.truediv(prod, norm_prod)
    # gamma = 20
    cos_sim = tf.transpose(tf.reshape(tf.transpose(cos_sim_raw), [NEG + 1, query_BS])) * 20

3.4 定義損失函式

with tf.name_scope('Loss'):
    # Train Loss
    # 轉化為softmax概率矩陣。
    prob = tf.nn.softmax(cos_sim)
    # 只取第一列，即正樣本列概率。
    hit_prob = tf.slice(prob, [0, 0], [-1, 1])
    loss = -tf.reduce_sum(tf.log(hit_prob))
    tf.summary.scalar('loss', loss)

3.5選擇優化方法

with tf.name_scope('Training'):
    # Optimizer
    train_step = tf.train.AdamOptimizer(FLAGS.learning_rate).minimize(loss)

## 3.6 開始訓練

# 建立一個Saver物件，選擇性儲存變數或者模型。
saver = tf.train.Saver()
# with tf.Session(config=config) as sess:
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    train_writer = tf.summary.FileWriter(FLAGS.summaries_dir + '/train', sess.graph)
    start = time.time()
    for step in range(FLAGS.max_steps):
        batch_id = step % FLAGS.epoch_steps
        sess.run(train_step, feed_dict=feed_dict(True, True, batch_id % FLAGS.pack_size, 0.5))

Multi-view DSSM實現同理，可以參考GitHub：multi_view_dssm_v3

DSSM & Multi-view DSSM TensorFlow實現

1. 資料DSSM，對於輸入資料是Query對，即Query短句和相應的展示，展示中分點選和未點選，分別為正負樣，同時對於點選的先後順序，也是有不同賦值，具體可參考論文。對於我的Query資料本人無權開放，還請自行尋找資料。2. word hashing原文使用3-grams

利用Tensorflow實現神經網絡模型

flow one 什麽 hold test ase tensor dom def 首先看一下神經網絡模型，一個比較簡單的兩層神經。代碼如下： # 定義參數 n_hidden_1 = 256 #第一層神經元 n_hidden_2 = 128 #第

86、使用Tensorflow實現，LSTM的時間序列預測，預測正弦函數

ati pre win real testing could sqrt sha ima ‘‘‘ Created on 2017年5月21日 @author: weizhen ‘‘‘ # 以下程序為預測離散化之後的sin函數 import numpy as np impo

django Multi-table inheritance ---- 用於實現基表-子表

django 而是 col pro ror person err per 兩個 SQL中的父子表、在django中可以直接通過模式的繼承來完成！一、django中的model定義如下：　　1、django定義 from django.db import models

MVC實戰之排球計分（四）—— View設計與實現

service family 角色元素需要 rom 之前 con xsl （view）視圖視圖是用戶看到並與之交互的界面。對老式的Web應用程序來說，視圖就是由HTML元素組成的界面，在新式的Web應用程序中，HTML依舊在視圖中扮演著重要的角色，但一些新的技術已層出

Tensorflow實現Mask R-CNN實例分割通用框架，檢測，分割和特征點定位一次搞定（多圖）

優點設計 orf 時間 rcnn 超越 rain 沒有 add Mask R-CNN實例分割通用框架，檢測，分割和特征點定位一次搞定（多圖）導語：Mask R-CNN是Faster R-CNN的擴展形式，能夠有效地檢測圖像中的目標，同時還能為每個實例生成一個

C#下使用Tensorflow實現攝像頭圖像的處理

來源 logs ring andro .html 全部 eap get 圖像 TensorFlow自帶例子已經包含了android和ios下的攝像頭圖像識別，這裏補充一個Windows下的，使用AForge庫(www.aforgenet.com)操作攝像頭。代碼在這裏下載

tensorflow實現Word2vec

while brush ber ear same split max ems red # coding: utf-8 ‘‘‘ Note: Step 3 is missing. That‘s why I left it. ‘‘‘ from __future__ impor

學習筆記TF024:TensorFlow實現Softmax Regression(回歸)識別手寫數字

概率 none nump 簡單測試數據 python dice bat desc TensorFlow實現Softmax Regression(回歸)識別手寫數字。MNIST(Mixed National Institute of Standards and Techno

tensorflow 實現神經網絡

參考 plus efi 節點 on() imp this range ros import tensorflow as tf import numpy as np # 添加層 def add_layer(inputs, in_size, out_size, activa

94、tensorflow實現語音識別0,1,2,3,4,5,6,7,8,9

結果 test amp building pre cti fun ner edi ‘‘‘ Created on 2017年7月23日 @author: weizhen ‘‘‘ #導入庫 from __future__ import division,print_func

Android中關於View滑動的實現你應該知道的

nan ida gif 當前位置距離保存改變 post 控件滑動作為Android中最基礎的特效之一，使用場景非常廣泛。實現的方式也有多種，理解各種滑動的實現方式。清楚在開發中根據自己的實際需求，選擇合理的實現方案。這篇文章從:scrollTo()/scrollBy

利用Tensorflow實現手寫字符識別

status ade 模式數學 malloc interact tutorials x模型項目模式識別領域應用機器學習的場景非常多，手寫識別就是其中一種，最簡單的數字識別是一個多類分類問題，我們借這個多類分類問題來介紹一下google最新開源的tensorflow框架

簡單神經網絡TensorFlow實現

2.7 傳播 oms finished 方差 atm 學習輸入 oss 學習TensorFlow筆記 import tensorflow as tf #定義變量 #Variable 定義張量及shape w1= tf.Variable(tf.random_normal

Tensorflow實現LSTM識別MINIST

growth 輸入應該訓練 run 類別 out 運行 port import tensorflow as tf import numpy as np from tensorflow.contrib import rnn from tensorflow.examples

【深度學習系列】用PaddlePaddle和Tensorflow實現經典CNN網絡GoogLeNet

mage eat oba card fin filter mod 一個 lec 　　前面講了LeNet、AlexNet和Vgg，這周來講講GoogLeNet。GoogLeNet是由google的Christian Szegedy等人在2014年的論文《Going Deepe

【深度學習系列】用PaddlePaddle和Tensorflow實現GoogLeNet InceptionV2/V3/V4

targe 所有 conn ride 出了 prev 縮減 tro 例如　　上一篇文章我們引出了GoogLeNet InceptionV1的網絡結構，這篇文章中我們會詳細講到Inception V2/V3/V4的發展歷程以及它們的網絡結構和亮點。 GoogLeNet I

tensorflow實現LeNet-5模型

log fault oss 出了過濾 after 輸入格式 argmax 學習網絡結構如下： INPUT: [28x28x1] weights: 0 CONV5-32: [28x28x32] weights: (5*5*1+1)*3

tensorflow實現遷移學習

解壓 optimize pool ini ble rate http soft 包含此例程出自《TensorFlow實戰Google深度學習框架》6.5.2小節卷積神經網絡遷移學習。數據集來自http://download.tensorflow.org/example

TensorFlow實現Softmax Regression識別手寫數字中"TimeoutError: [WinError 10060] 由於連接方在一段時間後沒有正確答復或連接的主機沒有反應，連接嘗試失敗”問題

http 截圖技術數字 alt 分享圖片 inf 主機 orf 出現問題：在使用TensorFlow實現MNIST手寫數字識別時，出現“TimeoutError: [WinError 10060] 由於連接方在一段時間後沒有正確答復或連接的主機沒有反應，連接嘗試失敗”

DSSM & Multi-view DSSM TensorFlow實現

1. 資料

2. word hashing

3. 結構

3.1 輸入

3.2 全連線層

單層

3.3 計算cos相似度

3.4 定義損失函式

3.5選擇優化方法

相關推薦