Siamese Network介紹

阿新 • • 發佈：2020-08-28

Siamese Network主要用於衡量兩個輸入的相似程度，如下圖所示，Network 1和Network 2是共享權值的相同神經網路，這個網路將兩個輸入對映到新的特徵空間中，通過設計好的與距離有關的損失函式來訓練網路引數，使得訓練好的網路可以衡量兩個輸入的相似程度。

Siamese Network經常用於要分類的類別很多或者不確定，但每個類別的樣本比較少的情況，例如人臉識別。針對Siamese Network的訓練可以設計多種不同的損失函式，本文介紹以下兩種：

1. Triplet Loss. 以人臉識別為例，訓練過程中的模型架構如下圖所示：

在訓練過程中每次同時輸入三張人臉圖片$(x, x^+, x^-)$，其中$x$和$x^+$為同一人，$x$和$x^-$為不同的人，損失函式的設計原理是使得網路輸出$Net(x)$和$Net(x^+)$的距離很小，而$Net(x)$和$Net(x^-)$很大，Triplet Loss的形式可表示為：$$L(x, x^+, x^-) = \max \left(\|Net(x)-Net(x^+)\|_2^{2}-\|Net(x)-Net(x^-)\|_2^{2}+\alpha, 0\right)$$其中$\alpha$為一個預先給定的正值。

def triplet_loss(y_pred, alpha = 0.2):
    """
    Arguments:
    y_pred -- python list containing three objects:
            anchor -- the network output for the anchor images x, of shape (None, output_size)
            positive -- the network output for the positive images x+, of shape (None, output_size)
            negative -- the network output for the negative images x-, of shape (None, output_size)    
    Returns:
    loss -- real number, value of the loss
     
"""  
    anchor, positive, negative = y_pred[0], y_pred[1], y_pred[2]
    # Compute the (encoding) distance between the anchor and the positive
    pos_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, positive)), -1)
    # Compute the (encoding) distance between the anchor and the negative
    neg_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, negative)), -1)
     
# Subtract the two previous distances and add alpha.
    basic_loss = tf.add(tf.subtract(pos_dist,neg_dist), alpha)
    # Take the maximum of basic_loss and 0.0. Average over the training batch.
    loss = tf.reduce_mean(tf.maximum(basic_loss, 0.0), axis=None)
    return loss

Triplet Loss

網路訓練好之後，使用Siamese Network進行兩個輸入相似程度的判斷。仍以人臉識別為例，對輸入的兩張人臉圖片$x_1$和$x_2$計算距離$d(x_1,x_2)=\|Net(x_1)-Net(x_2)\|_2$，若距離小於預先給定的臨界值threshold，則判斷為同一人，否則為不同的人，如下圖所示：

2. Binary Classification Loss. 如下圖所示，在訓練過程中每次輸入兩張圖片$(x^{(i)},x^{(j)})$，從網路輸出$f(x^{(i)}),f(x^{(j)})$後計算為同一人的概率$\hat{y}$：

$$\hat{y}=sigmoid\left(\sum_{k=1}^{output{\_size}} w_{k}\left|f\left(x^{(i)}\right)_{k}-f\left(x^{(j)}\right)_{k}\right|+b\right)$$若兩張為同一人，則真實標籤$y$為1，否則為0，使用Logistic迴歸中的交叉熵損失函式，即$$L(x^{(i)},x^{(j)})=L(\hat{y},y)=-y\ln\hat{y}-(1-y)\ln{(1-\hat{y})}$$訓練結束後對輸入的兩張圖片計算$\hat{y}$即可判斷是否為同一人。

參考資料

Coursera深度學習專項課程中的Convolutional Neural Networks
Siamese network 孿生神經網路--一個簡單神奇的結構

Siamese Network介紹

Siamese Network介紹

Residual Network和Inception Network網路架構介紹

Seq2seq模型的一個變種網路：Pointer Network的簡單介紹

OpenGL學習（十）-- 著色語言 GLSL 語法介紹

WCCgiMock——客戶端模擬網路回包工具介紹

推薦收藏 —— MySQL檢視詳細介紹

史上最全 Java 中各種鎖的介紹

詳細介紹 Go 中如何實現 bitset

ReactiveX使用介紹

爬蟲的介紹

(一) 《Nest.js：漸進式node.js框架》介紹

微服務入門篇：發展和入門介紹

CMake學習筆記（一）基本概念介紹、入門教程及CLion安裝配置

小工具介紹：KubeWatch

TiDB Binlog 原始碼閱讀系列文章（六）Pump Storage 介紹（下）

全面介紹SSO（單點登入）

Java分散式開發不得不知的Dubbo技術詳細介紹

深入瞭解Netty【一】BIO、NIO、AIO簡單介紹

JSON的使用場景及注意事項介紹

Spark | 關於Spark常用31個transform運算元程式碼總結以及使用方法介紹

Siamese Network介紹

相關推薦