Multiway Attention Networks for Modeling Sentence Pairs

阿新 • • 發佈：2018-11-10

Multiway Attention Networks for Modeling Sentence Pairs

模型架構：

Alt text

總體思想：

將query資訊通過不同形式的attention加入到answer中，對answer進行query感知的建模，從而進行預測

1.資料輸入

使用word embedding和language model表示的contextual embedding拼接表示，使用雙向GRU對句子進行建模
Alt text

2.相似度計算

以上標c,b,d,m表示兩個句子P和Q的雙向GRU表示之間進行四種相似度計算，並且作為4種attention對於 Q 進行帶權重的表示

3.聚合（Aggregation）

（1）拼接Q 的combination attention表示 $q_t^c$ 與P 在 t 時刻的隱狀態 $h$

t p h_t^p

h_{t}^{p}

（常規attention形式）
通過gate機制，進行資訊篩選
Alt text

（2）而後使用GRU進行序列的再次表示
Alt text

即對上面拼接後的向量，再一次用GRU進行表示，四個attention有四種這樣的表示
（3）再一次使用attention，進行四種attention帶權重的組合，

v_a

為引數（這個有點不懂）
Alt text

（4）整合後的表示，再一次使用GRU進行建模
Alt text

4.預測層

（1）對 Q 進行一個注意力再表示，引入引數 $v^q$
Alt text
（2）對 Q 的表示與 P 的表示進行attention

最後將 $r_p$ 送入MLP

Multiway Attention Networks for Modeling Sentence Pairs

Multiway Attention Networks for Modeling Sentence Pairs 模型架構：總體思想：將query資訊通過不同形式的attention加入到answer中，對answer進行query感知的建模，從而進行預測 1.資

Multiway attention networks for modeling sentence pairs(未完待續）

應用場景：釋義識別、自然語言推理、問答問題整體的framework可以分成兩類： The first framework is to model sentence pairs by encodin

ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs（閱讀理解）

在卷積前進行attention，通過attention矩陣計算出相應句對的attentionfeature map，然後連同原來的featuremap一起輸入到卷積層。主要的原來是將輸入擴充套件成雙通道，新增的新通道就是通過Attention Matrix計算出來的attention feature map

論文閱讀 | Multi-Cast Attention Networks for Retrieval-based Question Answering and Response Prediction

Multi-Cast Attention Networks for Retrieval-based Question Answering and Response Prediction （KDD 2018） 1.主要特點：通常，一個句子應用一次attention，然後學習最終表

論文解讀：Stacked Attention Networks for Image Question Answering

這是關於VQA問題的第二篇系列文章，這篇文章在vqa領域是一篇比較有影響的文章。本篇文章將介紹論文：主要思想；模型方法；主要貢獻。有興趣可以檢視原文：Stacked Attention Networks for Image Question Answering。原論文中附有作者原始碼。

Hierarchical Attention Networks for Document Classification 實現篇

Hierarchical Attention Networks for Document Classification 實現篇本文借鑑了大神的部落格和程式碼，連結：https://blog.csdn.net/liuchonge/article/details/74092014?loca

Hierarchical Attention Networks for Document Classification 模型理解篇

Hierarchical Attention Networks for Document Classification 模型理解篇本文借鑑了大神的部落格，連結：https://blog.csdn.net/liuchonge/article/details/73610734 最近看了

A Sensitivity Analysis of Convolutional Neural Networks for Sentence Classification

引言 Ye Zhang在2016年掛在arXiv上的論文，從名字大概可以看出來，這是一篇CNN調參指南。概述模型方面用的是單層CNN，主要是CNN用做文字分類方面的研究，模型結構如下所示：上述模型來自Convolutional Neural Networks for

《Convolutional Neural Networks for Sentence Classification》論文結構解讀

1.資料以某一雙鞋子為例，評論結果作為標籤（2分類：好評，差評）【穿了一段時間，不錯，喜歡的下單吧；好評】【鞋子收到了，不是很滿意。沒有吊牌，一直都是還是隻有我這一雙是；差評】資料處理步驟：把所有評論資料集分詞，去除停用詞，然後構建word2index，然後表示“句子”，以

Waveform Modeling and Generation Using Hierarchical Recurrent Neural Networks for Speech Bandwidth Extension

基於遞階遞迴神經網路的語音訊帶擴充套件的波形建模與生成作者：凌震華老師；成員：Yang Ai , 顧宇, and Li-Rong Dai 摘要　　本文提出了一種基於遞階遞迴神經網路(HRNN)的語音頻寬擴充套件(BWE)的波形建模與生成方法。與傳統的預測寬頻語音波形譜引數的盲式頻帶擴充套件(BWE)

KIM2014_Convolutional Neural Networks for Sentence Classification

Convolutional Neural Networks for Sentence Classification 1. Abstract 2. Introduction 3. Model 4. Datasets and Experimental Se

NRE論文總結：Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification

acl論文閱讀（Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification，中科大自動化所 Zhou ACL 2016）資料集詳情 SemEval-2010 Ta

[ACL2016]Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification

關係分類在自然語言處理領域是一個很重要的語義處理任務，目前state-of-the-art system非常依賴於lexical resources 比如WordNet或者dependency parser 和NER. 還有一個挑戰是重要資訊不知道在句中的什麼位

VGGnet論文總結（VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION）

lrn cli 共享融合 loss sca 得到同時 works VGGNet的主要貢獻：　　1、增加了網絡結構的深度　　2、使用了更小的filter（3*3） 1 introduction 這部分主要說明了，由於在所有的卷積網絡上使用了3*3的filter，所以使

Understanding Convolutional Neural Networks for NLP

n) rnn eas published previous depend tput parameter www. When we hear about Convolutional Neural Network (CNNs), we typically think of Co

FCN筆記（Fully Convolutional Networks for Semantic Segmentation）

width height training 註意 die str 指標 his repl FCN筆記（Fully Convolutional Networks for Semantic Segmentation）（1）FCN做的主要操作 (a)將之前分類網絡的全連接

CVPR 2017：See the Forest for the Trees: Joint Spatial and Temporal Recurrent Neural Networks for Video-based Person Re-identification

network 測試 eee 分享 The 因此進行最大變化 [1] Z. Zhou, Y. Huang, W. Wang, L. Wang, T. Tan, Ieee, See the Forest for the Trees: Joint Spatial and

Multiway Attention Networks for Modeling Sentence Pairs