keras實現attention(還不太懂)

阿新 • • 發佈：2019-02-16

from keras import backend as K
from keras.engine.topology import Layer
from keras import initializers, regularizers, constraints

class Attention_layer(Layer):
    """
        Attention operation, with a context/query vector, for temporal data.
        Supports Masking.
        Follows the work of Yang et al. [https://www.cs.cmu.edu/~diyiy/docs/naacl16.pdf]
        "Hierarchical Attention Networks for Document Classification"
        by using a context vector to assist the attention
        # Input shape
            3D tensor with shape: `(samples, steps, features)`.
        # Output shape
            2D tensor with shape: `(samples, features)`.
        :param kwargs:
        Just put it on top of an RNN Layer (GRU/LSTM/SimpleRNN) with return_sequences=True.
        The dimensions are inferred based on the output shape of the RNN.
        Example:
            model.add(LSTM(64, return_sequences=True))
            model.add(AttentionWithContext())
        """

    def __init__(self,
                 W_regularizer=None, b_regularizer=None,
                 W_constraint=None, b_constraint=None,
                 bias=True, **kwargs):

        self.supports_masking = True
        #self.init = initializations.get('glorot_uniform')
        self.init = initializers.get('glorot_uniform')
        self.W_regularizer = regularizers.get(W_regularizer)
        self.b_regularizer = regularizers.get(b_regularizer)

        self.W_constraint = constraints.get(W_constraint)
        self.b_constraint = constraints.get(b_constraint)

        self.bias = bias
        super(Attention_layer, self).__init__(**kwargs)

    def build(self, input_shape):
        assert len(input_shape) == 3

        self.W = self.add_weight((input_shape[-1], input_shape[-1],),
                                 initializer=self.init,
                                 name='{}_W'.format(self.name),
                                 regularizer=self.W_regularizer,
                                 constraint=self.W_constraint)
        if self.bias:
            self.b = self.add_weight((input_shape[-1],),
                                     initializer='zero',
                                     name='{}_b'.format(self.name),
                                     regularizer=self.b_regularizer,
                                     constraint=self.b_constraint)

        super(Attention_layer, self).build(input_shape)

    def compute_mask(self, input, input_mask=None):
        # do not pass the mask to the next layers
        return None

    def call(self, x, mask=None):
        uit = K.dot(x, self.W)

        if self.bias:
            uit += self.b

        uit = K.tanh(uit)

        a = K.exp(uit)

        # apply mask after the exp. will be re-normalized next
        if mask is not None:
            # Cast the mask to floatX to avoid float64 upcasting in theano
            a *= K.cast(mask, K.floatx())

        # in some cases especially in the early stages of training the sum may be almost zero
        # and this results in NaN's. A workaround is to add a very small positive number to the sum.
        # a /= K.cast(K.sum(a, axis=1, keepdims=True), K.floatx())
        a /= K.cast(K.sum(a, axis=1, keepdims=True) + K.epsilon(), K.floatx())
        weighted_input = x * a
        return K.sum(weighted_input, axis=1)

    def get_output_shape_for(self, input_shape):
        return input_shape[0], input_shape[-1]

step-1 keras定義自己的層的方法

要定義自己的層，要實現下面的三個方法(來自官方文件)

build(input_shape): 這是你定義權重的地方。這個方法必須設self.built = True，可以通過呼叫super([Layer], self).build()完成。
call(x): 這裡是編寫層的功能邏輯的地方。你只需要關注傳入call的第一個引數：輸入張量，除非你希望你的層支援masking。call(x): 這裡是編寫層的功能邏輯的地方。你只需要關注傳入call的第一個引數：輸入張量，除非你希望你的層支援masking。
compute_output_shape(input_shape)

: 如果你的層更改了輸入張量的形狀，你應該在這裡定義形狀變化的邏輯，這讓Keras能夠自動推斷各層的形狀。compute_output_shape(input_shape): 如果你的層更改了輸入張量的形狀，你應該在這裡定義形狀變化的邏輯，這讓Keras能夠自動推斷各層的形狀。

keras實現attention(還不太懂)

from keras import backend as K from keras.engine.topology import Layer from keras import initializers, regularizers, constraints c

自己根據網上資料做的一個記事本，有些功能還不太完善。

sta 水平 sselect != 第一個 gui more jpa conf 這個記事本主要是在學習javaGUI及其基本知識後，參考網上資料做出。在制作過程中，同時也發現了java中lastIndex（）方法存在不能準確刪選數據的情況，然後自己進行了修改。以下是代碼部

辦信用卡屢次被拒，網貸不太懂，花唄套現來幫你

人員服務淘寶蘋果條件鈴聲重要穩定沒有一部分人申請信用卡總被拒，可以用“屢申屢敗，屢敗屢申”來形容。身邊資歷差不多的朋友申卡都成功了，唯獨自己申請信用卡總被拒。申請信用卡被拒，這5大原因你必須知道。 1、個人征信不清白：一般銀行審核資料時會查詢你最近2

JVM虛擬機---本地接口（我還不太會）

類庫 jobject 其他 pri interface cpp 比較 con 基本類型轉載http://www.newhua.com/2008/0328/33542_2.shtml Java本地接口(Java Native Interface (JNI))允許運行在J

PooledByteBuf記憶體池-------這個我現在不太懂

轉載自：http://blog.csdn.net/youaremoon/article/details/47910971 http://blog.csdn.net/youaremoon/article/detail

關於校驗和計算方面的C++程式碼，哪位大俠幫忙解讀一下，小弟初學，不太懂！謝謝！

關於校驗和計算方面的C++程式碼，哪位大俠幫忙解讀一下，小弟初學，不太懂！謝謝！分享| 2010-09-03 16:51 李志鵬6076 | 瀏覽 755 次 #include "

keras實現attention based sequence to sequence model(首稿)

class AttentionGRU(GRU): def __init__(self, atten_states, states_len, L2Strength, **kwargs): ''' :param atten_states: pr

Alignment of Code（這題確實看不太懂）

題中輸出時要求“......without trailing and leading spaces......”，否則會wrong answer. 1、學會getline的一種用法。 2、學會i

如果看了此文還不懂 Word2Vec，那是我太笨

轉自http://www.sohu.com/a/128794834_211120 自從 Google 的 Tomas Mikolov 在《Efficient Estimation of Word Representation in Vector Space》提出 Word2Ve

[深度應用]·Keras實現Self-Attention文字分類（機器如何讀懂人心）

[深度應用]·Keras實現Self-Attention文字分類（機器如何讀懂人心）配合閱讀： [深度概念]·Attention機制概念學習筆記 [TensorFlow深度學習深入]實戰三·分別使用DNN,CNN與RNN(LSTM)做文字情感分析筆

每天用SpringBoot，還不懂RESTful API返回統一資料格式是怎麼實現的？

關於 Spring 的全域性處理，我有兩方面要說: 統一資料返回格式統一異常處理為了將兩個問題說明清楚，將分兩個章節分別說明，本章主要說第一點有童鞋說，我們專案都做了這種處理，就是在每個 API 都單獨工具類將返回值進行封裝，但這種不夠優雅；我想寫最少的程式碼完成這件事，也許有童鞋說，加幾個註解就解

如果看了此文你還不懂傅裏葉變換，那就過來掐死我吧【完整版】

處理為知自然 pic 是不是 wikipedia sina 學習方法依次如果看了此文你還不懂傅裏葉變換，那就過來掐死我吧【完整版】轉自 https://blog.csdn.net/u012361418/article/details/46277779 還記得上

長連線是如何實現的（不看後悔，一看必懂）

在HTTP1.0和HTTP1.1協議中都有對長連線的支援。其中HTTP1.0需要在request中增加”Connection： keep-alive“ header才能夠支援，而HTTP1.1預設支援. http1.0請求與服務端的互動過程: &nbs

程式設計師吐槽：新人有體味還不愛洗澡，網友：你懂什麼，這是技術大牛的標誌

不知從何時起，技術人才特別是程式設計師給人造成了這樣一種印象：越是厲害的人越是不修邊幅，甚至以一副邋遢的樣子來面對眾人。也許是之前有名的技術專家或多或少都有這樣的習慣，無暇顧及自己的外貌形象，久而久之便讓大眾接受了這樣的心理預期。無獨有偶，一名在網際網路大廠工作的程式設計師吐槽公司新來的實習生，稱其