tensorflow啟用函式--tf.nn.dropout
前言:啟用函式(Activation Function)執行時啟用神經網路中某一部分神經元,將啟用資訊向後傳入下一層的神經網路。神經網路的數學基礎是處處可微的,所以選取啟用函式要保證資料輸入與輸出也是可微的。
### 激勵函式的作用 如果不使用啟用函式,此時啟用函式本質上相當於f(x)=ax+b。這種情況下,神經網路的每一層輸出都是上層輸入的線性函式。不難看出,不論神經網路有多少層,輸出與輸入都是線性關係,與沒有隱層的效果是一樣的,這個就是相當於是最原始的感知機(Perceptron)。至於感知機,大家知道其連最基本的異或問題都無法解決,更別提更復雜的非線性問題。 神經網路之所以能處理非線性問題,這歸功於啟用函式的非線性表達能力。
### TFLearn官方提供的啟用函式:
[Activation Functions](https://www.tensorflow.org/api_guides/python/nn#activation-functions)
- tf.nn.relu
- tf.nn.relu6
- tf.nn.crelu
- tf.nn.elu
- tf.nn.selu
- tf.nn.softplus
- tf.nn.softsign
- tf.nn.dropout
- tf.nn.bias_add
- tf.sigmoid
- tf.tanh
dropout函式會以一個概率為keep_prob來決定神經元是否被抑制。如果被抑制,該神經元輸出為0,如果不被抑制則該神經元的輸出為輸入的1/keep_probbe倍。
每個神經元是否會被抑制是相互獨立的。神經元是否被抑制還可以通過調節noise_shape來調節,當noise_shape[i] == shape(x)[i],x中的元素是相互獨立的。如果shape(x)=[k,l,m,n](k表示資料的個數,l表示資料的行數,m表示資料的列,n表示通道),當noise_shape=[k,1,1,n],表示資料的個數與通道是相互獨立的,但是與資料的行和列是有關聯的,即要麼都為0,要麼都為輸入的1/keep_prob倍。
def dropout(incoming, keep_prob, noise_shape=None, name="Dropout"): """ Dropout. Outputs the input element scaled up by `1 / keep_prob`. The scaling is so that the expected sum is unchanged. By default, each element is kept or dropped independently. If noise_shape is specified, it must be broadcastable to the shape of x, and only dimensions with noise_shape[i] == shape(x)[i] will make independent decisions. For example, if shape(x) = [k, l, m, n] and noise_shape = [k, 1, 1, n], each batch and channel component will be kept independently and each row and column will be kept or not kept together. Arguments: incoming : A `Tensor`. The incoming tensor. keep_prob : A float representing the probability that each element is kept. noise_shape : A 1-D Tensor of type int32, representing the shape for randomly generated keep/drop flags. name : A name for this layer (optional).
下面以例項來進行說明。
import tensorflow as tf
dropout = tf.placeholder(tf.float32)
x = tf.Variable(tf.ones([10, 10]))
y = tf.nn.dropout(x, dropout)
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
a = sess.run(y, feed_dict = {dropout: 0.5})
print(a)
結果:
[[0. 2. 0. 2. 2. 2. 0. 2. 2. 2.]
[2. 0. 0. 0. 2. 2. 0. 2. 0. 2.]
[0. 0. 2. 2. 2. 0. 2. 2. 2. 2.]
[0. 0. 2. 2. 0. 0. 2. 2. 0. 2.]
[0. 2. 0. 0. 2. 0. 0. 0. 0. 0.]
[2. 0. 0. 0. 0. 2. 0. 0. 0. 0.]
[0. 2. 0. 0. 2. 2. 2. 0. 2. 0.]
[0. 2. 2. 2. 0. 0. 0. 2. 0. 2.]
[0. 0. 2. 0. 2. 2. 0. 2. 0. 0.]
[0. 2. 2. 2. 2. 0. 2. 0. 2. 2.]]
Process finished with exit code 0
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
d = tf.constant([[1.,2.,3.,4.],[5.,6.,7.,8.],[9.,10.,11.,12.],[13.,14.,15.,16.]])
print(sess.run(tf.shape(d)))
#由於[4,4] == [4,4] 行和列都為獨立
dropout_a44 = tf.nn.dropout(d, 0.5, noise_shape = [4,4])
result_dropout_a44 = sess.run(dropout_a44)
print(result_dropout_a44)
#noise_shpae[0]=4 == tf.shape(d)[0]=4
#noise_shpae[1]=1 != tf.shape(d)[1]=4
#所以[0]即行獨立,[1]即列相關,每個行同為0或同不為0
dropout_a41 = tf.nn.dropout(d, 0.5, noise_shape = [4,1])
result_dropout_a41 = sess.run(dropout_a41)
print(result_dropout_a41)
#noise_shpae[0]=1 != tf.shape(d)[0]=4
#noise_shpae[1]=4 == tf.shape(d)[1]=4
#所以[1]即列獨立,[0]即行相關,每個列同為0或同不為0
dropout_a24 = tf.nn.dropout(d, 0.5, noise_shape = [1,4])
result_dropout_a24 = sess.run(dropout_a24)
print(result_dropout_a24)
#不相等的noise_shape只能為1
結果:
[4 4]
[[ 0. 4. 0. 8.]
[ 0. 0. 14. 0.]
[ 0. 0. 22. 0.]
[ 0. 0. 30. 0.]]
[[ 2. 4. 6. 8.]
[ 0. 0. 0. 0.]
[ 18. 20. 22. 24.]
[ 26. 28. 30. 32.]]
[[ 0. 0. 6. 0.]
[ 0. 0. 14. 0.]
[ 0. 0. 22. 0.]
[ 0. 0. 30. 0.]]
Droptout定義
- Dropout是TensorFlow裡面為了防止或減輕過擬合而使用的函式,它一般用在全連線層。
- dropout 是訓練過程中,對於神經網路單元,按照一定的概率將其暫時從網路中丟棄。注意是暫時,對於隨機梯度下降來說,由於是隨機丟棄,故而每一個mini-batch都在訓練不同的網路。