中drop用法_深度學習中“drop out”的一些trick

阿新 • • 發佈：2021-01-16

1、dropout用法

def dropout(x, keep_prob, noise_shape=None, seed=None, name=None)

其中：

x 為神經元輸出結果

keep_prob 為被保留神經元佔的比重

tensorflow原始碼：

def dropout(x, keep_prob, noise_shape=None, seed=None, name=None):  # pylint: disable=invalid-name
  """Computes dropout.

  With probability `keep_prob`, outputs the input element scaled up by
  `1 / keep_prob`, otherwise outputs `0`.  The scaling is so that the expected
  sum is unchanged.

  By default, each element is kept or dropped independently.  If `noise_shape`
  is specified, it must be
  [broadcastable](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)
  to the shape of `x`, and only dimensions with `noise_shape[i] == shape(x)[i]`
  will make independent decisions.  For example, if `shape(x) = [k, l, m, n]`
  and `noise_shape = [k, 1, 1, n]`, each batch and channel component will be
  kept independently and each row and column will be kept or not kept together.

  Args:
    x: A floating point tensor.
    keep_prob: A scalar `Tensor` with the same type as x. The probability
      that each element is kept.
    noise_shape: A 1-D `Tensor` of type `int32`, representing the
      shape for randomly generated keep/drop flags.
    seed: A Python integer. Used to create random seeds. See
      `tf.set_random_seed`
      for behavior.
    name: A name for this operation (optional).

  Returns:
    A Tensor of the same shape of `x`.

  Raises:
    ValueError: If `keep_prob` is not in `(0, 1]` or if `x` is not a floating
      point tensor.
  """
  with ops.name_scope(name, "dropout", [x]) as name:
    x = ops.convert_to_tensor(x, name="x")
    if not x.dtype.is_floating:
      raise ValueError("x has to be a floating point tensor since it's going to"
                       " be scaled. Got a %s tensor instead." % x.dtype)
    if isinstance(keep_prob, numbers.Real) and not 0 < keep_prob <= 1:
      raise ValueError("keep_prob must be a scalar tensor or a float in the "
                       "range (0, 1], got %g" % keep_prob)

    # Early return if nothing needs to be dropped.
    if isinstance(keep_prob, float) and keep_prob == 1:
      return x
    if context.executing_eagerly():
      if isinstance(keep_prob, ops.EagerTensor):
        if keep_prob.numpy() == 1:
          return x
    else:
      keep_prob = ops.convert_to_tensor(
          keep_prob, dtype=x.dtype, name="keep_prob")
      keep_prob.get_shape().assert_is_compatible_with(tensor_shape.scalar())

      # Do nothing if we know keep_prob == 1
      if tensor_util.constant_value(keep_prob) == 1:
        return x

    noise_shape = _get_noise_shape(x, noise_shape)

    # uniform [keep_prob, 1.0 + keep_prob)
    random_tensor = keep_prob
    random_tensor += random_ops.random_uniform(
        noise_shape, seed=seed, dtype=x.dtype)
    # 0. if [keep_prob, 1.0) and 1. if [1.0, 1.0 + keep_prob)
    binary_tensor = math_ops.floor(random_tensor)
    ret = math_ops.div(x, keep_prob) * binary_tensor
    if not context.executing_eagerly():
      ret.set_shape(x.get_shape())
    return ret

依據tensorflow原始碼分析dropout的原理：

1）keep_prob為神經元輸出保留的概率，若keep_prob=1，則神經元輸出全部保留，具體見程式碼如下：

    # Early return if nothing needs to be dropped.
    if isinstance(keep_prob, float) and keep_prob == 1:
      return x
    if context.executing_eagerly():
      if isinstance(keep_prob, ops.EagerTensor):
        if keep_prob.numpy() == 1:
          return x
    else:
      keep_prob = ops.convert_to_tensor(
          keep_prob, dtype=x.dtype, name="keep_prob")
      keep_prob.get_shape().assert_is_compatible_with(tensor_shape.scalar())

      # Do nothing if we know keep_prob == 1
      if tensor_util.constant_value(keep_prob) == 1:
        return x

2）若keepprov不等於0，則有一些神經元將會被淘汰，但是為了保證整個網路輸出不受影響，我們只將保留的神經元作為輸出均值，再利用保留概率，算出等價的網路總輸出，進而保證訓練與測試結果的一致性。即y = y/keepprob，具體見程式碼如下：

# 0. if [keep_prob, 1.0) and 1. if [1.0, 1.0 + keep_prob)
    binary_tensor = math_ops.floor(random_tensor)
    ret = math_ops.div(x, keep_prob) * binary_tensor
    if not context.executing_eagerly():
      ret.set_shape(x.get_shape())
    return ret

3）drop_out使用時一定要區分訓練與測試過程，因為訓練是為了得到種類多的小規模特徵提取方法，而測試需要結合全部小規模特徵提取方法，得到一些綜合特徵。

定義place_holder
keep_prob = tf.placeholder(tf.float32)  
呼叫訓練優化器時：
sess.run(train_step, feed_dict={xs: X_train, ys: y_train, keep_prob: 0.5})  

執行前向計算，而不優化引數時：
train_result = sess.run(merged, feed_dict={xs: X_train, ys: y_train, keep_prob: 1})  
test_result = sess.run(merged, feed_dict={xs: X_test, ys: y_test, keep_prob: 1})

中drop用法_深度學習中“drop out”的一些trick

技術標籤：中drop用法 1、dropout用法 def dropout(x, keep_prob, noise_shape=None, seed=None, name=None)

深度學習中的常見啟用函式

1 sigmoid 1.1 sigmoid函式的公式 1.2 sigmoid函式的導數公式 1.3 sigmoid函式程式碼實現 class SigmoidActivator(object):

盤點深度學習中常見的損失函式

損失函式度量的是訓練的模型與真實模型之間的距離。一般以最小化損失函式為目標，對模型進行不斷優化。

python 深度學習中的4種啟用函式

這篇文章用來整理一下入門深度學習過程中接觸到的四種啟用函式，下面會從公式、程式碼以及影象三個方面介紹這幾種啟用函式，首先來明確一下是哪四種：

卷積、深度學習中的卷積和反捲積

文章目錄 1 訊號處理中的卷積與互相關1）訊號處理中的卷積操作2）訊號處理中的互相關

深度學習中的分類與迴歸任務

分類：輸入一張貓的圖片，最終要輸出這張圖片就是貓的概率；定位：輸入貓的圖片，輸出一個box，框出貓，得到這個box矩形的初始位置，用x，y表示，再得到矩形的寬和高；分類+定位：不光要定位出貓的位置，

一篇讀懂深度學習中「訓練」和「推斷」的區別

2019獨角獸企業重金招聘Python工程師標準>>> 2016-12-06優達學城Udacity 來源/ NVIDIA官網

深度學習中，CPU、GPU、NPU、FPGA如何發揮優勢？

隨著AI的廣泛應用，深度學習已成為當前AI研究和運用的主流方式。面對海量資料的並行運算，AI對於算力的要求不斷提升，對硬體的運算速度及功耗提出了更高的要求。

深度學習中的檔案處理（一）

批量調整影象尺寸並把豎直影象向右旋轉後重命名 import os from PIL import Image import cv2

在機器學習和深度學習中建立屬於自己的資料集

技術標籤：機器學習深度學習python機器學習深度學習 def CreateDataSet(file_path): """ demo :

深度學習中的資料歸一化

技術標籤：python演算法機器學習人工智慧最近在做低氧艙滯後時間模擬的專案中遇到了輸入資料量綱不同的情況，使用歸一化和直接學習對比的情況發現訓練結果大有不同。因此記錄一下。

WIP:【資料增強】深度學習中的影象資料增強及實踐

Test Time Augmentation What is Test Time Augmentation (TTA)? Similar to what Data Augmentation is doing to the training set, the purpose of Test Time Augmentation is to perform random modifications t

深度學習中的 Attention 機制總結與程式碼實現（2017-2021年）

作者丨mayiwei1998來源丨GiantPandaCV轉載自丨極市平臺導讀由於許多論文中的網路結構通常被嵌入到程式碼框架中，導致程式碼比較冗餘。本文作者對近幾年基於Attention網路的核心程式碼進行了整理和復現。

深度學習中影象上取樣的方法

深度學習中的影象上取樣方法所謂上取樣，就是將影象從一個較低的尺寸 \\([C, H, W]\\) 恢復到一個較大的尺寸 \\([C, sH, sW]\\)，其中 \\(s\\) 是上取樣倍數，從小圖到大圖這一變換過程也叫影象的超解析度重建。影

[轉] 深度學習中Xavier初始化

from https://www.cnblogs.com/hejunlin1992/p/8723816.html 　“Xavier”初始化方法是一種很有效的神經網路初始化方法，方法來源於2010年的一篇論文《Understanding the difficulty of training deep feedforward