Batch Normalization and Binarized Neural Networks

阿新 • • 發佈：2018-01-18

圖片 -- 比較 9.png 耗時二值化網絡學習 weight s函數

1使用BN進行數據歸一化的原因

　　a) 神經網絡學習過程本質就是為了學習數據分布，一旦訓練數據與測試數據的分布不同，那麽網絡的泛化能力也大大降低；
　　b) 另外一方面，一旦每批訓練數據的分布各不相同(batch 梯度下降)，那麽網絡就要在每次叠代都去學習適應不同的分布，這樣將會大大降低網絡的訓練速度.

2.BN概述

　　a) 實質。在網絡的每一層輸入的時候，又插入了一個歸一化層，也就是先做一個歸一化處理，然後再進入網絡的下一層。BN操作層，它位於X=WU+B激活值獲得之後，非線性函數變換之前
　　b) 數據預處理之白話預處理
　　真白化處理後數據滿足條件：a、特征之間的相關性降低，這個就相當於pca；b、數據均值、標準差歸一化，也就是使得每一維特征均值為0，標準差為1。
　　但是白話處理要滿足上述兩個條件的話，計算量特別大。

3.BN算法核心思想

　　a) 歸一化公式（偽白化）

　　　　技術分享圖片

　　可計算，該公式所有x(k)的估計值的均值為0，方差為1，此時均為標準正太分布N~（0,1）。

b)數據分布恢復

　　因為上公式強制將網絡中間某一層學習到特征數據給我歸一化處理、標準差也限制在了1，把數據變換成分布於s函數的中間部分，損害了該層網絡所學到的特征。數據恢復使用下面的公式：

　　　　　　技術分享圖片

　　可以推到的y(k)=x(k).

　　則BN網絡前向傳導公式為：

　　　技術分享圖片

　　x的估計值服從標準正太分布，則x經過線性變換後的y，仍然服從正太分布，可計算y的均值為(beta諧音)，方差為（gama諧音）。

4.使用BN的優點

　　a) 快速訓練收斂。可以選擇比較大的初始學習率。對於學習率、參數初始化、權重衰減系數、Drop out比例等，不需要那麽刻意的慢慢調整參數。

　　b) 不用處理過擬合中drop out、L2正則項參數的選擇問題，采用BN算法後，可以移除這兩項了參數，或者可以選擇更小的L2正則約束參數了，因為BN具有提高網絡泛化能力的特性；

　　c) 不需要使用使用局部響應歸一化層。BN中也會對數據進行歸一化。

　　d) 在訓練的時候可以把訓練數據徹底打亂（防止每批訓練的時候，某一個樣本都經常被挑選到），文獻說這個可以提高1%的精度。

4.BNN模型思想

　　從第一層一直到最後一層，需要說明的是除了最後一個隱含層到輸出層的連接權值和激活是實參外，其他的權值為二值化參數。Binarize為二值化函數。

4.1前向傳播階段

　　在一個神經元處需要做的操作有：二值化連接權值—>權值與輸入相乘-->BatchNorm(BatchNormalization)得到這一層的激活值ak—>將 ak二值化。即：在隱含層計算階段所有的值都為二值化後的結果。

　　　　　　技術分享圖片

4.2反向傳播階段

　　　　技術分享圖片

　　先求解第k+1層的誤差值：

　　　　技術分享圖片

　　然後對二值操作層求梯度，根據鏈式法則，求BN層的梯度，求二值化後的W的梯度。

4.3參數更新

　　　　技術分享圖片

　　根據上面計算的梯度更新參數。其中，在求權值（W）梯度的時候是對二值化後的權值求梯度，但是權值更新的時候，是利用上面求得的權值梯度對實數型的權值進行更新。

5.二值化方法

　　二值化的兩種方法：

　　　　技術分享圖片

　　clip(x,min,max)函數使數據限制再min與max之間，小於min的都等於min，大於max的都等於max.

　　由於當計算機生成隨機數的時候非常耗時，出於初衷加速考慮，所以一般以第一種方法進行實施。

　　但第一種方法函數的倒數處處為0，並不能進行梯度反向傳播。另外梯度具有累加效果，即梯度都帶有一定的噪音，而噪音一般認為是服從正態分布的，所以，多次累加梯度才能把噪音平均消耗掉。

　　對第一種方法的函數進行簡單改進：

　　　　技術分享圖片

　　在前向傳播階段，對weights和activation的二值化相當於對網絡的參數引入噪聲，可以提高網絡抗過擬合的能力。另外這可以看做是dropout的一種變形。Dropout是將激活值的一般變成0，從而造成一定的稀疏性，而二值化則是將另一半變成1，從而可以看做是進一步的dropout。

Batch Normalization and Binarized Neural Networks

圖片 -- 比較 9.png 耗時二值化網絡學習 weight s函數 1使用BN進行數據歸一化的原因　　a) 神經網絡學習過程本質就是為了學習數據分布，一旦訓練數據與測試數據的分布不同，那麽網絡的泛化能力也大大降低；　　b) 另外一方面，一旦每批訓練數據的分布各不相

PyNest——Part1:neurons and simple neural networks

例子兩個執行想象你在容易 height 想要根節點 neurons and simple neural networkspynest – nest模擬器的界面神經模擬工具（NEST：www.nest-initiative.org）專為仿真點神經元的大型異構網絡而

sp2.3 Hyperparameter tuning, Batch Normalization and Programming Frameworks

1除錯引數重要性紅黃紫指導原則：alpha學習速率 β是動量裡那個 adam裡β1、2 Σ一般不用除錯以前引數少時候比如倆引數就網格一樣每個依次試一試現在深度學習引數太多也不知道哪個重要就隨機試比如左邊的兩個軸分別是α和Σ 在左圖

Protein Secondary Structure Prediction Using Cascaded Convolutional and Recurrent Neural Networks筆記

利用級聯卷積和遞迴神經網路預測蛋白質二級結構 Abstract 蛋白質二級結構預測是生物資訊學中的一個重要問題。受近期深度神經網路成功的啟發，在本文中，我們提出了一種端到端深度網路，可以從整合的區域性和全域性上下文特徵預測蛋白質二級結構。我們的深層架構

Machine Learning is Fun! Part 3: Deep Learning and Convolutional Neural Networks

We can train this kind of neural network in a few minutes on a modern laptop. When it’s done, we’ll have a neural network that can recognize pictures of “8

Neural Networks and Convolutional Neural Networks Essential Training 神經網路和卷積神經網路基礎教程 Lynda課程中文字幕

Neural Networks and Convolutional Neural Networks Essential Training 中文字幕神經網路和卷積神經網路基礎教程中文字幕Neural Networks and Convolutional Neural Networks

How to Build and Use Neural Networks

How to Build and Use Neural NetworksCreating a neural network means creating a one-track mind system, trained to solve a single problem, or at most, relate

深度學習【6】二值網路（Binarized Neural Networks）學習與理解

http://blog.csdn.net/linmingan/article/details/51008830 Binarized Neural Networks: Training Neural Networks with Weights and Ac

Mastering the game of Go with deep neural networks and tree search

深度策略參數初始化技術以及 -1 簡單 cpu 網絡 Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.758

Neural Networks and Deep Learning學習筆記ch1 - 神經網絡

1.4 true ole 輸出使用 .org ptr easy isp 近期開始看一些深度學習的資料。想學習一下深度學習的基礎知識。找到了一個比較好的tutorial，Neural Networks and Deep Learning，認真看完了之後覺

課程一(Neural Networks and Deep Learning)總結：Logistic Regression

pdf idt note hub blog bsp http learn gre -------------------------------------------------------------------------

Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week1

圖片 .com arr neu regular img family nts radi Normalizing input Vanishing/Exploding gradients deep neural network suffer from t

吳恩達機器學習第5周Neural Networks（Cost Function and Backpropagation）

and div bsp 關於邏輯回歸 info src clas 分享 5.1 Cost Function 假設訓練樣本為：{(x1),y(1)),(x(2),y(2)),...(x(m),y(m))} L = total no.of layers in network

第四節，Neural Networks and Deep Learning 一書小節(上)

rain 集合最大值劃分 import {0} mar result bsp 最近花了半個多月把Mchiael Nielsen所寫的Neural Networks and Deep Learning這本書看了一遍，受益匪淺。該書英文原版地址地址：http://neur

Convolutional Neural Networks(2):Sparse Interactions, Receptive Field and Parameter Sharing

cep 處理根據 margin 單獨 cross rop 滑動 mage Sparse Interactions, Receptive Field and Parameter Sharing是整個CNN深度網絡的核心部分，我們用本文來具體分析其原理。首先我們考慮Fe

CVPR 2017：See the Forest for the Trees: Joint Spatial and Temporal Recurrent Neural Networks for Video-based Person Re-identification

network 測試 eee 分享 The 因此進行最大變化 [1] Z. Zhou, Y. Huang, W. Wang, L. Wang, T. Tan, Ieee, See the Forest for the Trees: Joint Spatial and

Batch Normalization and Binarized Neural Networks

Batch Normalization and Binarized Neural Networks

PyNest——Part1:neurons and simple neural networks

sp2.3 Hyperparameter tuning, Batch Normalization and Programming Frameworks

Protein Secondary Structure Prediction Using Cascaded Convolutional and Recurrent Neural Networks筆記

Machine Learning is Fun! Part 3: Deep Learning and Convolutional Neural Networks

Neural Networks and Convolutional Neural Networks Essential Training 神經網路和卷積神經網路基礎教程 Lynda課程中文字幕

How to Build and Use Neural Networks

深度學習【6】二值網路（Binarized Neural Networks）學習與理解

Mastering the game of Go with deep neural networks and tree search

Neural Networks and Deep Learning學習筆記ch1 - 神經網絡

課程一(Neural Networks and Deep Learning)總結：Logistic Regression

Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week1

吳恩達機器學習第5周Neural Networks（Cost Function and Backpropagation）

第四節，Neural Networks and Deep Learning 一書小節(上)

Convolutional Neural Networks(2):Sparse Interactions, Receptive Field and Parameter Sharing

CVPR 2017：See the Forest for the Trees: Joint Spatial and Temporal Recurrent Neural Networks for Video-based Person Re-identification

【DeepLearning學習筆記】Coursera課程《Neural Networks and Deep Learning》——Week1 Introduction to deep learning課堂筆記

【DeepLearning學習筆記】Coursera課程《Neural Networks and Deep Learning》——Week2 Neural Networks Basics課堂筆記

課程一(Neural Networks and Deep Learning)，第一週（Introduction to Deep Learning）—— 0、學習目標

課程一(Neural Networks and Deep Learning)，第二週（Basics of Neural Network programming）—— 1、10個測驗題（Neural N

Batch Normalization and Binarized Neural Networks

相關推薦