Deep Residual Network學習(二)

阿新 • • 發佈：2019-01-09

通過上次在Cifar10上覆現ResNet的結果，我們得到了上表，最後一欄是論文中的結果，可以看到已經最好的初始化方法(MSRA)已經和論文中的結果非常接近了！今天我們完全按照論文中的實驗環境，復現一下ResNet論文中的結果。

上次的論文復現主要和原文中有兩點不同：

1.Data Augmentation

Cifar10中的影象都是32X32的，論文中對測試集中的每張圖在每邊都擴充套件了4個畫素，得到40X40的影象，在訓練時隨機crop出32X32的影象進行訓練，而對測試集不做任何操作

import lmdb
import cv2
import caffe
from caffe.proto import caffe_pb2

env1=lmdb.open('cifar10_train_lmdb')
txn1=env1.begin()
cursor=txn1.cursor()
datum=caffe_pb2.Datum()

env2=lmdb.open('cifar10_pad4_train_lmdb',map_size=50000*1000*10)
txn2=env2.begin(write=True)

count=0
for key,value in cursor:
    datum.ParseFromString(value)
    label=datum.label

    data=caffe.io.datum_to_array(datum)
    img=data.transpose(1,2,0)
    pad=cv2.copyMakeBorder(img,4,4,4,4,cv2.BORDER_REFLECT)

    array=pad.transpose(2,0,1)
    datum1=caffe.io.array_to_datum(array,label)

    str_id='{:08}'.format(count)
    txn2.put(str_id,datum1.SerializeToString())

    count+=1
    if count%1000 ==0:
        print('already handled with {} pictures'.format(count))
        txn2.commit()
        txn2=env2.begin(write=True)

txn2.commit()
env2.close()
env1.close()

程式很容易理解，最關鍵的是這句:

pad=cv2.copyMakeBorder(img,4,4,4,4,cv2.BORDER_REFLECT)

使用cv2的makeBorder函式擴充套件4個畫素，執行後本地就會得到cifar10_pad4_train_lmdb了，注意，均值檔案也需要重新生成，用於訓練集資料

2.Different Shortcut Structure

這是將網路結構藉助ethereon繪製出來的部分截圖，上面的紅字代表這一層的卷積過濾器的數量，可以看到在filter數量加倍之後，shortcut結構沒法直接相加了，所以原文中採用了PadChannel的結構，將多餘的Channel全部補零，主要結構就是上圖所示，先用一個average pooling層將feature map的size減半，再使用PadChannel增加16層的零filter層，我們將這種方法稱為zero-padding法

（亦即論文中的option A),這種方法的優點是接近於恆等層，不會引入新的引數。但是caffe目前不支援這種zero-padding的方法,所以我們需要向caffe中新增PadChannel層:

看一下官方關於新增新層的說明，主要是以下四個檔案:

1.pad_channel_layer.hpp新增到include/caffe/layers:

#ifndef CAFFE_PAD_CHANNEL_LAYER_HPP_
#define CAFFE_PAD_CHANNEL_LAYER_HPP_

#include "caffe/blob.hpp"
#include "caffe/layer.hpp"
#include "caffe/proto/caffe.pb.h"

namespace caffe {

	/*
	* @brief zero-padding channel to extend number of channels
	*
	* Note: Back-propagate just drop the pad derivatives
	*/
	template <typename Dtype>
	class PadChannelLayer : public Layer<Dtype> {
	public:
		explicit PadChannelLayer(const LayerParameter& param)
			: Layer<Dtype>(param) {}
		virtual void LayerSetUp(const vector<Blob<Dtype>*>& bottom,
			const vector<Blob<Dtype>*>& top);
		virtual void Reshape(const vector<Blob<Dtype>*>& bottom,
			const vector<Blob<Dtype>*>& top);

		virtual inline const char* type() const { return "PadChannel"; }
		virtual inline int ExactNumBottomBlobs() const { return 1; }
		virtual inline int ExactNumTopBlobs() const { return 1; }

	protected:
		virtual void Forward_cpu(const vector<Blob<Dtype>*>& bottom,
			const vector<Blob<Dtype>*>& top);
		virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
			const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom);
		virtual void Forward_gpu(const vector<Blob<Dtype>*>& bottom,
			const vector<Blob<Dtype>*>& top);
		virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,
			const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom);
		int num_channels_to_pad_;
	};

}  // namespace caffe

#endif  // CAFFE_PAD_CHANNEL_LAYER_HPP_

2.pad_channel_layer.cpp新增到src/caffe/layers

#include "caffe/layers/pad_channel_layer.hpp"

namespace caffe {

	template <typename Dtype>
	void PadChannelLayer<Dtype>::LayerSetUp(const vector<Blob<Dtype>*>& bottom,
		const vector<Blob<Dtype>*>& top) {
		CHECK_NE(top[0], bottom[0]) << this->type() << " Layer does not "
			"allow in-place computation.";
		num_channels_to_pad_ = this->layer_param_.pad_channel_param().num_channels_to_pad();
		CHECK_GT(num_channels_to_pad_, 0) << "num channels to pad must greater than 0!";
	}

	template <typename Dtype>
	void PadChannelLayer<Dtype>::Reshape(const vector<Blob<Dtype>*>& bottom,
		const vector<Blob<Dtype>*>& top) {

		vector<int> top_shape = bottom[0]->shape();
		top_shape[1] += num_channels_to_pad_;
		top[0]->Reshape(top_shape);
	}

	template <typename Dtype>
	void PadChannelLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom,
		const vector<Blob<Dtype>*>& top) {
		const Dtype* bottom_data = bottom[0]->cpu_data();
		Dtype* top_data = top[0]->mutable_cpu_data();
		int num = bottom[0]->num();
		int channels = bottom[0]->channels();
		int dim = bottom[0]->height() * bottom[0]->width();
		int channel_by_dim = channels * dim;
		for (int n = 0; n < num; n++){
			caffe_copy(channel_by_dim, bottom_data, top_data);
			bottom_data += channel_by_dim;
			top_data += channel_by_dim;
			caffe_set(num_channels_to_pad_ * dim, Dtype(0), top_data);
			top_data += num_channels_to_pad_ * dim;
		}
	}

	template <typename Dtype>
	void PadChannelLayer<Dtype>::Backward_cpu(const vector<Blob<Dtype>*>& bottom,
		const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& top) {
		const Dtype* top_diff = top[0]->cpu_diff();
		Dtype* bottom_diff = bottom[0]->mutable_cpu_diff();
		int num = bottom[0]->num();
		int channels = bottom[0]->channels();
		int dim = bottom[0]->height() * bottom[0]->width();
		int channel_by_dim = channels * dim;
		for (int n = 0; n < num; n++){ // just drop the padding derivatives part.
			caffe_copy(channel_by_dim, top_diff, bottom_diff);
			top_diff += (channels + num_channels_to_pad_) * dim;
			bottom_diff += channel_by_dim;
		}
	}
	INSTANTIATE_CLASS(PadChannelLayer);
	REGISTER_LAYER_CLASS(PadChannel);

}  // namespace caffe

3.pad_channel_layer.cu新增到src/caffe/layers:

#include "caffe/layers/pad_channel_layer.hpp"

namespace caffe {

	// Copy (one line per thread) from one array to another, with arbitrary
	// strides in the last two dimensions.
	template <typename Dtype>
	__global__ void pad_forward_kernel(const int dst_count, const int src_channels, const int dst_channels,
		const int dim, const Dtype* src, Dtype* dst){
		CUDA_KERNEL_LOOP(index, dst_count)
		{
			int num = index / (dim * dst_channels);
			int dst_c = index / dim % dst_channels;
			int pixel_pos = index % dim;
			if (dst_c < src_channels)
				dst[index] = src[num * src_channels * dim + dst_c * dim + pixel_pos];
			else
				dst[index] = Dtype(0);
		}
	}


	template <typename Dtype>
	void PadChannelLayer<Dtype>::Forward_gpu(const vector<Blob<Dtype>*>& bottom,
		const vector<Blob<Dtype>*>& top){
		const Dtype* bottom_data = bottom[0]->gpu_data();
		Dtype* top_data = top[0]->mutable_gpu_data();

		int src_channels = bottom[0]->channels();
		int dim = bottom[0]->height() * bottom[0]->width();
		int dst_channels = src_channels + num_channels_to_pad_;
		const int dst_count = top[0]->count();
		pad_forward_kernel<Dtype> << <CAFFE_GET_BLOCKS(dst_count), CAFFE_CUDA_NUM_THREADS >> >(
			dst_count, src_channels, dst_channels, dim, bottom_data, top_data);
		CUDA_POST_KERNEL_CHECK;
	}

	template <typename Dtype>
	__global__ void pad_backward_kernel(const int bottom_count, const int bottom_channels, const int top_channels,
		const int dim, const Dtype* top, Dtype* bottom)
	{
		CUDA_KERNEL_LOOP(index, bottom_count)
		{
			int num = index / (dim * bottom_channels);
			int bottom_c = index / dim % bottom_channels;
			int pixel_pos = index % dim;
			bottom[index] = top[num * top_channels * dim + bottom_c * dim + pixel_pos];
		}
	}

	template <typename Dtype>
	void PadChannelLayer<Dtype>::Backward_gpu(const vector<Blob<Dtype>*>& top,
		const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom) {
		const Dtype* top_diff = top[0]->gpu_diff();
		Dtype* bottom_diff = bottom[0]->mutable_gpu_diff();
		int bottom_count = bottom[0]->count();
		int bottom_channels = bottom[0]->channels();
		int dim = bottom[0]->height() * bottom[0]->width();
		int top_channels = bottom_channels + num_channels_to_pad_;
		pad_backward_kernel<Dtype> << <CAFFE_GET_BLOCKS(bottom_count), CAFFE_CUDA_NUM_THREADS >> >(
			bottom_count, bottom_channels, top_channels, dim, top_diff, bottom_diff);
		CUDA_POST_KERNEL_CHECK;
	}

	INSTANTIATE_LAYER_GPU_FUNCS(PadChannelLayer);

}  // namespace caffe

4.向caffe.proto中新增對應的message：

好的，重新編譯一次caffe之後，我們就可以使用PadChannel層了！

下面我們看一下另外一種shortcut結構:

論文中將這種結構稱為projection（亦即option B),使用1x1的卷積核來增加維度，這種方法會引入額外的引數！

3.復現實驗

好的，現在一切準備就緒，我們開始完整復現ResNet在Cifar10上的實驗結果：

以下是一些引數設定:

weight_decay=0.0001 momentum=0.9
batch_size=128
learning_rate=0.1，0.01/32k，0.001/48k
max_iter=64k

當層數達到110層時，為加快收斂，我們先將learning_rate設定為0.01，迭代400次之後，再將learning_rate設定回0.1，正常進行訓練

A代表zero-padding,B代表projection,可以看出，Option A得到的結果和論文中基本一致，因為論文中採用的就是這種方法，而Option B卻是所有結果中最好的，可以看出projection的方法是要優於zero_padding的方法的！

接下來放幾張訓練的結果圖:

首先是zero_padding:

接下來是projection:

（這裡暫時先忽略164的結果)

4.總結

至此，ResNet在cifar10的復現實驗已經全部完成，我們完美復現了論文中的結果，甚至還得到了比論文中更好的結果！

Deep Residual Network學習(二)

通過上次在Cifar10上覆現ResNet的結果，我們得到了上表，最後一欄是論文中的結果，可以看到已經最好的初始化方法(MSRA)已經和論文中的結果非常接近了！今天我們完全按照論文中的實驗環境，復現一下ResNet論文中的結果。上次的論文復現主要和原文中有兩點不

Deep Residual Network學習(一)

回顧去年的DCNN成果和深度學習發展，就必然會提及到到Kaiming He的深度殘差網路 (https://arxiv.org/abs/1512.03385)。這不僅是因為ResNet一舉拿到了CV下多個比賽專案的冠軍，更重要的是這一結構解決了訓練極深網路時的degrada

學習筆記之——基於pytorch的殘差網路（deep residual network）

本博文為本人學習pytorch系列之——residual network。前面的博文（學習筆記之——基於深度學習的分類網路）也已經介紹過ResNet了。ResNet是2015年的ImageNet競賽的冠軍，由微軟研究院提出，通過引入residual block能夠成功地訓練高達

[caffe]深度學習之MSRA影象分類模型Deep Residual Network(深度殘差網路)解讀

一、簡介 MSRA的深度殘差網路在2015年ImageNet和COCO如下共5個領域取得第一名：ImageNet recognition, ImageNet detection, ImageNet localization, COCO detection,

論文閱讀筆記之——《DN-ResNet: Efficient Deep Residual Network for Image Denoising》

本文提出的DN-ResNet，就是a deep convolutional neural network (CNN) consisting of several residual blocks (ResBlocks).感覺有點類似於SRResNet的思路。並且對於訓練這個作者所提出的網路，作者還

強化學習之猜猜我是誰--- Deep Q-Network ^_^

導致 line d+ callbacks ima new div pan dense Deep Q-Network和Q-Learning怎麽長得這麽像，難道它們有關系？沒錯，Deep Q-Network其實是Q-Learning融合了神經網絡的一種方法這次我們以打飛機的

Deep Belief Network簡介——本質上是在做逐層無監督學習，每次學習一層網絡結構再逐步加深網絡

why article dia 圖片 fast org 一個能力如果 from：http://www.cnblogs.com/kemaswill/p/3266026.html 1. 多層神經網絡存在的問題常用的神經網絡模型, 一般只包含輸入層, 輸出層和一個

深度學習--深度信念網路（Deep Belief Network）

本篇非常簡要地介紹了深度信念網路的基本概念。文章先簡要介紹了深度信念網路（包括其應用例項）。接著分別講述了：(1) 其基本組成結構——受限玻爾茲曼機的的基本情況，以及，(2) 這個基本結構如何組成深度信念網路。本文僅僅能使讀者瞭解深度信念網路這一概念，內容非常淺顯，甚至有許多不嚴密

深度強化學習（一）： Deep Q Network(DQN)

原文：https://blog.csdn.net/LagrangeSK/article/details/80321265 一、背景 DeepMind2013年的論文《Playing Atari with Deep Reinforcement Learning》指

深度學習方法（五）：卷積神經網路CNN經典模型整理Lenet，Alexnet，Googlenet，VGG，Deep Residual Learning

歡迎轉載，轉載請註明：本文出自Bin的專欄blog.csdn.net/xbinworld。技術交流QQ群：433250724，歡迎對演算法、技術感興趣的同學加入。關於卷積神經網路CNN，網路和文獻中有非常多的資料，我在工作/研究中也用了好一段時間各種常見的model了，就想著

深度學習論文筆記：Deep Residual Networks with Dynamically Weighted Wavelet Coefficients for Fault Diagnosis of Planetary Gearboxes

這篇文章將深度學習演算法應用於機械故障診斷，採用了“小波包分解+深度殘差網路(ResNet)”的思路，將機械振動訊號按照故障型別進行分類。文章的核心創新點：複雜旋轉機械系統的振動訊號包含著很多不同頻率的衝擊和振盪成分，而且不同頻帶內的振動成分在故障診斷中的重要程度經常是不同的，因此可以按照如下步驟設計深度

Deep Residual Network學習(二)

1.Data Augmentation

2.Different Shortcut Structure

3.復現實驗

4.總結

Deep Residual Network學習(二)

Deep Residual Network學習(一)

學習筆記之——基於pytorch的殘差網路（deep residual network）

[caffe]深度學習之MSRA影象分類模型Deep Residual Network(深度殘差網路)解讀

論文閱讀筆記之——《DN-ResNet: Efficient Deep Residual Network for Image Denoising》

強化學習之猜猜我是誰--- Deep Q-Network ^_^

Deep Belief Network簡介——本質上是在做逐層無監督學習，每次學習一層網絡結構再逐步加深網絡

深度學習--深度信念網路（Deep Belief Network）

深度強化學習（一）： Deep Q Network(DQN)

深度學習方法（五）：卷積神經網路CNN經典模型整理Lenet，Alexnet，Googlenet，VGG，Deep Residual Learning

深度學習論文筆記：Deep Residual Networks with Dynamically Weighted Wavelet Coefficients for Fault Diagnosis of Planetary Gearboxes

01神經網路和深度學習-Deep Neural Network for Image Classification: Application-第四周程式設計作業2

CNN與為什麼要做DNN（Deep neural network)(李弘毅機器學習）

漫談深度強化學習之手寫Deep Q-Network解決迷宮問題

【轉】強化學習（一）Deep Q-Network

【轉】【強化學習】Deep Q Network(DQN)演算法詳解

神經網路與深度學習第四周-Building your Deep Neural Network

【文章閱讀】【超解像】--Image Super-Resolution via Deep Recursive Residual Network

強化學習之六：Deep Q-Network and Beyond

《白話深度學習與Tensorflow》學習筆記（4）Deep Residual Networks

Deep Residual Network學習(二)

1.Data Augmentation

2.Different Shortcut Structure

3.復現實驗

4.總結

相關推薦