Caffe學習之自定義建立新的Layer層
caffe原始碼中已經幫我封裝好了各種各樣的layer,但是有時候現有的layer不能滿足設計的網路要求,這個時候需要自己定義一個新的layer,本文參考here,進行簡單講解,具體方式如下:
一.建立.hpp檔案
1.新增你的layer標頭檔案置於 include/caffe/layers/ 下,比如include/caffe/layers/your_layer.hpp
2.your_layer繼承選擇繼承layer.hpp, common_layers.hpp, data_layers.hpp, loss_layers.hpp, neuron_layers.hpp, 或者 vision_layers.hpp其中一種
3.重寫
virtual inline const char* type() const { return "YourLayerName"; }
函式,這個的目的是為了在寫net.prototxt時,layer{type:"YourLayerName"}
有所對應4.根據自己layer的需要,對{*}blob部分方法進行重寫,以此來限制bottom和top的blob個數。比如 要是重寫了
virtual inline int ExactNumBottomBlobs() const { return 1; }
就表示限制bottom的blob為15.申明
virtual void LayerSetUp(const vector<Blob<Dtype>*>& bottom,const vector<Blob<Dtype>*>& top);
virtual void Reshape(const vector<Blob<Dtype>*>& bottom,const vector<Blob<Dtype>*>& top);
virtual void Forward_cpu(const vector<Blob<Dtype>*>& bottom,const vector<Blob<Dtype>*>& top);
virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom);
6.要是需要GPU加速,則需申明:
virtual void Forward_gpu(const vector<Blob<Dtype>*>& bottom,const vector<Blob<Dtype>*>& top);
virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom);
7.其他(根據演算法需要的函式以及引數)
可以在/caffe/include/caffe/下找到許多對應的例子,比如inner_product_layer.hpp:
#ifndef CAFFE_INNER_PRODUCT_LAYER_HPP_
#define CAFFE_INNER_PRODUCT_LAYER_HPP_
#include <vector>
#include "caffe/blob.hpp"
#include "caffe/layer.hpp"
#include "caffe/proto/caffe.pb.h"
namespace caffe {
/**
* @brief Also known as a "fully-connected" layer, computes an inner product
* with a set of learned weights, and (optionally) adds biases.
*
* TODO(dox): thorough documentation for Forward, Backward, and proto params.
*/
template <typename Dtype>
class InnerProductLayer : public Layer<Dtype> {
public:
explicit InnerProductLayer(const LayerParameter& param)
: Layer<Dtype>(param) {}
virtual void LayerSetUp(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top);
virtual void Reshape(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top);
virtual inline const char* type() const { return "InnerProduct"; }
virtual inline int ExactNumBottomBlobs() const { return 1; }
virtual inline int ExactNumTopBlobs() const { return 1; }
protected:
virtual void Forward_cpu(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top);
virtual void Forward_gpu(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top);
virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom);
virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom);
int M_;
int K_;
int N_;
bool bias_term_;
Blob<Dtype> bias_multiplier_;
bool transpose_; ///< if true, assume transposed weights
};
} // namespace caffe
#endif // CAFFE_INNER_PRODUCT_LAYER_HPP_
二.建立對應的.cpp檔案
1 新增你的原始檔置於 src/caffe/layers/下,比如 src/caffe/layers/your_layer.cpp
2.實現LayerSetUp方法(在這裡你可以讀取layer的引數,權重進行初始化等等),該方法在layer::SetUp時候被呼叫,用於layer的初始化
3.實現Reshape 方法,根據bottom的shape,修改top的shape等等,也是在layer::SetUp時候被呼叫,用於layer的初始化
4.實現Forward_cpu和Backward_cpu 方法,前向傳播計算loss和top,反向傳播計算diff(梯度)
5.在檔案末尾加上這兩行程式碼(XXXLayer表示layer的類名),以此在fayer_factory.hpp中註冊了此layer以便於執行時的統一建立
INSTANTIATE_CLASS(XXXLayer);
REGISTER_LAYER_CLASS(XXX);
具體例子可參考here
三.建立.cu檔案
1.如果需要gpu加速的話,那麼你需要在src/caffe/layers/下建立.cu檔案,比如src/caffe/layers/your_layer.cu
2.採用cuda語言程式設計,實現Forward_gpu和Backward_gpu方法,和.cpp檔案中Forward_cpu和Backward_cpu 方法實現類似,需要把所有的cpu字樣改成gpu
具體例子參考here
四.在caffe.proto新增your_layer的message
1.如果想要在net.prototxt中設定你的layer的引數的話,你需要在caffe.proto中定義,定義好之後,即可在forward或者backward的方法中獲取到引數值,進行其他相關運算
一個簡單的例子(InnerProductLayer )如下:
message InnerProductParameter {
optional uint32 num_output = 1; // The number of outputs for the layer
optional bool bias_term = 2 [default = true]; // whether to have bias terms
optional FillerParameter weight_filler = 3; // The filler for the weight
optional FillerParameter bias_filler = 4; // The filler for the bias
// The first axis to be lumped into a single inner product computation;
// all preceding axes are retained in the output.
// May be negative to index from the end (e.g., -1 for the last axis).
optional int32 axis = 5 [default = 1];
// Specify whether to transpose the weight matrix or not.
// If transpose == true, any operations will be performed on the transpose
// of the weight matrix. The weight matrix itself is not going to be transposed
// but rather the transfer flag of operations will be toggled accordingly.
optional bool transpose = 6 [default = false];
}
2.與此同時,在caffe.proto 的message LayerParameter中新增對應的訊息,同時更新一下注釋,表明下一個可用的數字大小,比如:
// LayerParameter next available layer-specific ID: 117 (last added: inner_product_param )
message LayerParameter {
...
...
...
optional InnerProductParameter inner_product_param = 117;
...
...
...
}
五.編譯
最後重新編譯一下caffe程式碼即可
CAFFE_ROOT$
make clean
make all