1. 程式人生 > >voc-fcn-alexnet網絡結構理解

voc-fcn-alexnet網絡結構理解

不同 group num true alex 4.0 map 像素點 寫在前面

一、寫在前面

fcn是首次使用cnn來實現語義分割的,論文地址:fully convolutional networks for semantic segmentation

實現代碼地址:https://github.com/shelhamer/fcn.berkeleyvision.org

全卷積神經網絡主要使用了三種技術:

1. 卷積化(Convolutional)

2. 上采樣(Upsample)

3. 跳躍結構(Skip Layer)

為了便於理解,我拿最簡單的結構voc-fcn-alexnet進行說明,該網絡結構主要用到了前面兩個技術,不包含跳躍結構。

二、voc-fcn-alexnet 的train.prototxt文件

layer {
  name: "data"
  type: "Python"
  top: "data"
  top: "label"
  python_param {
    module: "voc_layers"
    layer: "SBDDSegDataLayer"
    param_str: "{\‘sbdd_dir\‘: \‘../data/sbdd/dataset\‘, \‘seed\‘: 1337, \‘split\‘: \‘train\‘, \‘mean\‘: (104.00699, 116.66877, 122.67892)}"
  }
}
layer {
  name: 
"conv1" type: "Convolution" bottom: "data" top: "conv1" convolution_param { num_output: 96 pad: 100 kernel_size: 11 group: 1 stride: 4 } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "pool1" type: "Pooling
" bottom: "conv1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "norm1" type: "LRN" bottom: "pool1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv2" type: "Convolution" bottom: "norm1" top: "conv2" convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 stride: 1 } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "pool2" type: "Pooling" bottom: "conv2" top: "pool2" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "norm2" type: "LRN" bottom: "pool2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 1 stride: 1 } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 stride: 1 } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 stride: 1 } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "fc6" type: "Convolution" bottom: "pool5" top: "fc6" convolution_param { num_output: 4096 pad: 0 kernel_size: 6 group: 1 stride: 1 } } layer { name: "relu6" type: "ReLU" bottom: "fc6" top: "fc6" } layer { name: "drop6" type: "Dropout" bottom: "fc6" top: "fc6" dropout_param { dropout_ratio: 0.5 } } layer { name: "fc7" type: "Convolution" bottom: "fc6" top: "fc7" convolution_param { num_output: 4096 pad: 0 kernel_size: 1 group: 1 stride: 1 } } layer { name: "relu7" type: "ReLU" bottom: "fc7" top: "fc7" } layer { name: "drop7" type: "Dropout" bottom: "fc7" top: "fc7" dropout_param { dropout_ratio: 0.5 } } layer { name: "score_fr" type: "Convolution" bottom: "fc7" top: "score_fr" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 21 pad: 0 kernel_size: 1 } } layer { name: "upscore" type: "Deconvolution" bottom: "score_fr" top: "upscore" param { lr_mult: 0 } convolution_param { num_output: 21 bias_term: false kernel_size: 63 stride: 32 } } layer { name: "score" type: "Crop" bottom: "upscore" bottom: "data" top: "score" crop_param { axis: 2 offset: 18 } } layer { name: "loss" type: "SoftmaxWithLoss" bottom: "score" bottom: "label" top: "loss" loss_param { ignore_label: 255 normalize: true } }

三、網絡結構

假設輸入的圖片為500x500,

技術分享圖片

根據train.prototxt文件,可以得到上圖的網絡結構,該網絡結構除了前五層的卷積層,也把後面的三層改為了卷積層,score_fr是卷積層的最後一層,也叫heatmap熱圖,熱圖就是我們最重要的高維特診圖,得到高維特征的heatmap之後,就是最重要的一步也是最後的一步,對原圖像進行upsampling(即反卷積),把圖像進行放大,得到原圖像的大小。

四、損失函數

該網絡的損失函數為SoftmaxWithLoss。首先進行softmax求解,求出每個像素點屬於不同類別的概率,因為總共是分為21類,所以每個像素點對應21個概率值(輸出通道數為21)。然後求解每個像素點所屬實際類別概率的log值之和的平均,再取負數,可得到損失函數,參考如下:

技術分享圖片

end

voc-fcn-alexnet網絡結構理解