1. 程式人生 > >[深度學習從入門到女裝]FCN

[深度學習從入門到女裝]FCN

本文簡單介紹一下FCN模型,並對caffe原始碼進行閱讀

對於convolution:

output = (input + 2 * padding  - ksize)  / stride + 1;

對於deconvolution:

output = (input - 1) * stride + ksize - 2 * padding;

fcn8s程式碼:

layer {   name: "input"   type: "Input"   top: "data"   input_param {     # These dimensions are purely for sake of example;     # see infer.py for how to reshape the net to the given input size.     shape { dim: 1 dim: 3 dim: 500 dim: 500 }   } }

 輸入為500*500*3

layer {   name: "conv1_1"   type: "Convolution"   bottom: "data"   top: "conv1_1"   param {     lr_mult: 1     decay_mult: 1   }   param {     lr_mult: 2     decay_mult: 0   }   convolution_param {     num_output: 64     pad: 100     kernel_size: 3     stride: 1   } } layer {   name: "relu1_1"   type: "ReLU"   bottom: "conv1_1"   top: "conv1_1" }

第一個卷積層conv1_1的pad為100 pad後為700*700*3

使用64個3*3*3進行卷積操作後輸出為698*698*64

 layer {   name: "conv1_2"   type: "Convolution"   bottom: "conv1_1"   top: "conv1_2"   param {     lr_mult: 1     decay_mult: 1   }   param {     lr_mult: 2     decay_mult: 0   }   convolution_param {     num_output: 64     pad: 1     kernel_size: 3     stride: 1   } } layer {   name: "relu1_2"   type: "ReLU"   bottom: "conv1_2"   top: "conv1_2" }

第二個卷積層conv1_2的pad為1 pad後為700*700*64

使用64個3*3*64進行卷積操作後輸出為698*698*64

layer {   name: "pool1"   type: "Pooling"   bottom: "conv1_2"   top: "pool1"   pooling_param {     pool: MAX     kernel_size: 2     stride: 2   } }

 使用stride為2的MAXpool進行pooling後輸出為349*349*64

conv2_1:num_output: 128 pad: 1  kernel_size: 3 stride: 1

輸出為349*349*128

conv2_2:num_output: 128 pad: 1 kernel_size: 3 stride: 1

輸出為349*349*128

pool2:MAXpooling stride2 size2 輸出為175*175*128(pool使用向上取整)

conv3_1、conv3_2、conv3_3:num_output: 256 pad: 1 kernel_size: 3 stride: 1

輸出為175*175*256

pool3:MAXpooling stride2 size2 輸出為88*88*256(pool使用向上取整)

conv4_1、conv4_2、conv4_3:num_output: 512 pad: 1 kernel_size: 3 stride: 1

輸出為88*88*512

pool4:MAXpooling stride2 size2 輸出為44*44*512(pool使用向上取整)

conv5_1、conv5_2、conv5_3:num_output: 512 pad: 1 kernel_size: 3 stride: 1

輸出為44*44*512

pool5:MAXpooling stride2 size2 輸出為22*22*512(pool使用向上取整)

layer {   name: "fc6"   type: "Convolution"   bottom: "pool5"   top: "fc6"   param {     lr_mult: 1     decay_mult: 1   }   param {     lr_mult: 2     decay_mult: 0   }   convolution_param {     num_output: 4096     pad: 0     kernel_size: 7     stride: 1   } } layer {   name: "relu6"   type: "ReLU"   bottom: "fc6"   top: "fc6" }

(這個作者真的懶,改的VGG16的框架,連全連線層的名字都沒給改成卷積層。。。)

輸入為 22*22*512 輸出為16*16*4096

layer {   name: "fc7"   type: "Convolution"   bottom: "fc6"   top: "fc7"   param {     lr_mult: 1     decay_mult: 1   }   param {     lr_mult: 2     decay_mult: 0   }   convolution_param {     num_output: 4096     pad: 0     kernel_size: 1     stride: 1   } } layer {   name: "relu7"   type: "ReLU"   bottom: "fc7"   top: "fc7" }

 輸入為16*16*4096 輸出為16*16*4096

layer {   name: "score_fr"   type: "Convolution"   bottom: "fc7"   top: "score_fr"   param {     lr_mult: 1     decay_mult: 1   }   param {     lr_mult: 2     decay_mult: 0   }   convolution_param {     num_output: 21     pad: 0     kernel_size: 1   } }

score_fr:輸入為16*16*4096 輸出為16*16*21

 layer {   name: "upscore2"   type: "Deconvolution"   bottom: "score_fr"   top: "upscore2"   param {     lr_mult: 0   }   convolution_param {     num_output: 21     bias_term: false     kernel_size: 4     stride: 2   } }

 upscore2:輸入為16*16*21 輸入為34*34*21

layer {   name: "score_pool4"   type: "Convolution"   bottom: "pool4"   top: "score_pool4"   param {     lr_mult: 1     decay_mult: 1   }   param {     lr_mult: 2     decay_mult: 0   }   convolution_param {     num_output: 21     pad: 0     kernel_size: 1   } }

 score_pool4:輸入為44*44*512 輸出為44*44*21

layer {   name: "score_pool4c"   type: "Crop"   bottom: "score_pool4"   bottom: "upscore2"   top: "score_pool4c"   crop_param {     axis: 2     offset: 5   } }

 score_pool4c:這一層為對socre_pool4進行裁剪 caffe中crop作用詳見Caffe中crop_layer層的理解和使用

輸入為44*44*21 輸出為34*34*21

layer {   name: "fuse_pool4"   type: "Eltwise"   bottom: "upscore2"   bottom: "score_pool4c"   top: "fuse_pool4"   eltwise_param {     operation: SUM   } }

 fuse_pool4:這一層是為了將upscore2與score_pool4c進行合併,進行不同層次特徵融合 輸入為34*34*21

layer {   name: "upscore_pool4"   type: "Deconvolution"   bottom: "fuse_pool4"   top: "upscore_pool4"   param {     lr_mult: 0   }   convolution_param {     num_output: 21     bias_term: false     kernel_size: 4     stride: 2   } }

upscore_pool4:輸入為34*34*21 輸出為70*70*21

 layer {   name: "score_pool3"   type: "Convolution"   bottom: "pool3"   top: "score_pool3"   param {     lr_mult: 1     decay_mult: 1   }   param {     lr_mult: 2     decay_mult: 0   }   convolution_param {     num_output: 21     pad: 0     kernel_size: 1   } }

 score_pool3:輸入為88*88*256 輸出為88*88*21

layer {   name: "score_pool3c"   type: "Crop"   bottom: "score_pool3"   bottom: "upscore_pool4"   top: "score_pool3c"   crop_param {     axis: 2     offset: 9   } }

 score_pool3c:這層將score_pool3進行裁剪為和upscore_pool4相同尺寸

輸入為88*88*21 輸出為70*70*21

layer {   name: "fuse_pool3"   type: "Eltwise"   bottom: "upscore_pool4"   bottom: "score_pool3c"   top: "fuse_pool3"   eltwise_param {     operation: SUM   } }

 fuse_pool3:將upscore_pool4和score_pool3c特徵圖融合相加 輸出為70*70*21

layer {   name: "upscore8"   type: "Deconvolution"   bottom: "fuse_pool3"   top: "upscore8"   param {     lr_mult: 0   }   convolution_param {     num_output: 21     bias_term: false     kernel_size: 16     stride: 8   } }

 upscore8:輸入為70*70*21 輸出為568*568*21

layer {   name: "score"   type: "Crop"   bottom: "upscore8"   bottom: "data"   top: "score"   crop_param {     axis: 2     offset: 31   } }

 score:對最終分割圖進行裁剪 輸出為506*506*21