[深度學習從入門到女裝]FCN
本文簡單介紹一下FCN模型,並對caffe原始碼進行閱讀
對於convolution:
output = (input + 2 * padding - ksize) / stride + 1;
對於deconvolution:
output = (input - 1) * stride + ksize - 2 * padding;
fcn8s程式碼:
layer { name: "input" type: "Input" top: "data" input_param { # These dimensions are purely for sake of example; # see infer.py for how to reshape the net to the given input size. shape { dim: 1 dim: 3 dim: 500 dim: 500 } } }
輸入為500*500*3
layer { name: "conv1_1" type: "Convolution" bottom: "data" top: "conv1_1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 pad: 100 kernel_size: 3 stride: 1 } } layer { name: "relu1_1" type: "ReLU" bottom: "conv1_1" top: "conv1_1" }
第一個卷積層conv1_1的pad為100 pad後為700*700*3
使用64個3*3*3進行卷積操作後輸出為698*698*64
layer { name: "conv1_2" type: "Convolution" bottom: "conv1_1" top: "conv1_2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 64 pad: 1 kernel_size: 3 stride: 1 } } layer { name: "relu1_2" type: "ReLU" bottom: "conv1_2" top: "conv1_2" }
第二個卷積層conv1_2的pad為1 pad後為700*700*64
使用64個3*3*64進行卷積操作後輸出為698*698*64
layer { name: "pool1" type: "Pooling" bottom: "conv1_2" top: "pool1" pooling_param { pool: MAX kernel_size: 2 stride: 2 } }
使用stride為2的MAXpool進行pooling後輸出為349*349*64
conv2_1:num_output: 128 pad: 1 kernel_size: 3 stride: 1
輸出為349*349*128
conv2_2:num_output: 128 pad: 1 kernel_size: 3 stride: 1
輸出為349*349*128
pool2:MAXpooling stride2 size2 輸出為175*175*128(pool使用向上取整)
conv3_1、conv3_2、conv3_3:num_output: 256 pad: 1 kernel_size: 3 stride: 1
輸出為175*175*256
pool3:MAXpooling stride2 size2 輸出為88*88*256(pool使用向上取整)
conv4_1、conv4_2、conv4_3:num_output: 512 pad: 1 kernel_size: 3 stride: 1
輸出為88*88*512
pool4:MAXpooling stride2 size2 輸出為44*44*512(pool使用向上取整)
conv5_1、conv5_2、conv5_3:num_output: 512 pad: 1 kernel_size: 3 stride: 1
輸出為44*44*512
pool5:MAXpooling stride2 size2 輸出為22*22*512(pool使用向上取整)
layer { name: "fc6" type: "Convolution" bottom: "pool5" top: "fc6" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 4096 pad: 0 kernel_size: 7 stride: 1 } } layer { name: "relu6" type: "ReLU" bottom: "fc6" top: "fc6" }
(這個作者真的懶,改的VGG16的框架,連全連線層的名字都沒給改成卷積層。。。)
輸入為 22*22*512 輸出為16*16*4096
layer { name: "fc7" type: "Convolution" bottom: "fc6" top: "fc7" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 4096 pad: 0 kernel_size: 1 stride: 1 } } layer { name: "relu7" type: "ReLU" bottom: "fc7" top: "fc7" }
輸入為16*16*4096 輸出為16*16*4096
layer { name: "score_fr" type: "Convolution" bottom: "fc7" top: "score_fr" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 21 pad: 0 kernel_size: 1 } }
score_fr:輸入為16*16*4096 輸出為16*16*21
layer { name: "upscore2" type: "Deconvolution" bottom: "score_fr" top: "upscore2" param { lr_mult: 0 } convolution_param { num_output: 21 bias_term: false kernel_size: 4 stride: 2 } }
upscore2:輸入為16*16*21 輸入為34*34*21
layer { name: "score_pool4" type: "Convolution" bottom: "pool4" top: "score_pool4" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 21 pad: 0 kernel_size: 1 } }
score_pool4:輸入為44*44*512 輸出為44*44*21
layer { name: "score_pool4c" type: "Crop" bottom: "score_pool4" bottom: "upscore2" top: "score_pool4c" crop_param { axis: 2 offset: 5 } }
score_pool4c:這一層為對socre_pool4進行裁剪 caffe中crop作用詳見Caffe中crop_layer層的理解和使用
輸入為44*44*21 輸出為34*34*21
layer { name: "fuse_pool4" type: "Eltwise" bottom: "upscore2" bottom: "score_pool4c" top: "fuse_pool4" eltwise_param { operation: SUM } }
fuse_pool4:這一層是為了將upscore2與score_pool4c進行合併,進行不同層次特徵融合 輸入為34*34*21
layer { name: "upscore_pool4" type: "Deconvolution" bottom: "fuse_pool4" top: "upscore_pool4" param { lr_mult: 0 } convolution_param { num_output: 21 bias_term: false kernel_size: 4 stride: 2 } }
upscore_pool4:輸入為34*34*21 輸出為70*70*21
layer { name: "score_pool3" type: "Convolution" bottom: "pool3" top: "score_pool3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 21 pad: 0 kernel_size: 1 } }
score_pool3:輸入為88*88*256 輸出為88*88*21
layer { name: "score_pool3c" type: "Crop" bottom: "score_pool3" bottom: "upscore_pool4" top: "score_pool3c" crop_param { axis: 2 offset: 9 } }
score_pool3c:這層將score_pool3進行裁剪為和upscore_pool4相同尺寸
輸入為88*88*21 輸出為70*70*21
layer { name: "fuse_pool3" type: "Eltwise" bottom: "upscore_pool4" bottom: "score_pool3c" top: "fuse_pool3" eltwise_param { operation: SUM } }
fuse_pool3:將upscore_pool4和score_pool3c特徵圖融合相加 輸出為70*70*21
layer { name: "upscore8" type: "Deconvolution" bottom: "fuse_pool3" top: "upscore8" param { lr_mult: 0 } convolution_param { num_output: 21 bias_term: false kernel_size: 16 stride: 8 } }
upscore8:輸入為70*70*21 輸出為568*568*21
layer { name: "score" type: "Crop" bottom: "upscore8" bottom: "data" top: "score" crop_param { axis: 2 offset: 31 } }
score:對最終分割圖進行裁剪 輸出為506*506*21