SqueezeNet運用到Faster RCNN進行目標檢測

阿新 • • 發佈：2019-01-19

一、SqueezeNet介紹

論文提交ICLR 2017
論文地址：https://arxiv.org/abs/1602.07360
程式碼地址：https://github.com/DeepScale/SqueezeNet
注：程式碼只放出了prototxt檔案和訓練好的caffemodel，因為整個網路都是基於caffe的，有這兩樣東西就足夠了。
在這裡只是簡要的介紹文章的內容，具體細節的東西可以自行翻閱論文。

MOTIVATION

在相同的精度下，模型引數更少有3個好處：

More efficient distributed training
Less overhead when exporting new models to clients
Feasible FPGA and embedded deployment

即高效的分散式訓練、更容易替換模型、更方便FPGA和嵌入式部署。
鑑於此，提出3種策略：

Replace 3x3 filters with 1x1 filters.
Decrease the number of input channels to 3x3 filters.
Downsample late in the network so that convolution layers have large activation maps.

即

使用1x1的核替換3x3的核，因為1x1核引數是3x3的1/9；
輸入通道減少3x3核的數量，因為引數的數量由輸入通道數、卷積核數、卷積核的大小決定。因此，減少1x1的核數量還不夠，還需要減少輸入通道數量，在文中，作者使用squeeze layer來達到這一目的；

後移池化層，得到更大的feature map。作者認為在網路的前段使用大的步長進行池化，後面的feature map將會減小，而大的feature map會有較高的準確率。

FIRE MODULE

由上面的思路，作者提出了Fire Module，結構如下：
這裡寫圖片描述

ARCHITECTURE

這裡寫圖片描述

關於SqueezeNet的構建細節在文中也有詳細的描述

為了3x3的核輸出的feature map和1x1的大小相同，padding取1（主要是為了concat）
squeezelayer和expandlayer後面跟ReLU啟用函式
Dropout比例為0.5，跟在fire9後面
取消全連線，參考NIN結構
訓練過程採用多項式學習率（我用來做檢測時改為了step策略）

由於caffe不支援同一個卷積層既有1x1，又有3x3，所以需要concat，將兩個解析度的圖在channel維度concat。這在數學上是等價的

EVALUATION

這裡寫圖片描述

二、SqueezeNet與Faster RCNN結合

這裡，我首先嚐試的是使用alt-opt，但是很遺憾的是，出來的結果很糟糕，基本不能用，後來改為使用end2end，在最開始的時候，採用的就是faster rcnn官方提供的zfnet end2end訓練的solvers，又很不幸的是，在網路執行大概400步後出現：

loss = NAN

遇到這個問題，把學習率改為以前的1/10，解決。
直接上prototxt檔案，前面都是一樣的，只需要改動zfnet中的conv1-con5部分，外加把fc6-fc7改成squeeze中的卷積連結。
prototxt太長，給出每個部分的前面和後面部分：

name: "Alex_Squeeze_v1.1"
layer {
  name: 'input-data'
  type: 'Python'
  top: 'data'
  top: 'im_info'
  top: 'gt_boxes'
  python_param {
    module: 'roi_data_layer.layer'
    layer: 'RoIDataLayer'
    param_str: "'num_classes': 4"
  }
}

layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  convolution_param {
    num_output: 64
    kernel_size: 3
    stride: 2
  }
}
.
.
.
layer {
  name: "drop9"
  type: "Dropout"
  bottom: "fire9/concat"
  top: "fire9/concat"
  dropout_param {
    dropout_ratio: 0.5
  }
}

#========= RPN ============

layer {
  name: "rpn_conv/3x3"
  type: "Convolution"
  bottom: "fire9/concat"
  top: "rpn/output"
  param { lr_mult: 1.0 }
  param { lr_mult: 2.0 }
  convolution_param {
    num_output: 256
    kernel_size: 3 pad: 1 stride: 1
    weight_filler { type: "gaussian" std: 0.01 }
    bias_filler { type: "constant" value: 0 }
  }
}
.
.
.
layer {
  name: "drop9"
  type: "Dropout"
  bottom: "fire9/concat"
  top: "fire9/concat"
  dropout_param {
    dropout_ratio: 0.5
  }
}

#========= RPN ============

layer {
  name: "rpn_conv/3x3"
  type: "Convolution"
  bottom: "fire9/concat"
  top: "rpn/output"
  param { lr_mult: 1.0 }
  param { lr_mult: 2.0 }
  convolution_param {
    num_output: 256
    kernel_size: 3 pad: 1 stride: 1
    weight_filler { type: "gaussian" std: 0.01 }
    bias_filler { type: "constant" value: 0 }
  }
}
.
.
.
layer {
  name: 'roi-data'
  type: 'Python'
  bottom: 'rpn_rois'
  bottom: 'gt_boxes'
  top: 'rois'
  top: 'labels'
  top: 'bbox_targets'
  top: 'bbox_inside_weights'
  top: 'bbox_outside_weights'
  python_param {
    module: 'rpn.proposal_target_layer'
    layer: 'ProposalTargetLayer'
    param_str: "'num_classes': 4"
  }
}

#===================== RCNN =============

layer {
  name: "roi_pool5"
  type: "ROIPooling"
  bottom: "fire9/concat"
  bottom: "rois"
  top: "roi_pool5"
  roi_pooling_param {
    pooled_w: 7
    pooled_h: 7
    spatial_scale: 0.0625 # 1/16
  }
}

layer {
  name: "conv1_last"
  type: "Convolution"
  bottom: "roi_pool5"
  top: "conv1_last"
  param { lr_mult: 1.0 }
  param { lr_mult: 1.0 }
  convolution_param {
    num_output: 1000
    kernel_size: 1
    weight_filler {
      type: "gaussian"
      mean: 0.0
      std: 0.01
    }
  }
}
layer {
  name: "relu/conv1_last"
  type: "ReLU"
  bottom: "conv1_last"
  top: "relu/conv1_last"
}


layer {
  name: "cls_score"
  type: "InnerProduct"
  bottom: "relu/conv1_last"
  top: "cls_score"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 5
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "bbox_pred"
  type: "InnerProduct"
  bottom: "relu/conv1_last"
  top: "bbox_pred"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 20
    weight_filler {
      type: "gaussian"
      std: 0.001
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "loss_cls"
  type: "SoftmaxWithLoss"
  bottom: "cls_score"
  bottom: "labels"
  propagate_down: 1
  propagate_down: 0
  top: "loss_cls"
  loss_weight: 1
}
layer {
  name: "loss_bbox"
  type: "SmoothL1Loss"
  bottom: "bbox_pred"
  bottom: "bbox_targets"
  bottom: "bbox_inside_weights"
  bottom: "bbox_outside_weights"
  top: "loss_bbox"
  loss_weight: 1
}

後面一部分的結構如圖：
這裡寫圖片描述
注意紅圈部分，以前的fc換成了squ中的卷積層，這樣網路引數大大減少，因為我改動了rpn部分選proposal的比例和數量，共採用改了70種選擇，所以最後訓練出來的模型為17M，比初始化4.8M大很多，不過也已經很小了。

三、SqueezeNet+Faster RCNN+OHEM

OHEM無非就是多了一個readonly部分，不過加上之後效果會好很多，和上面的方式一致，放出一部分prototxt，其他的課自行補上。從rpn那裡開始，前面部分和上面給出的完全一樣

#====== RoI Proposal ====================
layer {
  name: "rpn_cls_prob"
  type: "Softmax"
  bottom: "rpn_cls_score_reshape"
  top: "rpn_cls_prob"
}
layer {
  name: 'rpn_cls_prob_reshape'
  type: 'Reshape'
  bottom: 'rpn_cls_prob'
  top: 'rpn_cls_prob_reshape'
  reshape_param { shape { dim: 0 dim: 140 dim: -1 dim: 0 } }
}
layer {
  name: 'proposal'
  type: 'Python'
  bottom: 'rpn_cls_prob_reshape'
  bottom: 'rpn_bbox_pred'
  bottom: 'im_info'
  top: 'rpn_rois'
  python_param {
    module: 'rpn.proposal_layer'
    layer: 'ProposalLayer'
    param_str: "'feat_stride': 16"
  }
}
layer {
  name: 'roi-data'
  type: 'Python'
  bottom: 'rpn_rois'
  bottom: 'gt_boxes'
  top: 'rois'
  top: 'labels'
  top: 'bbox_targets'
  top: 'bbox_inside_weights'
  top: 'bbox_outside_weights'
  python_param {
    module: 'rpn.proposal_target_layer'
    layer: 'ProposalTargetLayer'
    param_str: "'num_classes': 4"
  }
}
##########################
## Readonly RoI Network ##
######### Start ##########
layer {
  name: "roi_pool5_readonly"
  type: "ROIPooling"
  bottom: "fire9/concat"
  bottom: "rois"
  top: "pool5_readonly"
  propagate_down: false
  propagate_down: false
  roi_pooling_param {
    pooled_w: 6
    pooled_h: 6
    spatial_scale: 0.0625 # 1/16
  }
}
layer {
  name: "conv1_last_readonly"
  type: "Convolution"
  bottom: "pool5_readonly"
  top: "conv1_last_readonly"
  propagate_down: false  
  param {
    name: "conv1_last_w"
  }
  param {
    name: "conv1_last_b"
  }
  convolution_param {
    num_output: 1000
    kernel_size: 1
    weight_filler {
      type: "gaussian"
      mean: 0.0
      std: 0.01
    }
  }
}
layer {
  name: "relu/conv1_last_readonly"
  type: "ReLU"
  bottom: "conv1_last_readonly"
  top: "relu/conv1_last_readonly"
  propagate_down: false
}
layer {
  name: "cls_score_readonly"
  type: "InnerProduct"
  bottom: "relu/conv1_last_readonly"
  top: "cls_score_readonly"
  propagate_down: false
  param {
    name: "cls_score_w"
  }
  param {
    name: "cls_score_b"
  }
  inner_product_param {
    num_output: 4
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "bbox_pred_readonly"
  type: "InnerProduct"
  bottom: "relu/conv1_last_readonly"
  top: "bbox_pred_readonly"
  propagate_down: false
  param {
    name: "bbox_pred_w"
  }
  param {
    name: "bbox_pred_b"
  }
  inner_product_param {
    num_output: 16
    weight_filler {
      type: "gaussian"
      std: 0.001
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "cls_prob_readonly"
  type: "Softmax"
  bottom: "cls_score_readonly"
  top: "cls_prob_readonly"
  propagate_down: false
}
layer {
  name: "hard_roi_mining"
  type: "Python"
  bottom: "cls_prob_readonly"
  bottom: "bbox_pred_readonly"
  bottom: "rois"
  bottom: "labels"
  bottom: "bbox_targets"
  bottom: "bbox_inside_weights"
  bottom: "bbox_outside_weights"
  top: "rois_hard"
  top: "labels_hard"
  top: "bbox_targets_hard"
  top: "bbox_inside_weights_hard"
  top: "bbox_outside_weights_hard"
  propagate_down: false
  propagate_down: false
  propagate_down: false
  propagate_down: false
  propagate_down: false
  propagate_down: false
  propagate_down: false
  python_param {
    module: "roi_data_layer.layer"
    layer: "OHEMDataLayer"
    param_str: "'num_classes': 4"
  }
}
########## End ###########
## Readonly RoI Network ##
##########################
#===================== RCNN =============
layer {
  name: "roi_pool5"
  type: "ROIPooling"
  bottom: "fire9/concat"
  bottom: "rois_hard"
  top: "roi_pool5"
  propagate_down: true
  propagate_down: false
  roi_pooling_param {
    pooled_w: 7
    pooled_h: 7
    spatial_scale: 0.0625 # 1/16
  }
}
layer {
  name: "conv1_last"
  type: "Convolution"
  bottom: "roi_pool5"
  top: "conv1_last"
  param { 
      lr_mult: 1.0 
      name: "conv1_last_w"
      }
  param { 
      lr_mult: 1.0 
      name: "conv1_last_b"
      }
  convolution_param {
    num_output: 1000
    kernel_size: 1
    weight_filler {
      type: "gaussian"
      mean: 0.0
      std: 0.01
    }
  }
}
layer {
  name: "relu/conv1_last"
  type: "ReLU"
  bottom: "conv1_last"
  top: "relu/conv1_last"
}
layer {
  name: "cls_score"
  type: "InnerProduct"
  bottom: "relu/conv1_last"
  top: "cls_score"
  param {
    lr_mult: 1
    name: "cls_score_w"
  }
  param {
    lr_mult: 2
    name: "cls_score_b"
  }
  inner_product_param {
    num_output: 4
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "bbox_pred"
  type: "InnerProduct"
  bottom: "relu/conv1_last"
  top: "bbox_pred"
  param {
    lr_mult: 1
    name: "bbox_pred_w"
  }
  param {
    lr_mult: 2
    name: "bbox_pred_b"
  }
  inner_product_param {
    num_output: 16
    weight_filler {
      type: "gaussian"
      std: 0.001
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "loss_cls"
  type: "SoftmaxWithLoss"
  bottom: "cls_score"
  bottom: "labels_hard"
  propagate_down: true
  propagate_down: false
  top: "loss_cls"
  loss_weight: 1
}
layer {
  name: "loss_bbox"
  type: "SmoothL1Loss"
  bottom: "bbox_pred"
  bottom: "bbox_targets_hard"
  bottom: "bbox_inside_weights_hard"
  bottom: "bbox_outside_weights_hard"
  top: "loss_bbox"
  loss_weight: 1
  propagate_down: false
  propagate_down: false
  propagate_down: false
  propagate_down: false
}

結構圖如下：
這裡寫圖片描述
比前面訓練的多一個readonly部分，具體可參考論文：
Training Region-based Object Detectors with Online Hard Example Mining
https://arxiv.org/abs/1604.03540

至此，SqueezeNet+Faster RCNN 框架便介紹完了，執行速度在GPU下大概是ZF的5倍，CPU下大概為2。5倍。

SqueezeNet運用到Faster RCNN進行目標檢測

目錄

一、SqueezeNet介紹

MOTIVATION

FIRE MODULE

ARCHITECTURE

EVALUATION

二、SqueezeNet與Faster RCNN結合

三、SqueezeNet+Faster RCNN+OHEM

原文連結：

SqueezeNet運用到Faster RCNN進行目標檢測

使用Faster-Rcnn進行目標檢測(實踐篇)

使用faster rcnn進行目標檢測

使用Faster-Rcnn進行目標檢測

關於學習使用Faster-RCNN做目標檢測和物件捕捉問題

深度學習之目標檢測常用演算法原理+實踐精講 YOLO / Faster RCNN / SSD / 文字檢測 / 多工網路

faster-RCNN臺標檢測

tensorflow利用預訓練模型進行目標檢測（一）：預訓練模型的使用

tensorflow利用預訓練模型進行目標檢測

使用opencv訓練cascade分類器進行目標檢測

Faster R-CNN 目標檢測演算法詳細總結分析（two-stage)(深度學習)(NIPS 2015)

tensorflow利用預訓練模型進行目標檢測（二）：將檢測結果存入mysql資料庫

tensorflow利用預訓練模型進行目標檢測（四）：檢測中的精度問題以及evaluation

gluoncv 訓練自己的資料集，進行目標檢測

Tensorflow訓練自己的Object Detection模型並進行目標檢測

零基礎使用深度學習進行目標檢測

Opencv對視訊進行目標檢測

在Windows系統下，用faster-RCNN進行模型訓練

使用判別訓練的部件模型進行目標檢測（DPM）

關於在fpga上進行目標檢測、跟蹤的設計

SqueezeNet運用到Faster RCNN進行目標檢測

目錄

一、SqueezeNet介紹

MOTIVATION

FIRE MODULE

ARCHITECTURE

EVALUATION

二、SqueezeNet與Faster RCNN結合

三、SqueezeNet+Faster RCNN+OHEM

原文連結：

相關推薦