[Object Detection]關於“在預訓練網路中增加捲積和全連線層可以改善效能”

阿新 • • 發佈：2018-12-08

Yolo論文裡提到"Ren et al. show that adding both convolutional and connected layers to pretrained networks can improve performance [28]."

[28] S. Ren, K. He, R. B. Girshick, X. Zhang, and J. Sun. Object detection networks on convolutional feature maps. CoRR, abs/1504.06066, 2015. 3, 7

目標檢測中基本使用預訓練的卷積網路來提取特徵，然後在卷積網路後面再新增其它層，構成“Networks on Convolutional feature maps” (NoCs)。NoC可以使用SVM、MLP，或者ConvNet。

SVM vs MLP as NoC:

Table 1 shows the results of using MLP as NoC. Here we randomly initialize the weights by Gaussian distributions. The accuracy of NoC with 2 to 4 fc layers increases with the depth. Compared with the SVM classifier trained on the RoI features (“SVM on RoI”, equivalent to a 1-fc structure), the 4-fc NoC as a classifier on the same features has 7.8% higher mAP. Note that in this comparison the NoC classifiers have no pre-training (randomly initialized). The gain is solely because that MLPs are better classifiers than single-layer SVMs.

Using ConvNet as NoC:

In recent detection systems [12], [13], [14], [23], [24], conv layers in the pre-trained models are thought of as region-independent feature extractors, and thus are shared on the entire image without being aware of the regions that are of interest. Although this is a computationally efficient solution, it misses the opportunities of using conv layers to learn region-aware features that are fit to the regions of interest (instead of full images).

We investigate using 1 to 3 additional conv layers (with ReLU) in a NoC. We use 256 conv filters for the ZF net and 512 for the VGG net. The conv filters have a spatial size of 33 and a padding of 1, so the m×m spatial resolution is unchanged. After the last
additional conv layer, we apply three fc layers as in the above MLP case. For example, we denote a NoC with 2 conv layers as “c256-c256-f4096-f4096-f21”.

When using VOC 07 trainval for training, the mAP is nearly unchanged when using 1 additional conv layer, but drops when using more conv layers. We observe that the degradation is a result of overfitting. The VOC 07 trainval set is too small to train deeper models. However, NoCs with conv layers show improvements when trained on the VOC 07+12 trainval set (Table 2). For this training set, the 3fc NoC baseline is lifted to 56.5% mAP. The advanced 2conv3fc NoC improves over this baseline to 58.9%. This justifies the effects of the additional conv layers. Table 2 also shows that the mAP gets saturated when using 3 additional conv layers.

預訓練網路中的卷積層可以認為是區域無關的特徵提取器(region-independent feature extractors)，這個特徵提取器跟興趣區域無關，被整個影象共享。在NoC中增加捲積層，可以用於學習region-aware的特徵，專門適配於興趣區域。

[28]總結了幾個發現:

The following key observations can be concluded from the above subsections:
(i) A deeper region-wise classifier is useful and is in general orthogonal to deeper feature maps.
(ii) A convolutional region-wise classifier is more effective than an MLP-based region-wise classifier.

[Object Detection]關於“在預訓練網路中增加捲積和全連線層可以改善效能”

[Object Detection]關於“在預訓練網路中增加捲積和全連線層可以改善效能”

影象語義分割(5)-DeepLabV2: 使用深度卷積網路、空洞卷積和全連線條件隨機場進行影象語義分割

DeepLab：深度卷積網路，多孔卷積和全連線條件隨機場的影象語義分割 Semantic Image Segmentation with Deep Convolutional Nets, Atro

神經網路中的卷積和反捲積原理

tensorflow中的卷積和池化層(一)

tensorflow object detection faster r-cnn 中keep_aspect_ratio_resizer是什麽意思

Deeplearning4j 實戰（10）：遷移學習--ImageNet比賽預訓練網路VGG16分類花卉圖片

Tensorflow object detection API 訓練自己的資料集

【TensorFlow】Win7下使用Object Detection API 訓練自己的資料集，並視訊實時檢測

用pytorch實現預訓練網路的finetune

【學習筆記】pyQt5學習筆記(6）——Google object detection API訓練&識別用軟體更新

【學習筆記】pyQt5學習筆記(5）——Google object detection API訓練用軟體

使用tensorflow object detection API 訓練自己的目標檢測模型（一）labelImg的安裝配置過程

使用tensorflow object detection API 訓練自己的目標檢測模型（二）

Tensorflow學習——利用Object Detection api訓練自己的資料集

利用Google Object Detection模組識別圖片中的物體

用Tensorflow Object Detection API 訓練自己的資料集

預訓練網路模型

關於使用tensorflow object detection API訓練自己的模型-補充部分（程式碼，資料標註工具，訓練資料，測試資料）

tensorflow object detection api訓練自己的資料集

[Object Detection]關於“在預訓練網路中增加捲積和全連線層可以改善效能”

相關推薦