[Paper note] Feature Pyramid Networks for Object Detection
阿新 • • 發佈:2018-12-29
Intuition
- Multi-scale is important in traditional methods
- Current detection system use single shot CNN to save time and memory (Faster R-CNN)
- CNN is capable of representing higher-level semantics, but not all levels are semantically strong
- Single Shot Detector (SSD) is one of the first attempts at using ConvNet’s pyramidal feature, but they add new layers after high up layer, which may lose information in high-resolution feature map
- Main contribution: perform multi-scale in the network
Model
- Feature Pyramid Network (FPN) building block
- The feature maps in the picture above are from the last layer of each stage in ConvNet
- Nearest neighbor upsampling
- Denotes final set of feature maps as
{P2,P3,P4,P5} , corresponding to{C
- FPN in Region Proposal Network (RPN)
- Replacing single-scale feature map with FPN
- Anchors with different aspect ratios: 1:2, 1:1, 2:1
- 15 anchors over the pyramid
- FPN in Fast R-CNN
Experiment
- Evaluate on COCO
minival
set - Surpass 2016 COCO winner
- Lateral and top-down connection is helpful