Original url:

http://blog.csdn.net/u010167269/article/details/52563573

https://handong1587.github.io/deep_learning/2015/10/09/object-detection.html

Object Detection

Published: 09 Oct 2015 Category:

Method	VOC2007	VOC2010	VOC2012	ILSVRC 2013
OverFeat	24.3%
R-CNN (AlexNet)	58.5%	53.7%	53.3%	31.4%
R-CNN (VGG16)	66.0%
SPP_net(ZF-5)	54.2%(1-model), 60.9%(2-model)	31.84%(1-model), 35.11%(6-model)
DeepID-Net	64.1%	50.3%
NoC	73.3%	68.8%
Fast-RCNN (VGG16)	70.0%	68.8%	68.4%	19.7%(@[0.5-0.95]), 35.9%(@0.5)
MR-CNN	78.2%	73.9%
Faster-RCNN (VGG16)	78.8%	75.9%	21.9%(@[0.5-0.95]), 42.7%(@0.5)	198ms
Faster-RCNN (ResNet-101)	85.6%	83.8%	37.4%(@[0.5-0.95]), 59.0%(@0.5)
SSD300 (VGG16)	72.1%	58 fps
SSD500 (VGG16)	75.1%	23 fps
ION	79.2%	76.4%
AZ-Net	70.4%	22.3%(@[0.5-0.95]), 41.0%(@0.5)
CRAFT	75.7%	71.3%	48.5%
OHEM	78.9%	76.3%	25.5%(@[0.5-0.95]), 45.9%(@0.5)
R-FCN (ResNet-50)	77.4%	0.12sec(K40), 0.09sec(TitianX)
R-FCN (ResNet-101)	79.5%	0.17sec(K40), 0.12sec(TitianX)
R-FCN (ResNet-101),multi sc train	83.6%	82.0%	31.5%(@[0.5-0.95]), 53.2%(@0.5)
PVANet 9.0	81.8%	82.5%	750ms(CPU), 46ms(TitianX)

Leaderboard

Detection Results: VOC2012

Papers

Deep Neural Networks for Object Detection

OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks

intro: A deep version of the sliding window method, predicts bounding box directly from each location of the topmost feature map after knowing the confidences of the underlying object categories.
intro: training a convolutional network to simultaneously classify, locate and detect objects in images can boost the classification accuracy and the detection and localization accuracy of all tasks

R-CNN

Rich feature hierarchies for accurate object detection and semantic segmentation

MultiBox

Scalable Object Detection using Deep Neural Networks

Scalable, High-Quality Object Detection

SPP-Net

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

Learning Rich Features from RGB-D Images for Object Detection and Segmentation

DeepID-Net

DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection

Object Detectors Emerge in Deep Scene CNNs

segDeepM: Exploiting Segmentation and Context in Deep Neural Networks for Object Detection

NoC

Object Detection Networks on Convolutional Feature Maps

Improving Object Detection with Deep Convolutional Networks via Bayesian Optimization and Structured Prediction

Fast R-CNN

DeepBox

DeepBox: Learning Objectness with Convolutional Networks

MR-CNN

Object detection via a multi-region & semantic segmentation-aware CNN model

Faster R-CNN

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN in MXNet with distributed implementation and data parallelization

YOLO

You Only Look Once: Unified, Real-Time Object Detection

Start Training YOLO with Our Own Data

R-CNN minus R

AttentionNet

AttentionNet: Aggregating Weak Directions for Accurate Object Detection

DenseBox

DenseBox: Unifying Landmark Localization with End to End Object Detection

SSD

SSD: Single Shot MultiBox Detector

為什麼SSD(Single Shot MultiBox Detector)對小目標的檢測效果不好？

Inside-Outside Net (ION)

Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks

Adaptive Object Detection Using Adjacency and Zoom Prediction

G-CNN

G-CNN: an Iterative Grid Based Object Detector

Factors in Finetuning Deep Model for object detection Factors in Finetuning Deep Model for Object Detection with Long-tail Distribution

We don’t need no bounding-boxes: Training object class detectors using only human verification

HyperNet

HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection

MultiPathNet

A MultiPath Network for Object Detection

CRAFT

CRAFT Objects from Images

OHEM

Training Region-based Object Detectors with Online Hard Example Mining

Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection

Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers

R-FCN

R-FCN: Object Detection via Region-based Fully Convolutional Networks

Weakly supervised object detection using pseudo-strong labels

Recycle deep features for better object detection

MS-CNN

A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection

Multi-stage Object Detection with Group Recursive Learning

Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection

PVANET

PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection

PVANet: Lightweight Deep Neural Networks for Real-time Object Detection

intro: Presented at NIPS 2016 Workshop on Efficient Methods for Deep Neural Networks (EMDNN). Continuation of arXiv:1608.08021

GBD-Net

Gated Bi-directional CNN for Object Detection

Crafting GBD-Net for Object Detection

intro: winner of the ImageNet object detection challenge of 2016. CUImage and CUVideo
intro: gated bi-directional CNN (GBD-Net)

StuffNet

StuffNet: Using ‘Stuff’ to Improve Object Detection

Generalized Haar Filter based Deep Networks for Real-Time Object Detection in Traffic Scene

Hierarchical Object Detection with Deep Reinforcement Learning

Learning to detect and localize many objects from few examples

Detection From Video

Learning Object Class Detectors from Weakly Annotated Video

Analysing domain shift factors between videos and images for object detection

Video Object Recognition

Deep Learning for Saliency Prediction in Natural Video

intro: Submitted on 12 Jan 2016
keywords: Deep learning, saliency map, optical flow, convolution network, contrast features

T-CNN

T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos

Object Detection from Video Tubelets with Convolutional Neural Networks

Object Detection in Videos with Tubelets and Multi-context Cues

Context Matters: Refining Object Detection in Video with Recurrent Neural Networks

CNN Based Object Detection in Large Video Images

Datasets

YouTube-Objects dataset v2.2

ILSVRC2015: Object detection from video (VID)

Object Detection in 3D

Vote3Deep: Fast Object Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks

Salient Object Detection

This task involves predicting the salient regions of an image given by human eye fixations.

Large-scale optimization of hierarchical features for saliency prediction in natural images

Predicting Eye Fixations using Convolutional Neural Networks

Saliency Detection by Multi-Context Deep Learning

DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection

SuperCNN: A Superpixelwise Convolutional Neural Network for Salient Object Detection

Shallow and Deep Convolutional Networks for Saliency Prediction

Recurrent Attentional Networks for Saliency Detection

intro: CVPR 2016. recurrent attentional convolutional-deconvolution network (RACDNN)

Two-Stream Convolutional Networks for Dynamic Saliency Prediction

Unconstrained Salient Object Detection

Unconstrained Salient Object Detection via Proposal Subset Optimization

Salient Object Subitizing

Deeply-Supervised Recurrent Convolutional Neural Network for Saliency Detection

intro: ACMMM 2016. deeply-supervised recurrent convolutional neural network (DSRCNN)

Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs

Edge Preserving and Multi-Scale Contextual Neural Network for Salient Object Detection

A Deep Multi-Level Network for Saliency Prediction

Visual Saliency Detection Based on Multiscale Deep CNN Features

A Deep Spatial Contextual Long-term Recurrent Convolutional Network for Saliency Detection

Deeply supervised salient object detection with short connections

Weakly Supervised Top-down Salient Object Detection

Specific Object Deteciton

Face Deteciton

Multi-view Face Detection Using Deep Convolutional Neural Networks

From Facial Parts Responses to Face Detection: A Deep Learning Approach

Compact Convolutional Neural Network Cascade for Face Detection

Face Detection with End-to-End Integration of a ConvNet and a 3D Model

Supervised Transformer Network for Efficient Face Detection

UnitBox

UnitBox: An Advanced Object Detection Network

Bootstrapping Face Detection with Hard Negative Examples

author: 萬韶華 @ 小米.
intro: Faster R-CNN, hard negative mining. state-of-the-art on the FDDB dataset

A Multi-Scale Cascade Fully Convolutional Network Face Detector

MTCNN

Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks

Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks

Datasets / Benchmarks

FDDB: Face Detection Data Set and Benchmark

WIDER FACE: A Face Detection Benchmark

Facial Point / Landmark Detection

Deep Convolutional Network Cascade for Facial Point Detection

A Recurrent Encoder-Decoder Network for Sequential Face Alignment

Detecting facial landmarks in the video based on a hybrid framework

Deep Constrained Local Models for Facial Landmark Detection

People Detection

End-to-end people detection in crowded scenes

Detecting People in Artwork with CNNs

Person Head Detection

Context-aware CNNs for person head detection

Pedestrian Detection

Pedestrian Detection aided by Deep Learning Semantic Tasks

Deep Learning Strong Parts for Pedestrian Detection

intro: ICCV 2015. CUHK. DeepParts
intro: Achieving 11.89% average miss rate on Caltech Pedestrian Dataset

Deep convolutional neural networks for pedestrian detection

New algorithm improves speed and accuracy of pedestrian detection

Pushing the Limits of Deep CNNs for Pedestrian Detection

intro: “set a new record on the Caltech pedestrian dataset, lowering the log-average miss rate from 11.7% to 8.9%”

A Real-Time Deep Learning Pedestrian Detector for Robot Navigation

A Real-Time Pedestrian Detector using Deep Learning for Human-Aware Navigation

Is Faster R-CNN Doing Well for Pedestrian Detection?

Reduced Memory Region Based Deep Convolutional Neural Network Detection

Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection

Multispectral Deep Neural Networks for Pedestrian Detection

Vehicle Detection

Detection via Graph-Based Manifold Ranking'論文總結

對顯著性檢測的一些瞭解：一般認為，良好的顯著性檢測模型應至少滿足以下三個標準： 1）良好的檢測：丟失實際顯著區域的可能性以及將背景錯誤地標記為顯著區域應該是低的； 2）高解析度：顯著圖應該具有高解析度或全解析度以準確定位突出物體並保留原始影象資訊； 3）計算效率：作

各種物件檢測論文總結(Object Detection )