ssd目標檢測整理

阿新 • • 發佈：2019-01-12

ssd多目標檢測：

https://github.com/ex4sperans/SSD

https://github.com/georgesung/ssd_tensorflow_traffic_sign_detection

The model was trained on the LISA Traffic Sign Dataset, a dataset of US traffic signs.

Dependencies

Python 3.5+
TensorFlow v0.12.0
Pickle
OpenCV-Python
Matplotlib (optional)

How to run

Clone this repository somewhere, let's refer to it as $ROOT

To run predictions using the pre-trained model:

cd $ROOT
python inference.py -m demo
- This will take the images from sample_images, annotate them, and display them on screen
To run predictions on your own images and/or videos, use the -i flag in inference.py (see the code for more details)
- Note the model severly overfits at this time

Training the model from scratch:

Download the LISA Traffic Sign Dataset, and store it in a directory $LISA_DATA
cd $LISA_DATA
Follow instructions in the LISA Traffic Sign Dataset to create 'mergedAnnotations.csv' such that only stop signs and pedestrian crossing signs are shown
cp $ROOT/data_gathering/create_pickle.py $LISA_DATA
python create_pickle.py
cd $ROOT
ln -s $LISA_DATA/resized_images_* .
ln -s $LISA_DATA/data_raw_*.p .
python data_prep.py
- This performs box matching between ground-truth boxes and default boxes, and packages the data into a format used later in the pipeline
python train.py
- This trains the SSD model
python inference.py -m demo

Differences between original SSD implementation

Obivously, we are only detecting certain traffic signs in this implementation, whereas the original SSD implemetation detected a greater number of object classes in the PASCAL VOC and MS COCO datasets. Other notable differences are:

Uses AlexNet as the base network
Input image resolution is 400x260
Uses a dynamic scaling factor based on the dimensions of the feature map relative to original image dimensions

Performance

As mentioned above, this SSD implementation was able to achieve 40-45 fps on a GTX 1080 with an Intel Core i7 6700K.

The inference time is the sum of the neural network inference time, and Non-Maximum Suppression (NMS) time. Overall, the neural network inference time is significantly less than the NMS time, with the neural network inference time generally between 7-8 ms, whereas the NMS time is between 15-16 ms. The NMS algorithm implemented here has not been optimized, and runs on CPU only, so further effort to improve performance can be done there.

Dataset characteristics

The entire LISA Traffic Sign Dataset consists of 47 distinct traffic sign classes. Since we are only concered with a subset of those classes, we only use a subset of the LISA dataset. Also, we ignore all training samples where we do not find a matching default box, further reducing our dataset's size. Due to this process, we end up with very little data to work with.

In order to improve on this issue, we can perform image data augmentation, and/or pre-train the model on a larger dataset (e.g. VOC2012, ILSVRC)

Training process

Given the small size of our pruned dataset, I chose a train/validation split of 95/5. The model was trained with Adadelta optimizers, with the default parameters provided by TensorFlow. The model was trained over 200 epochs, with a batch size of 32.

Areas of improvement

There are multiple potential areas of improvement in this project:

Pre-train the model on VOC2012 and/or ILSVRC
Image data augmentation
Hyper-parameter tuning
Optimize NMS alogorithm, or leverage existing optimized NMS algorithm
Implement and report mAP metric
Try different base networks
Expand to more traffic sign classes

https://github.com/seann999/ssd_tensorflow

ssd_tensorflow

Single Shot Multibox Detector (SSD) (paper) implementation in TensorFlow, in development.

Results of some hand-picked test images through an experimental run with MS COCO, some good and some bad:

Just looking through them, the results are okay but not good enough.

However, there are still major things needed to do that was done in the original paper for COCO but not here:

Train on 500x500 images (this was 300x300)
Use COCO trainval (this was only train)
Use batch size 32 (this was only 8)

Other major improvements needed:

Implement proper evaluation (mAP)
Optimize training (currently pretty slow)

Concerns:

Simple momentum optimizer stopped working (stopped converging) at some point during development, but adding batch normalization made it work again

Dependencies

TensorFlow
OpenCV
MS COCO tools

Basic Instructions

This project is still under development--it's especially slow, but here are some instructions anyway.

You need vgg16.npy from this repository, which is what I used for the base network. Unfortunately, it's a big file, and for now it's just uploaded to Mega, so you might need an account. Direct link to npy
For now, the code uses and depends on MS COCO. You need the MS COCO dataset from here. You should at least have the 2014 training images and corresponding 2014 train/val object instance annotations.
Download and install COCO tools from here
Change the COCO paths in coco_loader.py
Test or train with trainer.py

ssd目標檢測整理

Dependencies

How to run

Differences between original SSD implementation

Performance

Dataset characteristics

Training process

Areas of improvement

ssd_tensorflow

Dependencies

Basic Instructions

ssd目標檢測整理

SSD 目標檢測演算法詳細總結分析（one-stage)(深度學習)(ECCV 2016)

SSD目標檢測(1)：圖片+視訊內的物體定位（附原始碼）

使用SSD目標檢測c++介面編譯問題解決記錄

SSD目標檢測演算法改進DSSD（反捲積）

解讀SSD目標檢測方法

SSD目標檢測原理

經典網路結構梳理：SSD目標檢測演算法。

SSD目標檢測(2)：如何製作自己的資料集（詳細說明附原始碼）

SSD目標檢測論文簡讀

SSD目標檢測(3)：使用自己的資料集做預測（詳細說明附原始碼）

為什麼SSD目標檢測演算法對小目標檢測的效果不好

製作SSD目標檢測模型需要的訓練資料並訓練SSD目標檢測模型

動手創建 SSD 目標檢測框架

TF專案實戰（基於SSD目標檢測）——人臉檢測1

學習筆記-目標檢測、定位、識別（RCNN，Fast-RCNN, Faster-RCNN，Mask-RCNN，YOLO，SSD 系列）

計算機視覺之目標檢測一之SSD

caffe-ssd使用預訓練模型做目標檢測

快速小目標檢測--Feature-Fused SSD: Fast Detection for Small Objects

一文讀懂目標檢測 R-CNN Fast R-CNN Faster R-CNN YOLO SSD

ssd目標檢測整理

Dependencies

How to run

Differences between original SSD implementation

Performance

Dataset characteristics

Training process

Areas of improvement

ssd_tensorflow

Dependencies

Basic Instructions

相關推薦