用自己的資料集訓練tf-ssd模型

阿新 • • 發佈：2019-01-04

資料集製作

因為老闆接的豐田的一個專案，工廠那邊要求能識別出雨天打傘的行人、交通錐形桶、躺在地上的人等，PASCAL VOC的資料集類別裡沒這些，是滿足不了他們要求了，所以要去製作資料集訓練網路。我們去現場拍了些視訊，然後用我之前寫的一個指令碼解析，得到原始圖片，拿給實驗室的學弟學妹們標註。標註工具也是github上一個開源專案，作者搞了個GUI，生成帶標籤的.xml檔案，這裡也想感謝他。https://github.com/tzutalin/labelImg

標註完了之後得到大量的.xml檔案，然後我們需要將這些資料做成VOC的一樣格式。先來看看VOC的訓練集是什麼樣的：

因為只是做檢測，生成Annotations、ImageSets、JPEGImages這三個資料夾就可以了。後面的兩個segmentation還有一個test，這三個夾子不用管。Annotations是用來放.xml檔案的，JPEGImages放原始的jpg圖片，像這樣：

ImageSets這個夾子開啟長這樣：

我們只要生成Main資料夾就可以了，這個資料夾是用來存放資料對應的.txt檔案的。開啟Main：

其他的都不重要，生成這三個畫圈的檔案加一個test.txt就行了。利用程式生成test.txt, train.txt, trainval.txt, val.txt, 程式碼如下：

import os  
import random   
  
xmlfilepath=r'/home/ogai/ngy/nissd/mydataset/Annotations'  
saveBasePath=r"/home/ogai/ngy/nissd/"  
  
trainval_percent=0.8  
train_percent=0.7  
total_xml = os.listdir(xmlfilepath)  
num=len(total_xml)    
list=range(num)    
tv=int(num*trainval_percent)    
tr=int(tv*train_percent)    
trainval= random.sample(list,tv)    
train=random.sample(trainval,tr)    
  
print("train and val size",tv)  
print("traub suze",tr)  
ftrainval = open(os.path.join(saveBasePath,'mydataset/ImageSets/Main/trainval.txt'), 'w')    
ftest = open(os.path.join(saveBasePath,'mydataset/ImageSets/Main/test.txt'), 'w')    
ftrain = open(os.path.join(saveBasePath,'mydataset/ImageSets/Main/train.txt'), 'w')    
fval = open(os.path.join(saveBasePath,'mydataset/ImageSets/Main/val.txt'), 'w')    
  
for i  in list:    
    name=total_xml[i][:-4]+'\n'    
    if i in trainval:    
        ftrainval.write(name)    
        if i in train:    
            ftrain.write(name)    
        else:    
            fval.write(name)    
    else:    
        ftest.write(name)    
    
ftrainval.close()    
ftrain.close()    
fval.close()    
ftest .close()

到此為止資料集製作完畢。

生成tfrecords

我們需要先修改一下/datasets/pascalvoc_common.py中的類別標籤定義，我的做法：

""" 
VOC_LABELS = { 
    'none': (0, 'Background'), 
    'aeroplane': (1, 'Vehicle'), 
    'bicycle': (2, 'Vehicle'), 
    'bird': (3, 'Animal'), 
    'boat': (4, 'Vehicle'), 
    'bottle': (5, 'Indoor'), 
    'bus': (6, 'Vehicle'), 
    'car': (7, 'Vehicle'), 
    'cat': (8, 'Animal'), 
    'chair': (9, 'Indoor'), 
    'cow': (10, 'Animal'), 
    'diningtable': (11, 'Indoor'), 
    'dog': (12, 'Animal'), 
    'horse': (13, 'Animal'), 
    'motorbike': (14, 'Vehicle'), 
    'Person': (15, 'Person'), 
    'pottedplant': (16, 'Indoor'), 
    'sheep': (17, 'Animal'), 
    'sofa': (18, 'Indoor'), 
    'train': (19, 'Vehicle'), 
    'tvmonitor': (20, 'Indoor'), 
} 
"""  
  
VOC_LABELS = {  
    'none': (0, 'Background'),  
    'cone': (1, 'Cone'),  
    'umbrellaman': (2, 'Umbrella Man'),  # 類別我就不一一列舉了 
}

總之你的資料有多少類就換成多少類。

寫個shell指令碼，利用tf_convert_data.py來生成tfrecords：

#!/bin/bash

DATASET_DIR=./mydataset/
OUTPUT_DIR=./tfrecords
python tf_convert_data.py \
    --dataset_name=pascalvoc \
    --dataset_dir=${DATASET_DIR} \
    --output_name=voc_2007_train \
    --output_dir=${OUTPUT_DIR}

但是呢，我這裡報錯了：UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

解決方法，修改pascalvoc_to_tfrecords.py的第83行，將'r'改成'rb'

image_data = tf.gfile.FastGFile(filename, 'rb').read()

Fine-tuning訓練

在預訓練好的ssd模型(vgg300)上訓練自己的資料，然後寫個shell指令碼，利用train_ssd_network.py來訓練，注意路徑：

#!/bin/bash

DATASET_DIR=./tfrecords
TRAIN_DIR=./logs/
CHECKPOINT_PATH=./checkpoints/ssd_300_vgg.ckpt
python train_ssd_network.py \
    --train_dir=${TRAIN_DIR} \
    --dataset_dir=${DATASET_DIR} \
    --dataset_name=pascalvoc_2007 \
    --dataset_split_name=train \
    --model_name=ssd_300_vgg \
    --checkpoint_path=${CHECKPOINT_PATH} \
    --save_summaries_secs=60 \
    --save_interval_secs=600 \
    --weight_decay=0.0005 \
    --optimizer=adam \
    --learning_rate=0.001 \
    --batch_size=16

TRAIN_DIR是自己訓練產生的模型的checkpoints目錄，待會demo要從這裡載入checkpoints。

我訓練了一天多一點，機器是1070ti（6g），然而loss總是在1~5之間振盪，無法收斂，我也不知道為什麼。後來我直接關閉了訓練，用最新的checkpoints載入測試，奇怪的是竟然發現結果還行......

Demo

用jupyter notebook開啟/notebooks/ssd_notebook.ipynb，然後需要修改一下checkpoints載入路徑，替換成自己訓練的結果。

改下類別標籤：

網上找了點圖片，測測結果：

Evaluation

寫個shell指令碼，利用eval_ssd_network.py來evaluate：

#!/bin/bash

EVAL_DIR=./logs/
DATASET_DIR=./tfrecords/
CHECKPOINT_PATH=./logs/model.ckpt-252448
python eval_ssd_network.py \
    --eval_dir=${EVAL_DIR} \
    --dataset_dir=${DATASET_DIR} \
    --dataset_name=pascalvoc_2007 \
    --dataset_split_name=train \
    --model_name=ssd_300_vgg \
    --checkpoint_path=${CHECKPOINT_PATH} \
    --batch_size=1

EVAL_DIR是eval結果儲存的路徑，DATASET_DIR是用來eval的資料集的路徑。

但是報錯了：TypeError: Can not convert a tuple into a Tensor or Operation

解決方法，在eval_ssd_network.py裡定義一個flatten函式：

def flatten(x): 
         result = [] 
         for el in x: 
              if isinstance(el, tuple): 
                    result.extend(flatten(el))
              else: 
                    result.append(el) 
         return result

然後將原本第318和338行的

eval_op = list(names_to_updates.values()),

改為

eval_op = flatten(list(names_to_updates.values())),

用自己的資料集訓練tf-ssd模型

資料集製作

生成tfrecords

Fine-tuning訓練

Demo

Evaluation

用自己的資料集訓練tf-ssd模型

基於自制資料集的MobileNet-SSD模型訓練

Keras之DNN：利用DNN演算法【Input(8)→12+8(relu)→O(sigmoid)】利用糖尿病資料集訓練、評估模型(利用糖尿病資料集中的八個引數特徵預測一個0或1結果)

tensorflow 用自己的資料集訓練CNN模型

Faster-RCNN+ZF用自己的資料集訓練模型(Python版本and MATLAB版本）

Faster-RCNN+ZF用自己的資料集訓練模型(Python版本)

R-FCN+ResNet-50用自己的資料集訓練模型(python版本)

caffe之利用mnist資料集訓練好的lenet_iter_10000.caffemodel模型測試一張自己的手寫體數字

《錯誤手記-01》 facenet使用預訓練模型fine-tune重新訓練自己資料集報錯

用自己的資料集訓練faster-rcnn時出現的一些問題及總結(五)

ChainerCV下用自己的資料集訓練Faster RCNN

用自己的資料集訓練Mask-RCNN實現過程中的坑

使用pytorch版faster-rcnn訓練自己資料集

TensorFlow筆記（3）——利用TensorFlow和MNIST資料集訓練一個最簡單的手寫數字識別模型

Kaldi中thchs30訓練自己資料集的步驟

用自己的圖片訓練和預測LeNet模型

用pandas劃分資料集——訓練集和測試集

yolov3訓練自己資料集可參考文章

Yolov3訓練自己資料集+資料分析

用自己資料訓練faster-rcnn---cpu

用自己的資料集訓練tf-ssd模型

資料集製作

生成tfrecords

Fine-tuning訓練

Demo

Evaluation

相關推薦