1. 程式人生 > >yolo訓練測試自己的資料

yolo訓練測試自己的資料

繼之前部落格的YOLO安裝,這篇將分享如何用YOLO訓練測試自己的資料~(這邊我們檢測的只有一類,是車裡的方向盤檢測,當然可以類推到其他的類,像車啊,動物,人啊等等)

如何製作資料集

要進行訓練需要資料集的支撐,首先需要製作資料集,這邊我們只檢測一類,所以製作了500多張資料集圖片。用LabelImg來做資料集,GitHub上自行下載使用,有介紹如何使用,推薦裝anaconda使用


第一次用的時候需要輸入前三行命令,後面每次使用的時候只需切換到目錄下執行第三行命令就能開啟標註資料集的介面了。

資料集

直接在某一盤下建了一個目錄VOC2007


Annotations裡存放的是用labelimg標註儲存的xml檔案,



ImagwSets資料夾下面存放的是


只用到了main資料夾下面存放一些train.txt  test.txt等檔案,其餘兩個資料夾目前沒用到。JPEGImages資料夾下存放的是採集的圖片,jpg,png都行的,有些教程說要jpg格式,其實不一定的png也可以,後面有些需要改動而已,我的圖片是png的,在標記圖片之前最好現將圖片的命名批量修改從成類似0000001.png之類的。


main檔案下的train.txt檔案就是圖片的名字不帶字尾也不帶路徑,可批處理生成


將生成的xml檔案轉成txt檔案

darknet-master\scripts目錄下有個voc_label.py檔案

import xml.etree.ElementTree as ET
import pickle
import os
from os import listdir, getcwd
from os.path import join

#sets=[('2012', 'train'), ('2012', 'val'), ('2007', 'train'), ('2007', 'val'), ('2007', 'test')]
#classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]
sets=[('2007', 'train')]      我的main下只放了train檔案,圖片全部用來訓練不用來驗證測試
classes = ["wheel"]            只有一個類就是方向盤
wd = "E:"                 添加了路徑,省的每次都要放在什麼scripts目錄下,麻煩,所以下面有些地方需要改動

def convert(size, box):
    dw = 1./size[0]
    dh = 1./size[1]
    x = (box[0] + box[1])/2.0
    y = (box[2] + box[3])/2.0
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x*dw
    w = w*dw
    y = y*dh
    h = h*dh
    return (x,y,w,h)

def convert_annotation(year, image_id):
    #in_file = open('VOCdevkit/VOC%s/Annotations/%s.xml'%(year, image_id))
    #out_file = open('VOCdevkit/VOC%s/labels/%s.txt'%(year, image_id), 'w')
    in_file = open('%s/VOC%s/Annotations/%s.xml'%(wd, year, image_id))
    out_file = open('%s/VOC%s/labels/%s.txt'%(wd, year, image_id), 'w')
	tree=ET.parse(in_file)
    root = tree.getroot()
    size = root.find('size')
    w = int(size.find('width').text)
    h = int(size.find('height').text)

    for obj in root.iter('object'):
        difficult = obj.find('difficult').text
        cls = obj.find('name').text
        if cls not in classes or int(difficult) == 1:
            continue
        cls_id = classes.index(cls)
        xmlbox = obj.find('bndbox')
        b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))
        bb = convert((w,h), b)
        out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')

#wd = getcwd()

#for year, image_set in sets:
 #   if not os.path.exists('VOCdevkit/VOC%s/labels/'%(year)):
  #      os.makedirs('VOCdevkit/VOC%s/labels/'%(year))
   # image_ids = open('VOCdevkit/VOC%s/ImageSets/Main/%s.txt'%(year, image_set)).read().strip().split()
    #list_file = open('%s_%s.txt'%(year, image_set), 'w')
    #for image_id in image_ids:
     #   list_file.write('%s/VOCdevkit/VOC%s/JPEGImages/%s.jpg\n'%(wd, year, image_id))
      #  convert_annotation(year, image_id)
    #list_file.close()

for year, image_set in sets:
    if not os.path.exists('%s/VOC%s/labels/'%(wd, year)):
        os.makedirs('%s/VOC%s/labels/'%(wd, year))
    image_ids = open('%s/VOC%s/ImageSets/Main/%s.txt'%(wd, year, image_set)).read().strip().split()
    list_file = open('%s_%s.txt'%(year, image_set), 'w')
    for image_id in image_ids:
        list_file.write('%s/VOC%s/JPEGImages/%s.png\n'%(wd, year, image_id))
        convert_annotation(year, image_id)
    list_file.close()

目錄下生成labels資料夾,裡面存放

同時在e盤根目錄下也生成了2007_train.txt,並將2007_train.txt移到darknet-master\build\darknet\x64目錄下


修改配置檔案

進入darknet-master\build\darknet\x64目錄下

首先修改data資料夾下voc.names檔案,我們只檢測一類,所以檔案裡只需要寫wheel就行,

其次修改cfg資料夾下的voc.data檔案


然後修改cfg資料夾下.cfg檔案也就是網路設定,隨便挑一個吧,這邊用的yolov2.cfg

[net]
# Testing
batch=5       batch不敢設定太大,電腦跑不掉
subdivisions=1
# Training
# batch=64
# subdivisions=8
width=416
height=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

learning_rate=0.0005     學習率調小了
burn_in=1000
max_batches = 2000     我就500多章圖片不需要迭代太多次
policy=steps
steps=200,250
scales=.1,.1

[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky


#######

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[route]
layers=-9

[convolutional]
batch_normalize=1
size=1
stride=1
pad=1
filters=64
activation=leaky

[reorg]
stride=2

[route]
layers=-1,-4

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=30    region前最後一個卷積層的filters數是特定的,計算公式為filter=num*(classes+5)
activation=linear


[region]
anchors =  0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828
bias_match=1
classes=1          種類設定1
coords=4
num=5
softmax=1
jitter=.3
rescore=1

object_scale=5
noobject_scale=1
class_scale=1
coord_scale=1

absolute=1
thresh = .6
random=0     random為1時會啟用Multi-Scale Training,隨機使用不同尺寸的圖片進行訓練 減少計算量就設定0啦

最後修改yolo.c檔案,開啟darknet工程,找到yolo.c原始碼,將voc_names改成一個類,wheel就好,並且需要重新生成解決方案

x64目錄下的darknet.exe會更新一下。


訓練

darknet-master\build\darknet\x64目錄下執行命令

darknet.exe detector train cfg\voc.data cfg\yolov2.cfg
訓練生成的模型在backup目錄下

測試

darknet-master\build\darknet\x64目錄下執行命令

darknet.exe detector test data/voc.data cfg/yolov2.cfg backup/yolov2_1800.weights E:/VOC2007/JPEGImages/00001.png

初試yolo訓練自己的資料,只是嘗試運用了還沒有深入瞭解,後續會有深入探索~~~