1. 程式人生 > >用YOLOV3訓練自己的資料

用YOLOV3訓練自己的資料

花了近一個星期的時間搗鼓,用yolo訓練自己的資料,參考了幾十篇部落格,發現好多坑啊,尤其是CSDN上的部落格,說多了都是淚啊,閒話少扯,我們直接進入正題。

  1. 根據yolo官網指令跑起來(沒毛病)
  2. 修改makefile 檔案:

   A:  GPU=1 (GPU faster CPU 500倍)

CUDNN=1

OPENCV=1

   B:ARCH= -gencode arch=compute_30,code=sm_30 \

      -gencode arch=compute_32,code=sm_32 \

      -gencode arch=compute_30,code=[sm_30,compute_30] \

      -gencode arch=compute_32,code=[sm_32,compute_32]  (你的N卡)

   C:NVCC=/usr/local/cuda-9.0/bin/nvcc (你的版本)

   D: ifeq ($(GPU), 1)

COMMON+= -DGPU -I/usr/local/cuda-9.0/include/

CFLAGS+= -DGPU

LDFLAGS+= -L/usr/local/cuda-9.0/lib64 -lcuda -lcudart -lcublas -lcurand

Endif (適配你電腦的型號)

  1. 標註檔案 labelImg 特別好用 直接出 yolo所需要的.txt檔案,我標註了兩類

0 0.392708 0.499219 0.110417 0.026562

0 0.636458 0.520312 0.122917 0.021875

1 0.392708 0.496875 0.014583 0.009375

1 0.639583 0.518750 0.012500 0.006250

一行五個數字分別為類母,中心座標 x y 長寬 w h

在你喜歡的地方儲存這些圖片和標註檔案,並建立一個train.txt 和val.txt,這個檔案的地址4要用到,並把圖片的地址寫入其中。

  1. 修改配置檔案,copy cfg/yolov3.cfg 為 face.cfg 並做如下修改:

   修改的地方:

   A: 若測試:

# batch=1

# subdivisions=1

若訓練:

# batch=64

# subdivisions=8 (根據你N卡計算能力來)

   B:修改YOLO上面最近一個 [convolutional]中的filter=[class+5]*3 (共有三處)

   C:修改YOLO層中classes (共有三處)

  1. 在data下新建立 face.names 檔案 寫入你要標記的類別 just like

    eye

Pupils

  1. 在cfg 下新建 face.date 寫入:

classes= 2

train  = /home/bluesandals/code/darknet/face_data/train.txt (地址隨意)

valid  = /home/bluesandals/code/darknet/face_data/val.txt (地址隨意)

names = data/face.names (上面新建的.names檔案)

backup = backup (輸出權重檔案地址 也可以為 result)

  1. 對.cfg檔案中各個引數說明:

  [net]

# Testing

# batch=1

# subdivisions=1

# Training

batch=32 # 一批訓練樣本的樣本數量,每batch個樣本更新一次引數

subdivisions=16 #batch/subdivision作為一次性送入訓練器的樣本數量

# 每輪迭代會從所有訓練集中隨機抽取batch=32個樣本參與訓練,所有這些batch的樣本又會分為 # ubdivision次送入網路,已減輕記憶體佔用的壓力) 

width=608

height=608

channels=3

momentum=0.9  # 動量 前後梯度一致加速學習,前後不一致,抑制震盪

decay=0.0005  # 正則項,避免過擬合

angle=0   # 資料擴充時,圖片旋轉的角度

saturation = 1.5 # 飽和度範圍

exposure = 1.5  # 曝光度範圍

hue=.1  # 色調變化範圍

learning_rate=0.001 # 初始學習率,開始(0.1~0.001),後來應指數衰減(100倍)

burn_in=1000  # 迭代次數小於burn_in時,學習率有一種更新方式,當大於採用policy

max_batches = 500200 # 訓練達到 max_batches 訓練停止

policy=steps

steps=400000,450000

scales=.1,.1

[convolutional]

batch_normalize=1  #是否做BN

filters=32         #輸出特徵圖的數量

size=3             #卷積核的尺寸

stride=1           #做卷積運算的步長

pad=1  # 如果pad=0,padding 由padding引數制定,如果pad=1,padding=size/2

activation=leaky #啟用函式

# Downsample

[convolutional]

batch_normalize=1

filters=64

size=3

stride=2

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=32

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=64

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

# Downsample

[convolutional]

batch_normalize=1

filters=128

size=3

stride=2

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=64

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=128

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

[convolutional]

batch_normalize=1

filters=64

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=128

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

# Downsample

[convolutional]

batch_normalize=1

filters=256

size=3

stride=2

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=128

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=256

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

[convolutional]

batch_normalize=1

filters=128

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=256

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

[convolutional]

batch_normalize=1

filters=128

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=256

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

[convolutional]

batch_normalize=1

filters=128

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=256

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

[convolutional]

batch_normalize=1

filters=128

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=256

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

[convolutional]

batch_normalize=1

filters=128

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=256

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

[convolutional]

batch_normalize=1

filters=128

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=256

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

[convolutional]

batch_normalize=1

filters=128

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=256

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

# Downsample

[convolutional]

batch_normalize=1

filters=512

size=3

stride=2

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=256

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=512

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

[convolutional]

batch_normalize=1

filters=256

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=512

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

[convolutional]

batch_normalize=1

filters=256

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=512

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

[convolutional]

batch_normalize=1

filters=256

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=512

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

[convolutional]

batch_normalize=1

filters=256

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=512

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

[convolutional]

batch_normalize=1

filters=256

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=512

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

[convolutional]

batch_normalize=1

filters=256

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=512

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

[convolutional]

batch_normalize=1

filters=256

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=512

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

# Downsample

[convolutional]

batch_normalize=1

filters=1024

size=3

stride=2

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=512

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=1024

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

[convolutional]

batch_normalize=1

filters=512

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=1024

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

[convolutional]

batch_normalize=1

filters=512

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=1024

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

[convolutional]

batch_normalize=1

filters=512

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

filters=1024

size=3

stride=1

pad=1

activation=leaky

[shortcut]

from=-3

activation=linear

######################

[convolutional]

batch_normalize=1

filters=512

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

size=3

stride=1

pad=1

filters=1024

activation=leaky

[convolutional]

batch_normalize=1

filters=512

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

size=3

stride=1

pad=1

filters=1024

activation=leaky

[convolutional]

batch_normalize=1

filters=512

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

size=3

stride=1

pad=1

filters=1024

activation=leaky

stopbackward=1 # 提升訓練速度

[convolutional]

size=1

stride=1

pad=1

filters=21  # 每一個yolo層前最後一個卷積層 filters=(classes+1+coords)*anchors_num

activation=linear

[yolo]

mask = 6,7,8

anchors = 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326

# anchors是可以通過事先cmd指令計算出來的,和圖片數量,width,height,cluster有關,也可以# 手工挑選,也可以通過kmeans從訓練樣本集中學出。

classes=2

num=9 #每個grid cell 預測的box,和anchor的數量一致,當要使用更多的anchors需要增大num,##若增大num後訓練時,obj~0,可以嘗試調大object_scale

jitter=.3 # 利用資料抖動產生更多的資料

ignore_thresh = .7 #決定是否需要計算IOU誤差引數,大於thresh,IOU誤差引數不計算在cost中

truth_thresh = 1

random=0 # 如果為1,每次迭代圖片大小隨機從320~608,部長為32,如果為0 每次訓練輸入大小一致

[route]

layers = -4

[convolutional]

batch_normalize=1

filters=256

size=1

stride=1

pad=1

activation=leaky

[upsample]

stride=2

[route]

layers = -1, 61

[convolutional]

batch_normalize=1

filters=256

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

size=3

stride=1

pad=1

filters=512

activation=leaky

[convolutional]

batch_normalize=1

filters=256

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

size=3

stride=1

pad=1

filters=512

activation=leaky

[convolutional]

batch_normalize=1

filters=256

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

size=3

stride=1

pad=1

filters=512

activation=leaky

[convolutional]

size=1

stride=1

pad=1

filters=21

activation=linear

[yolo]

mask = 3,4,5

anchors = 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326

classes=2

num=9

jitter=.3

ignore_thresh = .7

truth_thresh = 1

random=0

[route]

layers = -4

[convolutional]

batch_normalize=1

filters=128

size=1

stride=1

pad=1

activation=leaky

[upsample]

stride=2

[route]

layers = -1, 36

[convolutional]

batch_normalize=1

filters=128

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

size=3

stride=1

pad=1

filters=256

activation=leaky

[convolutional]

batch_normalize=1

filters=128

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

size=3

stride=1

pad=1

filters=256

activation=leaky

[convolutional]

batch_normalize=1

filters=128

size=1

stride=1

pad=1

activation=leaky

[convolutional]

batch_normalize=1

size=3

stride=1

pad=1

filters=256

activation=leaky

[convolutional]

size=1

stride=1

pad=1

filters=21

activation=linear

[yolo]

mask = 0,1,2

anchors = 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326

classes=2

num=9

jitter=.3

ignore_thresh = .7

truth_thresh = 1

random=0

# 批輸出 1491: 2.497199, 2.531538 avg, 0.001000 rate, 12.053180 seconds, 47712 images

#  1491:當前訓練的迭代次數

#  2.497199:總體loss

#  2.531538 avg:平均loss(這個值應該是越低越好,一般來說,一旦這個數低於0.060730 訓練可以終止)

#  0.001000 rate:當前學習率 在.cfg檔案中定義了初始值和學習策略

#  12.053180 seconds:該批次訓練所花費的總時間

#  47712 images:到目前為止,參與訓練圖片的總量=迭代次數(1491)×batch(32)

#  Region 106 Avg IOU: 0.498723, Class: 0.998587, Obj: 0.022303, No Obj: 0.000196, .5R: 0.500000, .75R: 0.250000,  count: 8;

# 三個尺度上預測不同大小的框;

# Region 82 Avg IOU:卷積層為最大的預測尺度,使用較大的mask,預測出較小的物體;

# Region 94 Avg IOU:卷積層為中等的預測尺度,使用中等的mask,預測出中等的物體;

# Region 106 Avg IOU:卷積層為最小的預測尺度,使用較小的mask,預測出較大的物體;

#  Avg IOU: 0.498723:代表當前subdivision內的圖片的平均IOU,代表預測的矩形框和真實目標框的交集與並集之比,0.9已經很高率;

#  Class: 0.998587:標註物體分類的正確率,期望該值趨近於1;

#  Obj: 0.022303:越接近1越好;

#  No Obj: 0.000196:期望該值悅來越小,但不為0;

#  .5R: 0.500000:以IOU=0.5為閾值的時候的recall;recall=檢出的正樣本/實際正樣本

#  .75R: 0.250000:以IOU=0.75為閾值的時候recall;

#  count: 8:所有的當前subdivision圖片中包含正樣本圖片的數量。

訓練的時候需要輸出日誌檔案:

./darknet detector train cfg/face.data cfg/face.cfg darknet53.conv.74 -gpus 0,1 2>1 | tee face_train.log

測試訓練結果:

./darknet detector test cfg/face.data cfg/face.cfg face_900.weight face_data/0000001.jpg -thresh 0.25

驗證集:標記好的資料,在訓練的過程中不參與訓練,驗證演算法通過對比預測目標和標記目標判斷預測的正確率,評價模型;

acc:正確數量/(正確數量+錯誤數量)

./darknet detector recall cfg/face.data cfg/face.cfg backup/face_final.weight

./darknet detector valid cfg/face.data cfg/face.cfg backup/face_final.weight -out 123 -gpu 0 -thresh .5