用YOLOV3訓練自己的資料
花了近一個星期的時間搗鼓,用yolo訓練自己的資料,參考了幾十篇部落格,發現好多坑啊,尤其是CSDN上的部落格,說多了都是淚啊,閒話少扯,我們直接進入正題。
- 根據yolo官網指令跑起來(沒毛病)
- 修改makefile 檔案:
A: GPU=1 (GPU faster CPU 500倍)
CUDNN=1
OPENCV=1
B:ARCH= -gencode arch=compute_30,code=sm_30 \
-gencode arch=compute_32,code=sm_32 \
-gencode arch=compute_30,code=[sm_30,compute_30] \
-gencode arch=compute_32,code=[sm_32,compute_32] (你的N卡)
C:NVCC=/usr/local/cuda-9.0/bin/nvcc (你的版本)
D: ifeq ($(GPU), 1)
COMMON+= -DGPU -I/usr/local/cuda-9.0/include/
CFLAGS+= -DGPU
LDFLAGS+= -L/usr/local/cuda-9.0/lib64 -lcuda -lcudart -lcublas -lcurand
Endif (適配你電腦的型號)
- 標註檔案 labelImg 特別好用 直接出 yolo所需要的.txt檔案,我標註了兩類
0 0.392708 0.499219 0.110417 0.026562
0 0.636458 0.520312 0.122917 0.021875
1 0.392708 0.496875 0.014583 0.009375
1 0.639583 0.518750 0.012500 0.006250
一行五個數字分別為類母,中心座標 x y 長寬 w h
在你喜歡的地方儲存這些圖片和標註檔案,並建立一個train.txt 和val.txt,這個檔案的地址4要用到,並把圖片的地址寫入其中。
- 修改配置檔案,copy cfg/yolov3.cfg 為 face.cfg 並做如下修改:
修改的地方:
A: 若測試:
# batch=1
# subdivisions=1
若訓練:
# batch=64
# subdivisions=8 (根據你N卡計算能力來)
B:修改YOLO上面最近一個 [convolutional]中的filter=[class+5]*3 (共有三處)
C:修改YOLO層中classes (共有三處)
- 在data下新建立 face.names 檔案 寫入你要標記的類別 just like
eye
Pupils
- 在cfg 下新建 face.date 寫入:
classes= 2
train = /home/bluesandals/code/darknet/face_data/train.txt (地址隨意)
valid = /home/bluesandals/code/darknet/face_data/val.txt (地址隨意)
names = data/face.names (上面新建的.names檔案)
backup = backup (輸出權重檔案地址 也可以為 result)
- 對.cfg檔案中各個引數說明:
[net]
# Testing
# batch=1
# subdivisions=1
# Training
batch=32 # 一批訓練樣本的樣本數量,每batch個樣本更新一次引數
subdivisions=16 #batch/subdivision作為一次性送入訓練器的樣本數量
# 每輪迭代會從所有訓練集中隨機抽取batch=32個樣本參與訓練,所有這些batch的樣本又會分為 # ubdivision次送入網路,已減輕記憶體佔用的壓力)
width=608
height=608
channels=3
momentum=0.9 # 動量 前後梯度一致加速學習,前後不一致,抑制震盪
decay=0.0005 # 正則項,避免過擬合
angle=0 # 資料擴充時,圖片旋轉的角度
saturation = 1.5 # 飽和度範圍
exposure = 1.5 # 曝光度範圍
hue=.1 # 色調變化範圍
learning_rate=0.001 # 初始學習率,開始(0.1~0.001),後來應指數衰減(100倍)
burn_in=1000 # 迭代次數小於burn_in時,學習率有一種更新方式,當大於採用policy
max_batches = 500200 # 訓練達到 max_batches 訓練停止
policy=steps
steps=400000,450000
scales=.1,.1
[convolutional]
batch_normalize=1 #是否做BN
filters=32 #輸出特徵圖的數量
size=3 #卷積核的尺寸
stride=1 #做卷積運算的步長
pad=1 # 如果pad=0,padding 由padding引數制定,如果pad=1,padding=size/2
activation=leaky #啟用函式
# Downsample
[convolutional]
batch_normalize=1
filters=64
size=3
stride=2
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
# Downsample
[convolutional]
batch_normalize=1
filters=128
size=3
stride=2
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
# Downsample
[convolutional]
batch_normalize=1
filters=256
size=3
stride=2
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
# Downsample
[convolutional]
batch_normalize=1
filters=512
size=3
stride=2
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
# Downsample
[convolutional]
batch_normalize=1
filters=1024
size=3
stride=2
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky
[shortcut]
from=-3
activation=linear
######################
[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky
stopbackward=1 # 提升訓練速度
[convolutional]
size=1
stride=1
pad=1
filters=21 # 每一個yolo層前最後一個卷積層 filters=(classes+1+coords)*anchors_num
activation=linear
[yolo]
mask = 6,7,8
anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326
# anchors是可以通過事先cmd指令計算出來的,和圖片數量,width,height,cluster有關,也可以# 手工挑選,也可以通過kmeans從訓練樣本集中學出。
classes=2
num=9 #每個grid cell 預測的box,和anchor的數量一致,當要使用更多的anchors需要增大num,##若增大num後訓練時,obj~0,可以嘗試調大object_scale
jitter=.3 # 利用資料抖動產生更多的資料
ignore_thresh = .7 #決定是否需要計算IOU誤差引數,大於thresh,IOU誤差引數不計算在cost中
truth_thresh = 1
random=0 # 如果為1,每次迭代圖片大小隨機從320~608,部長為32,如果為0 每次訓練輸入大小一致
[route]
layers = -4
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[upsample]
stride=2
[route]
layers = -1, 61
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky
[convolutional]
size=1
stride=1
pad=1
filters=21
activation=linear
[yolo]
mask = 3,4,5
anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326
classes=2
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=0
[route]
layers = -4
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[upsample]
stride=2
[route]
layers = -1, 36
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky
[convolutional]
size=1
stride=1
pad=1
filters=21
activation=linear
[yolo]
mask = 0,1,2
anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326
classes=2
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=0
# 批輸出 1491: 2.497199, 2.531538 avg, 0.001000 rate, 12.053180 seconds, 47712 images
# 1491:當前訓練的迭代次數
# 2.497199:總體loss
# 2.531538 avg:平均loss(這個值應該是越低越好,一般來說,一旦這個數低於0.060730 訓練可以終止)
# 0.001000 rate:當前學習率 在.cfg檔案中定義了初始值和學習策略
# 12.053180 seconds:該批次訓練所花費的總時間
# 47712 images:到目前為止,參與訓練圖片的總量=迭代次數(1491)×batch(32)
# Region 106 Avg IOU: 0.498723, Class: 0.998587, Obj: 0.022303, No Obj: 0.000196, .5R: 0.500000, .75R: 0.250000, count: 8;
# 三個尺度上預測不同大小的框;
# Region 82 Avg IOU:卷積層為最大的預測尺度,使用較大的mask,預測出較小的物體;
# Region 94 Avg IOU:卷積層為中等的預測尺度,使用中等的mask,預測出中等的物體;
# Region 106 Avg IOU:卷積層為最小的預測尺度,使用較小的mask,預測出較大的物體;
# Avg IOU: 0.498723:代表當前subdivision內的圖片的平均IOU,代表預測的矩形框和真實目標框的交集與並集之比,0.9已經很高率;
# Class: 0.998587:標註物體分類的正確率,期望該值趨近於1;
# Obj: 0.022303:越接近1越好;
# No Obj: 0.000196:期望該值悅來越小,但不為0;
# .5R: 0.500000:以IOU=0.5為閾值的時候的recall;recall=檢出的正樣本/實際正樣本
# .75R: 0.250000:以IOU=0.75為閾值的時候recall;
# count: 8:所有的當前subdivision圖片中包含正樣本圖片的數量。
訓練的時候需要輸出日誌檔案:
./darknet detector train cfg/face.data cfg/face.cfg darknet53.conv.74 -gpus 0,1 2>1 | tee face_train.log
測試訓練結果:
./darknet detector test cfg/face.data cfg/face.cfg face_900.weight face_data/0000001.jpg -thresh 0.25
驗證集:標記好的資料,在訓練的過程中不參與訓練,驗證演算法通過對比預測目標和標記目標判斷預測的正確率,評價模型;
acc:正確數量/(正確數量+錯誤數量)
./darknet detector recall cfg/face.data cfg/face.cfg backup/face_final.weight
./darknet detector valid cfg/face.data cfg/face.cfg backup/face_final.weight -out 123 -gpu 0 -thresh .5