Yolo v1測試和訓練問題總結

阿新 • • 發佈：2019-01-14

0.配置環境
ubuntu 16.04
opencv3.1
cuda8.0
cudn6.0

1.問題

(1). Darknet安裝，執行如下命令測試時，報錯：
./darknet -i 1 imagenet test cfg/alexnet.cfg alexnet.weights

報錯：
yolov1/darknet-master# ./darknet -i 1 imagenet test cfg/alexnet.cfg alexnet.weights
Not an option: imagenet

解決方法:
./darknet -i 0 test ./data/horses.jpg cfg/alexnet.cfg alexnet.weights

(2).
[email protected]:/data/gpu_cuda8_cudnn6_py2/yolov1/darknet# make
mkdir -p obj
mkdir -p results
gcc -DOPENCV pkg-config --cflags opencv -DGPU -I/usr/local/cuda/include/ -DCUDNN -Wall -Wfatal-errors -Ofast -DOPENCV -DGPU -DCUDNN -c ./src/gemm.c -o obj/gemm.o
gcc -DOPENCV pkg-config --cflags opencv -DGPU -I/usr/local/cuda/include/ -DCUDNN -Wall -Wfatal-errors -Ofast -DOPENCV -DGPU -DCUDNN -c ./src/utils.c -o obj/utils.o
./src/utils.c: In function ‘fgetl’:
./src/utils.c:265:9: warning: ignoring return value of ‘fgets’, declared with attribute warn_unused_result [-Wunused-result]
fgets(&line[curr], readsize, fp);
^
gcc -DOPENCV pkg-config --cflags opencv

-DGPU -I/usr/local/cuda/include/ -DCUDNN -Wall -Wfatal-errors -Ofast -DOPENCV -DGPU -DCUDNN -c ./src/cuda.c -o obj/cuda.o
gcc -DOPENCV pkg-config --cflags opencv -DGPU -I/usr/local/cuda/include/ -DCUDNN -Wall -Wfatal-errors -Ofast -DOPENCV -DGPU -DCUDNN -c ./src/convolutional_layer.c -o obj/convolutional_layer.o
./src/convolutional_layer.c: In function ‘cudnn_convolutional_setup’:
./src/convolutional_layer.c:145:5: error: too few arguments to function ‘cudnnSetConvolution2dDescriptor’
cudnnSetConvolution2dDescriptor(l->convDesc, l->pad, l->pad, l->stride, l->stride, 1, 1, CUDNN_CROSS_CORRELATION);
^
compilation terminated due to -Wfatal-errors.
Makefile:59: recipe for target ‘obj/convolutional_layer.o’ failed
make: *** [obj/convolutional_layer.o] Error 1
分析：
Makefile中配置：
GPU=1
CUDNN=0
OPENCV=1

【引用】：原因：我猜測是因為訓練openpose用的這個caffe是在老版本的基礎之上進行修改得到的，因此支援的cudnn也是老版本，因此當你的系統的cudnn版本>6就會產生上述問題，而你下載的官方的caffe會隨著cudnn的版本更新而更新上述cudnn.hpp檔案，以及其他相關的原始碼，因此在編譯官方的caffe時不會出現此問題。而我的系統中的cudnn版本時8.0，產生上述錯誤。

（3）.
./src/image.c:481:14: warning: assignment makes pointer from integer without a cast [-Wint-conversion]
if( (src = cvLoadImage(filename, flag)) == 0 )
^
./src/image.c: At top level:
./src/image.c:496:29: error: unknown type name ‘CvCapture’
image get_image_from_stream(CvCapture *cap)
^
compilation terminated due to -Wfatal-errors.
Makefile:59: recipe for target ‘obj/image.o’ failed
make: *** [obj/image.o] Error 1

解決方法：
在報錯的檔案中，#ifdef OPENCV 模組中新增下面這行程式碼：來自這裡
#include “opencv2/videoio/videoio_c.h”
多個檔案都要新增，注意修改完整。

(4).模型載入
29: Dropout Layer: 12544 inputs, 0.500000 probability
30: Connected Layer: 12544 inputs, 1715 outputs
31: Detection Layer
forced: Using default ‘0’
Loading weights from yolo.weights…Done!
CUDA Error: invalid device function
darknet: ./src/cuda.c:35: check_error: Assertion `0’ failed.
Aborted (core dumped)

解決：
編譯時與實際顯示卡型號不匹配，需要修改配置檔案。
修改ARCH配置（該項在以前編譯的過程中壓根就沒有注意，但是最近出現的cuda error都是因為這個隱祕的大坑。工程從別處拷貝個過來，兩臺機器的顯示卡不一樣，但是本人直接編譯。工程是直接可以編譯的，但是一執行yolo模型就會出現cuda error。各位同學應該緊密關注自己顯示卡的型號，並將arch配置成符合自己顯示卡型號的配置）
如果經過1和2的配置修改後編譯的darknet執行可能會報以下錯誤：
Loadingweights from yolo.weights…Done!
CUDA Error:invalid device function
darknet: ./src/cuda.c:21: check_error: Assertion `0’ failed.
Aborted (core dumped)
就是上述忽略自身顯示卡型號造成的。
這是因為配置檔案Makefile中配置的GPU架構和本機GPU型號不一致導致的。
更改前預設配置如下（不同版本可能有變）：
ARCH= -gencode arch=compute_30,code=sm_30 \
-gencode arch=compute_35,code=sm_35 \
-gencode arch=compute_50,code=[sm_50,compute_50] \
-gencode arch=compute_52,code=[sm_52,compute_52]

compute_30表示顯示卡的計算能力是3.0，幾款主流GPU的compute capability列表：
GTX Titan x ： 5.2
GTX 980 ： 5.2
Tesla K80 ： 3.7
Tesla K40 ： 3.5
K4200 ： 3.0
修改過後重新編譯即可

Yolo v1測試和訓練問題總結

Yolo v1測試和訓練問題總結

Darknet+YOLO的安裝和測試指南

黑盒滲透測試的一些姿勢和個人總結

吳恩達機器學習訓練祕籍整理四十四到五十二章（七）優化測試和端到端

YOLO 配置，測試與訓練

並行測試和變異測試的文獻總結

並行測試和變異測試三篇文獻總結(二)

關於outline的一點測試和總結

深度學習tensorflow實戰筆記（3）VGG-16訓練自己的資料並測試和儲存模型

自己測試過程中總結的易忽略的測試點和經驗－－持續更新中

YOLO v1,YOLO v2,YOLO9000演算法總結與原始碼解析

app崩潰的原因和提前測試流程/方法和出現崩潰後怎麼定位和處理總結

Darknet yolo 環境搭建以及訓練測試自己的資料集

YOLO v1之總結篇（linux+windows）

yolo生成和訓練資料集

yolo模型的批量測試和位置輸出

jQuery中prop()和attr()方法的測試和總結

YOLO v1論文翻譯和解讀

app崩潰的原因和提前測試流程/方法和出現崩潰後怎麼定位和處理總結（持續更新中）

介面測試和app測試的總結

Yolo v1測試和訓練問題總結

相關推薦