Caffe訓練個人資料並呼叫模型進行分類
最近有份作業,需要用到cafee做一些圖片分類方面的,用慣Tensorflow了就gg,圖片集用了華南理工大學的圖片集。
一開始的安裝由於我懶,所以讓個有經驗的同學幫我裝了下,本來想親力親為的我,真香
由於我之前裝了tensorflow-gpu,CUDA版本9.0,caffe現在好像支援最高8.0,用9.0是會build不出來的,嫌麻煩我直接裝cpu版了。
然後想先做個簡單的分類練一下手,第一眼看到的部落格地址,發現跟其他部落格寫的也差不多,順序也差不多,但是我自己會遇到一些問題,主要就是路徑的問題。
所以,流程中的指令碼檔案的路徑什麼的,要好好注意用在哪,以及會和其他路徑怎麼連線。
首先,可能由於Caffe版本不同,我看到很多網上的教程,可執行exe檔案都是在“/build/tools/”下,而我的是在“caffe\scripts\build\tools\Release”下,接下來跟著流程走。
我的整個訓練產生的檔案:
1.train.txt檔案和val.txt檔案以及label.txt,我的圖片都一起放data裡面了,一開始搞txt文字,還是用python處理的。。分出train和val資料夾,需要在之後的一些檔案中加上資料夾的路徑,後面會說原因。
ftw93.jpg 0 ftw94.jpg 0 ftw95.jpg 0 ftw96.jpg 0 ftw97.jpg 0 ftw98.jpg 0 ftw99.jpg 0 ... mtw1.jpg 1 mtw10.jpg 1 mtw100.jpg 1 mtw101.jpg 1 mtw102.jpg 1 mtw103.jpg 1 mtw104.jpg 1 mtw105.jpg 1 mtw106.jpg 1 mtw107.jpg 1
label.txt即所有分類
0 歐美女
1 亞洲女
2 歐美男
3 亞洲男
我的檔案都是這樣配置,不用絕對地址就是因為路徑相關,等一下說。
(2018-12-06更新)自動生成caffe訓練的訓練和測試集txt指令碼如下(訓練和測試圖片放兩個資料夾):
# /usr/bin/env sh DATA=D:/caffe/examples/my_image FILETYPE=jpg #需要處理樣本的圖片格式 echo "Create train.txt..." rm -rf $DATA/train.txt array=("ftw" "fty" "mtw" "mty") # 迴圈幾種類別 for i in 0 1 2 3 # do echo ${array[i]} find $DATA/data/train -name ${array[i]}*.$FILETYPE | cut -d '/' -f7 | sed "s/$/ $i/">>train.txt # 寫入檔案 done echo "Create test.txt..." rm -rf $DATA/test.txt for i in 0 1 2 3 # -f6-7 指目錄第6-7層,根據上面的目錄來指定,若是不加train資料夾,只需7即可 do find $DATA/data/test -name ${array[i]}*.$FILETYPE | cut -d '/' -f7 | sed "s/$/ $i/">>val.txt done echo "All done" pause
2. 生成lmdb檔案,我用的也是create_imagenet.sh檔案,路徑為caffe\examples\imagenet,檔案裡面我一開始也用相對路徑,一直有出錯的部分,所以我直接全用絕對路徑了。在這裡面就有TRAIN_DATA_ROOT和VAL_DATA_ROOT的問題了,這兩個指訓練和測試資料集的地址,與之前train.txt和val.txt裡的路徑會組合成完整的路徑,我不在train.txt中使用完整路徑的原因,是因為我用git bash啟動sh檔案,如果TRAIN_DATA_ROOT設定為 / ,則預設為git的exe檔案所以路徑作為TRAIN_DATA_ROOT,故而老是訓練失敗。一些註釋我就不打了,一般都能看得懂,之前貼的那份連結也有。
#!/usr/bin/env sh
# Create the imagenet lmdb inputs
# N.B. set the path to the imagenet train + val data dirs
set -e
EXAMPLE=D:/caffe/examples/my_image
DATA=D:/caffe/examples/my_image/data/
TOOLS=D:/caffe/scripts/build/tools/Release
TRAIN_DATA_ROOT=D:/caffe/examples/my_image/data/train/
VAL_DATA_ROOT=D:/caffe/examples/my_image/data/test/
# Set RESIZE=true to resize the images to 256x256. Leave as false if images have
# already been resized using another tool.
RESIZE=true
if $RESIZE; then
RESIZE_HEIGHT=32 # 檔案中本來自動設定成256,但我為了快點出結果,先設了32
RESIZE_WIDTH=32
else
RESIZE_HEIGHT=0
RESIZE_WIDTH=0
fi
if [ ! -d "$TRAIN_DATA_ROOT" ]; then
echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"
echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \
"where the ImageNet training data is stored."
exit 1
fi
if [ ! -d "$VAL_DATA_ROOT" ]; then
echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"
echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \
"where the ImageNet validation data is stored."
exit 1
fi
echo "Creating train lmdb..."
GLOG_logtostderr=1 $TOOLS/convert_imageset \
--resize_height=$RESIZE_HEIGHT \
--resize_width=$RESIZE_WIDTH \
--shuffle \
$TRAIN_DATA_ROOT \
$DATA/train.txt \
$EXAMPLE/ilsvrc12_train_lmdb #生成的lmdb路徑
echo "Creating val lmdb..."
GLOG_logtostderr=1 $TOOLS/convert_imageset \
--resize_height=$RESIZE_HEIGHT \
--resize_width=$RESIZE_WIDTH \
--shuffle \
$VAL_DATA_ROOT \
$DATA/val.txt \
$EXAMPLE/ilsvrc12_val_lmdb #生成的lmdb路徑
echo "Done."
3.生成mean_file,依然全部用絕對路徑,關於這個的用處,這個部落格有所描述。不過上面的連結中只計算了train的均值檔案,以及我看到的教程都是隻搞出了train的mean_file,然後後面在網路中一起用,這讓我覺得有點奇怪,於是我多加了個test的,結果計算出來的檔案大小與train的mean_file大小一樣,進入網路後跟之前也沒什麼變化,很迷
(注:一天後,我突然意識到了,現實中只有訓練資料和待預測資料,故而當然只有train的均值檔案,至於為什麼大小一樣。。自然因為它是均值檔案,不隨數量變化而變化)
EXAMPLE=D:/caffe/examples/my_image
DATA=D:/caffe/examples/my_image/data/
TOOLS=D:/caffe/scripts/build/tools/Release
$TOOLS/compute_image_mean $EXAMPLE/ilsvrc12_train_lmdb $EXAMPLE/imagenet_train_mean.binaryproto
$TOOLS/compute_image_mean $EXAMPLE/ilsvrc12_val_lmdb $EXAMPLE/imagenet_val_mean.binaryproto
echo "Done."
4.cifar10_quick_solver.prototxt 和 cifar10_quick_train_test.prototxt,這兩個都是在caffe\examples\cifar10中拷貝過來的
cifar10_quick_train_test.prototxt,可以對照一下哪些地方不同(我備註的就是不同的)
name: "CIFAR10_quick"
layer {
name: "cifar"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mean_file: "D:/caffe/examples/my_image/imagenet_train_mean.binaryproto" #均值檔案路徑
}
data_param {
source: "D:/caffe/examples/my_image/ilsvrc12_train_lmdb" # lmdb檔案路徑
batch_size: 20 # 圖片數量比較少的話就不要設定太大了
backend: LMDB # 有兩種,生成的是lmdb,就選LMDB
}
}
layer {
name: "cifar"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mean_file: "D:/caffe/examples/my_image/imagenet_train_mean.binaryproto" # 注意也是train的均值檔案
}
data_param {
source: "D:/caffe/examples/my_image/ilsvrc12_val_lmdb" # lmdb資料夾
batch_size: 20 # 測試時候batch_size
backend: LMDB # 同上
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 32
pad: 2
kernel_size: 5
stride: 1
weight_filler {
type: "gaussian"
std: 0.0001
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "pool1"
top: "pool1"
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 32
pad: 2
kernel_size: 5
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: AVE
kernel_size: 3
stride: 2
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "pool2"
top: "conv3"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "pool3"
type: "Pooling"
bottom: "conv3"
top: "pool3"
pooling_param {
pool: AVE
kernel_size: 3
stride: 2
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool3"
top: "ip1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 64
weight_filler {
type: "gaussian"
std: 0.1
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 4 # 輸出變數
weight_filler {
type: "gaussian"
std: 0.1
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "ip2"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "ip2"
bottom: "label"
top: "loss"
}
5.cifar10_quick_solver.prototxt檔案
# reduce the learning rate after 8 epochs (4000 iters) by a factor of 10
# The train/test net protocol buffer definition
net: "D:/caffe/examples/my_image/cifar10_quick_train_test.prototxt" # 網路路徑
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 20 # 訓練幾次,這裡的數字應該和batch_size相乘後等於訓練集總數
# Carry out testing every 500 training iterations.
test_interval: 10 # 每隔幾次測試一次
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.001
momentum: 0.9
weight_decay: 0.004
# The learning rate policy
lr_policy: "fixed"
# Display every 100 iterations
display: 100
# The maximum number of iterations
max_iter: 4000
# snapshot intermediate results
snapshot: 1000
snapshot_prefix: "D:/caffe/examples/my_image/cifar10_quick"
# solver mode: CPU or GPU
solver_mode: CPU
6.開始訓練,train.sh檔案
D:/caffe/scripts/build/tools/Release/caffe train --solver=D:/caffe/examples/my_image/cifar10_quick_solver.prototxt
然後執行train.sh檔案即可
7.(2018-12-06更新)對自己的圖片進行分類,注意其中有個deploy.protxt,需要跟之前的網路是適配的
我建立的deploy.protxt(根據前面的網路進行設定的)和test.sh如下:
name: "CIFAR10_quick"
layer {
name: "cifar"
type: "Input"
top: "data"
input_param { shape: { dim: 10 dim: 3 dim: 32 dim: 32 } } #這裡dim 227 兩個地方需要對應上你自己訓練時候的尺寸,否則會出現以下描述的異常
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 32
pad: 2
kernel_size: 5
stride: 1
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "pool1"
top: "pool1"
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
convolution_param {
num_output: 32
pad: 2
kernel_size: 5
stride: 1
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: AVE
kernel_size: 3
stride: 2
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "pool2"
top: "conv3"
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
stride: 1
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "pool3"
type: "Pooling"
bottom: "conv3"
top: "pool3"
pooling_param {
pool: AVE
kernel_size: 3
stride: 2
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool3"
top: "ip1"
inner_product_param {
num_output: 64
}
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
inner_product_param {
num_output: 4 #配置的標籤個數
}
}
layer {
name: "prob"
type: "Softmax"
bottom: "ip2"
top: "prob"
}
for i in 1 2 3 4
do
D:/caffe/scripts/build/examples/cpp_classification/Release/classification.exe D:/caffe/examples/my_image/deploy.prototxt D:/caffe/examples/my_image/cifar10_quick_iter_4000.caffemodel.h5 D:/caffe/examples/my_image/imagenet_train_mean.binaryproto D:/caffe/examples/my_image/label.txt C:/Users/xxx/Desktop/$i.jpg
# 一共五個路徑,應該都能看出指什麼
done
8.由於這篇一開始我只用了兩個分類,一開始的步驟寫的也都很簡單,後來才逐步修改上去的,所以可能有些地方承接的不好,之後我重新訓練了一份,保留了整個流程的指令碼和網路結構以及訓練出來的模型,可以直接下載。
9.在vs2015中建立自己的工程呼叫模型進行分類,這步我還在探索中