使用caffe中的imagenet對自己的圖片進行分類訓練(超級詳細版)

阿新 • • 發佈：2019-01-29

因為自己在網路上查到的資料對於一個新手來說雖然指明瞭方向，但是在細節上沒有給出很好的例項，因此我把自己訓練的過程記錄下來。

【實驗環境】

實體記憶體：64G Free：7.5G CPU個數：3，單個CPU物理核數：8

作業系統：Linux

備註：具有GPU運算能力

【實驗目標】

使用自己的圖片集，以及caffe框架，對imagenet進行訓練，得到自己的model。

【前期準備】

1. 安裝並配置caffe環境

【實驗過程】

1. 資料集準備

獲取訓練圖片集與驗證圖片集，併產生train.txt與val.txt，內容為圖片路徑與分類標籤；將圖片進行大小重設，設定為256*256大小；使用create_imagenet.sh指令碼將2組圖片集轉換為lmbp格式。

2. 計算影象均值

使用make_imagenet_mean.sh計算影象均值，產生imagenet_mean.binaryproto檔案。

3. 設定網路引數

拷貝caffe-master/model/bvlc_reference_caffenet中的檔案，修改train_val.prototxt，solver.prototxt中的執行引數，並進行路徑的修改；拷貝caffe_master/examples/imagenet中的train_caffnet.sh檔案，對路徑進行修改。

4. 執行train_caffnet.sh

【實驗過程詳細版】

備註一下目錄的情況，這樣比較調理啦：

Caffe根目錄：caffe_root=/home/james/caffe/

圖片類資料：caffe_root/data/mydata

命令引數類資料：caffe_root/examples/mytask

注：預設我們手動新增的除圖片以及.txt之外的檔案都屬於命令引數類資料，執行的時候注意路徑就好，另外，我門在實驗的時候換了別人的電腦，因此存在caffe根路徑前後不一致的狀況，大家注意一下就好。

1. 資料集準備

a. 準備訓練圖片集以及驗證圖片集

新建caffe_root/data/mydata，分別將圖片集放置於caffe_root/data/mydata/train與caffe_root/data/mydata/val下面

b. 準備圖片清單

在caffe_root/data/mydata下面新建兩個檔案train.txt與val.txt，train.txt中的內容為：

1.jpg 7

2.jpg7

3.jpg 7

…

以上格式為圖片名稱+空格+類標（數字）的格式，val.txt的格式也是一樣的（同樣需要類標）。

此步可以使用create_filelist.sh進行批量新增圖片路徑至train.txt。create_filelist.sh內容需要按照自身圖片的名稱與類標情況進行修改，並持續執行（因為是在檔案後面追加）內容如下：

#!/usr/bin/env sh

#!/bin/bash

DATA=/home/james/caffe/data/mydata/val

MY=/home/james/caffe/data/mydata

for i in {3122..3221}

echo $i.jpg 3 >> $MY/val.txt

done

echo "All done"

以上命令意思是，在val資料夾下面的圖片中，名稱為3122.jpg至3221.jpg的圖片都是第3類，因此就會在val.txt寫入：

3122.jpg 3

3123.jpg 3

…

注意：此時可能會報出bad loop variable的錯誤，這是由於Ubuntu bash的版本的原因，可以自行檢視如何解決。

c. 調整圖片大小至256*256

因為之前沒有仔細看caffe的相關檔案，後來才知道可以使用之自動調整大小，因此此步採用的是自己呼叫命令進行調整大小。如果不調整圖片大小的話，在執行後面命令的時候是會報錯的。

可以使用convert256.sh進行轉換。注意，該命令中用到了imagemagick工具，因此如果自己沒有安裝的話，還需要安裝該工具（命令為：sudo apt-get install imagemagick）。convert256.sh內容如下：

for name in/home/james/caffe/data/mydata/train/*.jpg; do

convert -resize 256x256\! $name $name

done

d. 構建圖片資料庫

要讓Caffe進行圖片的訓練，必須有圖片資料庫，並且也是使用其作為輸入，而非直接使用圖片作為輸入。使用create_imagenet.sh指令碼將train與val的2組圖片集轉換為lmbp格式。create_imagenet.sh內容如下：

#!/usr/bin/env sh

# Create the imagenet lmdb inputs

# N.B. set the path to the imagenet train +val data dirs

EXAMPLE=/home/james/caffe/examples/mytask

DATA=/home/james/caffe/data/mydata

TOOLS=/home/james/caffe/build/tools

TRAIN_DATA_ROOT=/home/james/caffe/data/mydata/train/

VAL_DATA_ROOT=/home/james/caffe/data/mydata/val/

# Set RESIZE=true to resize the images to256x256. Leave as false if images have

# already been resized using another tool.

RESIZE=false

if $RESIZE; then

RESIZE_HEIGHT=256

RESIZE_WIDTH=256

else

RESIZE_HEIGHT=0

RESIZE_WIDTH=0

if [ ! -d "$TRAIN_DATA_ROOT" ];then

echo "Error: TRAIN_DATA_ROOT is not a path to a directory:$TRAIN_DATA_ROOT"

echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to thepath" \

"where the ImageNet training data is stored."

exit 1

if [ ! -d "$VAL_DATA_ROOT" ]; then

echo "Error: VAL_DATA_ROOT is not a path to a directory:$VAL_DATA_ROOT"

echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to thepath" \

"where the ImageNet validation data is stored."

exit 1

echo "Creating train lmdb..."

GLOG_logtostderr=1 $TOOLS/convert_imageset\

--resize_height=$RESIZE_HEIGHT \

--resize_width=$RESIZE_WIDTH \

--shuffle \

$TRAIN_DATA_ROOT \

$DATA/train.txt \

$EXAMPLE/ilsvrc12_train_lmdb

echo "Creating val lmdb..."

GLOG_logtostderr=1 $TOOLS/convert_imageset\

--resize_height=$RESIZE_HEIGHT \

--resize_width=$RESIZE_WIDTH \

--shuffle \

$VAL_DATA_ROOT \

$DATA/val.txt \

$EXAMPLE/ilsvrc12_val_lmdb

echo "Done."

注：將其中的地址均修改為自己的對應地址，不是地址的就不要強行修改啦。

2. 計算影象均值

據說計算影象均值之後的訓練效果會更好，使用make_imagenet_mean.sh計算影象均值，產生imagenet_mean.binaryproto檔案。make_imagenet_mean.sh檔案內容如下：

#!/usr/bin/env sh

# Compute the mean image from the imagenettraining lmdb

# N.B. this is available in data/ilsvrc12

EXAMPLE=/home/james/caffe/examples/mytask

DATA=/home/james/caffe/data/mydata/

TOOLS=/home/james/caffe/build/tools

$TOOLS/compute_image_mean$EXAMPLE/ilsvrc12_train_lmdb \

$DATA/imagenet_mean.binaryproto

echo "Done."

注：將其中的地址修改為自己的地址，並且產生的imagenet_mean.binaryproto檔案在data/mydata資料夾下，稍後設定的時候注意該路徑。

3. 設定訓練引數

train_val.prototxt是網路的結構，內容如下：

layer {

type: "Data"

top: "data"

top: "label"

include {

phase: TRAIN

}

transform_param {

mirror: true

crop_size: 227

mean_file:"/home/dina/caffe/examples/mytask/imagenet_mean.binaryproto"

}

# mean pixel / channel-wise mean instead ofmean image

# transform_param {

# crop_size: 227

# mean_value: 104

# mean_value: 117

# mean_value: 123

# mirror: true

# }

data_param {

source: "/home/dina/caffe/examples/mytask/ilsvrc12_train_lmdb"

batch_size: 256

backend: LMDB

}

layer {

type: "Data"

top: "data"

top: "label"

include {

phase: TEST

}

transform_param {

mirror: false

crop_size: 227

mean_file:"/home/dina/caffe/examples/mytask/imagenet_mean.binaryproto"

}

# mean pixel / channel-wise mean instead ofmean image

# transform_param {

# crop_size: 227

# mean_value: 104

# mean_value: 117

# mean_value: 123

# mirror: false

# }

data_param {

source: "/home/dina/caffe/examples/mytask/ilsvrc12_val_lmdb"

batch_size: 50

backend: LMDB

}

layer {

type: "Convolution"

bottom: "data"

top: "conv1"

param {

lr_mult: 1

decay_mult: 1

}

param {

lr_mult: 2

decay_mult: 0

}

convolution_param {

num_output: 96

kernel_size: 11

stride: 4

weight_filler {

type: "gaussian"

std: 0.01

}

bias_filler {

type: "constant"

value: 0

}

layer {

type: "ReLU"

bottom: "conv1"

top: "conv1"

}

layer {

type: "Pooling"

bottom: "conv1"

top: "pool1"

pooling_param {

pool: MAX

kernel_size: 3

stride: 2

}

layer {

type: "LRN"

bottom: "pool1"

top: "norm1"

lrn_param {

local_size: 5

alpha: 0.0001

beta: 0.75

}

layer {

type: "Convolution"

bottom: "norm1"

top: "conv2"

param {

lr_mult:1

decay_mult: 1

}

param {

lr_mult: 2

decay_mult: 0

}

convolution_param {

num_output: 256

pad: 2

kernel_size: 5

group: 2

weight_filler {

type: "gaussian"

std: 0.01

}

bias_filler {

type: "constant"

value: 1

}

layer {

type: "ReLU"

bottom: "conv2"

top: "conv2"

}

layer {

type: "Pooling"

bottom: "conv2"

top: "pool2"

pooling_param {

pool: MAX

kernel_size: 3

stride: 2

}

layer {

type: "LRN"

bottom: "pool2"

top: "norm2"

lrn_param {

local_size: 5

alpha: 0.0001

beta: 0.75

}

layer {

type: "Convolution"

bottom: "norm2"

top: "conv3"

param {

lr_mult:1

decay_mult: 1

}

param {

lr_mult: 2

decay_mult: 0

}

convolution_param {

num_output: 384

pad: 1

kernel_size: 3

weight_filler {

type: "gaussian"

std: 0.01

}

bias_filler {

type: "constant"

value: 0

}

layer {

type: "ReLU"

bottom: "conv3"

top: "conv3"

}

layer {

type: "Convolution"

bottom: "conv3"

top: "conv4"

param {

lr_mult: 1

decay_mult: 1

}

param {

lr_mult: 2

decay_mult: 0

}

convolution_param {

num_output: 384

pad: 1

kernel_size: 3

group: 2

weight_filler {

type: "gaussian"

std: 0.01

}

bias_filler {

type: "constant"

value: 1

}

layer {

type: "ReLU"

bottom: "conv4"

top: "conv4"

}

layer {

type: "Convolution"

bottom: "conv4"

top: "conv5"

param {

lr_mult: 1

decay_mult: 1

}

param {

lr_mult: 2

decay_mult: 0

}

convolution_param {

num_output: 256

pad: 1

kernel_size: 3

group: 2

weight_filler {

type: "gaussian"

std: 0.01

}

bias_filler {

type: "constant"

value: 1

}

layer {

type: "ReLU"

bottom: "conv5"

top: "conv5"

}

layer {

type: "Pooling"

bottom: "conv5"

top: "pool5"

pooling_param {

pool: MAX

kernel_size: 3

stride: 2

}

layer {

type: "InnerProduct"

bottom: "pool5"

top: "fc6"

param {

lr_mult: 1

decay_mult: 1

}

param {

lr_mult: 2

decay_mult: 0

}

inner_product_param {

num_output: 4096

weight_filler {

type: "gaussian"

std: 0.005

}

bias_filler {

type: "constant"

value: 1

}

layer {

type: "ReLU"

bottom: "fc6"

top: "fc6"

}

layer {

type: "Dropout"

bottom: "fc6"

top: "fc6"

dropout_param {

dropout_ratio: 0.5

}

layer {

type: "InnerProduct"

bottom: "fc6"

top: "fc7"

param {

lr_mult: 1

decay_mult: 1

}

param {

lr_mult: 2

decay_mult: 0

}

inner_product_param {

num_output: 4096

weight_filler {

type: "gaussian"

std: 0.005

}

bias_filler {

type: "constant"

value: 1

}

layer {

type: "ReLU"

bottom: "fc7"

top: "fc7"

}

layer {

type: "Dropout"

bottom: "fc7"

top: "fc7"

dropout_param {

dropout_ratio: 0.5

}

layer {

type: "InnerProduct"

bottom: "fc7"

top: "fc8"

param {

lr_mult: 1

decay_mult: 1

}

param {

lr_mult: 2

decay_mult: 0

}

inner_product_param {

num_output: 1000

weight_filler {

type: "gaussian"

std: 0.01

}

bias_filler {

type: "constant"

value: 0

}

layer {

type: "Accuracy"

bottom: "fc8"

bottom: "label"

top: "accuracy"

include {

phase: TEST

}

layer {

type: "SoftmaxWithLoss"

bottom: "fc8"

bottom: "label"

top: "loss"

}

solver.prototxt是網路引數的設定，內容如下：

net:"/home/dina/caffe/examples/mytask/train_val.prototxt"

test_iter: 2

test_interval: 50

base_lr: 0.001

lr_policy: "step"

gamma: 0.1

stepsize: 100

display: 20

max_iter: 1000

momentum: 0.9

weight_decay: 0.0005

snapshot: 500

snapshot_prefix:"models/bvlc_reference_caffenet/caffenet_train"

solver_mode: GPU

train_caffnet.sh是執行網路的命令，內容如下：

#!/usr/bin/env sh

./build/tools/caffe train \

--solver=./examples/mytask/solver.prototxt

好了，可以等待訓練過程了，我們的訓練圖片是2000個訓練圖片，1000個驗證圖片，大約過了3-4個小時，就訓練好了。

使用caffe中的imagenet對自己的圖片進行分類訓練(超級詳細版)

使用caffe中的imagenet對自己的圖片進行分類訓練(超級詳細版)

TensorFlow-cifar訓練與測試（可對自己資料進行分類和測試）

利用opencv呼叫tensorflow的pb模型對jpg圖片進行分類

caffe訓練自己的圖片進行分類預測

caffe + win10基於CaffeNet網路框架訓練自己的圖片進行分類（實踐篇）

利用CNN對股票“圖片”進行漲跌分類——一次嘗試【附原始碼】

數組中以某個字段進行分類

5、xamarin.android 中如何對AndroidManifest.xml 進行配置和調整

蘋果手機中如何對CAD圖紙進行縮小查看？

利用全連線網路將圖片進行分類

使用LogisticRegression和SGDClassifier對良/惡性腫瘤進行分類，並計算出準確率召回率和F1的值

JAVA使用thumbnailator對base64圖片進行壓縮

Spark 中文文件分類(一) IKAnalyzer對文件進行分類

利用LSTM對腦電波訊號進行分類

計算機視覺（四）：使用K-NN分類器對CIFAR-10進行分類

構建多層感知器神經網路對數字圖片進行文字識別

使用opensmile提取音訊的特徵，得到特徵向量，並扔進libsvm中進行分類訓練測試

Caffe學習：pycaffe利用caffemodel進行分類

利用python對大量圖片進行重新命名

利用sklearn包中的k-近鄰演算法進行分類

使用caffe中的imagenet對自己的圖片進行分類訓練(超級詳細版)

相關推薦