LARC Caffe筆記（二）訓練自己的img

阿新 • • 發佈：2019-01-17

繼看完賀完結！CS231n官方筆記
上一次已經成功跑起caffe自帶的例程，mnist和cifar10
但是終歸用的是裡面寫好的指令碼，於是打算訓練自己的img

〇、目標

準備好food圖片3類（出於資料安全考慮，使用food101公開資料集）

這裡寫圖片描述

每一類都是沒有resize的1000張圖片

這裡寫圖片描述

現在的任務就是：

將這三類food分類

通過這個小任務應該可以熟練caffe使用

小問題列表：

（1）這個後面的數字只要不一樣就行了吧，用於表示類別？

這裡寫圖片描述

答：ML中類別都是用數字表示的，而且必須是連續的，這是softmax函式所決定的，不連續就沒法算了

且必須從0開始，如果只有三類就是0，1，2

（2）train val test，比如我有三類food，那資料夾結構應該是怎樣，或者說資料夾路徑無所謂，在txt裡面指明路徑就行？

無所謂，txt制定就行

（3）如果train 25%，val 25%， test 75%，是這三類對每一類都是這樣的比例吧，比如上面那個food，apple pie是train 25%，val 25%， test 75%

無所謂，不一定要嚴格成比例

一、createList

fileList=`ls`
for file in $fileList;do
echo $file
done

test.txt應不應該帶label

答：你如果要算準確率肯定要有label，格式和train的一樣就行，具體如何你直接看對應資料讀取的cpp，如果是txt裡讀取，對應的應該是個叫data layer的cpp

有時候資料讀取也自己寫的，資料比較複雜的情況，比如做檢測的時候，並不是一個圖片一個label，而是有很多矩形框，替換data layer的cpp

生成train.txt val.txt test.txt的shell如下，但是不具有一般性，而且這裡寫入txt的圖片路徑不應該是絕對路徑，應該是與後面的create_lmdb連用的路徑

# /usr/bin/env sh
# by Bill

DATA=`pwd`
echo "Create train.txt..."
fileList0=`ls $DATA/train/apple_pie`
fileList1=`ls $DATA/train/baby_back_ribs`
fileList2=`ls $DATA 
/train/caesar_salad`
for file in $fileList0;do
echo apple_pie/$file | sed "s/$/ 0/">>$DATA/train.txt
done
for file in $fileList1;do
echo baby_back_ribs/$file | sed "s/$/ 1/">>$DATA/train.txt
done
for file in $fileList2;do
echo caesar_salad/$file | sed "s/$/ 2/">>$DATA/train.txt
done

echo "Create val.txt..."
fileList0=`ls $DATA/val/apple_pie`
fileList1=`ls $DATA/val/baby_back_ribs`
fileList2=`ls $DATA/val/caesar_salad`
for file in $fileList0;do
echo apple_pie/$file | sed "s/$/ 0/">>$DATA/val.txt
done
for file in $fileList1;do
echo baby_back_ribs/$file | sed "s/$/ 1/">>$DATA/val.txt
done
for file in $fileList2;do
echo caesar_salad/$file | sed "s/$/ 2/">>$DATA/val.txt
done

echo "Create test.txt..."
fileList0=`ls $DATA/test/apple_pie`
fileList1=`ls $DATA/test/baby_back_ribs`
fileList2=`ls $DATA/test/caesar_salad`
for file in $fileList0;do
echo apple_pie/$file | sed "s/$/ 0/">>$DATA/test.txt
done
for file in $fileList1;do
echo baby_back_ribs/$file | sed "s/$/ 1/">>$DATA/test.txt
done
for file in $fileList2;do
echo caesar_salad/$file | sed "s/$/ 2/">>$DATA/test.txt
done

二、將img轉為lmdb

這裡寫圖片描述

修改於examples/imagenet/create_imagenet.sh

#!/usr/bin/env sh
# Create the imagenet lmdb inputs
# By Bill
set -e

DBNAME=.
ListPath=.
TOOLS=/home/hwang/hwang/caffe-master/build/tools

TRAIN_DATA_ROOT=/home/hwang/hwang/dataset/whFoodTrainTest/train/
VAL_DATA_ROOT=/home/hwang/hwang/dataset/whFoodTrainTest/val/

# Set RESIZE=true to resize the images to 256x256. Leave as false if images have
# already been resized using another tool.
RESIZE=true
if $RESIZE; then
  RESIZE_HEIGHT=256
  RESIZE_WIDTH=256
else
  RESIZE_HEIGHT=0
  RESIZE_WIDTH=0
fi

if [ ! -d "$TRAIN_DATA_ROOT" ]; then
  echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"
  echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \
       "where the ImageNet training data is stored."
  exit 1
fi

if [ ! -d "$VAL_DATA_ROOT" ]; then
  echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"
  echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \
       "where the ImageNet validation data is stored."
  exit 1
fi

echo "Creating train lmdb..."

GLOG_logtostderr=1 $TOOLS/convert_imageset \
    --resize_height=$RESIZE_HEIGHT \
    --resize_width=$RESIZE_WIDTH \
    --shuffle \
    $TRAIN_DATA_ROOT \
    $ListPath/train.txt \
    $DBNAME/whAlexNet_train_lmdb

echo "Creating val lmdb..."

GLOG_logtostderr=1 $TOOLS/convert_imageset \
    --resize_height=$RESIZE_HEIGHT \
    --resize_width=$RESIZE_WIDTH \
    --shuffle \
    $VAL_DATA_ROOT \
    $ListPath/val.txt \
    $DBNAME/whAlexNet_val_lmdb

echo "Done."

實則呼叫了caffe 的 convert_imageset [FLAGS] ROOTFOLDER/ LISTFILE DB_NAME [2]
需要帶四個引數：

FLAGS: 圖片引數組

ROOTFOLDER/: 圖片存放的絕對路徑，從linux系統根目錄開始

LISTFILE: 圖片檔案列表清單，一般為一個txt檔案，一行一張圖片

DB_NAME: 最終生成的db檔案存放目錄

FLAGS這個引數組，有些什麼內容：

-gray: 是否以灰度圖的方式開啟圖片。程式呼叫opencv庫中的imread()函式來開啟圖片，預設為false

-shuffle: 是否隨機打亂圖片順序。預設為false

-backend:需要轉換成的db檔案格式，可選為leveldb或lmdb,預設為lmdb

-resize_width/resize_height: 改變圖片的大小。在執行中，要求所有圖片的尺寸一致，因此需要改變圖片大小。程式呼叫opencv庫的resize（）函式來對圖片放大縮小，預設為0，不改變

-check_size: 檢查所有的資料是否有相同的尺寸。預設為false,不檢查

-encoded: 是否將原圖片編碼放入最終的資料中，預設為false

-encode_type: 與前一個引數對應，將圖片編碼為哪一個格式：‘png’,’jpg’……

三、計算均值並儲存

圖片減去均值再訓練，會提高訓練速度和精度。因此，一般都會有這個操作。

caffe程式提供了一個計算均值的檔案compute_image_mean.cpp，我們直接使用就可以了

# sudo build/tools/compute_image_mean examples/myfile/img_train_lmdb examples/myfile/mean.binaryproto

compute_image_mean帶兩個引數，第一個引數是lmdb訓練資料位置，第二個引數設定均值檔案的名字及儲存路徑。
執行成功後，會在 examples/myfile/ 下面生成一個mean.binaryproto的均值檔案。

模型需要從每張圖片減去均值，所以我們需要獲取training images的均值，用tools/compute_image_mean.cpp實現．這個cpp是一個很好的例子去熟悉如何操作多個元件，例如協議的緩衝區，leveldb,登陸等．
下面的shell程式碼修改自examples/imagenet/make_imagenet_mean.sh

#!/usr/bin/env sh
# Compute the mean image from the imagenet training lmdb
# By Bill

DBNAME=.
ListPath=.
TOOLS=/home/hwang/hwang/caffe-master/build/tools

$TOOLS/compute_image_mean $DBNAME/whAlexNet_train_lmdb \
  $ListPath/whAlexNet_mean.binaryproto

echo "Done."

這裡寫圖片描述

四、定義網路

AlexNet模型定義於檔案：models/bvlc_alexnet/train_val.prototxt，注意需將檔案中的訓練資料集和測試資料集的地址更改為伺服器中實際存放的地址。
訓練引數定義於檔案：models/bvlc_alexnet/solver.prototxt

主要是修改各資料層的檔案路徑．如下圖：

這裡寫圖片描述

如果細心觀察train_val.prototxt的train部分和val部分，可以發現他們除了資料來源和最後一層不同以外，其他基本相似．在training時，我們用softmax－loss層計算損失函式和初始化反向傳播，而在驗證時，我們使用精度層檢測精度．

還有一個執行協議solver.prototxt，複製過來，將第一行路徑改為我們自己的路徑net:”examples/mydata/train_val.prototxt”.　從裡面可以觀察到，我們將執行256批次，迭代4500000次（90期），每1000次迭代，我們測試學習網路驗證資料，我們設定初始的學習率為0.01，每100000（20期）次迭代減少學習率，顯示一次資訊，訓練的weight_decay為0.0005，每10000次迭代，我們顯示一下當前狀態。
以上是教程的，實際上，以上需要耗費很長時間，因此，我們稍微改一下
test_iter: 1000是指測試的批次，我們就10張照片，設定10就可以了。
test_interval: 1000是指每1000次迭代測試一次，我們改成500次測試一次。
base_lr: 0.01是基礎學習率，因為資料量小，0.01就會下降太快了，因此改成0.001
lr_policy: “step”學習率變化
gamma: 0.1學習率變化的比率
stepsize: 100000每100000次迭代減少學習率
display: 20每20層顯示一次
max_iter: 450000最大迭代次數，
momentum: 0.9學習的引數，不用變
weight_decay: 0.0005學習的引數，不用變
snapshot: 10000每迭代10000次顯示狀態，這裡改為2000次
solver_mode: GPU末尾加一行，代表用GPU進行

貼上另一篇部落格的說明：

net: "examples/my_simple_image/cifar/cifar10_quick_train_test.prototxt"   #網路檔案路徑
test_iter: 20        #測試執行的迭代次數
test_interval: 10    #迭代多少次進行測試
base_lr: 0.001       #迭代速率，這裡我們改小了一個數量級，因為資料比較少
momentum: 0.9
weight_decay: 0.004
lr_policy: "fixed"   #採用固定學習速率的模式display: 1           #迭代幾次就顯示一下資訊，這裡我為了及時跟蹤效果，改成1
max_iter: 4000       #最大迭代次數
snapshot: 1000       #迭代多少次生成一次快照
snapshot_prefix: "examples/my_simple_image/cifar/cifar10_quick"     #快照路徑和字首
solver_mode: CPU     #CPU或者GPU

熊偉說：

（1）base_lr: 0.01
gamma: 0.1
stepsize: 100000
這三個引數是最重要的

（2）在conv層，這兩個數字設為0，梯度就不會變（是乘在梯度前面的），那就可以單獨訓練固定的層
param {
lr_mult: 1
decay_mult: 1
}

（3）25%做train，25%val，50%test？感覺這個有問題

（4）iteration怎麼設定：總的datasize/batchsize=>一個epoch需要的iteration。一般10個epoch就可以

對於我的訓練資料，從[7]得到的啟發：

示例: caffe/examples/mnist/lenet_solver.prototxt 
# The train/test net protocol buffer definition
net: "examples/mnist/lenet_train_test.prototxt"
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100
# Carry out testing every 500 training iterations.
test_interval: 500
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
# The learning rate policy
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# Display every 100 iterations
display: 100
# The maximum number of iterations
max_iter: 10000
# snapshot intermediate results
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet"
# solver mode: CPU or GPU
solver_mode: CPU

我自己的設定，train資料是3個類別，每個類別20張food圖片（一共60張train）；val有（10*3=）30張；test有（10*3=）30張

batch_size設定的16

data_param {
    source: "/home/hwang/hwang/dataset/whFoodTrainTest/whAlexNet_train_lmdb"
    batch_size: 16
    backend: LMDB
  }

solver

net: "./train_val.prototxt"
test_iter: 4
test_interval: 500
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 100000
display: 20
max_iter: 1000
momentum: 0.9
weight_decay: 0.0005
snapshot: 2000
snapshot_prefix: "snapshot_prefix/caffe_alexnet_train"
solver_mode: GPU

我自己的設定2
1500（3*500 = ）張train資料

16的batchsize
solver如下：

net: "./train_val.prototxt"
test_iter: 32
test_interval: 500
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 100000
display: 20
max_iter: 10000
momentum: 0.9
weight_decay: 0.0005
snapshot: 2000
snapshot_prefix: "snapshot_prefix/caffe_alexnet_train"
solver_mode: GPU

五、開始train

#!/usr/bin/env sh
#By Bill
set -e

/home/hwang/hwang/caffe-master/build/tools/caffe train \
    --solver=./solver.prototxt [email protected]

把所有訓練資料輪一次是一個epoch

1500張train的food圖片跑的結果如下（loss很低，但是accuracy上不去的原因，熊偉說是因為資料量太少，overfitting了）

這裡寫圖片描述

LARC Caffe筆記（二）訓練自己的img

〇、目標

一、createList

二、將img轉為lmdb

三、計算均值並儲存

四、定義網路

五、開始train

參考資料

LARC Caffe筆記（二）訓練自己的img

CTC學習筆記（二）訓練和公式推導

Pytorch學習筆記（二）自己載入單通道圖片用作資料集訓練

caffe簡易上手指南（二）—— 訓練我們自己的資料

caffe學習筆記（五）--使用自己的資料集第一次進行訓練

Machine Learning筆記整理 ------ （二）訓練集與測試集的劃分

《自己動手寫java虛擬機器》學習筆記（二）-----命令列工具（java）

機器學習速成筆記（二）：訓練與損失

OpenCV學習記錄（二）：自己訓練haar特徵的adaboost分類器進行人臉識別

cs231n斯坦福基於卷積神經網路的CV學習筆記（二）神經網路訓練細節

深度學習tensorflow實戰筆記（1）全連線神經網路（FCN）訓練自己的資料（從txt檔案中讀取）

Composer筆記（二）：建立自己的PHP類庫

Windows Caffe 學習筆記（二）提取特徵

pytorch訓練ImageNet筆記（二）

Jetson TX2學習筆記（二）:caffe安裝配置

caffe學習筆記（四）--製作自己的資料集train.txt和val.txt，生成LMDB檔案

自己用的C#基礎學習筆記（二）——C#面向物件（2）

ios筆記（二）控件屬性

php laravel框架學習筆記（二）數據庫操作

java學習筆記（二）圖形用戶接口

LARC Caffe筆記（二） 訓練自己的img

〇、目標

一、createList

二、將img轉為lmdb

三、計算均值並儲存

四、定義網路

五、開始train

參考資料

相關推薦

LARC Caffe筆記（二）訓練自己的img