1. 程式人生 > >caffe使用自己的數據做分類

caffe使用自己的數據做分類

總數 direct 圖片 play local foo director creating 通過

這裏只舉一個例子: Alexnet網絡訓練自己數據的過程

用AlexNet跑自己的數據
參考1:http://blog.csdn.net/gybheroin/article/details/54095399
參考2:http://www.cnblogs.com/alexcai/p/5469436.html
1,準備數據;
在caffe根目錄下data文件夾新建一個文件夾,名字自己起一個就行了,我起的名字是food,在food文件夾下新建兩個文件夾,分別存放train和val數據,
在train文件夾下存放要分類的數據toast, pizza等,要分幾類就建立幾個文件夾,分別把對應的圖像放進去。(當然,也可以把所有的圖像都放在一個文件夾下,只是在標簽文件中標明就行)。
.
/data (food) -> ./data/food (train val) -> ./data/food/train (pizza sandwich 等等) ./data/food/val (pizza sandwich 等等) 然後在food目錄下生成建立train.txt和val.txt category.txt --- train.txt 和val.txt 內容類似為: toast/62.jpg 0 toast/107.jpg 0 toast/172.jpg 0 pizza/62.jpg 1 pizza/107.jpg 1 pizza/172.jpg 1 --- category.txt內容類似為: 0 toast
1 pizza 註:圖片需要分兩批:訓練集(train)、測試集(test),一般訓練集與測試集的比例大概是5:1以上,此外每個分類的圖片也不能太少,我這裏每個分類大概選了5000張訓練圖+1000張測試圖。 2,lmdb制作(也可以不制作lmdb數據類型,需要在train的配置文件中data layer 的type改為:type: "ImageData" ###可以直接使用圖像訓練) 編譯成功的caffe根目錄下bin文件夾下有一個convert_imageset.exe文件,用來轉換數據,在food文件夾下新建一個腳本文件create_foodnet.sh,內容參考example/imagenet/create_imagenet.sh #
!/usr/bin/env sh # Create the imagenet lmdb inputs # N.B. set the path to the imagenet train + val data dirs set -e EXAMPLE=data/food # the path of generated lmdb data DATA=data/food # the txt path of train and test data TOOLS=build/tools TRAIN_DATA_ROOT=/path/to/imagenet/train/ # /path/to/imagenet/train/ VAL_DATA_ROOT=/path/to/imagenet/val/ # Set RESIZE=true to resize the images to 256x256. Leave as false if images have # already been resized using another tool. RESIZE=false if $RESIZE; then RESIZE_HEIGHT=256 RESIZE_WIDTH=256 else RESIZE_HEIGHT=0 RESIZE_WIDTH=0 fi if [ ! -d "$TRAIN_DATA_ROOT" ]; then echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT" echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" "where the ImageNet training data is stored." exit 1 fi if [ ! -d "$VAL_DATA_ROOT" ]; then echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT" echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" "where the ImageNet validation data is stored." exit 1 fi echo "Creating train lmdb..." GLOG_logtostderr=1 $TOOLS/convert_imageset --resize_height=$RESIZE_HEIGHT --resize_width=$RESIZE_WIDTH --shuffle $TRAIN_DATA_ROOT $DATA/train.txt $EXAMPLE/food_train_lmdb #生成的lmdb路徑 echo "Creating val lmdb..." GLOG_logtostderr=1 $TOOLS/convert_imageset --resize_height=$RESIZE_HEIGHT --resize_width=$RESIZE_WIDTH --shuffle $VAL_DATA_ROOT $DATA/val.txt $EXAMPLE/food_val_lmdb #生成的lmdb路徑 echo "Done." 3,mean_binary生成 下面我們用lmdb生成mean_file,用於訓練 EXAMPLE=data/food DATA=data/food TOOLS=build/tools $TOOLS/compute_image_mean $EXAMPLE/food_train_lmdb $DATA/foodnet_mean.binaryproto 4,solver 和train網絡修改 ------ Solver.prototxt詳解: # 表示網絡的測試叠代次數。網絡一次叠代將一個batchSize的圖片進行測試, # 所以為了能將validation集中所有圖片都測試一次,這個參數乘以TEST的batchSize # 應該等於validation集中圖片總數量。即test_iter*batchSize=val_num。 test_iter: 299 # 表示網絡叠代多少次進行一次測試。一次叠代即一個batchSize的圖片通過網絡 # 正向傳播和反向傳播的整個過程。比如這裏設置的是224,即網絡每叠代224次即 # 對網絡的準確率進行一次驗證。一般來說,我們需要將訓練集中所有圖片都跑一 # 編,再對網絡的準確率進行測試,整個參數乘以網絡data層(TRAIN)中batchSize # 參數應該等於訓練集中圖片總數量。即test_interval*batchSize=train_num test_interval: 224 # 表示網絡的基礎學習率。學習率過高可能導致loss持續86.33333,也可能導致 # loss無法收斂等等問題。過低的學習率會使網絡收斂慢,也有可能導致梯度損失。 # 一般我們設置為0.01 base_lr: 0.01 display: 20 max_iter: 6720 lr_policy: "step" gamma: 0.1 momentum: 0.9 #動量,上次參數更新的權重 weight_decay: 0.0001 stepsize: 2218 #每stpesize之後降低學習率 snapshot: 224 # 每多少次保存一次學習的結果。即caffemodel snapshot_prefix: "food/food_net/food_alex_snapshot" #快照路徑和前綴 solver_mode: GPU net: "train_val.prototxt" # 網絡結構的文件路徑。 solver_type: SGD ----- train_val.prototxt 修改 ###### Data層為原圖像格式。設置主要是data層不同(原圖像作為輸入) layer { name: "data" type: "ImageData" ###註意是ImageData,可以直接使用圖像訓練 top: "data" top: "label" include { phase: TRAIN } image_data_param { ### source: "examples/finetune_myself/train.txt" ### batch_size: 50 new_height: 256 ### new_width: 256 ### } ##### data層為lmdb格式.(制作的lmdb格式作為輸入) layer { name: "data" type: "Data" ###這裏是data,使用轉換為lmdb的圖像之後訓練 top: "data" top: "label" include { phase: TRAIN } data_param { ### source: "examples/imagenet/car_train_lmdb"### batch_size: 256 backend: LMDB ### } 整個網絡結構為: name: "AlexNet" layer { name: "data" type: "Data" top: "data" top: "label" include { phase: TRAIN } transform_param { mirror: true crop_size: 227 mean_file: "mimg_mean.binaryproto" #均值文件 } data_param { source: "mtrainldb" #訓練數據 batch_size: 256 backend: LMDB } } layer { name: "data" type: "Data" top: "data" top: "label" include { phase: TEST } transform_param { mirror: false crop_size: 227 mean_file: "mimg_mean.binaryproto" #均值文件 } data_param { source: "mvaldb" #驗證數據 batch_size: 50 backend: LMDB } } layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool2" type: "Pooling" bottom: "norm2" top: "pool2" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv3" type: "Convolution" bottom: "pool2" top: "conv3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "fc6" type: "InnerProduct" bottom: "pool5" top: "fc6" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 4096 weight_filler { type: "gaussian" std: 0.005 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu6" type: "ReLU" bottom: "fc6" top: "fc6" } layer { name: "drop6" type: "Dropout" bottom: "fc6" top: "fc6" dropout_param { dropout_ratio: 0.5 } } layer { name: "fc7" type: "InnerProduct" bottom: "fc6" top: "fc7" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 4096 weight_filler { type: "gaussian" std: 0.005 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu7" type: "ReLU" bottom: "fc7" top: "fc7" } layer { name: "drop7" type: "Dropout" bottom: "fc7" top: "fc7" dropout_param { dropout_ratio: 0.5 } } layer { name: "fc8" type: "InnerProduct" bottom: "fc7" top: "fc8" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 2 #註意:這裏需要改成你要分成的類的個數 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "accuracy" type: "Accuracy" bottom: "fc8" bottom: "label" top: "accuracy" include { phase: TEST } } layer { name: "loss" type: "SoftmaxWithLoss" bottom: "fc8" bottom: "label" top: "loss" } 運行以下腳本進行train #!/usr/bin/env sh set -e ./build/tools/caffe train --solver=food/food_alexnet/solver.prototxt 5、測試 同樣,測試需要一個類別標簽文件,category.txt,文件內容同上,修改deploy.prototxt 開始測試: ./bin/classification "food/foodnet/deploy.prototxt" "food/foodnet/food_iter_100000.caffemodel" "ming_mean.binaryproto" "test001.jpg" ------------------------------------ ---------------- FineTune: http://www.cnblogs.com/denny402/p/5074212.html http://www.cnblogs.com/alexcai/p/5469478.html 1,註意finetune的時候,最後一層的連接層的名字需要做修改,類別數需要修改,並且學習率應該比較大,因為只有這層的權值是重新訓練的,而其他的都是已經訓練好了的 2、開始訓練的時候,最後制定的模型為將要finetune的模型 ./build/tools/caffe train -solver examples/money_test/fine_tune/solver.prototxt -weights models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel 其中model指定的是caffenet訓練好的model。

caffe使用自己的數據做分類