windows10 conda2 使用caffe訓練訓練自己的數據
首先得到了https://blog.csdn.net/gybheroin/article/details/72581318系列博客的幫助。表示感激。
關於安裝caffe已在之前的博客介紹,自用可行,https://www.cnblogs.com/MY0213/p/9225310.html
1.數據源
首先使用的數據集為人臉數據集,可在百度雲自行下載:
鏈接:https://pan.baidu.com/s/156DiOuB46wKrM0cEaAgfMw 密碼:1ap0
將train.zip解壓可得數據源,label文件是val.txt和train.txt。
2.將圖片數據做成lmdb數據源
SET GLOG_logtostderr=1 SET RESIZE_HEIGHT=227 SET RESIZE_WIDTH=227 "convert_imageset" --resize_height=227 --resize_width=227 --shuffle "train/" "train.txt" "mtraindb" "convert_imageset" --resize_height=227 --resize_width=227 --shuffle "val/" "val.txt" "mvaldb" pause
詳見face_lmdb.bat,將數據做成同等大小的數據。
3. 得到圖像均值
SET GLOG_logtostderr=1 "compute_image_mean" "mtraindb" "train_mean.binaryproto" pause
詳見mean_face.bat
訓練時先做減均值的操作,可能對訓練效果有好處
這裏可以用固定的圖片均值,是多少可以直接百度谷歌,這一步也可以不做,唐宇迪大神說影響不大。
4. 圖像訓練
SET GLOG_logostderr=1 caffe train --solver=solver.prototxt pause
詳見train.bat
net: "train.prototxt" test_iter: 100 test_interval: 1000 # lr for fine-tuning should be lower than when starting from scratch base_lr: 0.001 lr_policy: "step" gamma: 0.1 # stepsize should also be lower, as we‘re closer to being done stepsize: 1000 display: 50 max_iter: 10000 momentum: 0.9 weight_decay: 0.0005 snapshot: 1000 snapshot_prefix: "model" # uncomment the following to default to CPU mode solving # solver_mode: CPU
詳見solver.prototxt
關於solver.prototxt的內涵可查看
https://blog.csdn.net/qq_27923041/article/details/55211808
############################# DATA Layer ############################# name: "face_train_val" layer { top: "data" top: "label" name: "data" type: "Data" data_param { source: "mtraindb" backend:LMDB batch_size: 64 } transform_param { mean_file: "train_mean.binaryproto" mirror: true } include: { phase: TRAIN } } layer { top: "data" top: "label" name: "data" type: "Data" data_param { source: "mvaldb" backend:LMDB batch_size: 64 } transform_param { mean_file: "train_mean.binaryproto" mirror: true } include: { phase: TEST } } layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool2" type: "Pooling" bottom: "norm2" top: "pool2" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv3" type: "Convolution" bottom: "pool2" top: "conv3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "fc6" type: "InnerProduct" bottom: "pool5" top: "fc6" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 4096 weight_filler { type: "gaussian" std: 0.005 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu6" type: "ReLU" bottom: "fc6" top: "fc6" } layer { name: "drop6" type: "Dropout" bottom: "fc6" top: "fc6" dropout_param { dropout_ratio: 0.5 } } layer { name: "fc7" type: "InnerProduct" bottom: "fc6" top: "fc7" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 4096 weight_filler { type: "gaussian" std: 0.005 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu7" type: "ReLU" bottom: "fc7" top: "fc7" } layer { name: "drop7" type: "Dropout" bottom: "fc7" top: "fc7" dropout_param { dropout_ratio: 0.5 } } layer { name: "fc8-expr" type: "InnerProduct" bottom: "fc7" top: "fc8-expr" param { lr_mult: 10 decay_mult: 1 } param { lr_mult: 20 decay_mult: 0 } inner_product_param { num_output: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "accuracy" type: "Accuracy" bottom: "fc8-expr" bottom: "label" top: "accuracy" include { phase: TEST } } layer { name: "loss" type: "SoftmaxWithLoss" bottom: "fc8-expr" bottom: "label" top: "loss" }
詳見train.prototxt,也就是將alexnet中最後的1000變為2就可以了。
這個過程需要5天左右(我用的cpu),可以直接用已有模型alexnet_iter_50000_full_conv.caffemodel
5. 測試
可用run_face_detect_batch.py測試人臉檢測效果。
6. 總結
這個網絡測試時特別慢,用的是slipping window的方法。下面的文章再介紹快速一點的faster rcnn 及FPN。
slipping window中用了Casting a Classifier into a Fully Convolutional Network 的方法。這一方法在其他網絡中也可用。
關於rcnn的演進,可見https://www.cnblogs.com/MY0213/p/9460562.html
歡迎批評指正。
windows10 conda2 使用caffe訓練訓練自己的數據