MaskRCNN-Benchmark框架訓練自己的資料集
Facebook AI Research 開源了 Faster R-CNN 和 Mask R-CNN 的 PyTorch 1.0 實現基準:MaskRCNN-Benchmark。相比 Detectron 和 mmdetection,MaskRCNN-Benchmark 的效能相當,並擁有更快的訓練速度和更低的 GPU 記憶體佔用,眾多亮點如下。
- PyTorch 1.0:相當或者超越 Detectron 準確率的 RPN、Faster R-CNN、Mask R-CNN 實現;
- 非常快:訓練速度是 Detectron 的兩倍,是 mmdection 的 1.3 倍。
- 節省記憶體:在訓練過程中使用的 GPU 記憶體比 mmdetection 少大約 500MB;
- 使用多 GPU 訓練和推理;
- 批量化推理:可以在每 GPU 每批量上使用多張影象進行推理;
- 支援 CPU 推理:可以在推理時間內於 CPU 上執行。
- 提供幾乎所有參考 Mask R-CNN 和 Faster R-CNN 配置的預訓練模型,具有 1x 的 schedule。
介紹:https://mp.weixin.qq.com/s/XSGYlNO1wtRrEv2ivJvonA
專案地址:https://mp.weixin.qq.com/s/XSGYlNO1wtRrEv2ivJvonA
這篇文章主要是記錄我使用訪框架訓練自己的資料集的過程,總得來說還是比較容易上手的,當然坑也是有一點的。目前只包含了Mask R-CNN和Faster R-CNN兩種檢測模型,我嘗試了一下Mask R-CNN(不包含語義分割)和Faster R-CNN目標檢測的功能,也是因為我現在的工作只要用到目標檢測。
安裝
我的基礎環境:
系統:Ubutun 16.04
核心:4.15.0-36-generic
Python環境:Anaconda3
- conda 4.5.4
- pip 10.0.1
- Python 3.6.5 :: Anaconda, Inc.
要求的環境:
- PyTorch 1.0 from a nightly release. Installation instructions can be found in https://pytorch.org/get-started/locally/
- torchvision from master
- cocoapi
- yacs
- matplotlib
- GCC >= 4.9
- (optional) OpenCV for the webcam demo
$ conda create --name maskrcnn_benchmark
$ source activate maskrcnn_benchmark
# this installs the right pip and dependencies for the fresh python
$ conda install ipython
# maskrnn_benchmark and coco api dependencies
$ pip install ninja yacs cython matplotlib
# follow PyTorch installation in https://pytorch.org/get-started/locally/
# we give the instructions for CUDA 9.0
$ conda install pytorch-nightly -c pytorch
# install torchvision
$ cd ~/github
$ git clone https://github.com/pytorch/vision.git
$ cd vision
$ python setup.py install
# install pycocotools
$ cd ~/github
$ git clone https://github.com/cocodataset/cocoapi.git
$ cd cocoapi/PythonAPI
$ python setup.py build_ext install
# install PyTorch Detection
$ cd ~/github
$ git clone https://github.com/facebookresearch/maskrcnn-benchmark.git
$ cd maskrcnn-benchmark
$ python setup.py build develop
到這一步,maskrcnn-benchmark的安裝就已經完成了,下一步是要準備訓練/驗證資料。
資料準備
maskrcnn-benchmark預設是為coco資料集量身打造的,簡單起見我跑自己的資料集也完全照搬的coco的設定。COCO資料集現在有3種標註型別:object instances(目標例項), object keypoints(目標上的關鍵點), 和image captions(看圖說話),使用JSON檔案儲存。具體可以參考COCO資料集的標註格式一文。我的實驗只需要用到object detection,甚至都不需要語義分割,格式相對簡單,簡單說明如下。
{
"info": {..} #描述這個資料集的整體資訊,訓練自己的資料直接給個空詞典ok
"licenses": [license],#可以包含多個licenses例項,訓練自己的資料繼續給個空列表ok
"images": [
{
'filename': 'xx', #檔案路徑,這個路徑將和一個將root的根目錄拼接成你的檔案訪問路徑
'height': xx, #圖片高度
'width': xx, #圖片寬度
'id': xx,#每張圖片都有一個唯一的id,從0開始編碼即可
},
...
],
"annotations": [
{
'segmentation': [] #語義分割的時候要用到,我只用到了目標檢測,所以忽略.
'area': xx, #區域面積,寬*高就是區域面積
'image_id': xx, #一張當然可能有多個標註,這個image_id和images中的id相對應
'bbox':[x,y,w,h], #通過這4個座標來定位邊框
'category_id': xx, #類別id(與categories中的id對應)
'id': xx, #這是這個annotation的id,也是唯一的,從0編號即可
},
...
]
"categories": [
{
'supercategory': xx, #你類別名稱,例如vehicle(交通工具),下一級有car,truck等.我自己的資料集沒有這種層次關係,我就隨便取了個名字adas
'id': xx, #類別的id,從1開始編號,0預設為背景
'name': xx, #這個子類別的名字
},
...
],
}
參照上面的標註格式分別生成訓練集和驗證集的json標註檔案,可以繼續沿用coco資料集預設的名字:instances_train2104.json和instances_val2014.json。資料集的目錄組織結構可以參考下面的整體目錄結構中datasets目錄。
(maskrcnn_benchmark) [[email protected]]$tree -L 3
.
├── configs
│ ├── e2e_faster_rcnn_R_101_FPN_1x.yaml #訓練和驗證要用到的faster r-cnn模型配置檔案
│ ├── e2e_mask_rcnn_R_101_FPN_1x.yaml #訓練和驗證要用到的mask r-cnn模型配置檔案
│ └── quick_schedules
├── CONTRIBUTING.md
├── datasets
│ └── coco
│ ├── annotations
│ │ ├── instances_train2014.json #訓練集標註檔案
│ │ └── instances_val2014.json #驗證集標註檔案
│ ├── train2014 #存放訓練集圖片
│ └── val2014 #存放驗證集圖片
├── maskrcnn_benchmark
│ ├── config
│ │ ├── defaults.py #masrcnn_benchmark預設配置檔案,啟動時會讀取訪配置檔案,configs目錄下的模型配置檔案進行引數合併
│ │ ├── __init__.py
│ │ ├── paths_catalog.py #在訪檔案中配置訓練和測試集的路徑
│ │ └── __pycache__
│ ├── csrc
│ ├── data
│ │ ├── build.py #生成資料集的地方
│ │ ├── datasets #訪目錄下的coco.py提供了coco資料集的訪問介面
│ │ └── transforms
│ ├── engine
│ │ ├── inference.py #驗證引擎
│ │ └── trainer.py #訓練引擎
│ ├── __init__.py
│ ├── layers
│ │ ├── batch_norm.py
│ │ ├── __init__.py
│ │ ├── misc.py
│ │ ├── nms.py
│ │ ├── __pycache__
│ │ ├── roi_align.py
│ │ ├── roi_pool.py
│ │ ├── smooth_l1_loss.py
│ │ └── _utils.py
│ ├── modeling
│ │ ├── backbone
│ │ ├── balanced_positive_negative_sampler.py
│ │ ├── box_coder.py
│ │ ├── detector
│ │ ├── __init__.py
│ │ ├── matcher.py
│ │ ├── poolers.py
│ │ ├── __pycache__
│ │ ├── roi_heads
│ │ ├── rpn
│ │ └── utils.py
│ ├── solver
│ │ ├── build.py
│ │ ├── __init__.py
│ │ ├── lr_scheduler.py #在此設定學習率調整策略
│ │ └── __pycache__
│ ├── structures
│ │ ├── bounding_box.py
│ │ ├── boxlist_ops.py
│ │ ├── image_list.py
│ │ ├── __init__.py
│ │ ├── __pycache__
│ │ └── segmentation_mask.py
│ └── utils
│ ├── c2_model_loading.py
│ ├── checkpoint.py #檢查點
│ ├── __init__.py
│ ├── logger.py #日誌設定
│ ├── model_zoo.py
│ ├── __pycache__
│ └── README.md
├── output #我自己設定的輸出目錄
├── tools
│ ├── test_net.py #驗證入口
│ └── train_net.py #訓練入口
└── TROUBLESHOOTING.md
這樣一來資料集就準備好了。
配置檔案
這裡涉及到的配置檔案主要有3個:
- 模型配置檔案(如:configs/e2e_mask_rcnn_R_101_FPN_1x.yaml)
- 資料路徑配置檔案(maskrcnn_benchmark/config/paths_catalog.py)
- MaskRCNN-Benchmark框架配置檔案(maskrcnn_benchmark/config/defaults.py)。
模型配置檔案在啟動訓練時由--config-file引數指定,在config子目錄下預設提供了mask_rcnn和faster_rcnn框架不同骨幹網的基於YAML格式的配置檔案。我選用的是e2e_mask_rcnn_R_101_FPN_1x.yaml,也就是使用mask_rcnn檢測模型,骨幹網使用ResNet101-FPN,配置詳情如下(根據自己的資料集作相應的調整):
MODEL:
META_ARCHITECTURE: "GeneralizedRCNN"
WEIGHT: "catalog://ImageNetPretrained/MSRA/R-101"
BACKBONE:
CONV_BODY: "R-101-FPN"
OUT_CHANNELS: 256
RPN:
USE_FPN: True #是否使用FPN,也就是特徵金字塔結構,選擇True將在不同的特徵圖提取候選區域
ANCHOR_STRIDE: (4, 8, 16, 32, 64) #ANCHOR的步長
PRE_NMS_TOP_N_TRAIN: 2000 #訓練時,NMS之前的候選區數量
PRE_NMS_TOP_N_TEST: 1000 #測試時,NMS之後的候選區數量
POST_NMS_TOP_N_TEST: 1000
FPN_POST_NMS_TOP_N_TEST: 1000
ROI_HEADS:
USE_FPN: True
ROI_BOX_HEAD:
POOLER_RESOLUTION: 7
POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
POOLER_SAMPLING_RATIO: 2
FEATURE_EXTRACTOR: "FPN2MLPFeatureExtractor"
PREDICTOR: "FPNPredictor"
ROI_MASK_HEAD:
POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
FEATURE_EXTRACTOR: "MaskRCNNFPNFeatureExtractor"
PREDICTOR: "MaskRCNNC4Predictor"
POOLER_RESOLUTION: 14
POOLER_SAMPLING_RATIO: 2
RESOLUTION: 28
SHARE_BOX_FEATURE_EXTRACTOR: False
MASK_ON: False #預設是True,我這裡改為False,因為我沒有用到語義分割的功能
DATASETS:
TRAIN: ("coco_2014_train",) #注意這裡的訓練集和測試集的名字,
TEST: ("coco_2014_val",) #它們和paths_catalog.py中DATASETS相對應
DATALOADER:
SIZE_DIVISIBILITY: 32
SOLVER:
BASE_LR: 0.01 #起始學習率,學習率的調整有多種策略,訪框架自定義了一種策略
WEIGHT_DECAY: 0.0001
#這是什麼意思呢?是為了在不同的迭代區間進行學習率的調整而設定的.以我的資料集為例,
#我149898張圖,計劃是每4個epoch衰減一次,所以如下設定.
STEPS: (599592, 1199184)
MAX_ITER: 1300000 #最大迭代次數
看完模型配置檔案,你再看看MaskRCNN-Benchmark框架預設配置檔案(defaults.py)你就會發現有不少引數有重合。嘿嘿,閱讀程式碼會發現defaults.py會合並模型配置檔案中的引數,defaults.py顧名思義就是提供了預設的引數配置,如果模型配置檔案中對訪引數有改動則以模型中的為準。當然還有更多的引數是模型配置檔案中沒有的,我這裡對部分引數進行簡單的說明。
import os
from yacs.config import CfgNode as CN
_C = CN()
_C.MODEL = CN()
_C.MODEL.RPN_ONLY = False
_C.MODEL.MASK_ON = False
_C.MODEL.DEVICE = "cuda"
_C.MODEL.META_ARCHITECTURE = "GeneralizedRCNN"
_C.MODEL.WEIGHT = ""
_C.INPUT = CN()
_C.INPUT.MIN_SIZE_TRAIN = 800 #訓練集圖片最小尺寸
_C.INPUT.MAX_SIZE_TRAIN = 1333 #訓練集圖片最大尺寸
_C.INPUT.MIN_SIZE_TEST = 800
_C.INPUT.MAX_SIZE_TEST = 1333
_C.INPUT.PIXEL_MEAN = [102.9801, 115.9465, 122.7717]
_C.INPUT.PIXEL_STD = [1., 1., 1.]
_C.INPUT.TO_BGR255 = True
_C.DATASETS = CN()
_C.DATASETS.TRAIN = () #在模型配置檔案中已給出
_C.DATASETS.TEST = ()
_C.DATALOADER = CN()
_C.DATALOADER.NUM_WORKERS = 4 #資料生成啟執行緒數
_C.DATALOADER.SIZE_DIVISIBILITY = 0
_C.DATALOADER.ASPECT_RATIO_GROUPING = True
_C.MODEL.BACKBONE = CN()
_C.MODEL.BACKBONE.CONV_BODY = "R-50-C4"
_C.MODEL.BACKBONE.FREEZE_CONV_BODY_AT = 2
_C.MODEL.BACKBONE.OUT_CHANNELS = 256 * 4
_C.MODEL.RPN = CN()
_C.MODEL.RPN.USE_FPN = False
_C.MODEL.RPN.ANCHOR_SIZES = (32, 64, 128, 256, 512)
_C.MODEL.RPN.ANCHOR_STRIDE = (16,)
_C.MODEL.RPN.ASPECT_RATIOS = (0.5, 1.0, 2.0)
_C.MODEL.RPN.STRADDLE_THRESH = 0
_C.MODEL.RPN.FG_IOU_THRESHOLD = 0.7
_C.MODEL.RPN.BG_IOU_THRESHOLD = 0.3
_C.MODEL.RPN.BATCH_SIZE_PER_IMAGE = 256
_C.MODEL.RPN.POSITIVE_FRACTION = 0.5
_C.MODEL.RPN.PRE_NMS_TOP_N_TRAIN = 12000
_C.MODEL.RPN.PRE_NMS_TOP_N_TEST = 6000
_C.MODEL.RPN.POST_NMS_TOP_N_TRAIN = 2000
_C.MODEL.RPN.POST_NMS_TOP_N_TEST = 1000
_C.MODEL.RPN.NMS_THRESH = 0.7
_C.MODEL.RPN.MIN_SIZE = 0
_C.MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN = 2000
_C.MODEL.RPN.FPN_POST_NMS_TOP_N_TEST = 2000
_C.MODEL.ROI_HEADS = CN()
_C.MODEL.ROI_HEADS.USE_FPN = False
_C.MODEL.ROI_HEADS.FG_IOU_THRESHOLD = 0.5
_C.MODEL.ROI_HEADS.BG_IOU_THRESHOLD = 0.5
_C.MODEL.ROI_HEADS.BBOX_REG_WEIGHTS = (10., 10., 5., 5.)
_C.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 512
_C.MODEL.ROI_HEADS.POSITIVE_FRACTION = 0.25
_C.MODEL.ROI_HEADS.SCORE_THRESH = 0.05
_C.MODEL.ROI_HEADS.NMS = 0.5
_C.MODEL.ROI_HEADS.DETECTIONS_PER_IMG = 100
_C.MODEL.ROI_BOX_HEAD = CN()
_C.MODEL.ROI_BOX_HEAD.FEATURE_EXTRACTOR = "ResNet50Conv5ROIFeatureExtractor"
_C.MODEL.ROI_BOX_HEAD.PREDICTOR = "FastRCNNPredictor"
_C.MODEL.ROI_BOX_HEAD.POOLER_RESOLUTION = 14
_C.MODEL.ROI_BOX_HEAD.POOLER_SAMPLING_RATIO = 0
_C.MODEL.ROI_BOX_HEAD.POOLER_SCALES = (1.0 / 16,)
#資料集類別數,預設是81,因為coco資料集為80+1(背景),我的資料集只有4個類別,加上背景也就是5個類別
_C.MODEL.ROI_BOX_HEAD.NUM_CLASSES = 5
_C.MODEL.ROI_BOX_HEAD.MLP_HEAD_DIM = 1024
_C.MODEL.ROI_MASK_HEAD = CN()
_C.MODEL.ROI_MASK_HEAD.FEATURE_EXTRACTOR = "ResNet50Conv5ROIFeatureExtractor"
_C.MODEL.ROI_MASK_HEAD.PREDICTOR = "MaskRCNNC4Predictor"
_C.MODEL.ROI_MASK_HEAD.POOLER_RESOLUTION = 14
_C.MODEL.ROI_MASK_HEAD.POOLER_SAMPLING_RATIO = 0
_C.MODEL.ROI_MASK_HEAD.POOLER_SCALES = (1.0 / 16,)
_C.MODEL.ROI_MASK_HEAD.MLP_HEAD_DIM = 1024
_C.MODEL.ROI_MASK_HEAD.CONV_LAYERS = (256, 256, 256, 256)
_C.MODEL.ROI_MASK_HEAD.RESOLUTION = 14
_C.MODEL.ROI_MASK_HEAD.SHARE_BOX_FEATURE_EXTRACTOR = True
_C.MODEL.RESNETS = CN()
_C.MODEL.RESNETS.NUM_GROUPS = 1
_C.MODEL.RESNETS.WIDTH_PER_GROUP = 64
_C.MODEL.RESNETS.STRIDE_IN_1X1 = True
_C.MODEL.RESNETS.TRANS_FUNC = "BottleneckWithFixedBatchNorm"
_C.MODEL.RESNETS.STEM_FUNC = "StemWithFixedBatchNorm"
_C.MODEL.RESNETS.RES5_DILATION = 1
_C.MODEL.RESNETS.RES2_OUT_CHANNELS = 256
_C.MODEL.RESNETS.STEM_OUT_CHANNELS = 64
_C.SOLVER = CN()
_C.SOLVER.MAX_ITER = 40000 #最大迭代次數
_C.SOLVER.BASE_LR = 0.02 #初始學習率,這個通常在模型配置檔案中有設定
_C.SOLVER.BIAS_LR_FACTOR = 2
_C.SOLVER.MOMENTUM = 0.9
_C.SOLVER.WEIGHT_DECAY = 0.0005
_C.SOLVER.WEIGHT_DECAY_BIAS = 0
_C.SOLVER.GAMMA = 0.1
_C.SOLVER.STEPS = (30000,)
_C.SOLVER.WARMUP_FACTOR = 1.0 / 3
_C.SOLVER.WARMUP_ITERS = 500 #預熱迭代次數,預熱迭代次數內(小於訪值)的學習率比較低
_C.SOLVER.WARMUP_METHOD = "constant" #預熱策略,有'constant'和'linear'兩種
_C.SOLVER.CHECKPOINT_PERIOD = 2000 #生成檢查點(checkpoint)的步長
_C.SOLVER.IMS_PER_BATCH = 1 #一個batch包含的圖片數量
_C.TEST = CN()
_C.TEST.EXPECTED_RESULTS = []
_C.TEST.EXPECTED_RESULTS_SIGMA_TOL = 4
_C.TEST.IMS_PER_BATCH = 1
_C.OUTPUT_DIR = "output" #主要作為checkpoint和inference的輸出目錄
_C.PATHS_CATALOG = os.path.join(os.path.dirname(__file__), "paths_catalog.py")
關於path_catalog其實最重要的就是DatasetCatalog這個類。
class DatasetCatalog(object):
DATA_DIR = "datasets"
DATASETS = {
"coco_2014_train": (
"coco/train2014", #這裡是訪資料集的主目錄,稱其為root,訪root會和標註檔案中images欄位中的file_name指定的路徑進行拼接得到圖片的完整路徑
"coco/annotations/instances_train2014.json", # 標註檔案路徑
),
"coco_2014_val": (
"coco/val2014", #同上
"coco/annotations/instances_val2014.json" #同上
),
}
@staticmethod
def get(name):
if "coco" in name: #e.g. "coco_2014_train"
data_dir = DatasetCatalog.DATA_DIR
attrs = DatasetCatalog.DATASETS[name]
args = dict(
root=os.path.join(data_dir, attrs[0]),
ann_file=os.path.join(data_dir, attrs[1]),
)
return dict(
factory="COCODataset",
args=args,
)
raise RuntimeError("Dataset not available: {}".format(name))
啟動訓練
#進入maskrcnn-benchmark目錄下,啟用maskrcnn_benchmark虛擬環境
[[email protected]]$ cd maskrcnn-benchmark
[[email protected]]$ source activate maskrcnn_benchmark
#指定模型配置檔案,執行訓練啟動指令碼
(maskrcnn_benchmark) [[email protected]]$python tools/train_net.py --config-file configs/adas_e2e_mask_rcnn_R_101_FPN_1x.yaml
每隔規定的迭代次數(我設定的是200)會列印訓練中間資訊,主要是損失值。
2018-11-09 14:40:22,020 maskrcnn_benchmark.trainer INFO: Start training
2018-11-09 14:42:00,113 maskrcnn_benchmark.trainer INFO: eta: 17:35:44 iter: 200 loss: 0.1553 (0.3598) loss_classifier: 0.0728 (0.1902) loss_box_reg: 0.0764 (0.1221) loss_objectness: 0.0110 (0.0392) loss_rpn_box_reg: 0.0028 (0.0083) time: 0.4775 (0.4880) data: 0.0027 (0.0105) avg_loss: 0.3616 (0.3616) lr: 0.003333 max mem: 3629
2018-11-09 14:43:37,005 maskrcnn_benchmark.trainer INFO: eta: 17:30:17 iter: 400 loss: 0.2033 (0.3071) loss_classifier: 0.1271 (0.1587) loss_box_reg: 0.0883 (0.1162) loss_objectness: 0.0033 (0.0244) loss_rpn_box_reg: 0.0049 (0.0078) time: 0.4763 (0.4862) data: 0.0029 (0.0068) avg_loss: 0.2541 (0.3078) lr: 0.003333 max mem: 3629
2018-11-09 14:45:13,014 maskrcnn_benchmark.trainer INFO: eta: 17:24:13 iter: 600 loss: 0.3123 (0.2915) loss_classifier: 0.1296 (0.1511) loss_box_reg: 0.1310 (0.1127) loss_objectness: 0.0090 (0.0197) loss_rpn_box_reg: 0.0086 (0.0080) time: 0.4613 (0.4842) data: 0.0028 (0.0056) avg_loss: 0.2604 (0.2920) lr: 0.010000 max mem: 3629
2018-11-09 14:46:48,015 maskrcnn_benchmark.trainer INFO: eta: 17:17:40 iter: 800 loss: 0.3133 (0.2929) loss_classifier: 0.1620 (0.1534) loss_box_reg: 0.1227 (0.1121) loss_objectness: 0.0067 (0.0189) loss_rpn_box_reg: 0.0075 (0.0084) time: 0.4625 (0.4819) data: 0.0029 (0.0049) avg_loss: 0.2604 (0.2932) lr: 0.010000 max mem: 3629
2018-11-09 14:48:24,037 maskrcnn_benchmark.trainer INFO: eta: 17:15:17 iter: 1000 loss: 0.2165 (0.2952) loss_classifier: 0.1061 (0.1554) loss_box_reg: 0.0781 (0.1148) loss_objectness: 0.0037 (0.0167) loss_rpn_box_reg: 0.0047 (0.0082) time: 0.4688 (0.4815) data: 0.0031 (0.0046) avg_loss: 0.2968 (0.2955) lr: 0.010000 max mem: 3629
....省略若干....
2018-11-10 12:59:40,231 maskrcnn_benchmark.trainer INFO: eta: 4 days, 0:48:47 iter: 230600 loss: 0.0727 (0.0878) loss_classifier: 0.0355 (0.0466) loss_box_reg: 0.0321 (0.0369) loss_objectness: 0.0002 (0.0018) loss_rpn_box_reg: 0.0017 (0.0026) time: 0.6915 (0.3259) data: 0.0041 (0.0033) avg_loss: 0.0849 (0.0877) lr: 0.010000 max mem: 3626
2018-11-10 13:01:57,302 maskrcnn_benchmark.trainer INFO: eta: 4 days, 0:56:11 iter: 230800 loss: 0.0767 (0.0878) loss_classifier: 0.0388 (0.0466) loss_box_reg: 0.0275 (0.0368) loss_objectness: 0.0002 (0.0018) loss_rpn_box_reg: 0.0022 (0.0026) time: 0.6475 (0.3264) data: 0.0040 (0.0033) avg_loss: 0.0849 (0.0877) lr: 0.010000 max mem: 3626
2018-11-10 13:04:13,533 maskrcnn_benchmark.trainer INFO: eta: 4 days, 1:03:28 iter: 231000 loss: 0.0705 (0.0878) loss_classifier: 0.0338 (0.0466) loss_box_reg: 0.0350 (0.0368) loss_objectness: 0.0004 (0.0018) loss_rpn_box_reg: 0.0023 (0.0026) time: 0.7095 (0.3269) data: 0.0038 (0.0033) avg_loss: 0.0849 (0.0877) lr: 0.010000 max mem: 3626
2018-11-10 13:06:31,076 maskrcnn_benchmark.trainer INFO: eta: 4 days, 1:10:53 iter: 231200 loss: 0.0825 (0.0878) loss_classifier: 0.0428 (0.0466) loss_box_reg: 0.0383 (0.0368) loss_objectness: 0.0001 (0.0018) loss_rpn_box_reg: 0.0018 (0.0026) time: 0.7105 (0.3273) data: 0.0042 (0.0033) avg_loss: 0.0849 (0.0877) lr: 0.010000 max mem: 3626
注意觀察發現,在預熱階段,也就是前500次迭代內,雖然我初始學習率是設定的0.1,但是因為預熱策略的原因,學習率調整為0.003333,而500次之後學習率恢復到0.01。訓練的平均損失(200次迭代內的平均損失)由開始的0.3616降到0.0849。當然到此訓練還沒有完結,我跑一次驗證看一下效果。
#指定模型配置檔案,執行測試啟動指令碼
(maskrcnn_benchmark) [[email protected]]$python tools/test_net.py --config-file configs/adas_e2e_mask_rcnn_R_101_FPN_1x.yaml
驗證結果:
2018-11-10 13:15:13,025 maskrcnn_benchmark.inference INFO: Start evaluation on 3035 images
3035it [24:16, 2.07it/s]
2018-11-10 13:39:29,606 maskrcnn_benchmark.inference INFO: Total inference time: 0:24:16.580470 (0.4799276670867568 s / img per device, on 1 devices)
2018-11-10 13:39:30,015 maskrcnn_benchmark.inference INFO: Preparing results for COCO format
2018-11-10 13:39:30,015 maskrcnn_benchmark.inference INFO: Preparing bbox results
2018-11-10 13:39:30,466 maskrcnn_benchmark.inference INFO: Evaluating predictions
Loading and preparing results...
DONE (t=0.18s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=4.26s).
Accumulating evaluation results...
DONE (t=1.10s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.481
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.814
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.557
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.211
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.512
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.583
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.605
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.605
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.396
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.617
2018-11-10 13:39:36,404 maskrcnn_benchmark.inference INFO: OrderedDict([('bbox', OrderedDict([('AP', 0.4805768356976347), ('AP50', 0.813735887686001), ('AP75', 0.5574143235378376), ('APs', -1.0), ('APm', 0.21119579684755593), ('APl', 0.5118301401808005)]))])