mask_rcnn keras原始碼跟讀3）配置檔案

阿新 • • 發佈：2018-11-10

config.py檔案，引數配置一個一個地看

# NUMBER OF GPUs to use. For CPU training, use 1
GPU_COUNT = 1
# Number of images to train with on each GPU. A 12GB GPU can typically
# handle 2 images of 1024x1024px.
# Adjust based on your GPU memory and image sizes. Use the highest
# number that your GPU can handle for best performance. 

IMAGES_PER_GPU = 2

GPU_COUNT：你機器有幾個GPU，
IMAGES_PER_GP ：每個GPU訓練多少張圖片()

# Number of training steps per epoch
# This doesn't need to match the size of the training set. Tensorboard
# updates are saved at the end of each epoch, so setting this to a
# smaller number means getting more frequent TensorBoard updates. 

# Validation stats are also calculated at each epoch end and they
# might take a while, so don't set this too small to avoid spending
# a lot of time on validation stats.
STEPS_PER_EPOCH = 1000

# Number of validation steps to run at the end of every training epoch.
# A bigger number improves accuracy of validation stats, but slows 

# down the training.
VALIDATION_STEPS = 50

STEPS_PER_EPOCH ：多少個step為1個epoch（每個step都會取1個batch的資料）
VALIDATION_STEPS ：多少個step的batch資料作為訓練集

# Backbone network architecture
# Supported values are: resnet50, resnet101
BACKBONE = "resnet101"

BACKBONE :特徵提取主架構，一般用resnet101

# The strides of each layer of the FPN Pyramid. These values
# are based on a Resnet101 backbone.
BACKBONE_STRIDES = [4, 8, 16, 32, 64]

BACKBONE_STRIDES ：特徵圖相對於原圖縮小的倍數，如在resnet101中，C2的大小是原圖的1/4，C3是原圖的1/8，C4是的原圖1/16，C5是原圖的1/32，其和FPN輸出的P2-P5是對應的，P6在P5的基礎上進行了一個下采樣，因此P6是原圖的1/64

# Number of classification classes (including background)
NUM_CLASSES = 1  # Override in sub-classes

NUM_CLASSES ：目標檢測的類別，1表示只有1類(編號從0開始,0為預設為背景)

# Length of square anchor side in pixels
RPN_ANCHOR_SCALES = (32, 64, 128, 256, 512)

RPN_ANCHOR_SCALES ：P2-P5預設anchor的大小，如P2的anchor大小為32*32

# Ratios of anchors at each cell (width/height)
# A value of 1 represents a square anchor, and 0.5 is a wide anchor
RPN_ANCHOR_RATIOS = [0.5, 1, 2]

RPN_ANCHOR_RATIOS ：anchor的橫縱比，比如P2的預設anchor為32*32，若anchor的橫縱比為0.5，則anchor的size:(64*1/3)*(64*2/3)

# Anchor stride
# If 1 then anchors are created for each cell in the backbone feature map.
# If 2, then anchors are created for every other cell, and so on.
RPN_ANCHOR_STRIDE = 1

# Non-max suppression threshold to filter RPN proposals.
# You can increase this during training to generate more propsals.
RPN_NMS_THRESHOLD = 0.7

# How many anchors per image to use for RPN training
RPN_TRAIN_ANCHORS_PER_IMAGE = 256

RPN_ANCHOR_STRIDE ：對FPN網路的輸出P2-P5，首先會使用3*3核對其卷積獲得share輸出，share用於後續的BG/FG和座標的迴歸。anchor_stride

# Shared convolutional base of the RPN
shared = KL.Conv2D(512, (3, 3), padding='same', activation='relu',
                   strides=anchor_stride, #RPN_ANCHOR_STRIDE
                   name='rpn_conv_shared')(feature_map)

RPN_NMS_THRESHOLD :在ProposalLayer使用到的極大值抑制演算法，首先選擇一個最佳的best_anchor，left_anchor和best_anchor的IOU若大於0.7則捨棄。

RPN_TRAIN_ANCHORS_PER_IMAGE ：對於一張圖片，使用多少anchor用於RPN網路的訓練。

# RPN bounding boxes: [max anchors per image, (dy, dx, log(dh), log(dw))]
rpn_bbox = np.zeros((config.RPN_TRAIN_ANCHORS_PER_IMAGE, 4))

 # ROIs kept after non-maximum supression (training and inference)
 POST_NMS_ROIS_TRAINING = 2000
 POST_NMS_ROIS_INFERENCE = 1000

這2個引數基本不用管，在不使用RPN網路的時候用到

# If enabled, resizes instance masks to a smaller size to reduce
# memory load. Recommended when using high-resolution images.
USE_MINI_MASK = True
MINI_MASK_SHAPE = (56, 56)  # (height, width) of the mini-mask

把mask的部分扣出來進行縮放，具體可以參考minimize_mask()

# Input image resizing
# Generally, use the "square" resizing mode for training and inferencing
# and it should work well in most cases. In this mode, images are scaled
# up such that the small side is = IMAGE_MIN_DIM, but ensuring that the
# scaling doesn't make the long side > IMAGE_MAX_DIM. Then the image is
# padded with zeros to make it a square so multiple images can be put
# in one batch.
# Available resizing modes:
# none:   No resizing or padding. Return the image unchanged.
# square: Resize and pad with zeros to get a square image
#         of size [max_dim, max_dim].
# pad64:  Pads width and height with zeros to make them multiples of 64.
#         If IMAGE_MIN_DIM or IMAGE_MIN_SCALE are not None, then it scales
#         up before padding. IMAGE_MAX_DIM is ignored in this mode.
#         The multiple of 64 is needed to ensure smooth scaling of feature
#         maps up and down the 6 levels of the FPN pyramid (2**6=64).
# crop:   Picks random crops from the image. First, scales the image based
#         on IMAGE_MIN_DIM and IMAGE_MIN_SCALE, then picks a random crop of
#         size IMAGE_MIN_DIM x IMAGE_MIN_DIM. Can be used in training only.
#         IMAGE_MAX_DIM is not used in this mode.
IMAGE_RESIZE_MODE = "square"
IMAGE_MIN_DIM = 800
IMAGE_MAX_DIM = 1024
# Minimum scaling ratio. Checked after MIN_IMAGE_DIM and can force further
# up scaling. For example, if set to 2 then images are scaled up to double
# the width and height, or more, even if MIN_IMAGE_DIM doesn't require it.
# Howver, in 'square' mode, it can be overruled by IMAGE_MAX_DIM.
IMAGE_MIN_SCALE = 0

對輸入圖片的預處理，具體可以參考load_image_gt

    # Image mean (RGB)
    MEAN_PIXEL = np.array([123.7, 116.8, 103.9])

MEAN_PIXEL ：用於圖片的歸一化處理

# Number of ROIs per image to feed to classifier/mask heads
# The Mask RCNN paper uses 512 but often the RPN doesn't generate
# enough positive proposals to fill this and keep a positive:negative
# ratio of 1:3. You can increase the number of proposals by adjusting
# the RPN NMS threshold.
TRAIN_ROIS_PER_IMAGE = 200

# Percent of positive ROIs used to train classifier/mask heads
ROI_POSITIVE_RATIO = 0.33

TRAIN_ROIS_PER_IMAGE ：經過ProposalLayer生成了許多roi，根據roi和gt_box的iou把其劃為候選正負樣本(iou>0.5?)，最終在候選正樣本中選擇200*0.33個作為最終的正樣本。

# Subsample ROIs. Aim for 33% positive
# Positive ROIs
positive_count = int(config.TRAIN_ROIS_PER_IMAGE *config.ROI_POSITIVE_RATIO)

# Pooled ROIs
POOL_SIZE = 7
MASK_POOL_SIZE = 14

# Shape of output mask
# To change this you also need to change the neural network mask branch
MASK_SHAPE = [28, 28]

這裡是ROI_POOLING部分，具體可以參考build_fpn_mask_graph

剩下的引數比較簡單，看註釋即可

    # Maximum number of ground truth instances to use in one image
    MAX_GT_INSTANCES = 100

    # Bounding box refinement standard deviation for RPN and final detections.
    RPN_BBOX_STD_DEV = np.array([0.1, 0.1, 0.2, 0.2])
    BBOX_STD_DEV = np.array([0.1, 0.1, 0.2, 0.2])

    # Max number of final detections
    DETECTION_MAX_INSTANCES = 100

    # Minimum probability value to accept a detected instance
    # ROIs below this threshold are skipped
    DETECTION_MIN_CONFIDENCE = 0.7

    # Non-maximum suppression threshold for detection
    #降低該引數，可讓最終的mask不會都堆疊在一起
    DETECTION_NMS_THRESHOLD = 0.3

    # Learning rate and momentum
    # The Mask RCNN paper uses lr=0.02, but on TensorFlow it causes
    # weights to explode. Likely due to differences in optimzer
    # implementation.
    LEARNING_RATE = 0.001
    LEARNING_MOMENTUM = 0.9

    # Weight decay regularization
    WEIGHT_DECAY = 0.0001

    # Use RPN ROIs or externally generated ROIs for training
    # Keep this True for most situations. Set to False if you want to train
    # the head branches on ROI generated by code rather than the ROIs from
    # the RPN. For example, to debug the classifier head without having to
    # train the RPN.
    USE_RPN_ROIS = True

mask_rcnn keras原始碼跟讀3）配置檔案

mask_rcnn keras原始碼跟讀3）配置檔案

mask_rcnn keras原始碼跟讀2）資料部分

mask_rcnn keras原始碼跟讀1）模型搭建

Spring原始碼深度解析總結（3）—— 配置檔案的讀取和Bean的載入（一）

Mybatis原始碼分析（3）—— 從Mybatis的視角去看Bean的初始化流程

PackageManagerService 原始碼分析（3） ApplicationInfo 相關

資料庫路由中介軟體MyCat - 原始碼篇（3）

SNMP原始碼分析之（一）配置檔案部分

大資料入門（3）配置hadoop

Java原始碼系列（3）:列舉型別

Zookeeper C API應用示例（3）——配置管理（非同步API）

3---Django rest framework原始碼分析（3）----節流

Dubbo原始碼理解（3）消費者呼叫過程

spring事務管理原始碼分析（一）配置和事務增強代理的生成流程

React原始碼解析（3）：元件的生命週期

深入淺出Mybatis原始碼系列（三）---配置詳解之properties與environments（mybatis原始碼篇）

Netflix Eureka原始碼分析（3）——listener（EurekaBootStrap監聽類）分析

分散式事務Hmily TCC原始碼跟讀記錄

JDK原始碼分析（3）HashSet

LinkedBlockingQueue原始碼解析（3）

mask_rcnn keras原始碼跟讀3）配置檔案

相關推薦