1. 程式人生 > >基於TensorFlow的SSD車輛檢測-3

基於TensorFlow的SSD車輛檢測-3

百度雲連結總是掛掉,大家實在有需要發我郵箱吧[email protected]

此係列部落格是用來學習Tensorflow和Python的,由於是新手上車,如有錯誤之處希望大家不吝指出。

谷歌雲盤:

三. label製備以及batch資料供給

本環節主要包含下面三塊內容:

  • 一些關於anchor生成的常量**
  • 介紹如何通過原始的標註框來生成計算Loss所需的label以及mask;
  • 如果在訓練階段批量的提供訓練資料,幷包含shuffle等操作;

1.一些關於anchor生成的常量

在constants.py檔案中定義了一些關於anchor的常量:

# coding=utf-8
# to pre-define some constant variables # SSD網路中6個預測分支中feature map的大小 feature_size = [38, 19, 10, 5, 3, 1] # 300 / feature_size:feature map中畫素在原圖中對應的感受野比例 anchor_steps = [8, 16, 30, 60, 100, 300] # 6個預測分支分別對應的anchor類別數。注意:SSD原文中是[4 6 6 6 4 4 ],但是由於KITTI中圖片縮放後導致存在更多的小目標,因此為了提高小目標的檢測率,將第一個分支的anchor的種類由4提高到6.
anchors_num = [6, 6, 6, 6, 4, 4] # 則anchor的總數量也由原文中的8732提高到11620 all_anchors_num = 11620 # 6個分支所使用的anchor的長寬比,注意長寬比1:1的anchor有兩種,但大小不一 anchors_ratio = [[1, 1, 2, 0.5, 3, 1./3], [1, 1, 2, 0.5, 3, 1./3], [1, 1, 2, 0.5, 3, 1./3], [1, 1, 2, 0.5, 3, 1./3], [1
, 1, 2, 0.5], [1, 1, 2, 0.5]] # 按照論文規則設計的anchor大小:最小0.07,最大的0.87,然後等差分配,則6種anchor的大小佔原圖的百分比依次為[0.07 0.23 ... 0.87] # 特別的,對於長寬比1:1的anchor,再增加一種稍大的尺寸 # the first: ratio=1, sqrt(S_k*S_(k+1)) # the second: 0.07+(k-1)*(0.87-0.1)/(6-1), k=1...6 """anchors_scales = [[0.13, 0.07], [0.30, 0.23], [0.46, 0.39], [0.62, 0.55], [0.79, 0.71], [0.95, 0.87]]""" # 300*anchors_scales anchors_size = [[39, 21], [90, 69], [138, 108], [186, 165], [237, 213], [285, 261]]

2.如何生成label以及mask

我生成label的方法比較呆板:
- (1)首先利用genBatch.py中的gen_anchors函式生成所有可能的anchors,維度為11620*4(座標格式為[x y w h]);
- (2)然後利用genBatch.py中的gen_labels迴圈處理每一個標註的車輛的bounding box:每一個bounding box都去和所有anchors計算IOU,如果和某些anchor的IOU大於一定閾值,就將該anchor的屬性label置為1,並按照下式計算相應的bounding box offset:

這裡寫圖片描述

相應的計算函式如下:

# compute normalized offset between boxG(ground truth) and boxD(default anchor) [x,y,w,h]
def compute_offset(boxG, boxD):
    offset = np.zeros([1, 4])
    # offset_x, offset_dy
    offset[0, :2] = [(boxG[0] - boxD[0]) / boxD[2], (boxG[1] - boxD[1]) / boxD[3]]
    # offset_w, offset_h
    offset[0, 2:] = np.log([boxG[2] / boxD[2], boxG[3] / boxD[3]])
    return offset

mask的製作就顯得比較簡單了,具體定義已經在上一節中介紹過了,相應的程式碼如下:

# generate two masks to weights different parts in the final ssd loss
def gen_masks(cls_label, neg_weight=3.0, reg_weight=1.0):
    pos_mask = cls_label[:, 1]
    neg_mask = 1. - pos_mask
    pos_num = np.sum(pos_mask)
    neg_num = np.sum(neg_mask)

    if pos_num > 0:
        pos_mask = pos_mask / pos_num
    if neg_num > 0:
        neg_mask = neg_mask / neg_num * neg_weight

    return pos_mask + neg_mask, pos_mask * reg_weight
  • (3)需要注意的是:當有多個標註的boundingbox與同一個anchor的IOU大於一定閾值時,我們只選擇IOU最大的那個標註。

3.如何供給Batch資料

Batch的資料供給主要考慮到在訓練過程中,自動的為訓練提供正確的資料以及對應的label,主要考慮的因素有:batch_Szie,是否shuffle, 是否進行資料擴張以及各種資料擴張的比例等等。

為此,我們定義瞭如下類:

class GenBatch:
    def __init__(self, image_path, label_path,
                 batch_size, new_w, new_h, is_color=True, is_shuffle=True):
        self.image_path, self.label_path = image_path, label_path,
        self.batch_size, self.new_w, self.new_h, self.is_color, self.is_shuffle = \
            batch_size, new_w, new_h, is_color, is_shuffle

        self.readPos = 0

        # read KITTI
        self.image_list = readKITTI.get_filelist(image_path, '.png')
        self.bbox_list = readKITTI.get_bboxlist(label_path, self.image_list)
        if len(self.image_list) > 0 and len(self.image_list) == len(self.bbox_list):
            print("The amount of images is %d" % (len(self.image_list)))

            self.initOK = True
            self.all_anchors = gen_anchors()

            # init the outputs
            self.batch_image = np.zeros([batch_size, new_h, new_w, 3 if self.is_color else 1], dtype=np.float32)
            self.batch_cls_label = np.zeros([batch_size * all_anchors_num, 2], dtype=np.float32)
            self.batch_reg_label = np.zeros([batch_size * all_anchors_num, 4], dtype=np.float32)
            self.batch_cls_mask = np.zeros([batch_size * all_anchors_num], dtype=np.float32)
            self.batch_reg_mask = np.zeros([batch_size * all_anchors_num], dtype=np.float32)
        else:
            print("The amount of images is %d, while the amount of "
                  "corresponding label is %d" % (len(self.image_list), len(self.bbox_list)))
            self.initOK = False

    # generate a new batch
    # mirror_ratio and crop_ratio are used to control the image augmentation,
    # the default zeros means no images augmentation
    # cls_pos_weight and reg_weight are used to generate a mask to compute the final SSD loss
    def nextbatch(self, mirror_ratio=0.0, crop_ratio=0.0):
        if self.initOK is False:
            print("NO successful initiation!.")
            return []
        for i in range(self.batch_size):
            # if a epoch is completed
            if self.readPos >= len(self.image_list)-1:
                self.readPos = 0
                if self.is_shuffle is True:
                    r_seed = random.random()
                    random.seed(r_seed)
                    random.shuffle(self.image_list)
                    random.seed(r_seed)
                    random.shuffle(self.bbox_list)
                    print('Shuffle the data successfully.\n')

            img = cv2.imread(self.image_path + self.image_list[self.readPos])

            bbox = self.bbox_list[self.readPos]

            self.readPos += 1

            # randomly crop under a specified probability
            if crop_ratio > 0 and random.random() < crop_ratio:
                img, bbox = imAugment.imcrop(img, bbox, min(self.new_w, self.new_h))

            # check the input image's size and color
            img, bbox = imAugment.imresize(img, bbox, self.new_w, self.new_h, self.is_color)

            # horizontally flip the input image under a specified probability
            if mirror_ratio > 0 and random.random() < mirror_ratio:
                img, bbox = imAugment.immirror(img, bbox)

            # generate processed labels
            cls_label, reg_label = gen_labels(bbox, self.all_anchors)

            # generate masks
            cls_mask, reg_mask = gen_masks(cls_label)

            self.batch_image[i, :, :, :] = img.astype(np.float32)
            self.batch_cls_label[i*all_anchors_num:(i+1)*all_anchors_num, :] = cls_label
            self.batch_reg_label[i*all_anchors_num:(i+1)*all_anchors_num, :] = reg_label
            self.batch_cls_mask[i*all_anchors_num:(i+1)*all_anchors_num] = cls_mask
            self.batch_reg_mask[i*all_anchors_num:(i+1)*all_anchors_num] = reg_mask

        return self.batch_image, self.batch_cls_label, self.batch_reg_label, self.batch_cls_mask, self.batch_reg_mask