基於TensorFlow的SSD車輛檢測-3
阿新 • • 發佈:2019-01-05
百度雲連結總是掛掉,大家實在有需要發我郵箱吧[email protected]
此係列部落格是用來學習Tensorflow和Python的,由於是新手上車,如有錯誤之處希望大家不吝指出。
谷歌雲盤:
三. label製備以及batch資料供給
本環節主要包含下面三塊內容:
- 一些關於anchor生成的常量**
- 介紹如何通過原始的標註框來生成計算Loss所需的label以及mask;
- 如果在訓練階段批量的提供訓練資料,幷包含shuffle等操作;
1.一些關於anchor生成的常量
在constants.py檔案中定義了一些關於anchor的常量:
# coding=utf-8
# to pre-define some constant variables
# SSD網路中6個預測分支中feature map的大小
feature_size = [38, 19, 10, 5, 3, 1]
# 300 / feature_size:feature map中畫素在原圖中對應的感受野比例
anchor_steps = [8, 16, 30, 60, 100, 300]
# 6個預測分支分別對應的anchor類別數。注意:SSD原文中是[4 6 6 6 4 4 ],但是由於KITTI中圖片縮放後導致存在更多的小目標,因此為了提高小目標的檢測率,將第一個分支的anchor的種類由4提高到6.
anchors_num = [6, 6, 6, 6, 4, 4]
# 則anchor的總數量也由原文中的8732提高到11620
all_anchors_num = 11620
# 6個分支所使用的anchor的長寬比,注意長寬比1:1的anchor有兩種,但大小不一
anchors_ratio = [[1, 1, 2, 0.5, 3, 1./3],
[1, 1, 2, 0.5, 3, 1./3],
[1, 1, 2, 0.5, 3, 1./3],
[1, 1, 2, 0.5, 3, 1./3],
[1 , 1, 2, 0.5],
[1, 1, 2, 0.5]]
# 按照論文規則設計的anchor大小:最小0.07,最大的0.87,然後等差分配,則6種anchor的大小佔原圖的百分比依次為[0.07 0.23 ... 0.87]
# 特別的,對於長寬比1:1的anchor,再增加一種稍大的尺寸
# the first: ratio=1, sqrt(S_k*S_(k+1))
# the second: 0.07+(k-1)*(0.87-0.1)/(6-1), k=1...6
"""anchors_scales = [[0.13, 0.07],
[0.30, 0.23],
[0.46, 0.39],
[0.62, 0.55],
[0.79, 0.71],
[0.95, 0.87]]"""
# 300*anchors_scales
anchors_size = [[39, 21],
[90, 69],
[138, 108],
[186, 165],
[237, 213],
[285, 261]]
2.如何生成label以及mask
我生成label的方法比較呆板:
- (1)首先利用genBatch.py中的gen_anchors函式生成所有可能的anchors,維度為11620*4(座標格式為[x y w h]);
- (2)然後利用genBatch.py中的gen_labels迴圈處理每一個標註的車輛的bounding box:每一個bounding box都去和所有anchors計算IOU,如果和某些anchor的IOU大於一定閾值,就將該anchor的屬性label置為1,並按照下式計算相應的bounding box offset:
相應的計算函式如下:
# compute normalized offset between boxG(ground truth) and boxD(default anchor) [x,y,w,h]
def compute_offset(boxG, boxD):
offset = np.zeros([1, 4])
# offset_x, offset_dy
offset[0, :2] = [(boxG[0] - boxD[0]) / boxD[2], (boxG[1] - boxD[1]) / boxD[3]]
# offset_w, offset_h
offset[0, 2:] = np.log([boxG[2] / boxD[2], boxG[3] / boxD[3]])
return offset
mask的製作就顯得比較簡單了,具體定義已經在上一節中介紹過了,相應的程式碼如下:
# generate two masks to weights different parts in the final ssd loss
def gen_masks(cls_label, neg_weight=3.0, reg_weight=1.0):
pos_mask = cls_label[:, 1]
neg_mask = 1. - pos_mask
pos_num = np.sum(pos_mask)
neg_num = np.sum(neg_mask)
if pos_num > 0:
pos_mask = pos_mask / pos_num
if neg_num > 0:
neg_mask = neg_mask / neg_num * neg_weight
return pos_mask + neg_mask, pos_mask * reg_weight
- (3)需要注意的是:當有多個標註的boundingbox與同一個anchor的IOU大於一定閾值時,我們只選擇IOU最大的那個標註。
3.如何供給Batch資料
Batch的資料供給主要考慮到在訓練過程中,自動的為訓練提供正確的資料以及對應的label,主要考慮的因素有:batch_Szie,是否shuffle, 是否進行資料擴張以及各種資料擴張的比例等等。
為此,我們定義瞭如下類:
class GenBatch:
def __init__(self, image_path, label_path,
batch_size, new_w, new_h, is_color=True, is_shuffle=True):
self.image_path, self.label_path = image_path, label_path,
self.batch_size, self.new_w, self.new_h, self.is_color, self.is_shuffle = \
batch_size, new_w, new_h, is_color, is_shuffle
self.readPos = 0
# read KITTI
self.image_list = readKITTI.get_filelist(image_path, '.png')
self.bbox_list = readKITTI.get_bboxlist(label_path, self.image_list)
if len(self.image_list) > 0 and len(self.image_list) == len(self.bbox_list):
print("The amount of images is %d" % (len(self.image_list)))
self.initOK = True
self.all_anchors = gen_anchors()
# init the outputs
self.batch_image = np.zeros([batch_size, new_h, new_w, 3 if self.is_color else 1], dtype=np.float32)
self.batch_cls_label = np.zeros([batch_size * all_anchors_num, 2], dtype=np.float32)
self.batch_reg_label = np.zeros([batch_size * all_anchors_num, 4], dtype=np.float32)
self.batch_cls_mask = np.zeros([batch_size * all_anchors_num], dtype=np.float32)
self.batch_reg_mask = np.zeros([batch_size * all_anchors_num], dtype=np.float32)
else:
print("The amount of images is %d, while the amount of "
"corresponding label is %d" % (len(self.image_list), len(self.bbox_list)))
self.initOK = False
# generate a new batch
# mirror_ratio and crop_ratio are used to control the image augmentation,
# the default zeros means no images augmentation
# cls_pos_weight and reg_weight are used to generate a mask to compute the final SSD loss
def nextbatch(self, mirror_ratio=0.0, crop_ratio=0.0):
if self.initOK is False:
print("NO successful initiation!.")
return []
for i in range(self.batch_size):
# if a epoch is completed
if self.readPos >= len(self.image_list)-1:
self.readPos = 0
if self.is_shuffle is True:
r_seed = random.random()
random.seed(r_seed)
random.shuffle(self.image_list)
random.seed(r_seed)
random.shuffle(self.bbox_list)
print('Shuffle the data successfully.\n')
img = cv2.imread(self.image_path + self.image_list[self.readPos])
bbox = self.bbox_list[self.readPos]
self.readPos += 1
# randomly crop under a specified probability
if crop_ratio > 0 and random.random() < crop_ratio:
img, bbox = imAugment.imcrop(img, bbox, min(self.new_w, self.new_h))
# check the input image's size and color
img, bbox = imAugment.imresize(img, bbox, self.new_w, self.new_h, self.is_color)
# horizontally flip the input image under a specified probability
if mirror_ratio > 0 and random.random() < mirror_ratio:
img, bbox = imAugment.immirror(img, bbox)
# generate processed labels
cls_label, reg_label = gen_labels(bbox, self.all_anchors)
# generate masks
cls_mask, reg_mask = gen_masks(cls_label)
self.batch_image[i, :, :, :] = img.astype(np.float32)
self.batch_cls_label[i*all_anchors_num:(i+1)*all_anchors_num, :] = cls_label
self.batch_reg_label[i*all_anchors_num:(i+1)*all_anchors_num, :] = reg_label
self.batch_cls_mask[i*all_anchors_num:(i+1)*all_anchors_num] = cls_mask
self.batch_reg_mask[i*all_anchors_num:(i+1)*all_anchors_num] = reg_mask
return self.batch_image, self.batch_cls_label, self.batch_reg_label, self.batch_cls_mask, self.batch_reg_mask