【FPN車輛目標檢測】資料集獲取以及Windows7+TensorFlow+Faster-RCNN+FPN程式碼環境配置和執行過程實測
PS 最近在學目標檢測想用最新的FPN網路,剛好看到這篇部落格https://blog.csdn.net/Angela_qin/article/details/80944604嘗試把它復現,說的小白一點。
1.資料集獲取
博主只說是車輛目標檢測沒將資料集在哪裡獲取。我在程式碼中發現E:/study_materials/ECCV Vision Meets Drones Challenge/datasets/carData/carData/的路徑設定。去百度了下ECCV Vision Meets Drones Challenge 果然有資料
網址http://www.aiskyeye.com/views/getInfo?loc=2
2.minconda+pycharm 安裝
這個可以自己去找資料或看我之前的部落格https://blog.csdn.net/qq_36401512/article/details/84583552 (centos7版本改windows版)。https://blog.csdn.net/qq_36401512/article/details/84580625 (centos7版本改windows版)。
3.下載https://github.com/yangxue0827/FPN_Tensorflow 原始碼,
照著部落格更改
1.FPN_Tensorflow-master\libs\label_name_dict\label_dict.py檔案,最上面from libs.configs import cfgs 下面改成
# -*- coding: utf-8 -*- from __future__ import division, print_function, absolute_import from libs.configs import cfgs if cfgs.DATASET_NAME == 'car': NAME_LABEL_MAP = { 'back_ground': 0, "car": 1 } elif cfgs.DATASET_NAME == 'ship': NAME_LABEL_MAP = { 'back_ground': 0, "ship": 1 }
2.FPN_Tensorflow-master\data\io\convert_data_to_tfrecord.py檔案 對比作者添加了
if gtbox_label.shape[0]==0:
continue
這一小段,避免有的圖片沒有車輛標註。當然路徑什麼的還是要自己改的具體如下:
# -*- coding: utf-8 -*-
from __future__ import division, print_function, absolute_import
import sys
sys.path.append('../../')
import xml.etree.cElementTree as ET
import numpy as np
import tensorflow as tf
import glob
import cv2
from help_utils.tools import *
from libs.label_name_dict.label_dict import *
import os
VOC_dir = 'E:/DcmData/xlc/VisDrone2018/VisDrone2018-DET-train/'
txt_dir = 'annotations'
img_dir = 'images'
save_name = 'train'
save_dir = 'D:/Documents and Settings/Administrator/Desktop/ATP/FPN_Tensorflow-master/data/tfrecords/'
img_format = '.jpg'
dataset = 'car'
# FLAGS = tf.app.FLAGS
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def _bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
def convert_pascal_to_tfrecord():
save_path = save_dir + dataset + '_' + save_name + '.tfrecord'
mkdir(save_dir)
label_dir = VOC_dir + txt_dir
image_dir = VOC_dir + img_dir
writer = tf.python_io.TFRecordWriter(path=save_path)
for count, fn in enumerate(os.listdir(image_dir)):
if ((count + 1) % 4) != 0:
continue
else:
print(count+1)
image_fp = os.path.join(image_dir, fn)
image_fp = image_fp.replace('\\', '/')
label_fp = os.path.join(label_dir, fn.replace('.jpg', '.txt'))
# print('label_fp:',label_fp)
img_name = str.encode(fn)
if not os.path.exists(label_fp):
print('{} is not exist!'.format(label_fp))
continue
# img = np.array(Image.open(img_path))
img = cv2.imread(image_fp)
sizeImg = img.shape
img_height = sizeImg[0]
img_width = sizeImg[1]
boxes = []
with open(label_fp, 'r') as f:
for line in f.readlines():
line = line.strip().split(',') # strip() 方法用於移除字串頭尾指定的字元(預設為空格或換行符)或字元序列。
# print('line:',line)
if line[4] != '0':
# print(line)
try:
line = [int(i) for i in line]
except:
line.pop()#Python 字典 pop() 方法刪除字典給定鍵 key 及對應的值,返回值為被刪除的值。key 值必須給出。 否則,返回 default 值。
line = [round(float(i)) for i in line]# 這些操作只是為去掉引號,line[0]取到的是值。
#print('line',line)
# xmin, ymin, xmax, ymax, label
# 原始標註,xmin,ymin,box_width,box_height,score,category,truncation,occlusion
if line[4] == 1 and line[5] == 4:
boxes.append([line[0], line[1], line[0] + line[2], line[1] + line[3], 1])
gtbox_label = np.array(boxes, dtype=np.int32) # [x1, y1. x2, y2, label]
if gtbox_label.shape[0]==0:
continue
xmin, ymin, xmax, ymax, label = gtbox_label[:, 0], gtbox_label[:, 1], gtbox_label[:, 2], gtbox_label[:,
3], gtbox_label[:,
4]
gtbox_label = np.transpose(
np.stack([ymin, xmin, ymax, xmax, label], axis=0)) # [ymin, xmin, ymax, xmax, label]
feature = tf.train.Features(feature={
# maybe do not need encode() in linux
'img_name': _bytes_feature(img_name),
'img_height': _int64_feature(img_height),
'img_width': _int64_feature(img_width),
'img': _bytes_feature(img.tostring()),
'gtboxes_and_label': _bytes_feature(gtbox_label.tostring()),
'num_objects': _int64_feature(gtbox_label.shape[0])
})
example = tf.train.Example(features=feature)
writer.write(example.SerializeToString())
#view_bar('Conversion progress', count + 1, len(glob.glob(image_dir + '/*.jpg')))
print('\nConversion is complete!')
if __name__ == '__main__':
# xml_path = '../data/dataset/VOCdevkit/VOC2007/Annotations/000005.xml'
# read_xml_gtbox_and_label(xml_path)
convert_pascal_to_tfrecord()
VOC_dir :資料集存放位置
txt_dir :VOC_dir目錄下面的存放Annotation檔案的資料夾名
img_dir :VOC_dir目錄下面的存放圖片的資料夾名
save_name :字尾設定,選擇儲存成car_train.tfrecord 還是 car_test.tfrecord
save_dir :轉化成tfrecord格式之後,資料的儲存位置
dataset :改成car
照著他的部落格生成在ATP\FPN_Tensorflow-master\data\tfrecords下生成car_train_tfrecord和car_test_tfrecord。
3.FPN_Tensorflow-master\data\io\read_tfrecord.py 改成如下
def next_batch(dataset_name, batch_size, shortside_len, is_training):
if dataset_name not in ['nwpu', 'airplane', 'SSDD', 'ship', 'pascal', 'coco', 'car']:
raise ValueError('dataSet name must be in pascal or coco')
if is_training:
#pattern = os.path.join('../data/tfrecords', dataset_name + '_train*')
pattern = 'D:/Documents and Settings/Administrator/Desktop/ATP/FPN_Tensorflow-master/data/tfrecords/car_train.tfrecord'
else:
#pattern = os.path.join('../data/tfrecords', dataset_name + '_test.tfrecord')
pattern = 'D:/Documents and Settings/Administrator/Desktop/ATP/FPN_Tensorflow-master/data/tfrecords/car_test.tfrecord'
print('tfrecord path is -->', os.path.abspath(pattern))
filename_tensorlist = tf.train.match_filenames_once(pattern)
4.FPN_Tensorflow-master\libs\configs\cfgs.py 按照實際情況改成如下:
# -*- coding: utf-8 -*-
from __future__ import division, print_function, absolute_import
import os
# root path
ROOT_PATH = os.path.abspath(r'D:\Documents and Settings\Administrator\Desktop\ATP\FPN_Tensorflow-master')
# pretrain weights path
TEST_SAVE_PATH = ROOT_PATH + '/tools/test_result'
INFERENCE_IMAGE_PATH = ROOT_PATH + '/tools/inference_image'
INFERENCE_SAVE_PATH = ROOT_PATH + '/tools/inference_result'
NET_NAME = 'resnet_v1_101'
#VERSION = 'v2_airplane'
VERSION = 'v1_car'
CLASS_NUM = 1
BASE_ANCHOR_SIZE_LIST = [15, 25, 40, 60, 80]
LEVEL = ['P2', 'P3', 'P4', 'P5', "P6"]
STRIDE = [4, 8, 16, 32, 64]
ANCHOR_SCALES = [1.]
ANCHOR_RATIOS = [1, 0.5, 2, 1 / 3., 3., 1.5, 1 / 1.5]
SCALE_FACTORS = [10., 10., 5., 5.]
OUTPUT_STRIDE = 16
SHORT_SIDE_LEN = 600
#DATASET_NAME = 'airplane'
DATASET_NAME = 'car'
BATCH_SIZE = 1
5.開啟https://github.com/yangxue0827/FPN_Tensorflow 下載預先訓練模型
Train裡面
第2點、download pretrain weight(resnet_v1_101_2016_08_28.tar.gz or resnet_v1_50_2016_08_28.tar.gz) from here, then extract to folder $FPN_ROOT/data/pretrained_weights
點選下載resnet_v1_101_2016_08_28.tar.gz 解壓並放到FPN_Tensorflow-master\data\pretrained_weights\resnet_v1_101.ckpt。
6.執行FPN_Tensorflow-master\tools\train.py,訓練模型。(當然相關引數可在cfgs.py更改)
7.執行FPN_Tensorflow-master\tools\test.py,他一直報錯說Exception: test() missing 1 required positional argument: 'img_num',但img_num=548(下載的驗證資料集是548張)已經寫那裡了啊。沒辦法改成
1.if __name__ == '__main__':
#img_num = 548
#test(img_num)
test()
2.將def test(img_num):改成def test():
3.img_num=548 ,放入test函式中
with tf.Session(config=config) as sess:
sess.run(init_op)
if not restorer is None:
restorer.restore(sess, restore_ckpt)
print('restore model')
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess, coord)
img_num=548#加入行程式碼
for i in range(img_num):
start = time.time()
4.再次執行ok。
結果在test_result資料夾裡面。
8.如果要測試現有圖片,將圖片放入FPN_Tensorflow-master\tools\inference_image資料夾,執行inference.py就可以在得到inference_result資料夾下得到。
由於訓練資料太少只用了1/4(總共6400多張),訓練也才訓練了38000次(批次大小為1,共24遍左右)出來效果不太好。不過證明其可以用。
而且github開源作者更新了改進後的FPN網路。所以準備換程式碼。
ps 這次嘗試居然花了1天多點,中間各種很蠢的問題。最蠢是gpu被其他程式佔用了記憶體,而魯大師顯示gpu=0%,執行程式一直顯示記憶體不足。恰好又爆出slim.get_or_create_global_step()函式UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape的警告,我還以為是這個函式引起的,查了各種資料,只說原因tf.gather(…)函式。最後我 在C:\Program Files\NVIDIA Corporation\NVSMI裡找到檔案nvidia-smi.exe,拖到cmd中點選回車才發現已經佔用了百分之90多了。我只想說魯大師靠譜點行麼。