【Tensorflow tf 掏糞記錄】筆記五——YOLOv3 tensorflow 實現
YOLOv3是YOLO作者優化的YOLO演算法,與之前的相比,網路結構多了殘差塊來連線網路,採用了金字塔結構,網路的深度大大加深,達到了53層。所以作者把網路命名為DarkNet-53。
由於作者的編碼能力實在是太強了,連所用的框架好像都是自己寫的,所以目前並沒有其他的框架的版本的釋出。我根據作者寫的關於YOLOv3的論文和之前的關於YOLO的所有論文,按我對論文的理解來實現YOLOv3。(若我對論文的理解有誤, 歡迎指出。不勝感激)
還是老樣子,各種設定可以到config
資料夾下修改對應的設定項
專案程式碼
簡要介紹YOLO
YOLO( you only look once)的縮寫。顧名思義,就是隻看一次,整個預測過程只看圖片一次。這有別與之前的目標定位識別的專案,YOLO之前的專案一般是定位目標看一次影象,物體分類再看一次影象。所以簡單直觀地看,YOLO似乎效率比他們都高,畢竟少看了一次,省下了時間。事實上YOLO也是快的驚人,作者在Titan GPU上實現了實時目標定位與識別,識別視訊在
專案結構
本次的專案主要有3個檔案,2個資料夾。utils
資料夾中有6個檔案。config
資料夾中config.yml
配置檔案。
reader.py
:
- 存放了用來讀取資料集,資料集標籤存放地址與檔案的方法。並在讀取名字的過程中實現的
mini_batch
的操作
- 存放了用來讀取資料集,資料集標籤存放地址與檔案的方法。並在讀取名字的過程中實現的
train.py
:
- 就是組合各種工具,用來訓練網路的程式碼
eval.ph
:
- 用來跑訓練好的YOLOv3
utils
資料夾:
extract_labels.py
:
- 裡面的
labels_normaliszer()
根據傳入的標籤的存放地址來讀取標籤,並且把標籤轉化成我們網路需要的格式
- 裡面的
get_loss.py
:
- 整合了計算YOLOv3作者之前提到的3個
loss
的計算方法,並計算batch
的loss
- 整合了計算YOLOv3作者之前提到的3個
IOU.py
:
- 裡面的
IOU_calculator()
方法根據傳入的預測值和目標的標籤的值來計算IOU
- 裡面的
net.ph
:
- 裡面實現了YOLOv3的核心演算法。DarkNet-53
read_config.py
:
- 讀取
config
配置檔案中的配置引數
- 讀取
- ·select_things.py`:
- 顧名思義,裡面的方法實現了選擇功能。例如選擇YOLOv3中scale的大小,選擇scale對應的check_point檔案
Utils 工具
extract_labels
對於標註的資料的處理我的思路是生成一個shape與神經網路輸出的陣列shape相同的陣列,並把目標的標籤賦給對應下標的陣列單元
labels_normalizer()
方法中,我建立了一個map,用來把所有的種類(class)轉化為陣列中對應的位置的下標。這裡我用的是VOC2007資料集,可以從官網下載,官網上說明了資料集中的種類(class)的個數。更換資料集記得過來修改map的內容。
由於VOC資料集的標籤格式是xml型別的,所以我呼叫了xml.dom.minidom
這個庫來解析xml檔案,
並且組合成元組,然後連線元組組成連結串列。然後返回連結串列
從中提取了object_name
, bdbox
, xmin
,ymin
,xmax
,ymax
import xml.dom.minidom
def xml_extractor( dir ):
DOMTree = parse( dir )
collection = DOMTree.documentElement # 得到xml檔案的根節點
file_name_xml = collection.getElementsByTagName( 'filename' )[0]
objects_xml = collection.getElementsByTagName( 'object' )
size_xml = collection.getElementsByTagName( 'size' )
file_name = file_name_xml.childNodes[0].data
for size in size_xml:
width = size.getElementsByTagName( 'width' )[0]
height = size.getElementsByTagName( 'height' )[0]
width = width.childNodes[0].data
height = height.childNodes[0].data
objects = []
for object_xml in objects_xml:
object_name = object_xml.getElementsByTagName( 'name' )[0]
bdbox = object_xml.getElementsByTagName( 'bndbox' )[0]
xmin = bdbox.getElementsByTagName( 'xmin' )[0]
ymin = bdbox.getElementsByTagName( 'ymin' )[0]
xmax = bdbox.getElementsByTagName( 'xmax' )[0]
ymax = bdbox.getElementsByTagName( 'ymax' )[0]
object = ( object_name.childNodes[0].data,
xmin.childNodes[0].data,
ymin.childNodes[0].data,
xmax.childNodes[0].data,
ymax.childNodes[0].data )
objects.append( object )
return file_name, width, height, objects
在labels_normalizer()
方法中我才正式的把得到的labels轉化為陣列。有一點值得注意,在生成新的陣列的時候務必要加1e-8(一個近似0的數),為的是防止之後在計算IOU時出現分母為0從而輸出為nan的情況。因為VOC資料集標記的是目標的對角線的座標,而我們需要的是目標中點的座標與之對應的boundding box的長寬。所以需要一點小計算。
而且由於YOLOv3的檢測機制是中點所在的對應的box對目標物體進行預測,所以還需要得出物體所在的box,並對對應下標的陣列賦值。為了防止下標越界我把最右與最下邊界上的點歸為前一個box管理,他們本來歸下一個box管理,但是會發生越界錯誤
def labels_normalizer( batches_filenames, target_width, target_height, layerout_width, layerout_height ):
class_map = {
'person' : 5,
'bird' : 6,
'cat' : 7,
'cow' : 8,
'dog' : 9,
'horse' : 10,
'sheep' : 11,
'aeroplane' : 12,
'bicycle' : 13,
'boat' : 14,
'bus' : 15,
'car' : 16,
'motorbike' : 17,
'train' : 18,
'bottle' : 19,
'chair' : 20,
'diningtable' : 21,
'pottedplant': 22,
'sofa' : 23,
'tvmonitor' : 24
}
height_width = []
batches_labels = []
for batch_filenames in batches_filenames:
batch_labels = []
for filename in batch_filenames:
_, width, height, objects = xml_extractor( filename )
width_preprotion = target_width / int( width )
height_preprotion = target_height / int( height )
label = np.add( np.zeros( [int( layerout_height ), int( layerout_width ), 255] ), 1e-8 ) # 這裡加1e-8的原因是防止之後在用該資料在計算IOU時出現分母為0從而導致輸出為nan的情況
for object in objects:
class_label = class_map[object[0]]
xmin = float( object[1] )
ymin = float( object[2] )
xmax = float( object[3] )
ymax = float( object[4] )
x = ( 1.0 * xmax + xmin ) / 2 * width_preprotion # 計算目標中點的x值
y = ( 1.0 * ymax + ymin ) / 2 * height_preprotion # 計算目標中點的y值
bdbox_width = ( 1.0 * xmax - xmin ) * width_preprotion # 計算目標的boundding box的寬
bdbox_height = ( 1.0 * ymax - ymin ) * height_preprotion # 計算目標的boundding box的高
falg_width = int( target_width ) / layerout_width # 計算一個box內含有多少個原影象的橫軸畫素
flag_height = int( target_height ) / layerout_height # 計算一個box內含有多少個原影象的橫軸畫素
box_x = x // falg_width # 計算x所屬的box的x下標
box_y = y // flag_height # 計算y所屬的box的y下標
if box_x == layerout_width: # 把最後一個box右邊界的點歸為最後一個box管理(本來為下一個box管理)
box_x -= 1
if box_y == layerout_height: # 把最下面一個box的下邊界的點歸為最下面一個box管理(本來為下一個box管理)
box_y -= 1
for i in range( 3 ): # 每個box預測3個bdbox
label[int( box_y ), int( box_x ), i * 25] = x # point x
label[int( box_y ), int( box_x ), i * 25 + 1] = y # point y
label[int( box_y ), int( box_x ), i * 25 + 2] = bdbox_width # bdbox width
label[int( box_y ), int( box_x ), i * 25 + 3] = bdbox_height # bdbox height
label[int( box_y ), int( box_x ), i * 25 + 4] = 1 # objectness
label[int( box_y ), int( box_x ), i * 25 + int( class_label )] = 0.9 # class label
batch_labels.append( label )
batches_labels.append( batch_labels )
# batches_labels = np.array( batches_labels )
return batches_labels
get_loss
總的loss函式數學表的式:
coord取5,noobj取0.5
程式碼實現:
def calculate_loss( batch_inputs, batch_labels ):
batch_loss = 0
# for batch in range( batch_inputs.shape[0] ):
for image_num in range( batch_inputs.shape[0] ):
for y in range( batch_inputs.shape[1] ):
for x in range( batch_inputs.shape[2] ):
for i in range( 3 ):
pretect_x = batch_inputs[image_num][y][x][i * 25]
pretect_y = batch_inputs[image_num][y][x][i * 25 + 1]
pretect_width = batch_inputs[image_num][y][x][i * 25 + 2]
pretect_height = batch_inputs[image_num][y][x][i * 25 + 3]
pretect_objectness = batch_inputs[image_num][y][x][i * 25 + 4]
pretect_class = batch_inputs[image_num][y][x][i * 25 + 5 : i * 25 + 5 + 20]
label_x = batch_labels[image_num][y][x][i * 25]
label_y = batch_labels[image_num][y][x][i * 25 + 1]
label_width = batch_labels[image_num][y][x][i * 25 + 2]
label_height = batch_labels[image_num][y][x][i * 25 + 3]
label_objectness = batch_labels[image_num][y][x][i * 25 + 4]
label_class = batch_labels[image_num][y][x][i * 25 + 5 : i * 25 + 5 + 20]
IOU = get_IOU.IOU_calculator( tf.cast( pretect_x, tf.float32 ),
tf.cast( pretect_y, tf.float32 ),
tf.cast( pretect_width, tf.float32 ),
tf.cast( pretect_height, tf.float32 ),
tf.cast( label_x, tf.float32 ),
tf.cast( label_y, tf.float32 ),
tf.cast( label_width, tf.float32 ),
tf.cast( label_height, tf.float32 ) )
loss = class_loss( pretect_class,
label_class ) + location_loss( pretect_x,
pretect_y,
pretect_width,
pretect_height,
label_x,
label_y,
label_width,
label_height ) + objectness_loss( IOU, pretect_objectness, label_objectness )
batch_loss += loss
return batch_loss
計算IOU的損失函式:
程式碼實現:
def objectness_loss( input, switch, l_switch, alpha = 0.5 ):
'''
Calculate the objectness loss
:param input: input IOU
:param switch: If target in this box is 1, else 1e-8
:param l_switch: Target in this box is 1, else 0
:return: objectness_loss
'''
IOU_loss = tf.square( l_switch - input * switch )
loss_max = tf.square( l_switch * 0.5 - input * switch )
IOU_loss = tf.cond( IOU_loss < loss_max, lambda : tf.cast( 1e-8, tf.float32 ), lambda : IOU_loss )
IOU_loss = tf.cond( l_switch < 1, lambda : IOU_loss * alpha, lambda : IOU_loss )
return IOU_loss
作者說了,這次IOU誤差0.5是在接受範圍內。所以我在objectness_loss
方法中加入了判斷語句。讓誤差小於0.5的IUO_loss都等於1e-8(一個非常接近0的數)。這裡作者希望box中沒目標點的box相應的IOU預測為0,我用1e-8表示
計算Class的損失函式:
程式碼實現:
def class_loss( inputs, labels ):
classloss = tf.square( labels - inputs )
loss_sum = tf.reduce_sum( classloss )
return loss_sum
這種loss的計算就是機器學習的基礎。沒難度。
計算location的損失函式:
程式碼實現:
def location_loss( x, y, width, height, l_x, l_y, l_width, l_height, alpha = 5 ):
point_loss = ( tf.square( l_x - x ) + tf.square( l_y - y ) ) * alpha
size_loss = ( tf.square( tf.sqrt( l_width ) - tf.sqrt( width ) ) + tf.square( tf.sqrt( l_height ) - tf.sqrt( height ) ) ) * alpha
location_loss = point_loss + size_loss
return location_loss
這裡有開方,所以等下在寫net的時候記得輸出取絕對值就好。避免等下根號下為負數輸出為nan。
IOU
IOU為目標的預測框與目標的標籤框的交集的面積佔兩框總面積的比
這裡我們我們知道了標籤框的中點座標,寬,高與預測框的中點座標,寬,高。然後就是解初中數學題了。
在這裡我儘量避免分母出現0的情況防止出現令人煩惱的nan錯誤
def IOU_calculator( x, y, width, height, l_x, l_y, l_width, l_height ):
'''
Cculate IOU
:param x: net predicted x
:param y: net predicted y
:param width: net predicted width
:param height: net predicted height
:param l_x: label x
:param l_y: label y
:param l_width: label width
:param l_height: label height
:return: IOU
'''
x_max = calculate_max( x , width / 2 )
y_max = calculate_max( y, height / 2 )
x_min = calculate_min( x, width / 2 )
y_min = calculate_min( y, height / 2 )
l_x_max = calculate_max( l_x, width / 2 )
l_y_max = calculate_max( l_y, height / 2 )
l_x_min = calculate_min( l_x, width / 2 )
l_y_min = calculate_min( l_y, height / 2 )
'''--------Caculate Both Area's point--------'''
xend = tf.minimum( x_max, l_x_max )
xstart = tf.maximum( x_min, l_x_min )
yend = tf.minimum( y_max, l_y_max )
ystart = tf.maximum( y_min, l_y_min )
area_width = xend - xstart
area_height = yend - ystart
'''--------Caculate the IOU--------'''
area = area_width * area_height
all_area = tf.cond( ( width * height + l_width * l_height - area ) <= 0, lambda : tf.cast( 1e-8, tf.float32 ), lambda : ( width * height + l_width * l_height - area ) )
IOU = area / all_area
IOU = tf.cond( area_width < 0, lambda : tf.cast( 1e-8, tf.float32 ), lambda : IOU )
IOU = tf.cond( area_height < 0, lambda : tf.cast( 1e-8, tf.float32 ), lambda : IOU )
return IOU
net
DarkNet-53論文中的結構圖:
論文中說它有3種不同大小的scale。scale中的引數分別從上圖的倒數第一個框,倒數第二個框,倒數第三個框的輸出中得到引數,並分別加上最後一層的輸出的引數。作者說這樣能夠獲得很多很多的“特徵”(feature)
我是用的啟用函式是Leky Relu。因為tensorflow中沒有Leky Relu所以我自己寫了一個。其實本質就是在x的負方向梯度不為0的Relu函式。
def Leaky_Relu( input, alpha = 0.01 ):
output = tf.maximum( input, tf.multiply( input, alpha ) )
return output
我聲明瞭兩種卷積函式,一種是卷積操作後直接batch_normalization,Leky Relu直接走下去。還有一種是卷積操作後接batch_normalization,Leky Relu,然後再加上殘差網路的shortcut然後再次通過Leky Relu
def Res_conv2d( inputs, shortcut, filters, shape, stride = ( 1, 1 ) ):
conv = conv2d( inputs, filters, shape )
Res = Leaky_Relu( conv + shortcut )
return Res
def conv2d( inputs, filters, shape, stride = ( 1, 1 ) ):
layer = tf.layers.conv2d( inputs,
filters,
shape,
stride,
padding = 'SAME',
kernel_initializer=tf.truncated_normal_initializer( stddev=0.01 ) )
layer = tf.layers.batch_normalization( layer, training = True )
layer = Leaky_Relu( layer )
return layer
然後就是按照圖來實現神經網路
def feature_extractor( inputs ):
layer = conv2d( inputs, 32, [3, 3] )
layer = conv2d( layer, 64, [3, 3], ( 2, 2 ) )
shortcut = layer
layer = conv2d( layer, 32, [1, 1] )
layer = Res_conv2d( layer, shortcut, 64, [3, 3] )
layer = conv2d( layer, 128, [3, 3], ( 2, 2 ) )
shortcut = layer
for _ in range( 2 ):
layer = conv2d( layer, 64, [1, 1] )
layer = Res_conv2d( layer, shortcut, 128, [3, 3] )
layer = conv2d( layer, 256, [3, 3], ( 2, 2 ) )
shortcut = layer
for _ in range( 8 ):
layer = conv2d( layer, 128, [1, 1] )
layer = Res_conv2d( layer, shortcut, 256, [3, 3] )
pre_scale3 = layer
layer = conv2d( layer, 512, [3, 3], ( 2, 2 ) )
shortcut = layer
for _ in range( 8 ):
layer = conv2d( layer, 256, [1, 1] )
layer = Res_conv2d( layer, shortcut, 512, [3, 3] )
pre_scale2 = layer
layer = conv2d( layer, 1024, [3, 3], ( 2, 2 ) )
shortcut = layer
for _ in range( 4 ):
layer = conv2d( layer, 512, [1, 1] )
layer = Res_conv2d( layer, shortcut, 1024, [3, 3] )
pre_scale1 = layer
return pre_scale1, pre_scale2, pre_scale3
作者說,scale2, scale3從網路中間提取的引數會經過一個2x的操作。我的理解是,直接把輸出當成影象來縮放。
在此函式中,我還把網路最終層的輸出與升樣後的陣列相加。
def get_layer2x( layer_final, pre_scale ):
layer2x = tf.image.resize_images(layer_final,
[2 * tf.shape(layer_final)[1], 2 * tf.shape(layer_final)[2]])
layer2x_add = tf.concat( [layer2x, pre_scale], 3 )
return layer2x_add
最後的scale要通過一些層的神經網路得到最後的預測結果。這個是我按照我安裝的DarkNet列印的結構來實現的。在最後我把所有的結果取絕對值,防止出現前面提到的nan的錯誤
def scales( layer, pre_scale2, pre_scale3 ):
layer_copy = layer
layer = conv2d( layer, 512, [1, 1] )
layer = conv2d( layer, 1024, [3, 3] )
layer = conv2d(layer, 512, [1, 1])
layer_final = layer
layer = conv2d(layer, 1024, [3, 3])
'''--------scale_1--------'''
scale_1 = conv2d( layer, 255, [1, 1] )
'''--------scale_2--------'''
layer = conv2d( layer_final, 256, [1, 1] )
layer = get_layer2x( layer, pre_scale2 )
layer = conv2d( layer, 256, [1, 1] )
layer= conv2d( layer, 512, [3, 3] )
layer = conv2d( layer, 256, [1, 1] )
layer = conv2d( layer, 512, [3, 3] )
layer = conv2d( layer, 256, [1, 1] )
layer_final = layer
layer = conv2d( layer, 512, [3, 3] )
scale_2 = conv2d( layer, 255, [1, 1] )
'''--------scale_3--------'''
layer = conv2d( layer_final, 128, [1, 1] )
layer = get_layer2x( layer, pre_scale3 )
for _ in range( 3 ):
layer = conv2d( layer, 128, [1, 1] )
layer = conv2d( layer, 256, [3, 3] )
scale_3 = conv2d( layer, 255, [1, 1] )
scale_1 = tf.abs( scale_1 )
scale_2 = tf.abs( scale_2 )
scale_3 = tf.abs( scale_3 )
return scale_1, scale_2, scale_3
eval
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument( '-c', '--conf', default = './config/eval_config.yml', help = 'the path to the eval_config file' )
return parser.parse_args()
def main( FLAGS ):
if not os.path.exists( FLAGS.save_dir ):
os.makedirs( FLAGS.save_dir )
input_image = reader.get_image( FLAGS.image_dir, FLAGS.image_width, FLAGS.image_height )
output_image = np.copy( input_image )
'''--------Create placeholder--------'''
image = net.create_eval_placeholder( FLAGS.image_width, FLAGS.image_height )
'''--------net--------'''
pre_scale1, pre_scale2, pre_scale3 = net.feature_extractor( image )
scale1, scale2, scale3 = net.scales( pre_scale1, pre_scale2, pre_scale3 )
with tf.Session() as sess:
saver = tf.train.Saver()
save_path = select_things.select_checkpoint( FLAGS.scale )
last_checkpoint = tf.train.latest_checkpoint( save_path, 'checkpoint' )
if last_checkpoint:
saver.restore(sess, last_checkpoint)
print( 'Success load model from: ', format( last_checkpoint ) )
else:
print( 'Model has not trained' )
start_time = time.time()
scale1, scale2, scale3 = sess.run( [scale1, scale2, scale3], feed_dict = {image: [output_image]} )
if FLAGS.scale == 1:
scale = scale1
if FLAGS.scale == 2:
scale = scale2
if FLAGS.scale == 3:
scale = scale3
boxes_labels = eval_uitls.label_extractor( scale[0] )
bdboxes = eval_uitls.get_bdboxes( boxes_labels )
for bdbox in bdboxes:
font = cv2.FONT_HERSHEY_SIMPLEX
output_image = cv2.rectangle( output_image,
( int( bdbox[0] - bdbox[2] / 2 ), int( bdbox[1] - bdbox[3] / 2 ) ),
( int( bdbox[0] + bdbox[2] / 2 ), int( bdbox[1] + bdbox[3] / 2 ) ),
( 200, 0, 0 ),
1 )
# output_image = cv2.putText( output_image, bdbox[5],
# ( bdbox[0] - bdbox[2] / 2, bdbox[1] - bdbox[3] / 2 ),
# 1.2,
# (0, 255, 0),
# 2 )
# output_image = np.multiply( output_image, 255 )
generate_image = FLAGS.save_dir + '/res.jpg'
if not os.path.exists( FLAGS.save_dir ):
os.makedirs( FLAGS.save_dir )
with open( generate_image, 'wb' ) as img:
img.write( output_image )
end_time = time.time()
print( 'Use time: ', end_time - start_time )
plt.imshow( output_image )
plt.show()
train
import tensorflow as tf
import numpy as np
import os
import argparse
import time
import utils.read_config as read_config
from utils import net, read_config, get_loss, IOU, extract_labels, select_things
import reader
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument( '-c', '--conf', default = './config/config.yml', help = 'the path to the config file' )
return parser.parse_args()
def main( FLAGS ):
scale_width, scale_height = select_things.select_scale( FLAGS.scale, FLAGS.width, FLAGS.height )
'''--------Creat palceholder--------'''
datas, labels = net.create_placeholder( FLAGS.batch_size, FLAGS.width, FLAGS.height, scale_width, scale_height )
'''--------net--------'''
pre_scale1, pre_scale2, pre_scale3 = net.feature_extractor( datas )
scale1, scale2, scale3 = net.scales( pre_scale1, pre_scale2, pre_scale3 )
'''--------get labels_filenames and datas_filenames--------'''
datas_filenames = reader.images( FLAGS.batch_size, FLAGS.datas_path )
labels_fienames = reader.labels( FLAGS.batch_size, FLAGS.labels_path )
normalize_labels = extract_labels.labels_normalizer( labels_fienames,
FLAGS.width,
FLAGS.height,
scale_width,
scale_height )
'''---------partition the train data and val data--------'''
train_filenames = datas_filenames[: int( len( datas_filenames ) * 0.9 )]
train_labels = normalize_labels[: int( len( normalize_labels ) * 0.9 )]
val_filenames = datas_filenames[len( datas_filenames ) - int( len( datas_filenames ) * 0.9 ) :]
val_labels = normalize_labels[len( normalize_labels ) - int( len( normalize_labels ) * 0.9 ) :]
'''--------calculate loss--------'''
if FLAGS.scale == 1:
loss = get_loss.calculate_loss( scale1, labels )
if FLAGS.scale == 2:
loss = get_loss.calculate_loss( scale2, labels )
if FLAGS.scale == 3:
loss = get_loss.calculate_loss( scale3, labels )
'''--------Optimizer--------'''
optimizer = tf.train.AdamOptimizer( learning_rate=FLAGS.learning_rate ).minimize( loss )
tf.summary.scalar( 'epoch_loss', loss )
merged = tf.summary.merge_all()
init = tf.initialize_all_variables()
'''--------train--------'''
with tf.Session() as sess:
saver = tf.train.Saver()
save_path = select_things.select_checkpoint( FLAGS.scale )
last_checkpoint = tf.train.latest_checkpoint( save_path, 'checkpoint' )
if last_checkpoint:
saver.restore( sess, last_checkpoint )
print( 'Reuse model' )
else:
sess.run( init )
for epoch in range( FLAGS.epoch ):
epoch_loss = tf.cast( 0, tf.float32 )
for i in range( len( train_filenames ) ):
normalize_datas = []
for data_filename in train_filenames[i]:
image = reader.get_image( data_filename, FLAGS.width, FLAGS.height )
image = np.array( image, np.float32 )
normalize_datas.append( image )
normalize_datas = np.array( normalize_datas )
_, batch_loss = sess.run( [optimizer, loss], feed_dict = {datas: normalize_datas, labels: train_labels[i]} )
epoch_loss =+ batch_loss
if epoch % 10 == 0:
print( 'Cost after epoch %i: %f' % ( epoch, epoch_loss ) )
if epoch % 50 == 0:
val_loss = tf.cast( 0, tf.float32 )
for i in range( len( val_filenames ) ):
normalize_datas = []
for val_filename in val_filenames[i]:
image = reader.get_image( val_filename, FLAGS.width, FLAGS.height )
image = np.array( image, np.float32 )
image = np.divide( image, 255 )
normalize_datas.append( image )
normalize_datas = np.array( normalize_datas )
batch_loss = sess.run( loss, feed_dict = {datas: normalize_datas, labels: val_labels[i]} )
val_loss =+ batch_loss
print( 'VAL_Cost after epoch %i: %f' %( epoch, val_loss ) )
saver.save( sess, save_path, global_step = epoch )
if __name__ == '__main__':
args = parse_args()
FLAGS = read_config.read_config_file( args.conf )
main( FLAGS )