利用tensorflow訓練自己的圖片資料(3)——建立網路模型
一. 說明
在上一部落格——利用tensorflow訓練自己的圖片資料(2)中,我們已經獲得了神經網路的訓練輸入資料:image_batch,label_batch。接下就是建立神經網路模型,筆者的網路模型結構如下:
輸入資料:(batch_size,IMG_W,IMG_H,col_channel)= (20, 64, 64, 3)
卷積層1: (conv_kernel,num_channel,num_out_neure)= (3, 3, 3, 64)
池化層1: (ksize,strides,padding)= ([1,3,3,1], [1,2,2,1], 'SAME')
卷積層2: (conv_kernel,num_channel,num_out_neure)= (3, 3, 64, 16)
池化層2: (ksize,strides,padding)= ([1,3,3,1], [1,1,1,1], 'SAME')
全連線1: (out_pool2_reshape,num_out_neure)= (dim, 128)
全連線2: (fc1_out,num_out_neure)= (128,128)
softmax層: (fc2_out,num_classes) = (128, 4)
啟用函式: tf.nn.relu
損失函式: tf.nn.sparse_softmax_cross_entropy_with_logits
二. 程式設計實現
#=========================================================================
import tensorflow as tf
#=========================================================================
#網路結構定義
#輸入引數:images,image batch、4D tensor、tf.float32、[batch_size, width, height, channels]
#返回引數:logits, float、 [batch_size, n_classes]
def inference(images, batch_size, n_classes):
#一個簡單的卷積神經網路,卷積+池化層x2,全連線層x2,最後一個softmax層做分類。
#卷積層1
#64個3x3的卷積核(3通道),padding=’SAME’,表示padding後卷積的圖與原圖尺寸一致,啟用函式relu()
with tf.variable_scope('conv1') as scope:
weights = tf.Variable(tf.truncated_normal(shape=[3,3,3,64], stddev = 1.0, dtype = tf.float32),
name = 'weights', dtype = tf.float32)
biases = tf.Variable(tf.constant(value = 0.1, dtype = tf.float32, shape = [64]),
name = 'biases', dtype = tf.float32)
conv = tf.nn.conv2d(images, weights, strides=[1,1,1,1], padding='SAME')
pre_activation = tf.nn.bias_add(conv, biases)
conv1 = tf.nn.relu(pre_activation, name= scope.name)
#池化層1
#3x3最大池化,步長strides為2,池化後執行lrn()操作,區域性響應歸一化,對訓練有利。
with tf.variable_scope('pooling1_lrn') as scope:
pool1 = tf.nn.max_pool(conv1, ksize=[1,3,3,1],strides=[1,2,2,1],padding='SAME', name='pooling1')
norm1 = tf.nn.lrn(pool1, depth_radius=4, bias=1.0, alpha=0.001/9.0, beta=0.75, name='norm1')
#卷積層2
#16個3x3的卷積核(16通道),padding=’SAME’,表示padding後卷積的圖與原圖尺寸一致,啟用函式relu()
with tf.variable_scope('conv2') as scope:
weights = tf.Variable(tf.truncated_normal(shape=[3,3,64,16], stddev = 0.1, dtype = tf.float32),
name = 'weights', dtype = tf.float32)
biases = tf.Variable(tf.constant(value = 0.1, dtype = tf.float32, shape = [16]),
name = 'biases', dtype = tf.float32)
conv = tf.nn.conv2d(norm1, weights, strides = [1,1,1,1],padding='SAME')
pre_activation = tf.nn.bias_add(conv, biases)
conv2 = tf.nn.relu(pre_activation, name='conv2')
#池化層2
#3x3最大池化,步長strides為2,池化後執行lrn()操作,
#pool2 and norm2
with tf.variable_scope('pooling2_lrn') as scope:
norm2 = tf.nn.lrn(conv2, depth_radius=4, bias=1.0, alpha=0.001/9.0,beta=0.75,name='norm2')
pool2 = tf.nn.max_pool(norm2, ksize=[1,3,3,1], strides=[1,1,1,1],padding='SAME',name='pooling2')
#全連線層3
#128個神經元,將之前pool層的輸出reshape成一行,啟用函式relu()
with tf.variable_scope('local3') as scope:
reshape = tf.reshape(pool2, shape=[batch_size, -1])
dim = reshape.get_shape()[1].value
weights = tf.Variable(tf.truncated_normal(shape=[dim,128], stddev = 0.005, dtype = tf.float32),
name = 'weights', dtype = tf.float32)
biases = tf.Variable(tf.constant(value = 0.1, dtype = tf.float32, shape = [128]),
name = 'biases', dtype=tf.float32)
local3 = tf.nn.relu(tf.matmul(reshape, weights) + biases, name=scope.name)
#全連線層4
#128個神經元,啟用函式relu()
with tf.variable_scope('local4') as scope:
weights = tf.Variable(tf.truncated_normal(shape=[128,128], stddev = 0.005, dtype = tf.float32),
name = 'weights',dtype = tf.float32)
biases = tf.Variable(tf.constant(value = 0.1, dtype = tf.float32, shape = [128]),
name = 'biases', dtype = tf.float32)
local4 = tf.nn.relu(tf.matmul(local3, weights) + biases, name='local4')
#dropout層
# with tf.variable_scope('dropout') as scope:
# drop_out = tf.nn.dropout(local4, 0.8)
#Softmax迴歸層
#將前面的FC層輸出,做一個線性迴歸,計算出每一類的得分,在這裡是2類,所以這個層輸出的是兩個得分。
with tf.variable_scope('softmax_linear') as scope:
weights = tf.Variable(tf.truncated_normal(shape=[128, n_classes], stddev = 0.005, dtype = tf.float32),
name = 'softmax_linear', dtype = tf.float32)
biases = tf.Variable(tf.constant(value = 0.1, dtype = tf.float32, shape = [n_classes]),
name = 'biases', dtype = tf.float32)
softmax_linear = tf.add(tf.matmul(local4, weights), biases, name='softmax_linear')
return softmax_linear
#-----------------------------------------------------------------------------
#loss計算
#傳入引數:logits,網路計算輸出值。labels,真實值,在這裡是0或者1
#返回引數:loss,損失值
def losses(logits, labels):
with tf.variable_scope('loss') as scope:
cross_entropy =tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=labels, name='xentropy_per_example')
loss = tf.reduce_mean(cross_entropy, name='loss')
tf.summary.scalar(scope.name+'/loss', loss)
return loss
#--------------------------------------------------------------------------
#loss損失值優化
#輸入引數:loss。learning_rate,學習速率。
#返回引數:train_op,訓練op,這個引數要輸入sess.run中讓模型去訓練。
def trainning(loss, learning_rate):
with tf.name_scope('optimizer'):
optimizer = tf.train.AdamOptimizer(learning_rate= learning_rate)
global_step = tf.Variable(0, name='global_step', trainable=False)
train_op = optimizer.minimize(loss, global_step= global_step)
return train_op
#-----------------------------------------------------------------------
#評價/準確率計算
#輸入引數:logits,網路計算值。labels,標籤,也就是真實值,在這裡是0或者1。
#返回引數:accuracy,當前step的平均準確率,也就是在這些batch中多少張圖片被正確分類了。
def evaluation(logits, labels):
with tf.variable_scope('accuracy') as scope:
correct = tf.nn.in_top_k(logits, labels, 1)
correct = tf.cast(correct, tf.float16)
accuracy = tf.reduce_mean(correct)
tf.summary.scalar(scope.name+'/accuracy', accuracy)
return accuracy
#========================================================================
3 . 補充
tensorflow下的區域性相應歸一化函式:tf.nn.lrn
tf.nn.lrn = (input,depth_radius=None,bias=None,alpha=None,beta=None,name=None)
input是一個4D的tensor,型別必須為float。
depth_radius是一個型別為int的標量,表示囊括的kernel的範圍。
bias是偏置。
alpha是乘積係數,是在計算完囊括範圍內的kernel的啟用值之和之後再對其進行乘積。
beta是指數係數。
LRN是normalization的一種,normalizaiton的目的是抑制,抑制神經元的輸出。而LRN的設計借鑑了神經生物學中的一個概念,叫做“側抑制”。
側抑制:相近的神經元彼此之間發生抑制作用,即在某個神經元受到刺激而產生興奮時,再側記相近的神經元,則後者所發生的興奮對前產生的抑制作用。也就是說,抑制側是指相鄰的感受器之間能夠相互抑制的現象。
注:可參考部落格http://blog.csdn.net/gzhermit/article/details/75389130