教你用TensorFlow做影象識別

阿新 • • 發佈：2018-12-16

弱者用淚水安慰自己，強者用汗水磨練自己。

上一篇文章裡面講了使用TensorFlow做手寫數字影象識別，這篇文章算是它的進階篇吧，在本篇文章中將會講解如何使用TensorFlow識別多種類圖片。本次使用的資料集是CIFAR-10，這是一個比較經典的資料集，可以去百度一下它的官網，它包含60000張32X32的彩色影象，其中訓練集50000張，測試集10000張。裡面一共是10類的圖片，分別是airplane、automobile、bird、cat、deer、dog、frog、horse、ship和truck。

第一步我們需要下載TensorFlow Models庫，你可以去github上面下載也可以使用git指令下載

git clone https://github.com/tensorflow/models.git

匯入庫，定義batch_size、訓練輪數max_steps，以及下載CIFAR-10的路徑

from tensorflow.models.tutorials.image.cifar10 import cifar10, cifar10_input
import tensorflow as tf
import numpy as np
import time

max_steps=3000
batch_size=128
data_dir='/cifar10_data'

定義初始化weight的函式，使用tf.truncated_normal截斷的正太分佈，給weight加一個L2的loss，L2正則化可以幫助我們篩選出最有效的特徵。使用w1控制L2 loss的大小，使用tf.nn.l2_loss函式計算weight的L2 loss,再使用tf.multiply讓L2 loss乘以w1,得到最後的weight loss，使用tf.add_to.collection把weight loss統一存到一個collection並命名為losses，以後計算神經網路總體的loss會用。

def variable_with_weight_loss(shape,stddev,w1):
    var = tf.Variable(tf.truncated_normal(shape,stddev=stddev))
    if w1 is not None:
        weight_loss=tf.multiply(tf.nn.l2_loss(var),w1,name='weight_loss')
        tf.add_to_collection('losses',weight_loss)
    return var

使用cifar10來下載資料集，再使用cirfar10_input中的distorted_inputs函式產生訓練需要使用的資料，包括特徵及其對應的label,這裡返回的是已經封裝好的tensor，每次執行都會生成一個batch_size的數量的樣本。裡面使用了資料增強，包括隨機的水平翻轉、隨機剪下一款24X24大小的圖片、設定隨機的亮度和對比度以及對資料進行標準化，如果你想了解更多，可以看看我

之前寫的文章，因為資料增強需要的計算量很大，所以該方法內部建立了16個獨立的執行緒來進行工作，使用TensorFlow.queue進行排程

cifar10.maybe_download_and_extract()

images_train,labels_train=cifar10_input.distorted_inputs(data_dir=data_dir,batch_size=batch_size)

再使用cifar10_input.inputs來生成測試資料。建立holder，包含特徵和label,因為batch_size在之後定義網路被用到了，所以資料尺寸中的第一個值需要被預先設定，大小為24X24，顏色通道為3。

images_test,labels_test=cifar10_input.inputs(eval_data=True,data_dir=data_dir,batch_size=batch_size)

image_holder=tf.placeholder(tf.float32,[batch_size,24,24,3])
label_holder=tf.placeholder(tf.int32,[batch_size])

開始建立第一個卷積層，先使用之前寫好的variable_with_weight_loss函式建立卷積核的引數並初始化。第一個卷積層使用5X5的卷積核，3個顏色通道，64個卷積核，設定weight初始化引數的標準差為0.05。不對第一層卷積進行L2正則，所以w1設為0.使用tf.nn.conv2d函式對輸入資料進行卷積操作，stride設為1，padding模式為SAME。把這層的bias全部初始化為0，再將卷積的結果加上bias,最後使用一個ReLU啟用函式進行非線性化。在ReLU之後使用尺寸3X3，步長為2X2的最大池化層處理資料，然後使用tf.nn.lrn函式，該函式可以使反饋比較大的值更大，反饋比較小的值更小。

weight1=variable_with_weight_loss(shape=[5,5,3,64],stddev=5e-2,w1=0.0)
kernel1=tf.nn.conv2d(image_holder,weight1,[1,1,1,1],padding='SAME')
bias1=tf.Variable(tf.constant(0.0,shape=[64]))
conv1=tf.nn.relu(tf.nn.bias_add(kernel1,bias1))
pool1=tf.nn.max_pool(conv1,ksize=[1,3,3,1],strides=[1,2,2,1],padding='SAME')
norm1=tf.nn.lrn(pool1,4,bias1=1.0,alpha=0.01/9.0,beta=0.75)

第二層卷積步驟和第一層差不多，不同的是bias值全部初始化為0.1，最後再調換最大池化層和lrn層的位置。

weight2=variable_with_weight_loss(shape=[5,5,64,64],stddev=5e-2,w1=0.0)
kernel2=tf.nn.conv2d(norm1,weight2,[1,1,1,1],padding='SAME')
bias2=tf.Variable(tf.constant(0.1,shape=[64]))
conv2=tf.nn.relu(tf.nn.bias_add(kernel2,bias2))
norm2=tf.nn.lrn(conv2,4,bias=1.0,alpha=0.01/9.0,beta=0.75)
pool2=tf.nn.max_pool(norm2,ksize=[1,3,3,1],strides=[1,2,2,1],padding='SAME')

連線一個全連線層，將之前的輸出結果flatten，使用tf.reshape函式將每個樣本變成一維向量。使用get_shape獲取資料扁平化後的長度。再使用variable_with_weight_loss函式對全連線層的weight進行初始化，這裡的隱藏節點數為384，正太分佈分標準差設為0.04，bias值初始化為0.1。需要注意的是我們不希望全連線層過擬合，所以設定了一個非零的weight loss值為0.04，讓這一層所有的引數被L2正則約束。最後使用ReLU啟用函式進行非線性化。

reshape=tf.reshape(pool2,[batch_size,-1])
dim=reshape.get_shape()[1].value
weight3=variable_with_weight_loss(shape=[dim,384],stddev=0.04,w1=0.004)
bias3=tf.Variable(tf.constant(0.1,shape=[384]))
local3=tf.nn.relu(tf.matmul(reshape,weight3)+bias3)

再來一層全連線，把隱藏節點數降低一半

weight4=variable_with_weight_loss(shape=[384,192],stddev=0.04,w1=0.004)
bias4=tf.Variable(tf.constant(0.1,shape=[192]))
local4=tf.nn.relu(tf.matmul(local3,weight4)+bias4)

建立最後一層，先建立weight，將其正太分佈標準差設為上一隱含層的節點數的導數，並且不計入L2正則。

weight5=variable_with_weight_loss(shape=[192,10],stddev=1/192.0,w1=0.0)
bias5=tf.Variable(tf.constant(0.0,shape=[10]))
logits=tf.add(tf.matmul(local4,weight5),bias5)

接下來計算CNN的loss，計算softmax和cross_entropy_loss，使用tf.reduce_mean對cross_enteopy計算均值，再用tf.add_to_collection把cross_entropy的loss新增到整體losses的collection中。最後使用tf.add_n將全部loss求和

def loss(logits,labels):
    labels=tf.cast(labels,tf.int64)
    cross_entropy=tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits,labels=labels,name='cross_entropy_per_example')
    cross_entropy_mean=tf.reduce_mean(cross_entropy,name='cross_entropy')
    tf.add_to_collection('losses',cross_entropy_mean)
    return tf.add_n(tf.get_collection('losses'),name='total_loss')

將logits節點和label_placeholder傳入loss函式，得到最後的loss.

優化器使用adam，學習率設定為1e-3.

使用tf.nn.in_top_k函式求輸出結果中top_k的準確率，預設使用top_1，也就是輸出分數最高的那一類的準確率。

使用tf.InteractiveSession建立預設Session,初始化所有引數。

啟動執行緒。

loss=loss(logits,label_holder)
train_op=tf.train.AdamOptimizer(1e-3).minimize(loss)
top_k_op=tf.nn.in_top_k(logits,label_holder,1)
sess=tf.InteractiveSession()
tf.global_variables_initializer().run()
tf.train.start_queue_runners()

開始正式訓練，在每一個step的訓練過程中，先用session的run方法執行image_train,labels_train的計算，獲得一個batch的訓練資料，再將這個batch的資料傳入train_op和loss的計算。記錄每一個step所消耗的時間，沒10個step會列印一下loss,訓練速率以及訓練一個batch所消耗的時間。沒有gpu會跑的比較慢。

for step in range(max_steps):
    start_time=time.time()
    image_batch,label_batch=sess.run([images_train,labels_train])
    _,loss_value=sess.run([train_op,loss],feed_dict={image_holder:image_batch,label_holder:label_batch})
    duration=time.time()-start_time
    if step % 10==0:
        examples_per_sec=batch_size/duration
        sec_per_batch=float(duration)
        format_str=('step %d,loss=%.2f(%.1f examples/sec; %.3f sec/batch)')
        print(format_str%(step,loss_value,examples_per_sec,sec_per_batch))

接下來評測模型再測試集上的準確率，像訓練那樣一個batch一個batch進行測試，記錄正確的數量，最後求得準確率並列印。

num_examples=10000
import math
num_iter=int(math.ceil(num_examples/batch_size))
true_count=0
total_sample_count=num_iter*batch_size
step=0
while step < num_iter:
    image_batch,label_batch=sess.run([images_test,images_test])
    predictions=sess.run([top_k_op],feed_dict={image_holder:image_batch,label_holder:label_batch})
    true_count+=np.sum(predictions)
    step+=1
precision=true_count/total_sample_count
print('precision @ 1=%.3f'%precision)

教你用TensorFlow做影象識別

教你用TensorFlow做影象識別

教你用Python做影象處理

教你用TensorFlow實現手寫數字識別

【Python量化】手把手教你用python做股票分析入門

教你用TensorFlow搭建AlexNet

教你用TensorFlow實現VGGNet

深度學習入門篇——手把手教你用 TensorFlow 訓練模型

你用TensorFlow做過哪些有趣的嘗試？

深度學習入門篇--手把手教你用 TensorFlow 訓練模型

手把手教你用matlab做深度學習(一)- --CNN

12歲的少年教你用Python做小遊戲

教你用TensorFlow實現神經網路（附程式碼）

手把手教你用電腦做伺服器建網站並讓外網訪問

教你用TensorFlow和自編碼器模型生成手寫數字（附程式碼）

手把手教你用C#做疫情傳播模擬

【震驚】手把手教你用python做繪圖工具（一）

Google工程師：教你用樹莓派+Arduino+TensorFlow搭建圖像識別小車

通過攝像頭捕獲影象用tensorflow做手寫數字識別

手把手 | 教你用幾行Python和消費資料做客戶細分

手把手教你用幾行Python和消費資料做客戶細分

教你用TensorFlow做影象識別

相關推薦