TensorFlow實現Google InceptionNet V3(forward耗時檢測)
阿新 • • 發佈:2018-12-31
Google InceptionNet-V3網路結構圖:
Inception V3網路結構圖:
型別 | kernel尺寸/步長(或註釋) | 輸入尺寸 |
---|---|---|
卷積 | 3*3 / 2 | 299 * 299 * 3 |
卷積 | 3*3 / 1 | 149 * 149 * 32 |
卷積 | 3*3 / 1 | 147 * 147 * 32 |
池化 | 3*3 / 2 | 147 * 147 * 64 |
卷積 | 3*3 / 1 | 73 * 73 * 64 |
卷積 | 3*3 / 2 | 71 * 71 * 80 |
卷積 | 3*3 / 1 | 35 * 35 * 192 |
Inception模組組 | 3個Inception Module | 35 * 35 * 288 |
Inception模組組 | 3個Inception Module | 17 * 17 * 768 |
Inception模組組 | 3個Inception Module | 8 * 8 * 1280 |
池化 | 8*8 | 8 * 8 * 2048 |
線性 | logits | 1 * 1 * 2048 |
Softmax | 分類輸出 | 1 * 1 * 1000 |
Inception V3中設計CNN的思想和Trick:
(1)Factorization into small convolutions很有效,可以降低引數量,減輕
過擬合,增加網路非線性的表達能力。
(2)卷積網路從輸入到輸出,應該讓圖片尺寸逐漸減小,輸出通道數逐漸增加,
即讓空間結構化,將空間資訊轉化為高階抽象的特徵資訊。
(3)Inception Module用多個分支提取不同抽象程度的高階特徵的思路很有效,
可以豐富網路的表達能力。
# coding:UTF-8
import tensorflow as tf
from datetime import datetime
import math
import time
slim = tf.contrib.slim
trunc_normal = lambda stddev: tf.truncated_normal_initializer(0.0 , stddev) #產生截斷的正態分佈
########定義函式生成網路中經常用到的函式的預設引數########
# 預設引數:卷積的啟用函式、權重初始化方式、標準化器等
def inception_v3_arg_scope(weight_decay=0.00004, # 設定L2正則的weight_decay
stddev=0.1, # 標準差預設值0.1
batch_norm_var_collection='moving_vars'):
batch_norm_params = { # 定義batch normalization(標準化)的引數字典
'decay': 0.9997, # 定義引數衰減係數
'epsilon': 0.001,
'updates_collections': tf.GraphKeys.UPDATE_OPS,
'variables_collections': {
'beta': None,
'gamma': None,
'moving_mean': [batch_norm_var_collection],
'moving_variance': [batch_norm_var_collection],
}
}
with slim.arg_scope([slim.conv2d, slim.fully_connected], # 給函式的引數自動賦予某些預設值
weights_regularizer=slim.l2_regularizer(weight_decay)): # 對[slim.conv2d, slim.fully_connected]自動賦值
# 使用slim.arg_scope後就不需要每次都重複設定引數了,只需要在有修改時設定
with slim.arg_scope( # 巢狀一個slim.arg_scope對卷積層生成函式slim.conv2d的幾個引數賦予預設值
[slim.conv2d],
weights_initializer=trunc_normal(stddev), # 權重初始化器
activation_fn=tf.nn.relu, # 啟用函式
normalizer_fn=slim.batch_norm, # 標準化器
normalizer_params=batch_norm_params) as sc: # 標準化器的引數設定為前面定義的batch_norm_params
return sc # 最後返回定義好的scope
########定義函式可以生成Inception V3網路的卷積部分########
def inception_v3_base(inputs, scope=None):
'''
Args:
inputs:輸入的tensor
scope:包含了函式預設引數的環境
'''
end_points = {} # 定義一個字典表儲存某些關鍵節點供之後使用
with tf.variable_scope(scope, 'InceptionV3', [inputs]):
with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], # 對三個引數設定預設值
stride=1, padding='VALID'):
# 正式定義Inception V3的網路結構。首先是前面的非Inception Module的卷積層
# 299 x 299 x 3
# 第一個引數為輸入的tensor,第二個是輸出的通道數,卷積核尺寸,步長stride,padding模式
net = slim.conv2d(inputs, 32, [3, 3], stride=2, scope='Conv2d_1a_3x3') # 直接使用slim.conv2d建立卷積層
# 149 x 149 x 32
'''
因為使用了slim以及slim.arg_scope,我們一行程式碼就可以定義好一個卷積層
相比AlexNet使用好幾行程式碼定義一個卷積層,或是VGGNet中專門寫一個函式定義卷積層,都更加方便
'''
net = slim.conv2d(net, 32, [3, 3], scope='Conv2d_2a_3x3')
# 147 x 147 x 32
net = slim.conv2d(net, 64, [3, 3], padding='SAME', scope='Conv2d_2b_3x3')
# 147 x 147 x 64
net = slim.max_pool2d(net, [3, 3], stride=2, scope='MaxPool_3a_3x3')
# 73 x 73 x 64
net = slim.conv2d(net, 80, [1, 1], scope='Conv2d_3b_1x1')
# 73 x 73 x 80.
net = slim.conv2d(net, 192, [3, 3], scope='Conv2d_4a_3x3')
# 71 x 71 x 192.
net = slim.max_pool2d(net, [3, 3], stride=2, scope='MaxPool_5a_3x3')
# 35 x 35 x 192.
# 上面部分程式碼一共有5個卷積層,2個池化層,實現了對圖片資料的尺寸壓縮,並對圖片特徵進行了抽象
'''
三個連續的Inception模組組,三個Inception模組組中各自分別有多個Inception Module,這部分是Inception Module V3
的精華所在。每個Inception模組組內部的幾個Inception Mdoule結構非常相似,但是存在一些細節的不同
'''
# Inception blocks
with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], # 設定所有模組組的預設引數
stride=1, padding='SAME'): # 將所有卷積層、最大池化、平均池化層步長都設定為1
# mixed: 35 x 35 x 256.
# 第一個模組組包含了三個結構類似的Inception Module
with tf.variable_scope('Mixed_5b'): # 第一個Inception Module名稱。Inception Module有四個分支
with tf.variable_scope('Branch_0'): # 第一個分支64通道的1*1卷積
branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
with tf.variable_scope('Branch_1'): # 第二個分支48通道1*1卷積,連結一個64通道的5*5卷積
branch_1 = slim.conv2d(net, 48, [1, 1], scope='Conv2d_0a_1x1')
branch_1 = slim.conv2d(branch_1, 64, [5, 5], scope='Conv2d_0b_5x5')
with tf.variable_scope('Branch_2'):
branch_2 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_3x3')
branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0c_3x3')
with tf.variable_scope('Branch_3'): # 第四個分支為3*3的平均池化,連線32通道的1*1卷積
branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
branch_3 = slim.conv2d(branch_3, 32, [1, 1], scope='Conv2d_0b_1x1')
net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3) # 將四個分支的輸出合併在一起(第三個維度合併,即輸出通道上合併)
'''
因為這裡所有層步長均為1,並且padding模式為SAME,所以圖片尺寸不會縮小,但是通道數增加了。四個分支通道數之和
64+64+96+32=256,最終輸出的tensor的圖片尺寸為35*35*256。
第一個模組組所有Inception Module輸出圖片尺寸都是35*35,但是後兩個輸出通道數會發生變化。
'''
# mixed_1: 35 x 35 x 288.
with tf.variable_scope('Mixed_5c'):
with tf.variable_scope('Branch_0'):
branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
with tf.variable_scope('Branch_1'):
branch_1 = slim.conv2d(net, 48, [1, 1], scope='Conv2d_0b_1x1')
branch_1 = slim.conv2d(branch_1, 64, [5, 5], scope='Conv_1_0c_5x5')
with tf.variable_scope('Branch_2'):
branch_2 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_3x3')
branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0c_3x3')
with tf.variable_scope('Branch_3'):
branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')
net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
# mixed_2: 35 x 35 x 288.
with tf.variable_scope('Mixed_5d'):
with tf.variable_scope('Branch_0'):
branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
with tf.variable_scope('Branch_1'):
branch_1 = slim.conv2d(net, 48, [1, 1], scope='Conv2d_0a_1x1')
branch_1 = slim.conv2d(branch_1, 64, [5, 5], scope='Conv2d_0b_5x5')
with tf.variable_scope('Branch_2'):
branch_2 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_3x3')
branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0c_3x3')
with tf.variable_scope('Branch_3'):
branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')
net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
# 第二個Inception模組組。第二個到第五個Inception Module結構相似。
# mixed_3: 17 x 17 x 768.
with tf.variable_scope('Mixed_6a'):
with tf.variable_scope('Branch_0'):
branch_0 = slim.conv2d(net, 384, [3, 3], stride=2,
padding='VALID', scope='Conv2d_1a_1x1') # 圖片會被壓縮
with tf.variable_scope('Branch_1'):
branch_1 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
branch_1 = slim.conv2d(branch_1, 96, [3, 3], scope='Conv2d_0b_3x3')
branch_1 = slim.conv2d(branch_1, 96, [3, 3], stride=2,
padding='VALID', scope='Conv2d_1a_1x1') # 圖片被壓縮
with tf.variable_scope('Branch_2'):
branch_2 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID',
scope='MaxPool_1a_3x3')
net = tf.concat([branch_0, branch_1, branch_2], 3) # 輸出尺寸定格在17 x 17 x 768
# mixed4: 17 x 17 x 768.
with tf.variable_scope('Mixed_6b'):
with tf.variable_scope('Branch_0'):
branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
with tf.variable_scope('Branch_1'):
branch_1 = slim.conv2d(net, 128, [1, 1], scope='Conv2d_0a_1x1')
branch_1 = slim.conv2d(branch_1, 128, [1, 7], scope='Conv2d_0b_1x7') # 串聯1*7卷積和7*1卷積合成7*7卷積,減少了引數,減輕了過擬合
branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
with tf.variable_scope('Branch_2'):
branch_2 = slim.conv2d(net, 128, [1, 1], scope='Conv2d_0a_1x1') # 反覆將7*7卷積拆分
branch_2 = slim.conv2d(branch_2, 128, [7, 1], scope='Conv2d_0b_7x1')
branch_2 = slim.conv2d(branch_2, 128, [1, 7], scope='Conv2d_0c_1x7')
branch_2 = slim.conv2d(branch_2, 128, [7, 1], scope='Conv2d_0d_7x1')
branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0e_1x7')
with tf.variable_scope('Branch_3'):
branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
# mixed_5: 17 x 17 x 768.
with tf.variable_scope('Mixed_6c'):
with tf.variable_scope('Branch_0'):
'''
我們的網路每經過一個inception module,即使輸出尺寸不變,但是特徵都相當於被重新精煉了一遍,
其中豐富的卷積和非線性化對提升網路效能幫助很大。
'''
branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
with tf.variable_scope('Branch_1'):
branch_1 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
branch_1 = slim.conv2d(branch_1, 160, [1, 7], scope='Conv2d_0b_1x7')
branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
with tf.variable_scope('Branch_2'):
branch_2 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0b_7x1')
branch_2 = slim.conv2d(branch_2, 160, [1, 7], scope='Conv2d_0c_1x7')
branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0d_7x1')
branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0e_1x7')
with tf.variable_scope('Branch_3'):
branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
# mixed_6: 17 x 17 x 768.
with tf.variable_scope('Mixed_6d'):
with tf.variable_scope('Branch_0'):
branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
with tf.variable_scope('Branch_1'):
branch_1 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
branch_1 = slim.conv2d(branch_1, 160, [1, 7], scope='Conv2d_0b_1x7')
branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
with tf.variable_scope('Branch_2'):
branch_2 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0b_7x1')
branch_2 = slim.conv2d(branch_2, 160, [1, 7], scope='Conv2d_0c_1x7')
branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0d_7x1')
branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0e_1x7')
with tf.variable_scope('Branch_3'):
branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
# mixed_7: 17 x 17 x 768.
with tf.variable_scope('Mixed_6e'):
with tf.variable_scope('Branch_0'):
branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
with tf.variable_scope('Branch_1'):
branch_1 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
branch_1 = slim.conv2d(branch_1, 192, [1, 7], scope='Conv2d_0b_1x7')
branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
with tf.variable_scope('Branch_2'):
branch_2 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
branch_2 = slim.conv2d(branch_2, 192, [7, 1], scope='Conv2d_0b_7x1')
branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0c_1x7')
branch_2 = slim.conv2d(branch_2, 192, [7, 1], scope='Conv2d_0d_7x1')
branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0e_1x7')
with tf.variable_scope('Branch_3'):
branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
end_points['Mixed_6e'] = net # 將Mixed_6e儲存於end_points中,作為Auxiliary Classifier輔助模型的分類
# 第三個inception模組組包含了三個inception module
# mixed_8: 8 x 8 x 1280.
with tf.variable_scope('Mixed_7a'):
with tf.variable_scope('Branch_0'):
branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
branch_0 = slim.conv2d(branch_0, 320, [3, 3], stride=2,
padding='VALID', scope='Conv2d_1a_3x3') # 壓縮圖片
with tf.variable_scope('Branch_1'):
branch_1 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
branch_1 = slim.conv2d(branch_1, 192, [1, 7], scope='Conv2d_0b_1x7')
branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
branch_1 = slim.conv2d(branch_1, 192, [3, 3], stride=2,
padding='VALID', scope='Conv2d_1a_3x3')
with tf.variable_scope('Branch_2'): # 池化層不會對輸出通道數產生改變
branch_2 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID',
scope='MaxPool_1a_3x3')
net = tf.concat([branch_0, branch_1, branch_2], 3) # 輸出圖片尺寸被縮小,通道數增加,tensor的總size在持續下降中
# mixed_9: 8 x 8 x 2048.
with tf.variable_scope('Mixed_7b'):
with tf.variable_scope('Branch_0'):
branch_0 = slim.conv2d(net, 320, [1, 1], scope='Conv2d_0a_1x1')
with tf.variable_scope('Branch_1'):
branch_1 = slim.conv2d(net, 384, [1, 1], scope='Conv2d_0a_1x1')
branch_1 = tf.concat([
slim.conv2d(branch_1, 384, [1, 3], scope='Conv2d_0b_1x3'),
slim.conv2d(branch_1, 384, [3, 1], scope='Conv2d_0b_3x1')], 3)
with tf.variable_scope('Branch_2'):
branch_2 = slim.conv2d(net, 448, [1, 1], scope='Conv2d_0a_1x1')
branch_2 = slim.conv2d(
branch_2, 384, [3, 3], scope='Conv2d_0b_3x3')
branch_2 = tf.concat([
slim.conv2d(branch_2, 384, [1, 3], scope='Conv2d_0c_1x3'),
slim.conv2d(branch_2, 384, [3, 1], scope='Conv2d_0d_3x1')], 3)
with tf.variable_scope('Branch_3'):
branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
branch_3 = slim.conv2d(
branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3) # 輸出通道數增加到2048
# mixed_10: 8 x 8 x 2048.
with tf.variable_scope('Mixed_7c'):
with tf.variable_scope('Branch_0'):
branch_0 = slim.conv2d(net, 320, [1, 1], scope='Conv2d_0a_1x1')
with tf.variable_scope('Branch_1'):
branch_1 = slim.conv2d(net, 384, [1, 1], scope='Conv2d_0a_1x1')
branch_1 = tf.concat([
slim.conv2d(branch_1, 384, [1, 3], scope='Conv2d_0b_1x3'),
slim.conv2d(branch_1, 384, [3, 1], scope='Conv2d_0c_3x1')], 3)
with tf.variable_scope('Branch_2'):
branch_2 = slim.conv2d(net, 448, [1, 1], scope='Conv2d_0a_1x1')
branch_2 = slim.conv2d(
branch_2, 384, [3, 3], scope='Conv2d_0b_3x3')
branch_2 = tf.concat([
slim.conv2d(branch_2, 384, [1, 3], scope='Conv2d_0c_1x3'),
slim.conv2d(branch_2, 384, [3, 1], scope='Conv2d_0d_3x1')], 3)
with tf.variable_scope('Branch_3'):
branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
branch_3 = slim.conv2d(
branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
return net, end_points
#Inception V3網路的核心部分,即卷積層部分就完成了
'''
設計inception net的重要原則是圖片尺寸不斷縮小,inception模組組的目的都是將空間結構簡化,同時將空間資訊轉化為
高階抽象的特徵資訊,即將空間維度轉為通道的維度。降低了計算量。Inception Module是通過組合比較簡單的特徵
抽象(分支1)、比較比較複雜的特徵抽象(分支2和分支3)和一個簡化結構的池化層(分支4),一共四種不同程度的
特徵抽象和變換來有選擇地保留不同層次的高階特徵,這樣最大程度地豐富網路的表達能力。
'''
########全域性平均池化、Softmax和Auxiliary Logits########
def inception_v3(inputs,
num_classes=1000, # 最後需要分類的數量(比賽資料集的種類數)
is_training=True, # 標誌是否為訓練過程,只有在訓練時Batch normalization和Dropout才會啟用
dropout_keep_prob=0.8, # 節點保留比率
prediction_fn=slim.softmax, # 最後用來分類的函式
spatial_squeeze=True, # 引數標誌是否對輸出進行squeeze操作(去除維度數為1的維度,比如5*3*1轉為5*3)
reuse=None, # 是否對網路和Variable進行重複使用
scope='InceptionV3'): # 包含函式預設引數的環境
with tf.variable_scope(scope, 'InceptionV3', [inputs, num_classes], # 定義引數預設值
reuse=reuse) as scope:
with slim.arg_scope([slim.batch_norm, slim.dropout], # 定義標誌預設值
is_training=is_training):
# 拿到最後一層的輸出net和重要節點的字典表end_points
net, end_points = inception_v3_base(inputs, scope=scope) # 用定義好的函式構築整個網路的卷積部分
# Auxiliary Head logits作為輔助分類的節點,對分類結果預測有很大幫助
with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],
stride=1, padding='SAME'): # 將卷積、最大池化、平均池化步長設定為1
aux_logits = end_points['Mixed_6e'] # 通過end_points取到Mixed_6e
with tf.variable_scope('AuxLogits'):
aux_logits = slim.avg_pool2d(
aux_logits, [5, 5], stride=3, padding='VALID', # 在Mixed_6e之後接平均池化。壓縮影象尺寸
scope='AvgPool_1a_5x5')
aux_logits = slim.conv2d(aux_logits, 128, [1, 1], # 卷積。壓縮影象尺寸。
scope='Conv2d_1b_1x1')
# Shape of feature map before the final layer.
aux_logits = slim.conv2d(
aux_logits, 768, [5,5],
weights_initializer=trunc_normal(0.01), # 權重初始化方式重設為標準差為0.01的正態分佈
padding='VALID', scope='Conv2d_2a_5x5')
aux_logits = slim.conv2d(
aux_logits, num_classes, [1, 1], activation_fn=None,
normalizer_fn=None, weights_initializer=trunc_normal(0.001), # 輸出變為1*1*1000
scope='Conv2d_2b_1x1')
if spatial_squeeze: # tf.squeeze消除tensor中前兩個為1的維度。
aux_logits = tf.squeeze(aux_logits, [1, 2], name='SpatialSqueeze')
end_points['AuxLogits'] = aux_logits # 最後將輔助分類節點的輸出aux_logits儲存到字典表end_points中
# 處理正常的分類預測邏輯
# Final pooling and prediction
with tf.variable_scope('Logits'):
net = slim.avg_pool2d(net, [8, 8], padding='VALID',
scope='AvgPool_1a_8x8')
# 1 x 1 x 2048
net = slim.dropout(net, keep_prob=dropout_keep_prob, scope='Dropout_1b')
end_points['PreLogits'] = net
# 2048
logits = slim.conv2d(net, num_classes, [1, 1], activation_fn=None, # 輸出通道數1000
normalizer_fn=None, scope='Conv2d_1c_1x1') # 啟用函式和規範化函式設為空
if spatial_squeeze: # tf.squeeze去除輸出tensor中維度為1的節點
logits = tf.squeeze(logits, [1, 2], name='SpatialSqueeze')
# 1000
end_points['Logits'] = logits
end_points['Predictions'] = prediction_fn(logits, scope='Predictions') # Softmax對結果進行分類預測
return logits, end_points # 最後返回logits和包含輔助節點的end_points
########評估AlexNet每輪計算時間########
def time_tensorflow_run(session, target, info_string):
# Args:
# session:the TensorFlow session to run the computation under.
# target:需要評測的運算運算元。
# info_string:測試名稱。
num_steps_burn_in = 10 # 先定義預熱輪數(頭幾輪跌代有視訊記憶體載入、cache命中等問題因此可以跳過,只考量10輪迭代之後的計算時間)
total_duration = 0.0 # 記錄總時間
total_duration_squared = 0.0 # 總時間平方和 -----用來後面計算方差
for i in xrange(FLAGS.num_batches + num_steps_burn_in): # 迭代輪數
start_time = time.time() # 記錄時間
_ = session.run(target) # 每次迭代通過session.run(target)
duration = time.time() - start_time #
if i >= num_steps_burn_in:
if not i % 10:
print ('%s: step %d, duration = %.3f' %
(datetime.now(), i - num_steps_burn_in, duration))
total_duration += duration # 累加便於後面計算每輪耗時的均值和標準差
total_duration_squared += duration * duration
mn = total_duration / FLAGS.num_batches # 每輪迭代的平均耗時
vr = total_duration_squared / FLAGS.num_batches - mn * mn
sd = math.sqrt(vr) # 標準差
print ('%s: %s across %d steps, %.3f +/- %.3f sec / batch' %
(datetime.now(), info_string, FLAGS.num_batches, mn, sd))
batch_size = 32 # 因為網路結構較大依然設定為32,以免GPU視訊記憶體不夠
height, width = 299, 299 # 圖片尺寸
inputs = tf.random_uniform((batch_size, height, width, 3)) # 隨機生成圖片資料作為input
with slim.arg_scope(inception_v3_arg_scope()): # scope中包含了batch normalization預設引數,啟用函式和引數初始化方式的預設值
logits, end_points = inception_v3(inputs, is_training=False) # inception_v3中傳入inputs獲取裡logits和end_points
init = tf.global_variables_initializer() # 初始化全部模型引數
sess = tf.Session() # 建立session
sess.run(init)
num_batches=100 # 測試的batch數量
time_tensorflow_run(sess, logits, "Forward")
'''
雖然輸入圖片比VGGNet的224*224大了78%,但是forward速度卻比VGGNet更快。
這主要歸功於其較小的引數量,inception V3引數量比inception V1的700萬
多了很多,不過仍然不到AlexNet的6000萬引數量的一半。相比VGGNet的1.4
億引數量就更少了。整個網路的浮點計算量為50億次,比inception V1的15億
次大了不少,但是相比VGGNet來說不算大。因此較少的計算量讓inception V3
網路變得非常實用,可以輕鬆地移植到普通伺服器上提供快速響應服務,甚至
移植到手機上進行實時的影象識別。
'''