tensorflow載入訓練好的模型例項
1. 首先了解下tensorflow的一些基礎語法知識
這裡不再詳細說明其細節,只舉例學習。
1.1 tensorflow的tf.transpose()簡單使用:
tf.reshape(tensor, shape, name=None)
矩陣變形是常用的操作,在Tensorflow中呼叫方式有多種,例如:
1.tf.reshape
tf.reshape(L3, [-1, W4.get_shape().as_list()[0]])
2.object.reshape
mnist.test.images.reshape(-1, 28, 28, 1)
例子:
import tensorflow as tf
#import numpy as np
# tensor 't' is [1, 2, 3, 4, 5, 6, 7, 8, 9]
# tensor 't' has shape [9]
t1=[1, 2, 3, 4, 5, 6, 7, 8, 9]
print('t:',t1)
print(tf.reshape(t1, [3, 3]))
with tf.Session() as sess:
print(sess.run(tf.reshape(t1, [3, 3])))
print('----------------------')
# tensor 't' is [[[1, 1], [2, 2]],
# [[3, 3], [4, 4]] ]
# tensor 't' has shape [2, 2, 2]
t2=[[[1, 1], [2, 2]],
[[3, 3], [4, 4]]]
print('t:',t2)
print(tf.reshape(t2, [2,4]))
with tf.Session() as sess:
print(sess.run(tf.reshape(t2, [2, 4])))
print('----------------------')
# tensor 't' is [[[1, 1, 1],
# [2, 2, 2]],
# [[3, 3, 3],
# [4, 4, 4]] ,
# [[5, 5, 5],
# [6, 6, 6]]]
# tensor 't' has shape [3, 2, 3]
# pass '[-1]' to flatten 't'
t3=[[[1, 1, 1],
[2, 2, 2]],
[[3, 3, 3],
[4, 4, 4]],
[[5, 5, 5],
[6, 6, 6]]]
print('t:',t3)
print(tf.reshape(t3, [-1]))
with tf.Session() as sess:
print(sess.run(tf.reshape(t3, [-1])),'\n')
# -1 can also be used to infer the shape
# -1 is inferred to be 9:
print(sess.run(tf.reshape(t3, [2,-1])),'\n')
# -1 is inferred to be 2:
print(sess.run(tf.reshape(t3, [-1,9])),'\n')
# -1 is inferred to be 3:
print(sess.run(tf.reshape(t3, [2,-1,3])),'\n')
print(sess.run(tf.reshape(t3, [-1,3, 2, 3])))
執行結果:
t: [1, 2, 3, 4, 5, 6, 7, 8, 9]
Tensor("Reshape_47:0", shape=(3, 3), dtype=int32)
[[1 2 3]
[4 5 6]
[7 8 9]]
----------------------
t: [[[1, 1], [2, 2]], [[3, 3], [4, 4]]]
Tensor("Reshape_49:0", shape=(2, 4), dtype=int32)
[[1 1 2 2]
[3 3 4 4]]
----------------------
t: [[[1, 1, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]], [[5, 5, 5], [6, 6, 6]]]
Tensor("Reshape_51:0", shape=(18,), dtype=int32)
[1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6]
[[1 1 1 2 2 2 3 3 3]
[4 4 4 5 5 5 6 6 6]]
[[1 1 1 2 2 2 3 3 3]
[4 4 4 5 5 5 6 6 6]]
[[[1 1 1]
[2 2 2]
[3 3 3]]
[[4 4 4]
[5 5 5]
[6 6 6]]]
[[[[1 1 1]
[2 2 2]]
[[3 3 3]
[4 4 4]]
[[5 5 5]
[6 6 6]]]]
2. 模型訓練和儲存例項
2.1 線性擬合
import tensorflow as tf
import numpy as np
# 訓練模型
def train_model():
# 假造資料
x_data = np.random.rand(100).astype(np.float32)
print ('x_data:',x_data)
y_data = x_data * 0.1 + 0.2
print ('y_data:',y_data)
# 定義權重
W = tf.Variable(tf.random_uniform([1], -20.0, 20.0), dtype=tf.float32, name='w')
b = tf.Variable(tf.random_uniform([1], -10.0, 10.0), dtype=tf.float32, name='b')
# 計算線性輸出
y = W * x_data + b
# 定義損失函式
loss = tf.reduce_mean(tf.square(y - y_data))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(loss)
# 儲存模型:這裡的max_to_keep=4是最終會儲存最新的4個模型
saver = tf.train.Saver(max_to_keep=4)
# 定義會話,訓練模型
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print ("------------------------------------------------------")
print ("before the train, the W is %6f, the b is %6f" % (sess.run(W), sess.run(b)))
for epoch in range(300):
if epoch % 10 == 0:
print ("------------------------------------------------------")
print ("after epoch %d, the loss is %6f" % (epoch, sess.run(loss)))
print ("the W is %f, the b is %f" % (sess.run(W), sess.run(b)))
saver.save(sess, "model/my-model", global_step=epoch)
print ("save the model")
sess.run(train_step)
print ("------------------------------------------------------")
# 載入模型
def load_model():
with tf.Session() as sess:
# import_meta_graph填的名字meta檔案的名字
saver = tf.train.import_meta_graph('model/my-model-290.meta')
# 檢查checkpoint,所以只填到checkpoint所在的路徑下即可,不需要填checkpoint
saver.restore(sess, tf.train.latest_checkpoint("model"))
# saver.restore(sess, tf.train.latest_checkpoint("model/checkpoint"))
print (sess.run('w:0'))
print (sess.run('b:0'))
# 模型訓練
#train_model()
# 模型載入
load_model()
結果:
INFO:tensorflow:Restoring parameters from model/my-model-290
[0.09999993]
[0.20000005]
這裡的執行結果省略了訓練的過程,其實應該先進行訓練,儲存模型,然後進行模型的呼叫進行測試資料的測試,這裡的資料是隨機生成的,所以準確率不必在意。
要注意的幾點:
- 建立saver時,可以指定需要儲存的tensor,如果沒有指定,則全部儲存。
- 建立saver時,可以指定儲存的模型個數,利用max_to_keep=4,則最終會儲存4個模型
- saver.save()函式裡面可以設定global_step,說明是哪一步儲存的模型。
- 如果不想儲存所有變數,可以在建立saver例項時,指定儲存的變數,可以以list或者dict的型別儲存。如:
w1 = tf.Variable(tf.random_normal(shape=[2]), name='w1')
w2 = tf.Variable(tf.random_normal(shape=[5]), name='w2')
saver = tf.train.Saver([w1,w2])
- 程式結束後,會生成四個檔案,而每個檔案包括三個型別的小檔案:儲存網路結構.meta、儲存訓練好的引數.data和.index、記錄最新的模型checkpoint。
- .meta檔案:一個協議緩衝,儲存tensorflow中完整的graph、variables、operation、collection。
- import_meta_graph匯入的是meta檔案的名字。然後restore時,是檢查checkpoint,所以只填到checkpoint所在的路徑下即可,不需要填checkpoint,不然會報錯“ValueError: Can’t load save_path when it is None.”。
- 最好在定義tensor的時候就指定名字,如上面程式碼中的
name='w'
- 如果想設定每多長時間儲存一次,可以設定saver = tf.train.Saver(keep_checkpoint_every_n_hours=2),這個是每2個小時儲存一次。
2.2 簡單的卷積神經網路
下面定義了一個簡單的卷積神經網路:有兩個卷積層、兩個池化層和兩個全連線層。並且載入的資料是無意義的資料,模擬的是10張32x32的RGB影象,共4個類別0、1、2、3。這裡主要是為了學習模型的儲存和呼叫,對於資料怎樣得來和準確率不用在意。
import tensorflow as tf
import numpy as np
import os
# 自定義要載入的訓練集
def load_data(resultpath):
datapath = os.path.join(resultpath, "data10_4.npz")
# 如果有已經存在的資料,則載入
if os.path.exists(datapath):
data = np.load(datapath)
# 注意提取數值的方法
X, Y = data["X"], data["Y"]
else:
# 載入的資料是無意義的資料,模擬的是10張32x32的RGB影象,共4個類別:0、1、2、3
# 將30720個數字化成10*32*32*32*3的張量
X = np.array(np.arange(30720)).reshape(10, 32, 32, 3)
Y = [0, 0, 1, 1, 2, 2, 3, 3, 2, 0]
X = X.astype('float32')
Y = np.array(Y)
# 把資料儲存成dataset.npz的格式
np.savez(datapath, X=X, Y=Y)
print('Saved dataset to dataset.npz')
# 一種很好用的列印輸出顯示方式
print('X_shape:{}\nY_shape:{}'.format(X.shape, Y.shape))
return X, Y
# 搭建卷積網路:有兩個卷積層、兩個池化層和兩個全連線層。
def define_model(x):
x_image = tf.reshape(x, [-1, 32, 32, 3])
print ('x_image.shape:',x_image.shape)
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial, name="w")
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial, name="b")
def conv3d(x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
def max_pool_2d(x):
return tf.nn.max_pool(x, ksize=[1, 3, 3, 1], strides=[1, 3, 3, 1], padding='SAME')
with tf.variable_scope("conv1"): # [-1,32,32,3]
weights = weight_variable([3, 3, 3, 32])
biases = bias_variable([32])
conv1 = tf.nn.relu(conv3d(x_image, weights) + biases)
pool1 = max_pool_2d(conv1) # [-1,11,11,32]
with tf.variable_scope("conv2"):
weights = weight_variable([3, 3, 32, 64])
biases = bias_variable([64])
conv2 = tf.nn.relu(conv3d(pool1, weights) + biases)
pool2 = max_pool_2d(conv2) # [-1,4,4,64]
with tf.variable_scope("fc1"):
weights = weight_variable([4 * 4 * 64, 128]) # [-1,1024]
biases = bias_variable([128])
fc1_flat = tf.reshape(pool2, [-1, 4 * 4 * 64])
fc1 = tf.nn.relu(tf.matmul(fc1_flat, weights) + biases)
fc1_drop = tf.nn.dropout(fc1, 0.5) # [-1,128]
with tf.variable_scope("fc2"):
weights = weight_variable([128, 4])
biases = bias_variable([4])
fc2 = tf.matmul(fc1_drop, weights) + biases # [-1,4]
return fc2
# 訓練模型
def train_model():
# 訓練資料的佔位符
x = tf.placeholder(tf.float32, shape=[None, 32, 32, 3], name="x")
y_ = tf.placeholder('int64', shape=[None], name="y_")
# 學習率
initial_learning_rate = 0.001
# 定義網路結構,前向傳播,得到預測輸出
y_fc2 = define_model(x)
# 定義訓練集的one-hot標籤
y_label = tf.one_hot(y_, 4, name="y_labels")
# 定義損失函式
loss_temp = tf.losses.softmax_cross_entropy(onehot_labels=y_label, logits=y_fc2)
cross_entropy_loss = tf.reduce_mean(loss_temp)
# 訓練時的優化器
train_step = tf.train.AdamOptimizer(learning_rate=initial_learning_rate, beta1=0.9, beta2=0.999,
epsilon=1e-08).minimize(cross_entropy_loss)
# 一樣返回True,否則返回False
correct_prediction = tf.equal(tf.argmax(y_fc2, 1), tf.argmax(y_label, 1))
# 將correct_prediction,轉換成指定tf.float32型別
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# 儲存模型,這裡做多儲存4個模型
saver = tf.train.Saver(max_to_keep=4)
# 把預測值加入predict集合
tf.add_to_collection("predict", y_fc2)
tf.add_to_collection("acc", accuracy )
# 定義會話
with tf.Session() as sess:
# 所有變數初始化
sess.run(tf.global_variables_initializer())
print ("------------------------------------------------------")
# 載入訓練資料,這裡的訓練資料是構造的,旨在儲存/載入模型的學習
X, Y = load_data("model1/") # 這裡需要提前新建一個資料夾
X = np.multiply(X, 1.0 / 255.0)
for epoch in range(200):
if epoch % 10 == 0:
print ("------------------------------------------------------")
train_accuracy = accuracy.eval(feed_dict={x: X, y_: Y})
train_loss = cross_entropy_loss.eval(feed_dict={x: X, y_: Y})
print ("after epoch %d, the loss is %6f" % (epoch, train_loss))
# 這裡的正確率是以整體的訓練樣本為訓練樣例的
print ("after epoch %d, the acc is %6f" % (epoch, train_accuracy))
saver.save(sess, "model1/my-model", global_step=epoch)
print ("save the model")
train_step.run(feed_dict={x: X, y_: Y})
print ("------------------------------------------------------")
# 儲存模型
def load_model():
# 測試資料構造:模擬2張32x32的RGB圖
X = np.array(np.arange(6144, 12288)).reshape(2, 32, 32, 3)
Y = [3, 1]
Y = np.array(Y)
X = X.astype('float32')
X = np.multiply(X, 1.0 / 255.0)
with tf.Session() as sess:
# 載入元圖和權重
saver = tf.train.import_meta_graph('model1/my-model-190.meta')
saver.restore(sess, tf.train.latest_checkpoint("model1/"))
# 獲取權重
graph = tf.get_default_graph()
fc2_w = graph.get_tensor_by_name("fc2/w:0")
fc2_b = graph.get_tensor_by_name("fc2/b:0")
print ("------------------------------------------------------")
print ('fc2_w:',sess.run(fc2_w))
print ("#######################################")
print ('fc2_b:',sess.run(fc2_b))
print ("------------------------------------------------------")
#input_x = graph.get_operation_by_name("x").outputs[0]
# 預測輸出
feed_dict = {"x:0":X, "y_:0":Y}
y = graph.get_tensor_by_name("y_labels:0")
yy = sess.run(y, feed_dict)
print ('yy:',yy)
print ("the answer is: ", sess.run(tf.argmax(yy, 1)))
print ("------------------------------------------------------")
pred_y = tf.get_collection("predict")
print('i am here..1')
pred = sess.run(pred_y, feed_dict)[0]
print ('pred:',pred, '\n')
pred = sess.run(tf.argmax(pred, 1))
print ("the predict is: ", pred)
print ("------------------------------------------------------")
acc = tf.get_collection("acc")
#acc = graph.get_operation_by_name("acc")
acc = sess.run(acc, feed_dict)
#print(acc.eval())
print ("the accuracy is: ", acc)
print ("------------------------------------------------------")
# 訓練模型
train_model()
# 載入模型
load_model()
注意上面按照順序應該是先訓練,訓練好以後再呼叫訓練好的模型進行測試。
單獨訓練結果:
x_image.shape: (?, 32, 32, 3)
------------------------------------------------------
X_shape:(10, 32, 32, 3)
Y_shape:(10,)
------------------------------------------------------
after epoch 0, the loss is 37.972336
after epoch 0, the acc is 0.200000
save the model
------------------------------------------------------
after epoch 10, the loss is 55.470387
after epoch 10, the acc is 0.100000
save the model
------------------------------------------------------
after epoch 20, the loss is 17.129293
after epoch 20, the acc is 0.200000
save the model
------------------------------------------------------
after epoch 30, the loss is 15.748987
after epoch 30, the acc is 0.300000
save the model
------------------------------------------------------
after epoch 40, the loss is 4.500556
after epoch 40, the acc is 0.300000
save the model
------------------------------------------------------
after epoch 50, the loss is 2.675602
after epoch 50, the acc is 0.200000
save the model
------------------------------------------------------
after epoch 60, the loss is 2.377462
after epoch 60, the acc is 0.500000
save the model
------------------------------------------------------
after epoch 70, the loss is 1.419432
after epoch 70, the acc is 0.500000
save the model
------------------------------------------------------
...
...
...
after epoch 130, the loss is 1.356822
after epoch 130, the acc is 0.500000
save the model
------------------------------------------------------
after epoch 140, the loss is 1.361622
after epoch 140, the acc is 0.200000
save the model
------------------------------------------------------
after epoch 150, the loss is 1.204934
after epoch 150, the acc is 0.300000
save the model
------------------------------------------------------
after epoch 160, the loss is 1.273999
after epoch 160, the acc is 0.300000
save the model
------------------------------------------------------
after epoch 170, the loss is 1.213519
after epoch 170, the acc is 0.400000
save the model
------------------------------------------------------
after epoch 180, the loss is 1.276478
after epoch 180, the acc is 0.300000
save the model
------------------------------------------------------
after epoch 190, the loss is 1.162433
after epoch 190, the acc is 0.300000
save the model
------------------------------------------------------
單獨測試結果:
INFO:tensorflow:Restoring parameters from model1/my-model-190
------------------------------------------------------
fc2_w: [[ 0.09413899 -0.07282051 0.02397597 0.05508222]
[-0.05514605 -0.03894351 -0.0548727 -0.02125386]
[ 0.06236398 -0.00028329 0.13300249 0.06448492]
[-0.0921673 0.00342558 0.10539673 -0.02442357]
[-0.04699677 0.11520271 -0.04514726 -0.13220425]
...
...
...
[ 0.08583067 -0.06123111 0.10699942 0.03429044]
[-0.05737718 0.0714161 -0.04370898 -0.0397063 ]
[ 0.00849419 -0.04352335 0.01004444 0.03862172]]
#######################################
fc2_b: [0.12246324 0.11658503 0.10220832 0.06499074]
------------------------------------------------------
yy: [[0. 0. 0. 1.]
[0. 1. 0. 0.]]
the answer is: [3 1]
------------------------------------------------------
i am here..1
pred: [[ 0.6232525 0.18511544 0.08325944 -0.4809047 ]
[ 0.12246324 0.11658503 0.10220832 0.06499074]]
the predict is: [0 0]
------------------------------------------------------
the accuracy is: [0.0, 0.0]
------------------------------------------------------
2.3 載入滑動平均模型和變數重新命名
在使用梯度下降演算法訓練模型時,每次更新權重時,為每個權重維護一個影子變數,該影子變數隨著訓練的進行,會最終穩定在一個接近真實權重的值的附近。那麼,在進行預測的時候,使用影子變數的值替代真實變數的值,可以得到更好的結果。 滑動平均模型在梯段下降演算法上才會有好的結果,別的優化演算法沒有這個現象,還沒有合理的解釋。而優化的方法有很多,這個可以作為提高健壯性的有效措施。
# 執行時要注意IDE的當前工作過路徑,最好每段重啟控制器一次,輸出結果更準確
# Part1: 通過tf.train.Saver類實現儲存和載入神經網路模型
# 執行本段程式時注意當前的工作路徑
import tensorflow as tf
v1 = tf.Variable(tf.constant(1.0, shape=[1]), name="v1")
v2 = tf.Variable(tf.constant(2.0, shape=[1]), name="v2")
result = v1 + v2
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
saver.save(sess, "Model/model.ckpt")
# Part3: 若不希望重複定義計算圖上的運算,可直接載入已經持久化的圖
import tensorflow as tf
saver = tf.train.import_meta_graph("Model/model.ckpt.meta")
graph=tf.get_default_graph()
with tf.Session() as sess:
saver.restore(sess, "./Model/model.ckpt") # 注意路徑寫法
print(sess.run(graph.get_tensor_by_name("add:0"))) # [ 3.]
# Part4: tf.train.Saver類也支援在儲存和載入時給變數重新命名
import tensorflow as tf
# 宣告的變數名稱name與已儲存的模型中的變數名稱name不一致
u1 = tf.Variable(tf.constant(1.0, shape=[1]), name="other-v1")
u2 = tf.Variable(tf.constant(2.0, shape=[1]), name="other-v2")
result = u1 + u2
# 若直接生命Saver類物件,會報錯變數找不到
# 使用一個字典dict重新命名變數即可,{"已儲存的變數的名稱name": 重新命名變數名}
# 原來名稱name為v1的變數現在載入到變數u1(名稱name為other-v1)中
saver = tf.train.Saver({"v1": u1, "v2": u2})
with tf.Session() as sess:
saver.restore(sess, "./Model/model.ckpt")
print(sess.run(result)) # [ 3.]
#INFO:tensorflow:Restoring parameters from ./Model/model.ckpt
#[3.]
#INFO:tensorflow:Restoring parameters from ./Model/model.ckpt
#[3.]
# Part5: 儲存滑動平均模型
import tensorflow as tf
v = tf.Variable(0, dtype=tf.float32, name="v")
for variables in tf.global_variables():
print(variables.name) # v:0
print('...........')
ema = tf.train.ExponentialMovingAverage(0.99)
maintain_averages_op = ema.apply(tf.global_variables())
for variables in tf.global_variables():
print(variables.name) # v:0
# v/ExponentialMovingAverage:0
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
sess.run(tf.assign(v, 10))
sess.run(maintain_averages_op)
saver.save(sess, "Model/model_ema.ckpt")
print('here..1:',sess.run([v, ema.average(v)])) # [10.0, 0.099999905]
#v:0
#...........
#v:0
#v/ExponentialMovingAverage:0
#here..1: [10.0, 0.099999905]
# Part6: 通過變數重新命名直接讀取變數的滑動平均值
import tensorflow as tf
v = tf.Variable(0, dtype=tf.float32, name="v")
# {"已儲存的變數的名稱name": 重新命名變數名}
saver = tf.train.Saver({"v/ExponentialMovingAverage": v})
with tf.Session() as sess:
saver.restore(sess, "./Model/model_ema.ckpt")
print('here..2:',sess.run(v)) # 0.0999999
# INFO:tensorflow:Restoring parameters from ./Model/model_ema.ckpt
# here..2: 0.099999905
# Part7: 通過tf.train.ExponentialMovingAverage的variables_to_restore()函式獲取變數重新命名字典
import tensorflow as tf
v = tf.Variable(0, dtype=tf.float32, name="v")
# 注意此處的變數名稱name一定要與已儲存的變數名稱一致
ema = tf.train.ExponentialMovingAverage(0.99)
print(ema.variables_to_restore())
# {'v/ExponentialMovingAverage': <tf.Variable 'v:0' shape=() dtype=float32_ref>}
# 此處的v取自上面變數v的名稱name="v"
saver = tf.train.Saver(ema.variables_to_restore())
with tf.Session() as sess:
saver.restore(sess, "./Model/model_ema.ckpt")
print(sess.run(v)) # 0.0999999
#{'v/ExponentialMovingAverage': <tf.Variable 'v:0' shape=() dtype=float32_ref>}
#INFO:tensorflow:Restoring parameters from ./Model/model_ema.ckpt
#0.099999905
# Part8: 通過convert_variables_to_constants函式將計算圖中的變數及其取值通過常量的方式保存於一個檔案中
import tensorflow as tf
from tensorflow.python.framework import graph_util
v1 = tf.Variable(tf.constant(1.0, shape=[1]), name="v1")
v2 = tf.Variable(tf.constant(2.0, shape=[1]), name="v2")
result = v1 + v2
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
# 匯出當前計算圖的GraphDef部分,即從輸入層到輸出層的計算過程部分
graph_def = tf.get_default_graph().as_graph_def()
output_graph_def = graph_util.convert_variables_to_constants(sess,graph_def, ['add'])
with tf.gfile.GFile("Model/combined_model.pb", 'wb') as f:
f.write(output_graph_def.SerializeToString())
#INFO:tensorflow:Froze 2 variables.
#INFO:tensorflow:Converted 2 variables to const ops.
# Part9: 載入包含變數及其取值的模型
import tensorflow as tf
from tensorflow.python.platform import gfile
with tf.Session() as sess:
model_filename = "Model/combined_model.pb"
with gfile.FastGFile(model_filename, 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
result = tf.import_graph_def(graph_def, return_elements=["add:0"])
print(sess.run(result))
#[array([3.], dtype=float32)]
tf.train.Saver類也支援在儲存和載入時給變數重新命名,宣告Saver類物件的時候使用一個字典dict重新命名變數即可,{"已儲存的變數的名稱name": 重新命名變數名}
,saver = tf.train.Saver({"v1":u1, "v2": u2})
即原來名稱name為v1的變數現在載入到變數u1(名稱name為other-v1)中。
這樣就是為了方便使用變數的滑動平均值。如果在載入模型時直接將影子變數對映到變數自身,則在使用訓練好的模型時就不需要再呼叫函式來獲取變數的滑動平均值了。載入時,宣告Saver類物件時通過一個字典將滑動平均值直接載入到新的變數中,saver = tf.train.Saver({"v/ExponentialMovingAverage": v})
,另通過tf.train.ExponentialMovingAverage的variables_to_restore()
函式獲取變數重新命名字典。
2.4 fine-tuning
使用已經預訓練好的模型,自己fine-tuning。
- 首先獲得pre-traing的graph結構,
saver = tf.train.import_meta_graph('my_test_model-1000.meta')
- 載入引數,
saver.restore(sess,tf.train.latest_checkpoint('./'))
- 準備feed_dict,新的訓練資料或者測試資料。這樣就可以使用同樣的模型,訓練或者測試不同的資料。
- 如果想在已有的網路結構上新增新的層,如前面卷積網路,獲得fc2時,然後添加了一個全連線層和輸出層。(這裡的新增網路層沒有進行測試)
# pre-train and fine-tuning
fc2 = graph.get_tensor_by_name("fc2/add:0")
fc2 = tf.stop_gradient(fc2) # stop the gradient compute
fc2_shape = fc2.get_shape().as_list()
# fine -tuning
new_nums = 6
weights = tf.Variable(tf.truncated_normal([fc2_shape[1], new_nums], stddev=0.1), name="w")
biases = tf.Variable(tf.constant(0.1, shape=[new_nums]), name="b")
conv2 = tf.matmul(fc2, weights) + biases
output2 = tf.nn.softmax(conv2)