1. 程式人生 > 程式設計 >解決Keras TensorFlow 混編中 trainable=False設定無效問題

解決Keras TensorFlow 混編中 trainable=False設定無效問題

這是最近碰到一個問題,先描述下問題:

首先我有一個訓練好的模型(例如vgg16),我要對這個模型進行一些改變,例如新增一層全連線層,用於種種原因,我只能用TensorFlow來進行模型優化,tf的優化器,預設情況下對所有tf.trainable_variables()進行權值更新,問題就出在這,明明將vgg16的模型設定為trainable=False,但是tf的優化器仍然對vgg16做權值更新

以上就是問題描述,經過谷歌百度等等,終於找到了解決辦法,下面我們一點一點的來複原整個問題。

trainable=False 無效

首先,我們匯入訓練好的模型vgg16,對其設定成trainable=False

from keras.applications import VGG16
import tensorflow as tf
from keras import layers
# 匯入模型
base_mode = VGG16(include_top=False)
# 檢視可訓練的變數
tf.trainable_variables()
[<tf.Variable 'block1_conv1/kernel:0' shape=(3,3,64) dtype=float32_ref>,<tf.Variable 'block1_conv1/bias:0' shape=(64,) dtype=float32_ref>,<tf.Variable 'block1_conv2/kernel:0' shape=(3,64,<tf.Variable 'block1_conv2/bias:0' shape=(64,<tf.Variable 'block2_conv1/kernel:0' shape=(3,128) dtype=float32_ref>,<tf.Variable 'block2_conv1/bias:0' shape=(128,<tf.Variable 'block2_conv2/kernel:0' shape=(3,128,<tf.Variable 'block2_conv2/bias:0' shape=(128,<tf.Variable 'block3_conv1/kernel:0' shape=(3,256) dtype=float32_ref>,<tf.Variable 'block3_conv1/bias:0' shape=(256,<tf.Variable 'block3_conv2/kernel:0' shape=(3,256,<tf.Variable 'block3_conv2/bias:0' shape=(256,<tf.Variable 'block3_conv3/kernel:0' shape=(3,<tf.Variable 'block3_conv3/bias:0' shape=(256,<tf.Variable 'block4_conv1/kernel:0' shape=(3,512) dtype=float32_ref>,<tf.Variable 'block4_conv1/bias:0' shape=(512,<tf.Variable 'block4_conv2/kernel:0' shape=(3,512,<tf.Variable 'block4_conv2/bias:0' shape=(512,<tf.Variable 'block4_conv3/kernel:0' shape=(3,<tf.Variable 'block4_conv3/bias:0' shape=(512,<tf.Variable 'block5_conv1/kernel:0' shape=(3,<tf.Variable 'block5_conv1/bias:0' shape=(512,<tf.Variable 'block5_conv2/kernel:0' shape=(3,<tf.Variable 'block5_conv2/bias:0' shape=(512,<tf.Variable 'block5_conv3/kernel:0' shape=(3,<tf.Variable 'block5_conv3/bias:0' shape=(512,<tf.Variable 'block1_conv1_1/kernel:0' shape=(3,<tf.Variable 'block1_conv1_1/bias:0' shape=(64,<tf.Variable 'block1_conv2_1/kernel:0' shape=(3,<tf.Variable 'block1_conv2_1/bias:0' shape=(64,<tf.Variable 'block2_conv1_1/kernel:0' shape=(3,<tf.Variable 'block2_conv1_1/bias:0' shape=(128,<tf.Variable 'block2_conv2_1/kernel:0' shape=(3,<tf.Variable 'block2_conv2_1/bias:0' shape=(128,<tf.Variable 'block3_conv1_1/kernel:0' shape=(3,<tf.Variable 'block3_conv1_1/bias:0' shape=(256,<tf.Variable 'block3_conv2_1/kernel:0' shape=(3,<tf.Variable 'block3_conv2_1/bias:0' shape=(256,<tf.Variable 'block3_conv3_1/kernel:0' shape=(3,<tf.Variable 'block3_conv3_1/bias:0' shape=(256,<tf.Variable 'block4_conv1_1/kernel:0' shape=(3,<tf.Variable 'block4_conv1_1/bias:0' shape=(512,<tf.Variable 'block4_conv2_1/kernel:0' shape=(3,<tf.Variable 'block4_conv2_1/bias:0' shape=(512,<tf.Variable 'block4_conv3_1/kernel:0' shape=(3,<tf.Variable 'block4_conv3_1/bias:0' shape=(512,<tf.Variable 'block5_conv1_1/kernel:0' shape=(3,<tf.Variable 'block5_conv1_1/bias:0' shape=(512,<tf.Variable 'block5_conv2_1/kernel:0' shape=(3,<tf.Variable 'block5_conv2_1/bias:0' shape=(512,<tf.Variable 'block5_conv3_1/kernel:0' shape=(3,<tf.Variable 'block5_conv3_1/bias:0' shape=(512,) dtype=float32_ref>]
# 設定 trainable=False
# base_mode.trainable = False似乎也是可以的
for layer in base_mode.layers:
  layer.trainable = False

設定好trainable=False後,再次檢視可訓練的變數,發現並沒有變化,也就是說設定無效

# 再次檢視可訓練的變數
tf.trainable_variables()

[<tf.Variable 'block1_conv1/kernel:0' shape=(3,) dtype=float32_ref>]

解決的辦法

解決的辦法就是在匯入模型的時候建立一個variable_scope,將需要訓練的變數放在另一個variable_scope,然後通過tf.get_collection獲取需要訓練的變數,最後通過tf的優化器中var_list指定需要訓練的變數

from keras import models
with tf.variable_scope('base_model'):
  base_model = VGG16(include_top=False,input_shape=(224,224,3))
with tf.variable_scope('xxx'):
  model = models.Sequential()
  model.add(base_model)
  model.add(layers.Flatten())
  model.add(layers.Dense(10))

# 獲取需要訓練的變數
trainable_var = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES,'xxx')
trainable_var

[<tf.Variable 'xxx_2/dense_1/kernel:0' shape=(25088,10) dtype=float32_ref>,
<tf.Variable 'xxx_2/dense_1/bias:0' shape=(10,) dtype=float32_ref>]

# 定義tf優化器進行訓練,這裡假設有一個loss
loss = model.output / 2; # 隨便定義的,方便演示
train_step = tf.train.AdamOptimizer().minimize(loss,var_list=trainable_var)

總結

在keras與TensorFlow混編中,keras中設定trainable=False對於TensorFlow而言並不起作用

解決的辦法就是通過variable_scope對變數進行區分,在通過tf.get_collection來獲取需要訓練的變數,最後通過tf優化器中var_list指定訓練

以上這篇解決Keras TensorFlow 混編中 trainable=False設定無效問題就是小編分享給大家的全部內容了,希望能給大家一個參考,也希望大家多多支援我們。