1. 程式人生 > >Tensorflow中scope命名方法

Tensorflow中scope命名方法

兩篇文章掌握Tensorflow中scope用法:

【1】Tensorflow中scope命名方法(本文)

【2】Tensorflow中tf.name_scope() 和 tf.variable_scope() 的區別

微信公眾號

1. tf.name_scope()

在 Tensorflow 當中有兩種途徑生成變數 variable,一種是 tf.get_variable()另一種是tf.variable() 如果在tf.name_scope()的框架下使用這兩種方式,結果會如下。

import tensorflow as tf

with tf.name_scope("a_name_scope"):
    initializer = tf.constant_initializer(value=1)
    var1 = tf.get_variable(name='var1', shape=[1], dtype=tf.float32, initializer=initializer)
    var2 = tf.Variable(name='var2', initial_value=[2], dtype=tf.float32)
    var21 = tf.Variable(name='var2', initial_value=[2.1], dtype=tf.float32)
    var22 = tf.Variable(name='var2', initial_value=[2.2], dtype=tf.float32)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(var1.name)        # var1:0
    print(sess.run(var1))   # [ 1.]
    print(var2.name)        # a_name_scope/var2:0
    print(sess.run(var2))   # [ 2.]
    print(var21.name)       # a_name_scope/var2_1:0
    print(sess.run(var21))  # [ 2.1]
    print(var22.name)       # a_name_scope/var2_2:0
    print(sess.run(var22))  # [ 2.2]

結果:

分析:

在tf.name_scope()中使用tf.variable()定義變數的時候,雖然name都一樣,但是為了不重複變數名,Tensorflow輸出的變數名並不是一樣的。所以,本質上var2、var21、var22並不是一樣的變數。而另一方面,使用tf.get_variable()定義的變數不會被tf.name_scope()當中的名字所影響,相當於tf.name_scope()對tf.get_variable()是無效的。

2. tf.variable_scope()

如果想要達到重複利用變數的效果, 我們就要使用tf.variable_scope(), 並搭配tf.get_variable()。

這種方式產生和提取變數,不像tf.Variable()每次都會產生新的變數,tf.get_variable() 如果遇到了同樣名字的變數時, 它會單純的提取這個同樣名字的變數(避免產生新變數), 當在重複使用相同變數名字的時候, 一定要在程式碼中強調 scope.reuse_variables() ,否則系統將會報錯, 以為你只是不小心重複使用到了一個已經使用過的變數。

import tensorflow as tf

with tf.variable_scope("a_variable_scope") as scope:
    initializer = tf.constant_initializer(value=3)
    var3 = tf.get_variable(name='var3', shape=[1], dtype=tf.float32, initializer=initializer)
    scope.reuse_variables()
    var3_reuse = tf.get_variable(name='var3', )
    var4 = tf.Variable(name='var4', initial_value=[4], dtype=tf.float32)
    var4_reuse = tf.Variable(name='var4', initial_value=[4], dtype=tf.float32)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(var3.name)  # a_variable_scope/var3:0
    print(sess.run(var3))  # [ 3.]
    print(var3_reuse.name)  # a_variable_scope/var3:0
    print(sess.run(var3_reuse))  # [ 3.]
    print(var4.name)  # a_variable_scope/var4:0
    print(sess.run(var4))  # [ 4.]
    print(var4_reuse.name)  # a_variable_scope/var4_1:0
    print(sess.run(var4_reuse))  # [ 4.]

結果:

分析: 

上面結果中,var3和var3_reuse是同一個變數,而var4和var4_reuse是不同的變數。

在tf.variable_scope()中,使用tf.get_variable()方法建立變數時,如果這個變數已經存在,想直接使用這個變數,加上scope.reuse_variables()即可;如果沒有加上scope.reuse_variables(),Tensorflow會報重複使用變數的錯誤。

不管是在tf.name_scope還是在tf.variable_scope()中,tf.Variable()都是在建立新的變數。如果這個變數存在,則字尾會增加0、1、2等數字編號予以區別。

3. RNN應用例子

在使用迴圈神經網路(RNN)對序列化資料建模時,training RNN和test RNN的time_steps會有不同的取值,這將會影響整個RNN的結構,所以導致在test的時候,不能單純的使用training時建立的那個RNN模型。但是,training RNN和test RNN又必須是有相同的weights biases的引數。所以,這時就是使用reuse variable的好時機。

下面給一個例子:

首先,定義training和test的不同引數:

class TrainConfig:
    batch_size = 20
    time_steps = 20
    input_size = 10
    output_size = 2
    cell_size = 11
    learning_rate = 0.01


class TestConfig(TrainConfig):
    time_steps = 1
    
train_config = TrainConfig()
test_config = TestConfig()

然後,讓train_rnn 和 test_rnn 在同一個tf.variable_scope(‘rnn’) 之中。 並且定義scope.reuse_variables(), 使我們能把train_rnn的所有 weights, biases 引數全部繫結到test_rnn 中。這樣,不管兩者的time_steps有多不同,結構有多不同,train_rnn,W,b 引數更新成什麼樣,test_rnn的引數也更新成什麼樣。

with tf.variable_scope('rnn') as scope:
    sess = tf.Session()
    train_rnn = RNN(train_config)
    scope.reuse_variables()
    test_rnn = RNN(test_config)
    sess.run(tf.global_variables_initializer())

最後,給出RNN用tf.variable_scope()和get_variable_scope()實現引數共享的完整例子:

# 22 scope (name_scope/variable_scope)
from __future__ import print_function
import tensorflow as tf

class TrainConfig:
    batch_size = 20
    time_steps = 20
    input_size = 10
    output_size = 2
    cell_size = 11
    learning_rate = 0.01


class TestConfig(TrainConfig):
    time_steps = 1


class RNN(object):

    def __init__(self, config):
        self._batch_size = config.batch_size
        self._time_steps = config.time_steps
        self._input_size = config.input_size
        self._output_size = config.output_size
        self._cell_size = config.cell_size
        self._lr = config.learning_rate
        self._built_RNN()

    def _built_RNN(self):
        with tf.variable_scope('inputs'):
            self._xs = tf.placeholder(tf.float32, [self._batch_size, self._time_steps, self._input_size], name='xs')
            self._ys = tf.placeholder(tf.float32, [self._batch_size, self._time_steps, self._output_size], name='ys')
        with tf.name_scope('RNN'):
            with tf.variable_scope('input_layer'):
                l_in_x = tf.reshape(self._xs, [-1, self._input_size], name='2_2D')  # (batch*n_step, in_size)
                # Ws (in_size, cell_size)
                Wi = self._weight_variable([self._input_size, self._cell_size])
                print(Wi.name)
                # bs (cell_size, )
                bi = self._bias_variable([self._cell_size, ])
                # l_in_y = (batch * n_steps, cell_size)
                with tf.name_scope('Wx_plus_b'):
                    l_in_y = tf.matmul(l_in_x, Wi) + bi
                l_in_y = tf.reshape(l_in_y, [-1, self._time_steps, self._cell_size], name='2_3D')

            with tf.variable_scope('cell'):
                cell = tf.contrib.rnn.BasicLSTMCell(self._cell_size)
                with tf.name_scope('initial_state'):
                    self._cell_initial_state = cell.zero_state(self._batch_size, dtype=tf.float32)

                self.cell_outputs = []
                cell_state = self._cell_initial_state
                for t in range(self._time_steps):
                    if t > 0: tf.get_variable_scope().reuse_variables()
                    cell_output, cell_state = cell(l_in_y[:, t, :], cell_state)
                    self.cell_outputs.append(cell_output)
                self._cell_final_state = cell_state

            with tf.variable_scope('output_layer'):
                # cell_outputs_reshaped (BATCH*TIME_STEP, CELL_SIZE)
                cell_outputs_reshaped = tf.reshape(tf.concat(self.cell_outputs, 1), [-1, self._cell_size])
                Wo = self._weight_variable((self._cell_size, self._output_size))
                bo = self._bias_variable((self._output_size,))
                product = tf.matmul(cell_outputs_reshaped, Wo) + bo
                # _pred shape (batch*time_step, output_size)
                self._pred = tf.nn.relu(product)    # for displacement

        with tf.name_scope('cost'):
            _pred = tf.reshape(self._pred, [self._batch_size, self._time_steps, self._output_size])
            mse = self.ms_error(_pred, self._ys)
            mse_ave_across_batch = tf.reduce_mean(mse, 0)
            mse_sum_across_time = tf.reduce_sum(mse_ave_across_batch, 0)
            self._cost = mse_sum_across_time
            self._cost_ave_time = self._cost / self._time_steps

        with tf.variable_scope('trian'):
            self._lr = tf.convert_to_tensor(self._lr)
            self.train_op = tf.train.AdamOptimizer(self._lr).minimize(self._cost)

    @staticmethod
    def ms_error(y_target, y_pre):
        return tf.square(tf.subtract(y_target, y_pre))

    @staticmethod
    def _weight_variable(shape, name='weights'):
        initializer = tf.random_normal_initializer(mean=0., stddev=0.5, )
        return tf.get_variable(shape=shape, initializer=initializer, name=name)

    @staticmethod
    def _bias_variable(shape, name='biases'):
        initializer = tf.constant_initializer(0.1)
        return tf.get_variable(name=name, shape=shape, initializer=initializer)


if __name__ == '__main__':
    train_config = TrainConfig()
    test_config = TestConfig()

    # the wrong method to reuse parameters in train rnn
    with tf.variable_scope('train_rnn'):
        train_rnn1 = RNN(train_config)
    with tf.variable_scope('test_rnn'):
        test_rnn1 = RNN(test_config)

    # the right method to reuse parameters in train rnn
    with tf.variable_scope('rnn') as scope:
        sess = tf.Session()
        train_rnn2 = RNN(train_config)
        scope.reuse_variables()
        test_rnn2 = RNN(test_config)
        # tf.initialize_all_variables() no long valid from
        # 2017-03-02 if using tensorflow >= 0.12
        if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1:
            init = tf.initialize_all_variables()
        else:
            init = tf.global_variables_initializer()
        sess.run(init)

4. Reference

【1】莫煩PYTHON:scope命名方法

【2】共享變數RNN實現程式碼(reuse variable RNN程式碼)