tensorflow中slim模組api介紹
github:https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim
TensorFlow-Slim
TF-Slim is a lightweight library for defining, training and evaluating complexmodels in TensorFlow. Components of tf-slim can be freely mixed with nativetensorflow, as well as other frameworks, such as tf.contrib.learn.
TF-Slim是tensorflow中定義、訓練和評估複雜模型的輕量級庫。tf-slim中的元件可以輕易地和原生tensorflow框架以及例如tf.contrib.learn這樣的框架進行整合。
Usage
[python] view plain copy
import tensorflow.contrib.slim as slim
Why TF-Slim?
TF-Slim is a library that makes building, training and evaluation neuralnetworks simple:
Allows the user to define models much more compactly by eliminatingboilerplate code. This is accomplished through the use ofargument scopingand numerous high levellayersandvariables.These tools increase readability and maintainability, reduce the likelihoodof an error from copy-and-pasting hyperparameter values and simplifieshyperparameter tuning.
Makes developing models simple by providing commonly usedregularizers.
Several widely used computer vision models (e.g., VGG, AlexNet) have beendeveloped in slim, and areavailableto users. These can either be used as black boxes, or can be extended in variousways, e.g., by adding “multiple heads” to different internal layers.
Slim makes it easy to extend complex models, and to warm start trainingalgorithms by using pieces of pre-existing model checkpoints.
What are the various components of TF-Slim?
TF-Slim is composed of several parts which were design to exist independently.These include the following main pieces (explained in detail below).
arg_scope:provides a new scope namedarg_scope that allows a user to define defaultarguments for specific operations within that scope.
data:contains TF-slim’sdatasetdefinition,data providers,parallel_reader,anddecodingutilities.
evaluation:contains routines for evaluating models.
layers:contains high level layers for building models using tensorflow.
learning:contains routines for training models.
losses:contains commonly used loss functions.
metrics:contains popular evaluation metrics.
nets:contains popular network definitions such asVGGandAlexNetmodels.
queues:provides a context manager for easily and safely starting and closingQueueRunners.
regularizers:contains weight regularizers.
variables:provides convenience wrappers for variable creation and manipulation.
Defining Models
Models can be succinctly defined using TF-Slim by combining its variables, layers and scopes. Each of these elements are defined below.
利用TF-Slim通過合併variables, layers and scopes,模型可以簡潔地進行定義。各元素定義如下。
Variables
Creating Variables in native tensorflow requires either a predefined value or an initialization mechanism (e.g. randomly sampled from a Gaussian). Furthermore, if a variable needs to be created on a specific device, such as a GPU, the specification must be made explicit.To alleviate the code required for variable creation, TF-Slim provides a set of thin wrapper functions invariables.py which allow callers to easily define variables.
想在原生tensorflow中建立變數,要麼需要一個預定義值,要麼需要一種初始化機制。此外,如果變數需要在特定的裝置上建立,比如GPU上,則必要要顯式指定。為了簡化程式碼的變數建立,TF-Slim在variables.py中提供了一批輕量級的函式封裝,從而是呼叫者可以更加容易地定義變數。
For example, to create a weight variable, initialize it using a truncated_normal distribution, regularize it with an l2_loss and place it on the CPU, one need only declare the following:
例如,建立一個權值變數,並且用truncated_normal初始化,用L2損失正則化,放置於CPU中,我們只需要定義如下:
weights = slim.variable('weights',
shape=[10, 10, 3 , 3],
initializer=tf.truncated_normal_initializer(stddev=0.1),
regularizer=slim.l2_regularizer(0.05),
device='/CPU:0')
Note that in native TensorFlow, there are two types of variables: regular variables and local (transient) variables. The vast majority of variables are regular variables: once created, they can be saved to disk using a saver. Local variables are those variables that only exist for the duration of a session and are not saved to disk.
在原生tensorflow中,有兩種型別的變數:常規變數和區域性(臨時)變數。絕大部分都是常規變數,它們一旦建立,可以用Saver儲存在磁碟上。區域性變數則只在一個session期間存在,且不會儲存在磁碟上。
TF-Slim further differentiates variables by definingmodel variables, which are variables that represent parameters of a model. Model variables are trained or fine-tuned during learning and are loaded from a checkpoint during evaluation or inference. Examples include the variables created by aslim.fully_connected orslim.conv2d layer. Non-model variables are all other variables that are used during learning or evaluation but are not required for actually performing inference. For example, theglobal_step is a variable using during learning and evaluation but it is not actually part of the model. Similarly, moving average variables might mirror model variables,but the moving averages are not themselves model variables.
TF-Slim通過定義model variables可以進一步區分變數,這種變數代表一個模型的引數。模型變數在學習階段被訓練或微調,在評估和預測階段從checkpoint中載入。比如通過slim.fully_connected orslim.conv2d進行建立的變數。非模型變數是在學習或評估階段使用,但不會在預測階段起作用的變數。例如global_step,它在學習和評估階段使用,但不是模型的一部分。類似地,移動均值可以mirror模型引數,但是它們本身不是模型變數。
Both model variables and regular variables can be easily created and retrieved via TF-Slim:
通過TF-Slim,模型變數和常規變數都可以很容易地建立和獲取:
# Model Variables
weights = slim.model_variable('weights',
shape=[10, 10, 3 , 3],
initializer=tf.truncated_normal_initializer(stddev=0.1),
regularizer=slim.l2_regularizer(0.05),
device='/CPU:0')
model_variables = slim.get_model_variables()
# Regular variables
my_var = slim.variable('my_var',
shape=[20, 1],
initializer=tf.zeros_initializer())
regular_variables_and_model_variables = slim.get_variables()
How does this work? When you create a model variable via TF-Slim’s layers or directly via theslim.model_variable function, TF-Slim adds the variable to thetf.GraphKeys.MODEL_VARIABLES collection. What if you have your own custom layers or variable creation routine but still want TF-Slim to manage or be aware of your model variables? TF-Slim provides a convenience function for adding the model variable to its collection:
這玩意是怎麼起作用的呢?當你通過TF-Slim’s layers或者直接通過slim.model_variable函式建立一個模型變數,TF-Slim會把這個變數新增到tf.GraphKeys.MODEL_VARIABLES這個集合中。那我們自己的網路層變數怎麼讓TF-Slim管理呢?TF-Slim提供了一個很方便的函式可以將模型的變數新增到集合中:
my_model_variable = CreateViaCustomCode()
# Letting TF-Slim know about the additional variable.
slim.add_model_variable(my_model_variable)
Layers
While the set of TensorFlow operations is quite extensive, developers of neural networks typically think of models in terms of higher level concepts like “layers”, “losses”, “metrics”, and “networks”. A layer,such as a Convolutional Layer, a Fully Connected Layer or a BatchNorm Layer are more abstract than a single TensorFlow operation and typically involve several operations. Furthermore, a layer usually (but not always) has variables (tunable parameters) associated with it, unlike more primitive operations. For example, a Convolutional Layer in a neural networkis composed of several low level operations:
tensorflow的操作符集合是十分廣泛的,神經網路開發者通常會以更高層的概念,比如”layers”, “losses”, “metrics”, and “networks”去考慮模型。一個層,比如卷積層、全連線層或bn層,要比一個單獨的tensorflow操作符更抽象,並且通常會包含若干操作符。此外,和原始操作符不同,一個層經常(不總是)有一些與自己相關的變數(可調引數)。例如,在神經網路中,一個卷積層由許多底層操作符組成:
Creating the weight and bias variables
Convolving the weights with the input from the previous layer
Adding the biases to the result of the convolution.
Applying an activation function.
1. 建立權重、偏置變數
2. 將來自上一層的資料和權值進行卷積
3. 在卷積結果上加上偏置
4. 應用啟用函式
Using only plain TensorFlow code, this can be rather laborious:
如果只用普通的tensorflow程式碼,幹這個事是相當的費事:
input = ...
with tf.name_scope('conv1_1') as scope:
kernel = tf.Variable(tf.truncated_normal([3, 3, 64, 128], dtype=tf.float32,
stddev=1e-1), name='weights')
conv = tf.nn.conv2d(input, kernel, [1, 1, 1, 1], padding='SAME')
biases = tf.Variable(tf.constant(0.0, shape=[128], dtype=tf.float32),
trainable=True, name='biases')
bias = tf.nn.bias_add(conv, biases)
conv1 = tf.nn.relu(bias, name=scope)
To alleviate the need to duplicate this code repeatedly, TF-Slim provides a number of convenient operations defined at the more abstract level of neural network layers. For example, compare the code above to an invocation of the corresponding TF-Slim code:
為了緩解重複這些程式碼,TF-Slim在更抽象的神經網路層的層面上提供了大量方便使用的操作符。比如,將上面的程式碼和TF-Slim響應的程式碼呼叫進行比較:
input = ...
net = slim.conv2d(input, 128, [3, 3], scope='conv1_1')
TF-Slim provides standard implementations for numerous components for building neural networks. These include:
TF-Slim提供了標準介面用於組建神經網路,包括:
Layer TF-Slim
BiasAdd slim.bias_add
BatchNorm slim.batch_norm
Conv2d slim.conv2d
Conv2dInPlane slim.conv2d_in_plane
Conv2dTranspose (Deconv) slim.conv2d_transpose
FullyConnected slim.fully_connected
AvgPool2D slim.avg_pool2d
Dropout slim.dropout
Flatten slim.flatten
MaxPool2D slim.max_pool2d
OneHotEncoding slim.one_hot_encoding
SeparableConv2 slim.separable_conv2d
UnitNorm slim.unit_norm
TF-Slim also provides two meta-operations called repeat andstack that allow users to repeatedly perform the same operation. For example, consider the following snippet from the VGG network whose layers perform several convolutions in a row between pooling layers:
TF-Slim也提供了兩個元運算子—-repeat和stack,允許使用者可以重複地使用相同的運算子。例如,VGG網路的一個片段,這個網路在兩個池化層之間就有許多卷積層的堆疊:
net = ...
net = slim.conv2d(net, 256, [3, 3], scope='conv3_1')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_2')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_3')
net = slim.max_pool2d(net, [2, 2], scope='pool2')
One way to reduce this code duplication would be via a for loop:
一種減少這種程式碼重複的方法是使用for迴圈:
net = ...
for i in range(3):
net = slim.conv2d(net, 256, [3, 3], scope='conv3_' % (i+1))
net = slim.max_pool2d(net, [2, 2], scope='pool2')
This can be made even cleaner by using TF-Slim’s repeat operation:
若使用TF-Slim的repeat操作符,程式碼看起來會更簡潔:
net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
net = slim.max_pool2d(net, [2, 2], scope='pool2')
Notice that the slim.repeat not only applies the same argument in-line, it also is smart enough to unroll the scopes such that the scopes assigned to each subsequent call of slim.conv2d are appended with an underscore and iterationnumber. More concretely, the scopes in the example above would be named ‘conv3/conv3_1’, ‘conv3/conv3_2’ and ‘conv3/conv3_3’.
slim.repeat不但可以在一行中使用相同的引數,而且還能智慧地展開scope,即每個後續的slim.conv2d呼叫所對應的scope都會追加下劃線及迭代數字。更具體地講,上面程式碼的scope分別為 ‘conv3/conv3_1’, ‘conv3/conv3_2’ and ‘conv3/conv3_3’.
Furthermore, TF-Slim’s slim.stack operator allows a caller to repeatedly apply the same operation with different arguments to create a stack or tower of layers. slim.stack also creates a new tf.variable_scope for each operation created. For example, a simple way to create a Multi-Layer Perceptron(MLP):
除此之外,TF-Slim的slim.stack操作符允許呼叫者用不同的引數重複使用相同的操作符是建立一個stack或網路層塔。slim.stack也會為每個建立的操作符生成一個新的scope。例如,下面是一個簡單的方法去建立MLP:
# Verbose way:
x = slim.fully_connected(x, 32, scope='fc/fc_1')
x = slim.fully_connected(x, 64, scope='fc/fc_2')
x = slim.fully_connected(x, 128, scope='fc/fc_3')
# Equivalent, TF-Slim way using slim.stack:
slim.stack(x, slim.fully_connected, [32, 64, 128], scope='fc')
In this example, slim.stack calls slim.fully_connected three times passing the output of one invocation of the function to the next. However, the number of hidden units in each invocation changes from 32 to 64 to 128. Similarly, one can use stack to simplify a tower of multiple convolutions:
在這個例子中,slim.stack呼叫slim.fully_connected 三次,前一個層的輸出是下一層的輸入。而每個網路層的輸出通道數從32變到64,再到128. 同樣,我們可以用stack簡化一個多卷積層塔:
# Verbose way:
x = slim.conv2d(x, 32, [3, 3], scope='core/core_1')
x = slim.conv2d(x, 32, [1, 1], scope='core/core_2')
x = slim.conv2d(x, 64, [3, 3], scope='core/core_3')
x = slim.conv2d(x, 64, [1, 1], scope='core/core_4')
# Using stack:
slim.stack(x, slim.conv2d, [(32, [3, 3]), (32, [1, 1]), (64, [3, 3]), (64, [1, 1])], scope='core')
Scopes
In addition to the types of scope mechanisms in TensorFlow(name_scope, variable_scope ),TF-Slim adds a new scoping mechanism called arg_scope, This new scope allows a user to specify one or more operations and a set of arguments which will be passed to each of the operations defined in the arg_scope. This functionality is best illustrated by example. Consider the following code:
除了tensorflow中自帶的scope機制型別(name_scope, variable_scope)外, TF-Slim添加了一種叫做arg_scope的scope機制。這種scope允許使用者在arg_scope中指定若干操作符以及一批引數,這些引數會傳給前面所有的操作符中。參見以下程式碼:
net = slim.conv2d(inputs, 64, [11, 11], 4, padding='SAME',
weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
weights_regularizer=slim.l2_regularizer(0.0005), scope='conv1')
net = slim.conv2d(net, 128, [11, 11], padding='VALID',
weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
weights_regularizer=slim.l2_regularizer(0.0005), scope='conv2')
net = slim.conv2d(net, 256, [11, 11], padding='SAME',
weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
weights_regularizer=slim.l2_regularizer(0.0005), scope='conv3')
It should be clear that these three convolution layers share many of the same hyper parameters. Two have the same padding, all three have the same weights_initializer and weight_regularizer. This code is hard to read and contains a lot of repeated values that should be factored out. One solution would be to specify default values using variables:
很明顯,這三個卷積層有很多超引數都是相同的。有兩個卷積層有相同的padding設定,而且這三個卷積層都有相同的weights_initializer(權值初始化器)和weight_regularizer(權值正則化器)。這段程式碼很難讀,且包含了很多重複的引數值。一種解決辦法是用變數指定預設值:
padding = 'SAME'
initializer = tf.truncated_normal_initializer(stddev=0.01)
regularizer = slim.l2_regularizer(0.0005)
net = slim.conv2d(inputs, 64, [11, 11], 4,
padding=padding,
weights_initializer=initializer,
weights_regularizer=regularizer,
scope='conv1')
net = slim.conv2d(net, 128, [11, 11],
padding='VALID',
weights_initializer=initializer,
weights_regularizer=regularizer,
scope='conv2')
net = slim.conv2d(net, 256, [11, 11],
padding=padding,
weights_initializer=initializer,
weights_regularizer=regularizer,
scope='conv3')
This solution ensures that all three convolutions share the exact same parameter values but doesn’t reduce completely the code clutter. By using anarg_scope,we can both ensure that each layer uses the same values and simplify the code:
這種方式可以確保這三個卷積層共享相同的引數值,但是仍然沒有減少程式碼規模。通過使用arg_scope,我們既能確保每層共享引數值,又能精簡程式碼:
with slim.arg_scope([slim.conv2d], padding='SAME',
weights_initializer=tf.truncated_normal_initializer(stddev=0.01)
weights_regularizer=slim.l2_regularizer(0.0005)):
net = slim.conv2d(inputs, 64, [11, 11], scope='conv1')
net = slim.conv2d(net, 128, [11, 11], padding='VALID', scope='conv2')
net = slim.conv2d(net, 256, [11, 11], scope='conv3')
As the example illustrates, the use of arg_scope makes the code cleaner, simpler and easier to maintain. Notice that while argument values are specified in the arg_scope, they can be overwritten locally. In particular, while the padding argument has been set to ‘SAME’, the second convolution overrides it with the value of ‘VALID’.
如例所示,arg_scope使程式碼更簡潔且易於維護。注意,在arg_scope中被指定的引數值,也可以在區域性位置進行覆蓋。比如,padding引數設定為’SAME’, 而第二個卷積層仍然可以通過把它設為’VALID’而覆蓋掉arg_scope中的預設設定。
One can also nest arg_scopes and use multiple operations in the same scope.For example:
我們可以巢狀arg_scope, 也可以在一個scope中指定多個操作符,例如
with slim.arg_scope([slim.conv2d, slim.fully_connected],
activation_fn=tf.nn.relu,
weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
weights_regularizer=slim.l2_regularizer(0.0005)):
with slim.arg_scope([slim.conv2d], stride=1, padding='SAME'):
net = slim.conv2d(inputs, 64, [11, 11], 4, padding='VALID', scope='conv1')
net = slim.conv2d(net, 256, [5, 5],
weights_initializer=tf.truncated_normal_initializer(stddev=0.03),
scope='conv2')
net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc')
In this example, the firstarg_scope applies the sameweights_initializer andweights_regularizer arguments to theconv2d and fully_connected layers in its scope. In the secondarg_scope, additional default arguments to conv2d only are specified.
在這個例子中,第一個arg_scope對處於它的scope中的conv2d和fully_connected操作層應用相同的weights_initializer andweights_regularizer引數。在第二個arg_scope中,預設引數只是在conv2d中指定。
Working Example: Specifying the VGG16 Layers
By combining TF-Slim Variables, Operations and scopes, we can write a normally very complex network with very few lines of code. For example, the entire VGG architecture can bedefined with just the following snippet:
通過整合TF-Slim的變數、操作符和scope,我們可以用寥寥幾行程式碼寫一個通常非常複雜的網路。例如,完整的VGG結構只需要用下面的一小段程式碼定義:
def vgg16(inputs):
with slim.arg_scope([slim.conv2d, slim.fully_connected],
activation_fn=tf.nn.relu,
weights_initializer=tf.truncated_normal_initializer(0.0, 0.01),
weights_regularizer=slim.l2_regularizer(0.0005)):
net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1')
net = slim.max_pool2d(net, [2, 2], scope='pool1')
net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2')
net = slim.max_pool2d(net, [2, 2], scope='pool2')
net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
net = slim.max_pool2d(net, [2, 2], scope='pool3')
net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv4')
net = slim.max_pool2d(net, [2, 2], scope='pool4')
net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv5')
net = slim.max_pool2d(net, [2, 2], scope='pool5')
net = slim.fully_connected(net, 4096, scope='fc6')
net = slim.dropout(net, 0.5, scope='dropout6')
net = slim.fully_connected(net, 4096, scope='fc7')
net = slim.dropout(net, 0.5, scope='dropout7')
net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc8')
return net
Training Models
Training Tensorflow models requires a model, a loss function, the gradient computation and a training routine that iteratively computes the gradients of the model weights relative to the loss and updates the weights accordingly.TF-Slim provides both common loss functions and a set of helper functions that run the training and evaluation routines.
訓練一個tensorflow模型,需要一個網路模型,一個損失函式,梯度計算方式和用於迭代計算模型權重的訓練過程。TF-Slim提供了損失函式,同時也提供了一批執行訓練和評估模型的幫助函式。
Losses
The loss function defines a quantity that we want to minimize. For classification problems, this is typically the cross entropy between the true distribution and the predicted probability distribution across classes. For regression problems, this is often the sum-of-squares differences between the predicted and true values.
損失函式定義了我們想最小化的量。對於分裂問題,它通常是真實分佈和預測概率分佈的交叉熵。對於迴歸問題,它通常是真實值和預測值的平方和。
Certain models, such as multi-task learning models, require the use of multiple loss functions simultaneously. In other words, the loss function ultimately being minimized is the sum of various other loss functions. For example, consider a model that predicts both the type of scene in an image as well as the depth from the camera of each pixel. This model’s loss function would be the sum of the classification loss and depth prediction loss.
對於特定的模型,比如多工學習模型,可能需要同時使用多個損失函式。換句話說,正在最小化的損失函式是其他一些損失函式的和。例如,有一個模型既要預測影象中場景的型別,又要預測每個畫素的深度。那這個模型的損失函式就是分類損失和深度預測損失的和。
TF-Slim provides an easy-to-use mechanism for defining and keeping track of loss functions via the losses module. Consider the simple case where we want to train the VGG network:
TF-Slim通過losses模組,提供了一種易用的機制去定義和跟蹤損失函式的足跡。看一個簡單的例子,我們想訓練VGG網路:
import tensorflow as tf
vgg = tf.contrib.slim.nets.vgg
# Load the images and labels.
images, labels = ...
# Create the model.
predictions, _ = vgg.vgg_16(images)
# Define the loss functions and get the total loss.
loss = slim.losses.softmax_cross_entropy(predictions, labels)
In this example, we start by creating the model (using TF-Slim’s VGG implementation), and add the standard classification loss. Now, lets turn to the case where we have a multi-task model that produces multiple outputs:
在上面的例子中,我們首先建立了模型(用TF-Slim的VGG介面實現),並添加了標準的分類損失。現在,我們再看一個產生多輸出的多工模型:
# Load the images and labels.
images, scene_labels, depth_labels = ...
# Create the model.
scene_predictions, depth_predictions = CreateMultiTaskModel(images)
# Define the loss functions and get the total loss.
classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)
# The following two lines have the same effect:
total_loss = classification_loss + sum_of_squares_loss
total_loss = slim.losses.get_total_loss(add_regularization_losses=False)
In this example, we have two losses which we add by calling slim.losses.softmax_cross_entropy and slim.losses.sum_of_squares. We can obtain the total loss by adding them together (total_loss) or by calling slim.losses.get_total_loss(). How did this work? When you create a loss function via TF-Slim, TF-Slim adds the loss to a special TensorFlow collection of loss functions. This enables you to either manage the total loss manually, or allow TF-Slim to manage them for you.
在這個例子中,我們有兩個損失,分別是通過slim.losses.softmax_cross_entropy和 slim.losses.sum_of_squares得到的。我們既可以通過相加得到total_loss,也可以通過slim.losses.get_total_loss()得到total_loss。這是怎麼做到的呢?當你通過TF-Slim建立一個損失函式時,TF-Slim會把損失加入到一個特殊的Tensorflow的損失函式集合中。這樣你既可以手動管理損失函式,也可以託管給TF-Slim。
What if you want to let TF-Slim manage the losses for you but have a custom loss function?loss_ops.py also has a function that adds this loss to TF-Slims collection. For example:
如果我們有一個自定義的損失函式,現在也想託管給TF-Slim,該怎麼做呢?loss_ops.py也有一個函式可以將這個損失函式加入到TF-Slim集合中。例如
# Load the images and labels.
images, scene_labels, depth_labels, pose_labels = ...
# Create the model.
scene_predictions, depth_predictions, pose_predictions = CreateMultiTaskModel(images)
# Define the loss functions and get the total loss.
classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)
pose_loss = MyCustomLossFunction(pose_predictions, pose_labels)
slim.losses.add_loss(pose_loss) # Letting TF-Slim know about the additional loss.
# The following two ways to compute the total loss are equivalent:
regularization_loss = tf.add_n(slim.losses.get_regularization_losses())
total_loss1 = classification_loss + sum_of_squares_loss + pose_loss + regularization_loss
# (Regularization Loss is included in the total loss by default).
total_loss2 = slim.losses.get_total_loss()
In this example, we can again either produce the total loss function manually or let TF-Slim know about the additional loss and let TF-Slim handle the losses.
這個例子中,我們同樣既可以手動管理損失函式,也可以讓TF-Slim知曉這個自定義損失函式,然後託管給TF-Slim。
Training Loop
TF-Slim provides a simple but powerful set of tools for training models found inlearning.py. These include a Train function that repeatedly measures the loss, computes gradients and saves the model to disk, as well as several convenience functions for manipulating gradients. For example, once we’ve specified the model, the loss function and the optimization scheme, we can call slim.learning.create_train_op andslim.learning.train to perform the optimization:
在learning.py中,TF-Slim提供了簡單卻非常強大的訓練模型的工具集。包括Train函式,可以重複地測量損失,計算梯度以及儲存模型到磁碟中,還有一些方便的函式用於操作梯度。例如,當我們定義好了模型、損失函式以及優化方式,我們就可以呼叫slim.learning.create_train_op andslim.learning.train 去執行優化:
g = tf.Graph()
# Create the model and specify the losses...
...
total_loss = slim.losses.get_total_loss()
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
# create_train_op ensures that each time we ask for the loss, the update_ops
# are run and the gradients being computed are applied too.
train_op = slim.learning.create_train_op(total_loss, optimizer)
logdir = ... # Where checkpoints are stored.
slim.learning.train(
train_op,
logdir,
number_of_steps=1000,
save_summaries_secs=300,
save_interval_secs=600):
In this example, slim.learning.train is provided with thetrain_op which is used to (a) compute the loss and (b) apply the gradient step.logdir specifies the directory where the checkpoints and event files are stored. We can limit the number of gradient steps taken to any number. In this case, we’ve asked for1000 steps to be taken. Finally, save_summaries_secs=300 indicates that we’ll compute summaries every 5 minutes and save_interval_secs=600 indicates that we’ll save a model checkpoint every 10 minutes.
在該例中,slim.learning.train根據train_op計算損失、應用梯度step。logdir指定了checkpoints和event檔案的儲存路徑。我們可以限制梯度step到任何數值。這裡我們採用1000步。最後,save_summaries_secs=300表示每5分鐘計算一次summaries,save_interval_secs=600表示每10分鐘儲存一次模型的checkpoint。
Working Example: Training the VGG16 Model
To illustrate this, lets examine the following sample of training the VGG network:
為了說明,讓我們測試以下訓練VGG的例子:
import tensorflow as tf
slim = tf.contrib.slim
vgg = tf.contrib.slim.nets.vgg
...
train_log_dir = ...
if not tf.gfile.Exists(train_log_dir):
tf.gfile.MakeDirs(train_log_dir)
with tf.Graph().as_default():
# Set up the data loading:
images, labels = ...
# Define the model:
predictions = vgg.vgg16(images, is_training=True)
# Specify the loss function:
slim.losses.softmax_cross_entropy(predictions, labels)
total_loss = slim.losses.get_total_loss()
tf.summary.scalar('losses/total_loss', total_loss)
# Specify the optimization scheme:
optimizer = tf.train.GradientDescentOptimizer(learning_rate=.001)
# create_train_op that ensures that when we evaluate it to get the loss,
# the update_ops are done and the gradient updates are computed.
train_tensor = slim.learning.create_train_op(total_loss, optimizer)
# Actually runs training.
slim.learning.train(train_tensor, train_log_dir)
Fine-Tuning Existing Models
Brief Recap on Restoring Variables from a Checkpoint
對從checkpoint載入variables的簡略概括
After a model has been trained, it can be restored using tf.train.Saver() which restores Variables from a given checkpoint. For many cases,tf.train.Saver() provides a simple mechanism to restore all or just a few variables.
在一個模型訓練完成後,我們可以用tf.train.Saver()通過指定checkpoing載入variables的方式載入這個模型。對於很多情況,tf.train.Saver()提供了一種簡單的機制去載入所有或一些varialbes變數。
# Create some variables.
v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")
...
# Add ops to restore all the variables.
restorer = tf.train.Saver()
# Add ops to restore some variables.
restorer = tf.train.Saver([v1, v2])
# Later, launch the model, use the saver to restore variables from disk, and
# do some work with the model.
with tf.Session() as sess:
# Restore variables from disk.
restorer.restore(sess, "/tmp/model.ckpt")
print("Model restored.")
# Do some work with the model
…
See Restoring Variables and Choosing which Variables to Save and Restore sections of the Variables page for more details.
參閱Variables章中Restoring Variables和Choosing which Variables to Save and Restore 相關部分,獲取更多細節。
Partially Restoring Models
It is often desirable to fine-tune a pre-trained model on an entirely new dataset or even a new task. In these situations, one can use TF-Slim’s helper functions to select a subset of variables to restore:
有時我們希望在一個全新的資料集上或面對一個資訊任務方向去微調預訓練模型。在這些情況下,我們可以使用TF-Slim’s的幫助函式去載入模型中變數的一個子集:
# Create some variables.
v1 = slim.variable(name="v1", ...)
v2 = slim.variable(name="nested/v2", ...)
...
# Get list of variables to restore (which contains only 'v2'). These are all
# equivalent methods:
variables_to_restore = slim.get_variables_by_name("v2")
# or
variables_to_restore = slim.get_variables_by_suffix("2")
# or
variables_to_restore = slim.get_variables(scope="nested")
# or
variables_to_restore = slim.get_variables_to_restore(include=["nested"])
# or
variables_to_restore = slim.get_variables_to_restore(exclude=["v1"])
# Create the saver which will be used to restore the variables.
restorer = tf.train.Saver(variables_to_restore)
with tf.Session() as sess:
# Restore variables from disk.
restorer.restore(sess, "/tmp/model.ckpt")
print("Model restored.")
# Do some work with the model
...
Restoring models with different variable names
用不同的變數名載入模型
When restoring variables from a checkpoint, theSaverlocates the variable names in a checkpoint file and maps them to variables in the current graph. Above, we created a saver by passing to it a list of variables. In this case, the names of the variables to locate in the checkpoint file were implicitly obtained from each provided variable’s var.op.name.
當從checkpoint載入變數時,Saver先在checkpoint中定位變數名,然後對映到當前圖的變數中。我們也可以通過向saver傳遞一個變數列表來建立saver。這時,在checkpoint檔案中用於定位的變數名可以隱式地從各自的var.op.name中獲得。
This works well when the variable names in the checkpoint file match those in the graph. However, sometimes, we want to restore a model from a checkpoint whose variables have different names those in the current graph. In this case,we must provide the Saver a dictionary that maps from each checkpoint variable name to each graph variable. Consider the following example where the checkpoint variables names are obtained via a simple function:
當checkpoint檔案中的變數名與當前圖中的變數名完全匹配時,這會執行得很好。但是,有時我們想從一個變數名與與當前圖的變數名不同的checkpoint檔案中裝載一個模型。這時,我們必須提供一個saver字典,這個字典對checkpoint中的每個變數和每個圖變數進行了一一對映。請看下面這個例子,checkpoint的變數是通過一個簡單的函式獲得的:
# Assuming than 'conv1/weights' should be restored from 'vgg16/conv1/weights'
def name_in_checkpoint(var):
return 'vgg16/' + var.op.name
# Assuming than 'conv1/weights' and 'conv1/bias' should be restored from 'conv1/params1' and 'conv1/params2'
def name_in_checkpoint(var):
if "weights" in var.op.name:
return var.op.name.replace("weights", "params1")
if "bias" in var.op.name:
return var.op.name.replace("bias", "params2")
variables_to_restore = slim.get_model_variables()
variables_to_restore = {name_in_checkpoint(var):var for var in variables_to_restore}
restorer = tf.train.Saver(variables_to_restore)
with tf.Session() as sess:
# Restore variables from disk.
restorer.restore(sess, "/tmp/model.ckpt")
Fine-Tuning a Model on a different task
Consider the case where we have a pre-trained VGG16 model. The model was trained on the ImageNet dataset, which has 1000 classes. However, we would like to apply it to the Pascal VOC dataset which has only 20 classes. To do so, we can initialize our new model using the values of the pre-trained model excluding the final layer:
假設我們有一個已經預訓練好的VGG16的模型。這個模型是在擁有1000分類的ImageNet資料集上進行訓練的。但是,現在我們想把它應用在只具有20個分類的Pascal VOC資料集上。為了能這樣做,我們可以通過利用除最後一些全連線層的其他預訓練模型值來初始化新模型的達到目的:
# Load the Pascal VOC data
image, label = MyPascalVocDataLoader(...)
images, labels = tf.train.batch([image, label], batch_size=32)
# Create the model
predictions = vgg.vgg_16(images)
train_op = slim.learning.create_train_op(...)
# Specify where the Model, trained on ImageNet, was saved.
model_path = '/path/to/pre_trained_on_imagenet.checkpoint'
<a target="_blank" href="https://www.tensorflow.org/code/tensorflow/contrib/metrics/python/ops/metric_ops.py">metric_ops.py</a>
# Specify where the new model will live:
log_dir =
from_checkpoint_
'/path/to/my_pascal_model_dir/'
# Restore only the convolutional layers:
variables_to_restore = slim.get_variables_to_restore(exclude=['fc6', 'fc7', 'fc8'])
init_fn = assign_from_checkpoint_fn(model_path, variables_to_restore)
# Start training.
slim.learning.train(train_op, log_dir, init_fn=init_fn)
Evaluating Models.
Once we’ve trained a model (or even while the model is busy training) we’d like to see how well the model performs in practice. This is accomplished by picking a set of evaluation metrics, which will grade the models performance, and the evaluation code which actually loads the data, performs inference, compares the results to the ground truth and records the evaluation scores. This step may be performed once or repeated periodically.
一旦我們訓練好了一個模型(或者模型還在訓練中),我們想看一下模型在實際中效能如何。這可以通過獲取一系列表徵模型效能的評估指標來實現,評估程式碼一般會載入資料,執行前向傳播,和ground truth進行比較並記錄評估分數。這個步驟可能執行一次,也可能週期性地執行。
Metrics
We define a metric to be a performance measure that is not a loss function(losses are directly optimized during training), but which we are still interested in for the purpose of evaluating our model.For example, we might want to minimize log loss, but our metrics of interest might be F1 score (test accuracy), or Intersection Over Union score (which are not differentiable, and therefore cannot be used as losses).
比如我們定義了一個不是損失函式的效能度量指標(損失在訓練過程中進行直接優化),而這個指標出於評估模型的目的我們還非常感興趣。比如說我們想最小化log損失,但是我們感興趣的指標可能是F1 score(測試準確率),或者IoU分數(這個指標不可微,因此不能作為損失)。
TF-Slim provides a set of metric operations that makes evaluating models easy. Abstractly, computing the value of a metric can be divided into three parts:
Initialization: initialize the variables used to compute the metrics.
Aggregation: perform operations (sums, etc) used to compute the metrics.
Finalization: (optionally) perform any final operation to compute metric values. For example, computing means, mins, maxes, etc.
TF-Slim提供了一系列指標操作符,它們可以使模型評估更簡單。抽象來講,計算一個指標值可以分為3步:
1. 初始化:初始化用於計算指標的變數。
2. 聚合:執行用於計算指標的運算流程(比如sum)。
3. 收尾:(可選)執行其他用於計算指標值的操作。例如,計算mean、min、max等。
For example, to compute mean_absolute_error, two variables, a count and total variable are initialized to zero. During aggregation, we observed some set of predictions and labels, compute their absolute differences and add the total to total. Each time we observe another value,count is incremented. Finally, duringfinalization,total is divided by count to obtain the mean.
例如,為了計算絕對平均誤差,一個count變數和一個total變數需要初始化為0. 在聚合階段,我們可以觀察到一系列預測值及標籤,計算他們差的絕對值,並加到total中。每次迴圈,count變數自加1。最後,在收尾階段,total除以count就得到了mean。
The following example demonstrates the API for declaring metrics. Because metrics are often evaluated on a test set which is different from the training set (upon which the loss is computed), we’ll assume we’re using test data:
下面的例子演示了定義指標的API。因為指標通常是在測試集上計算,而不是訓練集(訓練集上是用於計算loss的),我們假設我們在使用測試集:
images, labels = LoadTestData(...)
predictions = MyModel(images)
mae_value_op, mae_update_op = slim.metrics.streaming_mean_absolute_error(predictions, labels)
mre_value_op, mre_update_op = slim.metrics.streaming_mean_relative_error(predictions, labels)
pl_value_op, pl_update_op = slim.metrics.percentage_less(mean_relative_errors, 0.3)
As the example illustrates, the creation of a metric returns two values:a value_op and anupdate_op. The value_op is an idempotent operation that returns the current value of the metric. The update_op is an operation that performs the aggregation step mentioned above as well as returning the value of the metric.
如上例所示,指標的建立會返回兩個值,一個value_op和一個update_op。value_op表示和當前指標值冪等的操作。update_op是上文提到的執行聚合步驟並返回指標值的操作符。
Keeping track of each value_op andupdate_op can be laborious. To deal with this, TF-Slim provides two convenience functions:
跟蹤每個value_op和update_op是非常費勁的。為了解決這個問題,TF-Slim提供了兩個方便的函式:
# Aggregates the value and update ops in two lists:
value_ops, update_ops = slim.metrics.aggregate_metrics(
slim.metrics.streaming_mean_absolute_error(predictions, labels),
slim.metrics.streaming_mean_squared_error(predictions, labels))
# Aggregates the value and update ops in two dictionaries:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
"eval/mean_absolute_error": slim.metrics.streaming_mean_absolute_error(predictions, labels),
"eval/mean_squared_error": slim.metrics.streaming_mean_squared_error(predictions, labels),
})
Working example: Tracking Multiple Metrics
Putting it all together:
把上面講到的我們整合在一起:
import tensorflow as tf
slim = tf.contrib.slim
vgg = tf.contrib.slim.nets.vgg
# Load the data
images, labels = load_data(...)
# Define the network
predictions = vgg.vgg_16(images)
# Choose the metrics to compute:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
"eval/mean_absolute_error": slim.metrics.streaming_mean_absolute_error(predictions, labels),
"eval/mean_squared_error": slim.metrics.streaming_mean_squared_error(predictions, labels),
})
# Evaluate the model using 1000 batches of data:
num_batches = 1000
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
sess.run(tf.local_variables_initializer())
for batch_id in range(num_batches):
sess.run(names_to_updates.values())
metric_values = sess.run(names_to_values.values())
for metric, value in zip(names_to_values.keys(), metric_values):
print('Metric %s has value: %f' % (metric, value))
Note that metric_ops.py can be used in isolation without using either layers.py or loss_ops.py
注意,metric_ops.py可以在沒有layers.py和loss_ops.py的情況下獨立使用。
Evaluation Loop
TF-Slim provides an evaluation module(evaluation.py),which contains helper functions for writing model evaluation scripts using metrics from the metric_ops.py module. These include a function for periodically running evaluations,evaluating metrics over batches of data and printing and summarizing metric results. For example:
TF-Slim提供了一個評估模組(evaluation.py),這個模組包含了一些利用來自metric_ops.py模組的指標寫模型評估指令碼的幫助函式。其中包含一個可以週期執行評估,評估資料batch之間的指標、列印並總結指標結果的函式。例如:
import tensorflow as tf
slim = tf.contrib.slim
# Load the data
images, labels = load_data(...)
# Define the network
predictions = MyModel(images)
# Choose the metrics to compute:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
'accuracy': slim.metrics.accuracy(predictions, labels),
'precision': slim.metrics.precision(predictions, labels),
'recall': slim.metrics.recall(mean_relative_errors, 0.3),
})
# Create the summary ops such that they also print out to std output:
summary_ops = []
for metric_name, metric_value in names_to_values.iteritems():
op = tf.summary.scalar(metric_name, metric_value)
op = tf.Print(op, [metric_value], metric_name)
summary_ops.append(op)
num_examples = 10000
batch_size = 32
num_batches = math.ceil(num_examples / float(batch_size))
# Setup the global step.
slim.get_or_create_global_step()
output_dir = ... # Where the summaries are stored.
eval_interval_secs = ... # How often to run the evaluation.
slim.evaluation.evaluation_loop(
'local',
checkpoint_dir,
log_dir,
num_evals=num_batches,
eval_op=names_to_updates.values(),
summary_op=tf.summary.merge(summary_ops),
eval_interval_secs=eval_interval_secs)