Tensorflow之tfdbg和tfprof
阿新 • • 發佈:2018-11-06
Tfdbg
TensorFlow debugger (tfdbg) is a specialized debugger for TensorFlow.
To add support for tfdbg.
from tensorflow.python import debug as tf_debug
sess = tf_debug.LocalCLIDebugWrapperSession(sess)
# 加入異常值對應的過濾器s
sess.add_tensor_filter("has_inf_or_nan", tf_debug.has_inf_or_nan)
Features
- Bringing up a terminal-based user interface (UI) before and after each run() call, to let you control the execution and inspect the graph’s internal state.
- Allowing you to register special “filters” for tensor values, to facilitate the diagnosis of issues.
命令
run
執行一次sess.run()。-t可以執行很多次。-n執行結束。-f執行到filter條件出現,比如nan,inf等
pt
列印tensor, eg: pt cross_entropy/Log:0[:, 0:10] > /tmp/xent_value_slices.txt
Tfprof
Examine the shapes and sizes of all trainable Variables.
# Print trainable variable parameter statistics to stdout.
# By default, statistics are associated with each graph node.
param_stats = tf.contrib.tfprof.model_analyzer.print_model_analysis(
tf.get_default_graph(),
tfprof_options=tf.contrib.tfprof.model_analyzer.
TRAINABLE_VARS_PARAMS_STAT_OPTIONS)
# param_stats is tensorflow.tfprof.TFGraphNodeProto proto.
# Let's print the root below.
sys.stdout.write('total_params: %d\n' % param_stats.total_parameters)
Examine the number of floating point operations
# Print to stdout an analysis of the number of floating point operations in the
# model broken down by individual operations.
#
# Note: Only Ops with RegisterStatistics('flops') defined have flop stats. It
# also requires complete shape information. It is common that shape is unknown
# statically. To complete the shape, provide run-time shape information with
# tf.RunMetadata to the API (See next example on how to provide RunMetadata).
#
tf.contrib.tfprof.model_analyzer.print_model_analysis(
tf.get_default_graph(),
tfprof_options=tf.contrib.tfprof.model_analyzer.FLOAT_OPS_OPTIONS)
Examine the timing and memory usage
# Generate the meta information for the model that contains the memory usage
# and timing information.
#
# Note: When run on GPU, a kernel is first scheduled (enqueued) and then
# executed asynchronously. tfprof only tracks the execution time.
# In addition, a substantial of time might be spent between Python and
# TensorFlow runtime, which is also not tracked by tfprof.
#
run_metadata = tf.RunMetadata()
with tf.Session() as sess:
_ = sess.run(train_op,
options=tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE),
run_metadata=run_metadata)
# Print to stdout an analysis of the memory usage and the timing information
# broken down by operations.
tf.contrib.tfprof.model_analyzer.print_model_analysis(
tf.get_default_graph(),
run_meta=run_metadata,
tfprof_options=tf.contrib.tfprof.model_analyzer.PRINT_ALL_TIMING_MEMORY)
對列印的記憶體資料分析
- 列印記憶體資料內容如下:
==================Model Analysis Report======================
_TFProfRoot (0B/165.49MB, 0us/5.86ms)
adam_optimizer/Adam/Assign (4B/4B, 43us/43us)
adam_optimizer/Adam/Assign_1 (4B/4B, 39us/39us)
adam_optimizer/Adam/beta1 (4B/4B, 2us/2us)
adam_optimizer/Adam/beta2 (4B/4B, 2us/2us)
adam_optimizer/Adam/epsilon (4B/4B, 2us/2us)
adam_optimizer/Adam/learning_rate (4B/4B, 3us/3us)
adam_optimizer/Adam/mul (4B/4B, 46us/46us)
adam_optimizer/Adam/mul_1 (4B/4B, 40us/40us)
adam_optimizer/Adam/update_conv1/Variable/ApplyAdam (3.20KB/3.20KB, 88us/88us)
adam_optimizer/Adam/update_conv1/Variable_1/ApplyAdam (128B/128B, 92us/92us)
adam_optimizer/Adam/update_conv2/Variable/ApplyAdam (204.80KB/204.80KB, 91us/91us)
adam_optimizer/Adam/update_conv2/Variable_1/ApplyAdam (256B/256B, 93us/93us)
adam_optimizer/Adam/update_fc1/Variable/ApplyAdam (12.85MB/12.85MB, 90us/90us)
adam_optimizer/Adam/update_fc1/Variable_1/ApplyAdam (4.10KB/4.10KB, 95us/95us)
adam_optimizer/Adam/update_fc2/Variable/ApplyAdam (40.96KB/40.96KB, 94us/94us)
adam_optimizer/Adam/update_fc2/Variable_1/ApplyAdam (40B/40B, 98us/98us)
adam_optimizer/beta1_power (4B/8B, 10us/13us)
adam_optimizer/beta1_power/read (4B/4B, 3us/3us)
adam_optimizer/beta2_power (4B/8B, 9us/12us)
adam_optimizer/beta2_power/read (4B/4B, 3us/3us)
adam_optimizer/gradients/Mean_grad/Cast (4B/4B, 52us/52us)
adam_optimizer/gradients/Mean_grad/Maximum (4B/8B, 13us/15us)
adam_optimizer/gradients/Mean_grad/Maximum/y (4B/4B, 2us/2us)
adam_optimizer/gradients/Mean_grad/Prod (4B/4B, 14us/14us)
adam_optimizer/gradients/Mean_grad/Prod_1 (4B/4B, 102us/102us)
adam_optimizer/gradients/Mean_grad/Shape (4B/4B, 11us/11us)
adam_optimizer/gradients/Mean_grad/Tile (200B/200B, 46us/46us)
adam_optimizer/gradients/Mean_grad/floordiv (4B/4B, 12us/12us)
adam_optimizer/gradients/Mean_grad/truediv (200B/200B, 44us/44us)
adam_optimizer/gradients/conv1/Conv2D_grad/Conv2DBackpropFilter (3.20KB/3.20KB, 183us/183us)
adam_optimizer/gradients/conv1/Conv2D_grad/Conv2DBackpropInput (156.80KB/156.80KB, 222us/222us)
adam_optimizer/gradients/conv1/Conv2D_grad/Shape (16B/16B, 8us/8us)
adam_optimizer/gradients/conv1/Conv2D_grad/Shape_1 (16B/16B, 5us/5us)
adam_optimizer/gradients/conv1/Conv2D_grad/tuple/control_dependency_1 (3.20KB/3.20KB, 3us/3us)
adam_optimizer/gradients/conv1/Relu_grad/ReluGrad (5.02MB/5.02MB, 40us/40us)
adam_optimizer/gradients/conv1/add_grad/BroadcastGradientArgs (12B/12B, 10us/10us)
adam_optimizer/gradients/conv1/add_grad/Reshape (5.02MB/5.02MB, 5us/5us)
adam_optimizer/gradients/conv1/add_grad/Reshape_1 (128B/128B, 4us/4us)
adam_optimizer/gradients/conv1/add_grad/Shape (16B/16B, 8us/8us)
adam_optimizer/gradients/conv1/add_grad/Shape_1 (4B/4B, 3us/3us)
adam_optimizer/gradients/conv1/add_grad/Sum (5.02MB/5.02MB, 7us/7us)
adam_optimizer/gradients/conv1/add_grad/Sum_1 (128B/128B, 72us/72us)
adam_optimizer/gradients/conv1/add_grad/tuple/control_dependency (5.02MB/5.02MB, 3us/3us)
adam_optimizer/gradients/conv1/add_grad/tuple/control_dependency_1 (128B/128B, 3us/3us)
adam_optimizer/gradients/conv2/Conv2D_grad/Conv2DBackpropFilter (204.80KB/204.80KB, 310us/310us)
adam_optimizer/gradients/conv2/Conv2D_grad/Conv2DBackpropInput (1.25MB/1.25MB, 287us/287us)
adam_optimizer/gradients/conv2/Conv2D_grad/Shape (16B/16B, 8us/8us)
adam_optimizer/gradients/conv2/Conv2D_grad/Shape_1 (16B/16B, 2us/2us)
adam_optimizer/gradients/conv2/Conv2D_grad/tuple/control_dependency (1.25MB/1.25MB, 3us/3us)
adam_optimizer/gradients/conv2/Conv2D_grad/tuple/control_dependency_1 (204.80KB/204.80KB, 3us/3us)
adam_optimizer/gradients/conv2/Relu_grad/ReluGrad (2.51MB/2.51MB, 41us/41us)
adam_optimizer/gradients/conv2/add_grad/BroadcastGradientArgs (12B/12B, 10us/10us)
adam_optimizer/gradients/conv2/add_grad/Reshape (2.51MB/2.51MB, 6us/6us)
adam_optimizer/gradients/conv2/add_grad/Reshape_1 (256B/256B, 4us/4us)
adam_optimizer/gradients/conv2/add_grad/Shape (16B/16B, 9us/9us)
adam_optimizer/gradients/conv2/add_grad/Shape_1 (4B/4B, 2us/2us)
adam_optimizer/gradients/conv2/add_grad/Sum (2.51MB/2.51MB, 6us/6us)
adam_optimizer/gradients/conv2/add_grad/Sum_1 (256B/256B, 72us/72us)
adam_optimizer/gradients/conv2/add_grad/tuple/control_dependency (2.51MB/2.51MB, 2us/2us)
adam_optimizer/gradients/conv2/add_grad/tuple/control_dependency_1 (256B/256B, 3us/3us)
adam_optimizer/gradients/dropout/dropout/div_grad/BroadcastGradientArgs (8B/8B, 13us/13us)
adam_optimizer/gradients/dropout/dropout/div_grad/Neg (204.80KB/204.80KB, 41us/41us)
adam_optimizer/gradients/dropout/dropout/div_grad/RealDiv (204.80KB/204.80KB, 45us/45us)
adam_optimizer/gradients/dropout/dropout/div_grad/RealDiv_1 (204.80KB/204.80KB, 45us/45us)
adam_optimizer/gradients/dropout/dropout/div_grad/RealDiv_2 (204.80KB/204.80KB, 41us/41us)
adam_optimizer/gradients/dropout/dropout/div_grad/Reshape (204.80KB/204.80KB, 6us/6us)
adam_optimizer/gradients/dropout/dropout/div_grad/Reshape_1 (4B/4B, 4us/4us)
adam_optimizer/gradients/dropout/dropout/div_grad/Sum (204.80KB/204.80KB, 7us/7us)
adam_optimizer/gradients/dropout/dropout/div_grad/Sum_1 (4B/4B, 49us/49us)
adam_optimizer/gradients/dropout/dropout/div_grad/mul (204.80KB/204.80KB, 43us/43us)
adam_optimizer/gradients/dropout/dropout/div_grad/tuple/control_dependency (204.80KB/204.80KB, 4us/4us)
adam_optimizer/gradients/dropout/dropout/mul_grad/Reshape (204.80KB/204.80KB, 5us/5us)
adam_optimizer/gradients/dropout/dropout/mul_grad/Reshape_1 (204.80KB/204.80KB, 4us/4us)
adam_optimizer/gradients/dropout/dropout/mul_grad/Shape (8B/8B, 9us/9us)
adam_optimizer/gradients/dropout/dropout/mul_grad/Shape_1 (8B/8B, 10us/10us)
adam_optimizer/gradients/dropout/dropout/mul_grad/Sum (204.80KB/204.80KB, 8us/8us)
adam_optimizer/gradients/dropout/dropout/mul_grad/Sum_1 (204.80KB/204.80KB, 5us/5us)
adam_optimizer/gradients/dropout/dropout/mul_grad/mul (204.80KB/204.80KB, 42us/42us)
adam_optimizer/gradients/dropout/dropout/mul_grad/mul_1 (204.80KB/204.80KB, 40us/40us)
adam_optimizer/gradients/dropout/dropout/mul_grad/tuple/control_dependency (204.80KB/204.80KB, 3us/3us)
adam_optimizer/gradients/fc1/MatMul_grad/MatMul (627.20KB/627.20KB, 53us/53us)
adam_optimizer/gradients/fc1/MatMul_grad/MatMul_1 (12.85MB/12.85MB, 89us/89us)
adam_optimizer/gradients/fc1/MatMul_grad/tuple/control_dependency (627.20KB/627.20KB, 4us/4us)
adam_optimizer/gradients/fc1/MatMul_grad/tuple/control_dependency_1 (12.85MB/12.85MB, 3us/3us)
adam_optimizer/gradients/fc1/Relu_grad/ReluGrad (204.80KB/204.80KB, 41us/41us)
adam_optimizer/gradients/fc1/Reshape_grad/Reshape (627.20KB/627.20KB, 5us/5us)
adam_optimizer/gradients/fc1/Reshape_grad/Shape (16B/16B, 8us/8us)
adam_optimizer/gradients/fc1/add_grad/BroadcastGradientArgs (4B/4B, 12us/12us)
adam_optimizer/gradients/fc1/add_grad/Reshape (204.80KB/204.80KB, 10us/10us)
adam_optimizer/gradients/fc1/add_grad/Reshape_1 (4.10KB/4.10KB, 4us/4us)
adam_optimizer/gradients/fc1/add_grad/Shape (8B/8B, 11us/11us)
adam_optimizer/gradients/fc1/add_grad/Shape_1 (4B/4B, 2us/2us)
adam_optimizer/gradients/fc1/add_grad/Sum (204.80KB/204.80KB, 7us/7us)
adam_optimizer/gradients/fc1/add_grad/Sum_1 (4.10KB/4.10KB, 99us/99us)
adam_optimizer/gradients/fc1/add_grad/tuple/control_dependency (204.80KB/204.80KB, 3us/3us)
adam_optimizer/gradients/fc1/add_grad/tuple/control_dependency_1 (4.10KB/4.10KB, 3us/3us)
adam_optimizer/gradients/fc2/MatMul_grad/MatMul (204.80KB/204.80KB, 54us/54us)
adam_optimizer/gradients/fc2/MatMul_grad/MatMul_1 (40.96KB/40.96KB, 52us/52us)
adam_optimizer/gradients/fc2/MatMul_grad/tuple/control_dependency (204.80KB/204.80KB, 4us/4us)
adam_optimizer/gradients/fc2/MatMul_grad/tuple/control_dependency_1 (40.96KB/40.96KB, 3us/3us)
adam_optimizer/gradients/fc2/add_grad/BroadcastGradientArgs (4B/4B, 11us/11us)
adam_optimizer/gradients/fc2/add_grad/Reshape (2.00KB/2.00KB, 6us/6us)
adam_optimizer/gradients/fc2/add_grad/Reshape_1 (40B/40B, 4us/4us)
adam_optimizer/gradients/fc2/add_grad/Shape (8B/8B, 6us/6us)
adam_optimizer/gradients/fc2/add_grad/Shape_1 (4B/4B, 2us/2us)
adam_optimizer/gradients/fc2/add_grad/Sum (2.00KB/2.00KB, 5us/5us)
adam_optimizer/gradients/fc2/add_grad/Sum_1 (40B/40B, 54us/54us)
adam_optimizer/gradients/fc2/add_grad/tuple/control_dependency (2.00KB/2.00KB, 3us/3us)
adam_optimizer/gradients/fc2/add_grad/tuple/control_dependency_1 (40B/40B, 3us/3us)
adam_optimizer/gradients/loss/Reshape_2_grad/Reshape (200B/200B, 7us/7us)
adam_optimizer/gradients/loss/Reshape_2_grad/Shape (4B/4B, 10us/10us)
adam_optimizer/gradients/loss/Reshape_grad/Reshape (2.00KB/2.00KB, 6us/6us)
adam_optimizer/gradients/loss/SoftmaxCrossEntropyWithLogits_grad/ExpandDims (200B/204B, 7us/9us)
adam_optimizer/gradients/loss/SoftmaxCrossEntropyWithLogits_grad/ExpandDims/dim (4B/4B, 2us/2us)
adam_optimizer/gradients/loss/SoftmaxCrossEntropyWithLogits_grad/mul (2.00KB/2.00KB, 44us/44us)
adam_optimizer/gradients/pool1/MaxPool_grad/MaxPoolGrad (5.02MB/5.02MB, 193us/193us)
adam_optimizer/gradients/pool2/MaxPool_grad/MaxPoolGrad (2.51MB/2.51MB, 195us/195us)
conv1/Conv2D (5.02MB/5.02MB, 154us/154us)
conv1/Relu (5.02MB/5.02MB, 43us/43us)
conv1/Variable (3.20KB/12.80KB, 17us/33us)
conv1/Variable/Adam (3.20KB/3.20KB, 8us/8us)
conv1/Variable/Adam_1 (3.20KB/3.20KB, 5us/5us)
conv1/Variable/read (3.20KB/3.20KB, 3us/3us)
conv1/Variable_1 (128B/512B, 13us/24us)
conv1/Variable_1/Adam (128B/128B, 4us/4us)
conv1/Variable_1/Adam_1 (128B/128B, 4us/4us)
conv1/Variable_1/read (128B/128B, 3us/3us)
conv1/add (5.02MB/5.02MB, 37us/37us)
conv2/Conv2D (2.51MB/2.51MB, 276us/276us)
conv2/Relu (2.51MB/2.51MB, 47us/47us)
conv2/Variable (204.80KB/819.20KB, 13us/27us)
conv2/Variable/Adam (204.80KB/204.80KB, 5us/5us)
conv2/Variable/Adam_1 (204.80KB/204.80KB, 6us/6us)
conv2/Variable/read (204.80KB/204.80KB, 3us/3us)
conv2/Variable_1 (256B/1.02KB, 11us/36us)
conv2/Variable_1/Adam (256B/256B, 10us/10us)
conv2/Variable_1/Adam_1 (256B/256B, 12us/12us)
conv2/Variable_1/read (256B/256B, 3us/3us)
conv2/add (2.51MB/2.51MB, 36us/36us)
dropout/dropout/Floor (204.80KB/204.80KB, 40us/40us)
dropout/dropout/Shape (8B/8B, 6us/6us)
dropout/dropout/add (204.80KB/204.80KB, 41us/41us)
dropout/dropout/div (204.80KB/204.80KB, 69us/69us)
dropout/dropout/mul (204.80KB/204.80KB, 44us/44us)
dropout/dropout/random_uniform (204.80KB/614.40KB, 41us/132us)
dropout/dropout/random_uniform/RandomUniform (204.80KB/204.80KB, 45us/45us)
dropout/dropout/random_uniform/min (4B/4B, 5us/5us)
dropout/dropout/random_uniform/mul (204.80KB/204.80KB, 41us/41us)
fc1/MatMul (204.80KB/204.80KB, 43us/43us)
fc1/Relu (204.80KB/204.80KB, 42us/42us)
fc1/Reshape (627.20KB/627.21KB, 4us/9us)
fc1/Reshape/shape (8B/8B, 5us/5us)
fc1/Variable (12.85MB/51.38MB, 5us/25us)
fc1/Variable/Adam (12.85MB/12.85MB, 12us/12us)
fc1/Variable/Adam_1 (12.85MB/12.85MB, 6us/6us)
fc1/Variable/read (12.85MB/12.85MB, 2us/2us)
fc1/Variable_1 (4.10KB/16.38KB, 11us/22us)
fc1/Variable_1/Adam (4.10KB/4.10KB, 4us/4us)
fc1/Variable_1/Adam_1 (4.10KB/4.10KB, 5us/5us)
fc1/Variable_1/read (4.10KB/4.10KB, 2us/2us)
fc1/add (204.80KB/204.80KB, 68us/68us)
fc2/MatMul (2.00KB/2.00KB, 53us/53us)
fc2/Variable (40.96KB/163.84KB, 7us/19us)
fc2/Variable/Adam (40.96KB/40.96KB, 5us/5us)
fc2/Variable/Adam_1 (40.96KB/40.96KB, 5us/5us)
fc2/Variable/read (40.96KB/40.96KB, 2us/2us)
fc2/Variable_1 (40B/160B, 5us/16us)
fc2/Variable_1/Adam (40B/40B, 4us/4us)
fc2/Variable_1/Adam_1 (40B/40B, 4us/4us)
fc2/Variable_1/read (40B/40B, 3us/3us)
fc2/add (2.00KB/2.00KB, 46us/46us)
loss/Reshape (2.00KB/2.00KB, 5us/5us)
loss/Reshape_1 (2.00KB/2.00KB, 3us/3us)
loss/Reshape_2 (200B/200B, 4us/4us)
loss/Shape (8B/8B, 6us/6us)
loss/Shape_2 (8B/8B, 7us/7us)
loss/Slice (4B/8B, 8us/10us)
loss/Slice/size (4B/4B, 2us/2us)
loss/Slice_1 (4B/4B, 7us/7us)
loss/Slice_2 (4B/8B, 7us/9us)
loss/Slice_2/begin (4B/4B, 2us/2us)
loss/SoftmaxCrossEntropyWithLogits (2.20KB/2.20KB, 260us/260us)
loss/concat (8B/16B, 11us/16us)
loss/concat/axis (4B/4B, 2us/2us)
loss/concat/values_0 (4B/4B, 3us/3us)
loss/concat_1 (8B/8B, 8us/8us)
pool1/MaxPool (1.25MB/1.25MB, 45us/45us)
pool2/MaxPool (627.20KB/627.20KB, 43us/43us)
reshape/Reshape (156.80KB/156.82KB, 10us/19us)
reshape/Reshape/shape (16B/16B, 9us/9us)
======================End of Report==========================
- 統計資料指令碼
#!/bin/bash
f="tf.txt"
cat $f | grep -v $' ' tf.txt | grep 'B' | awk '{split($0,a,"[(,]");print a[2]}' | awk -F '/' '{print $2}' > array.txt
f="array.txt"
#cat $file | grep "B"
array=`cat $f | grep "B"`
ans=0.0
for str in ${array[*]}
do
echo $str
#echo ${str:(-2):2}
if [ ${str:(-2):2} == 'MB' ]; then
#echo "M";
#len = `echo $str | wc -c`
#echo $len
#$str =
str=`echo ${str:0:${#str}}`;
str=`echo ${str%%MB}`;
#echo $str;
#str=$(printf %.4 $str)
#$num=$(( $num + $str));
ans=`awk '{print $1+$2*1024.0*1024.0}' <<< "$ans $str"`;
#echo $ans
#num=$(( $num+$x ));
#ans=$((ans+x))
elif [ ${str:(-2):2} == 'KB' ]; then
str=`echo ${str%%KB}`;
ans=`awk '{print $1+$2*1024.0}' <<< "$ans $str"`;
echo $ans
#echo "K";
else
str=`echo ${str%%B}`;
ans=`awk '{print $1+$2}' <<< "$ans $str"`;
#echo "B";
fi
echo $ans
done
echo $ans
ans=`awk '{print $1/1024.0/1024.0}' <<< "$ans"`;
echo $ans
分析
對比可以發現,用指令碼統計的總記憶體分配量和它報告記憶體分配量是類似的。但是值得注意的是它列印記憶體分配資訊有的是有縮排的,縮排的代表複用,我沒有把複用的算進總記憶體分配量裡。
參考
- https://www.tensorflow.org/programmers_guide/debugger
- http://blog.csdn.net/liuchonge/article/details/69397860
- https://github.com/tensorflow/tensorflow/tree/v1.2.1/tensorflow/tools/tfprof
- https://github.com/tensorflow/tensorflow/blob/v1.3.0-rc2/tensorflow/contrib/tfprof/model_analyzer.py
- http://www.sohu.com/a/126825399_473283