Tensorflow中的Eager execution

阿新 • • 發佈：2019-02-02

本文的主要內容參考斯坦福大學CS20SI課程，有興趣的同學請點選連結SC20檢視課程講義。

今天我們通常使用的Tensorflow是宣告式的（Declarative）。這意味著我們在執行我們的Graph的時候必須提前先宣告好其中的所有內容，然後再執行它。

對於圖，它是...

可優化的（Optimizable）

-自動緩衝區重用（automatic buffer reuse）

-可以不斷摺疊的（constant folding）

-op之間是並行處理的（inter-op parallelism）

-自動在計算和記憶體資源之間進行權衡（automatic trade-off between compute and memory）

可展開的（Deployable）

-圖是一個對於刻畫一個模型的中介。

可重寫的（Rewritable）

-experiment with automatic device placement or quantization.

但是，圖也是...

難以除錯的（Difficult to debug）

-在組成圖後，如果有問題會報告很長的錯誤。

-不能通過pdb或者列印狀態來對圖的執行進行除錯。

不夠Python(Un-Pythonic)

-編寫TensorFlow程式是一個超程式設計（metaprogramming）的練習。

-Tensorflow控制流和Python有很大的不同。

-不能用傳統的資料結構來完成圖的構建。

所以，為了解決圖具有的這一系列的問題。Tensorflow的開發者引入了Eager execution.

"A NumPy-like library for numerical computation with support for GPU acceleration and automatic differentiation, and a flexible platform for machine learning research and experimentation."

----the eager execution user guide

一個呼叫eager execution的demo:

$python
import tensorflow # version >= 1.50
import tensorflow.contrib.eager as tfe
tfe.enable_eager_execution()

重要的優點：

與Python的除錯工具相容

你終於可以使用pbd.set_trace()了！

提供及時的錯誤反饋
允許使用Python的資料結構
可以使用Python的控制流。諸如：if語句，for迴圈，遞迴等等。

使用Eager execution 可以使你的程式碼變得更簡潔

你再也不需要擔心...

1.佔位符（placeholders）

2.sessions

3.控制依賴（control dependencies）

4.lazy loading

5.{name, variable, op}

一些對比

使用eager execution 前：

在這裡我們實現了一個矩陣和自身相乘的操作。

x = tf.placeholder(tf.float32, shape=[1, 1])
m = tf.matmul(x, x)

print(m)
# Tensor("MatMul:0", shape=(1, 1), dtype=float32)

with tf.Session() as sess:
  m_out = sess.run(m, feed_dict={x: [[2.]]})
print(m_out)
# [[4.]]

使用eager execution後：

x = [[2.]]  # No need for placeholders!
m = tf.matmul(x, x)

print(m)  # No sessions!
# tf.Tensor([[4.]], shape=(1, 1), dtype=float32)

我們看到在我們使用eager execution後，三行程式碼就足以讓我們完成之前的任務。沒有placeholder，沒有session，這極大的簡化了我們的程式碼。

對於Lazy loading:

x = tf.random_uniform([2, 2])

with tf.Session() as sess:
  for i in range(x.shape[0]):
    for j in range(x.shape[1]):
      print(sess.run(x[i, j]))

在這個操作中，我們會在每次迭代時都要向圖中新增一個op。而當我們在使用eager execution 時，由於我們不再需要圖或者對一個op進行重複的操作，因此我們的程式碼會變得更加簡潔，如下：

x = tf.random_uniform([2, 2])

for i in range(x.shape[0]):
  for j in range(x.shape[1]):
    print(x[i, j])

另外，我們在這裡介紹一個小技巧，即如何讓Tensors像Numpy陣列一樣，下面是一個小例項：

x = tf.constant([1.0, 2.0, 3.0])


# Tensors are backed by NumPy arrays
assert type(x.numpy()) == np.ndarray
squared = np.square(x) # Tensors are compatible with NumPy functions
 
# Tensors are iterable!
for i in x:
  print(i)

梯度

在eager execution 中已經構建了微分的方法。

在這一框架下...

op是被記錄在一個tape上
這個tape會被回放從而可以用來計算梯度。（這種操作屬於反向傳播。）

當我們在使用eager execution時，被執行的ops會被記錄到一個tape上，從而可以通過回放來計算梯度。如果你熟悉autograd包，那麼這種方式和它API很相似。

舉個例子來說：

def square(x):
  return x ** 2

grad = tfe.gradients_function(square)

print(square(3.))    # tf.Tensor(9., shape=(), dtype=float32)
print(grad(3.))      # [tf.Tensor(6., shape=(), dtype=float32))]

其中，tfe.gradients_function()會根據輸入函式的不同而表現出不同的形式。再比如：

x = tfe.Variable(2.0)
def loss(y):
  return (y - x ** 2) ** 2

grad = tfe.implicit_gradients(loss)

print(loss(7.))  # tf.Tensor(9., shape=(), dtype=float32)
print(grad(7.))  # [(<tf.Tensor: -24.0, shape=(), dtype=float32>, 
                     <tf.Variable 'Variable:0' shape=()                
                      dtype=float32, numpy=2.0>)]

當我們使用eager execution時，需要使用tfe.Variable來宣告變數。同樣的，tfe.implicit_gradients()會根據變數來計算梯度。

下面的API均可以被用來計算梯度，即使當eager execution 沒有被使用。

tfe.gradients_function()
tfe.value_and_gradients_function()
tfe.implicit_gradients()
tfe.implicit_value_and_gradients()

使用Eager Execution 的Huber迴歸

和沒有Eager Execution的模式相比，沒有那麼多的不同。

一系列op的集合

Tensorflow = Operation Kernels + Execution

構建圖的模式：使用Session來執行一系列op的組合。

Eager execution 模式：用Python來執行一系列op的組合。

對於Tensorflow ,一種可以用來理解的方式是可以將它視為一系列operation的組合，這些operation包括數學，線性代數，影象處理，用來生成TensorBoard視覺化的程式碼等等，也包括執行這些組成部分的一個計算操作。Session提供了一種執行這些op的方法。而在Eager execution模式下，相當於是使用python直接執行這些操作。

但是二者基本的操作是相同的，因此API的形式也大體相當。

一般情況下，無論你是否啟用了eager execution,Tensorflow的API都是可以使用的。但是當eager execution 模式被啟用的情況下......

更傾向於推薦使用tfe.Variable來定義變數，這樣有助於實現構建圖時候的相容性。
你需要管理好你自己的變數儲存，在這種情況下，變數集合是不被支援的。
請使用tfe.contrib.summary
請使用tfe.Iterator來作為在eager execution模式下用於迭代處理資料集的迭代器。
更傾向於使用面向物件的層（例如tf.layers.Dense）

-只有當功能層（例如tf.layers.dense）包裝進tfe.make_template的時候才能發揮功效。

如果我喜歡圖呢？

必須要宣告並且返回。

模型只需要定義一次。

-相同的程式碼能夠在一個Python程序中執行op,同時能夠在另外一個程序中組成一個圖。

Checkpoints是相容的。

-Train eagerly, checkpoint, load in a graph, or vice-versa.

在eager execution 模式下建立圖

-tfe.defun:將“Complie”編譯成圖然後再執行。

所以，我該什麼時候使用eager execution呢？

如果你是一個想使用靈活框架的研究者，或者想要開發一個新的機器學習模型，或者是Tensorflow的初學者，我們都很推薦你去使用eager exexecution。

Tensorflow中的Eager execution

使用Eager execution 可以使你的程式碼變得更簡潔

一些對比

梯度

使用Eager Execution 的Huber迴歸

一系列op的集合

如果我喜歡圖呢？

所以，我該什麼時候使用eager execution呢？

Tensorflow中的Eager execution

Tensorflow Eager execution and interface

TensorFlow for R: More flexible models with TensorFlow eager execution and Keras

How to train your own FaceID CNN using TensorFlow Eager execution

(tensorflow之二十)TensorFlow Eager Execution立即執行外掛

eager execution——tensorflow動態圖

Tensorflow 的動態機制Eager Execution

斯坦福tensorflow教程(四) 貪婪執行Eager Execution

tensorflow中的共享變量（sharing variables）

搭建TensorFlow中碰到的一些問題（TensorBoard不是內部或外部指令也不是可運行的程序）~

tensorflow中使用指定的GPU及GPU顯存

調用tensorflow中的concat方法時Expected int32, got list containing Tensors of type '_Message' instead.

（原）tensorflow中finetune某些層

Tensorflow中的滑動平均模型

[翻譯] Tensorflow中name scope和variable scope的區別是什麽

python/numpy/tensorflow中，對矩陣行列操作，下標是怎麽回事兒？

tensorflow中moving average的正確用法

對Tensorflow中tensor的理解

tensorflow 中 reduce_sum 理解

Tensorflow中神經網絡的激活函數

Tensorflow中的Eager execution

使用Eager execution 可以使你的程式碼變得更簡潔

一些對比

梯度

使用Eager Execution 的Huber迴歸

一系列op的集合

如果我喜歡圖呢？

所以，我該什麼時候使用eager execution呢？

相關推薦