TensorFlow2.0（五）：張量限幅

阿新 • • 發佈：2019-10-09

1 maxmium()與minmium()¶

maximum()用於限制最小值,也即是說，將一個tensor中小於指定值的元素替換為指定值：

In [2]:

import tensorflow as tf

In [3]:

a = tf.range(10)
a

Out[3]:

<tf.Tensor: id=3, shape=(10,), dtype=int32, numpy=array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])>

In [4]:

tf.maximum(a, 4)

Out[4]:

<tf.Tensor: id=6, shape=(10,), dtype=int32, numpy=array([4, 4, 4, 4, 4, 5, 6, 7, 8, 9])>

In [12]:

b = tf.random.uniform([3,4], minval=1, maxval=10, dtype=tf.int32)
b

Out[12]:

<tf.Tensor: id=36, shape=(3, 4), dtype=int32, numpy=
array([[6, 7, 5, 3],
       [3, 2, 9, 4],
       [8, 4, 5, 8]])>

In [13]:

tf.maximum(b, 4)

Out[13]:

<tf.Tensor: id=39, shape=(3, 4), dtype=int32, numpy=
array([[6, 7, 5, 4],
       [4, 4, 9, 4],
       [8, 4, 5, 8]])>

minium()方法與maximum()方法想法，用於限制一個tensor的最大值，即將tensor中大於指定值的元素替換為指定值：

In [14]:

tf.minimum(a, 6)

Out[14]:

<tf.Tensor: id=42, shape=(10,), dtype=int32, numpy=array([0, 1, 2, 3, 4, 5, 6, 6, 6, 6])>

In [15]:

tf.minimum(b, 6)

Out[15]:

<tf.Tensor: id=45, shape=(3, 4), dtype=int32, numpy=
array([[6, 6, 5, 3],
       [3, 2, 6, 4],
       [6, 4, 5, 6]])>

如果要同時限制一個tensor的最大值和最小值，可以這麼做：

In [16]:

tf.minimum(tf.maximum(b,4),6)

Out[16]:

<tf.Tensor: id=50, shape=(3, 4), dtype=int32, numpy=
array([[6, 6, 5, 4],
       [4, 4, 6, 4],
       [6, 4, 5, 6]])>

這種同時呼叫minmium()和maxmium()的方法不夠便捷，所以TensorFlow中提供了clip_by_value()方法來實現這一功能。

2 clip_by_value()¶

clip_by_value()底層也是通過呼叫minmium()和maxmium()方法來實現同時限制最大值、最小值功能，我們現在來感受一下：

In [17]:

Out[17]:

<tf.Tensor: id=36, shape=(3, 4), dtype=int32, numpy=
array([[6, 7, 5, 3],
       [3, 2, 9, 4],
       [8, 4, 5, 8]])>

In [18]:

tf.clip_by_value(b,4,6)

Out[18]:

<tf.Tensor: id=56, shape=(3, 4), dtype=int32, numpy=
array([[6, 6, 5, 4],
       [4, 4, 6, 4],
       [6, 4, 5, 6]])>

3 relu()¶

relu()方法將tensor最小值限制為0，相當於tf.maxmium(a,0),注意，relu()方法在tf.nn模組中：

In [19]:

a = tf.range(-5,5,1)
a

Out[19]:

<tf.Tensor: id=61, shape=(10,), dtype=int32, numpy=array([-5, -4, -3, -2, -1,  0,  1,  2,  3,  4])>

In [20]:

tf.nn.relu(a)

Out[20]:

<tf.Tensor: id=63, shape=(10,), dtype=int32, numpy=array([0, 0, 0, 0, 0, 0, 1, 2, 3, 4])>

In [21]:

b = tf.random.uniform([3,4],minval=-10, maxval=10, dtype=tf.int32)
b

Out[21]:

<tf.Tensor: id=68, shape=(3, 4), dtype=int32, numpy=
array([[ 2,  0,  6,  7],
       [ 6,  2, -1, -8],
       [-7, -6, -1, -6]])>

In [22]:

tf.nn.relu(b)

Out[22]:

<tf.Tensor: id=70, shape=(3, 4), dtype=int32, numpy=
array([[2, 0, 6, 7],
       [6, 2, 0, 0],
       [0, 0, 0, 0]])>

4 cli_by_norm()¶

cli_by_norm()方法是根據tensor的L2範數（模）和給定裁切值按比例對tensor進行限幅。這種方法可以在不改變方向的前提下，按比例對向量進行限幅。我們先手動實現這一過程，先定義一個向量：

In [23]:

a = tf.random.normal([2,3],mean=10)
a

Out[23]:

<tf.Tensor: id=77, shape=(2, 3), dtype=float32, numpy=
array([[10.035151, 11.085695, 11.230698],
       [11.222443, 10.120184,  8.86371 ]], dtype=float32)>

然後求這個向量的L2範數，也就是向量的模：

In [24]:

n = tf.norm(a) 
n

Out[24]:

<tf.Tensor: id=83, shape=(), dtype=float32, numpy=25.625225>

向量處理模，就可以將向量縮放到0到1範圍：

In [25]:

a1 = a / n
a1

Out[25]:

<tf.Tensor: id=85, shape=(2, 3), dtype=float32, numpy=
array([[0.3916122 , 0.4326087 , 0.4382673 ],
       [0.43794513, 0.39493054, 0.34589785]], dtype=float32)>

對向量限幅時，例如限制在10範圍內：

In [29]:

a2 = a1 * 10
a2

Out[29]:

<tf.Tensor: id=94, shape=(2, 3), dtype=float32, numpy=
array([[3.916122 , 4.326087 , 4.382673 ],
       [4.3794513, 3.9493055, 3.4589787]], dtype=float32)>

clip_by_norm()方法實現的就是上述步驟：

In [30]:

tf.clip_by_norm(a,10)

Out[30]:

<tf.Tensor: id=111, shape=(2, 3), dtype=float32, numpy=
array([[3.9161217, 4.326087 , 4.382673 ],
       [4.3794513, 3.9493055, 3.4589784]], dtype=float32)>

當然，cli_by_norm()方法內部還做了一個判斷：如果給定的裁切值大於tensor的模，那就不會去對tensor進行修改，依舊返回tensor本身。繼續上面例子，a的模為25.625225，如果給定的裁切值大於這個值，就不會對a進行限幅：

In [31]:

tf.clip_by_norm(a,26)

Out[31]:

<tf.Tensor: id=128, shape=(2, 3), dtype=float32, numpy=
array([[10.035151, 11.085695, 11.230698],
       [11.222443, 10.120184,  8.86371 ]], dtype=float32)>

5 clip_by_global_norm()¶

在梯度更新等諸多場景中，需要同時綜合多個引數（tensor）進行梯度更新，這時候，clip_by_norm()就滿足不了需求了，所以就有了cip_by_global_norm()方法。cip_by_global_norm()方法限幅原理與clip_by_norm()是一樣的，都是綜合範數和給定的裁切值進行限幅，不同的是，cip_by_global_norm()方法方法計算範數時是綜合給定的多個tensor進行計算。

注：clip_by_global_norm()方法用於修正梯度值，控制梯度爆炸的問題。梯度爆炸和梯度彌散的原因一樣，都是因為鏈式法則求導的關係，導致梯度的指數級衰減。為了避免梯度爆炸，需要對梯度進行修剪。

以下面三個向量為例，同時進行限幅：

In [52]:

t1 = tf.random.normal([3],mean=10)
t1

Out[52]:

<tf.Tensor: id=298, shape=(3,), dtype=float32, numpy=array([9.564309, 9.443071, 8.37221 ], dtype=float32)>

In [53]:

t2 = tf.random.normal([3],mean=10)
t2

Out[53]:

<tf.Tensor: id=305, shape=(3,), dtype=float32, numpy=array([10.853721,  9.294285,  8.552048], dtype=float32)>

In [54]:

t3 = tf.random.normal([3],mean=10)
t3

Out[54]:

<tf.Tensor: id=312, shape=(3,), dtype=float32, numpy=array([10.658405,  9.979499,  8.440408], dtype=float32)>

In [55]:

t_list = [t1,t2,t3]

首先計算全域性L2範數,計算公式為： global_norm = sqrt(sum([L2norm(t)**2 for t in t_list]))

In [56]:

global_norm = tf.norm([tf.norm(t) for t in t_list])

假設給定裁切值為25：

In [57]:

[t*25/global_norm for t in t_list]

Out[57]:

[<tf.Tensor: id=337, shape=(3,), dtype=float32, numpy=array([8.388461, 8.282128, 7.34292 ], dtype=float32)>,
 <tf.Tensor: id=340, shape=(3,), dtype=float32, numpy=array([9.519351 , 8.151634 , 7.5006485], dtype=float32)>,
 <tf.Tensor: id=343, shape=(3,), dtype=float32, numpy=array([9.348048, 8.752607, 7.402734], dtype=float32)>]

In [58]:

tf.clip_by_global_norm(t_list,25)

Out[58]:

([<tf.Tensor: id=365, shape=(3,), dtype=float32, numpy=array([8.388461 , 8.282129 , 7.3429203], dtype=float32)>,
  <tf.Tensor: id=366, shape=(3,), dtype=float32, numpy=array([9.519351, 8.151634, 7.500649], dtype=float32)>,
  <tf.Tensor: id=367, shape=(3,), dtype=float32, numpy=array([9.348048, 8.752607, 7.402734], dtype=float32)>],
 <tf.Tensor: id=355, shape=(), dtype=float32, numpy=28.50436>)

計算結果是一樣的，不過clip_by_global_norm()返回兩個值，分別是各向量限幅後的返回值列表、全域性範數。

TensorFlow2.0（五）：張量限幅

1 maxmium()與minmium()¶

2 clip_by_value()¶

3 relu()¶

4 cli_by_norm()¶

5 clip_by_global_norm()¶

TensorFlow2.0（五）：張量限幅

TensorFlow2.0（1）：基本資料結構—張量

TensorFlow2.0（二）：數學運算

TensorFlow2.0（六）：Dataset

TensorFlow2.0（7）：啟用函式

TensorFlow2.0（8）：誤差計算——損失函式總結

TensorFlow2.0（9）：TensorBoard視覺化

TensorFlow2.0（10）：載入自定義圖片資料集到Dataset

TensorFlow2.0（11）：tf.keras建模三部曲

TensorFlow2.0（12）：模型儲存與序列化

Spark筆記整理（五）：Spark RDD持久化、廣播變量和累加器

Elam的caffe筆記之配置篇（五）：Centos6.5配置opencv3.1.0

Pytorch第一課：package-torch（1）之張量初識

我對hyperledger fabric1.1.0的執著（五）：solo多機部署

vue-cli 3.0腳手架配置及擴充套件（五）：AppConf類

KAFKA 1.0 文件（五）：生產者配置

從0開發3D引擎（五）：函數語言程式設計及其在引擎中的應用

Spring（五）：Spring&Struts2&Hibernate整合後，實現查詢Employee信息

Mina、Netty、Twisted一起學（五）：整合protobuf

《跨界雜談》商業模式（五）：金融

TensorFlow2.0（五）：張量限幅

1 maxmium()與minmium()¶

2 clip_by_value()¶

3 relu()¶

4 cli_by_norm()¶

5 clip_by_global_norm()¶

相關推薦