TensorFlow2.0（8）：誤差計算——損失函式總結

阿新 • • 發佈：2019-10-23

注：本系列所有部落格將持續更新併發布在github上，您可以通過github下載本系列所有文章筆記檔案。

1 均方差損失函式：MSE¶

均方誤差（Mean Square Error），應該是最常用的誤差計算方法了，數學公式為： $$loss = \frac{1}{N}\sum {{{(y - pred)}^2}} $$

其中，$y$是真實值，$pred$是預測值，$N$通常指的是batch_size，也有時候是指特徵屬性個數。

In [1]:

import tensorflow as tf
y = tf.random.uniform((5,),maxval=5,dtype=tf.int32)  # 假設這是真實值
print(y)

y = tf.one_hot(y,depth=5)  # 轉為熱獨編碼
print(y)

tf.Tensor([2 4 4 0 2], shape=(5,), dtype=int32)
tf.Tensor(
[[0. 0. 1. 0. 0.]
 [0. 0. 0. 0. 1.]
 [0. 0. 0. 0. 1.]
 [1. 0. 0. 0. 0.]
 [0. 0. 1. 0. 0.]], shape=(5, 5), dtype=float32)

In [2]:

Out[2]:

<tf.Tensor: id=7, shape=(5, 5), dtype=float32, numpy=
array([[0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 1.],
       [0., 0., 0., 0., 1.],
       [1., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0.]], dtype=float32)>

In [3]:

pred = tf.random.uniform((5,),maxval=5,dtype=tf.int32)  # 假設這是預測值
pred = tf.one_hot(pred,depth=5)  # 轉為熱獨編碼
print(pred)

tf.Tensor(
[[0. 1. 0. 0. 0.]
 [0. 0. 0. 1. 0.]
 [1. 0. 0. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]], shape=(5, 5), dtype=float32)

In [4]:

loss1 = tf.reduce_mean(tf.square(y-pred))
loss1

Out[4]:

<tf.Tensor: id=19, shape=(), dtype=float32, numpy=0.4>

在tensorflow的losses模組中，提供能MSE方法用於求均方誤差，注意簡寫MSE指的是一個方法，全寫MeanSquaredError指的是一個類，通常通過方法的形式呼叫MSE使用這一功能。 MSE方法返回的是每一對真實值和預測值之間的誤差，若要求所有樣本的誤差需要進一步求平均值：

In [5]:

loss_mse_1 = tf.losses.MSE(y,pred)
loss_mse_1

Out[5]:

<tf.Tensor: id=22, shape=(5,), dtype=float32, numpy=array([0.4, 0.4, 0.4, 0.4, 0.4], dtype=float32)>

In [6]:

loss_mse_2 = tf.reduce_mean(loss_mse_1)
loss_mse_2

Out[6]:

<tf.Tensor: id=24, shape=(), dtype=float32, numpy=0.4>

一般而言，均方誤差損失函式比較適用於迴歸問題中，對於分類問題，特別是目標輸出為One-hot向量的分類任務中，下面要說的交叉熵損失函式就要合適的多。

2 交叉熵損失函式¶

交叉熵（Cross Entropy）是資訊理論中一個重要概念，主要用於度量兩個概率分佈間的差異性資訊，交叉熵越小，兩者之間差異越小,當交叉熵等於0時達到最佳狀態，也即是預測值與真實值完全吻合。先給出交叉熵計算公式：

$$H(p,q) = - \sum\limits_i {p(x)\log q(x)} $$

其中，$p(x)$是真實分佈的概率，$q(x)$是模型通過資料計算出來的概率估計。

不理解？沒關係，我們通過一個例子來說明。假設對於一個分類問題，其可能結果有5類，由$[1,2,3,4,5]$表示，有一個樣本$x$，其真實結果是屬於第2類，用One-hot編碼表示就是$[0,1,0,0,0]$，也就是上面公司中的$p(x)$。現在有兩個模型，對樣本$x$的預測結果分別是$[0.1, 0.7, 0.05, 0.05, 0.1]$ 和 $[0, 0.6, 0.2, 0.1, 0.1]$，也就是上面公式中的$q(x)$。從直覺上判斷，我們會認為第一個模型預測要準確一些，因為它更加肯定$x$屬於第二類，不過，我們需要通過科學的量化分析對比來證明這一點：

第一個模型交叉熵：${H_1} = - (0 \times \log 0.1 + 1 \times \log 0.7 + 0 \times \log 0.05 + 0 \times \log 0.05 + 0 \times \log 0.01) = - \log 0.7 = 0.36$

第二個模型交叉熵：${H_2} = - (0 \times \log 0 + 1 \times \log 0.6 + 0 \times \log 0.2 + 0 \times \log 0.1 + 0 \times \log 0.1) = - \log 0.6 = 0.51$

可見，${H_1} < {H_2}$，所以第一個模型的結果更加可靠。

在TensorFlow中，計算交叉熵通過tf.losses模組中的categorical_crossentropy()方法。

In [7]:

tf.losses.categorical_crossentropy([0,1,0,0,0],[0.1, 0.7, 0.05, 0.05, 0.1])

Out[7]:

<tf.Tensor: id=41, shape=(), dtype=float32, numpy=0.35667497>

In [8]:

tf.losses.categorical_crossentropy([0,1,0,0,0],[0, 0.6, 0.2, 0.1, 0.1])

Out[8]:

<tf.Tensor: id=58, shape=(), dtype=float32, numpy=0.5108256>

模型在最後一層隱含層的輸出可能並不是概率的形式，不過可以通過softmax函式轉換為概率形式輸出，然後計算交叉熵，但有時候可能會出現不穩定的情況，即輸出結果是NAN或者inf，這種情況下可以通過直接計算隱藏層輸出結果的交叉熵，不過要給categorical_crossentropy()方法傳遞一個from_logits=True引數。

In [9]:

x = tf.random.normal([1,784])
w = tf.random.normal([784,2])
b = tf.zeros([2])

In [10]:

logits = x@w + b  # 最後一層沒有啟用函式的層稱為logits層
logits

Out[10]:

<tf.Tensor: id=75, shape=(1, 2), dtype=float32, numpy=array([[ 5.236802, 18.843138]], dtype=float32)>

In [12]:

prob = tf.math.softmax(logits, axis=1)  # 轉換為概率的形式
prob

Out[12]:

<tf.Tensor: id=77, shape=(1, 2), dtype=float32, numpy=array([[1.2326591e-06, 9.9999881e-01]], dtype=float32)>

In [13]:

tf.losses.categorical_crossentropy([0,1],logits,from_logits=True)  # 通過logits層直接計算交叉熵

Out[13]:

<tf.Tensor: id=112, shape=(1,), dtype=float32, numpy=array([1.1920922e-06], dtype=float32)>

In [14]:

tf.losses.categorical_crossentropy([0,1],prob)  # 通過轉換後的概率計算交叉熵

Out[14]:

<tf.Tensor: id=128, shape=(1,), dtype=float32, numpy=array([1.1920936e-06], dtype=float32)>

TensorFlow2.0（8）：誤差計算——損失函式總結

1 均方差損失函式：MSE¶

2 交叉熵損失函式¶

TensorFlow2.0（8）：誤差計算——損失函式總結

初識vue 2.0（8）：vuex進階

TensorFlow2.0（1）：基本資料結構—張量

TensorFlow2.0（二）：數學運算

TensorFlow2.0（五）：張量限幅

TensorFlow2.0（六）：Dataset

TensorFlow2.0（7）：啟用函式

TensorFlow2.0（9）：TensorBoard視覺化

TensorFlow2.0（10）：載入自定義圖片資料集到Dataset

TensorFlow2.0（11）：tf.keras建模三部曲

TensorFlow2.0（12）：模型儲存與序列化

機器學習總結（一）：常見的損失函式

ArcGIS for Android 10.2.9（8）：計算距離，周長，面積

Swift學習筆記（8）：枚舉

springCloud（8）：Ribbon實現客戶端側負載均衡-自定義Ribbon配置

初識vue 2.0（2）：路由與組件

EasyPR源碼剖析（8）：字符分割

iptables詳解（8）：iptables擴展模塊之state擴展

《Linux學習並不難》文件系統管理（8）：設置開機自動掛載Linux文件系統

相機IMU融合四部曲（二）：誤差狀態四元數詳細解讀

TensorFlow2.0（8）：誤差計算——損失函式總結

1 均方差損失函式：MSE¶

2 交叉熵損失函式¶

相關推薦