1. 程式人生 > >機器學習---迴歸預測---向量、矩陣求導

機器學習---迴歸預測---向量、矩陣求導

梯度

對於 f(x_1, x_2, x_3),可以通過下面的向量方程來表示梯度:

\nabla f = \frac{\partial f}{\partial x_1} \hat{x_1} + \frac{\partial f}{\partial x_2}  \hat{x_2} + \frac{\partial f}{\partial x_3} \hat{x_3}

\nabla f = \frac{\partial f}{\partial \mathbf{x}} = \begin{bmatrix}\frac{\partial f}{\partial x_1} &\frac{\partial f}{\partial x_2} &\frac{\partial f}{\partial x_3} \\\end{bmatrix}.

佈局約定

向量關於向量的導數:即 \frac{\partial \mathbf{y}}{\partial \mathbf{x}},如果分子y 是m維的,而分母x 是n維的:

  1. 分子佈局(Jacobian 形式),即按照y列向量x橫向量. (得到m×n矩陣:橫向y1/x1 y1/x2 y1/x3 縱向y1/x1 y2/x1 y3/x1)
  2. 分母佈局(Hessian 形式)(梯度),即按照y橫向量x列向量,是Jacobian形式的轉置。(得到n×m矩陣:橫向y1/x1 y2/x1 y3/x1 縱向y1/x1 y1/x2 y1/x3

分子佈局下的向量求導(分母佈局=分子佈局轉置)

標量/向量(橫):

\frac{\partial y}{\partial \mathbf{x}} =\left[\frac{\partial y}{\partial x_1}\frac{\partial y}{\partial x_2}\cdots\frac{\partial y}{\partial x_n}\right].

向量(列)/標量:

\frac{\partial \mathbf{y}}{\partial x} =\begin{bmatrix}\frac{\partial y_1}{\partial x}\\\frac{\partial y_2}{\partial x}\\\vdots\\\frac{\partial y_m}{\partial x}\\\end{bmatrix}.

向量/向量:

\frac{\partial \mathbf{y}}{\partial \mathbf{x}} =\begin{bmatrix}\frac{\partial y_1}{\partial x_1} & \frac{\partial y_1}{\partial x_2} & \cdots & \frac{\partial y_1}{\partial x_n}\\\frac{\partial y_2}{\partial x_1} & \frac{\partial y_2}{\partial x_2} & \cdots & \frac{\partial y_2}{\partial x_n}\\\vdots & \vdots & \ddots & \vdots\\\frac{\partial y_m}{\partial x_1} & \frac{\partial y_m}{\partial x_2} & \cdots & \frac{\partial y_m}{\partial x_n}\\\end{bmatrix}.

標量/矩陣:

\frac{\partial y}{\partial \mathbf{X}} =\begin{bmatrix}\frac{\partial y}{\partial x_{11}} & \frac{\partial y}{\partial x_{21}} & \cdots & \frac{\partial y}{\partial x_{p1}}\\\frac{\partial y}{\partial x_{12}} & \frac{\partial y}{\partial x_{22}} & \cdots & \frac{\partial y}{\partial x_{p2}}\\\vdots & \vdots & \ddots & \vdots\\\frac{\partial y}{\partial x_{1q}} & \frac{\partial y}{\partial x_{2q}} & \cdots & \frac{\partial y}{\partial x_{pq}}\\\end{bmatrix}.

矩陣/標量:

\frac{\partial \mathbf{Y}}{\partial x} =\begin{bmatrix}\frac{\partial y_{11}}{\partial x} & \frac{\partial y_{12}}{\partial x} & \cdots & \frac{\partial y_{1n}}{\partial x}\\\frac{\partial y_{21}}{\partial x} & \frac{\partial y_{22}}{\partial x} & \cdots & \frac{\partial y_{2n}}{\partial x}\\\vdots & \vdots & \ddots & \vdots\\\frac{\partial y_{m1}}{\partial x} & \frac{\partial y_{m2}}{\partial x} & \cdots & \frac{\partial y_{mn}}{\partial x}\\\end{bmatrix}.

性質

  • 鏈式規則:

{\displaystyle {\frac {dz}{dx}}={\frac {dz}{dy}}\cdot {\frac {dy}{dx}}=f'(y)g'(x)=f'(g(x))g'(x).}

  • 乘積規則:

{\displaystyle (f\cdot g)'=f'\cdot g+f\cdot g'}

{\displaystyle {\dfrac {d}{dx}}(u\cdot v)={\dfrac {du}{dx}}\cdot v+u\cdot {\dfrac {dv}{dx}}.}

{\displaystyle {\dfrac {d}{dx}}(u\cdot v\cdot w)={\dfrac {du}{dx}}\cdot v\cdot w+u\cdot {\dfrac {dv}{dx}}\cdot w+u\cdot v\cdot {\dfrac {dw}{dx}}.}

  • 求和規則:

{\frac  {d}{dx}}(u+v)={\frac  {du}{dx}}+{\frac  {dv}{dx}}