變分法
變分法
函數\(y(x)\)對於任意給定的輸入變量\(x\),給出輸出值\(y\);類似地,定義關於函數的函數\(F[y]\),亦稱泛函,給定函數\(y\),輸出值為\(F\)。熵\(\text{H}[x]\)也是泛函的一種,它定義在概率密度函數\(p(x)\)上,可等價記為\(\text{H}[p]\)。
泛函中變分法類似於函數中求極值點,即尋求某個最大化或最小化泛函\(F[y]\)的函數\(y(x)\)。利用變分法可證明兩點之間的最短路徑為直線以及最大熵分布為高斯分布。
對於多元函數\(y(\mathbf{x})=y(x_1,...,x_D)\),其泰勒展開為
\[y(\mathbf{x}+\boldsymbol\epsilon)=y(\mathbf{x})+\boldsymbol\epsilon^\text{T}\frac{\partial y}{\partial \mathbf{x}}+O(\Vert\boldsymbol\epsilon\Vert^2).\]
對於某個泛函\(F[y]\),考慮\(y(x)\)上的微小改變\(\epsilon\eta(x)\),其中\(\eta(x)\)為任意函數,對比上式,將\(\mathbf{x}\)展開到無限維,從而
\[F[y(x)+\epsilon\eta(x)]=F[y(x)]+\int \frac{\delta F}{\delta y(x)}\epsilon\eta(x){\rm{d}}x+O(\epsilon^2),\]
其中\(\delta F/\delta y(x)\)為泛函梯度。
為了使\(F[y]\)取得極值,上式一階條件為
\[\begin{aligned}\lim_{\epsilon\rightarrow 0}\frac{
\int \frac{\delta F}{\delta y(x)}\epsilon\eta(x){\rm{d}}x}{\epsilon}&=\int \frac{\delta F}{\delta y(x)}\eta(x){\rm{d}}x
\\&=0,
\end{aligned}\]
註意到\(\eta(x)\)選取的任意性,可得泛函梯度需要處處為零,即
\[\frac{\delta F}{\delta y(x)}\equiv 0.\]
考慮以下實例,泛函\(F[y]\)定義如下
\[F[y]=\int G(y(x),y'(x),x)\text{ d}x,\]
並且假定\(y(x)\)在積分邊界上值為常數,從而在積分邊界上\(\eta(x)=0\)。
考慮\(y(x)\)上的變分
\[\begin{aligned} F[y(x)+\epsilon\eta(x)]&= \int G\left(y(x)+\epsilon \eta(x),y'(x)+\epsilon\eta'(x),x\right)\text{ d} x\\&= \int\left\{G\left(y(x),y'(x),x\right)+\frac{\partial G}{\partial y}\epsilon\eta(x)+\frac{\partial G}{\partial y'}\epsilon\eta'(x)\right\}\text{ d}x+O(\epsilon^2)\\&= F[y(x)]+\epsilon\int\left\{\frac{\partial G}{\partial y}\eta(x)+\frac{\partial G}{\partial y'}\eta'(x)\right\}\text{ d}x+O(\epsilon^2) \end{aligned}\]
註意到
\[\begin{aligned} \int \frac{\partial G}{\partial y'}\eta'(x)\text{ d}x&=\int \frac{\partial G}{\partial y'}\text{ d}\eta(x)\\&= \eta(x)\frac{\partial G}{\partial y'}\Big\vert-\int\eta(x)\frac{\text{d}}{\text{d}x}\left(\frac{\partial G}{\partial y'}\right)\text{ d}x\\&= -\int\eta(x)\frac{\text{d}}{\text{d}x}\left(\frac{\partial G}{\partial y'}\right)\text{ d}x \end{aligned}\]
帶入上式,整理得
\[\begin{aligned} F[y(x)+\epsilon\eta(x)]&=F[y(x)]+\epsilon\int\left\{\frac{\partial G}{\partial y}-\frac{\text{d}}{\text{d}x}\left(\frac{\partial G}{\partial y'}\right)\right\}\eta(x)\text{ d}x+O(\epsilon^2), \end{aligned}\]
令泛函梯度為零
\[\frac{\partial G}{\partial y}-\frac{\text{d}}{\text{d}x}\left(\frac{\partial G}{\partial y'}\right)=0,\]
余下步驟可用微分方程求解。
變分法