1. 程式人生 > >Tikhonov regularization 吉洪諾夫 正則化

Tikhonov regularization 吉洪諾夫 正則化

value eight ise malle dia machines tip like 圖片

這個知識點很重要,但是,我不懂。

第一個問題:為什麽要做正則化?

In mathematics, statistics, and computer science, particularly in the fields of machine learning and inverse problems, regularization is a process of introducing additional information in order to solve an ill-posed problem or to prevent overfitting.

And, what is ill-posed problem?... ...

And, what is overfitting? In statistics, overfitting is "the production of an analysis that corresponds too closely or exactly to a particular set of data, and may therefore fail to fit additional data or predict future observations reliably", as the next figure shows.

技術分享圖片

Figure 1. The green curve represents an overfitted model and the black line represents a regularized model. While the green line best follows the training data, it is too dependent on that data and it is likely to have a higher error rate on new unseen data, compared to the black line.

第二個問題:常用的正則化方法有哪些?

第三個問題:The advantages fo Tikhonov regularizatioin

The fourth question: Tikhonov regularization

Tikhonov regularization, named for Andrey Tikhonov, is the most commonly used method of regularization of ill-posed problems. In statistics, the method is known as ridge regression, in machine learning it is known as weight decay, and with multiple independent discoveries, it is also variously known as the Tikhonov–Miller method, the Phillips–Twomey method, the constrained linear inversion method, and the method of linear regularization. It is related to the Levenberg–Marquardt algorithm for non-linear least-squares problems.

Suppose that for a known matrix A and vector b, we wish to find a vector x such that:

技術分享圖片

The standard approach is ordinary least squares linear regression. However, if no x satisfies the equation or more than one x does—that is, the solution is not unique—the problem is said to be ill posed. In such cases, ordinary least squares estimation leads to an overdetermined (over-fitted), or more often an underdetermined (under-fitted) system of equations. Most real-world phenomena have the effect of low-pass filters in the forward direction where A maps x to b. Therefore, in solving the inverse-problem, the inverse mapping operates as a high-pass filter that has the undesirable tendency of amplifying noise (eigenvalues / singular values are largest in the reverse mapping where they were smallest in the forward mapping). In addition, ordinary least squares implicitly nullifies every element of the reconstructed version of x that is in the null-space of A, rather than allowing for a model to be used as a prior for 技術分享圖片. Ordinary least squares seeks to minimize the sum of squared residuals, which can be compactly written as:

技術分享圖片 where 技術分享圖片 is the Euclidean norm.

技術分享圖片

In order to give preference to a particular solution with desirable properties, a regularization term can be included in this minimization:

技術分享圖片 for some suitably chosen Tikhonov matrix, 技術分享圖片. In many cases, this matrix is chosen as a multiple of the identity matrix (技術分享圖片), giving preference to solutions with smaller norms; this is known as L2 regularization.[1] In other cases, high-pass operators (e.g., a difference operator or a weighted Fourier operator) may be used to enforce smoothness if the underlying vector is believed to be mostly continuous. This regularization improves the conditioning of the problem, thus enabling a direct numerical solution. An explicit solution, denoted by 技術分享圖片, is given by:

技術分享圖片

The effect of regularization may be varied via the scale of matrix 技術分享圖片. For 技術分享圖片 this reduces to the unregularized least squares solution provided that (ATA)?1 exists.

L2 regularization is used in many contexts aside from linear regression, such as classification with logistic regression or support vector machines,[2] and matrix factorization.[3]

對於y=Xw,若X無解或有多個解,稱這個問題是病態的。病態問題下,用最小二乘法求解會導致過擬合或欠擬合,用正則化來解決。

設X為m乘n矩陣:

  • 過擬合模型:m<<nm<<n,欠定方程,存在多解的可能性大;
  • 欠擬合模型:m>>nm>>n,超定方程,可能無解,或者有解但準確率很低

技術分享圖片

REF:

https://blog.csdn.net/darknightt/article/details/70179848

Tikhonov regularization 吉洪諾夫 正則化