[線性模型] 對數機率迴歸(Logistic Regression)
阿新 • • 發佈:2020-10-14
公式推導
對數機率迴歸用於處理二分類問題,其數學基礎為對數機率函式,是一種 Sigmoid 函式
\[y = \frac{1}{1+e^{-z}} \tag 1 \]其函式影象如下
取 $z = \boldsymbol{w}^T\boldsymbol{x}+b$,並對式 $(1)$ 進行一定變換,得 $$ \ln\frac{y}{1-y}= \boldsymbol{w}^T\boldsymbol{x}+b \tag 2\\ $$ 可以理解為,使用線性迴歸的預測結果逼近真實標記的對數機率. 當 $y>0.5$ 時,左式大於 $0$;當 $y<0.5$ 時,左式小於$0$.按照線性迴歸模型的求解過程,推匯出求解 \(\boldsymbol w\)
令 \(\hat{\boldsymbol w} = (\boldsymbol w;b), \hat{\boldsymbol x_i}=(\boldsymbol x_i;1)\) 採用極大似然估計法估計 \(\hat{\boldsymbol w}\) ,最大化函式
\[L(\hat{\boldsymbol w}) = \sum\limits_{i=1}^m\ln p(y_i|\hat{\boldsymbol x_i},\hat{\boldsymbol w}) \tag 5\\ \]令\(p_1(\hat{\boldsymbol x_i},\hat{\boldsymbol w}) = p(y=1|\hat{\boldsymbol x_i},\hat{\boldsymbol w}), p_0 = 1-p_1(\hat{\boldsymbol x_i},\hat{\boldsymbol w})\)
代入式 \((3)(4)\),最大化式 \((6)\) 等價於最小化
\[L(\hat{\boldsymbol w}) = \sum\limits_{i=1}^m (-y_i\hat{\boldsymbol w}^T\hat{\boldsymbol x_i}+\ln(1+e^{\hat{\boldsymbol w}^T\hat{\boldsymbol x_i}}))\tag 7\\ \]使用牛頓迭代法得到迭代更新公式
\[\begin{align} \hat{\boldsymbol w} &\leftarrow \hat{\boldsymbol w}-\left(\frac{\partial^2L(\hat{\boldsymbol w})}{\partial \hat{\boldsymbol w} \partial\hat{\boldsymbol w}^T} \right )^{-1}\frac{\partial L(\hat{\boldsymbol w})}{\partial\hat{\boldsymbol w}} \tag 8\\ \frac{\partial L(\hat{\boldsymbol w})}{\partial\hat{\boldsymbol w}} &=-\sum\limits_{i=1}^m \hat{\boldsymbol x_i}(y_i-p_1(\hat{\boldsymbol x_i},\hat{\boldsymbol w})) \tag 9\\ \frac{\partial^2L(\hat{\boldsymbol w})}{\partial \hat{\boldsymbol w} \partial\hat{\boldsymbol w}^T} &=\sum\limits_{i=1}^m\hat{\boldsymbol x_i}\hat{\boldsymbol x_i}^Tp_1(\hat{\boldsymbol x_i},\hat{\boldsymbol w})(1-p_1(\hat{\boldsymbol x_i},\hat{\boldsymbol w})) \tag {10} \end{align} \]其中式 \((9)\) 可以向量化為
\[\frac{\partial L(\hat{\boldsymbol w})}{\partial\hat{\boldsymbol w}} = \boldsymbol X^T(p_1({\boldsymbol X},\hat{\boldsymbol w})-\boldsymbol y) \tag{11} \]MATLAB 實現
% 生成隨機訓練樣本,直線 y=0.7x+200 上方的為正例,下方為反例
% 輸出訓練樣本在座標軸上的分佈
x = zeros(100, 2);
y = zeros(100, 1);
kb = [0.7,200];
figure;
hold on;
for i = 1:100
x(i,1) = randi(1000,1);
x(i,2) = randi(1000,1);
if kb(1)*x(i,1)+kb(2)>x(i,2)
plot(x(i,1), x(i,2), 'r*');
y(i) = 1;
else
plot(x(i,1),x(i,2), 'b*');
y(i) = 0;
end
end
% 牛頓迭代法解權重
function w = cal(X,y,eps)
[m,n] = size(X);
X = [X ones(m,1)];
n = n + 1;
w = zeros(n, 1);
w(3) = 1000;
prew = zeros(n, 1);
while (true)
flag = 0;
sum1 = X'*(1-1./(1+exp(X*prew))-y);
sum2 = 0;
for i = 1:m
sum2 = sum2 + X(i,:)*X(i,:)'*(1-1./(1+exp(X(i,:)*prew)))*(1./(1+exp(X(i,:)*prew)));
end
w = prew - sum1./sum2;
for i = 1:n
if abs(w(i)-prew(i))>eps
flag = 1;
end
end
if flag==0
break;
end
prew = w;
end
end
% 測試並輸出訓練效果影象
eps = 0.0001;
w = cal(x, y, eps);
figure;
hold on;
for i=1:100
yy = w(1)*x(i,1)+w(2)*x(i,2)+w(3);
if yy>0
plot(x(i,1), x(i,2), 'r*');
else
plot(x(i,1), x(i,2), 'b*');
end
end
訓練集的真實分佈(紅色正例、藍色反例):
模型的分類效果: