1. 程式人生 > >ROC曲線的解釋(很形象)

ROC曲線的解釋(很形象)

幾個概念

這裡寫圖片描述

場景

AdaBoost的基本分類器的線性組合

f(x)=m=1MαmGm(x)

最終的分類器

G(x)=sign(f(x))=sign(m=1MαmGm(x))

這裡已知 {f(xi)|i=1,2,,N}{labeli|i=1,2,,N},前者是每個樣本xi對應的基本分類器的輸出的加權組合,後者是對應的標籤資料。

接下來基於這兩個資料做ROC曲線圖。

作圖

這裡寫圖片描述

繪圖程式碼:

<code class="language-python hljs  has-numbering" style="display: block; padding: 0px; background-color: transparent; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; word-wrap: normal; background-position: initial initial; background-repeat: initial initial;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#predStrengths 和classLabels都是299個元素的ndarray物件。</span>
ySum = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.0</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#variable to calculate AUC</span>
N = classLabels.shape[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>] <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#總樣本個數</span>
numPosClas = np.sum(classLabels==<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#樣本中正例的個數</span>
yStep = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>/numPosClas;  <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#真陽率(在縱軸上)的分母是正樣本的個數</span>
xStep = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>/(N-numPosClas) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#假陽率(在橫軸上)的分母是負樣本的個數</span>
srtidxs = predStrengths.argsort()<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 從小到大排列的序號</span>

fig = plt.figure()
fig.clf()
ax = plt.subplot(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">111</span>)

cur = (<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#左上頂角座標,全部樣本都判為正,真陽率和假陽率都為1</span>
<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> idx <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> srtidxs: 
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#從值最小到值最大,作為判斷門限,將大於該值的樣本判為正,將小於等於該值的樣本判為負</span>
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> classLabels[idx] == <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>: <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 樣本為正,影響的是真陽率,判錯了,所以真陽率要減小一個刻度</span>
        delX = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>; 
        delY = yStep;
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">else</span>: <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 樣本為負,影響的是假陽率,盤對了,故假陽率要減小一個刻度</span>
        delX = xStep; 
        delY = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>;

        <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#每次x軸(即假陽率)調整時,將ySum加上當前的y軸刻度值,</span>
        ySum += cur[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>] 

    ax.plot([cur[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>],cur[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]-delX],[cur[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>],cur[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>]-delY], c=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'b'</span>)
    cur = (cur[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]-delX,cur[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>]-delY) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#更新座標,從右上角向左下角畫的曲線    </span>
ax.plot([<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>],[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>],<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'b--'</span>) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 畫一條對角線,從(0,0)到(1,1)</span>

auc = np.str( <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"%.4f"</span>%(ySum*xStep)) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#曲線下的面積</span>
plt.xlabel(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'假陽率'</span>,{<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontname'</span>:<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'STFangsong'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontsize'</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">15</span>}); 
plt.ylabel(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'真陽率'</span>,{<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontname'</span>:<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'STFangsong'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontsize'</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">15</span>})
plt.title(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'ROC曲線'</span>+<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'(AUC = ('</span>+auc+<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">')'</span>,{<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontname'</span>:<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'STFangsong'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontsize'</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">15</span>})

ax.axis([<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>]) 
fig.savefig(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'roc.png'</span>,dpi=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">300</span>,bbox_inches=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'tight'</span>)</code>