CNTK:邏輯迴歸

阿新 • • 發佈：2019-01-05

許多關於機器學習方面的知識，邏輯迴歸作為剛入門學習的內容，這裡針對於剛學習機器學習和CNTK平臺的新人，在教程中使用的是python API。其中BrainScript的例子在：點選開啟連結

介紹：

問題描述：癌症醫院提供了資料，並希望我們確定患者是否有致命的惡性腫瘤或良性腫瘤。這類問題被稱為分類問題。為了幫助對每個病人進行分類，我們給予了他們的年齡和腫瘤的大小。直觀地，可以想象，年輕的患者或/和小腫瘤的患者不太可能患有惡性腫瘤。在下面的圖中，紅色表示惡性和藍色表示良性。注意：這是一個學習的例子; 在現實生活中，需要來自不同測試/檢查來源的許多特徵和醫生的專業知識將為患者做出診斷、治療決定。

from IPython.display import Image
Image(url="https://www.cntk.ai/jup/cancer_data_plot.jpg", width=400, height=400)

目標：我們的目標是學習一個分類器，可以根據兩個特徵（年齡和腫瘤大小）自動將任何患者標記為良性或惡性。在本教程中，我們將建立一個線性分類器。

以下是分類的結果

Image(url ="https://www.cntk.ai/jup/cancer_classify_plot.jpg" , width = 400, height = 400)

在上圖中，綠線表示從資料中學習的模型，並將藍點與紅點分開。

任何學習演算法通常有五個階段。這些是資料讀取，資料預處理，建立模型，學習模型引數和評估模型（也稱為測試/預測）。

1.資料讀取：我們生成模擬資料集，每個樣本具有兩個特徵（如下所示），用來表示年齡和腫瘤大小。 2. 資料預處理：通常需要縮放各種feature（如大小或年齡）。

通常情況下，可以在0和1之間縮放資料。 3. 模型建立：本教程中介紹一個基本的線性模型。4. 學習模式：這也被稱為訓練。雖然擬合線性模型可以通過各種方式完成，在CNTK中使用的是隨機梯度下降。

邏輯迴歸

邏輯迴歸在機器學習中是一種基本技術，它利用特徵的線性加權組合，併產生預測不同類別的概率。在文中，分類器的概率範圍為【0，1】，然後與設定的閾值（大多去取0.5）比較，進而產生二進位制標籤，0或1。這裡為二分類問題，所述方法也可以擴充套件到多分類問題。

由上圖可知，來自不同輸入特徵的貢獻是線性加權的。所得到的和通過Sigmoid函式對映到【0，1】範圍，對於具有兩個以上分類的，可以使用softmax函式

檢查是否安裝了CNTK，以及其版本

from __future__ import print_function
import numpy as np
import sys
import os

import cntk as C

if 'TEST_DEVICE' in os.environ:
    if os.environ['TEST_DEVICE'] == 'cpu':
        C.device.try_set_default_device(C.device.cpu())
    else:
        C.device.try_set_default_device(C.device.gpu(0))
if not C.__version__ == "2.0":
    raise Exception("this notebook was designed to work with 2.0. Current Version: " + C.__version__)

資料生成

用numpy庫生成一些模擬癌症的資料。這裡定義了兩個輸入的特徵和兩個標籤。在示例中，訓練資料中每組資料都有一個標籤，良性或惡性，所以這裡為二分類問題。

定義網路

input_dim  =  2 
num_output_classes  =  2

特徵和標籤

在本教程中使用numpy庫生成資料。

from __future__ import print_function
import numpy as np
import sys
import os

import cntk as C
# Plot the data 
import matplotlib.pyplot as plt
# Define the network
input_dim = 2
num_output_classes = 2

# Ensure that we always get the same results
np.random.seed(0)

# Helper function to generate a random data sample
def generate_random_data_sample(sample_size, feature_dim, num_classes):
    # Create synthetic data using NumPy. 
    Y = np.random.randint(size=(sample_size, 1), low=0, high=num_classes)

    # Make sure that the data is separable 
    X = (np.random.randn(sample_size, feature_dim)+3) * (Y+1)
    
    # Specify the data type to match the input variable used later in the tutorial 
    # (default type is double)
    X = X.astype(np.float32)    
    
    # convert class 0 into the vector "1 0 0", 
    # class 1 into the vector "0 1 0", ...
    class_ind = [Y==class_number for class_number in range(num_classes)]
    Y = np.asarray(np.hstack(class_ind), dtype=np.float32)
    return X, Y  
# Create the input variables denoting the features and the label data. Note: the input 
# does not need additional info on the number of observations (Samples) since CNTK creates only 
# the network topology first 
mysamplesize = 32
features, labels = generate_random_data_sample(mysamplesize, input_dim, num_output_classes)

# let 0 represent malignant/red and 1 represent benign/blue 
colors = ['r' if label == 0 else 'b' for label in labels[:,0]]

plt.scatter(features[:,0], features[:,1], c=colors)
plt.xlabel("Age (scaled)")
plt.ylabel("Tumor size (in cm)")
plt.show()

為了確保每次的執行結果一樣，在生成隨機數的時候使用seed可以保障每次生成的隨機數是一樣的。然後使用numpy生成隨機數，然後視覺化資料，使用matplotlib畫圖。

模型建立

其數學形式為：

z=∑i=1nwi×xi+b=w⋅x+b

W是向量N的權重，b為偏差。使用sigmoid或softmax函式可以將和對映到0到1.

定義輸入

feature = C.input_variable(input_dim, np.float32)

在輸入中，如果要輸入10*5pixel圖片，那麼該函式要寫作為C.input_variable(10*5, np.float32)

網路設定

linear_layer 函式是上面公式的簡單實現，在這裡我們要進行兩個操作：

1.使用times操作對權重W和特徵X進行相乘

2.加上偏差b

feature = C.input_variable(input_dim, np.float32)
# Define a dictionary to store the model parameters
mydict = {}

def linear_layer(input_var, output_dim):
    
    input_dim = input_var.shape[0]
    weight_param = C.parameter(shape=(input_dim, output_dim))
    bias_param = C.parameter(shape=(output_dim))
    
    mydict['w'], mydict['b'] = weight_param, bias_param

    return C.times(input_var, weight_param) + bias_param
output_dim = num_output_classes
z = linear_layer(feature, output_dim)

z用來表示網路的輸出

學習模型引數

現在網路已經建立起來，但是我們想要知道引數W和b，為此我們這裡使用softmax函式，將Z對映到0-1.其中softmax是一個啟用函式，進行歸一化處理。

訓練

通過softmax函式，輸出每個類別的概率。為了訓練分類器，我們需要定義損失函式，最小化輸出和真實標籤的誤差。

H(p)=−∑j=1|y|yjlog(pj)

其中p是經由softmax計算得到的預測概率，y為真實的標籤值。

label = C.input_variable(num_output_classes, np.float32)
loss = C.cross_entropy_with_softmax(z, label)

評估

為了評估分類結果，我們可以計算出classification_error，如果模型是正確的，則為0，否則為1.

eval_error = C.classification_error(z, label)

訓練

在訓練的過程中，努力是loss最小。在這裡使用隨機梯度下降，SGD。通常，從模型引數的隨機初始化開始。然後計算預測和真實標籤之間的誤差，應用梯度下降生成新的模型引數集合。

# Define a utility function to compute the moving average.
# A more efficient implementation is possible with np.cumsum() function
def moving_average(a, w=10):
    if len(a) < w: 
        return a[:]    
    return [val if idx < w else sum(a[(idx-w):idx])/w for idx, val in enumerate(a)]


# Define a utility that prints the training progress
def print_training_progress(trainer, mb, frequency, verbose=1):
    training_loss, eval_error = "NA", "NA"

    if mb % frequency == 0:
        training_loss = trainer.previous_minibatch_loss_average
        eval_error = trainer.previous_minibatch_evaluation_average
        if verbose: 
            print ("Minibatch: {0}, Loss: {1:.4f}, Error: {2:.2f}".format(mb, training_loss, eval_error))
        
    return mb, training_loss, eval_error

執行訓練模型

經過上述操作，那麼現在我們已經設定好了邏輯迴歸模型。一般我們使用大量的觀察資料進行訓練，比如總資料的70%，剩下的作為評估模型。

# Initialize the parameters for the trainer
minibatch_size = 25
num_samples_to_train = 20000
num_minibatches_to_train = int(num_samples_to_train  / minibatch_size)

from collections import defaultdict

# Run the trainer and perform model training
training_progress_output_freq = 50
plotdata = defaultdict(list)

for i in range(0, num_minibatches_to_train):
    features, labels = generate_random_data_sample(minibatch_size, input_dim, num_output_classes)
    
    # Assign the minibatch data to the input variables and train the model on the minibatch
    trainer.train_minibatch({feature : features, label : labels})
    batchsize, loss, error = print_training_progress(trainer, i, 
                                                     training_progress_output_freq, verbose=1)
    
    if not (loss == "NA" or error =="NA"):
        plotdata["batchsize"].append(batchsize)
        plotdata["loss"].append(loss)
        plotdata["error"].append(error)

執行結果為：

Minibatch: 0, Loss: 0.6931, Error: 0.32
Minibatch: 50, Loss: 4.4290, Error: 0.36
Minibatch: 100, Loss: 0.4585, Error: 0.16
Minibatch: 150, Loss: 0.7228, Error: 0.32
Minibatch: 200, Loss: 0.1290, Error: 0.08
Minibatch: 250, Loss: 0.1321, Error: 0.08
Minibatch: 300, Loss: 0.1012, Error: 0.04
Minibatch: 350, Loss: 0.1076, Error: 0.04
Minibatch: 400, Loss: 0.3087, Error: 0.08
Minibatch: 450, Loss: 0.3219, Error: 0.12
Minibatch: 500, Loss: 0.4076, Error: 0.20
Minibatch: 550, Loss: 0.6784, Error: 0.24
Minibatch: 600, Loss: 0.2988, Error: 0.12
Minibatch: 650, Loss: 0.1676, Error: 0.12
Minibatch: 700, Loss: 0.2772, Error: 0.12
Minibatch: 750, Loss: 0.2309, Error: 0.04

# Compute the moving average loss to smooth out the noise in SGD
plotdata["avgloss"] = moving_average(plotdata["loss"])
plotdata["avgerror"] = moving_average(plotdata["error"])

# Plot the training loss and the training error
import matplotlib.pyplot as plt

plt.figure(1)
plt.subplot(211)
plt.plot(plotdata["batchsize"], plotdata["avgloss"], 'b--')
plt.xlabel('Minibatch number')
plt.ylabel('Loss')
plt.title('Minibatch run vs. Training loss')

plt.show()

plt.subplot(212)
plt.plot(plotdata["batchsize"], plotdata["avgerror"], 'r--')
plt.xlabel('Minibatch number')
plt.ylabel('Label Prediction Error')
plt.title('Minibatch run vs. Label Prediction Error')
plt.show()

評估模型

為了評估模型，我們將剩下的資料輸入到已經訓練好的模型中，將真實的結果和預測的結果進行比較。

# Run the trained model on a newly generated dataset
test_minibatch_size = 25
features, labels = generate_random_data_sample(test_minibatch_size, input_dim, num_output_classes)

trainer.test_minibatch({feature : features, label : labels})

此時，這裡的minibatch為0.12，這是一個關鍵的指標，如果錯誤大大超過的訓練誤差，則表明訓練後的模型在訓練過程中出現了過擬合情況。

預測評估

檢視預測錯誤的個數

print("Label    :", [np.argmax(label) for label in labels])
print("Predicted:", [np.argmax(x) for x in result[0]])

視覺化

# Model parameters
print(mydict['b'].value)

bias_vector   = mydict['b'].value
weight_matrix = mydict['w'].value

# Plot the data 
import matplotlib.pyplot as plt

# let 0 represent malignant/red, and 1 represent benign/blue
colors = ['r' if label == 0 else 'b' for label in labels[:,0]]
plt.scatter(features[:,0], features[:,1], c=colors)
plt.plot([0, bias_vector[0]/weight_matrix[0][1]], 
         [ bias_vector[1]/weight_matrix[0][0], 0], c = 'g', lw = 3)
plt.xlabel("Patient age (scaled)")
plt.ylabel("Tumor size (in cm)")
plt.show()

CNTK:邏輯迴歸

CNTK:邏輯迴歸

CNTK API文件翻譯(2)——邏輯迴歸

【原】Andrew Ng斯坦福機器學習 Programming Exercise 2——邏輯迴歸

【原創】Logistic regression （邏輯迴歸）概述

機器學習實戰（四）邏輯迴歸LR（Logistic Regression）

吳恩達機器學習（第七章）---邏輯迴歸

演算法學習——邏輯迴歸(Logistic Regression)

線性迴歸_邏輯迴歸_廣義線性模型_斯坦福CS229_學習筆記

預測概率的邏輯迴歸演算法

Sklearn-LogisticRegression邏輯迴歸(有處理樣本不均衡時設定引數的方法)

pytorch入門——邊學邊練03邏輯迴歸

先驗概率、後驗概率、似然函式與機器學習中概率模型（如邏輯迴歸）的關係理解

邏輯迴歸演算法的一種實現

吳恩達機器學習 - 邏輯迴歸的正則化吳恩達機器學習 - 邏輯迴歸的正則化

吳恩達機器學習 - 邏輯迴歸吳恩達機器學習 - 邏輯迴歸

吳恩達機器學習 - 邏輯迴歸——多元分類吳恩達機器學習 - 邏輯迴歸——多元分類

tensorflow example 入門例子(線型迴歸與邏輯迴歸)

【原】Andrew Ng斯坦福機器學習 Coursera—Programming Exercise 3 邏輯迴歸多分類和神經網路

學習筆記（八）：使用邏輯迴歸檢測JAVA溢位攻擊以及識別驗證碼

Tensorflow搭建第一個邏輯迴歸(logistic regression，其實也就是單層感知機)模型

CNTK:邏輯迴歸

相關推薦