Hands-On Automated Machine Learning

阿新 • • 發佈：2018-11-03

機器學習（ML）模型中有許多移動部件必須連線在一起才能使ML模型成功執行並生成結果。將ML過程的不同部分捆綁在一起的過程稱為管道。對於資料科學家來說，管道是一個概括但非常重要的概念。在軟體工程中，人們構建管道來開發從原始碼到部署的軟體。類似地，在ML中，建立了一個管道，以允許資料從其原始格式流向一些有用的資訊。它提供了一種構建多ML並行管道系統的機制，以便比較幾種ML方法的結果。

管道的每個階段都饋送從其前一階段處理的資料; 也就是說，處理單元的輸出作為輸入提供給下一步驟。正如水在管道中流動一樣，資料流過管道。掌握管道概念是建立無差錯ML模型的有效方法，管道是AutoML系統的關鍵要素。

簡單的管道
我們將首先匯入一個稱為Iris的資料集，該資料集已在scikit-learn的樣本資料集庫中提供（

http://scikit-learn.org/stable/auto_examples/datasets/plot_iris_dataset.html）。資料集由四個要素組成，共有150行。我們將在管道中開發以下步驟，以使用Iris資料集訓練我們的模型。問題陳述是使用四種不同的特徵來預測虹膜資料的種類，如下面的流程圖所示：

圖片標題

在這個管道中，我們將使用MinMaxScaler方法來縮放輸入資料和邏輯迴歸以預測虹膜的種類。然後將根據準確度度量評估模型：

1.第一步是從scikit匯入 - 學習各種庫，這些庫將提供完成任務的方法。我們必須從sklearn.pipeline新增Pipeline方法，它將為我們提供建立ML管道所需的必要方法：

來自 sklearn.datasets import load_iris
來自 sklearn.preprocessing 匯入 MinMaxScaler
來自 sklearn.linear_model import LogisticRegression
來自 sklearn.model_selection import train_test_split
來自 sklearn.pipeline import Pipeline
2.下一步是載入虹膜資料並將其分成訓練和測試資料集。在此示例中，我們將使用80％的資料集來訓練模型，剩餘的20％用於測試模型的準確性。我們可以使用shape函式來檢視資料集的維度：

＃載入並拆分資料
iris = load_iris（）
X_train，X_test，y_train，y_test = train_test_split（iris.data，iris.target，test_size = 0.2，random_state = 42）
X_train.shape
3.以下結果顯示訓練資料集有4列120行，相當於Iris資料集的80％，並且符合預期：

圖片標題

4.接下來，我們列印資料集：

列印（X_train）
上面的程式碼產生以下輸出：

圖片標題

5.下一步是建立一個管道。管道物件採用（鍵，值）對的形式。Key是一個字串，它具有特定步驟的名稱，value是函式或實際方法的名稱。在下面的程式碼片段中，我們將MinMaxScaler（）方法命名為minmax，將LogisticRegression（）命名為lr：

pipe_lr = Pipeline（[（' minmax'，MinMaxScaler（）），
（' lr'，LogisticRegression（））]）
6.然後，我們將管道物件pipe_lr擬合到訓練資料集：

pipe_lr.fit（X_train，y_train）
7.在執行上述程式碼時，您將獲得以下輸出，該輸出顯示已構建的擬合模型的最終結構：

圖片標題

8.最後一步是使用score方法在測試資料集上對模型進行評分：

得分 = pipe_lr.score（X_test，y_test）
print（' Logistic迴歸管道測試精度：％。3f' ％得分）
我們可以從以下結果中注意到，模型的準確度為0.900，即90％：

圖片標題

在這個例子中，我們建立了一個包含兩個步驟的管道，即minmax scaling和LogisticRegression。當我們在pipe_lr上執行fit方法時，MinMaxScaler對輸入資料執行了擬合和變換方法，並將其傳遞給估計器，估計器是邏輯迴歸模型。管道中的這些中間步驟稱為變換器，最後一步是估計器。

Hands-On Automated Machine Learning

Hands-On Automated Machine Learning

All eyes on AI, machine learning at Cocon 2018

DataRobot Automated Machine Learning

Realizing the Benefits of Automated Machine Learning | Become AI

Animesh Singh on democratizing machine learning

Automated Machine Learning Platform

Automated Machine Learning for Data Scientists

Automated Machine Learning for Business Analysts

Automated Machine Learning for the AI-Driven Enterprise

Automated Machine Learning for Executives

Advancing Your Analytics Career With Automated Machine Learning

Automated Machine Learning Solutions

OReilly.Hands-On.Machine.Learning.with.Scikit-Learn.and.TensorFlow學習筆記彙總

Hands-on Machine Learning with Scikit-Learn and TensorFlow（中文版）和深度學習原理與TensorFlow實踐-學習筆記

二、《Hands-On Machine Learning with Scikit-Learn and TensorFlow》一個完整的機器學習專案

《Hands-On Machine Learning with Scikit-Learn & TensorFlow》讀書筆記第一章機器學習概覽

Hands on Machine Learning with Sklearn and TensorFlow學習筆記——機器學習概覽

《Hands-On Machine Learning with Scikit-Learn & TensorFlow》讀書筆記第六章決策樹

《Hands-On Machine Learning with Scikit-Learn & TensorFlow》讀書筆記第五章支援向量機

Machine Learning on Spark

Hands-On Automated Machine Learning

相關推薦