tf.estimator快速入門[5]

阿新 • • 發佈：2017-10-07

負載點數據描述 get target 特征值 absolut 隨機化 edi

tf.estimator快速入門

TensorFlow的高級機器學習API（tf.estimator）可以很容易地配置，培訓和評估各種機器學習模型。在本教程中，您將使用tf.estimator構建一個神經網絡分類器和訓練它在虹膜數據集的基礎上萼片/花瓣形狀來預測的花卉品種。您可以編寫代碼來執行以下五個步驟：

含鳶尾訓練/測試數據劃分為TensorFlow負載的CSV Dataset
構造一個神經網絡分類
利用訓練數據訓練模型
評估模型的準確性
分類新樣本

完整的神經網絡源代碼

下面是神經網絡分類器的完整代碼：

from __future__ import absolute_import
 
from __future__ import division
from __future__ import print_function

import os
import urllib

import numpy as np
import tensorflow as tf

# Data sets
IRIS_TRAINING = "iris_training.csv"
IRIS_TRAINING_URL = "http://download.tensorflow.org/data/iris_training.csv"

IRIS_TEST = "iris_test.csv"
IRIS_TEST_URL  
= "http://download.tensorflow.org/data/iris_test.csv"

def main():
  # If the training and test sets aren‘t stored locally, download them.
  if not os.path.exists(IRIS_TRAINING):
    raw = urllib.urlopen(IRIS_TRAINING_URL).read()
    with open(IRIS_TRAINING, "w") as f:
      f.write(raw)

  if not os.path.exists(IRIS_TEST):
    raw  
= urllib.urlopen(IRIS_TEST_URL).read()
    with open(IRIS_TEST, "w") as f:
      f.write(raw)

  # Load datasets.
  training_set = tf.contrib.learn.datasets.base.load_csv_with_header(
      filename=IRIS_TRAINING,
      target_dtype=np.int,
      features_dtype=np.float32)
  test_set = tf.contrib.learn.datasets.base.load_csv_with_header(
      filename=IRIS_TEST,
      target_dtype=np.int,
      features_dtype=np.float32)

  # Specify that all features have real-value data
  feature_columns = [tf.feature_column.numeric_column("x", shape=[4])]

  # Build 3 layer DNN with 10, 20, 10 units respectively.
  classifier = tf.estimator.DNNClassifier(feature_columns=feature_columns,
                                          hidden_units=[10, 20, 10],
                                          n_classes=3,
                                          model_dir="/tmp/iris_model")
  # Define the training inputs
  train_input_fn = tf.estimator.inputs.numpy_input_fn(
      x={"x": np.array(training_set.data)},
      y=np.array(training_set.target),
      num_epochs=None,
      shuffle=True)

  # Train model.
  classifier.train(input_fn=train_input_fn, steps=2000)

  # Define the test inputs
  test_input_fn = tf.estimator.inputs.numpy_input_fn(
      x={"x": np.array(test_set.data)},
      y=np.array(test_set.target),
      num_epochs=1,
      shuffle=False)

  # Evaluate accuracy.
  accuracy_score = classifier.evaluate(input_fn=test_input_fn)["accuracy"]

  print("\nTest Accuracy: {0:f}\n".format(accuracy_score))

  # Classify two new flower samples.
  new_samples = np.array(
      [[6.4, 3.2, 4.5, 1.5],
       [5.8, 3.1, 5.0, 1.7]], dtype=np.float32)
  predict_input_fn = tf.estimator.inputs.numpy_input_fn(
      x={"x": new_samples},
      num_epochs=1,
      shuffle=False)

  predictions = list(classifier.predict(input_fn=predict_input_fn))
  predicted_classes = [p["classes"] for p in predictions]

  print(
      "New Samples, Class Predictions:    {}\n"
      .format(predicted_classes))

if __name__ == "__main__":
    main()

以下各節走過了詳細的代碼。

加載虹膜CSV數據TensorFlow

該虹膜數據集包含150行數據，包括來自每個的三個相關鳶尾種類50個樣品： 山鳶尾，虹膜錦葵，和變色鳶尾。

每行包含每個花的樣品如下的數據：萼片長度，萼片寬度，花瓣長度，花瓣寬度，和花的品種。花種被表示為整數，0表示山鳶尾，表示1 變色鳶尾，和2表示虹膜錦葵

對於本教程，虹膜數據已被隨機化，並且分成兩個單獨的CSV：

的訓練集的120個樣本（iris_training.csv）
測試組的30個樣品（iris_test.csv）。

要開始，首先導入必要的模塊，並定義下載和存儲數據集：

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os
import urllib

import tensorflow as tf
import numpy as np

IRIS_TRAINING = "iris_training.csv"
IRIS_TRAINING_URL = "http://download.tensorflow.org/data/iris_training.csv"

IRIS_TEST = "iris_test.csv"
IRIS_TEST_URL = "http://download.tensorflow.org/data/iris_test.csv"

然後，如果訓練和測試集還不是本地存儲，下載。

if not os.path.exists(IRIS_TRAINING):
  raw = urllib.urlopen(IRIS_TRAINING_URL).read()
  with open(IRIS_TRAINING,‘w‘) as f:
    f.write(raw)

if not os.path.exists(IRIS_TEST):
  raw = urllib.urlopen(IRIS_TEST_URL).read()
  with open(IRIS_TEST,‘w‘) as f:
    f.write(raw)

接著，加載訓練和測試集向Dataset使用S load_csv_with_header() 在方法learn.datasets.base。該load_csv_with_header()方法有三個必需的參數：

filename，這需要的文件路徑CSV文件
target_dtype，這需要的 numpy數據類型的數據集的目標值。
features_dtype，這需要的 numpy數據類型的數據集的特征值。

在此，目標（你訓練模型預測值）是花種，這是0-2的整數，所以相應的numpy數據類型是np.int：

# Load datasets.
training_set = tf.contrib.learn.datasets.base.load_csv_with_header(
    filename=IRIS_TRAINING,
    target_dtype=np.int,
    features_dtype=np.float32)
test_set = tf.contrib.learn.datasets.base.load_csv_with_header(
    filename=IRIS_TEST,
    target_dtype=np.int,
    features_dtype=np.float32)

DatasetS IN tf.contrib.learn被命名為元組 ; 您可以通過訪問特征數據和目標值data和target 領域。在這裏，training_set.data並training_set.target包含用於訓練集，分別特征數據和目標值，並test_set.data 與test_set.target含有特征數據和目標值的測試集。

後來，在 “裝上DNNClassifier虹膜訓練數據，” 您將使用training_set.data和 training_set.target訓練模型，並在 “評估模型的準確性，”你會使用test_set.data和 test_set.target。但首先，你會建立你的模型在下一節。

構建深層神經網絡分類

tf.estimator提供了多種預定義的模型，叫做EstimatorS，您可以使用“開箱即用”，以您的數據運行的培訓和評估操作。在這裏，你將配置一個深層神經網絡分類模型，以適應虹膜數據。使用tf.estimator，你可以實例化tf.estimator.DNNClassifier與代碼只是幾行：

# Specify that all features have real-value data
feature_columns = [tf.feature_column.numeric_column("x", shape=[4])]

# Build 3 layer DNN with 10, 20, 10 units respectively.
classifier = tf.estimator.DNNClassifier(feature_columns=feature_columns,
                                        hidden_units=[10, 20, 10],
                                        n_classes=3,
                                        model_dir="/tmp/iris_model")

上述第一代碼定義該模型的特征列，用來指定在所述數據集的特征的數據類型。所有特征數據是連續的，所以tf.feature_column.numeric_column是用構造特征列相應的功能。有在數據組四個特征（萼片寬度，萼片高度，花瓣寬度，和高度花瓣），所以相應地shape 必須設置為[4]以保持所有數據。

然後，代碼創建一個DNNClassifier使用以下參數型號：

feature_columns=feature_columns。上文所定義的組特征列。
hidden_units=[10, 20, 10]。三個隱藏層，含有10，20和10的神經元，分別。
n_classes=3。三個目標類，代表三個光圈品種。
model_dir=/tmp/iris_model。該目錄中TensorFlow將模型訓練過程中保存檢查點數據和TensorBoard摘要。

描述訓練輸入管道

所述tf.estimatorAPI使用輸入功能，其創建用於生成模型數據中TensorFlow操作。我們可以用tf.estimator.inputs.numpy_input_fn產生的輸入管道：

# Define the training inputs
train_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"x": np.array(training_set.data)},
    y=np.array(training_set.target),
    num_epochs=None,
    shuffle=True)

適合DNNClassifier虹膜訓練數據

現在你已經配置了DNN classifier模型，您可以在適合使用光圈訓練數據train的方法。通過train_input_fn作為input_fn，和步數來訓練（在這裏，2000年）：

# Train model.
classifier.train(input_fn=train_input_fn, steps=2000)

該模型的狀態被保存在classifier，這意味著你可以反復訓練，如果你喜歡。例如，上面是等效於以下語句：

classifier.train(input_fn=train_input_fn, steps=1000)
classifier.train(input_fn=train_input_fn, steps=1000)

但是，如果你正在尋找跟蹤模式，同時它訓練，你可能會想改用TensorFlow SessionRunHook 執行記錄操作。

評價模型精度

你已經訓練你的DNNClassifier虹膜訓練數據模型; 現在，您可以檢查使用的虹膜測試數據的準確性 evaluate方法。像train， evaluate需要的是建立它的輸入管道輸入功能。evaluate 返回dicts的評價結果。下面的代碼經過光圈測試DATA- test_set.data和test_set.target-to evaluate 並打印accuracy從結果：

# Define the test inputs
test_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"x": np.array(test_set.data)},
    y=np.array(test_set.target),
    num_epochs=1,
    shuffle=False)

# Evaluate accuracy.
accuracy_score = classifier.evaluate(input_fn=test_input_fn)["accuracy"]

print("\nTest Accuracy: {0:f}\n".format(accuracy_score))

當您運行完整的腳本，它將打印接近的東西：

Test Accuracy: 0.966667

你的準確度結果可能會略有不同，但應高於90％。不壞的一個相對較小的數據集！

分類新樣本

技術分享

# Classify two new flower samples.
new_samples = np.array(
    [[6.4, 3.2, 4.5, 1.5],
     [5.8, 3.1, 5.0, 1.7]], dtype=np.float32)
predict_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"x": new_samples},
    num_epochs=1,
    shuffle=False)

predictions = list(classifier.predict(input_fn=predict_input_fn))
predicted_classes = [p["classes"] for p in predictions]

print(
    "New Samples, Class Predictions:    {}\n"
    .format(predicted_classes))

結果應該如下：

New Samples, Class Predictions:    [1 2]

因此，該模型預測，第一樣品是變色鳶尾，並且所述第二樣品是虹膜錦葵

tf.estimator快速入門[5]

負載點數據描述 get target 特征值 absolut 隨機化 edi tf.estimator快速入門 TensorFlow的高級機器學習API（tf.estimator）可以很容易地配置，培訓和評估各種機器學習模型。在本教程中，您將使用tf.estimator

tf.estimator快速入門[5]

tf.estimator快速入門

完整的神經網絡源代碼

加載虹膜CSV數據TensorFlow

每行包含每個花的樣品如下的數據：萼片長度，萼片寬度，花瓣長度，花瓣寬度，和花的品種。花種被表示為整數，0表示山鳶尾，表示1 變色鳶尾，和2表示虹膜錦葵

描述訓練輸入管道

適合DNNClassifier虹膜訓練數據

評價模型精度

分類新樣本

tf.estimator快速入門[5]

【swoole快速入門5】設定定時器

php之快速入門學習-5(常量)

python3.5+django2.0快速入門(一)

(5)Jquery1.8.3快速入門

前端模組化-5分鐘快速入門RequireJs

快速入門Python3.5開發（1）

Spring基礎快速入門spring cloud（5）斷路器之Hystrix

Laravel 5.5 Eloquent ORM - 快速入門

【自動化測試工具】QTP11 5/UFT快速入門

Angular 5.x 系列教程筆記（一）——快速入門

Struts2最新版(2.5.12)快速入門(五) struts2之檔案上傳

Struts2最新版(2.5.12)快速入門(四) struts2之攔截器（Interceptor）

Struts2最新版(2.5.12)快速入門(三) Struts2之Annotation

TensorFlow-4: tf.contrib.learn 快速入門

Docker快速入門系列（三）——CentOS-7.5下使用yum命令快速安裝Docker CE

5分鐘 BeetlSQL 快速入門

快速入門Openstack，無腦多節點部署Mitaka（5）--Nova部署

python3.5+django2.0快速入門(二)

5分鐘帶你快速入門和了解 OAM Kubernetes

tf.estimator快速入門[5]

tf.estimator快速入門

完整的神經網絡源代碼

加載虹膜CSV數據TensorFlow

每行包含每個花的樣品如下的數據： 萼片長度，萼片寬度， 花瓣長度，花瓣寬度，和花的品種。花種被表示為整數，0表示山鳶尾，表示1 變色鳶尾，和2表示虹膜錦葵

描述訓練輸入管道

適合DNNClassifier虹膜訓練數據

評價模型精度

分類新樣本

相關推薦

每行包含每個花的樣品如下的數據：萼片長度，萼片寬度，花瓣長度，花瓣寬度，和花的品種。花種被表示為整數，0表示山鳶尾，表示1 變色鳶尾，和2表示虹膜錦葵