演算法初探：Tensorflow及PAI平臺的使用

阿新 • • 發佈：2018-12-22

前言

Tensorflow這個詞由來已久，但是對它的理解一直就停留在“聽過”的層面。之前做過一個無線圖片適配問題智慧識別的專案，基於Tensorflow實現了GoogLeNet - Inception V3網路（一種包含22層的深層卷積神經網路），但是基本上也屬於“盲人摸象”、“照葫蘆畫瓢”的程度。作為當今機器學習乃至深度學習界出現頻率最高的一個詞，有必要去了解一下它到底是個什麼東西。

而PAI，作為一站式地機器學習和演算法服務平臺，它大大簡化了模型構建、模型訓練、調參、模型效能評估、服務化等一系列演算法的工作，可以幫助我們更快捷地實現演算法實驗和應用。

一、Tensorflow初探

1. 安裝和啟動

因為我自己的mac-pro安裝了docker，所以安裝Tensorflow的環境非常簡單，只要拉取Tensorflow的官方映象就可以完成Tensorflow的環境搭建。

#拉取tensorflow映象
docker pull tensorflow/tensorflow

#建立一個tensorflow的工作目錄，掛載到容器內
mkdir -p /Users/znifeng/docker-data/tensorflow/notebooks

#啟動容器
docker run -it --rm --name myts -v /Users/znifeng/docker-data/tensorflow/notebooks:/notebooks -p 8888:8888 tensorflow/tensorflow

啟動成功後，將看到如下資訊：

複製連結http://127.0.0.1:8888/?token=487c52e0aa0cd2a7b231bf909c1d6666482f8ed03353e510到瀏覽器，就可以看到jupyter（支援線上編寫和除錯python的互動式筆記本）頁面：

接下來，你可以在jupyter上或者在docker容器內部編寫和除錯tensorflow的程式碼，容器內部已經包含了tensorflow的所有庫。

2. 基本使用

2.1 核心概念

使用圖(graph)來表示計算任務
使用張量（tensor）來表示資料。張量與向量的區別：向量相當於一階的張量，張量可以從0階到多階（多維）
圖中的每一個節點稱之為op（operation），每一個op有0或多個Tensor作為輸入，執行計算後產出0或多個Tensor作為輸出

在被稱之為會話Session的上下文（context）中執行圖
通過變數（variable）來維護狀態
使用feed和fetch可以為任意的操作賦值或者從其中獲取資料
使用placeholder來定義佔位符，在執行時傳入對應的引數值

TensorFlow程式通常被組織成一個構建階段和一個執行階段。在構建階段，op的執行步驟被描述成一個圖，在執行階段，使用會話執行圖中的op。在Python中，返回的tensor是numpy.ndarray物件。；在C/C++中，返回的是tensorflow:Tensor例項。

2.2 使用示例

2.2.1 第一個helloworld程式：

import tensorflow as tf

#第一階段： 構建圖
#定義一個1x2的矩陣，矩陣元素為[3 3]
matrix1 = tf.constant([[3., 3.]])

#定義一個2x1的矩陣，矩陣元素為[2
                           2]
matrix2 = tf.constant([[2.],[2.]])

# 建立一個矩陣乘法 matmul op , 把 'matrix1' 和 'matrix2' 作為輸入.
product = tf.matmul(matrix1, matrix2)

#第二階段： 執行圖
with tf.Session() as sess:
    print "matrix1: %s" % sess.run(matrix1)
    print "matrix2: %s" % sess.run(matrix2)
    print "result type: %s" % type(sess.run(product))
    print "result: %s" % sess.run(product)

輸出結果：

matrix1: [[3. 3.]]
matrix2: [[2.]
 [2.]]
result type: <type 'numpy.ndarray'>
result: [[12.]]

如上圖所示，在第一階段（構建圖）中，我們的每一行操作其實都是一個operation，包含兩個constant操作和一個矩陣相乘的操作，每個operation的輸出都是tensor，它的型別是num.ndarray。實際上，構建階段我們只是定義op，並不會真正去執行；而在第二階段中，通過定義了一個會話session，我們才會在會話中真正開始執行前面定義的各個operation，然後獲得執行的結果。

2.2.2 使用tensorflow實現識別手寫數字（1～9）模型 —— Softmax Regression

import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

import tensorflow as tf

##Define input data format
#input images, each image is represented by a tensor of 784 dimensions
x = tf.placeholder("float", [None,784])
#input labels, each label values one digit of [0,9], which is represented by a tensor of 10 dimensions
y_ = tf.placeholder("float", [None, 10])

##Define Model and algorithm
#weight array of each feature VS predicted result
W = tf.Variable(tf.zeros([784, 10]))
#bias of each digit
b = tf.Variable(tf.zeros([10]))
#predicted probability array of an image, which is of 10 dimensions. 
#tf.matmul:矩陣相乘
y = tf.nn.softmax(tf.matmul(x,W) + b)
#cross-entropy or called loss function
#tf.reduce_sum:壓縮求和: tf.reduce_sum(x, 0)將x按行求和，tf.reduce_sum(x, 1)將x按列求和，tf.reduce_sum(x, [0, 1])按行列求和
cross_entropy = -tf.reduce_sum(y_*tf.log(y))
#gredient descent algorithm
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

##Training model
#initialize all variables
init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    for i in range(1000):
        batch_xs, batch_ys = mnist.train.next_batch(100)
        sess.run(train_step, feed_dict={x: batch_xs, y_:batch_ys})

    ##Evaluation 
    #tf.argmax(vector, 1)：返回的是vector中的最大值的索引號。tf.argmax(vector, 0) 
    correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
    print sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})

輸出結果：

0.904

模型中使用的資料來自tensorflow/g3doc/tutorials/mnist/，總共包含7萬張784（28x28）畫素的1～9的數字圖片，其中55000張用於模型訓練作為訓練集，5000張作為驗證集，剩餘10000張用作測試集。

因此，每一張圖片都可以用一個784維的向量來表示，向量裡的每一個元素表示某個畫素的強度值，介於0和1之間。使用x = tf.placeholder("float", [None,784])表示輸入集x，其中placeholder為佔位符，在實際使用時，我們再通過feed傳入具體的行數來替換其中的None。y_則為對應的實際結果，因為結果集合為0～9，因此我們可以用[y0,y1,...,y9]的十維向量來表示結果，比如數字“1”可以表示為[0,1,0,0,0,0,0,0,0,0]。本例中，定義的模型為線性模型，先用wx+b得到初步結果z，再通過softmax函式將z折射得到0～9各個數字的概率值。

在模型求解時，我們需要定義一個指標來評估模型是好的。而在機器學習中，通常是定義指標來表示模型是壞的，然後儘量最小化該指標得到最優解，該指標也稱為成本（cost）或損失（loss）函式。本例中，我們用“交叉熵”來作為損失函式cross_entropy = -tf.reduce_sum(y_*tf.log(y))，然後用梯度下降的方式求解train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)。其中0.01為下降的速率。

可以看到，經過1000次迭代，我們的模型的預測結果的準確率達到了90.4%。

2.2.3 使用tensorflow實現識別手寫數字（1～9）模型 —— DeepCNN

上節中用softmax模型預測的準確率大概在90%，接下來嘗試下用tensorflow實現一個卷積神經網路模型來識別手寫數字。

import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

import tensorflow as tf
def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial)

def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape)
    return tf.Variable(initial)

def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1,1,1,1], padding="SAME")

def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

x = tf.placeholder("float", shape=[None, 784])
y_ = tf.placeholder("float", shape=(None, 10))

#第一層卷積
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
x_image = tf.reshape(x, [-1,28,28,1])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

#第二層卷積
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

#密集連線層
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

#Dropout防止過擬合
keep_prob = tf.placeholder("float")
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

#輸出層: softmax
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)

#訓練和評估模型
cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(2000):
        batch = mnist.train.next_batch(50)
        if i%200 == 0:
            train_accuracy = sess.run(accuracy, feed_dict={x:batch[0], y_: batch[1], keep_prob: 1.0})
            print "step %d, training accuracy %g"%(i, train_accuracy)
        train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
    print "final training accuracy %g" % sess.run(accuracy, feed_dict={x:batch[0], y_: batch[1], keep_prob: 1.0})

輸出結果：

step 0, training accuracy 0.14
step 200, training accuracy 0.92
step 400, training accuracy 0.96
step 600, training accuracy 1
step 800, training accuracy 0.9
step 1000, training accuracy 0.96
step 1200, training accuracy 0.94
step 1400, training accuracy 0.94
step 1600, training accuracy 0.98
step 1800, training accuracy 1
final training accuracy 0.99

可以看到，2000次迭代後，deepccn的預測準確率達到了：99%。

二、PAI平臺的使用

前面介紹了Tensorflow的基本概念和使用，下面簡單介紹下使用PAI完成LR模型訓練的基本過程。整個流程大概包含以下步驟：離線資料開發 -> PAI平臺實驗搭建 -> 模型服務化

2.1 離線資料開發

首先，演算法的基礎是資料，我們首先要通過對業務的分析，找出影響目標結果的特徵，並對特徵資料進行採集，得到各項特徵的原始資料。這一部分，可以在odps完成。如“需求風險智慧識別”中，涉及到多個表的join和指標的提取、計算等，對於一些預設的值可以按照不同的策略來填充（如用0填充，或者該項的平均值）。最終得到如下的訓練資料：

2.2 PAI平臺實驗搭建

進入PAI平臺中，新建實驗：smarttest，搭建如下的實驗流程：

流程中的每個元件可以從左側“元件”導航欄中獲取。具體流程如下：

讀資料表：輸入odps表名
型別轉換：將特定的欄位統一轉化為double/int型別
歸一化：選取用於訓練的特徵，並進行歸一化。歸一化的邏輯為：y= (x-MinValue)/(MaxValue-MinValue)
拆分：將資料集按照比例拆分為訓練集和測試集
邏輯迴歸二分類：使用LR模型對訓練集的資料進行訓練
預測：將訓練得到的模型對測試集的輸入資料進行預測
混淆矩陣：產出模型預測結果的混淆矩陣
二分類評估：得到模型評估的AUC、KS、F1 Score等結果

2.3 模型服務化

訓練好模型後，就可以將模型線上服務化。

演算法初探：Tensorflow及PAI平臺的使用

前言

一、Tensorflow初探

1. 安裝和啟動

2. 基本使用

2.1 核心概念

2.2 使用示例

2.2.1 第一個helloworld程式：

2.2.2 使用tensorflow實現識別手寫數字（1～9）模型 —— Softmax Regression

2.2.3 使用tensorflow實現識別手寫數字（1～9）模型 —— DeepCNN

二、PAI平臺的使用

2.1 離線資料開發

2.2 PAI平臺實驗搭建

2.3 模型服務化

演算法初探：Tensorflow及PAI平臺的使用

CDN初探：CDN及CDN加速

MediaWiki初探：安裝及使用入門

推薦演算法之： DeepFM及使用DeepCTR測試

效能測試監控平臺：InfluxDB+Grafana+Jmeter linux環境執行jmeter並生成報告時序資料庫InfluxDB：簡介及安裝視覺化工具Grafana：簡介及安裝

實驗四：Tensorflow實現了四個對抗影象製作演算法--readme

排序演算法：三大中級排序演算法，原理解析及用法

解決已安裝python2.7 來安裝python3.5的共存和安裝問題及Anoconda安裝及搭建：TensorFlow、Keras

筆記：TensorFlow實現機器學習演算法的步驟

基於深度學習的目標檢測演算法綜述：常見問題及解決方案

Flynn初探：基於Docker的PaaS平臺

阿里演算法專家：信用風險評估評分卡建模方法及原理

經典演算法研究系列：二、Dijkstra 演算法初探

Python3《機器學習實戰》01：k-近鄰演算法（完整程式碼及註釋）

演算法工程師修仙之路：TensorFlow（五）

tensorflow學習筆記（四十）：tensorflow語音識別及 python音訊處理庫

演算法工程師修仙之路：TensorFlow（七）

演算法工程師修仙之路：TensorFlow（六）

演算法工程師修仙之路：TensorFlow（四）

演算法工程師修仙之路：TensorFlow（三）

演算法初探：Tensorflow及PAI平臺的使用

前言

一、Tensorflow初探

1. 安裝和啟動

2. 基本使用

2.1 核心概念

2.2 使用示例

2.2.1 第一個helloworld程式：

2.2.2 使用tensorflow實現識別手寫數字（1～9）模型 —— Softmax Regression

2.2.3 使用tensorflow實現識別手寫數字（1～9）模型 —— DeepCNN

二、PAI平臺的使用

2.1 離線資料開發

2.2 PAI平臺實驗搭建

2.3 模型服務化

相關推薦