tensorflow 神經網路基本使用

阿新 • • 發佈：2019-01-03

TF使用ANN(artificial neural network)

簡介

受到生物神經網路的啟發
發展歷史
- 生物神經網路單元
- 邏輯運算單元：and、or、xor等運算
- 感知機(perceptron)：hw(x)=step(wT⋅x)
- 多層感知機和反向傳播(multi-perceptron and backpropagation)

perceptron

sklearn中也有感知機的庫，其引數學習規則是
wnextstepi,j=wi,j+η(y^j−yj)xi
其中η是學習率
感知機與SGD很類似
邏輯斯蒂迴歸可以給出樣本對於每一類的分類概率，而感知機則是直接根據閾值給出分類結果，因此一般在分類時，邏輯斯蒂迴歸相對感知機來說會常用一點

感知機是線性的，難以解決非線性問題；但是如果採用多個感知機，則可以避免這個問題

多層感知機和反向傳播

感知機的啟用函式是step函式，得到的結果非0即1，無法用於反向傳播（需要求取微分），因此利用Logistic函式σ(z)=1/(1+exp(−z))替代之前的step函式，這個logistic函式也被稱為啟用函式
常用的啟用函式有
- logistic函式
- 雙曲正切函式：tanh(z)=2σ(2z)−1
- RELU函式：relu(z)=max(z,0)，RELU函式在z=0處不可導，但是由於它計算耗時十分短，在實際應用中應用很廣泛，同時它也沒有最大值的限制，可以減少GD使用過程中的一些問題
- softmax函式
MLP常常用於分類，輸出層常常用softmax函式作為啟用函式，可以保證所有節點的輸出之和為1，相當於每個節點的輸出值都是這個節點的概率，softmax函式如下
σ(z)j=ezj∑Kk=1ezk

# 不顯示python使用過程中的警告
import warnings
warnings.filterwarnings("ignore")

%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import os

# 這個options只需要在之後第一次使用Session時使用就可以了 

gpu_options = tf.GPUOptions(allow_growth=True)

def reset_graph(seed=42):
    tf.reset_default_graph()
    tf.set_random_seed(seed)
    np.random.seed(seed)
    return

with tf.Session( config=tf.ConfigProto(gpu_options=gpu_options) ) as sess:
    print( sess.run( tf.constant(1) ) )

# skelarn perceptron
from sklearn.datasets import load_iris
from sklearn.linear_model import Perceptron

iris = load_iris()
X = iris.data[:, (2,3)]
y = (iris.target ==0 ).astype( np.int )
per_clf = Perceptron( random_state=42 )
per_clf.fit(X, y)
y_pred = per_clf.predict( [[2, 0.5]] )
print( y_pred )

[1]

# 定義一些啟用函式
def logit(z):
    return 1 / (1 + np.exp(-z))

def relu(z):
    return np.maximum(0, z)

def derivative(f, z, eps=0.000001):
    return (f(z + eps) - f(z - eps))/(2 * eps)

# 視覺化啟用函式及其導數
z = np.linspace(-5, 5, 200)

plt.figure(figsize=(10,4))

plt.subplot(121)
plt.plot(z, np.sign(z), "r-", linewidth=2, label="Step")
plt.plot(z, logit(z), "g--", linewidth=2, label="Logit")
plt.plot(z, np.tanh(z), "b-", linewidth=2, label="Tanh")
plt.plot(z, relu(z), "m-.", linewidth=2, label="ReLU")
plt.grid(True)
plt.legend(loc="center right", fontsize=14)
plt.title("Activation functions", fontsize=14)
plt.axis([-5, 5, -1.2, 1.2])

plt.subplot(122)
plt.plot(z, derivative(np.sign, z), "r-", linewidth=2, label="Step")
plt.plot(0, 0, "ro", markersize=5)
plt.plot(0, 0, "rx", markersize=10)
plt.plot(z, derivative(logit, z), "g--", linewidth=2, label="Logit")
plt.plot(z, derivative(np.tanh, z), "b-", linewidth=2, label="Tanh")
plt.plot(z, derivative(relu, z), "m-.", linewidth=2, label="ReLU")
plt.grid(True)
#plt.legend(loc="center right", fontsize=14)
plt.title("Derivatives", fontsize=14)
plt.axis([-5, 5, -0.2, 1.2])

plt.show()

這裡寫圖片描述

使用MLP進行訓練

TF中集成了MLP的庫，在tf.learn中
下面是一個MLP訓練的例子
其中infer_real_valued_columns_from_input是根據輸入的資料推斷出資料的型別以及資料特徵的維度等資訊，參考連結：http://www.cnblogs.com/wxshi/p/8053973.html

from tensorflow.examples.tutorials.mnist import input_data
from sklearn.metrics import accuracy_score
# 匯入資料
mnist = input_data.read_data_sets("dataset/mnist")
X_train = mnist.train.images
X_test = mnist.test.images
y_train = mnist.train.labels.astype("int")
y_test = mnist.test.labels.astype("int")

feature_cols = tf.contrib.learn.infer_real_valued_columns_from_input(X_train)
dnn_clf = tf.contrib.learn.DNNClassifier( hidden_units=[300,100], n_classes=10, feature_columns=feature_cols, model_dir="./models/mnist/" )
dnn_clf.fit( x=X_train, y=y_train, batch_size=2000,steps=1000 )

y_pred = list( dnn_clf.predict(X_test) )
print( "accuracy : ", accuracy_score(y_test, y_pred) )

Extracting dataset/mnist/train-images-idx3-ubyte.gz
Extracting dataset/mnist/train-labels-idx1-ubyte.gz
Extracting dataset/mnist/t10k-images-idx3-ubyte.gz
Extracting dataset/mnist/t10k-labels-idx1-ubyte.gz
INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_task_type': None, '_task_id': 0, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f9f659ddd68>, '_master': '', '_num_ps_replicas': 0, '_num_worker_replicas': 0, '_environment': 'local', '_is_chief': True, '_evaluation_master': '', '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1.0
}
, '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_secs': 600, '_log_step_count_steps': 100, '_session_config': None, '_save_checkpoints_steps': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_model_dir': './models/mnist/'}
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Restoring parameters from ./models/mnist/model.ckpt-1000
INFO:tensorflow:Saving checkpoints for 1001 into ./models/mnist/model.ckpt.
INFO:tensorflow:loss = 0.11683539, step = 1001
INFO:tensorflow:global_step/sec: 110.226
INFO:tensorflow:loss = 0.10939115, step = 1101 (0.908 sec)
INFO:tensorflow:global_step/sec: 111.916
INFO:tensorflow:loss = 0.082077585, step = 1201 (0.894 sec)
INFO:tensorflow:global_step/sec: 108.765
INFO:tensorflow:loss = 0.089471206, step = 1301 (0.920 sec)
INFO:tensorflow:global_step/sec: 121.815
INFO:tensorflow:loss = 0.073814414, step = 1401 (0.820 sec)
INFO:tensorflow:global_step/sec: 106.326
INFO:tensorflow:loss = 0.067025915, step = 1501 (0.940 sec)
INFO:tensorflow:global_step/sec: 125.559
INFO:tensorflow:loss = 0.07670402, step = 1601 (0.796 sec)
INFO:tensorflow:global_step/sec: 118.059
INFO:tensorflow:loss = 0.060902975, step = 1701 (0.848 sec)
INFO:tensorflow:global_step/sec: 107.56
INFO:tensorflow:loss = 0.057678875, step = 1801 (0.929 sec)
INFO:tensorflow:global_step/sec: 109.521
INFO:tensorflow:loss = 0.074146144, step = 1901 (0.913 sec)
INFO:tensorflow:Saving checkpoints for 2000 into ./models/mnist/model.ckpt.
INFO:tensorflow:Loss for final step: 0.057994846.
INFO:tensorflow:Restoring parameters from ./models/mnist/model.ckpt-2000
accuracy :  0.9747

TF構建DNN

初始化訓練引數時，引數初始值可以設定為符合標準差為2/ninputs−−−−−√的截斷正態分佈的隨機數，這可以加快模型收斂的速度。在TF裡，截斷的最大值和最小值分別是2和-2。

from tensorflow.contrib.layers import fully_connected

reset_graph()

# MNIST
n_inputs = 28*28
n_hidden1 = 300
n_hidden2 = 100
n_outputs = 10

X = tf.placeholder( tf.float32, shape=(None, n_inputs), name="X" )
y = tf.placeholder( tf.int64, shape=(None), name="y" )

# 自己定義的層，和tf中的fully_connected類似
def neuron_layer(X, n_neurons, name, activation=None):
    with tf.name_scope( name ):
        n_inputs = int(X.shape[1])
        sttdev = 2 / np.sqrt( n_inputs )
        init = tf.truncated_normal( (n_inputs, n_neurons), sttdev=stddev ) # 截斷正態分佈，可以去除一些特別大的值
        W = tf.Variable( init, name="weights" )
        b = tf.Variable( tf.zeros([n_neurons]), name="bias" ) 
        z = tf.matmul( X, W ) + b
        if activation == "relu":
            return tf.nn.relu( z )
        else:
            return z

# with tf.name_scope("dnn"):
#     hidden1 = neuron_layer(X, n_hidden1, "hidden1", activation="relu")
#     hidden2 = neuron_layer(hidden1, n_hidden2, "hidden2", activation="relu")
#     logits = neuron_layer( hidden2, n_outputs, "outputs" )

with tf.name_scope("dnn"):
    hidden1 = fully_connected(X, n_hidden1, scope="hidden1")
    hidden2 = fully_connected(hidden1, n_hidden2, scope="hidden2")
    logits = fully_connected( hidden2, n_outputs, scope="outputs", activation_fn=None )

with tf.name_scope("loss"):
    xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits( labels=y, logits=logits )
    loss = tf.reduce_mean( xentropy, name="loss" )

lr = 0.01
with tf.name_scope( "train" ):
    optimizer = tf.train.GradientDescentOptimizer( lr )
    training_op = optimizer.minimize( loss )

with tf.name_scope( "eval" ):
    correct = tf.nn.in_top_k( logits, y, 1 )
    accuracy = tf.reduce_mean( tf.cast( correct, tf.float32 ) )

init = tf.global_variables_initializer()
saver = tf.train.Saver()

n_epochs = 30
batch_size = 200

with tf.Session() as sess:
    init.run()
    for epoch in range( n_epochs ):
        for iteration in range( X_train.shape[0] // batch_size ):
            X_batch, y_batch = mnist.train.next_batch( batch_size )
            sess.run( training_op, feed_dict={X:X_batch, y:y_batch} )
        acc_train = accuracy.eval( feed_dict={X:X_batch, y:y_batch} )
        acc_test = accuracy.eval( feed_dict={X:X_test, y:y_test} )
        print( epoch, "train accuracy : ", acc_train, "; Test accuracy : ", acc_test )
    save_path = saver.save( sess, "./models/mnist/my_model_final.ckpt" )

0 train accuracy :  0.82 ; Test accuracy :  0.8298
1 train accuracy :  0.89 ; Test accuracy :  0.8783
2 train accuracy :  0.88 ; Test accuracy :  0.8977
3 train accuracy :  0.885 ; Test accuracy :  0.9043
4 train accuracy :  0.925 ; Test accuracy :  0.9104
5 train accuracy :  0.9 ; Test accuracy :  0.9143
6 train accuracy :  0.915 ; Test accuracy :  0.9204
7 train accuracy :  0.925 ; Test accuracy :  0.9224
8 train accuracy :  0.93 ; Test accuracy :  0.9246
9 train accuracy :  0.925 ; Test accuracy :  0.9283
10 train accuracy :  0.92 ; Test accuracy :  0.9297
11 train accuracy :  0.91 ; Test accuracy :  0.9316
12 train accuracy :  0.95 ; Test accuracy :  0.933
13 train accuracy :  0.93 ; Test accuracy :  0.9356
14 train accuracy :  0.94 ; Test accuracy :  0.9373
15 train accuracy :  0.915 ; Test accuracy :  0.9382
16 train accuracy :  0.94 ; Test accuracy :  0.9398
17 train accuracy :  0.965 ; Test accuracy :  0.9415
18 train accuracy :  0.935 ; Test accuracy :  0.9425
19 train accuracy :  0.95 ; Test accuracy :  0.9433
20 train accuracy :  0.925 ; Test accuracy :  0.9447
21 train accuracy :  0.925 ; Test accuracy :  0.9455
22 train accuracy :  0.93 ; Test accuracy :  0.9461
23 train accuracy :  0.91 ; Test accuracy :  0.9484
24 train accuracy :  0.935 ; Test accuracy :  0.9485
25 train accuracy :  0.95 ; Test accuracy :  0.95
26 train accuracy :  0.94 ; Test accuracy :  0.9511
27 train accuracy :  0.93 ; Test accuracy :  0.9531
28 train accuracy :  0.95 ; Test accuracy :  0.9527
29 train accuracy :  0.965 ; Test accuracy :  0.9541

如果要使用之前訓練的模型進行分類任務，可以直接讀取儲存的模型檔案

with tf.Session() as sess:
    saver.restore( sess, "./models/mnist/my_model_final.ckpt" )
    X_new_scaled = X_test[:20, :]
    Z = logits.eval( feed_dict={X:X_new_scaled} )
    y_pred = np.argmax( Z, axis=1 )
    print( "real value : ", y_test[0:20] )
    print( "predict value :", y_pred[0:20] )

INFO:tensorflow:Restoring parameters from ./models/mnist/my_model_final.ckpt
real value :  [7 2 1 0 4 1 4 9 5 9 0 6 9 0 1 5 9 7 3 4]
predict value : [7 2 1 0 4 1 4 9 6 9 0 6 9 0 1 5 9 7 3 4]

NN超引數微調

NN雖然很靈活，但是有許多超引數需要調節，比如隱含層的數量、啟用函式等
一般情況下，可以首先使用一個隱含層進行訓練與測試，得到一個初步結果
深層網路比淺層網路在引數調節方面要更加靈活，它們可以使用更少的節點個數，對更加複雜的函式進行建模。
訓練的過程中，可以逐步增加隱含層的數目，得到更加複雜的網路
輸入和輸出層的神經元節點個數是由輸入和輸出確定的，對於mnist，輸入層有784個節點(特徵數目)，輸出層有10個節點(類別個數)
一般情況下，可以逐步增加隱含層中神經元節點的數量，直到模型發生過擬合；另外一種比較常用的方法是：選擇很大的神經元節點個數，然後利用early stopping方法進行訓練，得到最優的模型
一般情況下，使用RELU作為啟用函式就可以達到比較好的結果，因為它的計算速度很快，同時也不會因為輸入值過大而飽和；在輸出層，只要輸出的類別之間相互排斥，則softmax函式一般即可

tensorflow 神經網路基本使用

TF使用ANN(artificial neural network) 簡介受到生物神經網路的啟發發展歷史生物神經網路單元邏輯運算單元：and、or、xor等運算感知機(perceptron)：hw(x)=step(wT⋅x) 多層感知機和反向傳

Tensorflow 搭建神經網路基本流程

cs224d-Day 6: 快速入門 Tensorflow 本文是學習這個視訊課程系列的筆記，課程連結是 youtube 上的，講的很好，淺顯易懂，入門首選，而且在github有程式碼，想看視訊的也可以去他的優酷裡的頻道找。 Tensorflow 官網神經網路是一種數學模型，

TensorFlow學習筆記（九）tf搭建神經網路基本流程

1. 搭建神經網路基本流程定義新增神經層的函式 1.訓練的資料 2.定義節點準備接收資料 3.定義神經層：隱藏層和預測層 4.定義 loss 表示式 5.選擇 optimizer 使 loss 達到最小然後對所有變數進行初始化，通過 sess.run optimizer，迭代 1000 次進行學習： i

非區域性神經網路，打造未來神經網路基本元件

將非區域性計算作為獲取長時記憶的通用模組，提高神經網路效能在深度神經網路中，獲取長時記憶（long-range dependency）至關重要。對於序列資料（例如語音、語言），遞迴運算（recurrent operation）是長時記憶建模的主要解決方案。對於影象資料，長時記憶建模則依靠大型感受野，後者是多層

TensorFlow神經網路（四）手寫數字識別

內容來自mooc人工智慧實踐第五講一、MNIST資料集一些用到的基礎函式語法 ############ warm up ! ############ # 匯入imput_data模組 from tensorflow.examples.tutorials.mnist import

TensorFlow神經網路：模組化的神經網路八股

1、前向傳播：搭建從輸入到輸出的網路結構 forward.py: # 定義前向傳播過程 def forward(x, regularizer): w = b = y = return y # 給w賦初值，並把w的正則化損失加到總損失中 def g

話談tensorflow神經網路的啟用函式

神經網路啟用函式有很多，我們來看下： 1、我們常用啟用函式是S型函式，也就是sigmod(x)。S 型函式並非是唯一可以使用的啟用函式，實際上它具有一些不足。從圖中可以看出，S 型函式的導數最大值為

tensorflow 神經網路基礎

首先介紹本文使用的資料集MNIST手寫數字資料集： MNIST資料集的官網是http://yann.lecun.com/exdb/mnist/ 是由Google實驗室的Corinna Cortes和紐約大學柯朗研究所的YannLeCun建有一個手寫數字資料庫，訓練

TensorFlow神經網路模型不收斂的處理

1、learning rate設大了 0.1~0.0001.不同模型不同任務最優的lr都不一樣。我現在越來越不明白TensorFlow了，我設定訓練次數很大的時候，它一開始就給我“收斂”到一個值，後

tensorflow-神經網路識別手寫數字

資料下載連線：http://yann.lecun.com/exdb/mnist/ 下載t10k-images-idx3-ubyte.gz；t10k-labels-idx1-ubyte.gz；train-images-idx3-ubyte.gz；train-labels-idx1

資源 | HiddenLayer：視覺化PyTorch、TensorFlow神經網路圖的輕量級工具！

本文介紹了一個面向 PyTorch 和 TensorFlow 神經網路計算圖和訓練度量（metric）的輕量級庫——HiddenLayer，它適用於快速實驗，且與 Jupyter Notebook 相容。 GitHub連結：https://github.com/

BP神經網路基本介紹

1。BP網路的啟用函式必須是處處可微的。logsin(0~1);tansig(-1~1);purelin(負無窮到正無窮，一般用作輸出層)2。S型啟用函式所劃分的區域是一個非線性的超平面組成的區域，它是比較柔和、光滑的任意介面，因而它的分類比線性劃分精確、合理，這種網路的容

RNN 迴圈 NN 神經網路基本結構型別

基礎理解不同於卷積網路專門處理網格化資料，迴圈神經網路主要處理序列資料。比如一個句子：‘I went to Nepal in 2009’。每個word可以為序列的一個x。由於序列的長短不同，如果對每個x都單獨設定一個引數，那麼當出現更長的序列時模型就無法處理，

吳恩達深度學習課程筆記之卷積神經網路基本操作詳解

卷積層 CNN中卷積層的作用： CNN中的卷積層，在很多網路結構中會用conv來表示，也就是convolution的縮寫。卷積層在CNN中扮演著很重要的角色——特徵的抽象和提取，這也是CNN區別於傳統的ANN或SVM的重要不同。對於圖片而

tensorflow 神經網路解決二分類問題

import tensorflow as tf from numpy.random import RandomState # 定義訓練資料batch大小 batch_size = 8 # 定義神經網路的引數 w1 = tf.Variable(tf.rando

TensorFlow神經網路優化策略

在神經網路模型優化的過程中，會遇到許多問題，比如如何設定學習率的問題，我們可通過指數衰減的方式讓模型在訓練初期快速接近較優解，在訓練後期穩定進入最優解區域；針對過擬合問題，通過正則化的方法加以應對；滑動平均模型可以讓最終得到的模型在未知資料上表現的更加健壯。一、學習率的設

tensorflow基本教程8：手寫體分類卷積神經網路

import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data #number 1 to 10 data mnist=input_data.read_data_sets('MNIST_data'

tensorflow基本教程10：RNN迴圈神經網路對於手寫體識別預測

import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data #this is data mnist=input_data.read_data_sets("MNIST_data",one_

《TensorFlow：實戰Google深度學習框架》——6.2 卷積神經網路簡介（卷積神經網路的基本網路結構及其與全連線神經網路的差異）

下圖為全連線神經網路與卷積神經網路的結構對比圖：由上圖來分析兩者的差異：全連線神經網路與卷積網路相同點 &nb

tensorflow的基本用法(五)——建立神經網路並訓練

文章作者：Tyan 部落格：noahsnail.com | CSDN | 簡書本文主要是介紹利用tensorflow建立一個簡單的神經網路並進行訓練。 #!/usr/bin/env

tensorflow 神經網路基本使用

TF使用ANN(artificial neural network)

簡介

perceptron

多層感知機和反向傳播

使用MLP進行訓練

TF構建DNN

NN超引數微調

相關推薦