1. 程式人生 > >Kaggle Digit Recognizer識別手寫數字入門賽基於tensorflow-GPU(TOP 15%)

Kaggle Digit Recognizer識別手寫數字入門賽基於tensorflow-GPU(TOP 15%)

本人原創,開源出來希望與大家互相學習。
ps:目前這個比賽前二三十名測試集的正確率為1,我覺得其中一個方法可能是將所有的樣本(從官網下載train set和 test set及其他們的標籤)喂入CNN學習,將訓練集正確率訓練到1.0就可以了,這樣用測試集測出來的結果就可以1.0了,但是這樣就失去比賽的意義了。

貼出來比賽結果,截至到現在是TOP 15%,用的是CNN,完整的程式碼我會貼在最後面。後續會把原始碼和資料集放到github上。
在這裡插入圖片描述

洗資料

  1. 從kaggle下載的資料集包含三個檔案,train.csv,test.csv和sample_submission.csv。
  2. 利用pandas包讀取.csv檔案。
  3. train.csv是42000x785的陣列,一共42000個樣本,第一列是影象的label,剩下784需要轉換為28x28的圖片。
  4. test.csv是28000x784的陣列,網路訓練好之後對其識別,結果放到sample_submission.csv中上傳到kaggle評估。
  5. 將train中的前40000個樣本拿出來作為訓練集,剩下2000個樣本作為交叉驗證集。網路模型調好之後可以把42000個樣本資料都放入訓練集(本人懶,沒放),放入之後可能會稍微提高一點正確率。
train = pd.read_csv('train.csv')
X_train = train.
iloc[:40000,1:].values X_train = np.float32(X_train/255.)#將輸入值區間調整到[0,1] X_train = np.reshape(X_train, [40000,28,28,1])#將行向量轉換為圖片,因為用的是CNN Y_train = train.iloc[:40000,0].values X_dev = train.iloc[40000:,1:].values X_dev = np.float32(X_dev/255.) X_dev = np.reshape(X_dev, [2000,28,28,1]) Y_dev = train.iloc[40000:,0].values

ONE - HOT

  1. 因為是多分類需要將label進行編碼。
Y_train = np.eye(10)[Y_train.reshape(-1)]
Y_dev = np.eye(10)[Y_dev.reshape(-1)]

CNN結構

  1. 用了三個卷積層,每個卷積層後面都跟了relu啟用函式,最大池化層和dropout,最後是兩個全連線層。

  2. dropout層的設定參考了kaggle上的Yassine Ghouzam, PhD大神的markdown,不得不說防止過擬合的效果真的很好。

  3. 在張量中設定name,是為了以後儲存網路和讀取網路進行測試時能夠找到圖的輸入和輸出。

  4. 在調參的時候無意間發現當第二個池化層的步長為(1,1)的時候,擬合的速度很快,而且準確率很高。

  5. 卷積核我用的都是3x3的,試過第一卷積層改為5x5的,差別不大。

#設定3個placeholder,是喂資料的入口
X = tf.placeholder(tf.float32,(None,28,28,1), name = 'X') 
Y = tf.placeholder(tf.float32,(None,10))
training = tf.placeholder(tf.bool, name = 'training')#這個placeholder是為了在訓練的時候開啟dropout,測試的時候關閉dropout
W1 = tf.get_variable('W1',[3,3,1,32],initializer=tf.contrib.layers.xavier_initializer())
W2 = tf.get_variable('W2',[3,3,32,64],initializer=tf.contrib.layers.xavier_initializer())
W3 = tf.get_variable('W3',[3,3,64,128],initializer=tf.contrib.layers.xavier_initializer())

conv_0 = tf.nn.conv2d(X, W1, strides = [1,1,1,1], padding = 'SAME')
act_0 = tf.nn.relu(conv_0)
pool_0 = tf.nn.max_pool(act_0, ksize = [1,2,2,1], strides = [1,2,2,1], padding = 'VALID')
dropout_0 = tf.contrib.layers.dropout(pool_0, keep_prob = 0.25, is_training = training)

conv_1 = tf.nn.conv2d(pool_0, W2, strides = [1,1,1,1], padding = 'VALID')
act_1 = tf.nn.relu(conv_1)
pool_1 = tf.nn.max_pool(act_1, ksize = [1,2,2,1],strides = [1,1,1,1], padding = 'VALID')
dropout_1 = tf.contrib.layers.dropout(pool_1, keep_prob = 0.25, is_training = training)

conv_2 = tf.nn.conv2d(pool_1, W3, strides = [1,1,1,1], padding = 'VALID')
act_2 = tf.nn.relu(conv_2)
pool_2 = tf.nn.max_pool(act_2, ksize = [1,2,2,1],strides = [1,1,1,1], padding = 'VALID')
dropout_2 = tf.contrib.layers.dropout(pool_2, keep_prob = 0.25, is_training = training)

flat = tf.contrib.layers.flatten(dropout_2)
full_0 = tf.contrib.layers.fully_connected(flat, 256, activation_fn = tf.nn.relu)
dropout_3 = tf.contrib.layers.dropout(full_0, keep_prob = 0.5, is_training = training)

full_1 = tf.contrib.layers.fully_connected(dropout_3, 10, activation_fn= None)

y = tf.argmax(full_1, 1, name = 'y')

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=full_1,labels=Y))

設定GPU

  1. po主用的是1050ti,如果不設定,tensorflow會佔滿視訊記憶體,後續訓練的時候會爆視訊記憶體的,沒有的小夥伴可以不用設定。
gpu_no = '0' # or '1'
os.environ["CUDA_VISIBLE_DEVICES"] = gpu_no
# 定義TensorFlow配置
config = tf.ConfigProto()
# 配置GPU記憶體分配方式,按需增長,很關鍵
config.gpu_options.allow_growth = True
sess = tf.Session(config = config)

學習率自適應

  1. 當快接近全域性最優點時,如果學習率很大會在最優點附近震盪,所以隨之迭代的次數增多需要減小學習率。
starter_learning_rate = 1e-4
global_step = tf.Variable(0, trainable=False)
learning_rate = tf.train.exponential_decay(starter_learning_rate, global_step, 1500, 0.96, staircase = True)
optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate).minimize(cost, global_step=global_step)
init = tf.global_variables_initializer()

製作mini batch

  1. mini batch會讓訓練的速度快很多。
def random_mini_batches(X, Y, mini_batch_size):
    
    m = X.shape[0]                  
    mini_batches = []
    
    # 打亂資料
    permutation = list(np.random.permutation(m))
    shuffled_X = X[permutation,:,:,:]
    shuffled_Y = Y[permutation,:]

    # 製作mini batch
    num_complete_minibatches = math.floor(m/mini_batch_size) 
    for k in range(0, num_complete_minibatches):
        mini_batch_X = shuffled_X[k * mini_batch_size : k * mini_batch_size + mini_batch_size,:,:,:]
        mini_batch_Y = shuffled_Y[k * mini_batch_size : k * mini_batch_size + mini_batch_size,:]
        mini_batch = (mini_batch_X, mini_batch_Y)
        mini_batches.append(mini_batch)
    
    # 如果樣本數不能被mini batch的size整除,則將剩下的整合起來
    if m % mini_batch_size != 0:
        mini_batch_X = shuffled_X[num_complete_minibatches * mini_batch_size : m,:,:,:]
        mini_batch_Y = shuffled_Y[num_complete_minibatches * mini_batch_size : m,:]
        mini_batch = (mini_batch_X, mini_batch_Y)
        mini_batches.append(mini_batch)
    
    return mini_batches

開始訓練及儲存網路

  1. 因為顯示卡的視訊記憶體有限,測試的時候將所有資料一次喂進去的時候會爆視訊記憶體,所以測試訓練集準確率的時候只用了前4000個樣本。
  2. 當交叉驗證集的正確率大於等於0.997或者迭代1000次,儲存網路。迭代1000次1050ti大概用了1個小時左右,其實迭代3/400次就差不多了。
with tf.Session() as sess:
    sess.run(init)#初始化引數
    for epoch in range(num_epochs):
        minibatch_cost = 0.
        num_minibatches = int(m / minibatch_size)
        minibatches = random_mini_batches(X_train, Y_train, minibatch_size)
        for minibatch in minibatches:
            (minibatch_X, minibatch_Y) = minibatch
            _ , temp_cost = sess.run([optimizer,cost],feed_dict={X:minibatch_X,Y:minibatch_Y,training:1})
            minibatch_cost += temp_cost / num_minibatches
        if epoch % 5 == 0:
            print ("Cost after epoch %i: %f" % (epoch, minibatch_cost))
            correct_prediction = tf.equal(tf.argmax(full_1,1), tf.argmax(Y,1))
            accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))          
            print ("Train Accuracy:", accuracy.eval({X: X_train[:4000,:,:,:], Y: Y_train[:4000,:], training:0}))
            Dev_Accuracy = accuracy.eval({X: X_dev, Y: Y_dev, training:0})
            print ("Dev Accuracy:", Dev_Accuracy)
            print(sess.run(global_step))
            print(sess.run(learning_rate))#為了觀察學習率
            if Dev_Accuracy >=0.997 or epoch == 1000:
                saver.save(sess,'Model/model.ckpt')#儲存網路
                break
        if epoch % 1 == 0:
            costs.append(minibatch_cost)#記錄cost

載入網路並製作submission.csv

  1. 因為視訊記憶體不足,所以分批將測試集喂到網路中。
  2. 最後提交結果到kaggle的時候需要vpn,不然上傳不成功。
  3. 製作.csv檔案參考了這篇部落格
import tensorflow as tf
import pandas as pd 
import numpy as np
test = pd.read_csv('test.csv')

X_test = test.iloc[:,:].values
X_test = np.float32(X_test/255.)
X_test = np.reshape(X_test, [28000,28,28,1])

with tf.Session() as sess:
    saver = tf.train.import_meta_graph('model\model.ckpt.meta')
    saver.restore(sess, "model\model.ckpt")
    graph = tf.get_default_graph()
    X = graph.get_tensor_by_name("X:0")
    training = graph.get_tensor_by_name("training:0")
    y = graph.get_tensor_by_name("y:0")
    for i in range(7):
        prediect = sess.run(y, feed_dict={X:X_test[i*4000:(i+1)*4000,:,:,:], training:0})
        try:
            result = np.hstack((result,prediect))
        except:
            result = prediect
    pd.DataFrame({"ImageId": range(1, len(result) + 1), "Label": result}).to_csv('submission.csv', index=False, header=True)

完整程式碼

# -*- coding: utf-8 -*-
"""
Created on Tue Nov 20 13:54:05 2018

@author: zjn
"""
import os
import math
import numpy as np
import tensorflow as tf
import pandas as pd

train = pd.read_csv('train.csv')

X_train = train.iloc[:40000,1:].values
X_train = np.float32(X_train/255.)
X_train = np.reshape(X_train, [40000,28,28,1])
Y_train = train.iloc[:40000,0].values
X_dev = train.iloc[40000:,1:].values
X_dev = np.float32(X_dev/255.)
X_dev = np.reshape(X_dev, [2000,28,28,1])
Y_dev = train.iloc[40000:,0].values

Y_train = np.eye(10)[Y_train.reshape(-1)]
Y_dev = np.eye(10)[Y_dev.reshape(-1)]

gpu_no = '0' # or '1'
os.environ["CUDA_VISIBLE_DEVICES"] = gpu_no
# 定義TensorFlow配置
config = tf.ConfigProto()
# 配置GPU記憶體分配方式,按需增長,很關鍵
config.gpu_options.allow_growth = True
sess = tf.Session(config = config)

num_epochs = 6000
minibatch_size = 128
costs = []
m = X_dev.shape[0]

X = tf.placeholder(tf.float32,(None,28,28,1), name = 'X') 
Y = tf.placeholder(tf.float32,(None,10))
training = tf.placeholder(tf.bool, name = 'training')

W1 = tf.get_variable('W1',[3,3,1,32],initializer=tf.contrib.layers.xavier_initializer())
W2 = tf.get_variable('W2',[3,3,32,64],initializer=tf.contrib.layers.xavier_initializer())
W3 = tf.get_variable('W3',[3,3,64,128],initializer=tf.contrib.layers.xavier_initializer())
W4 = tf.get_variable('W4',[3,3,64,64],initializer=tf.contrib.layers.xavier_initializer())

conv_0 = tf.nn.conv2d(X, W1, strides = [1,1,1,1], padding = 'SAME')
act_0 = tf.nn.relu(conv_0)
pool_0 = tf.nn.max_pool(act_0, ksize = [1,2,2,1], strides = [1,2,2,1], padding = 'VALID')
dropout_0 = tf.contrib.layers.dropout(pool_0, keep_prob = 0.25, is_training = training)

conv_1 = tf.nn.conv2d(pool_0, W2, strides = [1,1,1,1], padding = 'VALID')
act_1 = tf.nn.relu(conv_1)
pool_1 = tf.nn.max_pool(act_1, ksize = [1,2,2,1],strides = [1,1,1,1], padding = 'VALID')
dropout_1 = tf.contrib.layers.dropout(pool_1, keep_prob = 0.25, is_training = training)

conv_2 = tf.nn.conv2d(<