Tensorflow的mnist資料集（十二）

阿新 • • 發佈：2019-02-01

mnist資料集

mnist資料集是一個TensorFlow的手寫資料集，可以自行下載也可以使用函式下載。

import tensorflow.examples.tutorials.mnist.input_data as input_data
mnist = input_data.read_data_sets('data/',one_hot=True)

引入包，使用read_data_sets來自動下載，不過可能會出錯，去百度了一下幾乎都出錯的了，所以還是自行下載吧。下載後放到本目錄下的data即可，因為上面第一個引數就是data，代表存的目錄。one_hot = True就是使用01編碼。

這個手寫資料集分為訓練集和測試集，

print(mnist.train.num_examples)
print(mnist.test.num_examples）

num_examples輸出就是數量

資料詳解：

training_images = mnist.train.images

這個是就是得到一個數據集的畫素。這裡的所有圖片都是28x28x1的，也就是寬高都是28，深度是1的。因為都是黑白畫素。使用深度都是一，等於就是把畫素拉長。所以輸出就是這個shape就是[55000 , 784]。

training_label = mnist.train.labels

這個就是一個數據集的一個結果，也就是標籤。如果這個圖片是9，那麼就是最後一個數字了[0,0,0,0,0,0,0,0,0,1]，那個是正確的數字，那個才是1。

現在顯示一下mnsit資料集的圖片。

mnist = input_data.read_data_sets('data/',one_hot=True)
training_image = mnist.train.images
training_label = mnist.train.labels
n = 20
curr_img = np.reshape(training_image[n , :] , (28 , 28))
plt.matshow(curr_img , cmap = plt.get_cmap('gray'))
plt.show()

首先是匯入資料集，得到訓練集的圖片樣本和標籤，再把畫素裝換成一個28x28的圖片，之後使用matshow進行顯示。

matshow使用的cmap = plt.get_cmap('gray')

可以看到是28x28的灰度圖

使用mnist完成一個邏輯迴歸的操作：

x = tf.placeholder("float" , [None , 784])
y = tf.placeholder("float" , [None , 10])
w = tf.Variable(tf.zeros([784 , 10]))
b = tf.Variable(tf.zeros([10]))

準備資料，x是float型別，[None , 784]在TensorFlow裡面None就是無窮大的意思，先設定好一個佔位符，使用placeholder佔位，因為w b不需要佔位，直接設定0就可以，所以可以使用Variable或者constent。

actv = tf.nn.softmax(tf.matmul(x , w) + b)
loss = tf.reduce_mean(-tf.reduce_sum(y * tf.log(actv) , reduction_indices=1))
lenght = 0.01
optim = tf.train.GradientDescentOptimizer(lenght).minimize(loss)

actv定義啟用函式，loss得到損失函式，lenght定義學習率，optim使用梯度下降進行優化w b值。優化的過程之前的邏輯迴歸講過。

preds = tf.equal(tf.argmax(actv , axis=1) , tf.argmax(y , axis=1))
accr = tf.reduce_mean(tf.cast(preds , tf.float32))

preds得到預測值，argmax就是找到最大值的索引，索引對比如果不是相同的就返回False，相同返回True。

accr是預測準確度，測試資料的準確率。

bacth_size = 100
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
costs = []
for i in range(100):
    cost = 0
num_batch = int(mnist.train.num_examples/bacth_size)
    for n in range(num_batch):
        train_x , train_y = mnist.train.next_batch(bacth_size)
        feeds = {x : train_x , y : train_y}
        sess.run(optim , feed_dict=feeds)
        cost += sess.run(loss , feed_dict=feeds)/num_batch
    costs.append(cost)
    print(cost)

在Session裡面執行可以得到結果：

顯示Loss曲線

plt.plot(range(100) , costs)
plt.show()
test_x , test_y = mnist.test.next_batch(bacth_size)
test = {x : test_x , y : test_y}
print(sess.run(accr , feed_dict=test))

可以看到是下降的。

接著試一下測試資料：

print(sess.run(accr , feed_dict=test))

Tensorflow的mnist資料集（十二）

Tensorflow的mnist資料集（十二）

利用 Python 進行資料分析（十二）pandas：資料合併

維度模型資料倉庫（十二） —— 多路徑和參差不齊的層次

資料結構與演算法（十二）並查集(Union Find)

深度學習筆記（十二）--深度學習資料集MNIST、ImageNet、 COCO 、 VOC介紹

Docker（十二）：Docker集群管理之Compose

大資料（十二）：自定義OutputFormat與ReduceJoin合併（資料傾斜）

【linux】Valgrind工具集詳解（十二）：DHAT：動態堆分析器

Python資料處理之（十二）Pandas 設定值

資料結構（十一）並查集的實現和優化

JDBC學習之路（十二）使用Spring中的JdbcTemple實現資料查詢

FPGA 學習筆記（十二）如何用串列埠傳送32位資料？

彙編--學習筆記（十二）-子程式（二）-子程式資料傳遞

資料結構複習（十二）之平衡二叉樹及哈夫曼樹

SpringCloud微服務雲架構構建B2B2C電子商務平臺（十二）springboot集成apidoc

Java資料結構詳解（十二）- HashMap

《機器學習實戰》學習筆記（十二）之利用PCA來簡化資料

CM+CDH構建企業大資料平臺系列（十二）

資料結構 JAVA描述（十二）歸併排序鏈式基數排序

資料預處理系列：（十二）用截斷奇異值分解降維

Tensorflow的mnist資料集（十二）

相關推薦