1. 程式人生 > 實用技巧 >多層感知機與簡易CNN的TensorFlow實現

多層感知機與簡易CNN的TensorFlow實現

下文使用TensorFlow實現了一個多層感知機和一個簡單的卷積神經網路模型,並應用於資料集MNIST。

所有程式碼以及所使用的的資料集檔案可以到作者的GitHub上下載,GitHub上提供的Jupyter Notebook文

件包含程式碼以及詳細註釋(程式碼中使用的每個函式的作用、引數說明)。

import tensorflow as tf
from tensorflow import keras
print(tf.__version__)# 2.0.0

使用的TensorFlow版本為2.0.0

首先獲取資料集:

from tensorflow.keras.datasets import mnist
(train_data, train_label), (test_data, test_label) 
= mnist.load_data('./mnist.npz')

這裡需要注意下載資料集時可能會出現HTTP連線超時的問題,可能需要VPN,也可以自行下載mnist.npz檔案

再將資料放到C:\Users\Administrator\.keras\datasets資料夾下。

多層感知機的實現:

# 定義模型
model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(256, activation='relu'),
    tf.keras.layers.Dense(
10, activation='softmax') ])

模型結構:

print(model.summary())
"""
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
flatten (Flatten)            (None, 784)               0         
_________________________________________________________________
dense (Dense)                (None, 256)               200960    
_________________________________________________________________
dense_1 (Dense)              (None, 10)                2570      
=================================================================
Total params: 203,530
Trainable params: 203,530
Non-trainable params: 0
_________________________________________________________________
None
"""

對資料做歸一化處理,並設定模型超引數,訓練模型:

# 將輸入資料歸一化
train_data = train_data / 255.0
test_data = test_data / 255.0

model.compile(optimizer=tf.keras.optimizers.SGD(lr=0.5), 
             loss='sparse_categorical_crossentropy', 
             metrics=['accuracy'])

model.fit(train_data, train_label, epochs=5,
              batch_size=256,
              validation_data=(test_data, test_label),
              validation_freq=1)

訓練結果:

Train on 60000 samples, validate on 10000 samples
Epoch 1/5
60000/60000 [==============================] - 16s 259us/sample - loss: 0.3641 - accuracy: 0.8926 - val_loss: 0.2121 - val_accuracy: 0.9351
Epoch 2/5
60000/60000 [==============================] - 4s 63us/sample - loss: 0.1652 - accuracy: 0.9523 - val_loss: 0.1375 - val_accuracy: 0.9580
Epoch 3/5
60000/60000 [==============================] - 4s 63us/sample - loss: 0.1199 - accuracy: 0.9658 - val_loss: 0.1091 - val_accuracy: 0.9674
Epoch 4/5
60000/60000 [==============================] - 5s 85us/sample - loss: 0.0952 - accuracy: 0.9726 - val_loss: 0.1082 - val_accuracy: 0.9658
Epoch 5/5
60000/60000 [==============================] - 4s 70us/sample - loss: 0.0788 - accuracy: 0.9775 - val_loss: 0.0947 - val_accuracy: 0.9702
<tensorflow.python.keras.callbacks.History at 0x23036b99320>

除此之外,作者通過給上述多層感知機模型新增全連線層以及改變全連線層的尺寸,並觀察了這些操作對訓練結果

的影響。由於此文的目的是為了提供一個多層感知機的實現示例,因此不再展開,具體程式碼以及實驗結果可以在作

者GitHub上看到。

簡易CNN實現:

model5 = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(filters=6, 
                           kernel_size=5, 
                           activation='relu', 
                           input_shape=(28, 28, 1)),
    tf.keras.layers.MaxPool2D(pool_size=2, strides=2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(256, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

模型結構:

Model: "sequential_7"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_4 (Conv2D)            (None, 24, 24, 6)         156       
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 12, 12, 6)         0         
_________________________________________________________________
flatten_4 (Flatten)          (None, 864)               0         
_________________________________________________________________
dense_16 (Dense)             (None, 256)               221440    
_________________________________________________________________
dense_17 (Dense)             (None, 10)                2570      
=================================================================
Total params: 224,166
Trainable params: 224,166
Non-trainable params: 0
_________________________________________________________________
None

在訓練模型前需要將訓練資料的shape更改一下:

train_data = tf.reshape(train_data, (-1, 28, 28, 1))
test_data = tf.reshape(test_data, (-1, 28, 28, 1))

設定超引數並訓練模型:

model5.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
             loss='sparse_categorical_crossentropy', 
             metrics=['accuracy'])

model5.fit(train_data, train_label, epochs=5, validation_split=0.1)

訓練結果:

Train on 54000 samples, validate on 6000 samples
Epoch 1/5
54000/54000 [==============================] - 34s 622us/sample - loss: 0.2047 - accuracy: 0.9400 - val_loss: 0.0763 - val_accuracy: 0.9797
Epoch 2/5
54000/54000 [==============================] - 32s 594us/sample - loss: 0.0688 - accuracy: 0.9792 - val_loss: 0.0605 - val_accuracy: 0.9833
Epoch 3/5
54000/54000 [==============================] - 32s 600us/sample - loss: 0.0479 - accuracy: 0.9846 - val_loss: 0.0476 - val_accuracy: 0.9870
Epoch 4/5
54000/54000 [==============================] - 32s 593us/sample - loss: 0.0338 - accuracy: 0.9892 - val_loss: 0.0566 - val_accuracy: 0.9855
Epoch 5/5
54000/54000 [==============================] - 35s 649us/sample - loss: 0.0258 - accuracy: 0.9916 - val_loss: 0.0522 - val_accuracy: 0.9858
<tensorflow.python.keras.callbacks.History at 0x230380d9518>

在訓練model5之前,作者使用了同樣結構的model4,但是優化器選用的SGD,學習率設為0.9,在最後訓練完成後

發現整個模型的準確率=0.1就像沒有訓練過的隨機初始化的模型一樣,因此作者將模型的優化器修改為Adam,並將

學習率設為0.001,即model4,訓練完成後模型準確率為0.98。之後作者僅修改學習率,優化器仍然使用SGD,發現

模型訓練完成後的準確率雖然沒有優化器尋味Adam的版本好,但也有0.96。可以看出模型的超引數選取十分重要。