1. 程式人生 > >基於keras.MNIST的遷移學習(Transfer Learning)

基於keras.MNIST的遷移學習(Transfer Learning)

       遷移學習用來解決當某一任務A資料量不足時,通過另一相似任務B提供經驗(也就是從任務B遷移到任務A)的問題。此處的情形是,在MNIST資料集中,通過對前5個數字(0~4)的學習遷移到後5個數字(5~9)的任務(在一些paper中似乎也有teacher和student任務的叫法)。這裡的環境是Windows10+Anaconda3+keras,keras是基於tensorflow的一個包,個人感覺在具體實現起來時要比tensorflow簡潔一些。安裝keras很簡單,這裡不多贅述。

       我們從官網的例項開始。

mnist_transfer_cnn.py:

'''Transfer learning toy example.
遷移學習例項
1 - Train a simple convnet on the MNIST dataset the first 5 digits [0..4].
1 - 基於MINIST資料集,訓練簡單卷積網路,前5個數字[0..4].
2 - Freeze convolutional layers and fine-tune dense layers
   for the classification of digits [5..9].
2 - 為[5..9]數字分類,凍結卷積層並微調全連線層
Get to 99.8% test accuracy after 5 epochs
for the first five digits classifier
and 99.2% for the last five digits after transfer + fine-tuning.
5個週期後,前5個數字分類測試準確率99.8% ,同時通過遷移+微調,後5個數字測試準確率99.2%
'''

from __future__ import print_function

import datetime
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K

# 獲取當前時間
now = datetime.datetime.now

batch_size = 128
num_classes = 5 #分類類別個數
epochs = 5 #迭代次數

# input image dimensions
# 輸入影象維度
img_rows, img_cols = 28, 28  #28*28的畫素
# number of convolutional filters to use
# 使用的卷積過濾器數量
filters = 32
# size of pooling area for max pooling
# 最大值池化的池化區域大小
pool_size = 2
# convolution kernel size
# 卷積核大小
kernel_size = 3

if K.image_data_format() == 'channels_first':
    input_shape = (1, img_rows, img_cols)
else:
    input_shape = (img_rows, img_cols, 1)


def train_model(model, train, test, num_classes):
    x_train = train[0].reshape((train[0].shape[0],) + input_shape)
    x_test = test[0].reshape((test[0].shape[0],) + input_shape)
    x_train = x_train.astype('float32')
    x_test = x_test.astype('float32')
    x_train /= 255
    x_test /= 255
    print('x_train shape:', x_train.shape)
    print(x_train.shape[0], 'train samples')
    print(x_test.shape[0], 'test samples')

    # convert class vectors to binary class matrices
    # 類別向量轉為多分類矩陣
    y_train = keras.utils.to_categorical(train[1], num_classes)
    y_test = keras.utils.to_categorical(test[1], num_classes)

    model.compile(loss='categorical_crossentropy',
                  optimizer='adadelta',
                  metrics=['accuracy'])

    t = now()
    model.fit(x_train, y_train,
              batch_size=batch_size,
              epochs=epochs,
              verbose=1,
              validation_data=(x_test, y_test))
    print('Training time: %s' % (now() - t))
    score = model.evaluate(x_test, y_test, verbose=0)
    print('Test score:', score[0])
    print('Test accuracy:', score[1])


# the data, shuffled and split between train and test sets
# 篩選(資料順序打亂)、劃分訓練集和測試集
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# create two datasets one with digits below 5 and one with 5 and above
# 建立2個數據集,一個數字小於5,另一個數學大於等與5
x_train_lt5 = x_train[y_train < 5]
y_train_lt5 = y_train[y_train < 5]
x_test_lt5 = x_test[y_test < 5]
y_test_lt5 = y_test[y_test < 5]

x_train_gte5 = x_train[y_train >= 5]
y_train_gte5 = y_train[y_train >= 5] - 5
x_test_gte5 = x_test[y_test >= 5]
y_test_gte5 = y_test[y_test >= 5] - 5

# define two groups of layers: feature (convolutions) and classification (dense)
# 定義2組層:特徵(卷積)和分類(全連線)
# 特徵 = Conv + relu + Conv + relu + pooling + dropout
feature_layers = [
    Conv2D(filters, kernel_size,
           padding='valid',
           input_shape=input_shape),
    Activation('relu'),
    Conv2D(filters, kernel_size),
    Activation('relu'),
    MaxPooling2D(pool_size=pool_size),
    Dropout(0.25),
    Flatten(),
]

# 分類 = 128全連線 + relu + dropout + 5全連線 + softmax
classification_layers = [
    Dense(128),
    Activation('relu'),
    Dropout(0.5),
    Dense(num_classes),
    Activation('softmax')
]

# create complete model
# 建立完整模型
model = Sequential(feature_layers + classification_layers)

# train model for 5-digit classification [0..4]
# 為5數字分類[0..4]訓練模型
train_model(model,
            (x_train_lt5, y_train_lt5),
            (x_test_lt5, y_test_lt5), num_classes)

# 凍結上層並重建模型
for l in feature_layers:
    l.trainable = False

# 遷移:訓練下層為[5..9]分類任務
train_model(model,
            (x_train_gte5, y_train_gte5),
            (x_test_gte5, y_test_gte5), num_classes)

訓練結果:

29404/29404 [==============================] - 15s 512us/step - loss: 0.0447 - acc: 0.9868 - val_loss: 0.0238 - val_acc: 0.9922
Training time: 0:01:16.097530
Test score: 0.023793517283240938
Test accuracy: 0.9921826783508657

另:keras中的model.fit()函式

# #fit引數詳情
# keras.models.fit(
# self,
# x=None, #訓練資料
# y=None, #訓練資料label標籤
# batch_size=None, #每經過多少個sample更新一次權重,defult 32
# epochs=1, #訓練的輪數epochs
# verbose=1, #0為不在標準輸出流輸出日誌資訊,1為輸出進度條記錄,2為每個epoch輸出一行記錄
# callbacks=None,#list,list中的元素為keras.callbacks.Callback物件,在訓練過程中會呼叫list中的回撥函式
# validation_split=0., #浮點數0-1,將訓練集中的一部分比例作為驗證集,然後下面的驗證集validation_data將不會起到作用
# validation_data=None, #驗證集
# shuffle=True, #布林值和字串,如果為布林值,表示是否在每一次epoch訓練前隨機打亂輸入樣本的順序,如果為"batch",為處理HDF5資料
# class_weight=None, #dict,分類問題的時候,有的類別可能需要額外關注,分錯的時候給的懲罰會比較大,所以權重會調高,體現在損失函式上面
# sample_weight=None, #array,和輸入樣本對等長度,對輸入的每個特徵+個權值,如果是時序的資料,則採用(samples,sequence_length)的矩陣
# initial_epoch=0, #如果之前做了訓練,則可以從指定的epoch開始訓練
# steps_per_epoch=None, #將一個epoch分為多少個steps,也就是劃分一個batch_size多大,比如steps_per_epoch=10,則就是將訓練集分為10份,不能和batch_size共同使用
# validation_steps=None, #當steps_per_epoch被啟用的時候才有用,驗證集的batch_size
# **kwargs #用於和後端互動
# )
# 
# 返回的是一個History物件,可以通過History.history來檢視訓練過程,loss值等等