基於keras.MNIST的遷移學習(Transfer Learning)

阿新 • • 發佈：2018-12-10

遷移學習用來解決當某一任務A資料量不足時，通過另一相似任務B提供經驗（也就是從任務B遷移到任務A）的問題。此處的情形是，在MNIST資料集中，通過對前5個數字(0~4)的學習遷移到後5個數字(5~9)的任務（在一些paper中似乎也有teacher和student任務的叫法）。這裡的環境是Windows10+Anaconda3+keras，keras是基於tensorflow的一個包，個人感覺在具體實現起來時要比tensorflow簡潔一些。安裝keras很簡單，這裡不多贅述。

我們從官網的例項開始。

mnist_transfer_cnn.py:

'''Transfer learning toy example.
遷移學習例項
1 - Train a simple convnet on the MNIST dataset the first 5 digits [0..4].
1 - 基於MINIST資料集，訓練簡單卷積網路，前5個數字[0..4].
2 - Freeze convolutional layers and fine-tune dense layers
   for the classification of digits [5..9].
2 - 為[5..9]數字分類，凍結卷積層並微調全連線層
Get to 99.8% test accuracy after 5 epochs
for the first five digits classifier
and 99.2% for the last five digits after transfer + fine-tuning.
5個週期後，前5個數字分類測試準確率99.8% ，同時通過遷移+微調，後5個數字測試準確率99.2%
'''

from __future__ import print_function

import datetime
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K

# 獲取當前時間
now = datetime.datetime.now

batch_size = 128
num_classes = 5 #分類類別個數
epochs = 5 #迭代次數

# input image dimensions
# 輸入影象維度
img_rows, img_cols = 28, 28  #28*28的畫素
# number of convolutional filters to use
# 使用的卷積過濾器數量
filters = 32
# size of pooling area for max pooling
# 最大值池化的池化區域大小
pool_size = 2
# convolution kernel size
# 卷積核大小
kernel_size = 3

if K.image_data_format() == 'channels_first':
    input_shape = (1, img_rows, img_cols)
else:
    input_shape = (img_rows, img_cols, 1)


def train_model(model, train, test, num_classes):
    x_train = train[0].reshape((train[0].shape[0],) + input_shape)
    x_test = test[0].reshape((test[0].shape[0],) + input_shape)
    x_train = x_train.astype('float32')
    x_test = x_test.astype('float32')
    x_train /= 255
    x_test /= 255
    print('x_train shape:', x_train.shape)
    print(x_train.shape[0], 'train samples')
    print(x_test.shape[0], 'test samples')

    # convert class vectors to binary class matrices
    # 類別向量轉為多分類矩陣
    y_train = keras.utils.to_categorical(train[1], num_classes)
    y_test = keras.utils.to_categorical(test[1], num_classes)

    model.compile(loss='categorical_crossentropy',
                  optimizer='adadelta',
                  metrics=['accuracy'])

    t = now()
    model.fit(x_train, y_train,
              batch_size=batch_size,
              epochs=epochs,
              verbose=1,
              validation_data=(x_test, y_test))
    print('Training time: %s' % (now() - t))
    score = model.evaluate(x_test, y_test, verbose=0)
    print('Test score:', score[0])
    print('Test accuracy:', score[1])


# the data, shuffled and split between train and test sets
# 篩選（資料順序打亂）、劃分訓練集和測試集
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# create two datasets one with digits below 5 and one with 5 and above
# 建立2個數據集，一個數字小於5，另一個數學大於等與5
x_train_lt5 = x_train[y_train < 5]
y_train_lt5 = y_train[y_train < 5]
x_test_lt5 = x_test[y_test < 5]
y_test_lt5 = y_test[y_test < 5]

x_train_gte5 = x_train[y_train >= 5]
y_train_gte5 = y_train[y_train >= 5] - 5
x_test_gte5 = x_test[y_test >= 5]
y_test_gte5 = y_test[y_test >= 5] - 5

# define two groups of layers: feature (convolutions) and classification (dense)
# 定義2組層：特徵（卷積）和分類（全連線）
# 特徵 = Conv + relu + Conv + relu + pooling + dropout
feature_layers = [
    Conv2D(filters, kernel_size,
           padding='valid',
           input_shape=input_shape),
    Activation('relu'),
    Conv2D(filters, kernel_size),
    Activation('relu'),
    MaxPooling2D(pool_size=pool_size),
    Dropout(0.25),
    Flatten(),
]

# 分類 = 128全連線 + relu + dropout + 5全連線 + softmax
classification_layers = [
    Dense(128),
    Activation('relu'),
    Dropout(0.5),
    Dense(num_classes),
    Activation('softmax')
]

# create complete model
# 建立完整模型
model = Sequential(feature_layers + classification_layers)

# train model for 5-digit classification [0..4]
# 為5數字分類[0..4]訓練模型
train_model(model,
            (x_train_lt5, y_train_lt5),
            (x_test_lt5, y_test_lt5), num_classes)

# 凍結上層並重建模型
for l in feature_layers:
    l.trainable = False

# 遷移：訓練下層為[5..9]分類任務
train_model(model,
            (x_train_gte5, y_train_gte5),
            (x_test_gte5, y_test_gte5), num_classes)

訓練結果：

29404/29404 [==============================] - 15s 512us/step - loss: 0.0447 - acc: 0.9868 - val_loss: 0.0238 - val_acc: 0.9922
Training time: 0:01:16.097530
Test score: 0.023793517283240938
Test accuracy: 0.9921826783508657

另：keras中的model.fit()函式

# #fit引數詳情
# keras.models.fit(
# self,
# x=None, #訓練資料
# y=None, #訓練資料label標籤
# batch_size=None, #每經過多少個sample更新一次權重，defult 32
# epochs=1, #訓練的輪數epochs
# verbose=1, #0為不在標準輸出流輸出日誌資訊，1為輸出進度條記錄，2為每個epoch輸出一行記錄
# callbacks=None,#list，list中的元素為keras.callbacks.Callback物件，在訓練過程中會呼叫list中的回撥函式
# validation_split=0., #浮點數0-1，將訓練集中的一部分比例作為驗證集，然後下面的驗證集validation_data將不會起到作用
# validation_data=None, #驗證集
# shuffle=True, #布林值和字串，如果為布林值，表示是否在每一次epoch訓練前隨機打亂輸入樣本的順序，如果為"batch"，為處理HDF5資料
# class_weight=None, #dict,分類問題的時候，有的類別可能需要額外關注，分錯的時候給的懲罰會比較大，所以權重會調高，體現在損失函式上面
# sample_weight=None, #array,和輸入樣本對等長度,對輸入的每個特徵+個權值，如果是時序的資料，則採用(samples，sequence_length)的矩陣
# initial_epoch=0, #如果之前做了訓練，則可以從指定的epoch開始訓練
# steps_per_epoch=None, #將一個epoch分為多少個steps，也就是劃分一個batch_size多大，比如steps_per_epoch=10，則就是將訓練集分為10份，不能和batch_size共同使用
# validation_steps=None, #當steps_per_epoch被啟用的時候才有用，驗證集的batch_size
# **kwargs #用於和後端互動
# )
# 
# 返回的是一個History物件，可以通過History.history來檢視訓練過程，loss值等等

基於keras.MNIST的遷移學習(Transfer Learning)

基於keras.MNIST的遷移學習(Transfer Learning)

遷移學習(Transfer Learning)

遷移學習Transfer Learning

遷移學習 Transfer Learning

基於Theano的深度學習(Deep Learning)框架Keras學習隨筆-01-FAQ

基於Theano的深度學習(Deep Learning)框架Keras學習隨筆-08-規則化(規格化)

基於Theano的深度學習(Deep Learning)框架Keras學習隨筆-03-優化器

基於Keras mnist手寫數字識別---Keras卷積神經網路入門教程

北大人工智慧網課攻略[4]:基於VGG16的遷移學習

基於keras的深度學習——分類

pyTorch實現機器學習——transfer learning

TensorFlow從入門到放棄（二）——基於InceptionV3的遷移學習以及影象特徵的提取

多分類例項：鳶尾花分類-基於keras的python學習筆記（五）

深度學習模型調參-基於keras的python學習筆記（四）

評估深度學習模型-在keras中使用scikit-learn-基於keras的python學習筆記（三）

評估深度學習模型-基於keras的python學習筆記（二）

多層感知機-印第安人糖尿病診斷-基於keras的python學習筆記（一）

基於LSTM和遷移學習的文字分類模型說明(Tensorflow)

tensorflow下基於alexnet的遷移學習

遷移學習簡介（transfer learning）

基於keras.MNIST的遷移學習(Transfer Learning)

相關推薦