基於keras.MNIST的遷移學習(Transfer Learning)
阿新 • • 發佈:2018-12-10
遷移學習用來解決當某一任務A資料量不足時,通過另一相似任務B提供經驗(也就是從任務B遷移到任務A)的問題。此處的情形是,在MNIST資料集中,通過對前5個數字(0~4)的學習遷移到後5個數字(5~9)的任務(在一些paper中似乎也有teacher和student任務的叫法)。這裡的環境是Windows10+Anaconda3+keras,keras是基於tensorflow的一個包,個人感覺在具體實現起來時要比tensorflow簡潔一些。安裝keras很簡單,這裡不多贅述。
我們從官網的例項開始。
mnist_transfer_cnn.py:
'''Transfer learning toy example. 遷移學習例項 1 - Train a simple convnet on the MNIST dataset the first 5 digits [0..4]. 1 - 基於MINIST資料集,訓練簡單卷積網路,前5個數字[0..4]. 2 - Freeze convolutional layers and fine-tune dense layers for the classification of digits [5..9]. 2 - 為[5..9]數字分類,凍結卷積層並微調全連線層 Get to 99.8% test accuracy after 5 epochs for the first five digits classifier and 99.2% for the last five digits after transfer + fine-tuning. 5個週期後,前5個數字分類測試準確率99.8% ,同時通過遷移+微調,後5個數字測試準確率99.2% ''' from __future__ import print_function import datetime import keras from keras.datasets import mnist from keras.models import Sequential from keras.layers import Dense, Dropout, Activation, Flatten from keras.layers import Conv2D, MaxPooling2D from keras import backend as K # 獲取當前時間 now = datetime.datetime.now batch_size = 128 num_classes = 5 #分類類別個數 epochs = 5 #迭代次數 # input image dimensions # 輸入影象維度 img_rows, img_cols = 28, 28 #28*28的畫素 # number of convolutional filters to use # 使用的卷積過濾器數量 filters = 32 # size of pooling area for max pooling # 最大值池化的池化區域大小 pool_size = 2 # convolution kernel size # 卷積核大小 kernel_size = 3 if K.image_data_format() == 'channels_first': input_shape = (1, img_rows, img_cols) else: input_shape = (img_rows, img_cols, 1) def train_model(model, train, test, num_classes): x_train = train[0].reshape((train[0].shape[0],) + input_shape) x_test = test[0].reshape((test[0].shape[0],) + input_shape) x_train = x_train.astype('float32') x_test = x_test.astype('float32') x_train /= 255 x_test /= 255 print('x_train shape:', x_train.shape) print(x_train.shape[0], 'train samples') print(x_test.shape[0], 'test samples') # convert class vectors to binary class matrices # 類別向量轉為多分類矩陣 y_train = keras.utils.to_categorical(train[1], num_classes) y_test = keras.utils.to_categorical(test[1], num_classes) model.compile(loss='categorical_crossentropy', optimizer='adadelta', metrics=['accuracy']) t = now() model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1, validation_data=(x_test, y_test)) print('Training time: %s' % (now() - t)) score = model.evaluate(x_test, y_test, verbose=0) print('Test score:', score[0]) print('Test accuracy:', score[1]) # the data, shuffled and split between train and test sets # 篩選(資料順序打亂)、劃分訓練集和測試集 (x_train, y_train), (x_test, y_test) = mnist.load_data() # create two datasets one with digits below 5 and one with 5 and above # 建立2個數據集,一個數字小於5,另一個數學大於等與5 x_train_lt5 = x_train[y_train < 5] y_train_lt5 = y_train[y_train < 5] x_test_lt5 = x_test[y_test < 5] y_test_lt5 = y_test[y_test < 5] x_train_gte5 = x_train[y_train >= 5] y_train_gte5 = y_train[y_train >= 5] - 5 x_test_gte5 = x_test[y_test >= 5] y_test_gte5 = y_test[y_test >= 5] - 5 # define two groups of layers: feature (convolutions) and classification (dense) # 定義2組層:特徵(卷積)和分類(全連線) # 特徵 = Conv + relu + Conv + relu + pooling + dropout feature_layers = [ Conv2D(filters, kernel_size, padding='valid', input_shape=input_shape), Activation('relu'), Conv2D(filters, kernel_size), Activation('relu'), MaxPooling2D(pool_size=pool_size), Dropout(0.25), Flatten(), ] # 分類 = 128全連線 + relu + dropout + 5全連線 + softmax classification_layers = [ Dense(128), Activation('relu'), Dropout(0.5), Dense(num_classes), Activation('softmax') ] # create complete model # 建立完整模型 model = Sequential(feature_layers + classification_layers) # train model for 5-digit classification [0..4] # 為5數字分類[0..4]訓練模型 train_model(model, (x_train_lt5, y_train_lt5), (x_test_lt5, y_test_lt5), num_classes) # 凍結上層並重建模型 for l in feature_layers: l.trainable = False # 遷移:訓練下層為[5..9]分類任務 train_model(model, (x_train_gte5, y_train_gte5), (x_test_gte5, y_test_gte5), num_classes)
訓練結果:
29404/29404 [==============================] - 15s 512us/step - loss: 0.0447 - acc: 0.9868 - val_loss: 0.0238 - val_acc: 0.9922
Training time: 0:01:16.097530
Test score: 0.023793517283240938
Test accuracy: 0.9921826783508657
另:keras中的model.fit()函式
# #fit引數詳情 # keras.models.fit( # self, # x=None, #訓練資料 # y=None, #訓練資料label標籤 # batch_size=None, #每經過多少個sample更新一次權重,defult 32 # epochs=1, #訓練的輪數epochs # verbose=1, #0為不在標準輸出流輸出日誌資訊,1為輸出進度條記錄,2為每個epoch輸出一行記錄 # callbacks=None,#list,list中的元素為keras.callbacks.Callback物件,在訓練過程中會呼叫list中的回撥函式 # validation_split=0., #浮點數0-1,將訓練集中的一部分比例作為驗證集,然後下面的驗證集validation_data將不會起到作用 # validation_data=None, #驗證集 # shuffle=True, #布林值和字串,如果為布林值,表示是否在每一次epoch訓練前隨機打亂輸入樣本的順序,如果為"batch",為處理HDF5資料 # class_weight=None, #dict,分類問題的時候,有的類別可能需要額外關注,分錯的時候給的懲罰會比較大,所以權重會調高,體現在損失函式上面 # sample_weight=None, #array,和輸入樣本對等長度,對輸入的每個特徵+個權值,如果是時序的資料,則採用(samples,sequence_length)的矩陣 # initial_epoch=0, #如果之前做了訓練,則可以從指定的epoch開始訓練 # steps_per_epoch=None, #將一個epoch分為多少個steps,也就是劃分一個batch_size多大,比如steps_per_epoch=10,則就是將訓練集分為10份,不能和batch_size共同使用 # validation_steps=None, #當steps_per_epoch被啟用的時候才有用,驗證集的batch_size # **kwargs #用於和後端互動 # ) # # 返回的是一個History物件,可以通過History.history來檢視訓練過程,loss值等等