1. 程式人生 > 其它 >深度學習實戰 | 使用Kera預測人物年齡

深度學習實戰 | 使用Kera預測人物年齡

01 問題描述

我們的任務是從一個人的面部特徵來預測他的年齡(用“Young”“Middle ”“Old”表示),我們訓練的資料集大約有19906多張照片及其每張圖片對應的年齡(全是阿三的頭像。。。),測試集有6636張圖片,首先我們載入資料集,然後我們通過深度學習框架Keras建立、編譯、訓練模型,預測出6636張人物頭像對應的年齡。

02 引入所需要的模組

import os
import random
import pandas as pd
import numpy as np
from PIL import Image

03 載入資料集

root_dir=os.path.abspath('E:/data/age') train=pd.read_csv(os.path.join(root_dir,'train.csv')) test=pd.read_csv(os.path.join(root_dir,'test.csv'))  print(train.head()) print(test.head())   
ID   Class
0    377.jpg  MIDDLE
1  17814.jpg   YOUNG
2  21283.jpg  MIDDLE
3  16496.jpg   YOUNG
4   4487.jpg  MIDDLE
          ID
0  25321.jpg
1    989.jpg
2  19277.jpg
3  13093.jpg
4   5367.jpg

04 隨機讀取一張圖片試下

i=random.choice(train.index) img_name=train.ID[i] print(img_name) img=Image.open(os.path.join(root_dir,'Train',img_name)) img.show() print(train.Class[i])
20188.jpg
MIDDLE

05 難點

我們隨機開啟幾張圖片之後,可以發現圖片之間的差別比較大。大家感受下:

質量好的圖片:

Middle:

*Middle**

Young:

**Young**

Old:

*Old**

質量差的:

Middle:

**Middle**

下面是我們需要面臨的問題:

1、圖片的尺寸差別:有的圖片的尺寸是66x46,而另一張圖片尺寸為102x87

2、人物面貌角度不同:

側臉:

正臉:

3、圖片質量不一(直接上圖):

插圖

4、亮度和對比度的差異

亮度

對比度

現在,我們只專注下圖片尺寸處理,將每一張圖片尺寸重置為32x32;

06 格式化圖片尺寸和將圖片轉換成numpy陣列

temp=[]for img_name in train.ID: img_path=os.path.join(root_dir,'Train',img_name) img=Image.open(img_path) img=img.resize((32,32)) array=np.array(img) temp.append(array.astype('float32')) train_x=np.stack(temp) print(train_x.shape) print(train_x.ndim)(19906, 32, 32, 3) 4temp=[]for img_name in test.ID: img_path=os.path.join(root_dir,'Test',img_name) img=Image.open(img_path) img=img.resize((32,32)) array=np.array(img) temp.append(array.astype('float32')) test_x=np.stack(temp) print(test_x.shape)(6636, 32, 32, 3)

另外我們再歸一化影象,這樣會使模型訓練的更快

train_x = train_x / 255.test_x = test_x / 255.

我們看下圖片年齡大致分佈:

train.Class.value_counts(normalize=True)
MIDDLE    0.542751
YOUNG     0.336883
OLD       0.120366
Name: Class, dtype: float64
test['Class'] = 'MIDDLE
'test.to_csv('sub01.csv', index=False)
將目標變數處理虛擬列,能夠使模型更容易接受識別它
import keras
from sklearn.preprocessing import LabelEncoder
lb=LabelEncoder() train_y=lb.fit_transform(train.Class) print(train_y) train_y=keras.utils.np_utils.to_categorical(train_y) print(train_y) print(train_y.shape)
[0 2 0 ..., 0 0 0]
[[ 1.  0.  0.]  [ 0.  0.  1.]  [ 1.  0.  0.]  ...,   [ 1.  0.  0.]  [ 1.  0.  0.]  [ 1.  0.  0.]] (19906, 3)

07 建立模型

#構建神經網路
input_num_units=(32,32,3) hidden_num_units=500
output_num_units=3
epochs=5
batch_size=128
from keras.models import Sequential
from keras.layers import Dense,Flatten,InputLayer model=Sequential({     InputLayer(input_shape=input_num_units),     Flatten(),     Dense(units=hidden_num_units,activation='relu'),     Dense(input_shape=(32,32,3),units=output_num_units,activation='softmax') }) model.summary()
_________________________________________________________________
 Layer (type)                 Output Shape              Param #   
========================================
 input_23 (InputLayer)        (None, 32, 32, 3)         0         
_________________________________________________________________
 flatten_23 (Flatten)         (None, 3072)              0        
 _________________________________________________________________
dense_45 (Dense)             (None, 500)               1536500   
_________________________________________________________________ 
dense_46 (Dense)             (None, 3)                 1503      
======================================== 
Total params: 1,538,003
Trainable params: 1,538,003
Non-trainable params: 0

_________________________________________________________________

08 編譯模型

# model.compile(optimizer='sgd',loss='categorical_crossentropy',metrics['accuracy'])
model.compile(optimizer='sgd',loss='categorical_crossentropy', metrics=['accuracy']) model.fit(train_x,train_y,batch_size=batch_size,epochs=epochs,verbose=1)
Epoch 1/5
19906/19906 [==============================] 
- 4s - loss: 0.8878 - acc: 0.5809      Epoch 2/5 19906/19906 [==============================] 
- 4s - loss: 0.8420 - acc: 0.6077      Epoch 3/5 19906/19906 [==============================] 
- 4s - loss: 0.8210 - acc: 0.6214      Epoch 4/5 19906/19906 [==============================] 
- 4s - loss: 0.8149 - acc: 0.6194      Epoch 5/5 19906/19906 [==============================] 
- 4s - loss: 0.8042 - acc: 0.6305     
<keras.callbacks.History at 0x1d3803e6278>
model.fit(train_x, train_y, batch_size=batch_size,epochs=epochs,verbose=1, validation_split=0.2)
Train on 15924 samples, validate on 3982 samples Epoch 1/5 15924/15924 [==============================] 
- 3s - loss: 0.7970 - acc: 0.6375 - val_loss: 0.7854 - val_acc: 0.6396 Epoch 2/5 15924/15924 [==============================] 
- 3s - loss: 0.7919 - acc: 0.6378 - val_loss: 0.7767 - val_acc: 0.6519 Epoch 3/5 15924/15924 [==============================] 
- 3s - loss: 0.7870 - acc: 0.6404 - val_loss: 0.7754 - val_acc: 0.6534 Epoch 4/5 15924/15924 [==============================] 
- 3s - loss: 0.7806 - acc: 0.6439 - val_loss: 0.7715 - val_acc: 0.6524 Epoch 5/5 15924/15924 [==============================] 
- 3s - loss: 0.7755 - acc: 0.6519 - val_loss: 0.7970 - val_acc: 0.6346
<keras.callbacks.History at 0x1d3800a4eb8>

09 優化

我們使用最基本的模型來處理這個年齡預測結果,並且最終的預測結果為0.6375。接下來,從以下角度嘗試優化:

1、使用更好的神經網路模型

2、增加訓練次數

3、將圖片進行灰度處理(因為對於本問題而言,圖片顏色不是一個特別重要的特徵。)

10 optimize1 使用卷積神經網路

添加捲積層之後,預測準確率有所上漲,從6.3到6.7;最開始epochs輪數是5,訓練輪數增加到10,此時準確率為6.87;然後將訓練輪數增加到20,結果沒有發生變化。

11 Conv2D層

keras.layers.convolutional.Conv2D(filters, kernel_size, strides=(1, 1), padding='valid', data_format=None, dilation_rate=(1, 1), activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)

  • filters:輸出的維度
  • strides:卷積的步長

更多關於Conv2D的介紹請看Keras文件Conv2D層(http://keras-cn.readthedocs.io/en/latest/layers/convolutional_layer/#conv2d)

#引數初始化
filters=10
filtersize=(5,5)  epochs =10
batchsize=128
input_shape=(32,32,3)
from keras.models import Sequential model = Sequential() model.add(keras.layers.InputLayer(input_shape=input_shape)) model.add(keras.layers.convolutional.Conv2D(filters, filtersize, strides=(1, 1), padding='valid', data_format="channels_last", activation='relu')) model.add(keras.layers.MaxPooling2D(pool_size=(2, 2))) model.add(keras.layers.Flatten()) model.add(keras.layers.Dense(units=3, input_dim=50,activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) model.fit(train_x, train_y, epochs=epochs, batch_size=batchsize,validation_split=0.3)  model.summary()
Train on 13934 samples, validate on 5972 samples Epoch 1/1
013934/13934 [==============================] - 9s - loss: 0.8986 - acc: 0.5884 - val_loss: 0.8352 - val_acc: 0.6271
Epoch 2/1
013934/13934 [==============================] - 9s - loss: 0.8141 - acc: 0.6281 - val_loss: 0.7886 - val_acc: 0.6474
Epoch 3/1
013934/13934 [==============================] - 9s - loss: 0.7788 - acc: 0.6504 - val_loss: 0.7706 - val_acc: 0.6551
Epoch 4/1
013934/13934 [==============================] - 9s - loss: 0.7638 - acc: 0.6577 - val_loss: 0.7559 - val_acc: 0.6626
Epoch 5/1
013934/13934 [==============================] - 9s - loss: 0.7484 - acc: 0.6679 - val_loss: 0.7457 - val_acc: 0.6710
Epoch 6/1
013934/13934 [==============================] - 9s - loss: 0.7346 - acc: 0.6723 - val_loss: 0.7490 - val_acc: 0.6780
Epoch 7/1
013934/13934 [==============================] - 9s - loss: 0.7217 - acc: 0.6804 - val_loss: 0.7298 - val_acc: 0.6795
Epoch 8/1
013934/13934 [==============================] - 9s - loss: 0.7162 - acc: 0.6826 - val_loss: 0.7248 - val_acc: 0.6792
Epoch 9/1
013934/13934 [==============================] - 9s - loss: 0.7082 - acc: 0.6892 - val_loss: 0.7202 - val_acc: 0.6890
Epoch 10/1
013934/13934 [==============================] - 9s - loss: 0.7001 - acc: 0.6940 - val_loss: 0.7226 - val_acc: 0.6885
_________________________________________________________________
 Layer (type)                 Output Shape              Param #   
========================================
 input_6 (InputLayer)         (None, 32, 32, 3)         0        
 _______________________________________________________________ 
conv2d_6 (Conv2D)            (None, 28, 28, 10)        760     
  _______________________________________________________________ 
max_pooling2d_6 (MaxPooling2 (None, 14, 14, 10)        0     
_______________________________________________________________ 
flatten_6 (Flatten)          (None, 1960)              0         
_________________________________________________________________ 
dense_6 (Dense)              (None, 3)                 5883      
========================================

Total params: 6,643
Trainable params: 6,643
Non-trainable params: 0
_________________________________________________________________

12 optimize2 增加神經網路的層數

我們在模型中多新增幾層並且提高卷幾層的輸出維度,這次結果得到顯著提升:0.750904

#引數初始化
filters1=50
filters2=100
filters3=100
filtersize=(5,5)  epochs =10
batchsize=128
input_shape=(32,32,3)
from keras.models import Sequential
model = Sequential()  model.add(keras.layers.InputLayer(input_shape=input_shape))  model.add(keras.layers.convolutional.Conv2D(filters1, filtersize, strides=(1, 1), padding='valid', data_format="channels_last", activation='relu'))
 model.add(keras.layers.MaxPooling2D(pool_size=(2, 2))) 
 model.add(keras.layers.convolutional.Conv2D(filters2, filtersize, strides=(1, 1), padding='valid', data_format="channels_last", activation='relu')) 
model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))  model.add(keras.layers.convolutional.Conv2D(filters3, filtersize, strides=(1, 1), padding='valid', data_format="channels_last", activation='relu')) 
model.add(keras.layers.Flatten()) 
 model.add(keras.layers.Dense(units=3, input_dim=50,activation='softmax'))  model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) model.fit(train_x, train_y, epochs=epochs, batch_size=batchsize,validation_split=0.3) model.summary()
Train on 13934 samples, validate on 5972 samples
Epoch 1/1
013934/13934 [==============================] - 44s - loss: 0.8613 - acc: 0.5985 - val_loss: 0.7778 - val_acc: 0.6586
Epoch 2/1
013934/13934 [==============================] - 44s - loss: 0.7493 - acc: 0.6697 - val_loss: 0.7545 - val_acc: 0.6808
Epoch 3/1
013934/13934 [==============================] - 43s - loss: 0.7079 - acc: 0.6877 - val_loss: 0.7150 - val_acc: 0.6947
Epoch 4/1
013934/13934 [==============================] - 43s - loss: 0.6694 - acc: 0.7061 - val_loss: 0.6496 - val_acc: 0.7261
Epoch 5/1
013934/13934 [==============================] - 43s - loss: 0.6274 - acc: 0.7295 - val_loss: 0.6683 - val_acc: 0.7125
Epoch 6/1
013934/13934 [==============================] - 43s - loss: 0.5950 - acc: 0.7462 - val_loss: 0.6194 - val_acc: 0.7400
Epoch 7/1
013934/13934 [==============================] - 43s - loss: 0.5562 - acc: 0.7655 - val_loss: 0.5981 - val_acc: 0.7465
Epoch 8/1
013934/13934 [==============================] - 43s - loss: 0.5165 - acc: 0.7852 - val_loss: 0.6458 - val_acc: 0.7354
Epoch 9/1
013934/13934 [==============================] - 46s - loss: 0.4826 - acc: 0.7986 - val_loss: 0.6206 - val_acc: 0.7467
Epoch 10/1
013934/13934 [==============================] - 45s - loss: 0.4530 - acc: 0.8130 - val_loss: 0.5984 - val_acc: 0.7569
_________________________________________________________________ Layer (type)                 Output Shape              Param #   
========================================== input_15 (InputLayer)        (None, 32, 32, 3)         0        
 _________________________________________________________________ conv2d_31 (Conv2D)           (None, 28, 28, 50)        3800      
_________________________________________________________________ max_pooling2d_23 (MaxPooling (None, 14, 14, 50)        0         
_________________________________________________________________ conv2d_32 (Conv2D)           (None, 10, 10, 100)       125100    
_________________________________________________________________ max_pooling2d_24 (MaxPooling (None, 5, 5, 100)         0        
 _________________________________________________________________ conv2d_33 (Conv2D)           (None, 1, 1, 100)         250100    
_________________________________________________________________ flatten_15 (Flatten)         (None, 100)               0         
_________________________________________________________________ dense_7 (Dense)              (None, 3)                 303       
========================================== Total params: 379,303
Trainable params: 379,303
Non-trainable params: 0
_________________________________________________________________

13

輸出結果
pred=model.predict_classes(test_x) pred=lb.inverse_transform(pred) print(pred) test['Class']=pred test.to_csv('sub02.csv',index=False)
6636/6636 [==============================] - 7s     ['MIDDLE' 'YOUNG' 'MIDDLE' ..., 'MIDDLE' 'MIDDLE' 'YOUNG']
i = random.choice(train.index) img_name = train.ID[i]  img=Image.open(os.path.join(root_dir,'Train',img_name)) img.show() pred = model.predict_classes(train_x) print('Original:', train.Class[i], 'Predicted:', lb.inverse_transform(pred[i]))
19872/19906 [============================>.] - ETA: 0sOriginal: MIDDLE Predicted: MIDDLE

14 結果