幾種深度學習模型,keras實現
阿新 • • 發佈:2018-12-18
#coding=utf-8 from keras.models import Sequential from keras.layers import Dense,Flatten from keras.layers.convolutional import Conv2D,MaxPooling2D def LeNet(input_shape): #LeNet5特徵能夠總結為如下幾點: #1)卷積神經網路使用三個層作為一個系列: 卷積,池化,非線性 #2) 使用卷積提取空間特徵 #3)使用對映到空間均值下采樣(subsample) #4)雙曲線(tanh)或S型(sigmoid)形式的非線性 #5)多層神經網路(MLP)作為最後的分類器 #6)層與層之間的稀疏連線矩陣避免大的計算成本 model = Sequential() model.add(Conv2D(32,(5,5),strides=(1,1),input_shape=input_shape,padding='valid',activation='relu',kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Conv2D(64,(5,5),strides=(1,1),padding='valid',activation='relu',kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Flatten()) model.add(Dense(100,activation='relu')) model.add(Dense(10,activation='softmax')) model.compile(optimizer='sgd',loss='categorical_crossentropy',metrics=['accuracy']) model.summary() return model ''' AlexNet將LeNet的思想發揚光大,把CNN的基本原理應用到了很深很寬的網路中。AlexNet主要使用到的新技術點如下。 (1)成功使用ReLU作為CNN的啟用函式,並驗證其效果在較深的網路超過了Sigmoid,成功解決了Sigmoid在網路較深時的梯度彌散問題。雖然ReLU啟用函式在很久之前就被提出了,但是直到AlexNet的出現才將其發揚光大。 (2)訓練時使用Dropout隨機忽略一部分神經元,以避免模型過擬合。Dropout雖有單獨的論文論述,但是AlexNet將其實用化,通過實踐證實了它的效果。在AlexNet中主要是最後幾個全連線層使用了Dropout。 (3)在CNN中使用重疊的最大池化。此前CNN中普遍使用平均池化,AlexNet全部使用最大池化,避免平均池化的模糊化效果。並且AlexNet中提出讓步長比池化核的尺寸小,這樣池化層的輸出之間會有重疊和覆蓋,提升了特徵的豐富性。 (4)提出了LRN層,對區域性神經元的活動建立競爭機制,使得其中響應比較大的值變得相對更大,並抑制其他反饋較小的神經元,增強了模型的泛化能力。 (5)使用CUDA加速深度卷積網路的訓練,利用GPU強大的平行計算能力,處理神經網路訓練時大量的矩陣運算。AlexNet使用了兩塊GTX 580 GPU進行訓練,單個GTX 580只有3GB視訊記憶體,這限制了可訓練的網路的最大規模。因此作者將AlexNet分佈在兩個GPU上,在每個GPU的視訊記憶體中儲存一半的神經元的引數。因為GPU之間通訊方便,可以互相訪問視訊記憶體,而不需要通過主機記憶體,所以同時使用多塊GPU也是非常高效的。同時,AlexNet的設計讓GPU之間的通訊只在網路的某些層進行,控制了通訊的效能損耗。 (6)資料增強,隨機地從256´256的原始影象中擷取224´224大小的區域(以及水平翻轉的映象),相當於增加了(256-224)2´2=2048倍的資料量。如果沒有資料增強,僅靠原始的資料量,引數眾多的CNN會陷入過擬閤中,使用了資料增強後可以大大減輕過擬合,提升泛化能力。進行預測時,則是取圖片的四個角加中間共5個位置,並進行左右翻轉,一共獲得10張圖片,對他們進行預測並對10次結果求均值。同時,AlexNet論文中提到了會對影象的RGB資料進行PCA處理,並對主成分做一個標準差為0.1的高斯擾動,增加一些噪聲,這個Trick可以讓錯誤率再下降1%。 ''' from keras.models import Sequential from keras.layers import Dense,Flatten,Dropout from keras.layers.convolutional import Conv2D,MaxPooling2D from keras.utils.np_utils import to_categorical def AlexNet(input_shape): model = Sequential() model.add(Conv2D(96,(11,11),strides=(4,4),input_shape=input_shape,padding='valid',activation='relu',kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(3,3),strides=(2,2))) model.add(Conv2D(256,(5,5),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(3,3),strides=(2,2))) model.add(Conv2D(384,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(Conv2D(384,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(Conv2D(256,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(3,3),strides=(2,2))) model.add(Flatten()) model.add(Dense(4096,activation='relu')) model.add(Dropout(0.5)) model.add(Dense(4096,activation='relu')) model.add(Dropout(0.5)) model.add(Dense(1000,activation='softmax')) model.compile(loss='categorical_crossentropy',optimizer='sgd',metrics=['accuracy']) model.summary() return model from keras.models import Sequential from keras.layers import Dense,Flatten,Dropout from keras.layers.convolutional import Conv2D,MaxPooling2D from keras.utils.np_utils import to_categorical def ZFNet(input_shape): model = Sequential() model.add(Conv2D(96,(7,7),strides=(2,2),input_shape=input_shape,padding='valid',activation='relu',kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(3,3),strides=(2,2))) model.add(Conv2D(256,(5,5),strides=(2,2),padding='same',activation='relu',kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(3,3),strides=(2,2))) model.add(Conv2D(384,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(Conv2D(384,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(Conv2D(256,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(3,3),strides=(2,2))) model.add(Flatten()) model.add(Dense(4096,activation='relu')) model.add(Dropout(0.5)) model.add(Dense(4096,activation='relu')) model.add(Dropout(0.5)) model.add(Dense(1000,activation='softmax')) model.compile(loss='categorical_crossentropy',optimizer='sgd',metrics=['accuracy']) model.summary() return model ''' VGGNet在訓練時有一個小技巧,先訓練級別A的簡單網路,再複用A網路的權重來初始化後面的幾個複雜模型,這樣訓練收斂的速度更快。 在預測時,VGG採用Multi-Scale的方法,將影象scale到一個尺寸Q,並將圖片輸入卷積網路計算。然後在最後一個卷積層使用滑窗的方式 進行分類預測,將不同視窗的分類結果平均,再將不同尺寸Q的結果平均得到最後結果,這樣可提高圖片資料的利用率並提升預測準確率。 同時在訓練中,VGGNet還使用了Multi-Scale的方法做資料增強,將原始影象縮放到不同尺寸S,然後再隨機裁切224´224的圖片,這樣能 增加很多資料量,對於防止模型過擬合有很不錯的效果。實踐中,作者令S在[256,512]這個區間內取值,使用Multi-Scale獲得多個版本的 資料,並將多個版本的資料合在一起進行訓練。圖9所示為VGGNet使用Multi-Scale訓練時得到的結果,可以看到D和E都可以達到7.5%的錯 誤率。最終提交到ILSVRC 2014的版本是僅使用Single-Scale的6個不同等級的網路與Multi-Scale的D網路的融合,達到了7.3%的錯誤率。 不過比賽結束後作者發現只融合Multi-Scale的D和E可以達到更好的效果,錯誤率達到7.0%,再使用其他優化策略最終錯誤率可達到6.8%左右, 非常接近同年的冠軍Google Inceptin Net。同時,作者在對比各級網路時總結出了以下幾個觀點。 ''' from keras.models import Sequential from keras.layers import Dense,Flatten,Dropout from keras.layers.convolutional import Conv2D,MaxPooling2D def VGG_13(input_shape): model = Sequential() model.add(Conv2D(64,(3,3),strides=(1,1),input_shape=input_shape,padding='same',activation='relu',kernel_initializer='uniform')) model.add(Conv2D(64,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Conv2D(128,(3,2),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(Conv2D(128,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Conv2D(256,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(Conv2D(256,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Conv2D(512,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(Conv2D(512,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Conv2D(512,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(Conv2D(512,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Flatten()) model.add(Dense(4096,activation='relu')) model.add(Dropout(0.5)) model.add(Dense(4096,activation='relu')) model.add(Dropout(0.5)) model.add(Dense(1000,activation='softmax')) model.compile(loss='categorical_crossentropy',optimizer='sgd',metrics=['accuracy']) model.summary() return model from keras.models import Sequential from keras.layers import Dense,Flatten,Dropout from keras.layers.convolutional import Conv2D,MaxPooling2D def VGG_16(input_shape): model = Sequential() model.add(Conv2D(64,(3,3),strides=(1,1),input_shape=input_shape,padding='same',activation='relu',kernel_initializer='uniform')) model.add(Conv2D(64,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Conv2D(128,(3,2),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(Conv2D(128,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Conv2D(256,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(Conv2D(256,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(Conv2D(256,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Conv2D(512,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(Conv2D(512,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(Conv2D(512,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Conv2D(512,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(Conv2D(512,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(Conv2D(512,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Flatten()) model.add(Dense(4096,activation='relu')) model.add(Dropout(0.5)) model.add(Dense(4096,activation='relu')) model.add(Dropout(0.5)) model.add(Dense(1000,activation='softmax')) model.compile(loss='categorical_crossentropy',optimizer='sgd',metrics=['accuracy']) model.summary() return model ''' Inception V1降低引數量的目的有兩點,第一,引數越多模型越龐大,需要供模型學習的資料量就越大,而目前高質量的資料非常昂貴; 第二,引數越多,耗費的計算資源也會更大。Inception V1引數少但效果好的原因除了模型層數更深、表達能力更強外,還有兩點: 一是去除了最後的全連線層,用全域性平均池化層(即將圖片尺寸變為1´1)來取代它。全連線層幾乎佔據了AlexNet或VGGNet中90%的引數量, 而且會引起過擬合,去除全連線層後模型訓練更快並且減輕了過擬合。用全域性平均池化層取代全連線層的做法借鑑了Network In Network (以下簡稱NIN)論文。二是Inception V1中精心設計的Inception Module提高了引數的利用效率,其結構如圖10所示。這一部分也借鑑 了NIN的思想,形象的解釋就是Inception Module本身如同大網路中的一個小網路,其結構可以反覆堆疊在一起形成大網路。不過Inception V1比NIN更進一步的是增加了分支網路,NIN則主要是級聯的卷積層和MLPConv層。一般來說卷積層要提升表達能力,主要依靠增加輸出通道數 但副作用是計算量增大和過擬合。每一個輸出通道對應一個濾波器,同一個濾波器共享引數,只能提取一類特徵,因此一個輸出通道只能做一種 特徵處理。而NIN中的MLPConv則擁有更強大的能力,允許在輸出通道之間組合資訊,因此效果明顯。可以說,MLPConv基本等效於普通卷積層後 再連線1´1的卷積和ReLU啟用函式。 ''' from keras.models import Model from keras.layers import Input,Dense,Dropout,BatchNormalization,Conv2D,MaxPooling2D,AveragePooling2D,concatenate from keras.layers.convolutional import Conv2D,MaxPooling2D,AveragePooling2D def GoogleNet(input_shape): def Conv2d_BN(x, nb_filter,kernel_size, padding='same',strides=(1,1),name=None): if name is not None: bn_name = name + '_bn' conv_name = name + '_conv' else: bn_name = None conv_name = None x = Conv2D(nb_filter,kernel_size,padding=padding,strides=strides,activation='relu',name=conv_name)(x) x = BatchNormalization(axis=3,name=bn_name)(x) return x def Inception(x,nb_filter): branch1x1 = Conv2d_BN(x,nb_filter,(1,1), padding='same',strides=(1,1),name=None) branch3x3 = Conv2d_BN(x,nb_filter,(1,1), padding='same',strides=(1,1),name=None) branch3x3 = Conv2d_BN(branch3x3,nb_filter,(3,3), padding='same',strides=(1,1),name=None) branch5x5 = Conv2d_BN(x,nb_filter,(1,1), padding='same',strides=(1,1),name=None) branch5x5 = Conv2d_BN(branch5x5,nb_filter,(1,1), padding='same',strides=(1,1),name=None) branchpool = MaxPooling2D(pool_size=(3,3),strides=(1,1),padding='same')(x) branchpool = Conv2d_BN(branchpool,nb_filter,(1,1),padding='same',strides=(1,1),name=None) x = concatenate([branch1x1,branch3x3,branch5x5,branchpool],axis=3) return x inpt = Input(shape=input_shape) #padding = 'same',填充為(步長-1)/2,還可以用ZeroPadding2D((3,3)) x = Conv2d_BN(inpt,64,(7,7),strides=(2,2),padding='same') x = MaxPooling2D(pool_size=(3,3),strides=(2,2),padding='same')(x) x = Conv2d_BN(x,192,(3,3),strides=(1,1),padding='same') x = MaxPooling2D(pool_size=(3,3),strides=(2,2),padding='same')(x) x = Inception(x,64)#256 x = Inception(x,120)#480 x = MaxPooling2D(pool_size=(3,3),strides=(2,2),padding='same')(x) x = Inception(x,128)#512 x = Inception(x,128) x = Inception(x,128) x = Inception(x,132)#528 x = Inception(x,208)#832 x = MaxPooling2D(pool_size=(3,3),strides=(2,2),padding='same')(x) x = Inception(x,208) x = Inception(x,256)#1024 x = AveragePooling2D(pool_size=(7,7),strides=(7,7),padding='same')(x) x = Dropout(0.4)(x) x = Dense(1000,activation='relu')(x) x = Dense(1000,activation='softmax')(x) model = Model(inpt,x,name='inception') model.compile(loss='categorical_crossentropy',optimizer='sgd',metrics=['accuracy']) model.summary() return model from keras.models import Model from keras.layers import Input,Dense,Dropout,BatchNormalization,Conv2D,MaxPooling2D,AveragePooling2D,concatenate,Activation,ZeroPadding2D from keras.layers import add,Flatten def Resnet_34(input_shape): def Conv2d_BN(x, nb_filter,kernel_size, strides=(1,1), padding='same',name=None): if name is not None: bn_name = name + '_bn' conv_name = name + '_conv' else: bn_name = None conv_name = None x = Conv2D(nb_filter,kernel_size,padding=padding,strides=strides,activation='relu',name=conv_name)(x) x = BatchNormalization(axis=3,name=bn_name)(x) return x def Conv_Block(inpt,nb_filter,kernel_size,strides=(1,1), with_conv_shortcut=False): x = Conv2d_BN(inpt,nb_filter=nb_filter,kernel_size=kernel_size,strides=strides,padding='same') x = Conv2d_BN(x, nb_filter=nb_filter, kernel_size=kernel_size,padding='same') if with_conv_shortcut: shortcut = Conv2d_BN(inpt,nb_filter=nb_filter,strides=strides,kernel_size=kernel_size) x = add([x,shortcut]) return x else: x = add([x,inpt]) return x inpt = Input(shape=(224,224,3)) x = ZeroPadding2D((3,3))(inpt) x = Conv2d_BN(x,nb_filter=64,kernel_size=(7,7),strides=(2,2),padding='valid') x = MaxPooling2D(pool_size=(3,3),strides=(2,2),padding='same')(x) #(56,56,64) x = Conv_Block(x,nb_filter=64,kernel_size=(3,3)) x = Conv_Block(x,nb_filter=64,kernel_size=(3,3)) x = Conv_Block(x,nb_filter=64,kernel_size=(3,3)) #(28,28,128) x = Conv_Block(x,nb_filter=128,kernel_size=(3,3),strides=(2,2),with_conv_shortcut=True) x = Conv_Block(x,nb_filter=128,kernel_size=(3,3)) x = Conv_Block(x,nb_filter=128,kernel_size=(3,3)) x = Conv_Block(x,nb_filter=128,kernel_size=(3,3)) #(14,14,256) x = Conv_Block(x,nb_filter=256,kernel_size=(3,3),strides=(2,2),with_conv_shortcut=True) x = Conv_Block(x,nb_filter=256,kernel_size=(3,3)) x = Conv_Block(x,nb_filter=256,kernel_size=(3,3)) x = Conv_Block(x,nb_filter=256,kernel_size=(3,3)) x = Conv_Block(x,nb_filter=256,kernel_size=(3,3)) x = Conv_Block(x,nb_filter=256,kernel_size=(3,3)) #(7,7,512) x = Conv_Block(x,nb_filter=512,kernel_size=(3,3),strides=(2,2),with_conv_shortcut=True) x = Conv_Block(x,nb_filter=512,kernel_size=(3,3)) x = Conv_Block(x,nb_filter=512,kernel_size=(3,3)) x = AveragePooling2D(pool_size=(7,7))(x) x = Flatten()(x) x = Dense(1000,activation='softmax')(x) model = Model(inputs=inpt,outputs=x) model.compile(loss='categorical_crossentropy',optimizer='sgd',metrics=['accuracy']) model.summary() return model from keras.models import Model from keras.layers import Input,Dense,BatchNormalization,Conv2D,MaxPooling2D,AveragePooling2D,ZeroPadding2D from keras.layers import add,Flatten #from keras.layers.convolutional import Conv2D,MaxPooling2D,AveragePooling2D from keras.optimizers import SGD def Resnet_50(input_shape): def Conv2d_BN(x, nb_filter,kernel_size, strides=(1,1), padding='same',name=None): if name is not None: bn_name = name + '_bn' conv_name = name + '_conv' else: bn_name = None conv_name = None x = Conv2D(nb_filter,kernel_size,padding=padding,strides=strides,activation='relu',name=conv_name)(x) x = BatchNormalization(axis=3,name=bn_name)(x) return x def Conv_Block(inpt,nb_filter,kernel_size,strides=(1,1), with_conv_shortcut=False): x = Conv2d_BN(inpt,nb_filter=nb_filter[0],kernel_size=(1,1),strides=strides,padding='same') x = Conv2d_BN(x, nb_filter=nb_filter[1], kernel_size=(3,3), padding='same') x = Conv2d_BN(x, nb_filter=nb_filter[2], kernel_size=(1,1), padding='same') if with_conv_shortcut: shortcut = Conv2d_BN(inpt,nb_filter=nb_filter[2],strides=strides,kernel_size=kernel_size) x = add([x,shortcut]) return x else: x = add([x,inpt]) return x inpt = Input(shape=input_shape) x = ZeroPadding2D((3,3))(inpt) x = Conv2d_BN(x,nb_filter=64,kernel_size=(7,7),strides=(2,2),padding='valid') x = MaxPooling2D(pool_size=(3,3),strides=(2,2),padding='same')(x) x = Conv_Block(x,nb_filter=[64,64,256],kernel_size=(3,3),strides=(1,1),with_conv_shortcut=True) x = Conv_Block(x,nb_filter=[64,64,256],kernel_size=(3,3)) x = Conv_Block(x,nb_filter=[64,64,256],kernel_size=(3,3)) x = Conv_Block(x,nb_filter=[128,128,512],kernel_size=(3,3),strides=(2,2),with_conv_shortcut=True) x = Conv_Block(x,nb_filter=[128,128,512],kernel_size=(3,3)) x = Conv_Block(x,nb_filter=[128,128,512],kernel_size=(3,3)) x = Conv_Block(x,nb_filter=[128,128,512],kernel_size=(3,3)) x = Conv_Block(x,nb_filter=[256,256,1024],kernel_size=(3,3),strides=(2,2),with_conv_shortcut=True) x = Conv_Block(x,nb_filter=[256,256,1024],kernel_size=(3,3)) x = Conv_Block(x,nb_filter=[256,256,1024],kernel_size=(3,3)) x = Conv_Block(x,nb_filter=[256,256,1024],kernel_size=(3,3)) x = Conv_Block(x,nb_filter=[256,256,1024],kernel_size=(3,3)) x = Conv_Block(x,nb_filter=[256,256,1024],kernel_size=(3,3)) x = Conv_Block(x,nb_filter=[512,512,2048],kernel_size=(3,3),strides=(2,2),with_conv_shortcut=True) x = Conv_Block(x,nb_filter=[512,512,2048],kernel_size=(3,3)) x = Conv_Block(x,nb_filter=[512,512,2048],kernel_size=(3,3)) x = AveragePooling2D(pool_size=(7,7))(x) x = Flatten()(x) x = Dense(1000,activation='softmax')(x) model = Model(inputs=inpt,outputs=x) sgd = SGD(decay=0.0001,momentum=0.9) model.compile(loss='categorical_crossentropy',optimizer='sgd',metrics=['accuracy']) model.summary() from keras.models import Model from keras.layers import Input, Conv2D, GlobalAveragePooling2D, Dropout from keras.layers import Activation, BatchNormalization, add, Reshape from keras.applications.mobilenet import relu6, DepthwiseConv2D from keras.utils.vis_utils import plot_model from keras import backend as K ''' | - data/ | - train/ | - class 0/ | - image.jpg .... | - class 1/ .... | - class n/ | - validation/ | - class 0/ | - class 1/ .... | - class n/ ''' def _conv_block(inputs, filters, kernel, strides): """Convolution Block This function defines a 2D convolution operation with BN and relu6. # Arguments inputs: Tensor, input tensor of conv layer. filters: Integer, the dimensionality of the output space. kernel: An integer or tuple/list of 2 integers, specifying the width and height of the 2D convolution window. strides: An integer or tuple/list of 2 integers, specifying the strides of the convolution along the width and height. Can be a single integer to specify the same value for all spatial dimensions. # Returns Output tensor. """ channel_axis = 1 if K.image_data_format() == 'channels_first' else -1 x = Conv2D(filters, kernel, padding='same', strides=strides)(inputs) x = BatchNormalization(axis=channel_axis)(x) return Activation(relu6)(x) def _bottleneck(inputs, filters, kernel, t, s, r=False): """Bottleneck This function defines a basic bottleneck structure. # Arguments inputs: Tensor, input tensor of conv layer. filters: Integer, the dimensionality of the output space. kernel: An integer or tuple/list of 2 integers, specifying the width and height of the 2D convolution window. t: Integer, expansion factor. t is always applied to the input size. s: An integer or tuple/list of 2 integers,specifying the strides of the convolution along the width and height.Can be a single integer to specify the same value for all spatial dimensions. r: Boolean, Whether to use the residuals. # Returns Output tensor. """ channel_axis = 1 if K.image_data_format() == 'channels_first' else -1 tchannel = K.int_shape(inputs)[channel_axis] * t x = _conv_block(inputs, tchannel, (1, 1), (1, 1)) x = DepthwiseConv2D(kernel, strides=(s, s), depth_multiplier=1, padding='same')(x) x = BatchNormalization(axis=channel_axis)(x) x = Activation(relu6)(x) x = Conv2D(filters, (1, 1), strides=(1, 1), padding='same')(x) x = BatchNormalization(axis=channel_axis)(x) if r: x = add([x, inputs]) return x def _inverted_residual_block(inputs, filters, kernel, t, strides, n): """Inverted Residual Block This function defines a sequence of 1 or more identical layers. # Arguments inputs: Tensor, input tensor of conv layer. filters: Integer, the dimensionality of the output space. kernel: An integer or tuple/list of 2 integers, specifying the width and height of the 2D convolution window. t: Integer, expansion factor. t is always applied to the input size. s: An integer or tuple/list of 2 integers,specifying the strides of the convolution along the width and height.Can be a single integer to specify the same value for all spatial dimensions. n: Integer, layer repeat times. # Returns Output tensor. """ x = _bottleneck(inputs, filters, kernel, t, strides) for i in range(1, n): x = _bottleneck(x, filters, kernel, t, 1, True) return x def MobileNetv2(input_shape, k): """MobileNetv2 This function defines a MobileNetv2 architectures. # Arguments input_shape: An integer or tuple/list of 3 integers, shape of input tensor. k: Integer, layer repeat times. # Returns MobileNetv2 model. """ inputs = Input(shape=input_shape) x = _conv_block(inputs, 32, (3, 3), strides=(2, 2)) x = _inverted_residual_block(x, 16, (3, 3), t=1, strides=1, n=1) x = _inverted_residual_block(x, 24, (3, 3), t=6, strides=2, n=2) x = _inverted_residual_block(x, 32, (3, 3), t=6, strides=2, n=3) x = _inverted_residual_block(x, 64, (3, 3), t=6, strides=2, n=4) x = _inverted_residual_block(x, 96, (3, 3), t=6, strides=1, n=3) x = _inverted_residual_block(x, 160, (3, 3), t=6, strides=2, n=3) x = _inverted_residual_block(x, 320, (3, 3), t=6, strides=1, n=1) x = _conv_block(x, 1280, (1, 1), strides=(1, 1)) x = GlobalAveragePooling2D()(x) x = Reshape((1, 1, 1280))(x) x = Dropout(0.3, name='Dropout')(x) x = Conv2D(k, (1, 1), padding='same')(x) x = Activation('softmax', name='softmax')(x) output = Reshape((k,))(x) model = Model(inputs, output) plot_model(model, to_file='images/MobileNetv2.png', show_shapes=True) return model if __name__ == '__main__': MobileNetv2((224, 224, 3), 1000) #推薦輸入圖片大小 ''' 這個網路的優點如下: 1.減輕了梯度彌散的問題,使模型不容易過擬合 2.增強了特徵在各個層之間的流動,因為每一層都與初始輸入層還有最後的由loss function得到的梯度直接相連 3.大大減少了引數個數,提高訓練效率 ''' # -*- coding: utf-8 -*- """DenseNet models for Keras. 基於keras的DenseNet模型 # Reference paper 論文 - [Densely Connected Convolutional Networks] 密集連線卷積網路 (https://arxiv.org/abs/1608.06993) (CVPR 2017 Best Paper Award) CVPR是IEEE Conference on Computer Vision and Pattern Recognition的縮寫,即IEEE國際計算機視覺與模式識別會議。 # Reference implementation 參考實現 - [Torch DenseNets] (https://github.com/liuzhuang13/DenseNet/blob/master/models/densenet.lua) - [TensorNets] (https://github.com/taehoonlee/tensornets/blob/master/tensornets/densenets.py) """ from __future__ import absolute_import from __future__ import division from __future__ import print_function import os from .. import backend as K from ..models import Model from ..layers import Activation from ..layers import AveragePooling2D from ..layers import BatchNormalization from ..layers import Concatenate from ..layers import Conv2D from ..layers import Dense from ..layers import GlobalAveragePooling2D from ..layers import GlobalMaxPooling2D from ..layers import Input from ..layers import MaxPooling2D from ..layers import ZeroPadding2D from ..utils.data_utils import get_file from ..engine.topology import get_source_inputs from . import imagenet_utils from .imagenet_utils import decode_predictions from .imagenet_utils import _obtain_input_shape # 權重下載地址 DENSENET121_WEIGHT_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.8/densenet121_weights_tf_dim_ordering_tf_kernels.h5' DENSENET121_WEIGHT_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.8/densenet121_weights_tf_dim_ordering_tf_kernels_notop.h5' DENSENET169_WEIGHT_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.8/densenet169_weights_tf_dim_ordering_tf_kernels.h5' DENSENET169_WEIGHT_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.8/densenet169_weights_tf_dim_ordering_tf_kernels_notop.h5' DENSENET201_WEIGHT_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.8/densenet201_weights_tf_dim_ordering_tf_kernels.h5' DENSENET201_WEIGHT_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.8/densenet201_weights_tf_dim_ordering_tf_kernels_notop.h5' def dense_block(x, blocks, name): """A dense block. 密集的模組 # Arguments 引數 x: input tensor. x: 輸入引數 blocks: integer, the number of building blocks. blocks: 整型,生成塊的個數。 name: string, block label. name: 字串,塊的標籤 # Returns 返回 output tensor for the block. 為塊輸出張量 """ for i in range(blocks): x = conv_block(x, 32, name=name + '_block' + str(i + 1)) return x def transition_block(x, reduction, name): """A transition block. 轉換塊 # Arguments 引數 x: input tensor. x: 輸入引數 reduction: float, compression rate at transition layers. reduction: 浮點數,轉換層的壓縮率 name: string, block label. name: 字串,塊標籤 # Returns 返回 output tensor for the block. 塊輸出張量 """ bn_axis = 3 if K.image_data_format() == 'channels_last' else 1 x = BatchNormalization(axis=bn_axis, epsilon=1.001e-5, name=name + '_bn')(x) x = Activation('relu', name=name + '_relu')(x) x = Conv2D(int(K.int_shape(x)[bn_axis] * reduction), 1, use_bias=False, name=name + '_conv')(x) x = AveragePooling2D(2, strides=2, name=name + '_pool')(x) return x def conv_block(x, growth_rate, name): """A building block for a dense block. 密集塊正在建立的塊 # Arguments 引數 x: input tensor. x: 輸入張量 growth_rate: float, growth rate at dense layers. growth_rate:浮點數,密集層的增長率。 name: string, block label. name: 字串,塊標籤 # Returns 返回 output tensor for the block. 塊輸出張量 """ bn_axis = 3 if K.image_data_format() == 'channels_last' else 1 x1 = BatchNormalization(axis=bn_axis, epsilon=1.001e-5, name=name + '_0_bn')(x) x1 = Activation('relu', name=name + '_0_relu')(x1) x1 = Conv2D(4 * growth_rate, 1, use_bias=False, name=name + '_1_conv')(x1) x1 = BatchNormalization(axis=bn_axis, epsilon=1.001e-5, name=name + '_1_bn')(x1) x1 = Activation('relu', name=name + '_1_relu')(x1) x1 = Conv2D(growth_rate, 3, padding='same', use_bias=False, name=name + '_2_conv')(x1) x = Concatenate(axis=bn_axis, name=name + '_concat')([x, x1]) return x def DenseNet(blocks, include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000): """Instantiates the DenseNet architecture. 例項化DenseNet結構 Optionally loads weights pre-trained on ImageNet. Note that when using TensorFlow, for best performance you should set `image_data_format='channels_last'` in your Keras config at ~/.keras/keras.json. 可選擇載入預訓練的ImageNet權重。注意,如果是Tensorflow,最好在Keras配置中設定`image_data_format='channels_last' The model and the weights are compatible with TensorFlow, Theano, and CNTK. The data format convention used by the model is the one specified in your Keras config file. 模型和權重相容TensorFlow, Theano, and CNTK.模型使用的資料格式約定是Keras配置檔案中指定的一種格式。 # Arguments 引數 blocks: numbers of building blocks for the four dense layers. blocks: (構建)4個密集層需要塊數量 include_top: whether to include the fully-connected layer at the top of the network. include_top: 在網路的頂層(一般指最後一層)師傅包含全連線層 weights: one of `None` (random initialization), 'imagenet' (pre-training on ImageNet), or the path to the weights file to be loaded. 以下的一個:`None` (隨機初始化),'imagenet' (ImageNet預訓練),或者下載權重檔案的路徑。 input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model. input_tensor: 可選的Keras張量(即,`layers.Input()`的輸出),用作模型的影象輸入。 input_shape: optional shape tuple, only to be specified if `include_top` is False (otherwise the input shape has to be `(224, 224, 3)` (with `channels_last` data format) or `(3, 224, 224)` (with `channels_first` data format). It should have exactly 3 inputs channels. input_shape: 可選的形狀元組,只有`include_top`是False(否則,輸入形狀必須 是“(224, 224, 3)”(帶有`channels_first` 資料格式。))時需要確認,它應該有3個輸入通道。 pooling: optional pooling mode for feature extraction when `include_top` is `False`. 可選,當 `include_top`是FALSE,特徵提取的池化模式。 - `None` means that the output of the model will be the 4D tensor output of the last convolutional layer. `None` 表示,模型輸出層是4維張量,從上一個的卷積層輸出。 - `avg` means that global average pooling will be applied to the output of the last convolutional layer, and thus the output of the model will be a 2D tensor. `avg`表示全域性平均池化被應用到上一個的卷積層輸出,所以模型輸出是2維張量。 - `max` means that global max pooling will be applied. `max` 表示全域性最大池化被應用 classes: optional number of classes to classify images into, only to be specified if `include_top` is True, and if no `weights` argument is specified. classes: 可選的類數分類的影象,只有指定,如果'include_top'是真的,如果沒有'weights'引數被指定。 # Returns 返回 A Keras model instance. 一個Keras模型例項 # Raises 補充 ValueError: in case of invalid argument for `weights`, or invalid input shape. ValueError: weights`無效的引數,或者無效的輸入形狀 """ if not (weights in {'imagenet', None} or os.path.exists(weights)): raise ValueError('The `weights` argument should be either ' '`None` (random initialization), `imagenet` ' '(pre-training on ImageNet), ' 'or the path to the weights file to be loaded.') if weights == 'imagenet' and include_top and classes != 1000: raise ValueError('If using `weights` as imagenet with `include_top`' ' as true, `classes` should be 1000') # Determine proper input shape input_shape = _obtain_input_shape(input_shape, default_size=224, min_size=221, data_format=K.image_data_format(), require_flatten=include_top, weights=weights) if input_tensor is None: img_input = Input(shape=input_shape) else: if not K.is_keras_tensor(input_tensor): img_input = Input(tensor=input_tensor, shape=input_shape) else: img_input = input_tensor bn_axis = 3 if K.image_data_format() == 'channels_last' else 1 x = ZeroPadding2D(padding=((3, 3), (3, 3)))(img_input) x = Conv2D(64, 7, strides=2, use_bias=False, name='conv1/conv')(x) x = BatchNormalization(axis=bn_axis, epsilon=1.001e-5, name='conv1/bn')(x) x = Activation('relu', name='conv1/relu')(x) x = ZeroPadding2D(padding=((1, 1), (1, 1)))(x) x = MaxPooling2D(3, strides=2, name='pool1')(x) x = dense_block(x, blocks[0], name='conv2') x = transition_block(x, 0.5, name='pool2') x = dense_block(x, blocks[1], name='conv3') x = transition_block(x, 0.5, name='pool3') x = dense_block(x, blocks[2], name='conv4') x = transition_block(x, 0.5, name='pool4') x = dense_block(x, blocks[3], name='conv5') x = BatchNormalization(axis=bn_axis, epsilon=1.001e-5, name='bn')(x) if include_top: x = GlobalAveragePooling2D(name='avg_pool')(x) x = Dense(classes, activation='softmax', name='fc1000')(x) else: if pooling == 'avg': x = GlobalAveragePooling2D(name='avg_pool')(x) elif pooling == 'max': x = GlobalMaxPooling2D(name='max_pool')(x) # Ensure that the model takes into account # any potential predecessors of `input_tensor`. # 確保模型考慮到任何潛在的字首“input_tensor”。 if input_tensor is not None: inputs = get_source_inputs(input_tensor) else: inputs = img_input # Create model. # 建立模型 if blocks == [6, 12, 24, 16]: model = Model(inputs, x, name='densenet121') elif blocks == [6, 12, 32, 32]: model = Model(inputs, x, name='densenet169') elif blocks == [6, 12, 48, 32]: model = Model(inputs, x, name='densenet201') else: model = Model(inputs, x, name='densenet') # Load weights. # 載入權重 if weights == 'imagenet': if include_top: if blocks == [6, 12, 24, 16]: weights_path = get_file( 'densenet121_weights_tf_dim_ordering_tf_kernels.h5', DENSENET121_WEIGHT_PATH, cache_subdir='models', file_hash='0962ca643bae20f9b6771cb844dca3b0') elif blocks == [6, 12, 32, 32]: weights_path = get_file( 'densenet169_weights_tf_dim_ordering_tf_kernels.h5', DENSENET169_WEIGHT_PATH, cache_subdir='models', file_hash='bcf9965cf5064a5f9eb6d7dc69386f43') elif blocks == [6, 12, 48, 32]: weights_path = get_file( 'densenet201_weights_tf_dim_ordering_tf_kernels.h5', DENSENET201_WEIGHT_PATH, cache_subdir='models', file_hash='7bb75edd58cb43163be7e0005fbe95ef') else: if blocks == [6, 12, 24, 16]: weights_path = get_file( 'densenet121_weights_tf_dim_ordering_tf_kernels_notop.h5', DENSENET121_WEIGHT_PATH_NO_TOP, cache_subdir='models', file_hash='4912a53fbd2a69346e7f2c0b5ec8c6d3') elif blocks == [6, 12, 32, 32]: weights_path = get_file( 'densenet169_weights_tf_dim_ordering_tf_kernels_notop.h5', DENSENET169_WEIGHT_PATH_NO_TOP, cache_subdir='models', file_hash='50662582284e4cf834ce40ab4dfa58c6') elif blocks == [6, 12, 48, 32]: weights_path = get_file( 'densenet201_weights_tf_dim_ordering_tf_kernels_notop.h5', DENSENET201_WEIGHT_PATH_NO_TOP, cache_subdir='models', file_hash='1c2de60ee40562448dbac34a0737e798') model.load_weights(weights_path) elif weights is not None: model.load_weights(weights) return model def DenseNet121(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000): return DenseNet([6, 12, 24, 16], include_top, weights, input_tensor, input_shape, pooling, classes) def DenseNet169(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000): return DenseNet([6, 12, 32, 32], include_top, weights, input_tensor, input_shape, pooling, classes) def DenseNet201(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000): return DenseNet([6, 12, 48, 32], include_top, weights, input_tensor, input_shape, pooling, classes) def preprocess_input(x, data_format=None): """Preprocesses a numpy array encoding a batch of images. 預處理:對一批影象進行編碼numpy陣列。 # Arguments 引數 x: a 3D or 4D numpy array consists of RGB values within [0, 255]. x: 3維或4維的numpy陣列,組成了基於[0, 255]之間的RGB值(例如:(255,255,255)) data_format: data format of the image tensor. data_format: 影象張量的資料格式。 # Returns 返回 Preprocessed array. 預處理後的陣列。 """ return imagenet_utils.preprocess_input(x, data_format, mode='torch') setattr(DenseNet121, '__doc__', DenseNet.__doc__) setattr(DenseNet169, '__doc__', DenseNet.__doc__) setattr(DenseNet201, '__doc__', DenseNet.__doc__)