基於MTCNN的人臉自動對齊技術原理及其Tensorflow實現測試
阿新 • • 發佈:2019-01-04
人臉識別是計算機視覺研究領域的一個熱點。而人臉識別包含了諸多步驟,其實現技術流程如下圖所示(摘自http://www.techshino.com/upfiles/images/%E4%BA%BA%E8%84%B8%E8%AF%86%E5%88%AB%E6%8A%80%E6%9C%AF%E6%B5%81%E7%A8%8B(2).png):
在上述過程中,人臉檢測是非常關鍵的一步,特別是在大多數應用場景條件下,監控視訊影象中問題包含了自然場景,而針對此類的應用,首要的是實現人臉檢測。
從上述原理圖可以看出,該模型由三個步驟組成:
步驟一:P-NET,該步驟主要生成了一堆候選區域的邊框,並採用NMS(非極大值)機制進行相應的合併。這與目標檢測過程中的原理類似。
步驟二:R-NET,即對步驟一的結果再進一步細劃,得到更精細的候選區域。
步驟三:O-NET,輸出結果。(人臉邊框和特徵點位置)
上述三個模型的具體卷積細節原理如下圖所示:
下面基於Tensorflow進行實驗,其中MTCNN相應的程式碼如下所示:
class PNet(Network): def setup(self): (self.feed('data') #pylint: disable=no-value-for-parameter, no-member .conv(3, 3, 10, 1, 1, padding='VALID', relu=False, name='conv1') .prelu(name='PReLU1') .max_pool(2, 2, 2, 2, name='pool1') .conv(3, 3, 16, 1, 1, padding='VALID', relu=False, name='conv2') .prelu(name='PReLU2') .conv(3, 3, 32, 1, 1, padding='VALID', relu=False, name='conv3') .prelu(name='PReLU3') .conv(1, 1, 2, 1, 1, relu=False, name='conv4-1') .softmax(3,name='prob1')) (self.feed('PReLU3') #pylint: disable=no-value-for-parameter .conv(1, 1, 4, 1, 1, relu=False, name='conv4-2')) class RNet(Network): def setup(self): (self.feed('data') #pylint: disable=no-value-for-parameter, no-member .conv(3, 3, 28, 1, 1, padding='VALID', relu=False, name='conv1') .prelu(name='prelu1') .max_pool(3, 3, 2, 2, name='pool1') .conv(3, 3, 48, 1, 1, padding='VALID', relu=False, name='conv2') .prelu(name='prelu2') .max_pool(3, 3, 2, 2, padding='VALID', name='pool2') .conv(2, 2, 64, 1, 1, padding='VALID', relu=False, name='conv3') .prelu(name='prelu3') .fc(128, relu=False, name='conv4') .prelu(name='prelu4') .fc(2, relu=False, name='conv5-1') .softmax(1,name='prob1')) (self.feed('prelu4') #pylint: disable=no-value-for-parameter .fc(4, relu=False, name='conv5-2')) class ONet(Network): def setup(self): (self.feed('data') #pylint: disable=no-value-for-parameter, no-member .conv(3, 3, 32, 1, 1, padding='VALID', relu=False, name='conv1') .prelu(name='prelu1') .max_pool(3, 3, 2, 2, name='pool1') .conv(3, 3, 64, 1, 1, padding='VALID', relu=False, name='conv2') .prelu(name='prelu2') .max_pool(3, 3, 2, 2, padding='VALID', name='pool2') .conv(3, 3, 64, 1, 1, padding='VALID', relu=False, name='conv3') .prelu(name='prelu3') .max_pool(2, 2, 2, 2, name='pool3') .conv(2, 2, 128, 1, 1, padding='VALID', relu=False, name='conv4') .prelu(name='prelu4') .fc(256, relu=False, name='conv5') .prelu(name='prelu5') .fc(2, relu=False, name='conv6-1') .softmax(1, name='prob1')) (self.feed('prelu5') #pylint: disable=no-value-for-parameter .fc(4, relu=False, name='conv6-2')) (self.feed('prelu5') #pylint: disable=no-value-for-parameter .fc(10, relu=False, name='conv6-3')) # 建立MTCNN模型 def create_mtcnn(sess, model_path): if not model_path: model_path,_ = os.path.split(os.path.realpath(__file__)) with tf.variable_scope('pnet'): data = tf.placeholder(tf.float32, (None,None,None,3), 'input') pnet = PNet({'data':data}) pnet.load(os.path.join(model_path, 'det1.npy'), sess) with tf.variable_scope('rnet'): data = tf.placeholder(tf.float32, (None,24,24,3), 'input') rnet = RNet({'data':data}) rnet.load(os.path.join(model_path, 'det2.npy'), sess) with tf.variable_scope('onet'): data = tf.placeholder(tf.float32, (None,48,48,3), 'input') onet = ONet({'data':data}) onet.load(os.path.join(model_path, 'det3.npy'), sess) pnet_fun = lambda img : sess.run(('pnet/conv4-2/BiasAdd:0', 'pnet/prob1:0'), feed_dict={'pnet/input:0':img}) rnet_fun = lambda img : sess.run(('rnet/conv5-2/conv5-2:0', 'rnet/prob1:0'), feed_dict={'rnet/input:0':img}) onet_fun = lambda img : sess.run(('onet/conv6-2/conv6-2:0', 'onet/conv6-3/conv6-3:0', 'onet/prob1:0'), feed_dict={'onet/input:0':img}) return pnet_fun, rnet_fun, onet_fun
在LFW資料集進行測試,發現結果還是相當的好。測試的結果如下: