移動端unet人像分割模型--3

阿新 • • 發佈：2018-11-09

前兩篇文章已經完成基本從mxnet到ncnn的unet模型訓練和轉換。不過還存在幾個問題，1. 模型比較大，2. 單幀處理需要15秒左右的時間（MAC PRO，ncnn沒有使用openmp的情況），3. 得到的mask結果不是特別理想。針對這三個問題，本文將對網路結構進行調整。

1. 模型比較大

採取將網路卷積核數量減少4倍的方式，模型大小下降到2M，粗略用圖片測試，效果也還可以。為了提高準確率，採取將樣本翻轉、crop、旋轉等方式進行擴充。同時把之前用0值填充圖片的方式，改成用邊界值填充，因為測試的時候發現之前的方式總在填充的邊界往往會出現檢測錯誤。原先還做過試驗，如果2M還不夠小，可以把U型下降段改成mobilenet的方式進一步壓縮模型大小。

#!/usr/bin/env python
# coding=utf8

import os
import sys
import random
import cv2
import mxnet as mx
import numpy as np
from mxnet.io import DataIter, DataBatch

sys.path.append('../')

def padding_and_resize(img, dstwidth, dstheight):
    height = img.shape[0]
    width = img.shape[1]

    top = 0
    bottom = 0
    left = 0
    right = 0

    if width > height:
        top = int((width - height) / 2)
        bottom = int((width - height) - top)
    else:
        left = int((height - width) / 2)
        right = int((height - width) - left)

    tmp = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_REPLICATE)

    return cv2.resize(img, (dstwidth, dstheight))

def rotate_image(image, angle):
    # grab the dimensions of the image and then determine the
    # center
    (h, w) = image.shape[:2]
    (cX, cY) = (w // 2, h // 2)

    # grab the rotation matrix (applying the negative of the
    # angle to rotate clockwise), then grab the sine and cosine
    # (i.e., the rotation components of the matrix)
    M = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
    cos = np.abs(M[0, 0])
    sin = np.abs(M[0, 1])

    # compute the new bounding dimensions of the image
    nW = int((h * sin) + (w * cos))
    nH = int((h * cos) + (w * sin))

    # adjust the rotation matrix to take into account translation
    M[0, 2] += (nW / 2) - cX
    M[1, 2] += (nH / 2) - cY

    # perform the actual rotation and return the image
    return cv2.warpAffine(image, M, (nW, nH), flags=cv2.INTER_LINEAR, borderMode=cv2.BORDER_REPLICATE)

def get_batch(items, root_path, nClasses, height, width):
    x = []
    y = []

    for item in items:
        flipped = False
        cropped = False
        rotated = False
        rotated_neg = False

        image_path = root_path + item.split(' ')[0]
        label_path = root_path + item.split(' ')[-1].strip()

        if image_path.find('_flipped.') >= 0:
          image_path = image_path.replace('_flipped', '')
          flipped = True
        elif image_path.find('_cropped.') >= 0:
          image_path = image_path.replace('_cropped', '')
          cropped = True
        elif image_path.find('_rotated.') >= 0:
          image_path = image_path.replace('_rotated', '')
          rotated = True
        elif image_path.find('_rotated_neg.') >= 0:
          image_path = image_path.replace('_rotated_neg', '')
          rotated_neg = True

        im = cv2.imread(image_path, 1)
        lim = cv2.imread(label_path, 1)

        if cropped:
          tmp_height = im.shape[0]
          im = im[:,tmp_height//5:tmp_height*4//5]
          tmp_height = lim.shape[0]
          lim = lim[:,tmp_height//5:tmp_height*4//5]

        if flipped:
          im = cv2.flip(im, 1)
          lim = cv2.flip(lim, 1)

        if rotated:
          im = rotate_image(im, 13)
          lim = rotate_image(lim, 13)

        if rotated_neg:
          im = rotate_image(im, -13)
          lim = rotate_image(lim, -13)

        im = padding_and_resize(im, width, height)
        lim = padding_and_resize(lim, width, height)

        im = np.float32(im) / 255.0

        lim = lim[:, :, 0]
        seg_labels = np.zeros((height, width, nClasses))

        for c in range(nClasses):
            seg_labels[:, :, c] = (lim == c).astype(int)

        seg_labels = np.reshape(seg_labels, (width * height, nClasses))

        x.append(im.transpose((2,0,1)))
        y.append(seg_labels.transpose((1,0)))

    return mx.nd.array(x), mx.nd.array(y)

2. 單幀處理需要15秒左右的時間

按照第一步處理之後，基本上一張圖片只要1秒鐘就處理完成，如果用上了openmp多開幾個執行緒，1秒應該可以處理好幾張。有個想法是，如果把神經網路的每一層搞成一個執行緒負責，用流水線的方式，也許可以做到實時處理視訊幀。

3. 得到的mask結果不是特別理想

通過擴充樣本，修改網路concat方式混合訓練，比如up6 = mx.sym.concat(*[trans_conv6, conv5], dim=1, name='concat6')的conv5換成第一次卷積（原先是第二次）的結果，訓練幾個epoch再換回原來的網路。

附幾張效果圖

移動端unet人像分割模型--3

移動端unet人像分割模型--3

移動端unet人像分割模型--1

移動端unet人像分割模型--2

Tensorflow lite for 移動端安卓開發（三）——移動端測試自己的模型

使用css 3實現移動端文本點點的效果

1、移動端 2、後臺 3、移動端，Web 端 4、 PC端

移動端適配（3）---rem適配

Unity 模型在移動端進行移動、旋轉和放大（縮小）

vue-cli3.0結合lib-flexible、px2rem實現移動端適配，完美解決第三方ui庫樣式變小問題 vue-cli 3.0 搭建專案流程

LuaScriptCore v2.3.2 釋出，移動端 Lua 橋接框架

移動端處理2倍圖片和3倍圖片

移動端車牌識別技術是怎麼解決車牌定位、字元分割、字元識別的？

微信小程式入門（3）：移動端訪問PC

3分鐘讀懂移動端rem使用方法

深度學習模型移植的移動端框架總結

Android實時直播，一千行java搞定不依賴jni，延遲0.8至3秒，強悍移動端來襲

Tensorflow lite for 移動端安卓開發（二）——完整詳細過程訓練自己的模型

與後端或APP聯調時如何定位問題3——移動端對接

vue.js移動端app實戰3：從一個購物車入門vue

C#開發BIMFACE系列18 服務端API之獲取模型資料3：獲取構件屬性

移動端unet人像分割模型--3

相關推薦