學習筆記:caffe2 教程記錄二
接著caffe2 教程記錄一,這個是第二篇
##2.載入預訓練模型
github 上的 :https://github.com/caffe2/tutorials/blob/master/Loading_Pretrained_Models.ipynb
描述:
在本教程中,我們將使用caff2中已經 預先訓練好的squeezenet模型進行分類我們自己的影象(squeezenet模型 它來自caff2 ModelZoo模型庫),作為輸入,我們將為要分類的影象提供路徑(或URL),這也將有助於瞭解影象的imageNet物件程式碼,以便我們可以驗證我們的結果。“物件程式碼”只不過是訓練期間使用的類的整數標籤,例如
985是 雛菊類的程式碼,注意,雖然我們這裡使用的是squeezenet ,但是本教程可以作為一種對預訓練模型執行推斷的通用方法。
如果您來自影象預處理教程(Image Pre-Processing Tutorial),您將看到我們正在使用縮放和裁剪函式來準備影象,以及將影象重新格式化為CHW、BGR,最後是NCHW。我們還通過使用從提供的npy檔案中計算的平均值,或者靜態地刪除128作為佔位符平均值來校正影象平均值。
希望你會發現,載入預先訓練的模型是簡單的且語法簡明的。從高層次來看,這是在預先訓練的模型上執行推斷所需的三個步驟。
-
讀取初始化init_net.pb 檔案 ,和 預測predict_net.pb 檔案
with open("init_net.pb", "rb") as f: init_net = f.read() with open("predict_net.pb", "rb") as f: predict_net = f.read()
-
使用 workspace.Predictor()方法 載入初始化 和預測 blob
p = workspace.Predictor(init_net, predict_net)
-
在一些資料上執行網路並獲得 (softmax) 結果
results = p.run({'data': img})
注意,假設網路的最後一層是softmax層,則結果返回為一個多維概率陣列,其長度等於模型所訓練的類數。概率可以通過物件程式碼(整數型別)來索引,所以如果您知道物件程式碼,那麼您可以在該索引處索引結果陣列,以檢視網路對輸入影象屬於該類的信心。
選擇模型下載:
雖然我們這裡將使用squeezenet模型,但是您可以在Model Zoo for pre-trained models 中檢視預訓練模型以瀏覽/下載各種預訓練模型,或者可以使用Caffe2的 caffe2.python.models.download
模組從Github caffe2/model 輕鬆獲取預訓練模型。
出於我們的目的,我們將使用 models.download 命令,將squeezenet 模型下載到本地Caffe2安裝的/caffe2/python/models資料夾中:
python -m caffe2.python.models.download -i squeezenet
如果上述下載執行正常,那麼您應該在/caffe2/python/models資料夾中會有一個squeezenet的資料夾,該檔案中應該有init_net.pb和predict_net.pb 檔案,注意如果你不使用-i 引數(命令中),這個模型將會被下載到你的cwd 目錄下,但是它仍然會有一個squeezenet的資料夾,該檔案中也包含init_net.pb和predict_net.pb 檔案,或者,您想下載所有模型,您可以克隆整個模型到本地,使用 以下clone :
git clone https://github.com/caffe2/models
程式碼(依賴匯入):
在我們開始之前,讓我們來處理所有依賴的匯入。 (該依賴是後面執行時,所需要用到的,有兩句警告,提示的是沒有在gpu上執行)
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from __future__ import unicode_literals
from caffe2.proto import caffe2_pb2
import numpy as np
import skimage.io
import skimage.transform
from matplotlib import pyplot
import os
from caffe2.python import core, workspace, models
import urllib2
import operator
print("Required modules imported.")
輸入 (輸入配置):
這裡,我們將指定用於此執行的輸入,包括輸入影象、模型位置、平均檔案(可選)、影象的所需大小以及標籤對映檔案的位置。
# Configuration --- Change to your setup and preferences!
# This directory should contain the models downloaded from the model zoo. To run this
# tutorial, make sure there is a 'squeezenet' directory at this location that
# contains both the 'init_net.pb' and 'predict_net.pb'
CAFFE_MODELS = "~/caffe2/caffe2/python/models"
# Some sample images you can try, or use any URL to a regular image.
# IMAGE_LOCATION = "https://upload.wikimedia.org/wikipedia/commons/thumb/f/f8/Whole-Lemon.jpg/1235px-Whole-Lemon.jpg"
# IMAGE_LOCATION = "https://upload.wikimedia.org/wikipedia/commons/7/7b/Orange-Whole-%26-Split.jpg"
# IMAGE_LOCATION = "https://upload.wikimedia.org/wikipedia/commons/a/ac/Pretzel.jpg"
# IMAGE_LOCATION = "https://cdn.pixabay.com/photo/2015/02/10/21/28/flower-631765_1280.jpg"
IMAGE_LOCATION = "images/flower.jpg"
# What model are we using?
# Format below is the model's: <folder, INIT_NET, predict_net, mean, input image size>
# You can switch 'squeezenet' out with 'bvlc_alexnet', 'bvlc_googlenet' or others that you have downloaded
MODEL = 'squeezenet', 'init_net.pb', 'predict_net.pb', 'ilsvrc_2012_mean.npy', 227
# codes - these help decypher the output and source from a list from ImageNet's object codes
# to provide an result like "tabby cat" or "lemon" depending on what's in the picture
# you submit to the CNN.
codes = "https://gist.githubusercontent.com/aaronmarkham/cd3a6b6ac071eca6f7b4a6e40e6038aa/raw/9edb4038a37da6b5a44c3b5bc52e448ff09bfe5b/alexnet_codes"
print("Config set!")
設定路徑:
使用配置集,我們現在可以載入mean file(如果他存在)、以及 predict net 和 init net.
# set paths and variables from model choice and prep image
CAFFE_MODELS = os.path.expanduser(CAFFE_MODELS)
# mean can be 128 or custom based on the model
# gives better results to remove the colors found in all of the training images
MEAN_FILE = os.path.join(CAFFE_MODELS, MODEL[0], MODEL[3])
if not os.path.exists(MEAN_FILE):
print("No mean file found!")
mean = 128
else:
print ("Mean file found!")
mean = np.load(MEAN_FILE).mean(1).mean(1)
mean = mean[:, np.newaxis, np.newaxis]
print("mean was set to: ", mean)
# some models were trained with different image sizes, this helps you calibrate your image
INPUT_IMAGE_SIZE = MODEL[4]
# make sure all of the files are around...
INIT_NET = os.path.join(CAFFE_MODELS, MODEL[0], MODEL[1])
PREDICT_NET = os.path.join(CAFFE_MODELS, MODEL[0], MODEL[2])
# Check to see if the files exist
if not os.path.exists(INIT_NET):
print("WARNING: " + INIT_NET + " not found!")
else:
if not os.path.exists(PREDICT_NET):
print("WARNING: " + PREDICT_NET + " not found!")
else:
print("All needed files found!")
影象預處理:
現在我們已經指定了輸入並驗證了輸入網路的存在,我們可以將影象載入到Caffe2卷積神經網路中並預處理用於攝取的影象!這是一個非常重要的步驟,因為經過訓練的CNN需要一個特定大小的輸入影象,其值來自特定的分佈。
# Function to crop the center cropX x cropY pixels from the input image
def crop_center(img,cropx,cropy):
y,x,c = img.shape
startx = x//2-(cropx//2)
starty = y//2-(cropy//2)
return img[starty:starty+cropy,startx:startx+cropx]
# Function to rescale the input image to the desired height and/or width. This function will preserve
# the aspect ratio of the original image while making the image the correct scale so we can retrieve
# a good center crop. This function is best used with center crop to resize any size input images into
# specific sized images that our model can use.
def rescale(img, input_height, input_width):
# Get original aspect ratio
aspect = img.shape[1]/float(img.shape[0])
if(aspect>1):
# landscape orientation - wide image
res = int(aspect * input_height)
imgScaled = skimage.transform.resize(img, (input_width, res))
if(aspect<1):
# portrait orientation - tall image
res = int(input_width/aspect)
imgScaled = skimage.transform.resize(img, (res, input_height))
if(aspect == 1):
imgScaled = skimage.transform.resize(img, (input_width, input_height))
return imgScaled
# Load the image as a 32-bit float
# Note: skimage.io.imread returns a HWC ordered RGB image of some size
img = skimage.img_as_float(skimage.io.imread(IMAGE_LOCATION)).astype(np.float32)
print("Original Image Shape: " , img.shape)
# Rescale the image to comply with our desired input size. This will not make the image 227x227
# but it will make either the height or width 227 so we can get the ideal center crop.
img = rescale(img, INPUT_IMAGE_SIZE, INPUT_IMAGE_SIZE)
print("Image Shape after rescaling: " , img.shape)
pyplot.figure()
pyplot.imshow(img)
pyplot.title('Rescaled image')
# Crop the center 227x227 pixels of the image so we can feed it to our model
img = crop_center(img, INPUT_IMAGE_SIZE, INPUT_IMAGE_SIZE)
print("Image Shape after cropping: " , img.shape)
pyplot.figure()
pyplot.imshow(img)
pyplot.title('Center Cropped')
# switch to CHW (HWC --> CHW)
img = img.swapaxes(1, 2).swapaxes(0, 1)
print("CHW Image Shape: " , img.shape)
pyplot.figure()
for i in range(3):
# For some reason, pyplot subplot follows Matlab's indexing
# convention (starting with 1). Well, we'll just follow it...
pyplot.subplot(1, 3, i+1)
pyplot.imshow(img[i])
pyplot.axis('off')
pyplot.title('RGB channel %d' % (i+1))
# switch to BGR (RGB --> BGR)
img = img[(2, 1, 0), :, :]
# remove mean for better results
img = img * 255 - mean
# add batch size axis which completes the formation of the NCHW shaped input that we want
img = img[np.newaxis, :, :, :].astype(np.float32)
print("NCHW image (ready to be used as input): ", img.shape)
做好cnn的網路準備,並執行它:
現在,影象已經準備好,可以給cnn了,讓我們開啟protobufs ,並將它載入到工作區(workspace)上,並執行網路。
處理結果:
回想ImageNet是一個1000類資料集,觀察到第三軸的結果是長度1000不是巧合。該軸保持每個類別在預訓練模型中的概率。因此,當檢視特定索引處的結果陣列時,可以將這個數字解釋為輸入屬於與該索引對應的類的概率。現在我們運行了預測器並收集了結果,我們可以通過將它們匹配到相應的英文標籤來解釋它們。
# the rest of this is digging through the results
results = np.delete(results, 1)
index = 0
highest = 0
arr = np.empty((0,2), dtype=object)
arr[:,0] = int(10)
arr[:,1:] = float(10)
for i, r in enumerate(results):
# imagenet index begins with 1!
i=i+1
arr = np.append(arr, np.array([[i,r]]), axis=0)
if (r > highest):
highest = r
index = i
# top N results
N = 5
topN = sorted(arr, key=lambda x: x[1], reverse=True)[:N]
print("Raw top {} results: {}".format(N,topN))
# Isolate the indexes of the top-N most likely classes
topN_inds = [int(x[0]) for x in topN]
print("Top {} classes in order: {}".format(N,topN_inds))
# Now we can grab the code list and create a class Look Up Table
response = urllib2.urlopen(codes)
class_LUT = []
for line in response:
code, result = line.partition(":")[::2]
code = code.strip()
result = result.replace("'", "")
if code.isdigit():
class_LUT.append(result.split(",")[0][1:])
# For each of the top-N results, associate the integer result with an actual class
for n in topN:
print("Model predicts '{}' with {}% confidence".format(class_LUT[int(n[0])],float("{0:.2f}".format(n[1]*100))))
執行結果:
多張圖片批量處理:
以上是如何一次輸入一個影象的例子。我們可以實現更高的吞吐量,如果我們在一個單一的時間一次餵養多個影象。回想起來,輸入到分類器的資料是“NCHW”的順序,所以為了給多個影象提供影象,我們將擴充套件N′軸。
# List of input images to be fed
images = ["images/cowboy-hat.jpg",
"images/cell-tower.jpg",
"images/Ducreux.jpg",
"images/pretzel.jpg",
"images/orangutan.jpg",
"images/aircraft-carrier.jpg",
"images/cat.jpg"]
# Allocate space for the batch of formatted images
NCHW_batch = np.zeros((len(images),3,227,227))
print ("Batch Shape: ",NCHW_batch.shape)
# For each of the images in the list, format it and place it in the batch
for i,curr_img in enumerate(images):
img = skimage.img_as_float(skimage.io.imread(curr_img)).astype(np.float32)
img = rescale(img, 227, 227)
img = crop_center(img, 227, 227)
img = img.swapaxes(1, 2).swapaxes(0, 1)
img = img[(2, 1, 0), :, :]
img = img * 255 - mean
NCHW_batch[i] = img
print("NCHW image (ready to be used as input): ", NCHW_batch.shape)
# Run the net on the batch
results = p.run([NCHW_batch.astype(np.float32)])
# Turn it into something we can play with and examine which is in a multi-dimensional array
results = np.asarray(results)
# Squeeze out the unnecessary axis
preds = np.squeeze(results)
print("Squeezed Predictions Shape, with batch size {}: {}".format(len(images),preds.shape))
# Describe the results
for i,pred in enumerate(preds):
print("Results for: '{}'".format(images[i]))
# Get the prediction and the confidence by finding the maximum value
# and index of maximum value in preds array
curr_pred, curr_conf = max(enumerate(pred), key=operator.itemgetter(1))
print("\tPrediction: ", curr_pred)
print("\tClass Name: ", class_LUT[int(curr_pred)])
print("\tConfidence: ", curr_conf)
執行結果:
教程結束。。。。
以下是我的單張圖片執行效果:
學習程式碼:szMain.py
#coding=utf-8
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from __future__ import unicode_literals
from caffe2.proto import caffe2_pb2
import numpy as np
import skimage.io
import skimage.transform
from matplotlib import pyplot
import os
from caffe2.python import core, workspace, models
import urllib2
import operator
##1. (配置 模型地址 )
# Configuration --- Change to your setup and preferences!
# This directory should contain the models downloaded from the model zoo. To run this
# tutorial, make sure there is a 'squeezenet' directory at this location that
# contains both the 'init_net.pb' and 'predict_net.pb'(描述的是 要配置找到自己的caffe 模型地址)
CAFFE_MODELS = "/Users/zhangrong/anaconda2/lib/python2.7/site-packages/caffe2/python/models/"
##2. (配置 圖片地址 )
# Some sample images you can try, or use any URL to a regular image.
# IMAGE_LOCATION = "https://upload.wikimedia.org/wikipedia/commons/thumb/f/f8/Whole-Lemon.jpg/1235px-Whole-Lemon.jpg"
# IMAGE_LOCATION = "https://upload.wikimedia.org/wikipedia/commons/7/7b/Orange-Whole-%26-Split.jpg"
IMAGE_LOCATION = "https://upload.wikimedia.org/wikipedia/commons/a/ac/Pretzel.jpg"
# IMAGE_LOCATION = "images/flower.jpg" #(提供了一些樣例 圖片 地址 和載入本地圖片地址的方法)
# IMAGE_LOCATION = "https://cdn.pixabay.com/photo/2015/02/10/21/28/flower-631765_1280.jpg"
##3. (配置 要使用的模型 )
# What model are we using?
# Format below is the model's: <folder, INIT_NET, predict_net, mean, input image size>
# You can switch 'squeezenet' out with 'bvlc_alexnet', 'bvlc_googlenet' or others that you have downloaded (描述的是 ,配置使用模型的方法)
MODEL = 'squeezenet', 'init_net.pb', 'predict_net.pb', 'ilsvrc_2012_mean.npy', 227
##4. (配置 要使用的imageNet的cnn 網路程式碼)
# codes - these help decypher the output and source from a list from ImageNet's object codes
# to provide an result like "tabby cat" or "lemon" depending on what's in the picture
# you submit to the CNN.
codes = "https://gist.githubusercontent.com/aaronmarkham/cd3a6b6ac071eca6f7b4a6e40e6038aa/raw/9edb4038a37da6b5a44c3b5bc52e448ff09bfe5b/alexnet_codes"
print("Config set!")
##5. (檢查 配置的路徑和要使用的模型檔案是否存在,為後面做準備)
# set paths and variables from model choice and prep image
CAFFE_MODELS = os.path.expanduser(CAFFE_MODELS)
# mean can be 128 or custom based on the model
# gives better results to remove the colors found in all of the training images
MEAN_FILE = os.path.join(CAFFE_MODELS, MODEL[0], MODEL[3])
if not os.path.exists(MEAN_FILE):
print("No mean file found!")
mean = 128
else:
print ("Mean file found!")
mean = np.load(MEAN_FILE).mean(1).mean(1)
mean = mean[:, np.newaxis, np.newaxis]
print("mean was set to: ", mean)
# some models were trained with different image sizes, this helps you calibrate your image
INPUT_IMAGE_SIZE = MODEL[4]
# make sure all of the files are around...
INIT_NET = os.path.join(CAFFE_MODELS, MODEL[0], MODEL[1])
PREDICT_NET = os.path.join(CAFFE_MODELS, MODEL[0], MODEL[2])
# Check to see if the files exist
if not os.path.exists(INIT_NET):
print("WARNING: " + INIT_NET + " not found!")
else:
if not os.path.exists(PREDICT_NET):
print("WARNING: " + PREDICT_NET + " not found!")
else:
print("All needed files found!")
##5. (對圖片資料 ,進行預處理)
# Function to crop the center cropX x cropY pixels from the input image (描述的是,該方法是一個裁剪方法)
def crop_center(img,cropx,cropy):
y,x,c = img.shape
startx = x//2-(cropx//2)
starty = y//2-(cropy//2)
return img[starty:starty+cropy,startx:startx+cropx]
# Function to rescale the input image to the desired height and/or width. This function will preserve
# the aspect ratio of the original image while making the image the correct scale so we can retrieve
# a good center crop. This function is best used with center crop to resize any size input images into
# specific sized images that our model can use.(描述的是該方法是一個 圖片處理的調整方法)
def rescale(img, input_height, input_width):
# Get original aspect ratio
aspect = img.shape[1]/float(img.shape[0])
if(aspect>1):
# landscape orientation - wide image
res = int(aspect * input_height)
imgScaled = skimage.transform.resize(img, (input_width, res))
if(aspect<1):
# portrait orientation - tall image
res = int(input_width/aspect)
imgScaled = skimage.transform.resize(img, (res, input_height))
if(aspect == 1):
imgScaled = skimage.transform.resize(img, (input_width, input_height))
return imgScaled
# Load the image as a 32-bit float
# Note: skimage.io.imread returns a HWC ordered RGB image of some size
img = skimage.img_as_float(skimage.io.imread(IMAGE_LOCATION)).astype(np.float32)
print("Original Image Shape: " , img.shape)
# Rescale the image to comply with our desired input size. This will not make the image 227x227
# but it will make either the height or width 227 so we can get the ideal center crop.
img = rescale(img, INPUT_IMAGE_SIZE, INPUT_IMAGE_SIZE)
print("Image Shape after rescaling: " , img.shape)
pyplot.figure()
pyplot.imshow(img)
pyplot.title('Rescaled image')
# Crop the center 227x227 pixels of the image so we can feed it to our model
img = crop_center(img, INPUT_IMAGE_SIZE, INPUT_IMAGE_SIZE)
print("Image Shape after cropping: " , img.shape)
pyplot.figure()
pyplot.imshow(img)
pyplot.title('Center Cropped')
# switch to CHW (HWC --> CHW) (描述的是,將圖片轉換成)chw 空間)
img = img.swapaxes(1, 2).swapaxes(0, 1)
print("CHW Image Shape: " , img.shape)
pyplot.figure()
for i in range(3):
# For some reason, pyplot subplot follows Matlab's indexing
# convention (starting with 1). Well, we'll just follow it...
pyplot.subplot(1, 3, i+1)
pyplot.imshow(img[i])
pyplot.axis('off')
pyplot.title('RGB channel %d' % (i+1))
# switch to BGR (RGB --> BGR)
img = img[(2, 1, 0), :, :]
# remove mean for better results
img = img * 255 - mean
# add batch size axis which completes the formation of the NCHW shaped input that we want
img = img[np.newaxis, :, :, :].astype(np.float32)
print("NCHW image (ready to be used as input): ", img.shape) #(描述的是,將圖片轉換 nchw 空間)
##6. (用模型,進行圖片的概率預測)
# Read the contents of the input protobufs into local variables
with open(INIT_NET, "rb") as f:
init_net = f.read()
with open(PREDICT_NET, "rb") as f:
predict_net = f.read()
# Initialize the predictor from the input protobufs
p = workspace.Predictor(init_net, predict_net)
# Run the net and return prediction
results = p.run({'data': img})
# Turn it into something we can play with and examine which is in a multi-dimensional array
results = np.asarray(results)
print("results shape: ", results.shape)
# Quick way to get the top-1 prediction result
# Squeeze out the unnecessary axis. This returns a 1-D array of length 1000
preds = np.squeeze(results)
# Get the prediction and the confidence by finding the maximum value and index of maximum value in preds array
curr_pred, curr_conf = max(enumerate(preds), key=operator.itemgetter(1))
print("Prediction: ", curr_pred)
print("Confidence: ", curr_conf)
##7. (將預測的概率結果,進行前五相識度排序,給出識別的圖片結果)
# the rest of this is digging through the results
results = np.delete(results, 1)
index = 0
highest = 0
arr = np.empty((0,2), dtype=object)
arr[:,0] = int(10)
arr[:,1:] = float(10)
for i, r in enumerate(results):
# imagenet index begins with 1!
i=i+1
arr = np.append(arr, np.array([[i,r]]), axis=0)
if (r > highest):
highest = r
index = i
# top N results
N = 5
topN = sorted(arr, key=lambda x: x[1], reverse=True)[:N]
print("Raw top {} results: {}".format(N,topN))
# Isolate the indexes of the top-N most likely classes
topN_inds = [int(x[0]) for x in topN]
print("Top {} classes in order: {}".format(N,topN_inds))
# Now we can grab the code list and create a class Look Up Table
response = urllib2.urlopen(codes)
class_LUT = []
for line in response:
code, result = line.partition(":")[::2]
code = code.strip()
result = result.replace("'", "")
if code.isdigit():
class_LUT.append(result.split(",")[0][1:])
# For each of the top-N results, associate the integer result with an actual class
for n in topN:
print("Model predicts '{}' with {}% confidence".format(class_LUT[int(n[0])],float("{0:.2f}".format(n[1]*100))))