12、TensorFlow 影象處理
阿新 • • 發佈:2019-01-24
一、影象編碼與解碼
影象在儲存時並不是直接記錄這些矩陣中的數字,而是記錄經過壓縮編碼之後的結果。所以要將一張影象還原成一個三維矩陣,需要解碼的過程。OpenCV 中的
imread
和imwrite
就是一個解碼和編碼的過程。TensorFLow 中提供了相應的編碼和解碼的函式。
# 影象解碼函式
tf.image.decode_image(
contents,
channels=None,
name=None
)
# 引數
contents: 0-D string. The encoded image bytes.
channels: An optional int. Defaults to 0. Number of color channels for the decoded image.
# 返回值
Tensor with type uint8 with shape [height, width, num_channels] for BMP, JPEG, and PNG images and shape [num_frames, height, width, 3] for GIF images.
# 影象編碼函式
tf.image.encode_jpeg()
tf.image.encode_png()
二、影象大小調整
# 1、縮放
tf.image.resize_images(
images,
size,
method=ResizeMethod.BILINEAR,
align_corners=False
)
# 引數
images: 4-D Tensor of shape [batch, height, width, channels] or 3-D Tensor of shape [height, width, channels].
size: A 1-D int32 Tensor of 2 elements: new_height, new_width. The new size for the images.
method can be one of:
ResizeMethod.BILINEAR: 雙線性插值法,預設
ResizeMethod.NEAREST_NEIGHBOR: 最近鄰法
ResizeMethod.BICUBIC: 雙三線性插值法
ResizeMethod.AREA: 面積插值法
# 返回值(float)
If images was 4-D, a 4-D float Tensor of shape [batch, new_height, new_width, channels]. If images was 3-D, a 3-D float Tensor of shape [new_height, new_width, channels].
# 2、裁剪(居中)或補零(四周均勻)
tf.image.resize_image_with_crop_or_pad(
image,
target_height,
target_width
)
# 引數
image: 4-D Tensor of shape [batch, height, width, channels] or 3-D Tensor of shape [height, width, channels].
# 返回值
Cropped and/or padded image. If images was 4-D, a 4-D float Tensor of shape [batch, new_height, new_width, channels]. If images was 3-D, a 3-D float Tensor of shape [new_height, new_width, channels]
# 3、按比例居中裁剪
tf.image.central_crop(
image,
central_fraction
)
# 4、對輸入影象做剪裁併通過插值方法調整尺寸
tf.image.crop_and_resizecrop_and_resize(
image,
boxes,
box_ind,
crop_size,
method='bilinear',
extrapolation_value=0,
name=None
)
# 5、沿著給定的 bbox 座標進行裁剪
tf.image.crop_to_bounding_box(
image,
offset_height,
offset_width,
target_height,
target_width
)
# 引數
image: 4-D Tensor of shape [batch, height, width, channels] or 3-D Tensor of shape [height, width, channels].
bbox: the top-left corner of the returned image is at offset_height, offset_width in image, and its lower-right corner is at offset_height + target_height, offset_width + target_width.
# 返回值
If image was 4-D, a 4-D float Tensor of shape [batch, target_height, target_width, channels] If image was 3-D, a 3-D float Tensor of shape [target_height, target_width, channels]
# 6、沿著原影象補零到指定高度(target_height)和寬度(target_width)
tf.image.pad_to_bounding_boxpad_to_bounding_box(
image,
offset_height,
offset_width,
target_height,
target_width
)
# 工作原理
Adds offset_height rows of zeros on top, offset_width columns of zeros on the left, and then pads the image on the bottom and right with zeros until it has dimensions target_height, target_width.
# 引數
image: 4-D Tensor of shape [batch, height, width, channels] or 3-D Tensor of shape [height, width, channels].
offset_height: Number of rows of zeros to add on top.
offset_width: Number of columns of zeros to add on the left.
target_height: Height of output image.
target_width: Width of output image.
# 返回值
If image was 4-D, a 4-D float Tensor of shape [batch, target_height, target_width, channels] If image was 3-D, a 3-D float Tensor of shape [target_height, target_width, channels]
三、影象翻轉、旋轉
# 1、(隨機)上下翻轉
tf.image.flip_up_down(image)
tf.image.random_flip_up_down(image,seed=None)
# 2、(隨機)左右翻轉
tf.image.flip_left_right(image)
tf.image.random_flip_left_right(image,seed=None)
# 3、沿對角線翻轉:交換影象的第一維和第二維
tf.image.transpose_image(image)
# 引數
image: 3-D tensor of shape [height, width, channels]
# 返回值
A 3-D tensor of the same type and shape as image
# 4、將影象逆時針旋轉 90*k 度
tf.image.rot90(image, k=1)
# 引數
image: A 3-D tensor of shape [height, width, channels].
k: A scalar integer. The number of times the image is rotated by 90 degrees.
name: A name for this operation (optional).
# 返回值
A rotated 3-D tensor of the same type and shape as image.
# 5、Rotate image(s) by the passed angle(s) in radians(弧度)
tf.contrib.image.rotate(
images,
angles,
interpolation='NEAREST'
)
# 引數
images: A tensor of shape (num_images, num_rows, num_columns, num_channels) (NHWC), (num_rows, num_columns, num_channels) (HWC), or (num_rows, num_columns) (HW).
angles: A scalar angle to rotate all images by, or (if images has rank 4) a vector of length num_images, with an angle for each image in the batch.
interpolation: Interpolation mode. Supported values: "NEAREST", "BILINEAR".
# 返回值
Image(s) with the same type and shape as images, rotated by the given angle(s). Empty space due to the rotation will be filled with zeros.
四、影象色彩調整
# 1、調整 RGB 影象或灰度圖的亮度
# delta is the amount to add to the pixel values, should be in [0,1)
tf.image.adjust_brightness(
image,
delta
)
# 2、調整 RGB 影象的色相, delta must be in the interval [-1, 1]
tf.image.adjust_hue(
image,
delta,
name=None
)
# 3、調整 RGB 影象或灰度圖的對比度
tf.image.adjust_contrast(
images,
contrast_factor
)
# 4、調整 RGB 影象的飽和度
tf.image.adjust_saturation(
image,
saturation_factor,
name=None
)
# 5、在輸入影象上執行伽馬校正
tf.image.adjust_gamma(
image,
gamma=1,
gain=1
)
# 6、在[-max_delta, max_delta]的範圍內隨機調整影象的亮度,0 的時候就是原始影象
tf.image.random_brightness(
image,
max_delta,
seed=None
)
# 7、在[-max_delta, max_delta]的範圍內隨機調整影象的色相
# max_delta must be in the interval [0, 0.5]
tf.image.random_hue(
image,
max_delta,
seed=None
)
# 8、在[lower, upper] 的範圍隨機調整影象的對比度
tf.image.random_contrast(
image,
lower,
upper,
seed=None
)
# 9、在[lower, upper] 的範圍隨機調整影象的飽和度
tf.image.random_saturation(
image,
lower,
upper,
seed=None
)
# 10、影象色彩空間轉換
tf.image.rgb_to_grayscale()
tf.image.grayscale_to_rgb()
tf.image.hsv_to_rgb()
tf.image.rgb_to_hsv() # 必須先轉換為實數(float32)影象
# 11、影象資料型別轉換,eg: 轉成 uint8-->float32, 除 255 轉成 [0,1)
tf.image.convert_image_dtype(
image,
dtype,
saturate=False,
name=None
)
# 12、影象標準化處理(均值為0,方差為1)
tf.image.per_image_standardization(image)
五、處理標註框(bounding_box)
# 1、Draw bounding boxes on a batch of images
draw_bounding_boxes(
images,
boxes,
name=None
)
# 引數
images: A Tensor. Must be one of the following types: float32, half. 4-D with shape [batch, height, width, depth]. A batch of images.
boxes: A Tensor of type float32. 3-D with shape [batch, num_bounding_boxes, 4] containing bounding boxes.
# 返回值
A Tensor. Has the same type as images. 4-D with the same shape as images. The batch of input images with bounding boxes drawn on the images.
# 資料型別和維度注意事項
images 要求為實數,所以需要先將影象矩陣轉化為實數型別,並增加一個 batch 維度 1,eg:
batched = tf.expand_dims(
tf.image.convert_image_dtype(images, tf.float32),
axis=0
)
# 座標系順序和相對座標注意事項
The coordinates of the each bounding box in boxes are encoded as [y_min, x_min, y_max, x_max]. The bounding box coordinates are floats in [0.0, 1.0] relative to the width and height of the underlying image.
For example, if an image is 100 x 200 pixels and the bounding box is [0.1, 0.2, 0.5, 0.9], the bottom left and upper right coordinates of the bounding box will be (10, 40) to (50, 180).
# 2、非極大值抑制
tf.image.non_max_suppression(
boxes,
scores,
max_output_size,
iou_threshold=0.5,
name=None
)
# 3、Generate a single randomly distorted bounding box for an image
tf.image.sample_distorted_bounding_box(
image_size,
bounding_boxes,
seed=None,
seed2=None,
min_object_covered=None,
aspect_ratio_range=None,
area_range=None,
max_attempts=None,
use_image_if_no_bounding_boxes=None,
name=None
)