pytorch 目標檢測 影象預處理
阿新 • • 發佈:2018-11-15
Faster RCNN 和Retinanet在將影象資料輸送到網路之前,要對影象資料進行預處理。大致上與部落格提到的相同。
事實上還可以採取第三步,將圖片的寬和高擴充套件為32的整倍數,正如在Retinanet使用的。下面是一個簡單的Pytorch資料預處理模組:
class Resizer(): def __call__(self, sample, targetSize=608, maxSize=1024, pad_N=32): image, anns = sample['img'], sample['ann'] rows, cols = image.shape[:2] smaller_size, larger_size = min(rows, cols), max(rows, cols) scale = targetSize / smaller_size if larger_size * scale > maxSize: scale = maxSize / larger_size image = skimage.transform.resize(image, (int(round(rows*scale)), int(round(cols*scale))), mode='constant') rows, cols, cns = image.shape[:3] pad_w, pad_h = (pad_N - cols % pad_N), (pad_N - rows % pad_N) new_image = np.zeros((rows + pad_h, cols + pad_w, cns)).astype(np.float32) new_image[:rows, :cols, :] = image.astype(np.float32) anns[:, :4] *= scale return {'img': torch.from_numpy(new_image), 'ann':torch.from_numpy(anns), 'scale':scale}