KITTI資料集資料初體驗
KITTI簡介
KITTI資料集由德國卡爾斯魯厄理工學院和豐田美國技術研究院聯合創辦,是目前國際上最大的自動駕駛場景下的計算機視覺演算法評測資料集。該資料集用於評測立體影象(stereo),光流(optical flow),視覺測距(visual odometry),3D物體檢測(object detection)和3D跟蹤(tracking)等計算機視覺技術在車載環境下的效能。KITTI包含市區、鄉村和高速公路等場景採集的真實影象資料,每張影象中最多達15輛車和30個行人,還有各種程度的遮擋與截斷。整個資料集由389對立體影象和光流圖,39.2 km視覺測距序列以及超過200k 3D標註物體的影象組成,以10Hz的頻率取樣及同步。
資料採集平臺
KITTI資料採集平臺包括2個灰度攝像機,2個彩色攝像機,一個Velodyne 3D鐳射雷達,4個光學鏡頭,以及1個GPS導航系統。
各裝置座標系、距離資訊由上圖可見。座標系轉換原理參見click。其實KITTI提供的資料中都包含三者的標定檔案,不需人工轉換。
鐳射資料
首先在官網KITTI下載 raw data development kit,其中的readme檔案詳細記錄了你想知道的一切,資料採集裝置,不同裝置的資料格式,label等。
鐳射資料是什麼形式呢?鐳射照射到物體表面產生大量點資料,KITTI中的點資料包括四維x,y,z以及reflectance反射強度。Velodyne 3D鐳射產生點雲資料,以.bin(二進位制)檔案儲存。
Velodyne 3D laser scan data
===========================
The velodyne point clouds are stored in the folder 'velodyne_points'. To
save space, all scans have been stored as Nx4 float matrix into a binary
file using the following code:
stream = fopen (dst_file.c_str(),"wb");
fwrite(data,sizeof(float),4 *num,stream);
fclose(stream);
Here, data contains 4*num values, where the first 3 values correspond to
x,y and z, and the last value is the reflectance information. All scans
are stored row-aligned, meaning that the first 4 values correspond to the
first measurement. Since each scan might potentially have a different
number of points, this must be determined from the file size when reading
the file, where 1e6 is a good enough upper bound on the number of values:
// allocate 4 MB buffer (only ~130*4*4 KB are needed)
int32_t num = 1000000;
float *data = (float*)malloc(num*sizeof(float));
// pointers
float *px = data+0;
float *py = data+1;
float *pz = data+2;
float *pr = data+3;
// load point cloud
FILE *stream;
stream = fopen (currFilenameBinary.c_str(),"rb");
num = fread(data,sizeof(float),num,stream)/4;
for (int32_t i=0; i<num; i++) {
point_cloud.points.push_back(tPoint(*px,*py,*pz,*pr));
px+=4; py+=4; pz+=4; pr+=4;
}
fclose(stream);
x,y and y are stored in metric (m) Velodyne coordinates.
IMPORTANT NOTE: Note that the velodyne scanner takes depth measurements
continuously while rotating around its vertical axis (in contrast to the cameras,
which are triggered at a certain point in time). This means that when computing
point clouds you have to 'untwist' the points linearly with respect to the velo-
dyne scanner location at the beginning and the end of the 360擄 sweep. The time-
stamps for the beginning and the end of the sweeps can be found in the time-
stamps file. The velodyne rotates in counter-clockwise direction.
Of course this 'untwisting' only works for non-dynamic environments.
The relationship between the camera triggers and the velodyne is the following:
We trigger the cameras when the velodyne is looking exactly forward (into the
direction of the cameras).
官方提供的鐳射資料為N*4的浮點數矩陣,raw data development kit中的matlab資料夾是官方提供matlab介面,主要是將鐳射資料與相機資料結合,在影象上投影。matlab介面詳解及使用 最終可以將點雲資料儲存為pcd格式,然後用pcl進行相應處理。