OpenCV相機標定及距離估計(單目)
相機標定基本知識
對於針孔攝像機模型,一幅檢視是通過透視變換將三維空間中的點投影到影象平面。投影公式如下:
或者
這裡(X, Y, Z)是一個點的世界座標,(u, v)是點投影在影象平面的座標,以畫素為單位。A被稱作攝像機矩陣,或者內參數矩陣。(cx, cy)是基準點(通常在影象的中心),fx, fy是以畫素為單位的焦距。所以如果因為某些因素對來自於攝像機的一幅影象升取樣或者降取樣,所有這些引數(fx, fy, cx和cy)都將被縮放(乘或者除)同樣的尺度。內參數矩陣不依賴場景的檢視,一旦計算出,可以被重複使用(只要焦距固定)。旋轉-平移矩陣[R|t]被稱作外引數矩陣,它用來描述相機相對於一個固定場景的運動,或者相反,物體圍繞相機的的剛性運動。也就是[R|t]將點(X, Y, Z)的座標變換到某個座標系,這個座標系相對於攝像機來說是固定不變的。上面的變換等價與下面的形式(z≠0):
x' = x / z
y' = y / z
真正的鏡頭通常有一些形變,主要的變形為徑向形變,也會有輕微的切向形變。所以上面的模型可以擴充套件為:
x' = x / z
y' = y / z
這裡 r2 = x'2 + y'2
k1和k2是徑向形變係數,p1和p1是切向形變係數。OpenCV中沒有考慮高階係數。形變係數跟拍攝的場景無關,因此它們是內參數,而且與拍攝影象的解析度無關。
OpenCV標定函式
double cv::calibrateCamera | ( | InputArrayOfArrays | objectPoints, |
InputArrayOfArrays | imagePoints, | ||
Size | imageSize, | ||
cameraMatrix, | |||
InputOutputArray | distCoeffs, | ||
OutputArrayOfArrays | rvecs, | ||
OutputArrayOfArrays | tvecs, | ||
OutputArray | stdDeviationsIntrinsics, | ||
OutputArray | stdDeviationsExtrinsics, | ||
OutputArray | perViewErrors, | ||
int | flags = 0 , |
||
TermCriteria | criteria = TermCriteria(TermCriteria::COUNT+TermCriteria::EPS, 30, DBL_EPSILON) |
||
) |
Finds the camera intrinsic and extrinsic parameters from several views of a calibration pattern.
Parameters
objectPoints | In the new interface it is a vector of vectors of calibration pattern points in the calibration pattern coordinate space (e.g. std::vector<std::vector<cv::Vec3f>>). The outer vector contains as many elements as the number of the pattern views. If the same calibration pattern is shown in each view and it is fully visible, all the vectors will be the same. Although, it is possible to use partially occluded patterns, or even different patterns in different views. Then, the vectors will be different. The points are 3D, but since they are in a pattern coordinate system, then, if the rig is planar, it may make sense to put the model to a XY coordinate plane so that Z-coordinate of each input object point is 0. In the old interface all the vectors of object points from different views are concatenated together. |
imagePoints | In the new interface it is a vector of vectors of the projections of calibration pattern points (e.g. std::vector<std::vector<cv::Vec2f>>). imagePoints.size() and objectPoints.size() and imagePoints[i].size() must be equal to objectPoints[i].size() for each i. In the old interface all the vectors of object points from different views are concatenated together. |
imageSize | Size of the image used only to initialize the intrinsic camera matrix. |
cameraMatrix | Output 3x3 floating-point camera matrix A=⎡⎣⎢⎢⎢fx000fy0cxcy1⎤⎦⎥⎥⎥ . If CV_CALIB_USE_INTRINSIC_GUESS and/or CALIB_FIX_ASPECT_RATIO are specified, some or all of fx, fy, cx, cy must be initialized before calling the function. |
distCoeffs | Output vector of distortion coefficients (k1,k2,p1,p2[,k3[,k4,k5,k6[,s1,s2,s3,s4[,τx,τy]]]]) of 4, 5, 8, 12 or 14 elements. |
rvecs | Output vector of rotation vectors (see Rodrigues ) estimated for each pattern view (e.g. std::vector<cv::Mat>>). That is, each k-th rotation vector together with the corresponding k-th translation vector (see the next output parameter description) brings the calibration pattern from the model coordinate space (in which object points are specified) to the world coordinate space, that is, a real position of the calibration pattern in the k-th pattern view (k=0.. M -1). |
tvecs | Output vector of translation vectors estimated for each pattern view. |
stdDeviationsIntrinsics | Output vector of standard deviations estimated for intrinsic parameters. Order of deviations values: (fx,fy,cx,cy,k1,k2,p1,p2,k3,k4,k5,k6,s1,s2,s3,s4,τx,τy) If one of parameters is not estimated, it's deviation is equals to zero. |
stdDeviationsExtrinsics | Output vector of standard deviations estimated for extrinsic parameters. Order of deviations values: (R1,T1,…,RM,TM) where M is number of pattern views, Ri,Ti are concatenated 1x3 vectors. |
perViewErrors | Output vector of the RMS re-projection error estimated for each pattern view. |
flags | Different flags that may be zero or a combination of the following values:
|
criteria | Termination criteria for the iterative optimization algorithm. |
Returns
the overall RMS re-projection error.
The function estimates the intrinsic camera parameters and extrinsic parameters for each of the views. The algorithm is based on [206] and [17] . The coordinates of 3D object points and their corresponding 2D projections in each view must be specified. That may be achieved by using an object with a known geometry and easily detectable feature points. Such an object is called a calibration rig or calibration pattern, and OpenCV has built-in support for a chessboard as a calibration rig (see findChessboardCorners ). Currently, initialization of intrinsic parameters (when CALIB_USE_INTRINSIC_GUESS is not set) is only implemented for planar calibration patterns (where Z-coordinates of the object points must be all zeros). 3D calibration rigs can also be used as long as initial cameraMatrix is provided.
The algorithm performs the following steps:
- Compute the initial intrinsic parameters (the option only available for planar calibration patterns) or read them from the input parameters. The distortion coefficients are all set to zeros initially unless some of CALIB_FIX_K? are specified.
- Estimate the initial camera pose as if the intrinsic parameters have been already known. This is done using solvePnP .
- Run the global Levenberg-Marquardt optimization algorithm to minimize the reprojection error, that is, the total sum of squared distances between the observed feature points imagePoints and the projected (using the current estimates for camera parameters and the poses) object points objectPoints. See projectPoints for details.
Note
If you use a non-square (=non-NxN) grid and findChessboardCorners for calibration, and calibrateCamera returns bad values (zero distortion coefficients, an image center very far from (w/2-0.5,h/2-0.5), and/or large differences between fx and fy (ratios of 10:1 or more)), then you have probably used patternSize=cvSize(rows,cols) instead of using patternSize=cvSize(cols,rows) in findChessboardCorners .
See also
findChessboardCorners, solvePnP, initCameraMatrix2D, stereoCalibrate, undistort
Reference
https://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html
雜記
---
一次相機標定僅僅只是對物理相機模型的一次近似,再具體一點來說,一次標定僅僅是對相機物理模型在取樣空間範圍內的一次近似。所以當你成像物體所在的空間跟相機標定時的取樣空間不一樣的時候,你可能永遠都沒辦法得到足夠高的精度,當你大幅改變相機與成像物體的距離的時候,你最好重新標定相機。
通過攝像機標定我們可以知道些什麼:
1.外引數矩陣。告訴你現實世界點(世界座標)是怎樣經過旋轉和平移,然後落到另一個現實世界點(攝像機座標)上。
2.內參數矩陣。告訴你上述那個點在1的基礎上,是如何繼續經過攝像機的鏡頭、並通過針孔成像和電子轉化而成為畫素點的。
3.畸變矩陣。告訴你為什麼上面那個畫素點並沒有落在理論計算該落在的位置上,還tm產生了一定的偏移和變形!!!
總的來說,攝像機標定是通過尋找物件在影象與現實世界的轉換數學關係,找出其定量的聯絡,從而實現從影象中測量出實際資料的目的。
---
每個標定板擺放的角度對應一個單應矩陣。然後每個矩陣根據旋轉矩陣的單位正交性,可以構成2個約束,對應2個方程。
內參數矩陣中包含了5個自由度(駐點座標u0,v0),相機焦距(fx,fy),X軸和Y軸垂直扭曲skew。因此至少3個單應關係,才可以求解,因此至少3個擺放的角度。
另外考慮畸變引數的建模,一般有4個,因此使用LM方法完成非線性優化。實際應用中擺放的角度20個,不同的角度能夠保證目標函式更加接近凸函式,便於完成所有引數的迭代優化,使得結果更加準確。
---
通過鏡頭,一個三維空間中的物體經常會被對映成一個倒立縮小的像(當然顯微鏡是放大的,不過常用的相機都是縮小的),被感測器感知到。
理想情況下,鏡頭的光軸(就是通過鏡頭中心垂直於感測器平面的直線)應該是穿過影象的正中間的,但是,實際由於安裝精度的問題,總是存在誤差,這種誤差需要用內參來描述;
理想情況下,相機對x方向和y方向的尺寸的縮小比例是一樣的,但實際上,鏡頭如果不是完美的圓,感測器上的畫素如果不是完美的緊密排列的正方形,都可能會導致這兩個方向的縮小比例不一致。內參中包含兩個引數可以描述這兩個方向的縮放比例,不僅可以將用畫素數量來衡量的長度轉換成三維空間中的用其它單位(比如米)來衡量的長度,也可以表示在x和y方向的尺度變換的不一致性;
理想情況下,鏡頭會將一個三維空間中的直線也對映成直線(即射影變換),但實際上,鏡頭無法這麼完美,通過鏡頭對映之後,直線會變彎,所以需要相機的畸變引數來描述這種變形效果。然後,說到為什麼需要20張圖片,這只是一個經驗值,實際上太多也不好,太少也不好。單純從統計上來看,可能越多會越好,但是,實際上圖片太多可能會讓引數優化的結果變差,因為棋盤格角點座標的確定是存在誤差的,而且這種誤差很難說是符合高斯分佈的,同時,標定過程所用的非線性迭代優化演算法不能保證總是得到最優解,而更多的圖片,可能會增加演算法陷入區域性最優的可能性。
---
總結一下就是:外參:另一個(相機或世界)座標系->這個個相機座標系,內參:這個相機座標系->影象座標系
實際應用中,個人經驗是,單目標定板還是不要傾斜角度太大,儘量在攝像機視場範圍內的所有地方都出現,遠近也可以做一做,但不需要拉的太遠。不知道有沒有更好的操作,之前這麼做誤差還行,所以也就先這麼搞著。
張正友的標定法論文裡結果顯示11幅圖之後的效果都很好,標定板繞一個軸轉動的角度在45度時最好。