1. 程式人生 > 其它 >DCT變換以及定點化實現

DCT變換以及定點化實現

技術標籤:編解碼

Mpeg1/2標準的DCT變換與H264的整數DCT不太一樣,會有小數運算。硬體實現時會做定點化處理,因此也會產生誤差,誤差主要體現在某些數值小數部分在0.5左右。比如小數運算時數值為4.4999,四捨五入最終畫素值為4;如果定點化結果為4.50001,實際上與小數運算差異不大,但最終畫素值卻相差1。

1.DCT簡介

DCT和DCT反變換可用如下公式表示:
dct變換公式
其中,Xij是影象塊 X 中第 i 行第 j 列影象或殘差值,Ymn 是變換結果矩陣 Y 相應頻率點上的 DCT 系 數。可以用矩陣表示
Y = A X A T Y=AXA^T Y=AXAT
X = A T Y A X=A^TYA

X=ATYA

Mpeg1/2標準的DCT是8x8大小的。
a = 1 / 8 a=\sqrt{1/8} a=1/8 ,
b = 1 2 cos ⁡ ( π 8 ) b=\frac{1}{2}\cos{\left(\frac{\pi}{8}\right)} b=21cos(8π),
c = 1 2 cos ⁡ ( 3 π 8 ) c=\frac{1}{2}\cos{\left(\frac{3\pi}{8}\right)} c=21cos(83π) ,
d = 1 2 cos ⁡ ( π 16 ) d=\frac{1}{2}\cos{\left(\frac{\pi}{16}\right)}

d=21cos(16π),
e = 1 2 cos ⁡ ( 3 16 π ) e=\frac{1}{2}\cos{\left(\frac{3}{16}\pi\right)} e=21cos(163π),
f = 1 2 cos ⁡ 5 16 π f=\frac{1}{2}\cos{\frac{5}{16}\pi} f=21cos165π,
g = 1 2 cos ⁡ 7 16 π g=\frac{1}{2}\cos{\frac{7}{16}\pi} g=21cos167π,
則 矩陣A可以如下表示
( a a a a a a a a d e f g − g − f − e − d b c − c − b − b − c c b e − g − d − f f d g − e a − a − a a a − a − a a f − d g e − e − g d − f c − b b − c − c b − b c g − f e − d d − e f − g ) \begin{pmatrix} a & a & a & a& a& a& a& a \\ d & e&f &g&-g&-f&-e&-d \\ b&c&-c&-b&-b&-c&c&b \\ e&-g&-d&-f&f&d&g&-e \\ a& -a&-a&a&a&-a&-a&a \\ f&-d&g&e&-e&-g&d&-f \\ c&-b&b&-c&-c&b&-b&c\\ g&-f&e&-d&d&-e&f&-g \end{pmatrix}
adbeafcgaecgadbfafcdagbeagbfaecdagbfaecdafcdagbeaecgadbfadbeafcg

矩陣運算可以使用蝶形變換加速,以8x8反變換為例,分析 A T Y A^TY ATY的第一列,
其中間結果用acc0-acc7表示,如下
( a c c 0 a c c 1 a c c 2 a c c 3 ) = ( a b a c a c − a − b a − c − a b a − b a − c ) ( x 0 x 2 x 4 x 6 ) \begin{pmatrix} acc0 \\ acc1 \\ acc 2 \\ acc3 \\ \end{pmatrix} =\begin{pmatrix} a&b& a& c\\ a&c&-a&-b \\ a&-c&-a&b\\ a&-b&a&-c \end{pmatrix}\begin{pmatrix} x0\\x2 \\ x4\\ x6 \end{pmatrix} acc0acc1acc2acc3=aaaabccbaaaacbbcx0x2x4x6

( a c c 4 a c c 5 a c c 6 a c c 7 ) = ( d e f g e − g − d f f − d g e g − f e − d ) ( x 1 x 3 x 5 x 7 ) \begin{pmatrix} acc4 \\ acc5 \\ acc 6 \\ acc7 \\ \end{pmatrix} =\begin{pmatrix} d&e& f& g\\ e&-g&-d&f \\ f&-d&g&e\\ g&-f&e&-d \end{pmatrix}\begin{pmatrix} x1\\x3 \\ x5\\ x7 \end{pmatrix} acc4acc5acc6acc7=defgegdffdgegfedx1x3x5x7
最終結果為
y 0 = a c c 1 + a c c 2 y0 = acc1+acc2 y0=acc1+acc2
y 0 = a c c 0 + a c c 4 y0 = acc0+acc4 y0=acc0+acc4
y 1 = a c c 1 + a c c 5 y1 = acc1+acc5 y1=acc1+acc5
y 2 = a c c 2 + a c c 6 y2 = acc2+acc6 y2=acc2+acc6
y 3 = a c c 3 + a c c 7 y3 = acc3+acc7 y3=acc3+acc7
y 4 = a c c 3 − a c c 7 y4 = acc3-acc7 y4=acc3acc7
y 5 = a c c 2 − a c c 6 y5 = acc2-acc6 y5=acc2acc6
y 6 = a c c 1 − a c c 5 y6= acc1-acc5 y6=acc1acc5
y 7 = a c c 0 − a c c 4 y7 = acc0-acc4 y7=acc0acc4
DCT反變換的幾種python實現

  • 按照公式直接計算
def normal_dct(block):
    A = np.zeros((8, 8))  # 生成0矩陣
    shape = src.shape[1]  # 獲取維數
    for i in range(8):
        for j in range(8):
            if (i == 0):
                x = sqrt(1 / shape)
            else:
                x = sqrt(2 / shape)
            A[i][j] = x * cos(pi * (j + 0.5) * i / shape)  
    return np.dot(np.dot(A.T, src), A)
  • 使用scipy的dct函式
from scipy.fftpack import dct, idct
def idct2(block):
    return idct(idct(block.T, norm = 'ortho').T, norm = 'ortho')

2.定點化實現

定點主要是把小數轉成整型資料範圍,通過整形加減以及移位操作替換小數運算。DCT運算過程中矩陣 A A A中的引數都是小數,需要轉為整數進行運算。
小數可以用2的負數次冪相加來表示,以 a = 1 / 8 a=\sqrt{1/8} a=1/8 為例,可以表示為 2 − 2 + 2 − 4 + 2 − 5 + 2 − 7 + 2 − 9 2^{-2} + 2^{-4} + 2^{-5} + 2^{-7} + 2^{-9} 22+24+25+27+29。實現過程中以16個元素的陣列來表示,每個陣列元素所在陣列下標對應2的冪次,則a的表項為a_table = [0,0,1,0,1,1,0,1,0,1,0,0,0,0,0,0]。

# a = sqrt(1/8) = 0.35355 = 2^-2 + 2^-4 + 2^-5 + 2^-7 + 2^-9
a_table = [0,0,1,0,1,1,0,1,0,1,0,0,0,0,0,0]
# b = (1/2*cos(PI/8)) = 0.46194 = 2^-1 - 2^-5 - 2^-7 + 2^-10
b_table = [0,1,0,0,0,-1,0,-1,0,0,1,0,0,0,0,0]
# c = (1/2*cos(PI*3/8)) = 0.19134 = 2^-3 + 2^-4 + 2^-8 - 2^-14
c_table = [0,0,0,1,1,0,0,0,1,0,0,0,0,0,-1,0]
# d = 1/2*cos(PI/16) = 0.49039 = 2^-1 - 2^-7 - 2^-9 + 2^-13
d_table = [0,1,0,0,0,0,0,-1,0,-1,0,0,0,1,0,0]
# e = 1/2*cos(PI*3/16) = 0.41573 = 2^-2 + 2^-3 + 2^-5 + 2^-7 + 2^-9 - 2^-12
e_table = [0,0,1,1,0,1,0,1,0,1,0,0,-1,0,0,0]
# f = 1/2*cos(PI*5/16) = 0.27779 = 2^-2 + 2^-5 - 2^-8 + 2^-11
f_table = [0,0,1,0,0,1,0,0,-1,0,0,1,0,0,0,0]
# g = 1/2*cos(PI*7/16) = 0.09755 = 2^-4 + 2^-5 + 2^-8 - 2^-13
g_table = [0,0,0,0,1,1,0,0,1,0,0,0,0,-1,0,0]
# 定點化IDCT
def fixpoint_idct(block):
    input = block.reshape(64, 1)
    idct_data1 = [0 for i in range(64)]
    idct_data2 = [0 for i in range(64)]
    tmp_data1 = idct_data1
    tmp_data2 = idct_data2

    for i in range(64):
        tmp_data1[i] = input[i] << 10
    #兩次蝶形變換
    for k in range(2):
        for hor in range(8):
            a_x0 = b_x2 = a_x4 = c_x6 = 0
            c_x2 = b_x6 = 0
            d_x1 = e_x3 = f_x5 = g_x7 = 0
            e_x1 = g_x3 = d_x5 = f_x7 = 0
            f_x1 = d_x3 = g_x5 = e_x7 = 0
            g_x1 = f_x3 = e_x5 = d_x7 = 0
            i = hor << 3

            x0 = tmp_data1[i]
            x1 = tmp_data1[i+1]
            x2 = tmp_data1[i+2]
            x3 = tmp_data1[i+3]
            x4 = tmp_data1[i+4]
            x5 = tmp_data1[i+5]
            x6 = tmp_data1[i+6]
            x7 = tmp_data1[i+7]
			#遍歷長度為16的查詢表
            for j in range(16):
                a_x0 += (x0 >> j) * a_table[j]
                b_x2 += (x2 >> j) * b_table[j]
                a_x4 += (x4 >> j) * a_table[j]
                c_x6 += (x6 >> j) * c_table[j]
                c_x2 += (x2 >> j) * c_table[j]
                b_x6 += (x6 >> j) * b_table[j]

                d_x1 += (x1 >> j) * d_table[j]
                e_x1 += (x1 >> j) * e_table[j]
                f_x1 += (x1 >> j) * f_table[j]
                g_x1 += (x1 >> j) * g_table[j]

                d_x3 += (x3 >> j) * d_table[j]
                e_x3 += (x3 >> j) * e_table[j]
                f_x3 += (x3 >> j) * f_table[j]
                g_x3 += (x3 >> j) * g_table[j]
                d_x5 += (x5 >> j) * d_table[j]
                e_x5 += (x5 >> j) * e_table[j]
                f_x5 += (x5 >> j) * f_table[j]
                g_x5 += (x5 >> j) * g_table[j]
                d_x7 += (x7 >> j) * d_table[j]
                e_x7 += (x7 >> j) * e_table[j]
                f_x7 += (x7 >> j) * f_table[j]
                g_x7 += (x7 >> j) * g_table[j]

            acc0 = a_x0 + b_x2 + a_x4 + c_x6
            acc1 = a_x0 + c_x2 - a_x4 - b_x6
            acc2 = a_x0 - c_x2 - a_x4 + b_x6
            acc3 = a_x0 - b_x2 + a_x4 - c_x6
            acc4 = d_x1 + e_x3 + f_x5 + g_x7
            acc5 = e_x1 - g_x3 - d_x5 + f_x7
            acc6 = f_x1 - d_x3 + g_x5 + e_x7
            acc7 = g_x1 - f_x3 + e_x5 - d_x7

            tmp_data2[hor] = acc0 + acc4
            tmp_data2[(1<<3)+hor] = acc1 + acc5
            tmp_data2[(2<<3)+hor] = acc2 + acc6
            tmp_data2[(3<<3)+hor] = acc3 + acc7
            tmp_data2[(4<<3)+hor] = acc3 - acc7
            tmp_data2[(5<<3)+hor] = acc2 - acc6
            tmp_data2[(6<<3)+hor] = acc1 - acc5
            tmp_data2[(7<<3)+hor] = acc0 - acc4

        if k == 0:
            tmp_data1 = idct_data2
            tmp_data2 = idct_data1

    for i in range(64):
        tmp_data2[i] = (tmp_data2[i] + 512) >> 10
        if tmp_data2[i] > 255:
            tmp_data2[i] = 255
        if tmp_data2[i] < -256:
            tmp_data2[i] = -256

    return np.asarray(tmp_data2).reshape(8, 8)