1. 程式人生 > >HEVC核心編碼技術之三.幀間預測

HEVC核心編碼技術之三.幀間預測

Overview of the High Efficiency Video Coding(HEVC) Standard之四

H. 幀間預測
Interpicture Prediction

1) 預測塊(PB)的劃分
PB Partitioning: 

Compared to intrapicture-predicted
CBs, HEVC supports more PB partition shapes for
interpicture-predicted CBs. The partitioning modes of
PART_2N×2N, PART_2N×N, and PART_N×2N indicate

the cases when the CB is not split, split into two equal-size
PBs horizontally, and split into two equal-size PBs vertically,
respectively. PART_N×N specifies that the CB is split into
four equal-size PBs, but this mode is only supported when the
CB size is equal to the smallest allowed CB size. In addition,

there are four partitioning types that support splitting the
CB into two PBs having different sizes: PART_2N×nU,
PART_2N×nD, PART_nL×2N, and PART_nR×2N. These
types are known as asymmetric motion partitions.
相對於幀內預測CB, HEVC對幀間預測CB提供了更多的PB劃分形狀;
下面四種模式對應的CB劃分形狀如下:
PART_2N×2N,   CB不劃分;
PART_2N×N,    CB水平劃分成兩個相等尺寸的PB;

PART_N×2N,    CB垂直劃分成兩個相等尺寸的PB;
PART_N×N,     CB劃分成四個相等尺寸的PB,
               但是,只有當CB尺寸等於最小允許的CB尺寸時,這種模式才有效;

另外,還有四種劃分型別將CB劃分成兩個不同尺寸的PB:
ART_2N×nU,
PART_2N×nD, 
PART_nL×2N, 
PART_nR×2N.
這些劃分型別被稱作非對稱運動劃分; 


Fig. 7. Integer and fractional sample positions for luma interpolation.

2) 分畫素插值
Fractional Sample Interpolation: 

The samples of the PB for an intrapicture-predicted CB are obtained from those of
corresponding block region in the reference picture identified
by a reference picture index, which is at a position displaced
by the horizontal and vertical components of the motion vector.
對於幀間預測編碼塊(CB)的預測塊(PB)畫素是從參考影象--以參考影象索引標記--
的對應塊區域得到,這個位置表示為運動向量的水平和垂直分量;

Except for the case when the motion vector has an integer
value, fractional sample interpolation is used to generate the
prediction samples for noninteger sampling positions. As in
H.264/MPEG-4 AVC, HEVC supports motion vectors with
units of one quarter of the distance between luma samples.
除了使用整數值的MV外,為了相鄰畫素位置,分畫素插值被用來生成預測畫素。
和H.264/MPEG-4 AVC一樣,HEVC也支援四分之一亮度畫素的MV;

For chroma samples, the motion vector accuracy is determined
according to the chroma sampling format, which for 4:2:0
sampling results in units of one eighth of the distance between
chroma samples.
對於色度畫素來說,MV的精度依據色度畫素格式來確定,
對於4:2:0畫素格式,MV的精度為八分之一畫素;

The fractional sample interpolation for luma samples in
HEVC uses separable application of an eight-tap filter for the
half-sample positions and a seven-tap filter for the quartersample
positions. This is in contrast to the process used in
H.264/MPEG-4 AVC, which applies a two-stage interpolation
process by first generating the values of one or two
neighboring samples at half-sample positions using six-tap
filtering, rounding the intermediate results, and then averaging
two values at integer or half-sample positions. HEVC instead
uses a single consistent separable interpolation process for
generating all fractional positions without intermediate rounding
operations, which improves precision and simplifies the
architecture of the fractional sample interpolation. The interpolation
precision is also improved in HEVC by using longer
filters, i.e., seven-tap or eight-tap filtering rather than the sixtap
filtering used in H.264/MPEG-4 AVC. Using only seven
taps rather than the eight used for half-sample positions was
sufficient for the quarter-sample interpolation positions since
the quarter-sample positions are relatively close to integer sample
positions, so the most distant sample in an eight-tap
interpolator would effectively be farther away than in the half sample
case (where the relative distances of the integer-sample
positions are symmetric). The actual filter tap values of the
interpolation filtering kernel were partially derived from DCT
basis function equations.
在HEVC中,亮度畫素的分畫素插值應用了兩種方法: 
對半畫素使用八階濾波器;
對四分之一畫素使用7階濾波器;
這一點和H.264/MPEG-4 AVC是不一樣的;
H.264/MPEG-4 AVC是用的兩步插值處理:
先使用六階濾波器,舍入均值 ,在半畫素位置生成一個或兩個相鄰畫素的值;
然後在整畫素和半畫素位置取兩個值的平均;
HEVC對所有分畫素位置使用了獨立的插值處理,而不用中間的舍入操作,
這種方式提高了精度並簡化了分畫素插值的架構; 
而且,在HEVC中,使用更長的濾波器,如七階和八階濾波器來提高插值精度,
而不是像在H.264/MPEG-4 AVC中用的六階濾波器;
對半畫素位置使用七階濾波器,而不像四分之一插值畫素位置使用八階濾波器,
是因為四分之一畫素位置更接近整畫素位置,
因此,在八階插值中,最遠的畫素相比半畫素情況會更遠;
在半畫素中,相對於整畫素的位置是非對稱的;
實際上,插值濾波器核心的濾波階值部分是從DCT基本函式等式中推匯出來的;

In Fig. 7, the positions labeled with upper-case letters,
Ai,j , represent the available luma samples at integer sample
locations, whereas the other positions labeled with lower-case
letters represent samples at noninteger sample locations, which
need to be generated by interpolation.
在圖7中,標記為大寫字母的位置,Ai,j, 表示在整畫素位置有效的亮度畫素;
因此,其它的標記為小寫字母的位置表示非整數畫素位置的畫素,它們是需要插值生成的;

The samples labeled a0,j, b0,j, c0,j, d0,0, h0,0, and n0,0
are derived from the samples Ai,j by applying the eight-tap
filter for half-sample positions and the seven-tap filter for the
quarter-sample positions as follows:
a0,j, b0,j, c0,j, d0,0, h0,0, and n0,0畫素都是對Ai,j畫素,
在半畫素位置時,用八階濾波器,
在四分之一畫素位置,用七階濾波器,推導等式如下:

a0,j = (i=_3..3 Ai,j qfilter[i]) >> (B _ 8)
b0,j = (i=_3..4 Ai,j hfilter[i]) >> (B _ 8)
c0,j = (i=_2..4 Ai,j qfilter[1 _ i]) >> (B _ 8)
d0,0 = (i=_3..3 A0,j qfilter[j]) >> (B _ 8)
h0,0 = (i=_3..4 A0,j hfilter[j]) >> (B _ 8)
n0,0 = (j=_2..4 A0,j qfilter[1 _ j]) >> (B _ 8)

where the constant B ≥ 8 is the bit depth of the reference
samples (and typically B = 8 for most applications) and the
filter coefficient values are given in Table II. In these formulas,
>> denotes an arithmetic right shift operation.
等式中,B是參考畫素的位元深度,通常為8;
濾波器係數值如表II中所示,
在這些等式中,>>表示算術右移操作;

TABLE II
Filter Coefficients for Luma Fractional Sample Interpolation


The samples labeled e0,0, f0,0, g0,0, i0,0, j0,0, k0,0, p0,0, q0,0,
and r0,0 can be derived by applying the corresponding filters
to samples located at vertically adjacent a0,j, b0,j and c0,j
positions as follows:
畫素e0,0, f0,0, g0,0, i0,0, j0,0, k0,0, p0,0, q0,0,and r0,0的值是
對垂直相鄰的畫素位置a0,j, b0,j and c0,j使用如下等式得到的:

e0,0 = (v=_3..3 a0,v qfilter[v]) >> 6
f0,0 = (v=_3..3 b0,v qfilter[v]) >> 6
g0,0 = (v=_3..3 c0,v qfilter[v]) >> 6
i0,0 = (v=_3..4 a0,v hfilter[v]) >> 6
j0,0 = (v=_3..4 b0,v hfilter[v]) >> 6
k0,0 = (v=_3..4 c0,v hfilter[v]) >> 6
p0,0 = (v=_2..4 a0,v qfilter[1 _ v]) >> 6
q0,0 = (v=_2..4 b0,v qfilter[1 _ v]) >> 6
r0,0 = (v=_2..4 c0,v qfilter[1 _ v]) >> 6.

The interpolation filtering is separable when B is equal to
8, so the same values could be computed in this case by
applying the vertical filtering before the horizontal filtering.
When implemented appropriately, the motion compensation
process of HEVC can be performed using only 16-b storage
elements (although care must be taken to do this correctly).
當B等於8時,插值濾波器是獨立的;
因此,同一值在水平濾波之前已被垂直濾波計算;
如果實現得很好,HEVC的運動補償處理可以只需要16位元的儲存空間;

It is at this point in the process that weighted prediction
is applied when selected by the encoder. Whereas
H.264/MPEG-4 AVC supported both temporally implicit and
explicit weighted prediction, in HEVC only explicit weighted
prediction is applied, by scaling and offsetting the prediction
with values sent explicitly by the encoder. The bit depth of
the prediction is then adjusted to the original bit depth of
the reference samples. In the case of uniprediction, the interpolated
(and possibly weighted) prediction value is rounded,
right-shifted, and clipped to have the original bit depth. In the
case of biprediction, the interpolated (and possibly weighted)
prediction values from two PBs are added first, and then
rounded, right-shifted, and clipped.
如果編碼器有選擇了,那麼現在進入權值預測處理;
H.264/MPEG-4 AVC支援隱示和顯示的權值預測;
而在HEVC中,只能使用顯示的權值預測;
需要通過縮放和位移預測值並顯式地在編碼端傳送來實現;
然後,預測的位元深度調整到參考畫素原始位元深度;
在單向預測中,插值預測值被舍入,右移,並切斷到原始位元深度;
在雙向預測中,從兩個PB中得到的插值預測值先被相加,然後舍入,右移和切斷;

In H.264/MPEG-4 AVC, up to three stages of rounding
operations are required to obtain each prediction sample (for
samples located at quarter-sample positions). If biprediction is
used, the total number of rounding operations is then seven
in the worst case. In HEVC, at most two rounding operations
are needed to obtain each sample located at the quarter-sample
positions, thus five rounding operations are sufficient in the
worst case when biprediction is used. Moreover, in the most
common usage, where the bit depth B is 8 b, the total number
of rounding operations in the worst case is further reduced
to 3. Due to the lower number of rounding operations, the
accumulated rounding error is decreased and greater flexibility
is enabled in regard to the manner of performing the necessary
operations in the decoder.
在H.264/MPEG-4 AVC中,需要對第個預測畫素(位於四分之一畫素位置的畫素)
進行三步的舍入操作;
而如果是雙向預測,則在最壞的情況下,需要最多可能到七步的舍入操作;
在HEVC中,最多需要兩步舍入操作來得到每個位於四分之一畫素位置的畫素;
因此,對於雙向預測,最多隻需要五步的舍入操作;
而且,對於最通常的情況,位元嘗試B等於8時,在最壞情況下整個舍入操作也
只需要三步;
由於舍入操作步驟的減少,累積的舍入錯誤會增加,但對於解碼器來說,
有了更多的靈活性;

The fractional sample interpolation process for the chroma
components is similar to the one for the luma component,
except that the number of filter taps is 4 and the fractional
accuracy is 1/8 for the usual 4:2:0 chroma format case. HEVC
defines a set of four-tap filters for eighth-sample positions, as
given in Table III for the case of 4:2:0 chroma format (where,
in H.264/MPEG-4 AVC, only two-tap bilinear filtering was applied).
對於色度分量的分畫素插值處理和亮度分量是相似的;
只是在4:2:0色度格式下,分畫素的精度為1/8,並且使用四階濾波器;
HEVC對八分之一畫素位置定義了一個四階濾波器集來處理,
如表III中所示:

TABLE III
Filter Coefficients for Chroma FractionalSample Interpolation


Filter coefficient values denoted as filter1[i], filter2[i], filter3[
i], and filter4[i] with i = _1,. . . , 2 are used for interpolating
the 1/8th, 2/8th, 3/8th, and 4/8th fractional positions
for the chroma samples, respectively. Using symmetry for the
5/8th, 6/8th, and 7/8th fractional positions, the mirrored values
of filter3[1_i], filter2[1_i], and filter1[1_i] with i = _1, . . . ,
2 are used, respectively.
標記為filter1[i], filter2[i], filter3[i], and filter4[i] with i = _1,. . . , 2
是濾波係數值用於1/8th, 2/8th, 3/8th, and 4/8th分畫素位置的插值;
對於非對稱的5/8th, 6/8th, and 7/8th分畫素位置,
則使用filter3[1_i], filter2[1_i], and filter1[1_i] with i = _1, . . . ,2的映象值; 

3) 合併模式
Merge Mode: 

Motion information typically consists of
the horizontal and vertical motion vector displacement values,
one or two reference picture indices, and, in the case of prediction
regions in B slices, an identification of which reference
picture list is associated with each index. HEVC includes a
merge mode to derive the motion information from spatially
or temporally neighboring blocks. It is denoted as merge mode
since it forms a merged region sharing all motion information.
運動資訊通常由
水平和垂直運動向量位移值,
一個或兩個(對於B片,每個參考影象列表都有一個索引)參考影象索引組成;
HEVC允許使用一個合併模式來從空域或時域相鄰塊來推導運動停下;
命名為合併模式是因為這種方式共享了所有的運動資訊來形成一個合併區域;

The merge mode is conceptually similar to the direct and
skip modes in H.264/MPEG-4 AVC. However, there are two
important differences. First, it transmits index information to
select one out of several available candidates, in a manner
sometimes referred to as a motion vector competition scheme.
It also explicitly identifies the reference picture list and reference
picture index, whereas the direct mode assumes that
these have some predefined values.
合併模式在概念上和H.264/MPEG-4 AVC中的direct和skip模式相似;
然而,這兩者有兩個很大的不同點:
首先,它是從多個有效候選中選擇一個出來作為索引資訊傳輸,這是一種MV競爭方案;
其次,它顯式地標識了參考影象列表和參考影象索引,而direct模式假定這個的值是相同的;

Fig. 8. Positions of spatial candidates of motion information.


The set of possible candidates in the merge mode consists
of spatial neighbor candidates, a temporal candidate, and
generated candidates. Fig. 8 shows the positions of five spatial
candidates. For each candidate position, the availability is
checked according to the order {a1, b1, b0, a0, b2}. If the
block located at the position is intrapicture predicted or the
position is outside of the current slice or tile, it is considered
as unavailable.
合併模式中的可能候選者由
空域相鄰候選者,
時域相鄰候選者,
生成的候選者組成。
圖8顯示了5個空域候選者的位置;
對於每個候選者的位置,依據{a1, b1, b0, a0, b2}這個順序來檢查有效性;
如果這個塊的位置是幀內預測模式,或是超出了當前片或瓦片,就認為它是無效的;


After validating the spatial candidates, two kinds of redundancy
are removed. If the candidate position for the current
PU would refer to the first PU within the same CU, the
position is excluded, as the same merge could be achieved by
a CU without splitting into prediction partitions. Furthermore,
any redundant entries where candidates have exactly the same
motion information are also excluded.
在對空域候選者驗證完成後,下面兩種型別的冗餘被移除:
對於當前PU, 如果候選者的位置是同一個CU中的第一個PU,這個位置的候選者被排除;
因為同樣的合併可以通過不對預測單元進行劃分來實現;
有著完全相同運動資訊的候選都也要被移除;


For the temporal candidate, the right bottom position just
outside of the collocated PU of the reference picture is used if
it is available. Otherwise, the center position is used instead.
The way to choose the collocated PU is similar to that of prior
standards, but HEVC allows more flexibility by transmitting
an index to specify which reference picture list is used for the
collocated reference picture.
對於時域候選者,參考影象對應PU外的右下位置,如果有效,則可以用作候選者;
否則,使用中心位置來代替;
這種選擇對應位置PU的方法在以前的編碼標準中也多有應用;
而HEVC只是通過傳輸一個索引來說明哪個參考影象列表被用作對應參考影象,這樣的更靈活;


One issue related to the use of the temporal candidate is
the amount of the memory to store the motion information
of the reference picture. This is addressed by restricting the
granularity for storing the temporal motion candidates to only
the resolution of a 16×16 luma grid, even when smaller
PB structures are used at the corresponding location in the
reference picture. In addition, a PPS-level flag allows the
encoder to disable the use of the temporal candidate, which is
useful for applications with error-prone transmission.
時域候選者的一個問題是儲存參考影象的運動資訊需要記憶體開銷;
這個問題可以通過限制儲存時域運動候選者的粒度到來解決,
如只允許16x16的亮度網格,即使更小的PB結構被用於參考影象對應位置;
另外,在PPS級有標誌可以關閉時域候選者的使用,
這對於易出錯傳輸鏈路的應用很有用;


The maximum number of merge candidates C is specified
in the slice header. If the number of merge candidates found
(including the temporal candidate) is larger than C, only the
first C – 1 spatial candidates and the temporal candidate
are retained. Otherwise, if the number of merge candidates

相關推薦

HEVC核心編碼技術.預測

Overview of the High Efficiency Video Coding(HEVC) Standard之四H. 幀間預測Interpicture Prediction1) 預測塊(PB)的劃分PB Partitioning: Compared to intr

HEVC演算法和體系結構:預測編碼預測

<link rel="stylesheet" href="https://csdnimg.cn/release/phoenix/template/css/

HEVC預測——TEncCu::xCheckRDCostMerge2Nx2N函式分析

本文將對實現merge模式的主函式xCheckRDCostMerge2Nx2N進行分析,方便理清merge模式的整個過程。之前的一篇分析了getInterMergeCandidates的具體實現,還有兩個比較重要的函式motionCompensation和encodeRes

工業網際網路平臺核心技術:平行計算與分散式計算

之所以將兩種計算技術放在一起,是因為這兩種計算具有共同的特點,都是運用並行來獲得更高效能運算,把大任務分為N個小任務。但兩者還是有區別的,關於兩者的區別在後面說。 一、平行計算 1、平行計算概念 平行計算又稱平行計算是指一種能夠讓多條指令同時進行的計算模式,可分為時

【H.264/AVC視訊編解碼技術詳解】二十三、預測編碼(1):預測編碼的基本原理

《H.264/AVC視訊編解碼技術詳解》視訊教程已經在“CSDN學院”上線,視訊中詳述了H.264的背景、標準協議和實現,並通過一個實戰工程的形式對H.264的標準進行解析和實現,歡迎觀看! “紙上得來終覺淺,絕知此事要躬行”,只有自己按照標準文件以程式碼的形式操作一遍,才能對視訊壓

HEVC預測五——運動估計(二)

分析xTZSearch這個函式,xTZSearchHelp是當中最為重要的子函式之一。它實現最基本的功能:根據輸入的搜尋點座標,參考影象首地址,原始影象首地址,以及當前PU大小等相關資訊,計算出SAD,並與之前儲存的最佳值進行比較,更新到目前為止的最佳值相關引數,如uiBe

HEVC預測四——運動估計(一)

其實HM的運動估計這部分與H.264相比基本沒有變化,如果看過JMVC運動估計的程式碼,會發現xTZSearch的結構幾乎就是一樣的。所以,嚴格來講,這部分的東西沒有什麼太多新鮮的東西,相信以前研究過TZSearch的人看這部分程式碼會很輕鬆。先看運動估計的主調函式: //

HEVC預測七——運動估計(四)

有了前面幾篇的鋪墊,本文就可以把整畫素部分的運動估計給結束掉了。到目前為止,只剩下xTZSearch這個函式沒分析了,在開始這個函式的程式碼解釋之前,讓我們共同來理一下TZSearch的基本流程: 1. 搜尋預測得到的mv所指向的點:中值預測mv,當前PU的左,上及右上PU

H.264預測編碼預測

預測編碼是視訊壓縮中最基本的編碼工具,常見的預測編碼為幀間預測和幀內預測。 視訊編碼中,主要的冗餘資訊是時間冗餘,其次是空間冗餘,視訊編碼通過幀間預測消除時間冗餘,通過幀內預測消除空間冗餘。接下來

HM編碼器程式碼閱讀(32)——預測AMVP/Merge模式(七)encodeResAndCalcRdInterCU函式:殘差計算、變換量化

encodeResAndCalcRdInterCU 原理和細節     經過運動估計、運動補償,我們得到了MV以及參考塊,那麼接下來是計算殘差、計算MVD,然後對係數進行變換、量化。     encodeResAndCalcRdInterCU函式就是幀間預測之後,專門用

windows核心編程進程共享數據

font view oid 管道 section clas argc 例如 指向 有時候我們會遇到window進程間共享數據的需求,例如說我想知道系統當前有多少某個進程的實例。我們能夠在程序中定義一個全局變量。初始化為0。每當程序啟動後就加1。當然我們我們能夠借

PC軟體開發技術:C#操作SQLite資料庫

我們在開發應用是經常會需要用到一些資料的儲存,儲存的方式有多種,使用資料庫是一種比較受大家歡迎的方式。但是對於一些小型的應用,如一些移動APP,通常的資料庫過於龐大,而輕便的SQLite則能解決這一問題。不但操作方便,而且只需要要一個檔案即可,在這裡我們來說一說使用C#語言操作SQLite資料庫

HEVC內/預測:Cross-Component Prediction (CCP)

Cross-Component Prediction (CCP)跨元件預測是通過消除顏色元件的相關性,在保證顏色高保真度的同時,實現對視訊內容的高效壓縮,被HEVC RExt採納。其核心思想是使用亮度

[編碼]概分法編碼快速判定

曾經有在讀研究生問我有關幀間編碼快速判定演算法,因為他(她)目前的任務主要是為幀間編碼或者幀內編碼提出一種實用實時的快速演算法。對於實時編碼軟體(或者硬體)而言包括X264,T264在內為了達快速的效果,以達到實時傳送影象資訊的效果,都會採用快速判定幀間或者幀內預測,有的會去掉一些複雜的演算法。主要的想法就是

安全編碼實踐

secure 的人 技術分享 避免 信用卡 編寫 方式 問題 writeup 聲明:本文由Bypass整理並翻譯,僅用於安全研究和學習之用。 文章來源:https://medium.com/bugbountywriteup/how-to-write-secure-co

預測模式獲取

一、獲取幀間編碼模式 呼叫getPartitionSize(UInt uiIdx),uiIdx=0表示當前深度CU eg.如果編碼塊是Inter2Nx2N模式,即遍歷xCheckRDCostInter( rpcBestCU, rpcTempCU, SIZE_2Nx2N DEBUG_

內/預測要點

問題1:幀內/幀間預測的目的是什麼? 幀內/幀間預測是利用畫素之間的相關性來去除空間冗餘和時間冗餘。手段是為當前編碼塊構造儘可能準確的預測塊,從而得到能量較小的殘差塊。目的是能量較小的殘差塊可以減少傳

HEVC學習(五) —— 預測系列

今天主要介紹幀內預測一個很重要的函式initAdiPattern,它的主要功能有三個,(1)檢測當前PU的相鄰樣點包括左上、上、右上、左、左下鄰域樣點值的可用性,或者說檢查這些點是否存在;(2)參考樣點的替換過程,主要實現的是JCTVC-J1003即draft 8.4.4.

Java核心技術基本數據類型

又能 art 資料 string 1.4 get bool 讀取 大數 這篇文章。我們討論一些java的最主要的東西。這些東西我們一般剛剛學java的時候就學過,可是不一定真正明確。正好,我在做一個讀取內存的值,涉及到bit位的值的讀取和寫。那就能夠討論一個

【IPC進程通訊】內存映射文件Mapping File

eas -h 大小 b2c iss ipc etl enter bject IPC進程間通信+共享內存Mapping IPC(Inter-Process Communication。進程間通信)。