Xcell Journal: The authoritative journal for programmable ... - Xilinx
Xcell Journal: The authoritative journal for programmable ... - Xilinx
Xcell Journal: The authoritative journal for programmable ... - Xilinx
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
DIGITAL SIGNAL PROCESSING<br />
Input<br />
Video<br />
Signal<br />
Split into<br />
Macroblocks<br />
16x16 pixels<br />
Coder<br />
Control<br />
Trans<strong>for</strong>m<br />
Scal./Quant.<br />
Decoder Scaling & Inv.<br />
Trans<strong>for</strong>m<br />
Intra-frame<br />
Prediction<br />
Motion-<br />
Compensation<br />
Motion<br />
Estimation<br />
Improved Prediction Methods<br />
Let’s highlight some of the main features of<br />
the H.264/AVC video coding standard<br />
design that enable its enhanced coding efficiency,<br />
evaluating these functional modules<br />
based on the design criteria discussed in the<br />
previous section.<br />
• Quarter-pixel-accurate motion compensation.<br />
Prior standards use half-pixel<br />
motion vector accuracy. <strong>The</strong> new design<br />
improves on this by providing quarterpixel<br />
motion vector accuracy. <strong>The</strong> prediction<br />
values at half-pixel positions are<br />
calculated by applying a one-dimensional<br />
six-tap FIR filter [1, -5, 20, 20, -5,<br />
1]/32 horizontally and vertically.<br />
Prediction values at quarter-pixel positions<br />
are generated by averaging samples<br />
at the full- and half-pixel positions.<br />
<strong>The</strong>se sub-sampling interpolation operations<br />
can be efficiently implemented<br />
in hardware inside the FPGA fabric.<br />
• Variable block-sized motion compensation<br />
with small block size. <strong>The</strong> standard<br />
provides more flexibility <strong>for</strong> the<br />
tiling structure in a macroblock size of<br />
16 x 16 pixels. It allows the use of 16 x<br />
16, 16 x 8, 8 x 16, 8 x 8, 8 x 4, 4 x 8,<br />
and 4 x 4 sub-macroblock sizes.<br />
Because of the increasing combinations<br />
of tiling geometry with a given<br />
De-blocking<br />
Filter<br />
Output<br />
Video<br />
Signal<br />
Control<br />
Data<br />
Quant.<br />
Transf. Coeffs<br />
Motion<br />
Data<br />
16 x 16 macroblock, to find a rate<br />
distortion optimal tiling solution is<br />
extremely computationally intensive.<br />
This additional feature places an<br />
enormous burden on the computational<br />
engines used in motion estimation,<br />
refinement, and mode decision<br />
process.<br />
• In-the-loop adaptive deblocking filtering.<br />
<strong>The</strong> deblocking filter has been successfully<br />
applied in H.263+ and<br />
MPEG-4 part 2 implementations as a<br />
post-processing filter. In H.264/AVC,<br />
the deblocking filter is moved inside<br />
the motion-compensated loop to filter<br />
block edges resulting from the prediction<br />
and residual difference coding<br />
stages of the decoding process. <strong>The</strong> filtering<br />
is applied on both 4 x 4 block<br />
and 16 x 16 macroblock boundaries, in<br />
which two pixels on either side of the<br />
boundary may be updated using a<br />
three-tap filter. <strong>The</strong> filter coefficients or<br />
“strength” are governed by a contentadaptive<br />
non-linear filtering scheme.<br />
• Directional spatial prediction <strong>for</strong> intra<br />
coding. In cases where motion estimation<br />
cannot be exploited, intra-directional<br />
spatial prediction is used to<br />
eliminate spatial redundancies. This<br />
technique attempts to predict the current<br />
block by extrapolating the neigh-<br />
boring pixels from adjacent blocks in a<br />
defined set of directions. <strong>The</strong> difference<br />
between the predicted block and<br />
the actual block is then coded.<br />
This approach is particularly useful in<br />
flat backgrounds where spatial redundancies<br />
exist. <strong>The</strong>re are a total of nine<br />
prediction directions <strong>for</strong> Intra_4x4<br />
prediction, and four prediction directions<br />
<strong>for</strong> Intra_16x16 prediction.<br />
Note that the data causality imposes<br />
quick memory access to the neighboring<br />
13 pixel values to the above and<br />
left of the current block in the case of<br />
Intra_4x4. For the Intra_16x16, 16<br />
neighboring pixels on each side are<br />
used to predict a 16 x 16 block.<br />
• Multiple reference picture motion compensation.<br />
<strong>The</strong> H.264/AVC standard<br />
offers the option <strong>for</strong> multiple reference<br />
frames in the inter-frame coding. Unless<br />
the number of the referenced pictures is<br />
one, the index at which the reference<br />
picture is located inside the multi-picture<br />
buffer has to be signaled. <strong>The</strong><br />
multi-picture buffer size determines the<br />
memory usage in the encoder and<br />
decoder. <strong>The</strong>se reference frame buffers<br />
must be addressed correspondingly during<br />
the motion estimation and compensation<br />
stages in the encoder.<br />
• Weighted prediction. <strong>The</strong> JVT recognizes<br />
that in encoding certain video<br />
scenes that involve fades, having a<br />
weighted motion-compensated prediction<br />
dramatically improves the coding<br />
efficiency.<br />
Improved Coding Efficiency<br />
In addition to improved prediction methods,<br />
other parts of the standard design were<br />
also enhanced <strong>for</strong> improved coding efficiency.<br />
Two additional features are most likely to<br />
impact the overall system architecture based<br />
on our design criteria <strong>for</strong> software and hardware<br />
partitioning:<br />
• Small block size, hierarchical, exactmatch<br />
inverse, and short word-length<br />
trans<strong>for</strong>m. <strong>The</strong> H.264/AVC, like other<br />
standards, also applies trans<strong>for</strong>m coding<br />
to the motion-compensated prediction<br />
42 <strong>Xcell</strong> <strong>Journal</strong> Winter 2004<br />
Entropy<br />
Coding<br />
Figure 1 – H.264/AVC macroblock encoder with functional blocks and data flows