(HEVC) Standard
(HEVC) Standard
(HEVC) Standard
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
2011 18th IEEE International Conference on Image Processing<br />
However, due to the existence of the fractional parts of<br />
intercept which are caused by the accumulation of the angle,<br />
the location of the selected reference samples may arise on a<br />
non-integer displacement. This means the displacements of<br />
the five selected samples are discontinuous when predicting<br />
a few rows of pixels in some modes. The discontinuation<br />
phenomenon also explains why an extended array is<br />
developed to refine the reference samples in such modes.<br />
The refinement technique results in the waste of the<br />
memory. In addition, processing latency is increased<br />
because this method requires copying the selected samples<br />
into extended array before prediction.<br />
After summarizing the characteristics of these modes,<br />
we found a simple rule to detect the discontinuation<br />
situation: if the fractional parts of the intercept are nonzero,<br />
and the upper-left corner sample A_L is used as the first<br />
selected reference sample in predicting the first row of a<br />
mode, the discontinuation phenomenon will occur in the rest<br />
rows of the mode where the first selected reference sample<br />
is not A_L. The rule can be implemented by a simple<br />
detecting circuit with a comparator and a counter. After<br />
detecting the rows in this mode, if it is the vertical<br />
directional mode, the reference sample L1 is skipped to as<br />
one of the selected reference samples in five. Likewise, if it<br />
is the horizontal directional mode, the reference sample A1<br />
is skipped.<br />
Obviously, the flexible reference samples selection<br />
technique does not need to project the samples from the side<br />
reference to the main reference which will be used in the<br />
linear interpolation filter. So, it can save memory resources.<br />
In addition, the procedure of detecting the skipped samples<br />
is completed with the procedure of the prediction. Thus, it<br />
can reduce the processing latency compared with the<br />
method copying samples into extended array before<br />
prediction procedure.<br />
4. IMPLEMENTATION RESULT<br />
The proposed architecture is designed by Verilog HDL and<br />
implemented using the TSMC 0.13μm CMOS technology.<br />
Table 3 lists the specifications of VLSI implementation. In<br />
this table, we can observe that the total gate count of the<br />
proposed architecture is 9020. The design can work at the<br />
highest frequency 150MHz.<br />
Table 3. Specifications of VLSI implementation<br />
Technology<br />
TSMC 0.13μm CMOS<br />
Logic Gate Count 9020<br />
Max Operation Freq.<br />
Processing Latency<br />
Average Cycles to Generate a Pixel<br />
150MHz<br />
8 Clocks<br />
1.5 Clocks<br />
By using the novel register array with the correlation<br />
parameters and the flexible reference samples selection<br />
technique, we only need half of the memory resources than<br />
that implemented by the software described. Furthermore, it<br />
is unnecessary to waste time copying the selected samples<br />
into the extended array. All of the predictions can be<br />
finished in 24 clocks with 8 clocks processing latency. It<br />
takes 1.5 clocks to generate a prediction pixel in average.<br />
5. CONCLUSIONS<br />
In this paper, we propose a high efficient uniform VLSI<br />
architecture and a flexible reference samples selection<br />
technique for 4×4 intra prediction in HM. This architecture<br />
integrates the copying circuit and the interpolation circuit<br />
into a uniform circuit to save the hardware resources. The<br />
new samples selection technique can relieve the memory<br />
pressure and reduce the processing latency considerably.<br />
Implementation with TSMC 0.13μm CMOS technology<br />
indicates that the proposed architecture can work at 150<br />
MHz operation frequency and 9020 logic gates acquired.<br />
This architecture can be extended to parts of the 8×8, 16×16,<br />
32×32 and 64×64 intra prediction. So they can share same<br />
logics to improve the utilization.<br />
6. ACKNOWLEDGMENT<br />
The authors are grateful to Ji Zheng Xu and You Zhou for<br />
their valuable discussions.<br />
This work was supported in part by the National Science<br />
Foundation of China under Grants 60736043, 61033004,<br />
61070138, and the Fundamental Research Funds for the<br />
Central Universities of China under Grant K50510020032.<br />
7. REFERENCES<br />
[1] Draft Document of JCT-VC, “Test model under<br />
consideration,” JCTVC-A205, April, 2010.<br />
[2] K. McCann et. al., “Samsung’s response to the call for<br />
proposals on video compression technology,” JCTVC-A124,<br />
April, 2010.<br />
[3] W. Thomas, J.S. Gary et. al, “Overview of the H.264/AVC<br />
video coding standard,” IEEE Transactions on Circuits and<br />
Systems for Video Technology, vol. 13, no.7, pp.560-576,<br />
2003.<br />
[4] C. Lian, Y. Huang et. al., “JPEG, MPEG-4, and H.264 codec<br />
IP development,” in Proc. Design, Automation and Test in<br />
Europe, vol.2, pp.1118-1119, 2005.<br />
[5] Y.W. Huang, B.Y. H, T.C. Chen, L.G. Chen, “ Analysis, fast<br />
algorithm, and VLSI architecture design for H.264/AVC intra<br />
frame coder ,” IEEE Transactions on Circuits and Systems for<br />
Video Technology, vol. 15, no.3, pp.378-401, 2005.<br />
[6] Y. Liu, “Analysis of coding tools in <strong>HEVC</strong> test model (HM)<br />
intra prediction,” http://www.h265.net/2010/12.<br />
[7] J.H. Min, “Unification of the directional intra prediction<br />
methods in TMuC,” JCTVC-B100, July, 2010.<br />
[8] T. Tan, “Summary report for TE5 on simplification of unified<br />
intra prediction,” JCTVC-C046, Oct., 2010.<br />
384