R-letter of August 2013 - IEEE Communications Society

More documents

Recommendations

Info

IEEE COMSOC MMTC R-LetterHow to Analyze and Optimize the Encoding Latency for Multiview Video CodingA short review for “A Framework for the Analysis and Optimization of Encoding Latency for MultiviewVideo”Edited by Christian TimmererP. Carballeira, J. Cabrera, A. Ortega, F. Jaureguizar and N. García, “A Framework for theAnalysis and Optimization of Encoding Latency for Multiview Video”, IEEE Journal ofSelected Topics in Signal Processing, vol. 6, no. 5, pp. 583-596, Sep. 2012.Multiview video with additional scene geometryinformation, such as depth maps, is a widelyadopted data format to enable key functionalitiesin new visual media systems, such as 3D Video(3DV) and Free Viewpoint Video (FVV) 0.Given that the data size of multiview videogrows linearly with the number of cameras,while the available bandwidth is generallylimited, new schemes for an efficientcompression for multiview video [2] andadditional data [3] have been under investigationin recent years.The authors argue that the design of multiviewprediction structures for multiview video coding[4] has been mostly focused on improving ratedistortion(RD) performance, ignoring importantdifferences in the latency behavior of theresulting codecs. These differences in latencymay be critical for delay constrained applicationssuch as immersive video conferencing scenarios,in which the end-to-end delay, thecommunication latency, needs to be kept low inorder to preserve interactivity [5]. In hybridvideo encoders there is a clear trade-off betweenRD performance and encoding delay, mainly dueto the use of backward prediction andhierarchical prediction structures. In single-viewvideo encoders, the encoding delay can be easilyestimated and reduced by simple decisions onthe design of prediction structures.The analysis of the encoding delay in the case ofmultiview video is more challenging as itrequires to handle more complex dependencystructures than in single-view video, includingnot only temporal but also inter-view prediction.Additionally, the fact that the encoder may haveto manage the encoding of several frames at thesame time (frames from several views), due tothe inherent parallel nature of multiview video,makes the characteristics of multi-processorhardware platforms play a significant role in theanalysis.In this paper, the authors propose a generalframework for the characterization of theencoding latency in multiview encoders thatcaptures the influence of 1) the predictionstructure and 2) the hardware encoder model.This framework allows a systematic analysis ofthe encoding latency for arbitrary multiviewprediction structures in a multiview encoder. Theprimary element of the proposed framework is anencoding latency model based on graph theoryalgorithms that assumes that the processingcapacity of the encoder is essentially unbounded,i.e., the directed acyclic graph encoding latency(DAGEL) model. It can be seen as a taskscheduling model [6] (the encoding of a frame isthe task unit) that is used to compute theencoding latency rather than the schedule length.The paper also demonstrates that, despite theassumption of unbounded processing capacity,the encoding latency values obtained with theDAGEL model are accurate for multiviewencoders with a finite number of processorsgreater than a required minimum, which can beidentified. Otherwise, results provided by theDAGEL model represent a lower bound to theactual encoding latency of the encoder.As an example of the applications of the DAGELmodel, the authors show how it can be used toreduce the encoding latency of a given multiviewprediction structure in order to meet a targetvalue while preserving as much as possible theRD performance. In this approach, the objectiveis to prune the minimum number of framedependencies (those that introduce a higherencoding delay in the original structure) until thelatency target value is achieved. Therefore, thedegradation of RD performance due to removalof prediction dependencies is limited. Finally,the authors demonstrate that the prunedprediction structures still produce a minimumencoding latency, as compared to other pruningoptions, even in hardware platforms models thathttp://committees.comsoc.org/mmc 4/22 Vol.4, No.4, August 2013
IEEE COMSOC MMTC R-Letterdo not meet the minimum requirements in termsof the number of processors of the DAGELmodel.Following this research direction, future workincludes the extension of this framework tomultiview decoders and the use of graph modelsto analyze the delay behavior in more realisticencoder/decoder hardware architectures [7].This paper is nominated by Cha Zhang of theMMTC 3D Processing, Rendering andCommunication (3DPRC) Interest Group.References:[1] P. Merkle, K. Mueller, and T. Wiegand, “3Dvideo: acquisition, coding, and display,”IEEE Transactions on ConsumerElectronics, vol. 56, no. 2, pp. 946–950,2010.[2] A. Vetro, T. Wiegand, and G. Sullivan,“Overview of the stereo and multiviewvideo coding extensions of theH.264/MPEG-4 AVC standard,”Proceedings of the IEEE, vol. 99, no. 4, pp.626–642, Apr. 2011.[3] ISO/IEC JTC1/SC29/WG11, “Call forProposals on 3D Video CodingTechnology,” MPEG output doc. N12036,Geneva, Switzerland, Mar. 2011.[4] P. Merkle, A. Smolic, K. Müller, and T.Wiegand, “Efficient prediction structures formultiview video coding,” IEEETransactions on Circuits and Systems forVideo Technology, vol. 17, no. 11, pp.1461–1473, Nov. 2007.[5] G. Karlsson, “Asynchronous transfer ofvideo,” IEEE Communication Magazine,vol. 34, no. 8, pp. 118–126, Aug. 1996.[6] Y.-K. Kwok and I. Ahmad, “Staticscheduling algorithms for allocating directedtask graphs to multiprocessors,” ACMAcknowledgement:This paper is nominated by Cha Zhang of theMMTC 3D Processing, Rendering andCommunication (3DPRC) Interest Group.Computing Surveys, vol. 31, no. 4, pp. 406–471, Dec. 1999.[7] P. Carballeira, J. Cabrera, F. Jaureguizar andN. García, “Systematic Analysis of theDecoding Delay in Multiview Video”,Journal of Visual Communication andImage Representation, Special Issue onAdvances in 3D Video Processing, (in press)(doi: 10.1016/j.jvcir.2013.04.004).Christian Timmerer is anassistant professor at theInstitute of InformationTechnology (ITEC),Alpen-Adria-UniversitätKlagenfurt, Austria. Hisresearch interests includeimmersive multimedia communication, streaming,adaptation, and Quality of Experience with more than100 publications in this domain. He was the generalchair of WIAMIS’08, ISWM’09, EUMOB’09,AVSTP2P’10, WoMAN’11, QoMEX’13 and hasparticipated in several EC-funded projects, notablyDANAE, ENTHRONE, P2P-Next, ALICANTE,QUALINET, and SocialSensor. He also participated inISO/MPEG work for several years, notably in the areaof MPEG-21, MPEG-M, MPEG-V, and DASH/MMT.He received his PhD in 2006 from the Alpen-Adria-Universität Klagenfurt. Publications and MPEGcontributions can be found underresearch.timmerer.com, follow him ontwitter.com/timse7, and subscribe to his blogblog.timmerer.com.http://committees.comsoc.org/mmc 5/22 Vol.4, No.4, August 2013
Page 2 and 3: IEEE COMSOC MMTC R-LetterMessage fr
Page 6 and 7: IEEE COMSOC MMTC R-LetterDoes Kinec
Page 8 and 9: IEEE COMSOC MMTC R-LetterExtending
Page 10 and 11: IEEE COMSOC MMTC R-LetterFairness R
Page 12 and 13: IEEE COMSOC MMTC R-LetterCommun., v
Page 14 and 15: IEEE COMSOC MMTC R-Letterconcave co
Page 16 and 17: IEEE COMSOC MMTC R-Letterremove the
Page 18 and 19: IEEE COMSOC MMTC R-Letterwhich give
Page 20 and 21: IEEE COMSOC MMTC R-Letterallocation
Page 22: IEEE COMSOC MMTC R-LetterMMTC R-Let

R-letter of August 2013 - IEEE Communications Society

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?