13.07.2015 Views

E-LETTER - IEEE Communications Society

E-LETTER - IEEE Communications Society

E-LETTER - IEEE Communications Society

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>IEEE</strong> COMSOC MMTC E-Letter3D Visual Content Compression for <strong>Communications</strong>Karsten Müller,FraunhoferHeinrich Hertz Institute, GermanyKarsten.mueller@hhi.fraunhofer.deThe recent popularity of 3D media is driven byresearch and development projects worldwide,targeting all aspects from 3D content creationand capturing, format representation, coding andtransmission as well as new 3D displaytechnologies and extensive user studies. Here,efficient compression technologies are developedthat also consider these aspects: 3D productiongenerates different types of content, like naturalcontent, recorded by multiple cameras orsynthetic computer-generated content.Transmission networks have different conditionsand requirements for stationary as well as mobiledevices. Finally, different end user devices anddisplays are entering the market, likestereoscopic and auto-stereoscopic displays, aswell as multi-view displays. From thecompression point of view, visual data requiresmost of the data rate and different formats withappropriate compression methods are currentlyinvestigated. These visual methods can beclustered into vision-based approaches fornatural content, and (computer) graphics-basedmethods for synthetic content.A straight-forward representation for naturalcontent is the use of N camera views as N colorvideos. For this, compression methods derivetechnology from 2D video coding, where spatialcorrelations in each frame, as well as temporalcorrelations within the video sequence areexploited. In multi-view video coding (MVC) [1],also correlations between neighboring camerasare considered and thus multi-view video can becoded more efficiently than coding each viewindividually.The first types of 3D displays, currently hittingthe market, are stereoscopic displays. Therefore,special emphasis has been given to 2-view orstereoscopic video. Here, three main codingapproaches are investigated. The first approachconsiders conventional stereo video and appliesindividual coding of both views as well as multiviewvideo coding. Although the latter providesbetter coding efficiency, some mobile devicesrequire reduced decoding complexity, such thatindividual coding might be considered. Thesecond approach considers data reduction priorto coding. This method is called mixedresolution coding and is based on the binocularsuppression theorem. This states, that the overallvisual perception is close to that of the originalhigh image resolution of both views, if oneoriginal view and one low-pass filtered view (i.e.the upsampled smaller view) is presented to theviewer. The third coding approach is based onone original view and associated per-pixel depthdata. The second view is generated from this dataafter decoding. Usually, depth data can becompressed better than color data, thereforecoding gains are expected in comparison to the 2view-color-only approaches. This video+depthmethod has another advantage: The baseline orvirtual eye distance can be varied and thusadapted to the display and/or adjusted by the userto change the depth impression. However, thismethod also has disadvantages, namely theinitial creation/provision of depth data bylimited-capability range cameras or error-pronedepth estimation. Also, areas only visible in thesecond view are not covered. All this may lead toa disturbed synthesized second view. Currently,studies are carried out about the suitability ofeach method, considering all aspects of the 3Dcontent delivery chain.Both approaches, multi-view video coding andstereo video coding, still have limitations, suchthat newer 3D video coding approaches focus onadvanced multi-view formats. Here, the mostgeneral format is multi-view video + depth(MVD) [2]. It combines sparse multi-view colordata with depth information and allows renderinginfinitely dense intermediate views similar forany kind of autostereoscopic N-view displays.The methods, described above are a subset ofthis format. For possible better data compression,also variants of the MVD format are considered,like layered depth video (LDV). Here, one fullview and depth is considered with additionalresidual color and depth information fromneighboring views. The latter covers thedissoccluded areas in these views. For theseadvanced formats, coding approaches will beconsidered with respect to best compressioncapability, but also for down-scalability andbackward compatibility to existing approaches,like stereo video coding. The new approaches aremore than simply extending existing methods:Classical video coding is based on rate-distortionoptimization, where the best compression at thebest possible quality is sought. This objectivequality is measured by pixel-wise mean squarederror (MSE), comparing the lossy decoded andhttp://www.comsoc.org/~mmc/ 22/41 Vol.4, No.7, August 2009

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!