13.07.2015 Views

On Media Data Structures for Interactive Streaming in Immersive ...

On Media Data Structures for Interactive Streaming in Immersive ...

On Media Data Structures for Interactive Streaming in Immersive ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2.3 Reversible Video PlaybackAnother example is reversible video playback: 20 a video frame was encoded us<strong>in</strong>g distributed source cod<strong>in</strong>g(DSC), where both past and future frames were used as side <strong>in</strong><strong>for</strong>mation (predictors). This was done so thatframes could be sent either <strong>for</strong>ward or backward <strong>in</strong> time per client’s request, and the client could simply decodeand play back the video <strong>in</strong> the transmission order with no excess buffer<strong>in</strong>g. Note that this was a more limited<strong>for</strong>m of media <strong>in</strong>teractivity than video brows<strong>in</strong>g, but unlike JPEG2000-based approaches 11,12 <strong>for</strong> video brows<strong>in</strong>g,where each frame was encoded <strong>in</strong>dependently, the <strong>in</strong>ter-frame redundancy was explicitly exploited here dur<strong>in</strong>gactual media encod<strong>in</strong>g, hence the result<strong>in</strong>g encod<strong>in</strong>g rate is expected to be much lower.2.4 <strong>Interactive</strong> Light Field <strong>Stream<strong>in</strong>g</strong>In the case of light fields, 3 where a subset of a densely sampled 2-D array of images is used to <strong>in</strong>terpolate adesired view us<strong>in</strong>g image-based render<strong>in</strong>g (IBR), 5 the notion of <strong>in</strong>teractive media stream<strong>in</strong>g has been <strong>in</strong>vestigatedextensively. 13–17 These works were motivated by the very large size of the orig<strong>in</strong>al image set, 6 which will cause<strong>in</strong>tolerable delay if the set must be transmitted <strong>in</strong> its entirety be<strong>for</strong>e user’s <strong>in</strong>teraction beg<strong>in</strong>s.To provide random access to images <strong>in</strong> the set, Aaron et al 16 coded images <strong>in</strong>dependently. Aaron et al 16(<strong>in</strong>tra DSC) encoded non-key images (key images are encoded as I-frames) <strong>in</strong>dependently us<strong>in</strong>g a Wyner-Zivencoder, and transmitted different amount of the encoded bits depend<strong>in</strong>g on the quality of the side <strong>in</strong><strong>for</strong>mation(frames already transmitted) available at the decoder. To exploit the large spatial correlation <strong>in</strong>herent amongdensely sampled images, however, the majority of these works 13–15,17 encoded the image set us<strong>in</strong>g disparitycompensation, i.e., differential cod<strong>in</strong>g us<strong>in</strong>g neighbor<strong>in</strong>g view images as predictors.For a given image, Jagmohan et al 17 and Ramanathan et al 15 used DSC and SP-frame-like lossless cod<strong>in</strong>grespectively to elim<strong>in</strong>ate frame differences caused by usage of different predictors of different decod<strong>in</strong>g paths.Specifically, Jagmohan et al 17 proposed to encode disparity <strong>in</strong><strong>for</strong>mation <strong>for</strong> every possible neighbor<strong>in</strong>g viewimage, plus DSC based coset bits so that when both are applied, the result<strong>in</strong>g image is the same no matterwhich neighbor<strong>in</strong>g image was used as predictor. Instead of DSC, Ramanathan et al 15 used a lossless cod<strong>in</strong>gscheme ak<strong>in</strong> to SP-frames <strong>in</strong> H.264 25 to encode residues after disparity compensation so that the exact sameframe was reconstructed no matter which predictor was used.Bauermann et al 13,14 took a different approach and analyzed tradeoffs among four quantities—storage rate,distortion, transmission data rate and decod<strong>in</strong>g complexity. In particular, they first assumed that each cod<strong>in</strong>gblock of an image was encoded as INTRA, INTER or SKIP as done <strong>in</strong> H.263. 26 Then, <strong>for</strong> a requested INTERcod<strong>in</strong>g block to be correctly decoded, all blocks <strong>in</strong> its dependency cha<strong>in</strong> ∗ that were not already <strong>in</strong> the clientcache must be transmitted, creat<strong>in</strong>g a cost both <strong>in</strong> transmission rate and decod<strong>in</strong>g complexity. We denote thistechnique as rerout<strong>in</strong>g, as the dependency path of blocks from desired <strong>in</strong>ter block all the way back to the <strong>in</strong>itial<strong>in</strong>tra block must be re-traced and transmitted if not resid<strong>in</strong>g <strong>in</strong> the observer’s cache.2.5 ROI Video <strong>Stream<strong>in</strong>g</strong>A recent application called ROI video stream<strong>in</strong>g <strong>in</strong>volves the transmission of high-resolution video to an observerwith low-resolution display term<strong>in</strong>al. In such scenario, the observer can choose between view<strong>in</strong>g the entire spatialregion but <strong>in</strong> low resolution, or view<strong>in</strong>g a selected smaller ROI but <strong>in</strong> higher resolution. Mavlankar et al 18,19proposed a multi-resolution motion compensation (multi-res MC) technique where a low resolution version of thevideo called thumbnail was first encoded, then frames at higher resolution were each divided <strong>in</strong>to different tiledspatial regions, which were motion-compensated us<strong>in</strong>g an up-sampled version of the thumbnail as predictor.Prediction <strong>for</strong> the high-resolution frames used only the thumbnail, so that ROIs could be arbitrarily chosenacross time by observers without caus<strong>in</strong>g complicated <strong>in</strong>ter-frame dependencies among high-resolution frames.The procedure can be repeated <strong>for</strong> even smaller spatial regions and higher resolution.∗ By dependency cha<strong>in</strong> we mean all the blocks that need to be decoded be<strong>for</strong>e the requested block can be decoded, i.e.,an INTRA block followed by a succession of disparity compensated INTER blocks.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!