EECE 541 Multimedia Systems Project Proposal: Logo ... - Courses

EECE 541 Multimedia Systems 

Project Proposal: 

Logo Insertion for H.264 Compressed Video 

Instructor: Dr. Panos Nasiopoulos 

Group members: Gopichand Yalamanchili (20322061) 

Abdul K.Murad Agha (80089972) 

Teresa Zhou (74417999) 

Di Xu (31679079) 

February 26, 2008 

1

TABLE OF CONTENT 

TABLE OF CONTENT ............................................................................................................................... 2 

I. INTRODUCTION............................................................................................................................... 3 

II. NOTATION AND TERMINOLOGY............................................................................................... 3 

III. VIDEO CODING AND TRANSCODING................................................................................... 4 

A. BASIC TRANSCODING STRUCTURES .................................................................................................. 4 

B. MPEG2 VIDEO CODING.................................................................................................................... 6 

C. H.264 VIDEO CODING....................................................................................................................... 6 

C.1. INTRA PREDICTION............................................................................................................................. 6 

C.2. INTER PREDICTION ............................................................................................................................. 6 

IV. EXISTING METHODS FOR MPEG2 COMPRESSED VIDEO LOGO INSERTION.......... 7 

A. LOGO INSERTION POSITION............................................................................................................... 7 

B. LOGO INSERTION IN SPATIAL DOMAIN.............................................................................................. 7 

C. LOGO INSERTION IN TRANSFORM DOMAIN ...................................................................................... 9 

D. LOW COST AND EFFICIENT LOGO INSERTION.................................................................................. 10 

D.1. LOGO-AFFECTED RANGE OF FRAMES IN THE TEMPORAL DOMAIN .................................................. 10 

D.2. MOTION INFORMATION ADJUSTMENT IN THE LOGO AND LOGO-AFFECTED PARTS.......................... 11 

E. QUANTIZATION SCALE ADJUSTMENT ............................................................................................. 12 

E.1. CONSTANT QUALITY AT THE LOGO PART......................................................................................... 13 

E.2. BIT REALLOCATION.......................................................................................................................... 13 

V. MAIN ISSUES FOR H.264 COMPRESSED VIDEO LOGO INSERTION ............................... 13 

REFERENCES ........................................................................................................................................... 16 

2

I. INTRODUCTION 

Transcoding is the process of converting the content of a compressed video stream from 

one format to another. A format is determined by characteristics such as the bit rate, 

frame rate, spatial resolution, coding syntax, and the content. One useful and highly 

demanded application of transcoding is inserting a logo into a stream of encoded video. 

There are many commercial applications for this technology. As there are now many 

television networks, the inserted logo is extremely effective for the viewers to identify the 

station. Throughout the years, we have come to associate the “peacock” logo with NBC, 

or the “eye” logo with CBS. These logos can greatly improve a broadcaster’s chances of 

viewer recognition. Several logo-insertion methods are proposed for MPEG2 [1] 

compressed video, but there is not much work done for a much more complicated 

situation, that is the H.264 [2] compressed video logo insertion. In this project, we aim at 

inserting logos to H.264 compressed videos. 

The remainder of the project proposal is structured as follows. Section II briefly 

introduces the notation and terminology used herein. Then, Section III introduces some 

essential background of video coding and transcoding. In Section IV, we present the 

existing logo-insertion methods for MEPG2 compressed video, and point out the 

weaknesses of the methods especially when they are applied to the H.264 compressed 

videos. Finally, Section V states the problems need to be solved in the logo insertion for 

H.264 compressed video. 

II. NOTATION AND TERMINOLOGY 

Before further discussing logo insertion into the H.264 video stream, we need to define 

some terminology used herein. In what follows, a video frame is partitioned into logo 

unrelated and related parts. The logo related part further includes the “logo part” and 

“logo-affected part”. The region that is covered by logo is called logo part, and the region 

outside of the logo but motion predicted based on the logo part is called logo-affected 

part. In MPEG2 compressed video, the logo-affected part exists only in P and B frames, 

while in H.264 compressed video, the logo-affected part exists in all I, P, and B types of 

frames. 

Logos have different features. Some commonly desired logos can be classified as nontransparent 

(i.e., solid) logos, transparent logos, rectangular-shaped logos, and arbitraryshaped 

logos. 

Logo insertion in the pixel domain can be performed by combing the pixel of the 

background image B(x,y) with the logo L(x,y) to obtain the output image P(x, y). The 

operation is usually expressed as a linear combination of the form: 

P(x, y) = α × (L(x,y)) + (1 – α) × (B(x,y)) , (1) 

where the transparency factor α determines the transparency of the logo. The value α is 

in the range of 0 < α ≤ 1. In particular, when α = 1, all pixels of the background image 

3

are replaced by the logo, giving rise to an opaque overlapping of the logo over the input 

image. 

A logo often occupies a small portion of a frame, and is static over a frame sequence. 

Logos often appear in a corner of a frame (e.g. the top left corner, bottom right corner). 

They, however, can be anywhere in a slice for H.264 compressed video, since slice 

partition is rather flexible in H.264 standard. Moreover, logos may present only in groups 

of successive frames, as opposite to all frames in a video sequence. 

III. VIDEO CODING AND TRANSCODING 

Having introduced some logo-related terminology, in this section, we will present several 

essential concepts for video coding and video transcoding. We first introduce three 

commonly used transcoding structures in what follows. 

A. Basic Transcoding Structures 

One straightforward transcoding structure is the cascaded form. For the cascaded 

structure as shown in Figure 1, the decoder decodes the compressed video stream 

completely, and the encoder re-encodes the reconstructed video into the target format. 

The cascaded architecture achieves high video quality, but it is computationally very 

expensive. Therefore, the cascaded structure is not often used, especially in the real-time 

transcoding. It is better to re-use the information contained in the original bit stream to 

simplify the architecture. 

Figure 1. Cascaded architecture in pixel domain. 

Open-loop structure is another commonly used transcoding structure. In the open-loop 

system, the bit stream is first variable-length decoded (VLD) to reconstruct the quantized 

discrete cosine transform (DCT) coefficients, motion vectors, prediction modes, and 

other macroblock-level information. The quantized coefficients are then inverse 

quantized and modified according to the transcoding requirements. Finally, the modified 

data is re-quantized and variable length coded to achieve the new output format. Figure 2 

shows a requantization transcoder as an open-loop structure example. 

4

Figure 2. Open-loop architecture. 

Open-loop systems are relatively simple. They do not include motion estimation, motion 

compensation, DCT, and inverse DCT (IDCT). Therefore, open-loop structures are 

computationally very efficient, but they suffer from low video quality due to the drift 

problem. The drift problem is caused by the mismatch between the actual reference frame 

used for motion estimation in the encoder and the degraded reference frame used for 

motion compensation in the decoder. The drift problem accumulates and causes severe 

degradation to the video quality. 

To avoid the high complexity problem in the cascaded architecture and the poor video 

quality problem in the open-loop structure, closed-loop systems provide a good tradeoff 

between quality and computational complexity. In closed-loop structure, significant 

complexity saving can be achieved while still maintaining acceptable video quality. 

Closed-loop systems provide drift compensation for re-quantized data. They aim at 

eliminating the mismatch between predictive and residual components by approximating 

the cascaded transcoding architecture. In the simplified closed-loop structure as shown in 

Figure 3, only one reconstruction loop is required with one DCT and one IDCT. Some 

architecture inaccuracy is introduced due to the non-linear nature in which the reconstruction 

loops are combined. However, it has been found that the approximation has 

little effect on the video quality. 

Figure 3. Closed-loop architecture. 

5

B. MPEG2 Video Coding 

Since we are relatively familiar with MPEG2 coding process, we only present a point that 

is easily overlooked. In MPEG2, the frame to be compressed is divided into 16×16 pixel 

macroblocks. Then, for each of these macroblocks in P and B frames, the reconstructed 

reference frame is searched to find a macroblock that best matches the macroblock to be 

compressed. The offset is encoded as a motion vector. The match between the two 

macroblocks will often not be perfect. To correct this, the encoder computes the residual 

of the original pixel values and the predicted pixel values. The residual for each 

macroblock is appended to the motion vector, and the spatial redundancy is further 

reduced by the DCT transform. Sometimes, no suitable match is found. Then, the 

macroblock is treated like an I-frame macroblock. Therefore, both inter and intra modes 

can be used to compress macroblocks in MPEG2 P and B frames. 

C. H.264 Video Coding 

In this project, we aim to add logos into H.264 coded video streams. The H.264 standard 

is the latest video coding standard. A brief description of the H.264 standard is described 

below. 

C.1. Intra Prediction 

If a block or macroblock is encoded in the intra mode, a prediction block is formed based 

on previously encoded and reconstructed (but un-filtered) blocks from the same slice. 

This prediction block is subtracted from the current block prior to encoding. For the 

luminance (luma) samples, a prediction block may be formed for each 4×4 subblock or 

for a 16×16 macroblock. There are a total of nine optional prediction modes for each 4×4 

luma block; four optional modes for a 16×16 luma block; and four optional modes for 

each 8×8 chroma block. Note that the same mode is always applied to both chroma 

blocks. 

C.2. Inter Prediction 

Inter prediction creates a prediction model from one or more previously encoded video 

frames. The model is formed by shifting samples in the reference frame(s) (i.e., motion 

compensated prediction). The H.264 codec uses block-based motion compensation. The 

H.264 standard supports motion compensation block sizes ranging from 16×16 to 4×4 

luminance samples with many options in between. The luminance component of each 

macroblock (16×16 samples) may be split up in four ways: 16×16, 16×8, 8×16 or 8×8. 

Each of the sub-divided regions is a macroblock partition. If the 8×8 mode is chosen, 

each of the four 8×8 macroblock partitions within the macroblock may be split into a 

further four ways: 8×8, 8×4, 4×8 or 4×4 (known as macroblock sub-partitions). These 

partitions and sub-partitions give rise to a large number of possible combinations within 

each macroblock. 

A separate motion vector is required for each partition or sub-partition. Each motion 

vector must be coded and transmitted. In addition, the choice of partition(s) must be 

encoded and stored in the compressed bit stream. Choosing a large partition size (e.g. 

16×16, 16×8, 8×16) means that a small number of bits are required to signal the choice of 

motion vector(s) and the type of partition; however, the motion compensated residual 

6

may contain a significant amount of energy in frame areas with high detail. Choosing a 

small partition size (e.g. 8×4, 4×4, etc.) may give a lower-energy residual after motion 

compensation, but requires a larger number of bits to signal the motion vectors and 

choice of partition(s). The choice of partition size, therefore, has a significant impact on 

compression performance. In general, a large partition size is appropriate for 

homogeneous areas of the frame and a small partition size may be beneficial for detailed 

areas. 

IV. EXISTING METHODS FOR MPEG2 COMPRESSED VIDEO LOGO INSERTION 

Most methods for logo insertion are developed for MPEG2 compressed video. In what 

follows, we will introduce such methods presented in several difficult papers, which 

handle logo insertion from different perspectives. We will also analyze whether the 

methods are appropriate to be adopted in logo insertion for H.264 compressed video. 

A. Logo Insertion Position 

For logo insertion, the first step is that the position of the logo needs to be determined. 

Liu’s paper has placed the logo on the top left hand corner. Another major issue is that 

the best logo position needs to be searched so that the effect on the coded macroblocks 

can be minimized. In Liu’s paper [3], the logo is aligned to the macroblocks to minimize 

the amount of macroblocks affected. If the logo is not aligned, then nine macroblocks 

are affected whereas if it is aligned to the macroblocks, then only four macroblocks are 

affected as shown in Figure 4. 

Figure 4. Diagram of logo aligned to macroblocks. 

B. Logo Insertion in Spatial Domain 

As explained in [3], one of the approaches for logo insertion is to insert the logo in the 

spatial domain. Figure 5 gives the block diagram for spatial domain logo insertion 

structure. 

7

Figure 5. Logo insertion in spatial domain. 

The transcoding works as follows. First, the input video stream (containing the motion 

vectors and the residuals for the P and B frames, and I frame) goes through entropy 

decoding to get the quantized versions. Then, the residual component for the P and B 

frames are sent through an inverse quantizer for dequantization; then, it will be fed 

through an inverse DCT transform. The motion vectors goes through two different paths. 

The first path is through motion compensation, where it will be combined with the 

feedback of the previous frame to form a prediction of the current frame without the 

residual. The output of the first motion compensation is then added to the residual to form 

the complete picture for the current frame. The output is also placed into a buffer so that 

the next frame can use the previous frame’s picture for motion compensation. The motion 

vector also goes through the second motion compensation block in the encoder part of the 

loop. It is used to correct the logo insertion errors since the encoder uses the same 

motion vector of the inputted video. 

After the output of the first motion compensation block has been added with the residual 

of the current frame, we will add the logo at this point using the formula (1). That is 

P(x, y) = α × (L(x,y)) + (1 – α) × (B(x,y)) . 

This is then fed in for error correction. After that, it will be encoded using a DCT block, 

quantized, and then entropy encoded for output. There is a rate control mechanism to 

keep a consistent bit rate output for the stream. This mechanism increases or decreases 

the quantizer parameter for Q2 as a means of controlling the bit and also the quality of 

the output frame. 

8

In [4], it shares the same architecture as described above except for the definition of the 

motion vectors (MV). The early transcoders re-uses the motion vector from the input bitstream. 

However, this motion vector may not be pointer to the best match of 

macroblock’s that are close to the insertion area since part of the content is always static. 

To maintain a high coding efficiency, paper [4] suggested the motion vector to be set to 

zero for logo macroblocks that are dominated by logo content and the original motion 

vector from the input bit-stream is used for logo macroblocks that are dominated by video 

content. For example, 

MV(x, y) = (0, 0), when α is greater than or equal to a threshold, e.g., 0.5. 

MV(x, y) = MV’(x, y), otherwise. That is, using MV’s decoded from input bit-stream. 

The above scheme is developed for MPEG2 coded video. For H.264 coded video, 

however, this logo insertion architecture needs to be modified due to the multireferencing 

capability in H.264. The above motion vector redefinition concept can 

probably be used in the H.264 video stream logo insertion. However, the threshold of 

transparency factor α needs to be refined carefully in order to achieve a good coding 

efficiency for H.264 coded video. 

C. Logo Insertion in Transform Domain 

The transform domain insertion algorithm is a little less complicated than the spatial 

domain one, as described by Liu [3]. There is less DCT block to worry about on the 

decoding and the encoding sides. Because of the linearity property of the DCT transform 

as shown in (2), transform domain additions are the same as spatial domain. 

DCT(m+n) = DCT(m) + DCT(n) and 

DCT{α(m(x,y))+(l-α)(n(x,y))} = αDCT{m(x,y)} + (l-α)DCT{n(x,y)}. (2) 

The architecture of the logo insertion in the transform domain is shown in Figure 6. The 

video stream is first entropy decoded and inverse quantized. This is similar to the spatial 

domain strategy. The logo is then inserted in the DCT domain. Next, motion vectors are 

fed back for DCT motion compensation, and subtraction is done for error correction. 

These steps are all done in DCT domain. Then, the output is quantized before entropy 

decoding and rate control. The quantized version is also inversely quantized and fed back 

to the buffer for motion compensation of the next frame and error correction. 

9

Figure 6. Logo insertion in transform domain. 

D. Low Cost and Efficient Logo Insertion 

High accuracy and efficiency are two important criteria for logo insertion. One efficient 

logo-insertion method proposed by Shu Xiao, etc, is shown in [5]. In this paper, the 

authors presented efficient logo insertion methods for transparent and non-transparent 

logos for MPEG2 compressed video. They considered the refinement of prediction modes 

and motion vectors for different types of macroblocks. The method should be able to 

apply in both spatial and transform domains. 

In logo-insertion transcoding for MPEG2 compressed video, we need to compensate the 

changes caused by the logo insertion. Such changes propagate through frames when P 

and B frames refer to the reference frames in motion prediction process. For H.264 

compressed video, however, change propagation happens also for I frames due to the use 

of intra prediction as described in Section III. Sometimes, logos are not inserted in all 

frames of a video sequence. Now, we start to introduce the method used in [5] for 

determining the affected range of frames caused by logo insertion for MPEG2 coded 

video. 

D.1. Logo-Affected Range of Frames in the Temporal Domain 

Let [l, h] denote the range of the video sequence where a logo is required to be inserted. 

That is, the indices l and h are the lowest and highest frame numbers of this range, 

respectively. We further let [L, H] represent the range of video sequence which is 

affected by the logo insertion due to the change propagation caused by frame reference. 

Clearly, we have [l, h] being a subset of [L, H]. Reference frames are frames of a 

10

compressed video that are used to define future frames. In MPEG2 video coding standard, 

reference frames are I and P frames. The previous reference frame lr and next reference 

frame hr of the range [l, h] are defined to be the reference frames outside the range [l, h] 

with the largest and smallest frame numbers, respectively. Then, the indices L and H can 

be determined using (3) and (4), as follows: 

L = lr 

+ 1, 

(3) 

⎧ hr 

if hr 

is not I frame, 

H = ⎨ 

(4) 

⎩hr 

−1 

otherwise. 

Figure 7 gives two examples of logo-affected range of frames. In this example, we use 

the typical group of picture (GOP) structure: IBBPBBPBBP…, where I and P frames are 

reference frames. The frame dependencies are also drawn in the figure. In the logo range 

example one, [5, 16] is the range of frames [l, h] where logos are added. The previous 

reference frame lr of the range [l, h] is 3, and the next reference frame hr of the range [l, h] 

is 18. According to (3) and (4), we got the logo-affected range of frames [L, H] as [4, 18]. 

Similarly, in example two, the range [L, H] is [4, 14]. 

Figure 7. Examples of logo affected range of frames. 

Once the logo affected frame range [L, H] is identified, we can focus on compensating 

the changes induced by logo insertion within this range of frames. The frames outside the 

range [L, H] are not affected, and therefore remain unchanged. 

The scheme described above cannot be directly applied to H.264 compressed video. This 

is because, in H.264 video coding standard, all three types of frames (i.e., I, P, and B 

frames) can be used as reference frames. Multiple reference frames are also used, which 

makes determining the range [L, H] more challenging. The affected-frame range for 

H.264 coded video needs to be wisely adjusted. 

D.2. Motion Information Adjustment in the Logo and Logo-Affected Parts 

Having introduced the logo-affected range of frames, we now state the motion 

information adjustment in [5]. As mentioned earlier in Section II, a frame can be 

partitioned into the logo part, logo affected part, and logo unrelated part. For simplicity, 

in [5], the authors assume that the logo is rectangular and covers integer number of 

macroblocks. The coding modes and motion vectors for macroblocks at the logo 

unrelated part shall remain unchanged. In what follows, we discuss the motion-vector 

refinement of macroblocks in the logo and logo-affected parts, respectively. 

11

For I and P frames in the logo part, if the first reference frame in the range of [l, h] is a P 

frame, then set the macroblock mode (of those macroblocks in the logo part) to be intracoded. 

If the first reference frame in the range of [l, h] is an I frame instead, the 

macroblock modes for all I and P frames remain the same. 

For B frames in the logo part, if its frame number is smaller than the first reference frame 

in the range of [L, H], set the macroblock mode to be backward predicted. If its frame 

number is larger than the last reference frame in the range of [L, H], set the macroblock 

mode to be forward predicted or intra coded. Then, set the prediction mode for all other B 

frames in [L, H] to be forward predicted. 

Having the prediction modes refined, we set all motion vectors for inter-coded 

macroblocks in the logo part to be zero. Clearly, this scheme is suitable for nontransparent 

logos. For transparent logos, however, especially for logos with their 

transparency factor α being small (e.g., α

E.1. Constant Quality at the Logo Part 

We begin to discuss the non-transparent logo situation. Non-transparent logos should 

look still in the video sequence. Therefore, we set the quantization scales for all the intracoded 

macroblocks in the logo area to be the same value, so that the logo region has 

approximately the same quality. For inter-coded macroblocks, the zero motion vector and 

stationary logo content ensure the prediction accuracy. No residual information is needed. 

Hence, we use the skip mode for P frames, and the maximum quantization scales for B 

frames due to the lack of skip mode for B frames. 

Unlike non-transparent logos, transparent logos are superimposed on the original video 

frames. Therefore, the logo parts are not identical over different frames. To ensure 

approximately the same quality in the logo area, we still use the same quantization scales 

for all the intra-coded macroblocks in the logo area. For the inter-coded macroblocks, the 

prediction residuals are not zero nor negligible any more. In [5], a slightly bigger 

quantization scale is used for inter-coded macroblocks than intra-coded macroblocks. 

Both quantization scales for the intra and inter macroblocks are inverse-proportional to 

the total output bit rate. In doing so, we prevent the logo parts from consuming too many 

bits for a low bit rate situation. 

E.2. Bit Reallocation 

Rate control is also an important requirement for video transcoding. It is usually 

accomplished by adjusting the quantization scales. The transcoded video with logo may 

consume more bits than the original coded video. In order to control the overall bit rate of 

the encoded video while achieving a constant visual quality in the logo area, we need to 

adjust the quantization scales for the macroblocks not covered by logo. One simple 

practical scheme proposed in [5] is to pre-encode the logo part, then deduct the consumed 

bits from the total target bits, and adjust the bit allocation of the non-logo-covered area 

according to the left bit rate. 

Note that the bit reallocation method described above is not the most efficient under some 

circumstances. If the logo is inserted in the top left corner of a frame, the pre-encoding of 

the logo part is not necessary. This is because the logo part is coded before most of the 

other macroblocks are coded in this case, and the bits consumed by the logo-part are 

known before most adjustment can be done. Furthermore, if the logo transparency is high, 

adopting the original motion vectors might be more efficient than using zero motion 

vectors for the inter-coded macroblocks. 

V. MAIN ISSUES FOR H.264 COMPRESSED VIDEO LOGO INSERTION 

As indicated earlier, most existing methods for logo insertion are developed for MPEG2 

compressed video. The H.264 standard is much more complex and flexible, and therefore 

more difficult to handle and introduces more challenges. 

In H.264, intra prediction is used for coding the Intra macroblocks in the frames. Hence, I 

frame has logo-affected part due to the dependence within the same slice. This challenge 

13

did not exist in MPEG2. The logo-affected areas caused by intra prediction needs to be 

solved and properly compensated. The intra predication issue implies a complex 

challenge. Because every intra macroblock depends on neighboring macroblock pixels 

(i.e. the bottom-right corner pixel of top-left neighboring macroblock, and bottom row of 

pixels of top neighboring macroblock or right column of pixels of left macroblock) the 

border (boundary) pixels of the logo-affected area should remain the same otherwise a 

drift error will accumulate over time. 

H.264 utilizes better half-pixel approximation (6-tap bicubic interpolation) to find better 

match in the motion estimation stage where MPEG2 uses bilinear interpolation. There are 

many proposals to reuse motion vectors from the decoder, however, directly using those 

motion vectors is not optimal and an extra process is needed to compensate the difference 

between bicubical and bilinear interpolation. In addition H.264 supports quarter-pixel 

samples accuracy to further refine the motion vectors. This did not exist in MPEG2. This 

introduces another challenge. The challenges of half-pixel and quarter-pixel make many 

of the MPEG2 motion vectors transcoding solution inapplicable (not easily usable) for 

H264 transcoding. 

There are papers discussing and proposing a logo insertion in the DCT domain for 

MPEG2. DCT is done in 8×8 macroblocks. In H.264 the block size is different (i.e. 4×4) 

and the transform is integer-DCT-like transform. The issues here are: 1) different 

transform sizes. 2) MPEG2 DCT is independent process where in H.264 the transform is 

combined with quantization. 3) MPEG2 DCT logo insertion is efficient for intra 

macroblocks only because they are absolutely independent. These issues make transform 

domain logo insertion for H.264 extremely hard. 

MPEG2 quantization has 32 quantization steps with some dead zone assumptions. 

MPEG2 quantization is applied by division operations. In H.264 quantization is mixed 

with the transform. It also uses 52 quantization steps and applied by lookup tables and 

shifts. There are several suggestions for MPEG2 quantization and requantization 

mapping for rate control purposes. Almost all those are inapplicable because the whole 

process is too different in terms of steps ranges, quantization error approximation, and 

corresponding quality of each quantization step. Another related issue is the rate control 

distortion approximation is completely different. This makes the optimization equation 

for MPEG2 very suboptimal for H.264. 

The wide variety of the block size in H.264 is important feature that in general can 

improve the coding efficiency. However, in case of logo insertion, some of the modes 

might not have an effect on quality because the transcoder is encoding previously 

encoded frames. There are some existing techniques for mode selection in case of 

MPEG2 logo insertion. However, wider range of modes and macroblock sizes in H.264 

make so hard to apply any of MPEG2 current methods to H.264. 

H.264 uses variable width in-loop deblocking filter affecting up to 2-3 border pixels of 

each macroblock. This means the deblocked frames are used as references for feature 

frames. This creates a significant challenge which manifests as more boundary pixels are 

14

not to be changed. In other words after logo insertion more boundary pixel have to be 

identical to before logo insertion. 

Another issue with H.264, multiple reference frames can be used as references for motion 

prediction in the inter prediction mode. The prediction modes and motion vectors have 

more combinations and possibilities. There is also multiple backward reference frames 

for Inter modes in B frames. The issue alone will multiply the resources requirement (i.e. 

temp memory) of the H.264 transcoder versus MPEG2. 

H264 complexity offers new features which did not exist before such as I_PCM (lossless 

compression) and arbitrary slices shapes and sizes. These features might make logo 

insertions in H.264 completely different than MPEG2 current work. These feature to be 

investigated to study the efficiency of using them. In this project, we aim at solving logoinsertion 

transcoding issues for H.264 coded video. 

15

REFERENCES 

[1] ISO/IEC 13818-2, Generic Coding of Moving Pictures and Associated Audio Information: Video 

International Organization for Standardization, Nov. 1994, Draft International Standard (MPEG-2 

Video). 

[2] “Draft Text of Final Draft International Standard for Advanced Video Coding,” Int. Telecommun. 

Union-Telecommun. (ITU-T), Geneva, Switzerland, Recommendation H.264 (draft), Mar. 2003 

[3] Y. Liu, G. Li, Q. Tang, and J. Guo, “DCT Domain Logo Insertion of MPEG2 Transcoding,” in Proc. 

IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), vol.2, May 2003, pp. 

1219- 1222. 

[4] K. Panusopone, X. Chen, and F. Ling, “Logo insertion in MPEG transcoder,” in Proc. IEEE 

International Conference on Acoustics, Speech, and Signal Processing, Salt Lake City , USA, vol.2, 

May 2001, pp. 981-984. 

[5] S. Xiao, L. Lu, J. L. Kouloheris, and C. A. Gonzales, “Low-Cost and Efficient Logo Insertion Scheme 

in MPEG Video Transcoding,” Proc. of SPIE, Visual Communications and Image Processing, vol. 

4617, Jan. 2002, pp. 172-179. 

[6] J. Xin, C.-W. Lin, and M.-T. Sun, “Digital Video Transcoding,” in Proc. of the IEEE, vol. 93, Issue 1, 

Jan. 2005, pp. 84-97. 

[7] N. Roma and L. Sousa, “Insertion of Irregular-Shaped Logos in the Compressed DCT Domain,” 14th 

International Conference on Digital Signal Processing, vol.1, 2002, pp. 125-128. 

[8] D. G. Morrison, M. E. Nilson, and M. Ghanbari, “Reduction of the Bit-Rate of Compressed Video 

While in Its Coded Form,” in Proc. 6th Int. Workshop Packet Video, 1994, pp. D17.1–D17.4. 

[9] G. Keesman, R. Hellinghuizen, F. Hoeksema, and G. Heideman, “Transcoding of MPEG Bitstreams,” 

Signal Process. Image Commun., vol. 8, no. 6, pp. 481–500, Sep. 1996. 

[10] P. A. A. Assuncao and M. Ghanbari, “A Frequency-Domain Video Transcoder for Dynamic Bitrate 

Reduction of MPEG-2 Bit streams,” IEEE Trans. Circuits Syst. Video Technol., vol. 8, no. 8, pp. 953– 

967, Dec. 1998. 

[11] S.-F. Chang and D. G. Messerschmitt, “Manipulation and Compositing of MC-DCT compressed 

video,” IEEE J. Sel. Areas Commun., vol. 13, no. 1, pp. 1–11, Jan. 1995. 

16

EECE 541 Multimedia Systems Project Proposal: Logo ... - Courses

Create successful ePaper yourself

Delete template?

Save as template?