25.07.2013 Views

Source Coding

Source Coding

Source Coding

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Source</strong> <strong>Coding</strong><br />

Prof. Ja-Ling Wu<br />

Department of Computer Science<br />

and Information Engineering<br />

National Taiwan University


<strong>Source</strong><br />

sequence of<br />

source symbols<br />

ui Encoder<br />

sequence of<br />

code symbols<br />

<strong>Source</strong> alphabet Code alphabet<br />

{ u , u2,<br />

u }<br />

{ p p , p } L<br />

U 1 L M ⎫ △<br />

⎬ =<br />

P M ⎭ ⎬<br />

1,<br />

2<br />

⎧ = ,<br />

⎨<br />

⎩ = ,<br />

X<br />

{ a a , a } L { a a , a }<br />

A , = A ,<br />

1 , 2<br />

n<br />

a i<br />

2


( i)<br />

( i)<br />

message g X , X , , X<br />

( i)<br />

△<br />

ui → 1 2 L N = X i<br />

i i<br />

where<br />

X<br />

( i)<br />

k<br />

∈ A,<br />

⎧ X i : codeword<br />

⎨<br />

⎩N<br />

i : length of the codeword X<br />

Average length of codeword<br />

N<br />

=<br />

=<br />

M<br />

∑<br />

i=<br />

1<br />

M<br />

∑<br />

i=<br />

1<br />

p(X<br />

p(<br />

u<br />

i<br />

i ) Ni<br />

) N<br />

i<br />

=<br />

M<br />

∑<br />

i=<br />

1<br />

p<br />

i<br />

N<br />

i<br />

k<br />

= 1,<br />

2,<br />

L,<br />

N<br />

i = 1 , L , M<br />

i<br />

i<br />

3


Ex:<br />

X<br />

⎧ ⎧X<br />

⎨<br />

⎩X<br />

1<br />

⎧<br />

⎫<br />

⎪<br />

u1<br />

u2<br />

u3<br />

u4<br />

u5<br />

u6<br />

u7<br />

u8<br />

⎪<br />

= ⎨<br />

1 1 1 1 1 1 1 1<br />

⎬<br />

⎪<br />

⎪<br />

⎩ 4 4 8 8 16 16 16 16 ⎭<br />

2<br />

2<br />

X 1 (1) X2 (1)<br />

= 00<br />

= 01<br />

−2<br />

⎧ ⎧X<br />

⎨<br />

⎩X<br />

3<br />

4<br />

2<br />

= 100<br />

= 101<br />

−3<br />

⎧ X 5 = 1100<br />

⎪<br />

X 6 = 1101<br />

⎨<br />

⎪X<br />

7 = 1110<br />

⎪<br />

⎩<br />

⎪<br />

⎩ X 8 = 1111<br />

2<br />

−4<br />

0 1<br />

X 1 2 X<br />

0 1<br />

0<br />

0<br />

X 3 4<br />

1<br />

1<br />

X<br />

0 1<br />

0 1 0 1<br />

X 5 6 X 7 X 8 X<br />

4


01 { 1110 {{{ 1100 101 00 { 100 {<br />

u<br />

2<br />

N<br />

=<br />

u<br />

7<br />

1<br />

4<br />

u<br />

5<br />

× 2 +<br />

1<br />

4<br />

u<br />

4<br />

u<br />

× 2 +<br />

1<br />

1<br />

8<br />

u<br />

3<br />

× 3+<br />

:<br />

1<br />

8<br />

uniquely i l ddecodable d bl<br />

andd<br />

iinstantane t t ous ddecodable d bl<br />

× 3 +<br />

1<br />

16<br />

× 4×<br />

4 =<br />

2.<br />

75<br />

bits<br />

5


Entropy of information sources:<br />

H ( x)<br />

In general, general<br />

⇒<br />

=<br />

8<br />

∑<br />

i i=<br />

1<br />

p<br />

i<br />

log<br />

p<br />

i<br />

=<br />

2.<br />

75<br />

bit<br />

symbol<br />

H ( x)<br />

N ≥ , n : size of the code alphabet p<br />

llog<br />

n<br />

Entropy<br />

codeword d d<br />

provides<br />

llength h<br />

the<br />

ffor<br />

lower<br />

a<br />

source<br />

bound<br />

code d<br />

of<br />

the<br />

average<br />

6


Only when = ⇒ = ⇒<br />

− Ni<br />

n N H (x)<br />

pf: H x pi<br />

p log<br />

) ( −<br />

N<br />

pi n<br />

M<br />

= ∑<br />

i=<br />

1<br />

M<br />

= −∑<br />

i=<br />

1<br />

M<br />

n<br />

i<br />

i log n<br />

−N<br />

−N<br />

i<br />

M<br />

−N<br />

i = ∑ Ni<br />

n (log n)<br />

= (log n)<br />

∑ N<br />

i=<br />

1<br />

M<br />

M<br />

= ∑ Ni<br />

pi<br />

= ∑ N<br />

i=<br />

1 i=<br />

1<br />

H ( x )<br />

∴<br />

N =<br />

log n<br />

i<br />

n<br />

−N<br />

i<br />

i=<br />

1<br />

i<br />

n<br />

100% efficiency<br />

bbe<br />

occurredd<br />

only l<br />

−N<br />

when the<br />

iis<br />

a negtive ti<br />

of n.<br />

i<br />

prob.<br />

dist.<br />

power<br />

7


uniquely<br />

decodable decod b e<br />

not instantaneous<br />

ddecodable d bl<br />

instantaneous<br />

decodable<br />

Not uniquely<br />

decodable<br />

8


Theorem :<br />

Let a code have codeword lengths N , N 2 2,<br />

L , N<br />

and have n symbols in the code alphabet. If the<br />

code d<br />

iis<br />

uniquely i l<br />

∑<br />

1<br />

−<br />

M<br />

n<br />

i i=<br />

1<br />

ddecodable, d bl<br />

N i<br />

≤<br />

1<br />

then h the h<br />

must<br />

be<br />

1 M<br />

KKraft f<br />

satisfied.<br />

iinequality li<br />

9


Lemma1:<br />

A uniquely decodable code is a prefix code (prefix-free<br />

code) d)ifih if it has the h prefix fi property, which hi h requires i that h<br />

no codeword be a proper prefix of any other codeword.<br />

LLemma2: 2 IInstantaneous t t decodable d d bl code. d<br />

(i) uniquely decodable (prefix-free code)<br />

(ii) Kraft inequality hold<br />

10


C2 C3 C4 C5<br />

S1<br />

00 0 0 1<br />

S2 01 10 01 01<br />

S3<br />

10 110 011 011<br />

S S4<br />

11 111 0111 0111<br />

4<br />

prefix<br />

uniquely<br />

- free code<br />

decodable<br />

uniquely decodable<br />

+<br />

prefix property<br />

→<br />

←×<br />

( C )<br />

4<br />

⇒<br />

→<br />

←×<br />

( C<br />

5<br />

uniquely<br />

)<br />

Kraft<br />

instantane ous<br />

decodable<br />

inequality<br />

decodable<br />

11


Encoding gAlgorithm g ( (based on the Kraft inequality) q y)<br />

Assume 2<br />

Let<br />

l1 ≤ l ≤ L ≤ lq<br />

( 1 ≤ i q)<br />

−li<br />

⎧ Si<br />

+ 1 = Si<br />

+ r <<br />

⎨<br />

⎩S1<br />

= 0<br />

⇒ Si < S j for<br />

i<br />

<<br />

j<br />

12


i<br />

l<br />

r<br />

S<br />

S<br />

−<br />

+<br />

⇒<br />

i<br />

i<br />

i<br />

i<br />

i<br />

r<br />

S<br />

S<br />

⇓<br />

⇓<br />

+<br />

+<br />

=<br />

⇒<br />

+<br />

X<br />

X 1<br />

1<br />

i<br />

i<br />

of<br />

prefix<br />

a<br />

not<br />

is<br />

1<br />

+<br />

⇒ X<br />

X<br />

i<br />

l<br />

(i)<br />

-l<br />

i<br />

i<br />

C to<br />

due<br />

changed<br />

be<br />

will<br />

since<br />

p 1<br />

+<br />

i<br />

-l<br />

r<br />

of<br />

addition<br />

the<br />

13


l<br />

l<br />

S =<br />

1<br />

0<br />

l<br />

l<br />

l<br />

l<br />

l<br />

r<br />

r<br />

r<br />

S<br />

S<br />

r<br />

r<br />

S<br />

S<br />

−<br />

−<br />

−<br />

−<br />

−<br />

+<br />

=<br />

+<br />

=<br />

=<br />

+<br />

=<br />

2<br />

3<br />

1<br />

2<br />

2<br />

1<br />

2<br />

1<br />

1<br />

q<br />

q<br />

l<br />

l<br />

l<br />

l<br />

S<br />

S<br />

−<br />

−<br />

−<br />

−<br />

2<br />

3<br />

1<br />

2<br />

1<br />

1<br />

M<br />

q<br />

i<br />

q<br />

q<br />

l<br />

q<br />

l<br />

l<br />

l<br />

l<br />

l<br />

q<br />

q<br />

r<br />

r<br />

r<br />

r<br />

S<br />

S<br />

−<br />

−<br />

−<br />

⎟<br />

⎞<br />

⎜<br />

⎛<br />

+<br />

+<br />

+<br />

=<br />

+<br />

=<br />

∑<br />

−<br />

−<br />

1<br />

1<br />

2<br />

1<br />

1<br />

L<br />

q<br />

i<br />

i<br />

l<br />

r<br />

r<br />

=<br />

−<br />

⎟<br />

⎠<br />

⎜<br />

⎝<br />

= ∑ 1<br />

1<br />

i ∑ q<br />

l<br />

← uniquely decodable<br />

1<br />

assumption<br />

By<br />

1<br />

≤<br />

=<br />

−<br />

∑ r<br />

q<br />

i<br />

l i<br />

q y<br />

1<br />

1<br />

<<br />

−<br />

=<br />

⇒<br />

−<br />

=<br />

−<br />

∑<br />

r<br />

r<br />

S<br />

l<br />

q<br />

i<br />

l<br />

q<br />

q<br />

i<br />

14<br />

2<br />

for<br />

,<br />

1<br />

all ≥<br />

<<br />

⇒ i<br />

S i


− ar ary nnumber mber representa tion of S Si<br />

S<br />

i<br />

( i)<br />

−1<br />

( i)<br />

−2<br />

( i)<br />

−li<br />

= C−1<br />

r + C−2<br />

r + L+<br />

C−l<br />

r + L<br />

{ 0 0,<br />

1 1,<br />

, r 1 }<br />

( i )<br />

where Cz<br />

∈ L r<br />

or<br />

C −<br />

( ( i)<br />

( i)<br />

( i)<br />

. , , , , ) C C C L<br />

S = − L<br />

i −1<br />

−2<br />

li<br />

codeword<br />

X<br />

= L<br />

( i)<br />

( i)<br />

( i)<br />

1 2 l C C C− − −<br />

i i<br />

r<br />

i<br />

15


So, this algorithm guarantees for<br />

prod producing cing instantaneous instantaneo s decodable codes codes.<br />

Ex:<br />

l i<br />

r<br />

=<br />

=<br />

check<br />

S<br />

S<br />

S<br />

S<br />

S<br />

S<br />

S<br />

S<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

2,<br />

2,<br />

3,<br />

3,<br />

4,<br />

4,<br />

4,<br />

4<br />

2<br />

=<br />

=<br />

=<br />

=<br />

=<br />

=<br />

0<br />

S<br />

S<br />

S<br />

S<br />

S<br />

: 2 × 2<br />

1<br />

= S<br />

= S<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

+<br />

+<br />

2<br />

+ 2<br />

+<br />

2<br />

+ 2<br />

2<br />

+ 2<br />

+<br />

2<br />

−2<br />

−2<br />

− 2<br />

−3<br />

− 3<br />

−4<br />

−4<br />

−4<br />

+ 2 × 2<br />

=<br />

=<br />

=<br />

=<br />

=<br />

=<br />

=<br />

=<br />

−3<br />

+ 4 × 2<br />

−4<br />

≤ 1<br />

( . 0000 L ) 2 X 1 = 00<br />

( . 0100 L)<br />

2 X 2 = 01<br />

( . 1000 L ) 2 X 3 = 100<br />

( . 1010 L)<br />

2 X 4 = 101<br />

( . 1 100 L ) 2 X 5 = 1100<br />

( . 11010<br />

L)<br />

2 X 6 = 1101<br />

( . 1 110 L ) 2 X 7 = 1110<br />

( . 11110<br />

L)<br />

X = 1111<br />

Remark: (1) instantaneous decodable<br />

(2) this encoding procedure is independent of p i<br />

2<br />

8<br />

16


Noise in Huffman coding Probabilities:<br />

Suppose that the estimate of the probs. p i are not accurate.<br />

How much does the average code length suffer?<br />

Let i be the original Huffman code design probs.<br />

d<br />

p<br />

and p ′ i = pi<br />

+ ei<br />

be the probs. for the source that is actually used.<br />

17


∑ q<br />

(1)<br />

0<br />

1<br />

Cl l<br />

∑<br />

∑<br />

∑ =<br />

=<br />

q<br />

q<br />

i<br />

i<br />

e<br />

q<br />

(1)<br />

0<br />

Clearly,<br />

1<br />

L<br />

∑<br />

∑<br />

=<br />

=<br />

′<br />

=<br />

= i<br />

i<br />

i<br />

i<br />

e<br />

p<br />

p<br />

get<br />

we<br />

error<br />

the<br />

of<br />

size<br />

the<br />

of<br />

measure<br />

one<br />

As<br />

1<br />

Since<br />

1<br />

1<br />

∑ =<br />

q<br />

i<br />

i<br />

e<br />

e<br />

(2)<br />

1<br />

get<br />

we<br />

,<br />

error<br />

the<br />

of<br />

size<br />

the<br />

of<br />

measure<br />

one<br />

As<br />

2<br />

2<br />

L<br />

σ<br />

∑ =<br />

i<br />

i<br />

q<br />

symbol<br />

per<br />

length<br />

average<br />

new<br />

the<br />

Now<br />

()<br />

1<br />

∑<br />

∑<br />

∑<br />

+<br />

=<br />

′<br />

=<br />

′<br />

i<br />

i<br />

i<br />

i<br />

i<br />

i<br />

i<br />

i<br />

i<br />

e<br />

l<br />

q<br />

p<br />

l<br />

q<br />

p<br />

l<br />

q<br />

L<br />

1<br />

1<br />

1<br />

∑<br />

+<br />

=<br />

i<br />

i<br />

i<br />

i<br />

i<br />

i<br />

e<br />

l<br />

q<br />

L<br />

q<br />

q<br />

q<br />

1<br />

18<br />

i<br />

q


the<br />

resort to<br />

we<br />

(2),<br />

and<br />

(1)<br />

,<br />

on<br />

conditions<br />

two<br />

With the i<br />

e<br />

1<br />

0<br />

1<br />

1<br />

cases.<br />

extreme<br />

the<br />

find<br />

to<br />

multiplier<br />

Lagrange<br />

of<br />

method<br />

2<br />

2<br />

e<br />

e<br />

e<br />

l ⎟ ⎟⎞<br />

⎜<br />

⎜<br />

⎛<br />

−<br />

−<br />

⎟<br />

⎟<br />

⎞<br />

⎜<br />

⎜<br />

⎛<br />

−<br />

−<br />

= ∑<br />

∑<br />

∑<br />

σ<br />

μ<br />

λ<br />

L<br />

)<br />

2<br />

1<br />

(<br />

0<br />

0<br />

i<br />

i<br />

i<br />

i<br />

i<br />

i<br />

i<br />

q<br />

i<br />

e<br />

q<br />

e<br />

q<br />

e<br />

l<br />

q<br />

=<br />

=<br />

∂<br />

⎟<br />

⎟<br />

⎠<br />

⎜<br />

⎜<br />

⎝<br />

⎟<br />

⎟<br />

⎠<br />

⎜<br />

⎜<br />

⎝<br />

= ∑<br />

∑<br />

∑<br />

σ<br />

μ<br />

λ<br />

L<br />

L<br />

L<br />

)<br />

,<br />

,<br />

2<br />

,<br />

1<br />

(<br />

0<br />

i<br />

l<br />

q<br />

i<br />

l<br />

=<br />

=<br />

∂<br />

∑<br />

L<br />

2<br />

and<br />

1<br />

2<br />

i<br />

i<br />

i<br />

i<br />

i<br />

e<br />

e<br />

l<br />

l<br />

q<br />

=<br />

=<br />

⇒<br />

∑<br />

∑<br />

∑<br />

μ<br />

λ<br />

1<br />

1<br />

1<br />

2<br />

2<br />

2<br />

2<br />

2<br />

i<br />

i<br />

e<br />

⎥<br />

⎤<br />

⎢<br />

⎡<br />

⎟<br />

⎞<br />

⎜<br />

⎛<br />

⎟<br />

⎞<br />

⎜<br />

⎛<br />

∑<br />

∑<br />

∑<br />

∑<br />

[ ] [ ]<br />

)<br />

(<br />

of<br />

ariance<br />

of<br />

ariance<br />

1<br />

1<br />

1 2<br />

2<br />

i<br />

i<br />

i<br />

i<br />

i<br />

i<br />

i<br />

l<br />

l<br />

q<br />

l<br />

q<br />

e<br />

l<br />

q<br />

⋅<br />

⎥<br />

⎥<br />

⎦<br />

⎢ ⎢<br />

⎣<br />

⎟<br />

⎠<br />

⎞<br />

⎜<br />

⎝<br />

⎛<br />

−<br />

=<br />

⎟<br />

⎟<br />

⎠<br />

⎞<br />

⎜<br />

⎜<br />

⎝<br />

⎛<br />

⇒ ∑<br />

∑<br />

∑<br />

σ<br />

19<br />

[ ] [ ]<br />

)<br />

(<br />

of<br />

variance<br />

of<br />

variance i<br />

i<br />

e<br />

l ⋅<br />

=


The more variable the l i , the more harm the<br />

errors in the estimates of the p i can cause in<br />

the average of the symbol length.<br />

20


Ex: p 1 = 0.4, p 2 = 0.2, p 3 = 0.2, p 4 = 0.1, p 5 = 0.1<br />

[Approach 1]<br />

0.4<br />

0.2<br />

0<br />

1<br />

0.6<br />

0.2<br />

0<br />

000<br />

0.1<br />

0.1<br />

0<br />

1<br />

0.2<br />

1<br />

0.4 0010<br />

0011<br />

L<br />

1<br />

=<br />

0.<br />

4<br />

× 1+<br />

0.<br />

2<br />

× 2 +<br />

0.<br />

2<br />

0<br />

1<br />

1<br />

× 3+<br />

0.<br />

1×<br />

4 + 0.<br />

1×<br />

4 =<br />

1<br />

01<br />

2.<br />

2<br />

bits<br />

21


[Approach 2]<br />

0<br />

0.4<br />

0.6<br />

1<br />

00<br />

1<br />

0<br />

1<br />

0<br />

0.2<br />

02<br />

0.4<br />

10<br />

11<br />

1<br />

1<br />

0<br />

0.2<br />

0.1<br />

02<br />

11<br />

010<br />

0<br />

1<br />

0.1<br />

0.2<br />

011<br />

1<br />

1<br />

2<br />

bits<br />

2<br />

.<br />

2<br />

3<br />

1<br />

.<br />

0<br />

3<br />

1<br />

.<br />

0<br />

2<br />

2<br />

.<br />

0<br />

2<br />

2<br />

.<br />

0<br />

2<br />

4<br />

.<br />

0 L<br />

L =<br />

=<br />

×<br />

+<br />

×<br />

+<br />

×<br />

+<br />

×<br />

+<br />

×<br />

=<br />

22


)<br />

2<br />

2<br />

3<br />

(<br />

2<br />

0<br />

)<br />

2<br />

2<br />

2<br />

(<br />

2<br />

0<br />

)<br />

2<br />

2<br />

1<br />

(<br />

4<br />

0<br />

)<br />

1<br />

var(<br />

2<br />

2<br />

2<br />

−<br />

+<br />

−<br />

+<br />

−<br />

= .<br />

.<br />

.<br />

.<br />

.<br />

.<br />

but<br />

36<br />

.<br />

1<br />

)<br />

2<br />

2<br />

4<br />

(<br />

1<br />

0<br />

)<br />

2<br />

2<br />

4<br />

(<br />

1<br />

0<br />

2<br />

2<br />

=<br />

−<br />

+<br />

−<br />

+ .<br />

.<br />

.<br />

.<br />

)<br />

2<br />

2<br />

3<br />

(<br />

1<br />

0<br />

)<br />

2<br />

2<br />

3<br />

(<br />

1<br />

0<br />

)<br />

2<br />

2<br />

2<br />

(<br />

2<br />

0<br />

)<br />

2<br />

2<br />

2<br />

(<br />

2<br />

0<br />

)<br />

2<br />

2<br />

2<br />

(<br />

4<br />

0<br />

)<br />

2<br />

var(<br />

2<br />

2<br />

2<br />

2<br />

2<br />

+<br />

+<br />

−<br />

+<br />

−<br />

+<br />

−<br />

= .<br />

.<br />

.<br />

.<br />

.<br />

.<br />

16<br />

.<br />

0<br />

)<br />

2<br />

2<br />

3<br />

(<br />

1<br />

0<br />

)<br />

2<br />

2<br />

3<br />

(<br />

1<br />

0<br />

=<br />

−<br />

+<br />

−<br />

+ .<br />

.<br />

.<br />

.<br />

variance!<br />

less<br />

its<br />

to<br />

due<br />

preferable<br />

is<br />

2<br />

Approach<br />

⇒<br />

23

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!