12.01.2015 Views

Information Theory - What is Information Theory? - What is the ...

Information Theory - What is Information Theory? - What is the ...

Information Theory - What is Information Theory? - What is the ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Information</strong> <strong>Theory</strong><br />

- <strong>What</strong> <strong>is</strong> <strong>Information</strong> <strong>Theory</strong><br />

- <strong>What</strong> <strong>is</strong> <strong>the</strong> purpose of information <strong>the</strong>ory<br />

- Why we need to study information <strong>the</strong>ory<br />

<strong>What</strong> <strong>is</strong> information <strong>the</strong>ory (IT) <br />

IT <strong>is</strong> <strong>the</strong> science that deals with <strong>the</strong> concept “ <strong>Information</strong>” :<br />

Its measurement & its application<br />

<strong>What</strong> <strong>is</strong> <strong>the</strong> purpose of IT<br />

Transm<strong>is</strong>sion of information in an efficient way : minimum time & space<br />

Why we need to study information <strong>the</strong>ory<br />

Because of <strong>the</strong> revolution of communication dealing efficiently with information and its<br />

transm<strong>is</strong>sion becomes a necessary requirement for a computer engineer.


<strong>What</strong> <strong>is</strong> <strong>Information</strong><br />

There are 3 types of <strong>Information</strong><br />

Syntactic <strong>Information</strong> Semantic information Pragmatic information<br />

Related to <strong>the</strong><br />

structure of <strong>the</strong><br />

messages<br />

(characters) that forms<br />

<strong>the</strong> information<br />

Related to <strong>the</strong> meaning<br />

of <strong>the</strong> messages<br />

Related to <strong>the</strong> usage and<br />

effect of <strong>the</strong> messages<br />

Example:<br />

i. I eat sushi by hashi<br />

ii. By hashi I eat sushi<br />

iii. There <strong>is</strong> a typhoon in Japan<br />

iv. There <strong>is</strong> a typhoon in Kanto area in Japan<br />

i. and ii. are syntactically different but semantically & Pragmatically equal<br />

iii. and iv. are syntactically, semantically & pragmatically different<br />

(iv. gives more information than iii.)


INFORMATION TRANSFER ACROSS CHANNELS<br />

Sent<br />

Received<br />

messages<br />

symbols<br />

messages<br />

source<br />

encoder<br />

Source<br />

coding<br />

Channel<br />

coding<br />

channel<br />

Channel<br />

Source<br />

decoder<br />

decoding decoding<br />

receiver<br />

Compression<br />

Source Entropy<br />

Rate vs D<strong>is</strong>tortion<br />

Error Correction<br />

Channel Capacity<br />

Capacity vs Efficiency<br />

Decompression


Digital Communication Systems<br />

<strong>Information</strong><br />

Source<br />

User of<br />

<strong>Information</strong><br />

Source<br />

Encoder<br />

Source<br />

Decoder<br />

Channel<br />

Encoder<br />

Channel<br />

Decoder<br />

Modulator<br />

De-Modulator<br />

Channel


<strong>Information</strong> Source<br />

Examples:<br />

1. Engl<strong>is</strong>h text<br />

2. A man speaking<br />

3. Photographs<br />

4. Motion of films,<br />

....... etc.<br />

A chief aim of information <strong>the</strong>ory <strong>is</strong> to study how such sequence of symbols (signals)<br />

can be most effectively encoded for transm<strong>is</strong>sion (by electrical means).<br />

For <strong>Information</strong> source we have:<br />

· <strong>Information</strong> expressed as a finite set of symbols : Source Alphabet<br />

· A ( d<strong>is</strong>crete) information source : <strong>is</strong> a source that generates a sequence of symbols<br />

· The symbols are denoted by a1, a2, …., am and <strong>the</strong> alphabet by A = { a1, a2, …, am }<br />

· A finite sequence of symbols <strong>is</strong> called word. The set of all words <strong>is</strong> denoted by A*


<strong>Information</strong> Source<br />

Memoryless<br />

Memoryless means <strong>the</strong> generated symbols (of a source message ) are<br />

independent.<br />

Stationary<br />

The idea of stationary of a source demands no change with time<br />

D<strong>is</strong>crete<br />

The source produces independent symbols in different unit times


Memoryless<br />

Memoryless means <strong>the</strong> generated symbols (of a source message ) are independent.<br />

Channel<br />

X<br />

P(X = i | X1, X2, …, Xn ) = P ( X = i )<br />

i.e. The probability of <strong>the</strong> output X <strong>is</strong> conditionally independent of previous channel<br />

inputs or outputs X1, .., Xn<br />

Example:<br />

Coin toss 8 times : <strong>the</strong> probability to get head<br />

Each time <strong>is</strong> ½ x …x ½ = (½) 8 = 1/256 = 0.0039 = 0.4 %<br />

Coin toss number 9 : The probability to get head <strong>is</strong> still ½<br />

So it <strong>is</strong> independent of <strong>the</strong> previous 8 toss


Stationary<br />

A stochastic process <strong>is</strong> said to be stationary when <strong>the</strong> process <strong>is</strong> (temporally<br />

homogeneous ) remain invariant under every translation of <strong>the</strong> time scale<br />

Example 1 : Assume a machine produces an infinite sequence of <strong>the</strong> form:<br />

Machine<br />

AE AE AE AE AE etc.<br />

i.e. <strong>What</strong> comes later <strong>is</strong> like what has gone before. Stationary <strong>is</strong> a designation of such<br />

source of characters<br />

Example 2 : Assume <strong>the</strong> machine that produces:<br />

Machine<br />

AE AAEE AAAEEE etc.<br />

such machine <strong>is</strong> not stationary<br />

The idea of stationary of a source demands no change with time<br />

i.e P(X = i ) = C (constant)


Digital Communication Systems<br />

Source of<br />

<strong>Information</strong><br />

User of<br />

<strong>Information</strong><br />

Source<br />

Encoder<br />

Source<br />

Decoder<br />

Channel<br />

Encoder<br />

Channel<br />

Decoder<br />

Modulator<br />

De-Modulator<br />

Channel


Source Coding<br />

· We consider memoryless, stationary, d<strong>is</strong>crete information source S ,<br />

S =<br />

a 1<br />

a 2<br />

…. A m<br />

p 1<br />

p 2<br />

…. p m<br />

where p i<br />

= p(X n<br />

=a i<br />

)<br />

so that<br />

P(X n = a i ) = C for a constant C<br />

i.e. probability doesn’t depend on <strong>the</strong> trial ( n )


Source Coding<br />

-<strong>Information</strong> source encoding (enciphering) :<br />

<strong>is</strong> a procedure for associating words constructed from a finite alphabet<br />

of a language with given words of ano<strong>the</strong>r language in a one-to-one<br />

manner.<br />

i.e. encoding <strong>is</strong> a procedure for mapping a given set of messages<br />

{m1, m2, ... , mi } onto a new set of encoded messages {c1, c2, ... , ci }<br />

in a one-to-one manner.<br />

-A measure of <strong>the</strong> encoded information <strong>is</strong> entropy :<br />

which <strong>is</strong> <strong>the</strong> shortest mean encoding.<br />

The unit measure of entropy <strong>is</strong> bit : binary digit.<br />

Example: The entropy of a fair coin toss <strong>is</strong> 1 bit<br />

Later on we will study entropy in more details.


Source Coding<br />

-The goal of source coding <strong>is</strong> to make encoding as short as<br />

possible<br />

- Code : For <strong>the</strong> alphabet A = { a1,…., am }<br />

a code C <strong>is</strong> a nonempty set of words, i.e. C <strong>is</strong> a subset of A*<br />

-Code word : Is an element of <strong>the</strong> code<br />

- Code word length : Is <strong>the</strong> number of symbols <strong>is</strong> <strong>the</strong> code word.<br />

For a code word x, l(x) denote its length


Average Code Length<br />

<strong>Information</strong><br />

Source<br />

s k<br />

Source<br />

Encoder<br />

c k<br />

• Source has K symbols<br />

• Each symbol s k has probability p k<br />

• Each symbol s k <strong>is</strong> represented by a codeword<br />

c k of length l k bits<br />

• Average codeword length<br />

K −<br />

L<br />

1 = ∑<br />

k = 0<br />

pl<br />

k k


Source Coding<br />

Example : for <strong>the</strong> source information { Red, Blue} and <strong>the</strong> code alphabet <strong>is</strong> {0,1 }.<br />

Let C (Red) = 00 and C (Blue) = 11<br />

Then C = { 00, 11 } <strong>is</strong> a subset of { 0, 1 }* and <strong>the</strong> length <strong>is</strong> l( 00 ) = 2 and l(11) = 2<br />

- Binary Source Coding : Is <strong>the</strong> code whose alphabet <strong>is</strong> { 0, 1 }


Decoding<br />

Is <strong>the</strong> inverse process of encoding.<br />

i.e. The operation of assigning code words to <strong>the</strong> corresponding words of <strong>the</strong><br />

source information.<br />

c k<br />

Source<br />

Decoder<br />

s k<br />

<strong>Information</strong><br />

Destination


Decoding<br />

Is <strong>the</strong> inverse process of encoding.<br />

i.e. The operation of assigning code words to <strong>the</strong> corresponding words of <strong>the</strong><br />

source information.<br />

Various Coding Techniques :<br />

- Comma Code : <strong>is</strong> a code C with <strong>the</strong> code words<br />

1, 01, 001, 0001, ….., 0…01, 000…0<br />

where <strong>the</strong> last two codes have n-1 and n 0's respectively.<br />

-Variable length Code : <strong>is</strong> <strong>the</strong> code with codeword of different lengths.<br />

- Instantaneous code : <strong>is</strong> <strong>the</strong> code in which each codeword in any string<br />

of codewords can be de coded (reading from left to right ) as soon as it <strong>is</strong> received.<br />

- Singular Code : A code <strong>is</strong> singular if some codewords are not different


Examples of <strong>Information</strong> Source Coding<br />

Symbol<br />

Pro<br />

C1<br />

C2<br />

C3<br />

C4<br />

C5<br />

C6<br />

A<br />

0.4<br />

00<br />

1<br />

1<br />

1<br />

0<br />

0<br />

C<br />

0.3<br />

01<br />

01<br />

01<br />

00<br />

1<br />

0<br />

C<br />

0.2<br />

10<br />

001<br />

001<br />

010<br />

10<br />

10<br />

D<br />

0.1<br />

11<br />

0001<br />

000<br />

011<br />

01<br />

01<br />

MCL<br />

2<br />

2<br />

1.9<br />

1.9<br />

1.3<br />

1.3<br />

MCL = Mean code length<br />

C1 : fragile ( recovery impossible if letters drop out)<br />

C2 : robust ( recovery possible if letters drop out);<br />

variable-length code (non-equal length code );<br />

C3 : smarter comma code (incorporating maximum length)<br />

C3, C4 : also (variable- length) instantaneous codes;<br />

C5 : decoding impossible;<br />

D A A C<br />

0 1 0 01 0 or 0 10<br />

C6 : singular code (so decoding impossible)


Example: Code I<br />

Source<br />

Symbol<br />

Binary Code<br />

Representatio<br />

n<br />

<strong>Information</strong><br />

Source<br />

s 0 s 0 s 3<br />

Source<br />

Encoder<br />

0011<br />

s 0<br />

s 1<br />

0<br />

1<br />

s 0 s 0 s 3<br />

s 0 s 0 s 1 s 1<br />

s 2 s 3<br />

s 2 s 1 s 1<br />

or,<br />

or,<br />

or,<br />

Source<br />

Decoder<br />

Error Free<br />

Channel<br />

0011<br />

s 2<br />

s 3<br />

00<br />

11


Kraft-McMillan Inequality<br />

K − 1<br />

∑ 2<br />

−lk<br />

≤ 1<br />

k = 0<br />

• If codeword lengths of a code<br />

sat<strong>is</strong>fy <strong>the</strong> Kraft McMillan’s<br />

inequality, <strong>the</strong>n a prefix code<br />

with <strong>the</strong>se codeword lengths<br />

can be constructed.<br />

• For code D<br />

– 2 - 1 + 2 - 2 + 2 - 3 + 2 -2=<br />

9/8<br />

– Th<strong>is</strong> means that Code D IS<br />

NOT A PREFIX CODE<br />

Source<br />

Symbol<br />

s k<br />

s 0<br />

s 1<br />

s 2<br />

s 3<br />

Symbol<br />

Codeword<br />

C k<br />

0<br />

10<br />

110<br />

11<br />

Code D<br />

Codeword<br />

Length<br />

l k<br />

1<br />

2<br />

3<br />

2


Use of Kraft-McMillan Inequality<br />

• We may use it if <strong>the</strong> number of symbols are<br />

large such that we cannot simply by inspection<br />

judge whe<strong>the</strong>r a given code <strong>is</strong> a prefix code or<br />

not<br />

• WHAT Kraft-McMillan Inequality Can Do:<br />

– It can determine that a given code IS NOT A PREFIX<br />

CODE<br />

– It can identify that a prefix code could be constructed<br />

from a set of codeword lengths<br />

• WHAT Kraft-McMillan Inequality Cannot Do:<br />

– It cannot guarantee that a given code <strong>is</strong> indeed a<br />

prefix code


Example<br />

Source<br />

Symbol<br />

s k<br />

s 0<br />

s 1<br />

s 2<br />

s 3<br />

• For code E<br />

– 2 - 1 + 2 - 2 + 2 - 3 + 2 -3=<br />

1<br />

• IS CODE E A PREFIX CODE<br />

– NO<br />

– WHY<br />

• s 3 <strong>is</strong> a prefix to s 2<br />

Symbol<br />

Codeword<br />

c k<br />

0<br />

100<br />

110<br />

11<br />

Code E<br />

Codeword<br />

Length<br />

l k<br />

1<br />

3<br />

3<br />

2

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!