28.02.2013 Views

Introduction to Acoustics

Introduction to Acoustics

Introduction to Acoustics

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

number of possibilities<br />

H = log Ω. (14.166)<br />

Because log 100 is just twice log 10, the logical problem<br />

is solved. The information measured in bits is obtained<br />

by using a base 2 logarithm.<br />

A few simple features follow immediately. If the<br />

number of possible messages is Ω = 1 then the message<br />

provides no information, which agrees with log 1 = 0. If<br />

the context is binary, where a character can be only 1 or<br />

0(Ω = 2), then receiving a character provides 1 bit of<br />

information, which agrees with log2 2 = 1.<br />

If the context is an alphabet with M possible symbols,<br />

and all of the symbols are equally probable, then<br />

a message with N characters has Ω = M N possible<br />

outcomes and the information entropy is<br />

H = log M N = N log M , (14.167)<br />

illustrating the additivity of information over the characters<br />

of the message.<br />

14.14.1 Shannon Entropy<br />

Information theory becomes interesting when the probabilities<br />

of different symbols are different. Shannon [14.3,<br />

4] showed that the information content per character is<br />

given by<br />

M�<br />

pi log pi , (14.168)<br />

Hc =−<br />

i=1<br />

where pi is the probability of symbol i in the given<br />

context.<br />

The rest of this section proves Shannon’s formula.<br />

The proof begins with the plausible assumption that, if<br />

the probability of symbol i is pi, then in a very long<br />

message of N characters, the number of occurrences of<br />

character i, mi will be exactly mi = Npi.<br />

The number of possibilities for a message of N<br />

characters in which the set of {mi} is fixed by the<br />

corresponding {pi} is<br />

N!<br />

Ω =<br />

. (14.169)<br />

m1! m2! ... m M!<br />

Therefore,<br />

H = log N!−log m1!−log m2!−...log m M! .<br />

(14.170)<br />

One can write log N! as a sum<br />

N�<br />

log N!= log k (14.171)<br />

k=1<br />

and similarly for log mi!.<br />

Acoustic Signal Processing 14.14 Information Theory 529<br />

For a long message one can replace the sum by an<br />

integral,<br />

�<br />

log N!=<br />

and similarly for log mi! .<br />

Therefore,<br />

1<br />

i=1<br />

N<br />

dx log x = N log N − N + 1 (14.172)<br />

H =N log N − N + 1<br />

M�<br />

M�<br />

− mi log mi +<br />

i=1<br />

mi −<br />

M�<br />

1 . (14.173)<br />

i=1<br />

Because � M i=1 mi = N, this reduces <strong>to</strong><br />

H = N log N + 1 −<br />

M�<br />

mi log mi − M . (14.174)<br />

i=1<br />

The information per character is obtained by dividing<br />

the message entropy by the number of characters in the<br />

message,<br />

Hc = log N −<br />

M�<br />

pi log mi + (1 − M)/N ,<br />

i=1<br />

(14.175)<br />

where we have used the fact that mi/N = pi.<br />

In a long message, the last term can be ignored as<br />

small. Then because the sum of probabilities pi is equal<br />

<strong>to</strong> 1,<br />

M�<br />

Hc =− pi(log mi − log N) , (14.176)<br />

or<br />

i=1<br />

Hc =−<br />

i=1<br />

M�<br />

pi log pi , (14.177)<br />

which is (14.168) as advertised.<br />

If the context of written English consists of 27<br />

symbols (26 letters and a space), and if all symbols<br />

are equally probable, then the information content of<br />

a single character is<br />

Hc =−1.443<br />

27�<br />

i=1<br />

1 1<br />

ln = 4.75 (bits) , (14.178)<br />

27 27<br />

where the fac<strong>to</strong>r 1/ ln(2) = 1.443 converts the natural<br />

log <strong>to</strong> a base 2 log. However, in written English all symbols<br />

are not equally probable. For example, the most<br />

Part D 14.14

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!