05.01.2013 Views

Perceptual Coherence : Hearing and Seeing

Perceptual Coherence : Hearing and Seeing

Perceptual Coherence : Hearing and Seeing

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

98 <strong>Perceptual</strong> <strong>Coherence</strong><br />

Information, Redundancy, <strong>and</strong> Prior Probabilities<br />

Information Theory <strong>and</strong> Redundancy<br />

We can use the framework of information theory (Shannon & Weaver, 1949)<br />

to quantify how much the neural response tells us about the stimulus. Our<br />

interest lies in the mutual information between stimuli <strong>and</strong> neural responses.<br />

Given the neural response, how much is the uncertainty reduced about which<br />

stimulus actually occurred (which is the same reduction in uncertainty about<br />

the resulting neural response given the stimulus, hence mutual)? I frame all<br />

of these questions in terms of conditional probabilities: Given the neural response,<br />

what are the probabilities of the possible stimuli as compared to the<br />

probabilities of the same stimuli before observing the response?<br />

In defining information, Shannon (1948) was guided by several commonsense<br />

guidelines. 1 The first was that the uncertainty due to several independent<br />

variables would be equal to the sum of the uncertainty due to each<br />

variable individually. Independence means that knowing the values of one<br />

variable does not allow you to predict the value of any other variable. The<br />

probability of any joint outcome of two or more independent variables becomes<br />

equal to the multiplication of the probabilities of each of the variables:<br />

Pr(w, x, y, z, ...)= Pr(w)Pr(x)Pr(y)Pr(z) . . . (3.1)<br />

To make the information from each variable add, we need to add the probabilities,<br />

which can be accomplished by converting the probabilities into the<br />

logarithms of the probabilities.<br />

The second consideration was that the information of a single stimulus<br />

should be proportional to its “surprise,” that is, its probability of occurrence.<br />

Thus, events that occur with probability close to 1 should have no information<br />

content, while events that occur with a low probability should<br />

have high information content.<br />

Shannon (1948) demonstrated that defining the information in terms of<br />

the negative logarithm is the only function that satisfies both considerations. 2<br />

Information =−log 2 Pr(x). (3.2)<br />

If there are several possible outcomes, then the information from each<br />

outcome should be equal to its surprise value multiplied by the probability<br />

of that event. This leads to the averaged information for a stimulus distribution<br />

or a response distribution:<br />

1. I use the terms information <strong>and</strong> uncertainty interchangeably. The uncertainty of an<br />

event equals its information value.<br />

2. Traditionally, the logarithm is set to base 2 to make the information measure equivalent<br />

to the number of bits. In fact, the choice of base is completely arbitrary because simply multiplying<br />

by a constant converts between any two bases.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!