01.08.2013 Views

Information Theory, Inference, and Learning ... - MAELabs UCSD

Information Theory, Inference, and Learning ... - MAELabs UCSD

Information Theory, Inference, and Learning ... - MAELabs UCSD

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981<br />

You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.<br />

484 40 — Capacity of a Single Neuron<br />

of its weights is a real number <strong>and</strong> so can convey an infinite number of bits’.<br />

We exclude this answer by saying that the receiver is not able to examine the<br />

weights directly, nor is the receiver allowed to probe the weights by observing<br />

the output of the neuron for arbitrarily chosen inputs. We constrain the<br />

receiver to observe the output of the neuron at the same fixed set of N points<br />

{xn} that were in the training set. What matters now is how many different<br />

distinguishable functions our neuron can produce, given that we can observe<br />

the function only at these N points. How many different binary labellings of<br />

N points can a linear threshold function produce? And how does this number<br />

compare with the maximum possible number of binary labellings, 2 N ? If<br />

nearly all of the 2 N labellings can be realized by our neuron, then it is a<br />

communication channel that can convey all N bits (the target values {tn})<br />

with small probability of error. We will identify the capacity of the neuron as<br />

the maximum value that N can have such that the probability of error is very<br />

small. [We are departing a little from the definition of capacity in Chapter 9.]<br />

We thus examine the following scenario. The sender is given a neuron<br />

with K inputs <strong>and</strong> a data set DN which is a labelling of N points. The<br />

sender uses an adaptive algorithm to try to find a w that can reproduce this<br />

labelling exactly. We will assume the algorithm finds such a w if it exists. The<br />

receiver then evaluates the threshold function on the N input values. What<br />

is the probability that all N bits are correctly reproduced? How large can N<br />

become, for a given K, without this probability becoming substantially less<br />

than one?<br />

General position<br />

One technical detail needs to be pinned down: what set of inputs {xn} are we<br />

considering? Our answer might depend on this choice. We will assume that<br />

the points are in general position.<br />

Definition 40.1 A set of points {xn} in K-dimensional space are in general<br />

position if any subset of size ≤ K is linearly independent, <strong>and</strong> no K + 1 of<br />

them lie in a (K − 1)-dimensional plane.<br />

In K = 3 dimensions, for example, a set of points are in general position if no<br />

three points are colinear <strong>and</strong> no four points are coplanar. The intuitive idea is<br />

that points in general position are like r<strong>and</strong>om points in the space, in terms of<br />

the linear dependences between points. You don’t expect three r<strong>and</strong>om points<br />

in three dimensions to lie on a straight line.<br />

The linear threshold function<br />

The neuron we will consider performs the function<br />

where<br />

y = f<br />

f(a) =<br />

K<br />

k=1<br />

wkxk<br />

<br />

1 a > 0<br />

0 a ≤ 0.<br />

(40.1)<br />

(40.2)<br />

We will not have a bias w0; the capacity for a neuron with a bias can be<br />

obtained by replacing K by K + 1 in the final result below, i.e., considering<br />

one of the inputs to be fixed to 1. (These input points would not then be in<br />

general position; the derivation still works.)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!