10.07.2015 Views

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.40Capacity of a Single Neuron{t n } N n=1✲ <strong>Learning</strong>algorithm ✲ w ✲ w✻✻✲ {ˆt n } N n=1Figure 40.1. Neural networklearning viewed ascommunication.{x n } N n=1{x n } N n=140.1 Neural network learning as communicationMany neural network models involve the adaptation of a set of weights w inresponse to a set of data points, for example a set of N target values D N ={t n } N n=1 at given locations {x n} N n=1 . The adapted weights are then used toprocess subsequent input data. This process can be viewed as a communicationprocess, in which the sender examines the data D N <strong>and</strong> creates a message wthat depends on those data. The receiver then uses w; for example, thereceiver might use the weights to try to reconstruct what the data D N was.[In neural network parlance, this is using the neuron for ‘memory’ rather thanfor ‘generalization’; ‘generalizing’ means extrapolating from the observed datato the value of t N+1 at some new location x N+1 .] Just as a disk drive is acommunication channel, the adapted network weights w therefore play therole of a communication channel, conveying information about the trainingdata to a future user of that neural net. The question we now address is,‘what is the capacity of this channel?’ – that is, ‘how much information canbe stored by training a neural network?’If we had a learning algorithm that either produces a network whose responseto all inputs is +1 or a network whose response to all inputs is 0,depending on the training data, then the weights allow us to distinguish betweenjust two sorts of data set. The maximum information such a learningalgorithm could convey about the data is therefore 1 bit, this information contentbeing achieved if the two sorts of data set are equiprobable. How muchmore information can be conveyed if we make full use of a neural network’sability to represent other functions?40.2 The capacity of a single neuronWe will look at the simplest case, that of a single binary threshold neuron. Wewill find that the capacity of such a neuron is two bits per weight. A neuronwith K inputs can store 2K bits of information.To obtain this interesting result we lay down some rules to exclude lessinteresting answers, such as: ‘the capacity of a neuron is infinite, because each483

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!