10.07.2015 Views

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.42.7: The capacity of the Hopfield network 513Of these failure modes, modes 1 <strong>and</strong> 2 are clearly undesirable, mode 2 especiallyso. Mode 3 might not matter so much as long as each of the desiredmemories has a large basin of attraction. The fourth failure mode might insome contexts actually be viewed as beneficial. For example, if a network isrequired to memorize examples of valid sentences such as ‘John loves Mary’<strong>and</strong> ‘John gets cake’, we might be happy to find that ‘John loves cake’ was alsoa stable state of the network. We might call this behaviour ‘generalization’.The capacity of a Hopfield network with I neurons might be defined to bethe number of r<strong>and</strong>om patterns N that can be stored without failure-mode 2having substantial probability. If we also require failure-mode 1 to have tinyprobability then the resulting capacity is much smaller. We now study thesealternative definitions of the capacity.The capacity of the Hopfield network – stringent definitionWe will first explore the information storage capabilities of a binary Hopfieldnetwork that learns using the Hebb rule by considering the stability of justone bit of one of the desired patterns, assuming that the state of the networkis set to that desired pattern x (n) . We will assume that the patterns to bestored are r<strong>and</strong>omly selected binary patterns.The activation of a particular neuron i isa i = ∑ jw ij x (n)j, (42.18)where the weights are, for i ≠ j,w ij = x (n)ix (n)j+ ∑ x (m)ix (m)j. (42.19)m≠nHere we have split W into two terms, the first of which will contribute ‘signal’,reinforcing the desired memory, <strong>and</strong> the second ‘noise’. Substituting for w ij ,the activation isa i = ∑ j≠ix (n)i= (I − 1)x (n)ix (n)jx (n)j+ ∑ ∑j≠i m≠n+ ∑ ∑x (m)ij≠i m≠nx (m)ix (m)jx (n)j(42.20)x (m)jx (n)j. (42.21)The first term is (I − 1) times the desired state x (n)i. If this were the onlyterm, it would keep the neuron firmly clamped in the desired state. Thesecond term is a sum of (I − 1)(N − 1) r<strong>and</strong>om quantities x (m)ix (m)jx (n)j. Amoment’s reflection confirms that these quantities are independent r<strong>and</strong>ombinary variables with mean 0 <strong>and</strong> variance 1.Thus, considering the statistics of a i under the ensemble of r<strong>and</strong>om patterns,we conclude that a i has mean (I − 1)x (n)i<strong>and</strong> variance (I − 1)(N − 1).For brevity, we will now assume I <strong>and</strong> N are large enough that we canneglect the distinction between I <strong>and</strong> I − 1, <strong>and</strong> between N <strong>and</strong> N − 1. Thenwe can restate our conclusion: a i is Gaussian-distributed with mean Ix (n)i<strong>and</strong>variance IN.What then is the probability that the selected bit is stable, if we put thenetwork into the state x (n) ? The probability that bit i will flip on the firstiteration of the Hopfield network’s dynamics is(P (i unstable) = Φ −√ I ) ( )= Φ −√ 1 , (42.22)IN N/I√INIa iFigure 42.7. The probabilitydensity of the activation a i in thecase x (n)i = 1; the probability thatbit i becomes flipped is the areaof the tail.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!