10.07.2015 Views

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

Information Theory, Inference, and Learning ... - Inference Group

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.20.1: K-means clustering 287Data:Figure 20.3. K-means algorithmapplied to a data set of 40 points.K = 2 means evolve to stablelocations after three iterations.Assignment Update Assignment Update Assignment UpdateRun 1Figure 20.4. K-means algorithmapplied to a data set of 40 points.Two separate runs, both withK = 4 means, reach differentsolutions. Each frame shows asuccessive assignment step.Run 2Exercise 20.1. [4, p.291] See if you can prove that K-means always converges.[Hint: find a physical analogy <strong>and</strong> an associated Lyapunov function.][A Lyapunov function is a function of the state of the algorithm thatdecreases whenever the state changes <strong>and</strong> that is bounded below. If asystem has a Lyapunov function then its dynamics converge.]The K-means algorithm with a larger number of means, 4, is demonstrated infigure 20.4. The outcome of the algorithm depends on the initial condition.In the first case, after five iterations, a steady state is found in which the datapoints are fairly evenly split between the four clusters. In the second case,after six iterations, half the data points are in one cluster, <strong>and</strong> the others areshared among the other three clusters.Questions about this algorithmThe K-means algorithm has several ad hoc features. Why does the update stepset the ‘mean’ to the mean of the assigned points? Where did the distance dcome from? What if we used a different measure of distance between x <strong>and</strong> m?How can we choose the ‘best’ distance? [In vector quantization, the distance

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!