11.07.2015 Views

Preface to First Edition - lib

Preface to First Edition - lib

Preface to First Edition - lib

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

334 CLUSTER ANALYSIS18.4 SummaryCluster analysis techniques provide a rich source of possible strategies for exploringcomplex multivariate data. But the use of cluster analysis in practisedoes not involve simply the application of one particular technique <strong>to</strong> the dataunder investigation, but rather necessitates a series of steps, each of which maybe dependent on the results of the preceding one. It is generally impossiblea priori <strong>to</strong> anticipate what combination of variables, similarity measures andclustering technique is likely <strong>to</strong> lead <strong>to</strong> interesting and informative classifications.Consequently, the analysis proceeds through several stages, with theresearcher intervening if necessary <strong>to</strong> alter variables, choose a different similaritymeasure, concentrate on a particular subset of individuals, and so on. Thefinal, extremely important, stage concerns the evaluation of the clustering solutionsobtained. Are the clusters ‘real’ or merely artefacts of the algorithms?Do other solutions exist that are better in some sense? Can the clusters begiven a convincing interpretation? A long list of such questions might be posed,and readers intending <strong>to</strong> apply clustering <strong>to</strong> their data are recommended <strong>to</strong>read the detailed accounts of cluster evaluation given in Dubes and Jain (1979)and in Everitt et al. (2001).ExercisesEx. 18.1 Construct a three-dimensional drop-line scatterplot of the planetsdata in which the points are labelled with a suitable cluster label.Ex. 18.2 Write an R function <strong>to</strong> fit a mixture of k normal densities <strong>to</strong> a dataset using maximum likelihood.Ex. 18.3 Apply complete linkage and average linkage hierarchical clustering<strong>to</strong> the planets data. Compare the results with those given in the text.Ex. 18.4 Write a general R function that will display a particular partitionfrom the k-means cluster method on both a scatterplot matrix of the originaldata and a scatterplot or scatterplot matrix of a selected number ofprincipal components of the data.© 2010 by Taylor and Francis Group, LLC

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!