27.03.2013 Views

Brian S. Everitt A Handbook of Statistical Analyses using SPSS

Brian S. Everitt A Handbook of Statistical Analyses using SPSS

Brian S. Everitt A Handbook of Statistical Analyses using SPSS

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

The procedure continues until all individuals in a cluster are<br />

closer to their own cluster mean vector than to that <strong>of</strong> any<br />

other cluster.<br />

Essentially the technique seeks to minimize the variability<br />

within clusters and maximize variability between clusters.<br />

Finding the optimal number <strong>of</strong> groups will also be an issue<br />

with this type <strong>of</strong> clustering. In practice, a k-means solution is<br />

usually found for a range <strong>of</strong> values <strong>of</strong> k, and then one <strong>of</strong> the<br />

largely ad hoc techniques described in <strong>Everitt</strong> et al. (2001) for<br />

indicating the correct number <strong>of</strong> groups applied.<br />

Different methods <strong>of</strong> cluster analysis applied to the same set<br />

<strong>of</strong> data <strong>of</strong>ten result in different solutions. Many methods are<br />

really only suitable when the clusters are approximately spherical<br />

in shape, details are given in <strong>Everitt</strong> et al. (2001).<br />

Box 12.2 Fisher’s Linear Discriminant Function<br />

A further aspect <strong>of</strong> the classification <strong>of</strong> multivariate data concerns<br />

the derivation <strong>of</strong> rules and procedures for allocating<br />

individuals/objects to one <strong>of</strong> a set <strong>of</strong> a priori defined groups<br />

in some optimal fashion on the basis <strong>of</strong> a set <strong>of</strong> q measurements,<br />

x 1, x 2, …, x q, taken on each individual or object.<br />

This is the province <strong>of</strong> assignment or discrimination techniques.<br />

The sample <strong>of</strong> observations for which the groups are known<br />

is <strong>of</strong>ten called the training set.<br />

Here we shall concentrate on the two group situation and on<br />

the most commonly used assignment procedure, namely,<br />

Fisher’s linear discriminant function (Fisher, 1936).<br />

Fisher’s suggestion was to seek a linear transformation <strong>of</strong> the<br />

variables<br />

z a x a x a x<br />

such that the separation between the group means on the<br />

transformed scale, – z 1 and – z 2, would be maximized relative to<br />

the within group variation on the z-scale.<br />

© 2004 by Chapman & Hall/CRC Press LLC<br />

1 1 2 2 L<br />

p q

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!