27.03.2013 Views

Brian S. Everitt A Handbook of Statistical Analyses using SPSS

Brian S. Everitt A Handbook of Statistical Analyses using SPSS

Brian S. Everitt A Handbook of Statistical Analyses using SPSS

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

z 30. 463<br />

(12.2)<br />

The coefficients defining Fisher’s linear discriminant function in equation<br />

(12.1) are proportional to the unstandardized coefficients given in<br />

the “Canonical Discriminant Function Coefficients” table which is produced<br />

when Unstandardized is checked in the Statistics sub-dialogue box (see<br />

Display 12.3). The latter (scaled) discriminant function is used by <strong>SPSS</strong> to<br />

calculate discriminant scores for each object (skulls). The discriminant<br />

scores are centered so that they have a sample mean zero. These scores<br />

can be compared with the average <strong>of</strong> their group means (shown in the<br />

“Functions at Group Centroids” table) to allocate skulls into groups. Here<br />

the threshold against which a skull’s discriminant score is evaluated is<br />

<br />

0. 0585 12 0. 877 0. 994<br />

(12.3)<br />

Thus new skulls with discriminant scores above 0.0585 would be assigned<br />

to the Lhasa site (type B); otherwise, they would be classified as type A.<br />

When variables are measured on different scales, the magnitude <strong>of</strong> an<br />

unstandardized coefficient provides little indication <strong>of</strong> the relative contribution<br />

<strong>of</strong> the variable to the overall discrimination. The “Standardized<br />

Canonical Discriminant Function Coefficients” listed attempt to overcome<br />

this problem by rescaling <strong>of</strong> the variables to unit standard deviation. For<br />

our data, such standardization is not necessary since all skull measurements<br />

were in millimeters. Standardization should, however, not matter much<br />

since the within-group standard deviations were similar across different<br />

skull measures (see Display 12.4). According to the standardized coefficients,<br />

skull height (meas3) seems to contribute little to discriminating<br />

between the two types <strong>of</strong> skulls.<br />

A question <strong>of</strong> some importance about a discriminant function is: how<br />

well does it perform? One possible method <strong>of</strong> evaluating performance is<br />

to apply the derived classification rule to the data set and calculate the<br />

misclassification rate. This is known as the resubstitution estimate and the<br />

corresponding results are shown in the “Original” part <strong>of</strong> the “Classification<br />

Results” table in Display 12.7. According to this estimate, 81.3% <strong>of</strong> skulls<br />

can be correctly classified as type A or type B on the basis <strong>of</strong> the<br />

discriminant rule. However, estimating misclassification rates in this way<br />

is known to be overly optimistic and several alternatives for estimating<br />

misclassification rates in discriminant analysis have been suggested (Hand,<br />

1997). One <strong>of</strong> the most commonly used <strong>of</strong> these alternatives is the socalled<br />

leaving one out method, in which the discriminant function is first<br />

derived from only n – 1 sample members, and then used to classify the<br />

observation left out. The procedure is repeated n times, each time omitting<br />

© 2004 by Chapman & Hall/CRC Press LLC

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!