13.11.2012 Views

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

264 MODELS FOR MATCHED PAIRS<br />

8.5.5 Kappa Measure of Agreement<br />

An alternative approach describes strength of agreement using a single summary<br />

index, rather than a model. The most popular index is Cohen’s kappa. It compares<br />

the agreement with that expected if the ratings were independent. The probability<br />

of agreement equals �<br />

i πii. If the observers’ ratings were independent, then<br />

πii = πi+π+i and the probability of agreement equals �<br />

i πi+π+i.<br />

Cohen’s kappa is defined by<br />

�<br />

πii −<br />

κ =<br />

� πi+π+i<br />

1 − � πi+π+i<br />

The numera<strong>to</strong>r compares the probability of agreement with that expected under<br />

independence. The denomina<strong>to</strong>r replaces � πii with its maximum possible value of<br />

1, corresponding <strong>to</strong> perfect agreement. Kappa equals 0 when the agreement merely<br />

equals that expected under independence, and it equals 1.0 when perfect agreement<br />

occurs. The stronger the agreement is, for a given pair of marginal distributions, the<br />

higher the value of kappa.<br />

For Table 8.7, � ˆπii = (22 + 7 + 36 + 10)/118 = 0.636, whereas � ˆπi+ ˆπ+i =<br />

[(26)(27) + (26)(12) + (38)(69) + (28)(10)]/(118) 2 = 0.281. Sample kappa equals<br />

ˆκ = (0.636 − 0.281)/(1 − 0.281) = 0.49<br />

The difference between the observed agreement and that expected under independence<br />

is about 50% of the maximum possible difference.<br />

Kappa treats the variables as nominal, in the sense that, when categories are<br />

ordered, it treats a disagreement for categories that are close the same as for categories<br />

that are far apart. For ordinal scales, a weighted kappa extension gives more<br />

weight <strong>to</strong> disagreements for categories that are farther apart.<br />

Controversy surrounds the usefulness of kappa, primarily because its value<br />

depends strongly on the marginal distributions. The same diagnostic rating process<br />

can yield quite different values of kappa, depending on the proportions of cases of the<br />

various types. We prefer <strong>to</strong> construct models describing the structure of agreement<br />

and disagreement, rather than <strong>to</strong> depend solely on this summary index.<br />

8.6 BRADLEY–TERRY MODEL FOR PAIRED PREFERENCES ∗<br />

Table 8.9 summarizes results of matches among five professional tennis players during<br />

2004 and 2005. For instance, Roger Federer won three of the four matches that he<br />

and Tim Henman played. This section presents a model that applies <strong>to</strong> data of this<br />

sort, in which observations consist of pairwise comparisons that result in a preference<br />

for one category over another. The fitted model provides a ranking of the players. It<br />

also estimates the probabilities of win and of loss for matches between each pair of<br />

players.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!