22.03.2013 Views

Seeing clearly: Frame Semantic, Psycholinguistic, and Cross ...

Seeing clearly: Frame Semantic, Psycholinguistic, and Cross ...

Seeing clearly: Frame Semantic, Psycholinguistic, and Cross ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

CHAPTER 4. PSYCHOLINGUISTIC EXPERIMENTS 143<br />

logic of the statistic, see the excellent introduction in Siegel & Castellan (1988:284-91).)<br />

The omega statistic (Morey & Agresti 1984) is based on whether or not two raters<br />

classify each pairofstimuli in the same category or not, without regard to the classi cation<br />

of other pairs. Like kappa, it is corrected for chance agreement, so that it varies from 0<br />

for chance agreement to 1.0 for perfect agreement. Omega is inherently less powerful than<br />

kappa, since it considers each pair of stimuli in isolation, however, it has the great advantage<br />

that it can be used in cases in which the number of categories di ers from rater to rater.<br />

We will therefore use omega to measure agreement among subjects on the Sorting task,<br />

where di erent subjects created di erent numbers of categories.<br />

Both kappa <strong>and</strong> omega are insensitive to the number of categories involved, or the<br />

type of distribution of instances into categories. The variance of the sampling distribution<br />

is known for both, so that the probability of a particular outcome can be calculated.<br />

Results <strong>and</strong> Analysis<br />

Task 1: Sorting<br />

The number of categories per subject ranged from 6 to 21, with a mean of 11.<br />

The proportion of examples in the categories varied greatly, from 33% for eye <strong>and</strong> 15% for<br />

recognize to 0 for some categories. The omega coe cient of agreement ranged from 0.09<br />

to 0.49, with a median of 0.245.<br />

Task 2: Classi cation<br />

All subjects nished 99 sentences of the rst set. Some subjects continued on to<br />

other sets, but the order of the sets was r<strong>and</strong>omized across subjects, so that there was little<br />

overlap beyond the rst set. The overall agreement among raters, measured by the kappa<br />

statistic, was .38 5 . This value is low, but underst<strong>and</strong>able; not only were a large number of<br />

senses listed, but also many of the sentences were ambiguous when presented out of context.<br />

Discussion<br />

In evaluating the results of Experiment 1, the strengths <strong>and</strong> weaknesses of using<br />

corpus examples became apparent. On the one h<strong>and</strong> we had learned something about the<br />

5 This <strong>and</strong> all of the values of kappa reported in this study are statistically signi cant at p

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!