13.11.2012 Views

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

46 CONTINGENCY TABLES<br />

n11 equals<br />

P(n11) =<br />

� ��<br />

n1+ n2+<br />

n11<br />

� n<br />

n+1 − n11<br />

n+1<br />

�<br />

� (2.11)<br />

The binomial coefficients equal � � a<br />

b = a!/b!(a − b)!.<br />

To test H0: independence, the P -value is the sum of hypergeometric probabilities<br />

for outcomes at least as favorable <strong>to</strong> Ha as the observed outcome. We illustrate for<br />

Ha: θ>1. Given the marginal <strong>to</strong>tals, tables having larger n11 values also have larger<br />

sample odds ratios ˆθ = (n11n22)/(n12n21); hence, they provide stronger evidence in<br />

favor of this alternative. The P -value equals the right-tail hypergeometric probability<br />

that n11 is at least as large as the observed value. This test, proposed by the eminent<br />

British statistician R. A. Fisher in 1934, is called Fisher’s exact test.<br />

2.6.2 Example: Fisher’s Tea Taster<br />

To illustrate this test in his 1935 book, The Design of Experiments, Fisher described<br />

the following experiment: When drinking tea, a colleague of Fisher’s at Rothamsted<br />

Experiment Station near London claimed she could distinguish whether milk or tea<br />

was added <strong>to</strong> the cup first. To test her claim, Fisher designed an experiment in which<br />

she tasted eight cups of tea. Four cups had milk added first, and the other four had tea<br />

added first. She was <strong>to</strong>ld there were four cups of each type and she should try <strong>to</strong> select<br />

the four that had milk added first. The cups were presented <strong>to</strong> her in random order.<br />

Table 2.8 shows a potential result of the experiment. The null hypothesis H0: θ = 1<br />

for Fisher’s exact test states that her guess was independent of the actual order of pouring.<br />

The alternative hypothesis that reflects her claim, predicting a positive association<br />

between true order of pouring and her guess, is Ha: θ>1. For this experimental<br />

design, the column margins are identical <strong>to</strong> the row margins (4, 4), because she knew<br />

that four cups had milk added first. Both marginal distributions are naturally fixed.<br />

Table 2.8. Fisher’s Tea Tasting Experiment<br />

Guess Poured First<br />

Poured First Milk Tea Total<br />

Milk 3 1 4<br />

Tea 1 3 4<br />

Total 4 4<br />

The null distribution of n11 is the hypergeometric distribution defined for all 2 × 2<br />

tables having row and column margins (4, 4). The potential values for n11 are (0, 1,

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!