13.11.2012 Views

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

Introduction to Categorical Data Analysis

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

CHAPTER 11<br />

A His<strong>to</strong>rical Tour of <strong>Categorical</strong><br />

<strong>Data</strong> <strong>Analysis</strong> ∗<br />

We conclude by providing a his<strong>to</strong>rical overview of the evolution of methods for<br />

categorical data analysis (CDA). The beginnings of CDA were often shrouded in<br />

controversy. Key figures in the development of statistical science made groundbreaking<br />

contributions, but these statisticians were often in heated disagreement with one<br />

another.<br />

11.1 THE PEARSON–YULE ASSOCIATION CONTROVERSY<br />

Much of the early development of methods for CDA <strong>to</strong>ok place in the UK. It is fitting<br />

that we begin our his<strong>to</strong>rical <strong>to</strong>ur in London in 1900, because in that year Karl Pearson<br />

introduced his chi-squared statistic (X 2 ). Pearson’s motivation for developing the chisquared<br />

test included testing whether outcomes on a roulette wheel in Monte Carlo<br />

varied randomly and testing statistical independence in two-way contingency tables.<br />

Much of the literature on CDA in the early 1900s consisted of vocal debates<br />

about appropriate ways <strong>to</strong> summarize association. Pearson’s approach assumed that<br />

continuous bivariate distributions underlie cross-classification tables. He argued that<br />

one should describe association by approximating a measure, such as the correlation,<br />

for the underlying continuum. In 1904, Pearson introduced the term contingency as<br />

a “measure of the <strong>to</strong>tal deviation of the classification from independent probability,”<br />

and he introduced measures <strong>to</strong> describe its extent and <strong>to</strong> estimate the correlation.<br />

George Udny Yule (1871–1951), an English contemporary of Pearson’s, <strong>to</strong>ok an<br />

alternative approach in his study of association between 1900 and 1912. He believed<br />

that many categorical variables are inherently discrete. Yule defined measures, such<br />

as the odds ratio, directly using cell counts without assuming an underlying continuum.<br />

Discussing one of Pearson’s measures that assumes underlying normality,<br />

An <strong>Introduction</strong> <strong>to</strong> <strong>Categorical</strong> <strong>Data</strong> <strong>Analysis</strong>, Second Edition. By Alan Agresti<br />

Copyright © 2007 John Wiley & Sons, Inc.<br />

325

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!