29.11.2014 Views

Systematically Misclassified Binary Dependant Variables

Systematically Misclassified Binary Dependant Variables

Systematically Misclassified Binary Dependant Variables

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

SYSTEMATICALLY MISCLASSIFIED BINARY DEPENDANT VARIABLES<br />

tail, would assign herself a 1. A different person who believes she has to have an<br />

extremely high value of the characteristic gives herself a 4. Others with intermediate<br />

values move to the closest node. People between 3 and 3.25 move to 3, while those<br />

between 3.25 and 3.5 move to 3.5. In this case, the rounding errors offset; blue against<br />

red, purple against gray, green against orange, and the two halves of the yellow. The<br />

Likert scale provides an unbiased and consistent estimate of the mean value. The<br />

variance, however, is biased downward because the extreme values are not truncated.<br />

We note that if the location on the distribution is covariate dependent, then covariate<br />

analysis is biased because individual deviations persist. As already noted, the variance of<br />

the distribution of the characteristic is biased downward.<br />

Now suppose that some reason (for example, a bias against being associated with<br />

socially unacceptable behavior) keeps people from centering on the average of the Likert<br />

scale. In other words, the true distribution has a mean different from that offered by the<br />

Likert scale nodes. In terms of Figure 1, we can assume, for example, that the proffered<br />

Likert scale had nodes 1, 2, 3, and 4, with a mean value of 2.5, but the underlying<br />

distribution is as shown. Any respondent with a personal valuation below 1.5 assigns<br />

herself 1. Likewise, a respondent with a personal valuation above 3.5 assigns herself 4.<br />

Other intermediate values are likewise assigned to one of the integer nodes. Now,<br />

matching the colors we get a biased estimate of the mean, not just the variance. The blue<br />

and red areas together have an average value of 2.5, a bias of 0.25 above the mean. The<br />

purple and gray areas together have an average of 2, hence a bias of -0.25. Similarly,<br />

31

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!