09.08.2013 Views

Fundamentals of epidemiology - an evolving text - Are you looking ...

Fundamentals of epidemiology - an evolving text - Are you looking ...

Fundamentals of epidemiology - an evolving text - Are you looking ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

greater nonresponse by heavy drinkers will still lower the estimate <strong>of</strong> alcohol consumption; greater<br />

nonresponse by men may also lower the estimate <strong>of</strong> alcohol consumption).<br />

3.4.6 Conditional imputation<br />

Modern imputation methods achieve more accurate imputations by taking adv<strong>an</strong>tage <strong>of</strong><br />

relationships among variables. If, for example, female respondents are more likely to have a<br />

confid<strong>an</strong>t th<strong>an</strong> are male respondents, then imputing a value for "presence <strong>of</strong> a confid<strong>an</strong>t" c<strong>an</strong> be<br />

based on the respondent's sex. WIth this approach, confid<strong>an</strong>t status among men will be imputed<br />

based on the proportion <strong>of</strong> men with a confid<strong>an</strong>t; confid<strong>an</strong>t status among women will be imputed<br />

based on the proportion <strong>of</strong> women with a confid<strong>an</strong>t. In this way, the dataset that includes the<br />

imputed values will give a less biased estimate <strong>of</strong> the population values th<strong>an</strong> will the complete-data<br />

cases alone.<br />

A simple extension from imputation conditional on a single variable is imputation<br />

conditional on a set <strong>of</strong> strata formed from a number <strong>of</strong> variables simult<strong>an</strong>eously. If the number <strong>of</strong><br />

strata is too large, a regression procedure c<strong>an</strong> be used to "predict" the value <strong>of</strong> the variable to be<br />

imputed as a function <strong>of</strong> variables for which data are available. The coefficients in the regression<br />

model are estimated from complete-data cases.<br />

Imputed values are then r<strong>an</strong>domly assigned (using a procedure such as that outlined above)<br />

using the stratum-specific distributions or predicted values from the regression model. This strategy<br />

provides superior imputations for missing values <strong>an</strong>d preserves associations between the variable<br />

being imputed <strong>an</strong>d the other variables in the model or stratification. The stronger the associations<br />

among the variables, the more nearly accurate the imputation. There does remain, though, the<br />

problem <strong>of</strong> what to do when the value <strong>of</strong> more th<strong>an</strong> one variable is missing. If in actuality two<br />

variables are associated with each other, then imputing values to one independently <strong>of</strong> the value <strong>of</strong><br />

the other will weaken the observed association.<br />

3.4.7 Joint imputation<br />

Yet <strong>an</strong>other step forward is joint imputation for all <strong>of</strong> the missing values in each observation.<br />

Picture <strong>an</strong> array which categorizes all complete-data observations according to their values <strong>of</strong> the<br />

variables being considered together <strong>an</strong>d a second array categorizing all remaining observations<br />

according to their configuration <strong>of</strong> missing values. Suppose there are three dichotomous (0-1)<br />

variables, A, B, C <strong>an</strong>d that A is known for all respondents but B <strong>an</strong>d/or C c<strong>an</strong> be missing. The<br />

arrays might look like this:<br />

_____________________________________________________________________________________________<br />

www.sph.unc.edu/EPID168/ © Victor J. Schoenbach 16. Data m<strong>an</strong>agement <strong>an</strong>d data <strong>an</strong>alysis - 542<br />

rev. 9/27/1999, 10/22/1999, 10/28/1999

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!