21.06.2014 Views

Questionnaire Dwelling Unit-Level and Person Pair-Level Sampling ...

Questionnaire Dwelling Unit-Level and Person Pair-Level Sampling ...

Questionnaire Dwelling Unit-Level and Person Pair-Level Sampling ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

The PMM method is only applicable to continuous outcome variables. With this method,<br />

a distance function is used to determine distances between the predicted mean for the recipient,<br />

obtained under a model, <strong>and</strong> the response variable outcomes for c<strong>and</strong>idate donors. The<br />

respondent with the smallest distance is chosen as the donor. Unlike the NNHD, the donor is not<br />

r<strong>and</strong>omly selected from a neighborhood. The advantages of PMM include the following:<br />

• Model bias in the predicted mean can be minimized by using suitable covariates.<br />

• The PMM method is not a pure model-based method, because the predicted mean is<br />

only used to assist in finding a donor. Hence, like NNHD, it has the flexibility of<br />

imposing certain constraints on the set of donors.<br />

However, the choice of donor is nonr<strong>and</strong>om. This nonr<strong>and</strong>omness leads to bias in the estimators<br />

of means <strong>and</strong> totals. It also tends to make the distribution of outcome values skewed to the<br />

center. Furthermore, as mentioned earlier, the PMM method is not applicable to discrete<br />

variables, because the distance function between the recipient's predicted mean (which takes<br />

continuous values) <strong>and</strong> the donor's outcome value (which takes discrete values) is not well<br />

defined.<br />

N.2.2 Univariate <strong>and</strong> Multivariate Applications of the Predictive Mean Neighborhood<br />

Method<br />

The PMN method is easily applicable to problems of both univariate <strong>and</strong> multivariate<br />

imputations. The need for univariate imputation arises when the value of a single variable, which<br />

cannot be easily grouped together with other variables, is missing for the respondent. On the<br />

other h<strong>and</strong>, the need for multivariate imputation arises when values of two or more related<br />

variables are missing for a single respondent. The case of a single polytomous variable with<br />

missing values also can be viewed as a multivariate imputation problem. An example of this in<br />

pair applications is a missing pair relationship for a pair where both respondents are in the 21- to<br />

25-year-old range. In this instance, the possible outcomes are spouse-spouse without children,<br />

spouse-spouse with children, <strong>and</strong> all other pair relationships.<br />

The st<strong>and</strong>ard approach to multivariate modeling, with a given set of outcome variables<br />

(including both discrete <strong>and</strong> continuous), is likely to be tedious in practice because of the<br />

computational problems due to the volume of model parameters <strong>and</strong> the difficulty in specifying a<br />

suitable covariance structure. Following Little <strong>and</strong> Rubin's (1987) proposal of a joint model for<br />

discrete <strong>and</strong> continuous variables, <strong>and</strong> its implementation by Schafer (1997), it is possible to fit a<br />

pure multivariate model for multivariate imputation, but it would require making distributional<br />

assumptions. Moreover, because of the obvious problem of specifying the probability<br />

distribution underlying survey data, none of the existing solutions take the survey design into<br />

account. However, since the 1999 survey, in the application of the multivariate predictive mean<br />

neighborhood (MPMN) method to the imputation procedures, a multivariate model has been<br />

fitted by a series of univariate parametric models (including the polytomous case), such that<br />

variables modeled earlier in the hierarchy have a chance to be included in the covariate set for<br />

subsequent models in the hierarchy. In the multivariate modeling with MPMN, the innovative<br />

idea is to express the likelihood in the superpopulation model as a product of marginal <strong>and</strong><br />

N-4

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!