Questionnaire Dwelling Unit-Level and Person Pair-Level Sampling ...
Questionnaire Dwelling Unit-Level and Person Pair-Level Sampling ...
Questionnaire Dwelling Unit-Level and Person Pair-Level Sampling ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
The PMM method is only applicable to continuous outcome variables. With this method,<br />
a distance function is used to determine distances between the predicted mean for the recipient,<br />
obtained under a model, <strong>and</strong> the response variable outcomes for c<strong>and</strong>idate donors. The<br />
respondent with the smallest distance is chosen as the donor. Unlike the NNHD, the donor is not<br />
r<strong>and</strong>omly selected from a neighborhood. The advantages of PMM include the following:<br />
• Model bias in the predicted mean can be minimized by using suitable covariates.<br />
• The PMM method is not a pure model-based method, because the predicted mean is<br />
only used to assist in finding a donor. Hence, like NNHD, it has the flexibility of<br />
imposing certain constraints on the set of donors.<br />
However, the choice of donor is nonr<strong>and</strong>om. This nonr<strong>and</strong>omness leads to bias in the estimators<br />
of means <strong>and</strong> totals. It also tends to make the distribution of outcome values skewed to the<br />
center. Furthermore, as mentioned earlier, the PMM method is not applicable to discrete<br />
variables, because the distance function between the recipient's predicted mean (which takes<br />
continuous values) <strong>and</strong> the donor's outcome value (which takes discrete values) is not well<br />
defined.<br />
N.2.2 Univariate <strong>and</strong> Multivariate Applications of the Predictive Mean Neighborhood<br />
Method<br />
The PMN method is easily applicable to problems of both univariate <strong>and</strong> multivariate<br />
imputations. The need for univariate imputation arises when the value of a single variable, which<br />
cannot be easily grouped together with other variables, is missing for the respondent. On the<br />
other h<strong>and</strong>, the need for multivariate imputation arises when values of two or more related<br />
variables are missing for a single respondent. The case of a single polytomous variable with<br />
missing values also can be viewed as a multivariate imputation problem. An example of this in<br />
pair applications is a missing pair relationship for a pair where both respondents are in the 21- to<br />
25-year-old range. In this instance, the possible outcomes are spouse-spouse without children,<br />
spouse-spouse with children, <strong>and</strong> all other pair relationships.<br />
The st<strong>and</strong>ard approach to multivariate modeling, with a given set of outcome variables<br />
(including both discrete <strong>and</strong> continuous), is likely to be tedious in practice because of the<br />
computational problems due to the volume of model parameters <strong>and</strong> the difficulty in specifying a<br />
suitable covariance structure. Following Little <strong>and</strong> Rubin's (1987) proposal of a joint model for<br />
discrete <strong>and</strong> continuous variables, <strong>and</strong> its implementation by Schafer (1997), it is possible to fit a<br />
pure multivariate model for multivariate imputation, but it would require making distributional<br />
assumptions. Moreover, because of the obvious problem of specifying the probability<br />
distribution underlying survey data, none of the existing solutions take the survey design into<br />
account. However, since the 1999 survey, in the application of the multivariate predictive mean<br />
neighborhood (MPMN) method to the imputation procedures, a multivariate model has been<br />
fitted by a series of univariate parametric models (including the polytomous case), such that<br />
variables modeled earlier in the hierarchy have a chance to be included in the covariate set for<br />
subsequent models in the hierarchy. In the multivariate modeling with MPMN, the innovative<br />
idea is to express the likelihood in the superpopulation model as a product of marginal <strong>and</strong><br />
N-4