21.06.2014 Views

Questionnaire Dwelling Unit-Level and Person Pair-Level Sampling ...

Questionnaire Dwelling Unit-Level and Person Pair-Level Sampling ...

Questionnaire Dwelling Unit-Level and Person Pair-Level Sampling ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

as regression, that quantify the relationship between a given covariate <strong>and</strong> the<br />

response variable, in the presence of other covariates. Thus, the response variable<br />

itself is indirectly used to determine donors.<br />

• Problem of sparse neighborhoods is considerably reduced, making it easier to<br />

implement restrictions on the donor set. Because the distance function is defined as<br />

a continuous function of the predicted mean, it is possible to find donors arbitrarily<br />

close to the recipient. Thus, it is less likely to have the problem of sparse<br />

neighborhoods for hot decking. Moreover, having sufficient donors in the<br />

neighborhood allows for imposing extra constraints on the donor set, which would be<br />

difficult to incorporate directly in the model.<br />

• <strong>Sampling</strong> weights are easily incorporated in the models. The weighted hot deck<br />

can be viewed as a special case of PMN.<br />

• Correlations across response variables are justified by making the imputation<br />

multivariate.<br />

• Choice of donor can be made r<strong>and</strong>om by choosing delta large enough such that<br />

the neighborhood is of a size greater than 1. Under the assumption that the<br />

recipient <strong>and</strong> the c<strong>and</strong>idate donors in the neighborhood have approximately equal<br />

means, the r<strong>and</strong>om selection allows the case where the error distribution with mean<br />

zero can be mimicked. This helps to avoid bias in estimating means <strong>and</strong> totals,<br />

variances of which can be estimated in two-phase sampling or by suitable resampling<br />

methods.<br />

In comparison with other model-based methods, discrete <strong>and</strong> continuous variables can be<br />

h<strong>and</strong>led jointly <strong>and</strong> relatively easily in MPMN by using the idea of univariate (conditional)<br />

modeling in a hierarchical manner. In MPMN, differential weights can be objectively assigned to<br />

different elements of the predictive mean vector depending on the variability of predicted means<br />

in the dataset via the Mahalanobis squared distance.<br />

As noted earlier, the PMN method has some similarity with the predictive mean matching<br />

method of Rubin (1986) except that, for the donor records, the observed variable value <strong>and</strong> not<br />

the predicted mean, is used for computing the distance function. Also, the well-known method of<br />

nearest neighbor imputation is similar to PMN, except that the distance function is in terms of<br />

the original predictor variables <strong>and</strong> would often require arbitrary scaling of discrete variables.<br />

Moreover, for this method, it is generally hard to make objective decisions about the relative<br />

weights for different predictor variables.<br />

N-7

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!