Questionnaire Dwelling Unit-Level and Person Pair-Level Sampling ...
Questionnaire Dwelling Unit-Level and Person Pair-Level Sampling ...
Questionnaire Dwelling Unit-Level and Person Pair-Level Sampling ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
7. Weight Calibration at <strong>Questionnaire</strong><br />
<strong>Dwelling</strong> <strong>Unit</strong> <strong>and</strong> <strong>Pair</strong> <strong>Level</strong>s<br />
The 2006 National Survey on Drug Use <strong>and</strong> Health (NSDUH) was based on probability<br />
sampling so that valid inferences can be made from survey findings about the target population.<br />
Probability sampling refers to sampling in which every unit on the frame is given a known,<br />
nonzero probability for inclusion in the survey. This is required for unbiased estimation of the<br />
population total. The assumption of nonzero inclusion probability for every pair of units in the<br />
frame also is required for unbiased variance estimation. The basic sampling plan involved four<br />
stages of selection across two phases of design: within Phase I, (1) the selection of census tracts<br />
within each State sampling (SS) region, (2) the selection of subareas or segments (comprised of<br />
U.S. Bureau of the Census blocks) within SS regions; (3) the selection of dwelling units (DUs)<br />
within these subareas; <strong>and</strong>, finally, within Phase II, (4) the selection of eligible individuals within<br />
DUs. Specific details of the sample design <strong>and</strong> selection procedures for the sample can be found<br />
in the 2006 NSDUH sample design report (Morton et al., 2007).<br />
As part of the postsurvey data-processing activities, analysis weights that reflected the<br />
selection probabilities from various stages of the sample design were calculated for respondents.<br />
These sample weights were adjusted at the DU (screening sample), questionnaire dwelling unit<br />
(QDU), person, <strong>and</strong> paired respondent levels (the latter three all based around the drug<br />
questionnaire sample) to account for bias due to extreme values (ev), nonresponse (nr), <strong>and</strong><br />
coverage.<br />
The final sample weights for Phase I screener dwelling units (SDU) <strong>and</strong> Phase II QDU,<br />
person, <strong>and</strong> pair levels for the 2006 samples consisted of products of several factors, each<br />
representing either a probability of selection at some particular stage or some form of ev, nr, or<br />
ps calibration adjustment. In the following sections, we describe the QDU <strong>and</strong> pair weight<br />
components in greater detail. In summary, the first 10 factors were defined for all SDUs <strong>and</strong><br />
reflected the fully adjusted SDU sample weight. The remaining components branched to reflect<br />
QDU <strong>and</strong> pair selection probabilities, as well as additional adjustments for ev, nr, <strong>and</strong> ps. Note<br />
that the final QDU <strong>and</strong> pair weights for the 2006 survey sample are the product of all weight<br />
components for each type of sample, illustrated in Exhibits 7.1 <strong>and</strong> 7.2.<br />
For QDU data, generalized exponential modeling (GEM) calibration modeling was<br />
applied by partitioning the data into four groups of States: Northeast, South, Midwest, <strong>and</strong> West,<br />
based on census regions in the interest of computational feasibility. Previous experience showed<br />
that with current computing power, the large number of variables <strong>and</strong> records prevented any<br />
further reduction of modeling groups.<br />
For pair data, GEM modeling was initially applied by partitioning the pair data into four<br />
groups based on census regions. However, there were not enough observations in each group to<br />
fit a comprehensive model to reduce bias. Alternatively, a single model was attempted for the<br />
whole pair data, but it was rejected as not practical due to computational limitations. A<br />
compromise approach was adopted by combining census regions into two groups: Northeast with<br />
South <strong>and</strong> Midwest with West. This grouping proved both manageable <strong>and</strong> desirable as it<br />
65